From david.holmes at oracle.com Mon Apr 1 02:32:49 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 1 Apr 2019 12:32:49 +1000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: Message-ID: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Hi Goetz, I'm looking at this ... On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: > Hi, > > Any interest in this change? I'm personally of two minds here because these VM generated exceptions are not only delivered to Java source code. I'd like to know how other language developers using the JVM runtime would view this. That aside if you're going to make a change like this then I think the full signature string has to be quoted in some way to delineate it within the larger message. > Should I split it to adapt the exceptions separately one-by-one to > make the change smaller and simplify the review? I don't think that is necessary. Thanks, David ----- > I would propose to start out with AbstractMethodError only. > > Best regards, > Goetz. > > > > From: Lindenmaier, Goetz > Sent: Tuesday, March 26, 2019 1:06 PM > To: hotspot-runtime-dev at openjdk.java.net > Subject: RFR(L): 8221470: Print methods in exception messages in java-like Syntax. > > Hi, > > A row of exceptions are thrown from the hotspot runtime. > They print methods with their JNI signatures. To increase > readability and resemblance to source code, this change proposes > to print them in a Java-like syntax. > > Some examples: > current method printouts: > > test.TeMe3_B.ma()V > test.TeMe3_B.ma(IZ[[BF)[[D > test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > > improved format: > > void test.TeMe3_B.ma() > double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > > So far, Method::name_and_sig_as_C_string() is used to print > these messages. > > This change implements function Method::external_name() that prints the better > format. > external_name() is chosen according to Klass::external_name(). > > Printing the better format requires parsing the signature > Symbol. This is implemented in > void Symbol::print_as_signature_external_return_type(outputStream *os); > void Symbol::print_as_signature_external_parameters(outputStream *os); > These method names are chosen according to Symbol::as_class_external_name(). > > See this partial webrev for the new functions: > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01-new_methods/ > > Also, I changed a lot of exception messages to use the new format. > This required to adapt a row of tests. I added a test to check > the signature printing does not regress. For all these changes, see > the full webrev: > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ > > I hope I detected all places where method signatures are printed to > exception messages. > > Best regards, > Goetz. > > > > > > > > > > > > > From thomas.stuefe at gmail.com Mon Apr 1 04:37:11 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 1 Apr 2019 06:37:11 +0200 Subject: RFR (XS) 8221726: Multiple build failures after JDK-8221698 (Remove redundant includes from popular header files) In-Reply-To: <88a8e19d-82c3-69f7-3c53-c19c689a87af@oracle.com> References: <1f99bf26-a3e2-783e-4905-462d7635ba99@redhat.com> <88a8e19d-82c3-69f7-3c53-c19c689a87af@oracle.com> Message-ID: On Mon 1. Apr 2019 at 00:19, David Holmes wrote: > Hi Aleksey, > > On 1/04/2019 8:04 am, Aleksey Shipilev wrote: > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8221726 > > > > See bug for examples of build failures. Seems only ppc64le and x86_64 > {minimal, zero} are affected. > > Happy to fold other fixes if other platforms are failing too. > > > > Fix: > > > > diff -r 7ad62bdfec59 > src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp > > --- a/src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp Sun Mar > 31 23:29:47 2019 +0200 > > +++ b/src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp Sun Mar > 31 23:52:49 2019 +0200 > > @@ -28,4 +28,5 @@ > > #include "gc/shared/barrierSetAssembler.hpp" > > #include "interpreter/interp_masm.hpp" > > +#include "runtime/jniHandles.hpp" > > > > #define __ masm-> > > diff -r 7ad62bdfec59 src/hotspot/share/classfile/systemDictionary.hpp > > --- a/src/hotspot/share/classfile/systemDictionary.hpp Sun Mar 31 > 23:29:47 2019 +0200 > > +++ b/src/hotspot/share/classfile/systemDictionary.hpp Sun Mar 31 > 23:52:49 2019 +0200 > > @@ -31,4 +31,5 @@ > > #include "oops/symbol.hpp" > > #include "runtime/java.hpp" > > +#include "runtime/mutexLocker.hpp" > > #include "runtime/reflectionUtils.hpp" > > #include "runtime/signature.hpp" > > I'm struggling to see what changes in JDK-8221698 led to these problems, > but the fixes certainly look totally appropriate. I also think this > constitutes a trivial change and can be pushed with one Review andnot > wait 24 hours. (If there are any issues I'll sort them out if needed.) > > Aside: are there any tools that will show where a particular declaration > is being included from? We've obviously got some interesting transitive > closures with conditional includes. > I think it would be helpful if we could have at least one zero build (eg x64) in jdk-submit. ..thomas > Thanks, > David > > > Testing: Linux x86_64 {server, minimal, zero}, ppc64le builds > > > > Thanks, > > -Aleksey > > > From ioi.lam at oracle.com Mon Apr 1 05:17:22 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Sun, 31 Mar 2019 22:17:22 -0700 Subject: RFR (XS) 8221726: Multiple build failures after JDK-8221698 (Remove redundant includes from popular header files) In-Reply-To: References: <1f99bf26-a3e2-783e-4905-462d7635ba99@redhat.com> <88a8e19d-82c3-69f7-3c53-c19c689a87af@oracle.com> Message-ID: <734743cf-5633-a559-7b0f-232a8b90da30@oracle.com> Hi Aleksey, thanks for fixing this! Now I realized that the hotspot header file dependency is more fragile than I thought. Thomas, I have a couple more changesets for cleaning header files. I'll post them and let people try them out on other ports (for at least a week, etc) before pushing. I'll also test on more combinations like zero and minimal. I'll try to write a tool to analyze how the header files are included. The current state is pretty abysmal (from a simple script that I wrote): http://cr.openjdk.java.net/~iklam/jdk13/headers.txt Number of headers?????? =?? 1293 Number of objs????????? =??? 942 Each obj file includes? =??? 279.64 headers Each header is included =??? 203.73 times Rank? 1% -? 10% headers are included??? 1.7 times Rank 11% -? 20% headers are included??? 3.0 times Rank 21% -? 30% headers are included??? 5.2 times Rank 31% -? 40% headers are included?? 10.6 times Rank 41% -? 50% headers are included?? 21.4 times Rank 51% -? 60% headers are included?? 42.7 times Rank 61% -? 70% headers are included?? 97.4 times Rank 71% -? 80% headers are included? 281.1 times Rank 81% -? 90% headers are included? 711.8 times Rank 91% - 100% headers are included? 866.2 times So basically you have 20% of headers that are practically included in every object file :-( Thanks - Ioi On 3/31/19 9:37 PM, Thomas St?fe wrote: > On Mon 1. Apr 2019 at 00:19, David Holmes wrote: > >> Hi Aleksey, >> >> On 1/04/2019 8:04 am, Aleksey Shipilev wrote: >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8221726 >>> >>> See bug for examples of build failures. Seems only ppc64le and x86_64 >> {minimal, zero} are affected. >>> Happy to fold other fixes if other platforms are failing too. >>> >>> Fix: >>> >>> diff -r 7ad62bdfec59 >> src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp >>> --- a/src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp Sun Mar >> 31 23:29:47 2019 +0200 >>> +++ b/src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp Sun Mar >> 31 23:52:49 2019 +0200 >>> @@ -28,4 +28,5 @@ >>> #include "gc/shared/barrierSetAssembler.hpp" >>> #include "interpreter/interp_masm.hpp" >>> +#include "runtime/jniHandles.hpp" >>> >>> #define __ masm-> >>> diff -r 7ad62bdfec59 src/hotspot/share/classfile/systemDictionary.hpp >>> --- a/src/hotspot/share/classfile/systemDictionary.hpp Sun Mar 31 >> 23:29:47 2019 +0200 >>> +++ b/src/hotspot/share/classfile/systemDictionary.hpp Sun Mar 31 >> 23:52:49 2019 +0200 >>> @@ -31,4 +31,5 @@ >>> #include "oops/symbol.hpp" >>> #include "runtime/java.hpp" >>> +#include "runtime/mutexLocker.hpp" >>> #include "runtime/reflectionUtils.hpp" >>> #include "runtime/signature.hpp" >> I'm struggling to see what changes in JDK-8221698 led to these problems, >> but the fixes certainly look totally appropriate. I also think this >> constitutes a trivial change and can be pushed with one Review andnot >> wait 24 hours. (If there are any issues I'll sort them out if needed.) >> >> Aside: are there any tools that will show where a particular declaration >> is being included from? We've obviously got some interesting transitive >> closures with conditional includes. >> > I think it would be helpful if we could have at least one zero build (eg > x64) in jdk-submit. > > ..thomas > > >> Thanks, >> David >> >>> Testing: Linux x86_64 {server, minimal, zero}, ppc64le builds >>> >>> Thanks, >>> -Aleksey >>> From goetz.lindenmaier at sap.com Mon Apr 1 07:12:06 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 1 Apr 2019 07:12:06 +0000 Subject: RFR(L): 8218628: Add detailed message to NullPointerException describing what is null. In-Reply-To: <342a394b-9798-cd83-3c18-cb7f24da712e@gmail.com> References: <7c4b0bc27961471e91195bef9e767226@sap.com> <01361236-c046-0cac-e09d-be59ea6499e0@oracle.com> <2d38e96dcd214dd091f4d79d2a9e71e3@sap.com> <440e685b-b528-056d-385f-9dc010d65e97@oracle.com> <7189ff5f-a73f-5109-1d6b-aa8a2635543a@oracle.com> <20190314143804.400279193@eggemoggin.niobe.net> <172abe6c-e95d-515b-9e8c-8cfa402f4a7c@oracle.com> <3265336b-b483-dc69-b8f7-787139a44183@oracle.com> <33a092ca-9949-bac0-3160-6d018b5e27c4@oracle.com> <850cce9d-619c-b6c0-495c-eebbfff801cc@gmail.com> <342a394b-9798-cd83-3c18-cb7f24da712e@gmail.com> Message-ID: Hi Peter, > -----Original Message----- > From: Peter Levart > Sent: Freitag, 29. M?rz 2019 16:44 > To: Lindenmaier, Goetz ; 'Mandy Chung' > > Cc: core-libs-dev at openjdk.java.net; maurizio.cimadamore at oracle.com; > hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(L): 8218628: Add detailed message to NullPointerException > describing what is null. > > > > On 3/29/19 4:36 PM, Peter Levart wrote: > > > > > > On 3/29/19 8:49 AM, Lindenmaier, Goetz wrote: > >> So I want to withdraw my claim that NPEs are thrown frequently. > >> Probably I was biased by my compiler construction background, > >> remembering NPE checks are all over the place in the code. > >> > >> But I think I can still keep the claim that the message is > >> printed rarely. > >> > >> I'll adapt the JEP saying something like this: > >> "While exceptions are supposed to be thrown rarely, i.e., only > >> In exceptional situations, most are swallowed without ever > >> looking at the message. Thus, overhead in getMessage() will > >> not fall into account." > > > > Is this really a realistic assumption? That NPE exceptions are mostly > > swallowed in most programs despite the fact that swallowing exceptions > > (and throwing them to control the flow) is an anti-pattern? Is > > majority of code really so badly written? I would expect that most > > programs contain an exception handler of some kind that at least logs > > all unexpected exceptions. > > > > I think JDK should assume that NPEs are not frequent in most well > > written programs. Because in well written programs all unexpected > > exceptions are at least logged somewhere and this alone guarantees > > that programs are eventually "fixed" to not throw them frequently... > > > > Regards, Peter > > So I would say that there are two kinds of programs (which kind is in > majority doesn't matter): > > a - programs that throw and catch exceptions for exceptional situations > only (i.e. non frequently) - they also print the exceptions' messages > b - programs that throw and swallow exceptions frequently, but they > mostly don't print their messages > > In either case .getMessage() is not called frequently for kind (a) and > hopefully also for kind (b). > > Regards, Peter Hi, I agree with this, and my numbers show that the message is not printed frequently in any case. Best regards, Goetz. From goetz.lindenmaier at sap.com Mon Apr 1 08:02:20 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 1 Apr 2019 08:02:20 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Hi David, > I'm looking at this ... Great, thanks! > I'm personally of two minds here because these VM generated exceptions > are not only delivered to Java source code. I'd like to know how other > language developers using the JVM runtime would view this. I thought about the other languages, too. But then I think why would the one be better than the other, if they don't reflect the actual code of that language? > That aside if you're going to make a change like this then I think the > full signature string has to be quoted in some way to delineate it > within the larger message. Quoting would especially help because of the space between return type and the method name. It would be easier to capture that they belong together. But for consistency, we should then quote all class, field and method names, which is currently not the case as you can easily see by looking at the updated messages in the tests. I thought about leaving out the return type, but that would mean to drop important information. So I'm not sure here ... Best regards, Goetz. > > Should I split it to adapt the exceptions separately one-by-one to > > make the change smaller and simplify the review? > > I don't think that is necessary. > > Thanks, > David > ----- > > > I would propose to start out with AbstractMethodError only. > > > > Best regards, > > Goetz. > > > > > > > > From: Lindenmaier, Goetz > > Sent: Tuesday, March 26, 2019 1:06 PM > > To: hotspot-runtime-dev at openjdk.java.net > > Subject: RFR(L): 8221470: Print methods in exception messages in java-like > Syntax. > > > > Hi, > > > > A row of exceptions are thrown from the hotspot runtime. > > They print methods with their JNI signatures. To increase > > readability and resemblance to source code, this change proposes > > to print them in a Java-like syntax. > > > > Some examples: > > current method printouts: > > > > test.TeMe3_B.ma()V > > test.TeMe3_B.ma(IZ[[BF)[[D > > test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > > > > improved format: > > > > void test.TeMe3_B.ma() > > double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > > test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > > > > So far, Method::name_and_sig_as_C_string() is used to print > > these messages. > > > > This change implements function Method::external_name() that prints the > better > > format. > > external_name() is chosen according to Klass::external_name(). > > > > Printing the better format requires parsing the signature > > Symbol. This is implemented in > > void Symbol::print_as_signature_external_return_type(outputStream *os); > > void Symbol::print_as_signature_external_parameters(outputStream *os); > > These method names are chosen according to > Symbol::as_class_external_name(). > > > > See this partial webrev for the new functions: > > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- > new_methods/ > > > > Also, I changed a lot of exception messages to use the new format. > > This required to adapt a row of tests. I added a test to check > > the signature printing does not regress. For all these changes, see > > the full webrev: > > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ > > > > I hope I detected all places where method signatures are printed to > > exception messages. > > > > Best regards, > > Goetz. > > > > > > > > > > > > > > > > > > > > > > > > > > From shade at redhat.com Mon Apr 1 08:06:42 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 1 Apr 2019 10:06:42 +0200 Subject: RFR (XS) 8221726: Multiple build failures after JDK-8221698 (Remove redundant includes from popular header files) In-Reply-To: References: <1f99bf26-a3e2-783e-4905-462d7635ba99@redhat.com> <88a8e19d-82c3-69f7-3c53-c19c689a87af@oracle.com> Message-ID: On 4/1/19 6:37 AM, Thomas St?fe wrote: > On Mon 1. Apr 2019 at 00:19, David Holmes > > wrote: > Aside: are there any tools that will show where a particular declaration > is being included from? We've obviously got some interesting transitive > closures with conditional includes. Yes. I need to add that my CI builds without PCH, which exposes this kind of thing too. > I think it would be helpful if we could have at least one zero build (eg x64) in jdk-submit.? In this case, x86_64-minimal would be nice. And it builds much faster than x86_64-zero, which calls jmod/jlink with Zero, and that gets quite slow. Pushed the fix. Thanks, -Aleksey From stefan.karlsson at oracle.com Mon Apr 1 08:33:53 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 1 Apr 2019 10:33:53 +0200 Subject: RFR: 8221149: os::malloc checks MallocCatchPtr outside of ifdef ASSERT block In-Reply-To: <94f2b38c-4e13-d51b-5b98-2fd2f4ed5151@oracle.com> References: <87c44919-7587-3380-3f60-a177342edd6b@oracle.com> <94f2b38c-4e13-d51b-5b98-2fd2f4ed5151@oracle.com> Message-ID: Thanks, David. StefanK On 2019-03-29 01:31, David Holmes wrote: > Hi Stefan, > > This looks good to me too. > > Thanks, > David > > On 28/03/2019 10:33 pm, Stefan Karlsson wrote: >> Hi Thomas, >> >> On 2019-03-28 09:21, Thomas St?fe wrote: >>> Hi Stefan, >>> >>> On Thu, Mar 28, 2019 at 8:47 AM Stefan Karlsson >>> > wrote: >>> >>> ??? Hi all, >>> >>> ??? Please review this patch to move the MallocCatchPtr check into the >>> ??? ifdef >>> ??? ASSERT block, just like the other usages of it. >>> >>> ??? http://cr.openjdk.java.net/~stefank/8221149/webrev.01/ >>> ??? https://bugs.openjdk.java.net/browse/JDK-8221149 >>> >>> >>> Looks good. Note that you also could remove the test completely from >>> os::realloc for the oldptr!=NULL case (lines 764ff) since we >>> allocated using os::malloc(), which already does the test. >> >> Thanks for reviewing. >> >> I've incorporated your proposal: >> ??http://cr.openjdk.java.net/~stefank/8221149/webrev.02.delta >> ??http://cr.openjdk.java.net/~stefank/8221149/webrev.02 >> >> StefanK >> >>> >>> ??? A side note: Is the intention that MallocCatchPtr should find >>> pointers >>> ??? to the memory address returned from ::malloc, or the memory >>> address we >>> ??? hand out from os::malloc? Currently it's the latter and it's not >>> ??? obvious >>> ??? from the the code if this was the intention from the beginning. >>> >>> ???? ? 704? ?// Wrap memory with guard >>> ???? ? 705? ?GuardedMemory guarded(ptr, size + nmt_header_size); >>> ???? ? 706? ?ptr = guarded.get_user_ptr(); >>> ???? ? 707 >>> ???? ? 708? ?if ((intptr_t)ptr == (intptr_t)MallocCatchPtr) { >>> >>> >>> I find the test as it is is more useful since usually one wants to >>> follow pointers one sees in the hotspot. >>> >>> However, we may just test both pointers, yes? So, break if >>> MallocCatchPtr is either the outside or the inside pointer. >>> >>> I am not really emotionally invested though :) >>> >>> Cheers Thomas >>> >>> ??? Thanks, >>> ??? StefanK >>> From david.holmes at oracle.com Mon Apr 1 11:18:05 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 1 Apr 2019 21:18:05 +1000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: <353eb85c-5752-5679-23fd-a5b82191cdc5@oracle.com> On 1/04/2019 6:02 pm, Lindenmaier, Goetz wrote: > Hi David, > >> I'm looking at this ... > Great, thanks! > >> I'm personally of two minds here because these VM generated exceptions >> are not only delivered to Java source code. I'd like to know how other >> language developers using the JVM runtime would view this. > I thought about the other languages, too. But then I think why would the > one be better than the other, if they don't reflect the actual > code of that language? "VM language" is at least a "common denminator". >> That aside if you're going to make a change like this then I think the >> full signature string has to be quoted in some way to delineate it >> within the larger message. > Quoting would especially help because of the space between return > type and the method name. It would be easier to capture that they > belong together. > But for consistency, we should then quote all class, field and method > names, which is currently not the case as you can easily see by looking > at the updated messages in the tests. It's mainly the spaces caused by return types that is the issue so I don't see we need to quote everything to address that issue. David ----- > I thought about leaving out the return type, but that would mean to > drop important information. > So I'm not sure here ... > > Best regards, > Goetz. > > >>> Should I split it to adapt the exceptions separately one-by-one to >>> make the change smaller and simplify the review? >> >> I don't think that is necessary. >> >> Thanks, >> David >> ----- >> >>> I would propose to start out with AbstractMethodError only. >>> >>> Best regards, >>> Goetz. >>> >>> >>> >>> From: Lindenmaier, Goetz >>> Sent: Tuesday, March 26, 2019 1:06 PM >>> To: hotspot-runtime-dev at openjdk.java.net >>> Subject: RFR(L): 8221470: Print methods in exception messages in java-like >> Syntax. >>> >>> Hi, >>> >>> A row of exceptions are thrown from the hotspot runtime. >>> They print methods with their JNI signatures. To increase >>> readability and resemblance to source code, this change proposes >>> to print them in a Java-like syntax. >>> >>> Some examples: >>> current method printouts: >>> >>> test.TeMe3_B.ma()V >>> test.TeMe3_B.ma(IZ[[BF)[[D >>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; >>> >>> improved format: >>> >>> void test.TeMe3_B.ma() >>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) >>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) >>> >>> So far, Method::name_and_sig_as_C_string() is used to print >>> these messages. >>> >>> This change implements function Method::external_name() that prints the >> better >>> format. >>> external_name() is chosen according to Klass::external_name(). >>> >>> Printing the better format requires parsing the signature >>> Symbol. This is implemented in >>> void Symbol::print_as_signature_external_return_type(outputStream *os); >>> void Symbol::print_as_signature_external_parameters(outputStream *os); >>> These method names are chosen according to >> Symbol::as_class_external_name(). >>> >>> See this partial webrev for the new functions: >>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- >> new_methods/ >>> >>> Also, I changed a lot of exception messages to use the new format. >>> This required to adapt a row of tests. I added a test to check >>> the signature printing does not regress. For all these changes, see >>> the full webrev: >>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ >>> >>> I hope I detected all places where method signatures are printed to >>> exception messages. >>> >>> Best regards, >>> Goetz. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> From claes.redestad at oracle.com Mon Apr 1 13:12:19 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 1 Apr 2019 15:12:19 +0200 Subject: RFR: 8221724: Enable archiving of Strings with hash 0 Message-ID: Hi, the current implementation of String archiving explicitly excludes Strings with hashcode 0, including "". The reason for this is not explicitly stated, but is likely to be due the fact that String::hashCode currently stores the calculated 0 to String.hash every time, which doesn't play well with read-only memory. This behavior is a blocker for archiving things where such strings might exist and there is code depending on the identity of the same, so I propose dropping this restriction. This doesn't _break_ anything functionally: all tests passes with this patch, regardless of whether the patch I'm proposing to change String::hashCode is applied, but some pages could be dirtied and copied into non-shared memory. Bug: https://bugs.openjdk.java.net/browse/JDK-8221724 Webrev: http://cr.openjdk.java.net/~redestad/8221724/open.00/ Testing: tier1-3 Thanks! /Claes [1] https://bugs.openjdk.java.net/browse/JDK-8221723 From coleen.phillimore at oracle.com Mon Apr 1 14:02:34 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 1 Apr 2019 10:02:34 -0400 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: Message-ID: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/src/hotspot/share/memory/allocation.cpp.udiff.html Apparently comparing this == NULL or other values is undefined behaviour.? Luckily, I think there's just one call that can be made static. The rest looks good. Coleen On 3/27/19 5:01 PM, Thomas St?fe wrote: > Hi all, > > May I please have reviews for this small optimization: > > cr: > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html > Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 > > There are several functions which, given an unknown pointer assumed to be a > metaspace object, check if the pointer is indeed a metaspace object by > walking the VirtualSpaceList and checking ranges. > > This patch adds checks which weed out the obvious cases to avoid needlessly > walking the vs list. > > Patch also adds verifications for the VirtualSpaceList in debug cases. > Those run only when a new node has been added to the list, or when a node > has been purged, so very sparingly. > > When purging nodes, I removed a small unnecessary and inefficient check > which checked whether (one of the) purged nodes was still in the list. > Since we now as part of the new VirtualSpaceNode::verify() walk this list, > the check is unnecessary. > > Thanks, Thomas From goetz.lindenmaier at sap.com Mon Apr 1 14:25:16 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 1 Apr 2019 14:25:16 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: <353eb85c-5752-5679-23fd-a5b82191cdc5@oracle.com> References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> <353eb85c-5752-5679-23fd-a5b82191cdc5@oracle.com> Message-ID: Hi David, > > But for consistency, we should then quote all class, field and method > > names, which is currently not the case as you can easily see by looking > > at the updated messages in the tests. > It's mainly the spaces caused by return types that is the issue so I > don't see we need to quote everything to address that issue. I added single quotes around the method. I included the decorators like: 'abstract void AME3_B.ma()' An incremental webrev: http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/02-incremental_quoting/ the full webrev: http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/02/ Best regards, Goetz. > > David > ----- > > > I thought about leaving out the return type, but that would mean to > > drop important information. > > So I'm not sure here ... > > > > Best regards, > > Goetz. > > > > > >>> Should I split it to adapt the exceptions separately one-by-one to > >>> make the change smaller and simplify the review? > >> > >> I don't think that is necessary. > >> > >> Thanks, > >> David > >> ----- > >> > >>> I would propose to start out with AbstractMethodError only. > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> > >>> > >>> From: Lindenmaier, Goetz > >>> Sent: Tuesday, March 26, 2019 1:06 PM > >>> To: hotspot-runtime-dev at openjdk.java.net > >>> Subject: RFR(L): 8221470: Print methods in exception messages in java-like > >> Syntax. > >>> > >>> Hi, > >>> > >>> A row of exceptions are thrown from the hotspot runtime. > >>> They print methods with their JNI signatures. To increase > >>> readability and resemblance to source code, this change proposes > >>> to print them in a Java-like syntax. > >>> > >>> Some examples: > >>> current method printouts: > >>> > >>> test.TeMe3_B.ma()V > >>> test.TeMe3_B.ma(IZ[[BF)[[D > >>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > >>> > >>> improved format: > >>> > >>> void test.TeMe3_B.ma() > >>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > >>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > >>> > >>> So far, Method::name_and_sig_as_C_string() is used to print > >>> these messages. > >>> > >>> This change implements function Method::external_name() that prints the > >> better > >>> format. > >>> external_name() is chosen according to Klass::external_name(). > >>> > >>> Printing the better format requires parsing the signature > >>> Symbol. This is implemented in > >>> void Symbol::print_as_signature_external_return_type(outputStream *os); > >>> void Symbol::print_as_signature_external_parameters(outputStream *os); > >>> These method names are chosen according to > >> Symbol::as_class_external_name(). > >>> > >>> See this partial webrev for the new functions: > >>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- > >> new_methods/ > >>> > >>> Also, I changed a lot of exception messages to use the new format. > >>> This required to adapt a row of tests. I added a test to check > >>> the signature printing does not regress. For all these changes, see > >>> the full webrev: > >>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ > >>> > >>> I hope I detected all places where method signatures are printed to > >>> exception messages. > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> From thomas.stuefe at gmail.com Mon Apr 1 14:44:57 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 1 Apr 2019 16:44:57 +0200 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: Message-ID: On Mon, Apr 1, 2019 at 4:03 PM wrote: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/src/hotspot/share/memory/allocation.cpp.udiff.html > > Apparently comparing this == NULL or other values is undefined > behaviour. Luckily, I think there's just one call that can be made static. > > Hi Coleen, What would be the "right" way to do this? A static helper taking a MetaspaceObj* and comparing it to NULL? Note that we seem to compare this == NULL in a couple of places. If this is UB those should be fixed too. Thank you, Thomas > The rest looks good. > Coleen > > > > On 3/27/19 5:01 PM, Thomas St?fe wrote: > > Hi all, > > > > May I please have reviews for this small optimization: > > > > cr: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html > > Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 > > > > There are several functions which, given an unknown pointer assumed to > be a > > metaspace object, check if the pointer is indeed a metaspace object by > > walking the VirtualSpaceList and checking ranges. > > > > This patch adds checks which weed out the obvious cases to avoid > needlessly > > walking the vs list. > > > > Patch also adds verifications for the VirtualSpaceList in debug cases. > > Those run only when a new node has been added to the list, or when a node > > has been purged, so very sparingly. > > > > When purging nodes, I removed a small unnecessary and inefficient check > > which checked whether (one of the) purged nodes was still in the list. > > Since we now as part of the new VirtualSpaceNode::verify() walk this > list, > > the check is unnecessary. > > > > Thanks, Thomas > > From coleen.phillimore at oracle.com Mon Apr 1 15:05:49 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 1 Apr 2019 11:05:49 -0400 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: Message-ID: On 4/1/19 10:44 AM, Thomas St?fe wrote: > > > On Mon, Apr 1, 2019 at 4:03 PM > wrote: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/src/hotspot/share/memory/allocation.cpp.udiff.html > > Apparently comparing this == NULL or other values is undefined > behaviour.? Luckily, I think there's just one call that can be > made static. > > > Hi Coleen, > > What would be the "right" way to do this? A static helper taking a > MetaspaceObj* and comparing it to NULL? Yes. > > Note that we seem to compare this == NULL in a couple of places. If > this is UB those should be fixed too. We had a pass at fixing some of these places.? I agree we should fix them all (not this patch of course). thanks, Coleen > > Thank you, Thomas > > > > The rest looks good. > Coleen > > > > On 3/27/19 5:01 PM, Thomas St?fe wrote: > > Hi all, > > > > May I please have reviews for this small optimization: > > > > cr: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html > > Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 > > > > There are several functions which, given an unknown pointer > assumed to be a > > metaspace object, check if the pointer is indeed a metaspace > object by > > walking the VirtualSpaceList and checking ranges. > > > > This patch adds checks which weed out the obvious cases to avoid > needlessly > > walking the vs list. > > > > Patch also adds verifications for the VirtualSpaceList in debug > cases. > > Those run only when a new node has been added to the list, or > when a node > > has been purged, so very sparingly. > > > > When purging nodes, I removed a small unnecessary and > inefficient check > > which checked whether (one of the) purged nodes was still in the > list. > > Since we now as part of the new VirtualSpaceNode::verify() walk > this list, > > the check is unnecessary. > > > > Thanks, Thomas > From coleen.phillimore at oracle.com Mon Apr 1 15:22:17 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 1 Apr 2019 11:22:17 -0400 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> <353eb85c-5752-5679-23fd-a5b82191cdc5@oracle.com> Message-ID: <889ac2fa-53d1-8131-57c4-621da1f494b5@oracle.com> This looks really good.?? The quotes around the method name are an improvement.? This change is a good improvement! Coleen On 4/1/19 10:25 AM, Lindenmaier, Goetz wrote: > Hi David, > >>> But for consistency, we should then quote all class, field and method >>> names, which is currently not the case as you can easily see by looking >>> at the updated messages in the tests. >> It's mainly the spaces caused by return types that is the issue so I >> don't see we need to quote everything to address that issue. > I added single quotes around the method. > I included the decorators like: > 'abstract void AME3_B.ma()' > > An incremental webrev: > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/02-incremental_quoting/ > the full webrev: > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/02/ > > Best regards, > Goetz. > > > > > >> David >> ----- >> >>> I thought about leaving out the return type, but that would mean to >>> drop important information. >>> So I'm not sure here ... >>> >>> Best regards, >>> Goetz. >>> >>> >>>>> Should I split it to adapt the exceptions separately one-by-one to >>>>> make the change smaller and simplify the review? >>>> I don't think that is necessary. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> I would propose to start out with AbstractMethodError only. >>>>> >>>>> Best regards, >>>>> Goetz. >>>>> >>>>> >>>>> >>>>> From: Lindenmaier, Goetz >>>>> Sent: Tuesday, March 26, 2019 1:06 PM >>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: RFR(L): 8221470: Print methods in exception messages in java-like >>>> Syntax. >>>>> Hi, >>>>> >>>>> A row of exceptions are thrown from the hotspot runtime. >>>>> They print methods with their JNI signatures. To increase >>>>> readability and resemblance to source code, this change proposes >>>>> to print them in a Java-like syntax. >>>>> >>>>> Some examples: >>>>> current method printouts: >>>>> >>>>> test.TeMe3_B.ma()V >>>>> test.TeMe3_B.ma(IZ[[BF)[[D >>>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; >>>>> >>>>> improved format: >>>>> >>>>> void test.TeMe3_B.ma() >>>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) >>>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) >>>>> >>>>> So far, Method::name_and_sig_as_C_string() is used to print >>>>> these messages. >>>>> >>>>> This change implements function Method::external_name() that prints the >>>> better >>>>> format. >>>>> external_name() is chosen according to Klass::external_name(). >>>>> >>>>> Printing the better format requires parsing the signature >>>>> Symbol. This is implemented in >>>>> void Symbol::print_as_signature_external_return_type(outputStream *os); >>>>> void Symbol::print_as_signature_external_parameters(outputStream *os); >>>>> These method names are chosen according to >>>> Symbol::as_class_external_name(). >>>>> See this partial webrev for the new functions: >>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- >>>> new_methods/ >>>>> Also, I changed a lot of exception messages to use the new format. >>>>> This required to adapt a row of tests. I added a test to check >>>>> the signature printing does not regress. For all these changes, see >>>>> the full webrev: >>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ >>>>> >>>>> I hope I detected all places where method signatures are printed to >>>>> exception messages. >>>>> >>>>> Best regards, >>>>> Goetz. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> From jianglizhou at google.com Mon Apr 1 21:47:49 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Mon, 1 Apr 2019 14:47:49 -0700 Subject: RFR: 8221724: Enable archiving of Strings with hash 0 In-Reply-To: References: Message-ID: Hi Claes, The changes look great to me. I especially like the new test case for 0-hash non-empty string. There was this hidden issue for shared Strings introduced by object graph archiving, which I didn't realize earlier. Although 0-hash strings were excluded from archiving when walking the string table and constant pool resolved references arrays, they could be archived into the 'Open' archive heap region (writes are okay for the region) during the walk for object graphs. That most likely was the cause of the problem that you observed when archiving the constant BaseLocales. Thanks for fixing it! Thanks and regards, Jiangli On Mon, Apr 1, 2019 at 6:14 AM Claes Redestad wrote: > Hi, > > the current implementation of String archiving explicitly excludes > Strings with hashcode 0, including "". The reason for this is not > explicitly stated, but is likely to be due the fact that > String::hashCode currently stores the calculated 0 to String.hash every > time, which doesn't play well with read-only memory. > > This behavior is a blocker for archiving things where such strings might > exist and there is code depending on the identity of the same, so I > propose dropping this restriction. This doesn't _break_ anything > functionally: all tests passes with this patch, regardless of whether > the patch I'm proposing to change String::hashCode is applied, but some > pages could be dirtied and copied into non-shared memory. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221724 > Webrev: http://cr.openjdk.java.net/~redestad/8221724/open.00/ > > Testing: tier1-3 > > Thanks! > > /Claes > > [1] https://bugs.openjdk.java.net/browse/JDK-8221723 > From leonid.mesnik at oracle.com Mon Apr 1 21:50:32 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Mon, 1 Apr 2019 14:50:32 -0700 Subject: RFR: 8221437: assert(java_lang_invoke_ResolvedMethodName::vmtarget(resolved_method()) == m()) failed: Should not change after link resolution Message-ID: <35182AD1-781F-4882-B194-C8FF8D63CE58@oracle.com> Hi Could you please review following fix which just relax assertion in methodHandles.cpp. The assertion 319 assert(java_lang_invoke_ResolvedMethodName::vmtarget(resolved_method()) == m(), ..) is to strict and fails in the case if class was redefined/retransformed and method MethodHandles::init_method_MemberName(Handle mname, CallInfo& info) is called for old copy of method. So this assertion is incorrect for old methods and shouldn't be checked. I hit this assertion by internal stress test and verified that it is not reproduced after fix. Also run tier1 sanity testing. webrev: http://cr.openjdk.java.net/~lmesnik/8221437/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8221437 Leonid From coleen.phillimore at oracle.com Mon Apr 1 21:55:37 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 1 Apr 2019 17:55:37 -0400 Subject: RFR: 8221437: assert(java_lang_invoke_ResolvedMethodName::vmtarget(resolved_method()) == m()) failed: Should not change after link resolution In-Reply-To: <35182AD1-781F-4882-B194-C8FF8D63CE58@oracle.com> References: <35182AD1-781F-4882-B194-C8FF8D63CE58@oracle.com> Message-ID: Leonid,? Looks good!? Thank you for diagnosing the problem. Coleen On 4/1/19 5:50 PM, Leonid Mesnik wrote: > Hi > > Could you please review following fix which just relax assertion in methodHandles.cpp. > The assertion > 319 assert(java_lang_invoke_ResolvedMethodName::vmtarget(resolved_method()) == m(), ..) > is to strict and fails in the case if class was redefined/retransformed and method MethodHandles::init_method_MemberName(Handle mname, CallInfo& info) > is called for old copy of method. > So this assertion is incorrect for old methods and shouldn't be checked. > > I hit this assertion by internal stress test and verified that it is not reproduced after fix. Also run tier1 sanity testing. > > webrev: http://cr.openjdk.java.net/~lmesnik/8221437/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8221437 > > Leonid From claes.redestad at oracle.com Mon Apr 1 22:03:46 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 2 Apr 2019 00:03:46 +0200 Subject: RFR: 8221724: Enable archiving of Strings with hash 0 In-Reply-To: References: Message-ID: Hi Jiangli, On 2019-04-01 23:47, Jiangli Zhou wrote: > Hi Claes, > > The changes look great to me. I especially like the new test case for > 0-hash non-empty string. thanks for reviewing this, and glad you like it! Thanks! /Claes From david.holmes at oracle.com Mon Apr 1 22:30:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 2 Apr 2019 08:30:55 +1000 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" Message-ID: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 webrev: http://cr.openjdk.java.net/~dholmes/8218483/webrev/ A bug in Thread.setDaemon (JDK-8221657) means that the daemon state of a thread can change after the thread is !isAlive() at the Java level. If this happens before the VM call to ThreadService::remove_thread then we have a situation where we incremented the thread counters when the thread was not a daemon, and we decrement the thread counters when the thread is a daemon - and so the counters are out of sync and the assertion fires. The simple fix is to capture the daemon state of the thread while it is still alive and to pass that through to Threads::remove and thus ThreadService::remove_thread. Testing: - manual test with modified VM (to delay Threads::remove call) as per the bug report - mach 5 tiers 1-3 Thanks, David From ioi.lam at oracle.com Tue Apr 2 00:18:01 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 1 Apr 2019 17:18:01 -0700 Subject: RFR: 8221724: Enable archiving of Strings with hash 0 In-Reply-To: References: Message-ID: Hi Claes, The changes look good. On 4/1/19 6:12 AM, Claes Redestad wrote: > Hi, > > the current implementation of String archiving explicitly excludes > Strings with hashcode 0, including "". The reason for this is not > explicitly stated, but is likely to be due the fact that > String::hashCode currently stores the calculated 0 to String.hash every > time, which doesn't play well with read-only memory. > Actually, archived strings are not stored in read-only memory. Instead, the pages are mmap'ed copy-on-write. So for a non-empty archive string S with zero hashcode, if you call S.hashCode, with your patch on top of the current JDK code, we will end up writing a zero to S.hash (which was zero to begin with). The net effect is no changes in memory content, but the page does get dirtied, and will no longer be sharable with other processes that map the same CDS archive. When your other patch JDK-8221723 is applied with this patch (8221724), then we will no longer have any dirty pages when you call S.hashCode for any archived string S. Thanks - Ioi > This behavior is a blocker for archiving things where such strings might > exist and there is code depending on the identity of the same, so I > propose dropping this restriction. This doesn't _break_ anything > functionally: all tests passes with this patch, regardless of whether > the patch I'm proposing to change String::hashCode is applied, but some > pages could be dirtied and copied into non-shared memory. > > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8221724 > Webrev: http://cr.openjdk.java.net/~redestad/8221724/open.00/ > > Testing: tier1-3 > > Thanks! > > /Claes > > [1] https://bugs.openjdk.java.net/browse/JDK-8221723 From david.holmes at oracle.com Tue Apr 2 00:33:28 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 2 Apr 2019 10:33:28 +1000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Hi Goetz, Overall this looks good to me - a few minor nits/comments below. I've applied the patch and am running it through our internal build and test system (tiers 1-3 initially). I have a suspicion there will be other tests that need to be updated - possibly even JCK tests. Discovering those a-priori will be difficult (simply running all the tests would take an extremely long time). Will have a discussion about how best to handle those internally. --- src/hotspot/share/oops/method.cpp Please put a blank line after each new method. --- src/hotspot/share/oops/symbol.cpp + os->print("."); + } else { + os->print("%c", start[i]); Please use os->put(char c) for individual characters. -- The "start" name would seem better as "buf" to me. -- + } else if (start[i] == 'L') { + print_class(os, start+i+1, len-i-2); Can you insert a comment that help explains the -2: } else if (start[i] == 'L') { + // Expected format: L; print_class(os, start+i+1, len-i-2); -- + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { space after for (2 occurrences) --- test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMethods.java Not sure the special characters can be used directly in the sources. Can they not be put in as unicode escapes at all places? --- Thanks, David ------- On 1/04/2019 12:32 pm, David Holmes wrote: > Hi Goetz, > > I'm looking at this ... > > On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: >> Hi, >> >> Any interest in this change? > > I'm personally of two minds here because these VM generated exceptions > are not only delivered to Java source code. I'd like to know how other > language developers using the JVM runtime would view this. > > That aside if you're going to make a change like this then I think the > full signature string has to be quoted in some way to delineate it > within the larger message. > >> Should I split it to adapt the exceptions separately one-by-one to >> make the change smaller and simplify the review? > > I don't think that is necessary. > > Thanks, > David > ----- > >> I would propose to start out with AbstractMethodError only. >> >> Best regards, >> ?? Goetz. >> >> >> >> From: Lindenmaier, Goetz >> Sent: Tuesday, March 26, 2019 1:06 PM >> To: hotspot-runtime-dev at openjdk.java.net >> Subject: RFR(L): 8221470: Print methods in exception messages in >> java-like Syntax. >> >> Hi, >> >> A row of exceptions are thrown from the hotspot runtime. >> They print methods with their JNI signatures. To increase >> readability and resemblance to source code, this change proposes >> to print them in a Java-like syntax. >> >> Some examples: >> current method printouts: >> >> test.TeMe3_B.ma()V >> test.TeMe3_B.ma(IZ[[BF)[[D >> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; >> >> improved format: >> >> void test.TeMe3_B.ma() >> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) >> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) >> >> So far, Method::name_and_sig_as_C_string() is used to print >> these messages. >> >> This change implements function Method::external_name() that prints >> the better >> format. >> external_name() is chosen according to Klass::external_name(). >> >> Printing the better format requires parsing the signature >> Symbol. This is implemented in >> void Symbol::print_as_signature_external_return_type(outputStream *os); >> void Symbol::print_as_signature_external_parameters(outputStream *os); >> These method names are chosen according to >> Symbol::as_class_external_name(). >> >> See this partial webrev for the new functions: >> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01-new_methods/ >> >> >> Also, I changed a lot of exception messages to use the new format. >> This required to adapt a row of tests. I added a test to check >> the signature printing does not regress.? For all these changes, see >> the full webrev: >> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ >> >> I hope I detected all places where method signatures are printed to >> exception messages. >> >> Best regards, >> ?? Goetz. >> >> >> >> >> >> >> >> >> >> >> >> >> From david.holmes at oracle.com Tue Apr 2 01:41:33 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 2 Apr 2019 11:41:33 +1000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: <43d389d9-289f-e6b9-2380-c63ca634ccb4@oracle.com> Hi Goetz, Your new test fails to compile on some systems: error: error while writing Strange\u20ac\u00a3Named: bad filename RelativeFile[test/Strange\u20ac\u00a3Named.class] class Strange\u20ac\u00a3Named { ^ 1 error result: Failed. Compilation failed: Compilation failed This was linux-x64 - only seems to occur on Oracle Linux Server 7.1. We also have some closed tests that also need updating so I'll need to coordinate with you on the push. Thanks, David ----- On 2/04/2019 10:33 am, David Holmes wrote: > Hi Goetz, > > Overall this looks good to me - a few minor nits/comments below. > > I've applied the patch and am running it through our internal build and > test system (tiers 1-3 initially). > > I have a suspicion there will be other tests that need to be updated - > possibly even JCK tests. Discovering those a-priori will be difficult > (simply running all the tests would take an extremely long time). Will > have a discussion about how best to handle those internally. > > --- > > src/hotspot/share/oops/method.cpp > > Please put a blank line after each new method. > > --- > > src/hotspot/share/oops/symbol.cpp > > +?????? os->print("."); > +???? } else { > +?????? os->print("%c", start[i]); > > Please use os->put(char c) for individual characters. > > -- > > The "start" name would seem better as "buf" to me. > > -- > > +???? } else if (start[i] == 'L') { > +?????? print_class(os, start+i+1, len-i-2); > > Can you insert a comment that help explains the -2: > > ???? } else if (start[i] == 'L') { > +????? // Expected format: L; > ?????? print_class(os, start+i+1, len-i-2); > > -- > > +?? for(SignatureStream ss(this); !ss.is_done(); ss.next()) { > > space after for (2 occurrences) > > --- > > > test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMethods.java > > > Not sure the special characters can be used directly in the sources. Can > they not be put in as unicode escapes at all places? > > --- > > Thanks, > David > ------- > > > On 1/04/2019 12:32 pm, David Holmes wrote: >> Hi Goetz, >> >> I'm looking at this ... >> >> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> Any interest in this change? >> >> I'm personally of two minds here because these VM generated exceptions >> are not only delivered to Java source code. I'd like to know how other >> language developers using the JVM runtime would view this. >> >> That aside if you're going to make a change like this then I think the >> full signature string has to be quoted in some way to delineate it >> within the larger message. >> >>> Should I split it to adapt the exceptions separately one-by-one to >>> make the change smaller and simplify the review? >> >> I don't think that is necessary. >> >> Thanks, >> David >> ----- >> >>> I would propose to start out with AbstractMethodError only. >>> >>> Best regards, >>> ?? Goetz. >>> >>> >>> >>> From: Lindenmaier, Goetz >>> Sent: Tuesday, March 26, 2019 1:06 PM >>> To: hotspot-runtime-dev at openjdk.java.net >>> Subject: RFR(L): 8221470: Print methods in exception messages in >>> java-like Syntax. >>> >>> Hi, >>> >>> A row of exceptions are thrown from the hotspot runtime. >>> They print methods with their JNI signatures. To increase >>> readability and resemblance to source code, this change proposes >>> to print them in a Java-like syntax. >>> >>> Some examples: >>> current method printouts: >>> >>> test.TeMe3_B.ma()V >>> test.TeMe3_B.ma(IZ[[BF)[[D >>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; >>> >>> improved format: >>> >>> void test.TeMe3_B.ma() >>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) >>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) >>> >>> So far, Method::name_and_sig_as_C_string() is used to print >>> these messages. >>> >>> This change implements function Method::external_name() that prints >>> the better >>> format. >>> external_name() is chosen according to Klass::external_name(). >>> >>> Printing the better format requires parsing the signature >>> Symbol. This is implemented in >>> void Symbol::print_as_signature_external_return_type(outputStream *os); >>> void Symbol::print_as_signature_external_parameters(outputStream *os); >>> These method names are chosen according to >>> Symbol::as_class_external_name(). >>> >>> See this partial webrev for the new functions: >>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01-new_methods/ >>> >>> >>> Also, I changed a lot of exception messages to use the new format. >>> This required to adapt a row of tests. I added a test to check >>> the signature printing does not regress.? For all these changes, see >>> the full webrev: >>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ >>> >>> I hope I detected all places where method signatures are printed to >>> exception messages. >>> >>> Best regards, >>> ?? Goetz. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> From ioi.lam at oracle.com Tue Apr 2 04:51:57 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 1 Apr 2019 21:51:57 -0700 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: <43d389d9-289f-e6b9-2380-c63ca634ccb4@oracle.com> References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> <43d389d9-289f-e6b9-2380-c63ca634ccb4@oracle.com> Message-ID: <59f344d6-d1b7-58ce-b127-26dca4033cb4@oracle.com> Hi Goetz, I think you can use this class to avoid writing classfiles with non-ascii names. http://hg.openjdk.java.net/jdk/jdk/file/2221f042556d/test/lib/jdk/test/lib/compiler/InMemoryJavaCompiler.java You'd need to write a class loader to load the byte array returned by InMemoryJavaCompiler.compile(). Thanks - Ioi On 4/1/19 6:41 PM, David Holmes wrote: > Hi Goetz, > > Your new test fails to compile on some systems: > > error: error while writing Strange\u20ac\u00a3Named: bad filename > RelativeFile[test/Strange\u20ac\u00a3Named.class] > class Strange\u20ac\u00a3Named { > ^ > 1 error > result: Failed. Compilation failed: Compilation failed > > This was linux-x64 - only seems to occur on Oracle Linux Server 7.1. > > We also have some closed tests that also need updating so I'll need to > coordinate with you on the push. > > Thanks, > David > ----- > > On 2/04/2019 10:33 am, David Holmes wrote: >> Hi Goetz, >> >> Overall this looks good to me - a few minor nits/comments below. >> >> I've applied the patch and am running it through our internal build >> and test system (tiers 1-3 initially). >> >> I have a suspicion there will be other tests that need to be updated >> - possibly even JCK tests. Discovering those a-priori will be >> difficult (simply running all the tests would take an extremely long >> time). Will have a discussion about how best to handle those internally. >> >> --- >> >> src/hotspot/share/oops/method.cpp >> >> Please put a blank line after each new method. >> >> --- >> >> src/hotspot/share/oops/symbol.cpp >> >> +?????? os->print("."); >> +???? } else { >> +?????? os->print("%c", start[i]); >> >> Please use os->put(char c) for individual characters. >> >> -- >> >> The "start" name would seem better as "buf" to me. >> >> -- >> >> +???? } else if (start[i] == 'L') { >> +?????? print_class(os, start+i+1, len-i-2); >> >> Can you insert a comment that help explains the -2: >> >> ????? } else if (start[i] == 'L') { >> +????? // Expected format: L; >> ??????? print_class(os, start+i+1, len-i-2); >> >> -- >> >> +?? for(SignatureStream ss(this); !ss.is_done(); ss.next()) { >> >> space after for (2 occurrences) >> >> --- >> >> >> test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMethods.java >> >> >> Not sure the special characters can be used directly in the sources. >> Can they not be put in as unicode escapes at all places? >> >> --- >> >> Thanks, >> David >> ------- >> >> >> On 1/04/2019 12:32 pm, David Holmes wrote: >>> Hi Goetz, >>> >>> I'm looking at this ... >>> >>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> Any interest in this change? >>> >>> I'm personally of two minds here because these VM generated >>> exceptions are not only delivered to Java source code. I'd like to >>> know how other language developers using the JVM runtime would view >>> this. >>> >>> That aside if you're going to make a change like this then I think >>> the full signature string has to be quoted in some way to delineate >>> it within the larger message. >>> >>>> Should I split it to adapt the exceptions separately one-by-one to >>>> make the change smaller and simplify the review? >>> >>> I don't think that is necessary. >>> >>> Thanks, >>> David >>> ----- >>> >>>> I would propose to start out with AbstractMethodError only. >>>> >>>> Best regards, >>>> ?? Goetz. >>>> >>>> >>>> >>>> From: Lindenmaier, Goetz >>>> Sent: Tuesday, March 26, 2019 1:06 PM >>>> To: hotspot-runtime-dev at openjdk.java.net >>>> Subject: RFR(L): 8221470: Print methods in exception messages in >>>> java-like Syntax. >>>> >>>> Hi, >>>> >>>> A row of exceptions are thrown from the hotspot runtime. >>>> They print methods with their JNI signatures. To increase >>>> readability and resemblance to source code, this change proposes >>>> to print them in a Java-like syntax. >>>> >>>> Some examples: >>>> current method printouts: >>>> >>>> test.TeMe3_B.ma()V >>>> test.TeMe3_B.ma(IZ[[BF)[[D >>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; >>>> >>>> improved format: >>>> >>>> void test.TeMe3_B.ma() >>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) >>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) >>>> >>>> So far, Method::name_and_sig_as_C_string() is used to print >>>> these messages. >>>> >>>> This change implements function Method::external_name() that prints >>>> the better >>>> format. >>>> external_name() is chosen according to Klass::external_name(). >>>> >>>> Printing the better format requires parsing the signature >>>> Symbol. This is implemented in >>>> void Symbol::print_as_signature_external_return_type(outputStream >>>> *os); >>>> void Symbol::print_as_signature_external_parameters(outputStream *os); >>>> These method names are chosen according to >>>> Symbol::as_class_external_name(). >>>> >>>> See this partial webrev for the new functions: >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01-new_methods/ >>>> >>>> >>>> Also, I changed a lot of exception messages to use the new format. >>>> This required to adapt a row of tests. I added a test to check >>>> the signature printing does not regress.? For all these changes, see >>>> the full webrev: >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ >>>> >>>> I hope I detected all places where method signatures are printed to >>>> exception messages. >>>> >>>> Best regards, >>>> ?? Goetz. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> From thomas.stuefe at gmail.com Tue Apr 2 05:47:04 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 2 Apr 2019 07:47:04 +0200 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: Message-ID: Hi Coleen, Andrew, thank you for reviewing my little change. Unfortunately, I had an error in the space list verification method which needed fixing, so here is a second version: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.01/webrev/ Differences: - As Coleen requested: in allocation.cpp I replaced the comparison this==NULL with a static helper method - I had mistype "envelope" as "envolope" in "expand_envelope_to_include_node()". Since that sounded funny I changed it. - The real bug was in VirtualSpaceList::verify() where I checked that the extension of the envelope is as large as the current nodes. But that is wrong, since the envelope never is shrunk (by design) and nodes at the border of the envelope may have been unmapped. So the real test should be to test if no node is outside the envelope. Thanks, Thomas On Wed, Mar 27, 2019 at 10:01 PM Thomas St?fe wrote: > Hi all, > > May I please have reviews for this small optimization: > > cr: > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html > Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 > > There are several functions which, given an unknown pointer assumed to be > a metaspace object, check if the pointer is indeed a metaspace object by > walking the VirtualSpaceList and checking ranges. > > This patch adds checks which weed out the obvious cases to avoid > needlessly walking the vs list. > > Patch also adds verifications for the VirtualSpaceList in debug cases. > Those run only when a new node has been added to the list, or when a node > has been purged, so very sparingly. > > When purging nodes, I removed a small unnecessary and inefficient check > which checked whether (one of the) purged nodes was still in the list. > Since we now as part of the new VirtualSpaceNode::verify() walk this list, > the check is unnecessary. > > Thanks, Thomas > > > From thomas.stuefe at gmail.com Tue Apr 2 07:13:45 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 2 Apr 2019 09:13:45 +0200 Subject: RFR (XS) 8221726: Multiple build failures after JDK-8221698 (Remove redundant includes from popular header files) In-Reply-To: <734743cf-5633-a559-7b0f-232a8b90da30@oracle.com> References: <1f99bf26-a3e2-783e-4905-462d7635ba99@redhat.com> <88a8e19d-82c3-69f7-3c53-c19c689a87af@oracle.com> <734743cf-5633-a559-7b0f-232a8b90da30@oracle.com> Message-ID: Hi Ioi, On Mon, Apr 1, 2019 at 7:17 AM Ioi Lam wrote: > Hi Aleksey, thanks for fixing this! > > Now I realized that the hotspot header file dependency is more fragile > than I thought. > > Thomas, I have a couple more changesets for cleaning header files. I'll > post them and let people try them out on other ports (for at least a > week, etc) before pushing. I'll also test on more combinations like zero > and minimal. > > I'll try to write a tool to analyze how the header files are included. > The current state is pretty abysmal (from a simple script that I wrote): > > http://cr.openjdk.java.net/~iklam/jdk13/headers.txt > > Number of headers = 1293 > Number of objs = 942 > Each obj file includes = 279.64 headers > Each header is included = 203.73 times > Rank 1% - 10% headers are included 1.7 times > Rank 11% - 20% headers are included 3.0 times > Rank 21% - 30% headers are included 5.2 times > Rank 31% - 40% headers are included 10.6 times > Rank 41% - 50% headers are included 21.4 times > Rank 51% - 60% headers are included 42.7 times > Rank 61% - 70% headers are included 97.4 times > Rank 71% - 80% headers are included 281.1 times > Rank 81% - 90% headers are included 711.8 times > Rank 91% - 100% headers are included 866.2 times > > So basically you have 20% of headers that are practically included in > every object file :-( > Yeah that is not good. Cleaning this up is certainly valuable, my little laptop thanks you :) I also see a lot of functionality unnecessarily implemented inline in headers. For example, I just changed something in JvmtiExport::post_array_size_exhausted() and got almost a complete rebuild since it is defined in jvmtiExport.hpp. ..Thomas > > > Thanks > - Ioi > > > On 3/31/19 9:37 PM, Thomas St?fe wrote: > > On Mon 1. Apr 2019 at 00:19, David Holmes > wrote: > > > >> Hi Aleksey, > >> > >> On 1/04/2019 8:04 am, Aleksey Shipilev wrote: > >>> Bug: > >>> https://bugs.openjdk.java.net/browse/JDK-8221726 > >>> > >>> See bug for examples of build failures. Seems only ppc64le and x86_64 > >> {minimal, zero} are affected. > >>> Happy to fold other fixes if other platforms are failing too. > >>> > >>> Fix: > >>> > >>> diff -r 7ad62bdfec59 > >> src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp > >>> --- a/src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp Sun Mar > >> 31 23:29:47 2019 +0200 > >>> +++ b/src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp Sun Mar > >> 31 23:52:49 2019 +0200 > >>> @@ -28,4 +28,5 @@ > >>> #include "gc/shared/barrierSetAssembler.hpp" > >>> #include "interpreter/interp_masm.hpp" > >>> +#include "runtime/jniHandles.hpp" > >>> > >>> #define __ masm-> > >>> diff -r 7ad62bdfec59 src/hotspot/share/classfile/systemDictionary.hpp > >>> --- a/src/hotspot/share/classfile/systemDictionary.hpp Sun Mar 31 > >> 23:29:47 2019 +0200 > >>> +++ b/src/hotspot/share/classfile/systemDictionary.hpp Sun Mar 31 > >> 23:52:49 2019 +0200 > >>> @@ -31,4 +31,5 @@ > >>> #include "oops/symbol.hpp" > >>> #include "runtime/java.hpp" > >>> +#include "runtime/mutexLocker.hpp" > >>> #include "runtime/reflectionUtils.hpp" > >>> #include "runtime/signature.hpp" > >> I'm struggling to see what changes in JDK-8221698 led to these problems, > >> but the fixes certainly look totally appropriate. I also think this > >> constitutes a trivial change and can be pushed with one Review andnot > >> wait 24 hours. (If there are any issues I'll sort them out if needed.) > >> > >> Aside: are there any tools that will show where a particular declaration > >> is being included from? We've obviously got some interesting transitive > >> closures with conditional includes. > >> > > I think it would be helpful if we could have at least one zero build (eg > > x64) in jdk-submit. > > > > ..thomas > > > > > >> Thanks, > >> David > >> > >>> Testing: Linux x86_64 {server, minimal, zero}, ppc64le builds > >>> > >>> Thanks, > >>> -Aleksey > >>> > > From claes.redestad at oracle.com Tue Apr 2 07:55:35 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 2 Apr 2019 09:55:35 +0200 Subject: RFR: 8221724: Enable archiving of Strings with hash 0 In-Reply-To: References: Message-ID: <7eec5b8c-a24c-3e5a-408e-42563435590f@oracle.com> Hi Ioi, On 2019-04-02 02:18, Ioi Lam wrote: > Hi Claes, > > The changes look good. thanks! > > On 4/1/19 6:12 AM, Claes Redestad wrote: >> Hi, >> >> the current implementation of String archiving explicitly excludes >> Strings with hashcode 0, including "". The reason for this is not >> explicitly stated, but is likely to be due the fact that >> String::hashCode currently stores the calculated 0 to String.hash every >> time, which doesn't play well with read-only memory. >> > Actually, archived strings are not stored in read-only memory. Instead, > the pages are mmap'ed copy-on-write. So for a non-empty archive string S > with zero hashcode, if you call S.hashCode, with your patch on top of > the current JDK code, we will end up writing a zero to S.hash (which was > zero to begin with). The net effect is no changes in memory content, but > the page does get dirtied, and will no longer be sharable with other > processes that map the same CDS archive. > > When your other patch JDK-8221723 is applied with this patch (8221724), > then we will no longer have any dirty pages when you call S.hashCode for > any archived string S. Thanks for elaborating. I guess using real read-only memory instead of COW is still not feasible since archived objects could be used as locks, in which case we'd write to the object header. /Claes From thomas.stuefe at gmail.com Tue Apr 2 07:56:56 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 2 Apr 2019 09:56:56 +0200 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: Message-ID: p.s. nightlies at SAP ok, submit tests are ok Cheers Thomas On Tue, Apr 2, 2019 at 7:47 AM Thomas St?fe wrote: > Hi Coleen, Andrew, > > thank you for reviewing my little change. Unfortunately, I had an error in > the space list verification method which needed fixing, so here is a second > version: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.01/webrev/ > > Differences: > - As Coleen requested: in allocation.cpp I replaced the comparison > this==NULL with a static helper method > - I had mistype "envelope" as "envolope" in > "expand_envelope_to_include_node()". Since that sounded funny I changed it. > - The real bug was in VirtualSpaceList::verify() where I checked that the > extension of the envelope is as large as the current nodes. But that is > wrong, since the envelope never is shrunk (by design) and nodes at the > border of the envelope may have been unmapped. So the real test should be > to test if no node is outside the envelope. > > Thanks, Thomas > > > On Wed, Mar 27, 2019 at 10:01 PM Thomas St?fe > wrote: > >> Hi all, >> >> May I please have reviews for this small optimization: >> >> cr: >> http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html >> Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 >> >> There are several functions which, given an unknown pointer assumed to be >> a metaspace object, check if the pointer is indeed a metaspace object by >> walking the VirtualSpaceList and checking ranges. >> >> This patch adds checks which weed out the obvious cases to avoid >> needlessly walking the vs list. >> >> Patch also adds verifications for the VirtualSpaceList in debug cases. >> Those run only when a new node has been added to the list, or when a node >> has been purged, so very sparingly. >> >> When purging nodes, I removed a small unnecessary and inefficient check >> which checked whether (one of the) purged nodes was still in the list. >> Since we now as part of the new VirtualSpaceNode::verify() walk this list, >> the check is unnecessary. >> >> Thanks, Thomas >> >> >> From adinn at redhat.com Tue Apr 2 09:30:13 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 2 Apr 2019 10:30:13 +0100 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: Message-ID: On 02/04/2019 08:56, Thomas St?fe wrote: > p.s. nightlies at SAP ok, submit tests are ok > . . . > Differences: > - As Coleen requested: in allocation.cpp I replaced the comparison > this==NULL with a static helper method > - I had mistype "envelope" as "envolope" in > "expand_envelope_to_include_node()". Since that sounded funny I > changed it. > - The real bug was in VirtualSpaceList::verify() where I checked > that the extension of the envelope is as large as the current nodes. > But that is wrong, since the envelope never is shrunk (by design) > and nodes at the border of the envelope may have been unmapped. So > the real test should be to test if no node is outside the envelope. This version looks ok. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From martin.doerr at sap.com Tue Apr 2 10:26:05 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 2 Apr 2019 10:26:05 +0000 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: Message-ID: Hi Thomas, can you replace the NULL check by a check for address below page size, please? oopDesc::is_valid and Symbol::is_valid do it the following way: if (!is_aligned(s, sizeof(MetaWord))) return false; if ((size_t)s < os::min_page_size()) return false; Besides this, webrev.01 looks good to me. Thanks, Martin -----Original Message----- From: hotspot-runtime-dev On Behalf Of Andrew Dinn Sent: Dienstag, 2. April 2019 11:30 To: Thomas St?fe ; Coleen Phillmore Cc: Hotspot dev runtime Subject: Re: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends On 02/04/2019 08:56, Thomas St?fe wrote: > p.s. nightlies at SAP ok, submit tests are ok > . . . > Differences: > - As Coleen requested: in allocation.cpp I replaced the comparison > this==NULL with a static helper method > - I had mistype "envelope" as "envolope" in > "expand_envelope_to_include_node()". Since that sounded funny I > changed it. > - The real bug was in VirtualSpaceList::verify() where I checked > that the extension of the envelope is as large as the current nodes. > But that is wrong, since the envelope never is shrunk (by design) > and nodes at the border of the envelope may have been unmapped. So > the real test should be to test if no node is outside the envelope. This version looks ok. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From coleen.phillimore at oracle.com Tue Apr 2 12:36:19 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 2 Apr 2019 08:36:19 -0400 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: Message-ID: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> On 4/2/19 1:47 AM, Thomas St?fe wrote: > Hi Coleen, Andrew, > > thank you for reviewing my little change. Unfortunately, I had an > error in the space list verification method which needed fixing, so > here is a second version: > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.01/webrev/ > > Differences: > - As Coleen requested: in allocation.cpp I replaced the comparison > this==NULL with a static helper method I think you have to change the callers to not pass this as null.? So you can't do metaspaceobj->is_metaspace_object() because you're calling with "this" potentially NULL. So remove this function: bool MetaspaceObj::is_metaspace_object() const { - return Metaspace::contains((void*)this); + return MetaspaceObj::is_metaspace_object(this); } > - I had mistype "envelope" as "envolope" in > "expand_envelope_to_include_node()". Since that sounded funny I > changed it. > - The real bug was in VirtualSpaceList::verify() where I checked that > the extension of the envelope is as large as the current nodes. But > that is wrong, since the envelope never is shrunk (by design) and > nodes at the border of the envelope may have been unmapped. So the > real test should be to test if no node is outside the envelope. So this envelope is an interesting concept and name.? It seems okay.? I guess over time, it won't give you a very good answer. Maybe you'll have to fix the boundaries someday. Looks good though.? Thank you for making this improvement for performance. Coleen > > Thanks, Thomas > > > On Wed, Mar 27, 2019 at 10:01 PM Thomas St?fe > wrote: > > Hi all, > > May I please have reviews for this small optimization: > > cr: > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html > Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 > > There are several functions which, given an unknown pointer > assumed to be a metaspace object, check if the pointer is indeed a > metaspace object by walking the VirtualSpaceList and checking ranges. > > This patch adds checks which weed out the obvious cases to avoid > needlessly walking the vs list. > > Patch also adds verifications for the VirtualSpaceList in debug > cases. Those run only when a new node has been added to the list, > or when a node has been purged, so very sparingly. > > When purging nodes, I removed a small unnecessary and inefficient > check which checked whether (one of the) purged nodes was still in > the list. Since we now as part of the new > VirtualSpaceNode::verify() walk this list, the check is unnecessary. > > Thanks, Thomas > > From martin.doerr at sap.com Tue Apr 2 13:05:10 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 2 Apr 2019 13:05:10 +0000 Subject: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses Message-ID: Hi, I'd like to fix a minor bug in Symbol::is_valid which can cause errors during error reporting: Address computation can overflow leading to skipped readability check. Bug: https://bugs.openjdk.java.net/browse/JDK-8221833 Webrev: http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.00/ Please review. Best regards, Martin From goetz.lindenmaier at sap.com Tue Apr 2 13:07:52 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 2 Apr 2019 13:07:52 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: <59f344d6-d1b7-58ce-b127-26dca4033cb4@oracle.com> References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> <43d389d9-289f-e6b9-2380-c63ca634ccb4@oracle.com> <59f344d6-d1b7-58ce-b127-26dca4033cb4@oracle.com> Message-ID: Hi, I'll try to fix the test that way, thanks Ioi! Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev > On Behalf Of Ioi Lam > Sent: Dienstag, 2. April 2019 06:52 > To: hotspot-runtime-dev at openjdk.java.net > Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in > java-like Syntax. > > Hi Goetz, > > I think you can use this class to avoid writing classfiles with > non-ascii names. > > http://hg.openjdk.java.net/jdk/jdk/file/2221f042556d/test/lib/jdk/test/lib/co > mpiler/InMemoryJavaCompiler.java > > You'd need to write a class loader to load the byte array returned by > InMemoryJavaCompiler.compile(). > > Thanks > - Ioi > > > On 4/1/19 6:41 PM, David Holmes wrote: > > Hi Goetz, > > > > Your new test fails to compile on some systems: > > > > error: error while writing Strange\u20ac\u00a3Named: bad filename > > RelativeFile[test/Strange\u20ac\u00a3Named.class] > > class Strange\u20ac\u00a3Named { > > ^ > > 1 error > > result: Failed. Compilation failed: Compilation failed > > > > This was linux-x64 - only seems to occur on Oracle Linux Server 7.1. > > > > We also have some closed tests that also need updating so I'll need to > > coordinate with you on the push. > > > > Thanks, > > David > > ----- > > > > On 2/04/2019 10:33 am, David Holmes wrote: > >> Hi Goetz, > >> > >> Overall this looks good to me - a few minor nits/comments below. > >> > >> I've applied the patch and am running it through our internal build > >> and test system (tiers 1-3 initially). > >> > >> I have a suspicion there will be other tests that need to be updated > >> - possibly even JCK tests. Discovering those a-priori will be > >> difficult (simply running all the tests would take an extremely long > >> time). Will have a discussion about how best to handle those internally. > >> > >> --- > >> > >> src/hotspot/share/oops/method.cpp > >> > >> Please put a blank line after each new method. > >> > >> --- > >> > >> src/hotspot/share/oops/symbol.cpp > >> > >> +?????? os->print("."); > >> +???? } else { > >> +?????? os->print("%c", start[i]); > >> > >> Please use os->put(char c) for individual characters. > >> > >> -- > >> > >> The "start" name would seem better as "buf" to me. > >> > >> -- > >> > >> +???? } else if (start[i] == 'L') { > >> +?????? print_class(os, start+i+1, len-i-2); > >> > >> Can you insert a comment that help explains the -2: > >> > >> ????? } else if (start[i] == 'L') { > >> +????? // Expected format: L; > >> ??????? print_class(os, start+i+1, len-i-2); > >> > >> -- > >> > >> +?? for(SignatureStream ss(this); !ss.is_done(); ss.next()) { > >> > >> space after for (2 occurrences) > >> > >> --- > >> > >> > >> > test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth > ods.java > >> > >> > >> Not sure the special characters can be used directly in the sources. > >> Can they not be put in as unicode escapes at all places? > >> > >> --- > >> > >> Thanks, > >> David > >> ------- > >> > >> > >> On 1/04/2019 12:32 pm, David Holmes wrote: > >>> Hi Goetz, > >>> > >>> I'm looking at this ... > >>> > >>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> Any interest in this change? > >>> > >>> I'm personally of two minds here because these VM generated > >>> exceptions are not only delivered to Java source code. I'd like to > >>> know how other language developers using the JVM runtime would view > >>> this. > >>> > >>> That aside if you're going to make a change like this then I think > >>> the full signature string has to be quoted in some way to delineate > >>> it within the larger message. > >>> > >>>> Should I split it to adapt the exceptions separately one-by-one to > >>>> make the change smaller and simplify the review? > >>> > >>> I don't think that is necessary. > >>> > >>> Thanks, > >>> David > >>> ----- > >>> > >>>> I would propose to start out with AbstractMethodError only. > >>>> > >>>> Best regards, > >>>> ?? Goetz. > >>>> > >>>> > >>>> > >>>> From: Lindenmaier, Goetz > >>>> Sent: Tuesday, March 26, 2019 1:06 PM > >>>> To: hotspot-runtime-dev at openjdk.java.net > >>>> Subject: RFR(L): 8221470: Print methods in exception messages in > >>>> java-like Syntax. > >>>> > >>>> Hi, > >>>> > >>>> A row of exceptions are thrown from the hotspot runtime. > >>>> They print methods with their JNI signatures. To increase > >>>> readability and resemblance to source code, this change proposes > >>>> to print them in a Java-like syntax. > >>>> > >>>> Some examples: > >>>> current method printouts: > >>>> > >>>> test.TeMe3_B.ma()V > >>>> test.TeMe3_B.ma(IZ[[BF)[[D > >>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > >>>> > >>>> improved format: > >>>> > >>>> void test.TeMe3_B.ma() > >>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > >>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > >>>> > >>>> So far, Method::name_and_sig_as_C_string() is used to print > >>>> these messages. > >>>> > >>>> This change implements function Method::external_name() that prints > >>>> the better > >>>> format. > >>>> external_name() is chosen according to Klass::external_name(). > >>>> > >>>> Printing the better format requires parsing the signature > >>>> Symbol. This is implemented in > >>>> void Symbol::print_as_signature_external_return_type(outputStream > >>>> *os); > >>>> void Symbol::print_as_signature_external_parameters(outputStream > *os); > >>>> These method names are chosen according to > >>>> Symbol::as_class_external_name(). > >>>> > >>>> See this partial webrev for the new functions: > >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- > new_methods/ > >>>> > >>>> > >>>> Also, I changed a lot of exception messages to use the new format. > >>>> This required to adapt a row of tests. I added a test to check > >>>> the signature printing does not regress.? For all these changes, see > >>>> the full webrev: > >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ > >>>> > >>>> I hope I detected all places where method signatures are printed to > >>>> exception messages. > >>>> > >>>> Best regards, > >>>> ?? Goetz. > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> From goetz.lindenmaier at sap.com Tue Apr 2 13:26:29 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 2 Apr 2019 13:26:29 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Hi David, > Overall this looks good to me - a few minor nits/comments below. thanks! > I've applied the patch and am running it through our internal build and > test system (tiers 1-3 initially). > > I have a suspicion there will be other tests that need to be updated - > possibly even JCK tests. Discovering those a-priori will be difficult > (simply running all the tests would take an extremely long time). Will > have a discussion about how best to handle those internally. I ran most JCK test without problem. They usually don't check messages. I ran all hotspot, jdk, langtools, nashorn and jaxp test (except for headful tests). > src/hotspot/share/oops/method.cpp > Please put a blank line after each new method. Fixed. > src/hotspot/share/oops/symbol.cpp > > + os->print("."); > + } else { > + os->print("%c", start[i]); > > Please use os->put(char c) for individual characters. Fixed. > The "start" name would seem better as "buf" to me. Hmm, buf to me is a local chunk of memory used temporarily. What about array_sig, class_sig? > + } else if (start[i] == 'L') { > + print_class(os, start+i+1, len-i-2); > Can you insert a comment that help explains the -2: Done. > + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { > space after for (2 occurrences) Fixed. > test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth > ods.java > > Not sure the special characters can be used directly in the sources. Can > they not be put in as unicode escapes at all places? I'll try what Ioi proposed. I'll post a new webrev including that. Best regards, Goetz. > > --- > > Thanks, > David > ------- > > > On 1/04/2019 12:32 pm, David Holmes wrote: > > Hi Goetz, > > > > I'm looking at this ... > > > > On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: > >> Hi, > >> > >> Any interest in this change? > > > > I'm personally of two minds here because these VM generated exceptions > > are not only delivered to Java source code. I'd like to know how other > > language developers using the JVM runtime would view this. > > > > That aside if you're going to make a change like this then I think the > > full signature string has to be quoted in some way to delineate it > > within the larger message. > > > >> Should I split it to adapt the exceptions separately one-by-one to > >> make the change smaller and simplify the review? > > > > I don't think that is necessary. > > > > Thanks, > > David > > ----- > > > >> I would propose to start out with AbstractMethodError only. > >> > >> Best regards, > >> ?? Goetz. > >> > >> > >> > >> From: Lindenmaier, Goetz > >> Sent: Tuesday, March 26, 2019 1:06 PM > >> To: hotspot-runtime-dev at openjdk.java.net > >> Subject: RFR(L): 8221470: Print methods in exception messages in > >> java-like Syntax. > >> > >> Hi, > >> > >> A row of exceptions are thrown from the hotspot runtime. > >> They print methods with their JNI signatures. To increase > >> readability and resemblance to source code, this change proposes > >> to print them in a Java-like syntax. > >> > >> Some examples: > >> current method printouts: > >> > >> test.TeMe3_B.ma()V > >> test.TeMe3_B.ma(IZ[[BF)[[D > >> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > >> > >> improved format: > >> > >> void test.TeMe3_B.ma() > >> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > >> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > >> > >> So far, Method::name_and_sig_as_C_string() is used to print > >> these messages. > >> > >> This change implements function Method::external_name() that prints > >> the better > >> format. > >> external_name() is chosen according to Klass::external_name(). > >> > >> Printing the better format requires parsing the signature > >> Symbol. This is implemented in > >> void Symbol::print_as_signature_external_return_type(outputStream *os); > >> void Symbol::print_as_signature_external_parameters(outputStream *os); > >> These method names are chosen according to > >> Symbol::as_class_external_name(). > >> > >> See this partial webrev for the new functions: > >> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- > new_methods/ > >> > >> > >> Also, I changed a lot of exception messages to use the new format. > >> This required to adapt a row of tests. I added a test to check > >> the signature printing does not regress.? For all these changes, see > >> the full webrev: > >> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ > >> > >> I hope I detected all places where method signatures are printed to > >> exception messages. > >> > >> Best regards, > >> ?? Goetz. > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> From zgu at redhat.com Tue Apr 2 14:01:23 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 2 Apr 2019 10:01:23 -0400 Subject: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses In-Reply-To: References: Message-ID: Hi Martin, Would it be more proper to do the check in os::is_readable_range()? Thanks, -Zhengyu On 4/2/19 9:05 AM, Doerr, Martin wrote: > Hi, > > I'd like to fix a minor bug in Symbol::is_valid which can cause errors during error reporting: > Address computation can overflow leading to skipped readability check. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8221833 > > Webrev: > http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.00/ > > Please review. > > Best regards, > Martin > From martin.doerr at sap.com Tue Apr 2 14:33:43 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 2 Apr 2019 14:33:43 +0000 Subject: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses In-Reply-To: References: Message-ID: Hi Zhengyu, that would be fine, too. I'll put it there if other reviewers prefer that, too. Thanks and best regards, Martin -----Original Message----- From: Zhengyu Gu Sent: Dienstag, 2. April 2019 16:01 To: Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses Hi Martin, Would it be more proper to do the check in os::is_readable_range()? Thanks, -Zhengyu On 4/2/19 9:05 AM, Doerr, Martin wrote: > Hi, > > I'd like to fix a minor bug in Symbol::is_valid which can cause errors during error reporting: > Address computation can overflow leading to skipped readability check. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8221833 > > Webrev: > http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.00/ > > Please review. > > Best regards, > Martin > From daniel.daugherty at oracle.com Tue Apr 2 19:17:45 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 2 Apr 2019 15:17:45 -0400 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" In-Reply-To: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> References: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> Message-ID: <4896c4f5-c31e-a4c0-ec59-5a4dba93615e@oracle.com> On 4/1/19 6:30 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 > webrev: http://cr.openjdk.java.net/~dholmes/8218483/webrev/ src/hotspot/share/prims/jni.cpp ??? No comments. src/hotspot/share/runtime/thread.hpp ??? No comments. src/hotspot/share/runtime/thread.cpp ??? No comments. test/hotspot/gtest/threadHelper.inline.hpp ??? No comments. Thumbs up! I sanity checked uses of java_lang_Thread::set_daemon() and didn't see any issues there. I also checked uses of java_lang_Thread::is_daemon() and the most I found was a possible issue with thread dump output racing with a Thread.setDaemon(false) call after the thread has died. However, that would require that the ThreadsList captured for the thread dump contain the JavaThread that's about to exit, the JavaThread exiting, another thread calling Thread.setDaemon(false) before the thread dump gets to querying the JavaThread that is trying to exit... Pretty rare if possible at all and most that could go wrong with the output is a missing "daemon" marker in the thread dump... Dan > > A bug in Thread.setDaemon (JDK-8221657) means that the daemon state of > a thread can change after the thread is !isAlive() at the Java level. > If this happens before the VM call to ThreadService::remove_thread > then we have a situation where we incremented the thread counters when > the thread was not a daemon, and we decrement the thread counters when > the thread is a daemon - and so the counters are out of sync and the > assertion fires. > > The simple fix is to capture the daemon state of the thread while it > is still alive and to pass that through to Threads::remove and thus > ThreadService::remove_thread. > > Testing: > ? - manual test with modified VM (to delay Threads::remove call) as > per the bug report > ? - mach 5 tiers 1-3 > > Thanks, > David > From thomas.stuefe at gmail.com Tue Apr 2 19:41:19 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 2 Apr 2019 21:41:19 +0200 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" In-Reply-To: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> References: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> Message-ID: Hi David, first thanks for the good analysis! Is this not a problem with the usage of setDaemon(): https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDaemon(boolean) "This method must be invoked before the thread is started." I think the real solution would be for setDaemon to distinguish between not-yet-started, running and finished. It should not use isAlive(). It should throw an exception if it has been started, regardless of whether it finished already or not. Not sure. Its late, I may not be thinking straight. Cheers, Thomas On Tue, Apr 2, 2019 at 12:33 AM David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 > webrev: http://cr.openjdk.java.net/~dholmes/8218483/webrev/ > > A bug in Thread.setDaemon (JDK-8221657) means that the daemon state of a > thread can change after the thread is !isAlive() at the Java level. If > this happens before the VM call to ThreadService::remove_thread then we > have a situation where we incremented the thread counters when the > thread was not a daemon, and we decrement the thread counters when the > thread is a daemon - and so the counters are out of sync and the > assertion fires. > > The simple fix is to capture the daemon state of the thread while it is > still alive and to pass that through to Threads::remove and thus > ThreadService::remove_thread. > > Testing: > - manual test with modified VM (to delay Threads::remove call) as per > the bug report > - mach 5 tiers 1-3 > > Thanks, > David > > From leonid.mesnik at oracle.com Tue Apr 2 19:47:41 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 2 Apr 2019 12:47:41 -0700 Subject: RFR: 8221437: assert(java_lang_invoke_ResolvedMethodName::vmtarget(resolved_method()) == m()) failed: Should not change after link resolution In-Reply-To: References: <35182AD1-781F-4882-B194-C8FF8D63CE58@oracle.com> Message-ID: <1903AA5B-E362-4AC8-82D8-B5EDB698C8DA@oracle.com> Thank you for review. I still need a second review I suppose. Leonid > On Apr 1, 2019, at 2:55 PM, coleen.phillimore at oracle.com wrote: > > > Leonid, Looks good! Thank you for diagnosing the problem. > > Coleen > > On 4/1/19 5:50 PM, Leonid Mesnik wrote: >> Hi >> >> Could you please review following fix which just relax assertion in methodHandles.cpp. >> The assertion >> 319 assert(java_lang_invoke_ResolvedMethodName::vmtarget(resolved_method()) == m(), ..) >> is to strict and fails in the case if class was redefined/retransformed and method MethodHandles::init_method_MemberName(Handle mname, CallInfo& info) >> is called for old copy of method. >> So this assertion is incorrect for old methods and shouldn't be checked. >> >> I hit this assertion by internal stress test and verified that it is not reproduced after fix. Also run tier1 sanity testing. >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8221437/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8221437 >> >> Leonid > From david.holmes at oracle.com Tue Apr 2 20:55:01 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Apr 2019 06:55:01 +1000 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" In-Reply-To: References: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> Message-ID: <6dfb811e-df61-4b12-041d-5721a2282ba6@oracle.com> Hi Thomas, Thanks for taking a look at this. On 3/04/2019 5:41 am, Thomas St?fe wrote: > Hi David, > > first thanks for the good analysis! > > Is this not a problem with the usage of setDaemon(): > > https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDaemon(boolean) > > "This method must be invoked before the thread is started." Not the usage as such, but there is a problem with setDaemon - as per: https://bugs.openjdk.java.net/browse/JDK-8221657 The test that causes the crash in the VM deliberately tests a case where it expects to get the IllegalThreadStateException. > I think the real solution would be for setDaemon to distinguish between > not-yet-started, running and finished. It should not use isAlive(). It > should throw an exception if it has been started, regardless of whether > it finished already or not. Yes that fix is needed at the Java level. The use of isAlive() pre-dates the existence of Thread.State. But a change at the Java level may be some time coming given this is a day one bug in the spec and implementation of Thread.setDaemon, so I wanted to address this quickly in the VM as we are seeing these crashes in testing. Thanks, David > Not sure. Its late, I may not be thinking straight. > > Cheers, Thomas > > > > On Tue, Apr 2, 2019 at 12:33 AM David Holmes > wrote: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 > webrev: http://cr.openjdk.java.net/~dholmes/8218483/webrev/ > > A bug in Thread.setDaemon (JDK-8221657) means that the daemon state > of a > thread can change after the thread is !isAlive() at the Java level. If > this happens before the VM call to ThreadService::remove_thread then we > have a situation where we incremented the thread counters when the > thread was not a daemon, and we decrement the thread counters when the > thread is a daemon - and so the counters are out of sync and the > assertion fires. > > The simple fix is to capture the daemon state of the thread while it is > still alive and to pass that through to Threads::remove and thus > ThreadService::remove_thread. > > Testing: > ? ?- manual test with modified VM (to delay Threads::remove call) > as per > the bug report > ? ?- mach 5 tiers 1-3 > > Thanks, > David > From david.holmes at oracle.com Tue Apr 2 20:59:44 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Apr 2019 06:59:44 +1000 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" In-Reply-To: <4896c4f5-c31e-a4c0-ec59-5a4dba93615e@oracle.com> References: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> <4896c4f5-c31e-a4c0-ec59-5a4dba93615e@oracle.com> Message-ID: Hi Dan, Thanks for taking a look at this. On 3/04/2019 5:17 am, Daniel D. Daugherty wrote: > On 4/1/19 6:30 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 >> webrev: http://cr.openjdk.java.net/~dholmes/8218483/webrev/ > > src/hotspot/share/prims/jni.cpp > ??? No comments. > > src/hotspot/share/runtime/thread.hpp > ??? No comments. > > src/hotspot/share/runtime/thread.cpp > ??? No comments. > > test/hotspot/gtest/threadHelper.inline.hpp > ??? No comments. > > Thumbs up! Thanks. > I sanity checked uses of java_lang_Thread::set_daemon() and didn't see any > issues there. I also checked uses of java_lang_Thread::is_daemon() and the > most I found was a possible issue with thread dump output racing with a > Thread.setDaemon(false) call after the thread has died. However, that > would require that the ThreadsList captured for the thread dump contain > the JavaThread that's about to exit, the JavaThread exiting, another > thread calling Thread.setDaemon(false) before the thread dump gets to > querying the JavaThread that is trying to exit... Pretty rare if possible > at all and most that could go wrong with the output is a missing "daemon" > marker in the thread dump... Thanks for looking more deeply at that. There's another race between setDaemon(true) and start() that could lead to the same problem resurfacing, due to a lack of synchronization in setDaemon. I decided to let JDK-8221657 fix that, as there's no test that exposes the issue. Thanks, David > Dan > > >> >> A bug in Thread.setDaemon (JDK-8221657) means that the daemon state of >> a thread can change after the thread is !isAlive() at the Java level. >> If this happens before the VM call to ThreadService::remove_thread >> then we have a situation where we incremented the thread counters when >> the thread was not a daemon, and we decrement the thread counters when >> the thread is a daemon - and so the counters are out of sync and the >> assertion fires. >> >> The simple fix is to capture the daemon state of the thread while it >> is still alive and to pass that through to Threads::remove and thus >> ThreadService::remove_thread. >> >> Testing: >> ? - manual test with modified VM (to delay Threads::remove call) as >> per the bug report >> ? - mach 5 tiers 1-3 >> >> Thanks, >> David >> > From david.holmes at oracle.com Tue Apr 2 22:42:02 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Apr 2019 08:42:02 +1000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Two follow ups ... On 2/04/2019 11:26 pm, Lindenmaier, Goetz wrote: > Hi David, > >> Overall this looks good to me - a few minor nits/comments below. > thanks! > >> I've applied the patch and am running it through our internal build and >> test system (tiers 1-3 initially). >> >> I have a suspicion there will be other tests that need to be updated - >> possibly even JCK tests. Discovering those a-priori will be difficult >> (simply running all the tests would take an extremely long time). Will >> have a discussion about how best to handle those internally. > > I ran most JCK test without problem. They usually don't check messages. > I ran all hotspot, jdk, langtools, nashorn and jaxp test (except > for headful tests). Thanks for the additional testing info. I duplicated some of that but found no issues, other than a couple of closed tests. >> src/hotspot/share/oops/method.cpp >> Please put a blank line after each new method. > Fixed. > >> src/hotspot/share/oops/symbol.cpp >> >> + os->print("."); >> + } else { >> + os->print("%c", start[i]); >> >> Please use os->put(char c) for individual characters. > Fixed. > >> The "start" name would seem better as "buf" to me. > Hmm, buf to me is a local chunk of memory used temporarily. > What about array_sig, class_sig? Not really "sigs". str? Else just leave it. Thanks, David ----- >> + } else if (start[i] == 'L') { >> + print_class(os, start+i+1, len-i-2); >> Can you insert a comment that help explains the -2: > Done. > >> + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { >> space after for (2 occurrences) > Fixed. > >> test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth >> ods.java >> >> Not sure the special characters can be used directly in the sources. Can >> they not be put in as unicode escapes at all places? > I'll try what Ioi proposed. I'll post a new webrev including that. > > Best regards, > Goetz. > > >> >> --- >> >> Thanks, >> David >> ------- >> >> >> On 1/04/2019 12:32 pm, David Holmes wrote: >>> Hi Goetz, >>> >>> I'm looking at this ... >>> >>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> Any interest in this change? >>> >>> I'm personally of two minds here because these VM generated exceptions >>> are not only delivered to Java source code. I'd like to know how other >>> language developers using the JVM runtime would view this. >>> >>> That aside if you're going to make a change like this then I think the >>> full signature string has to be quoted in some way to delineate it >>> within the larger message. >>> >>>> Should I split it to adapt the exceptions separately one-by-one to >>>> make the change smaller and simplify the review? >>> >>> I don't think that is necessary. >>> >>> Thanks, >>> David >>> ----- >>> >>>> I would propose to start out with AbstractMethodError only. >>>> >>>> Best regards, >>>> ?? Goetz. >>>> >>>> >>>> >>>> From: Lindenmaier, Goetz >>>> Sent: Tuesday, March 26, 2019 1:06 PM >>>> To: hotspot-runtime-dev at openjdk.java.net >>>> Subject: RFR(L): 8221470: Print methods in exception messages in >>>> java-like Syntax. >>>> >>>> Hi, >>>> >>>> A row of exceptions are thrown from the hotspot runtime. >>>> They print methods with their JNI signatures. To increase >>>> readability and resemblance to source code, this change proposes >>>> to print them in a Java-like syntax. >>>> >>>> Some examples: >>>> current method printouts: >>>> >>>> test.TeMe3_B.ma()V >>>> test.TeMe3_B.ma(IZ[[BF)[[D >>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; >>>> >>>> improved format: >>>> >>>> void test.TeMe3_B.ma() >>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) >>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) >>>> >>>> So far, Method::name_and_sig_as_C_string() is used to print >>>> these messages. >>>> >>>> This change implements function Method::external_name() that prints >>>> the better >>>> format. >>>> external_name() is chosen according to Klass::external_name(). >>>> >>>> Printing the better format requires parsing the signature >>>> Symbol. This is implemented in >>>> void Symbol::print_as_signature_external_return_type(outputStream *os); >>>> void Symbol::print_as_signature_external_parameters(outputStream *os); >>>> These method names are chosen according to >>>> Symbol::as_class_external_name(). >>>> >>>> See this partial webrev for the new functions: >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- >> new_methods/ >>>> >>>> >>>> Also, I changed a lot of exception messages to use the new format. >>>> This required to adapt a row of tests. I added a test to check >>>> the signature printing does not regress.? For all these changes, see >>>> the full webrev: >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ >>>> >>>> I hope I detected all places where method signatures are printed to >>>> exception messages. >>>> >>>> Best regards, >>>> ?? Goetz. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> From serguei.spitsyn at oracle.com Wed Apr 3 00:07:03 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 2 Apr 2019 17:07:03 -0700 Subject: RFR: 8221437: assert(java_lang_invoke_ResolvedMethodName::vmtarget(resolved_method()) == m()) failed: Should not change after link resolution In-Reply-To: <1903AA5B-E362-4AC8-82D8-B5EDB698C8DA@oracle.com> References: <35182AD1-781F-4882-B194-C8FF8D63CE58@oracle.com> <1903AA5B-E362-4AC8-82D8-B5EDB698C8DA@oracle.com> Message-ID: Hi Leonid, +1 Looks good. Nice catch! Thank you for taking care about this problem. Thanks, Serguei On 4/2/19 12:47 PM, Leonid Mesnik wrote: > Thank you for review. I still need a second review I suppose. > > Leonid > >> On Apr 1, 2019, at 2:55 PM, coleen.phillimore at oracle.com wrote: >> >> >> Leonid, Looks good! Thank you for diagnosing the problem. >> >> Coleen >> >> On 4/1/19 5:50 PM, Leonid Mesnik wrote: >>> Hi >>> >>> Could you please review following fix which just relax assertion in methodHandles.cpp. >>> The assertion >>> 319 assert(java_lang_invoke_ResolvedMethodName::vmtarget(resolved_method()) == m(), ..) >>> is to strict and fails in the case if class was redefined/retransformed and method MethodHandles::init_method_MemberName(Handle mname, CallInfo& info) >>> is called for old copy of method. >>> So this assertion is incorrect for old methods and shouldn't be checked. >>> >>> I hit this assertion by internal stress test and verified that it is not reproduced after fix. Also run tier1 sanity testing. >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8221437/webrev.00/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8221437 >>> >>> Leonid From nick.gasson at arm.com Wed Apr 3 05:58:37 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Wed, 3 Apr 2019 13:58:37 +0800 Subject: RFR (XS): 8221529: [TESTBUG] Docker tests use old/deprecated image on AArch64 In-Reply-To: <5C9CFF6F.3060804@oracle.com> References: <5C9CFF6F.3060804@oracle.com> Message-ID: Thanks Misha. I think I still need another reviewer to look at it before it's ok to push? Nick On 29/03/2019 01:07, Mikhailo Seledtsov wrote: > Looks good to me, > Thank you for this fix. > > Misha > > On 3/28/19, 3:05 AM, Nick Gasson wrote: >> Hi, >> >> This is a follow on from 8221342 to update the default Docker image >> used on AArch64 from "aarch64/ubuntu" to "arm64v8/ubuntu". According >> to Docker Hub the former is deprecated and hasn't been updated since >> Ubuntu 16.04. This causes symbol resolution failures if the JDK image >> being tested was built against a recent glibc. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8221529 >> Webrev: http://cr.openjdk.java.net/~ngasson/8221529/webrev.0/ >> >> Tested using the runtime/containers/docker hotspot jtreg tests on >> AArch64 and x86. >> >> Thanks, >> Nick >> >> From thomas.stuefe at gmail.com Wed Apr 3 06:37:54 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 3 Apr 2019 08:37:54 +0200 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" In-Reply-To: <6dfb811e-df61-4b12-041d-5721a2282ba6@oracle.com> References: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> <6dfb811e-df61-4b12-041d-5721a2282ba6@oracle.com> Message-ID: Hi David, On Tue, Apr 2, 2019 at 10:57 PM David Holmes wrote: > Hi Thomas, > > Thanks for taking a look at this. > > On 3/04/2019 5:41 am, Thomas St?fe wrote: > > Hi David, > > > > first thanks for the good analysis! > > > > Is this not a problem with the usage of setDaemon(): > > > > > https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDaemon(boolean) > > > > "This method must be invoked before the thread is started." > > Not the usage as such, but there is a problem with setDaemon - as per: > > https://bugs.openjdk.java.net/browse/JDK-8221657 > > The test that causes the crash in the VM deliberately tests a case where > it expects to get the IllegalThreadStateException. > > > I think the real solution would be for setDaemon to distinguish between > > not-yet-started, running and finished. It should not use isAlive(). It > > should throw an exception if it has been started, regardless of whether > > it finished already or not. > > Yes that fix is needed at the Java level. The use of isAlive() pre-dates > the existence of Thread.State. > > But a change at the Java level may be some time coming given this is a > day one bug in the spec and implementation of Thread.setDaemon, so I > wanted to address this quickly in the VM as we are seeing these crashes > in testing. > I think a simple patch could be very simply using if (threadStatus != 0) instead of isAlive() in Thread.setDaemon? We do this in other places in Thread.java too. -- Also I think it makes sense to scan for similar errors in the code base (isAlive being used as "has-been-started") and fix those too. For example: ApplicationShutdownHook.java: static synchronized void add(Thread hook) { if(hooks == null) throw new IllegalStateException("Shutdown in progress"); if (hook.isAlive()) throw new IllegalArgumentException("Hook already running"); if (hooks.containsKey(hook)) throw new IllegalArgumentException("Hook previously registered"); hooks.put(hook, hook); } would register a terminated thread as shutdown hook. I found similar looking code in ThreadPoolExecutor. I really think the jdk would be really the right place to fix this. > > Thanks, > David > > > Not sure. Its late, I may not be thinking straight. > > > > Cheers, Thomas > > > > > > > > On Tue, Apr 2, 2019 at 12:33 AM David Holmes > > wrote: > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 > > webrev: http://cr.openjdk.java.net/~dholmes/8218483/webrev/ > > > > A bug in Thread.setDaemon (JDK-8221657) means that the daemon state > > of a > > thread can change after the thread is !isAlive() at the Java level. > If > > this happens before the VM call to ThreadService::remove_thread then > we > > have a situation where we incremented the thread counters when the > > thread was not a daemon, and we decrement the thread counters when > the > > thread is a daemon - and so the counters are out of sync and the > > assertion fires. > > > > The simple fix is to capture the daemon state of the thread while it > is > > still alive and to pass that through to Threads::remove and thus > > ThreadService::remove_thread. > > > > Testing: > > - manual test with modified VM (to delay Threads::remove call) > > as per > > the bug report > > - mach 5 tiers 1-3 > > > > Thanks, > > David > > > From robbin.ehn at oracle.com Wed Apr 3 07:04:52 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 3 Apr 2019 09:04:52 +0200 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <1015f61b-71a3-3c81-09a2-159d16c0b24b@oracle.com> <4e262c9a-6a21-b213-27f9-d8e59c27ba84@oracle.com> <0f4ab494-57d1-d202-5ffa-f2416031c5ff@oracle.com> Message-ID: Hi Dan, Carsten, >>> However, moving deflate_idle_monitors() from the safepoint cleanup phase >>> to before the actual garbage collection can wait until we do the work to >>> decouple triggering of monitor deflation to be independent of the the >>> safepoint cleanup phase. >>> If I got the context correct here: All cleanups are done before the VM op is execute. It is the last thing we do in SS::begin(), so anything added to SS::do_cleanup_tasks/ParallelSPCleanupTask is done first in any safepoint. /Robbin >> >> SGTM. >> >>> Speaking of optimizations, it sure would be nice if little changes to >>>> java threads could be combined and performed on the way out of the >>>> safepoint in one go instead of having lots of iterations of the thread list >>>> in various places. Some people have thousands of threads and each traversal >>>> of the thread list hurts. >>>> >>>> >>>> Do you have a specific example in mind? >>>> >>> >>> No concrete example for a public mailing list. :(. But do notice that >>> independent tasks that require traversals of the thread list are already >>> fused in ParallelSPCleanupThreadClosure >>> . >>> If you made deflate_thread_local_monitors >>> set jt->omShouldDeflateIdleMonitors to true, then you wouldn't need to >>> iterator over all java threads in do_safepoint_work. >>> >>> >>> I think I see what you mean... so when ParallelSPCleanupThreadClosure:: >>> do_thread() calls deflate_thread_local_monitors(): >>> >>> 2250 if (AsyncDeflateIdleMonitors) { >>> 2251 // Nothing to do when idle ObjectMonitors are deflated using a >>> 2252 // JavaThread unless a special cleanup has been requested. >>> >>> Replace L2251-2 with: >>> // Mark the JavaThread for idle monitor cleanup unless a >>> // special cleanup has been requested. >>> 2253 if (!is_cleanup_requested()) { >>> >>> Add these three lines: >>> if (thread->omInUseCount > 0) { >>> // This JavaThread is using monitors so mark it. >>> thread->omShouldDeflateIdleMonitors = true; >>> } >>> 2254 return; >>> 2255 } >>> >>> That will allow this block to go away: >>> >>> 1695 // Request deflation of per-thread idle monitors by each >>> JavaThread: >>> 1696 for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = >>> jtiwh.next(); ) { >>> 1697 if (jt->omInUseCount > 0) { >>> 1698 // This JavaThread is using monitors so check it. >>> 1699 jt->omShouldDeflateIdleMonitors = true; >>> 1700 } >>> 1701 } >>> >>> Please let me know if I understand what you meant... >>> >> >> This is exactly what I meant. >> >> >> Good. This will be in the next round of code review. >> > > Nice. > > Carsten > From david.holmes at oracle.com Wed Apr 3 07:14:01 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Apr 2019 17:14:01 +1000 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" In-Reply-To: References: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> <6dfb811e-df61-4b12-041d-5721a2282ba6@oracle.com> Message-ID: Hi Thomas, On 3/04/2019 4:37 pm, Thomas St?fe wrote: > Hi David, > > > On Tue, Apr 2, 2019 at 10:57 PM David Holmes > wrote: > > Hi Thomas, > > Thanks for taking a look at this. > > On 3/04/2019 5:41 am, Thomas St?fe wrote: > > Hi David, > > > > first thanks for the good analysis! > > > > Is this not a problem with the usage of setDaemon(): > > > > > https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDaemon(boolean) > > > > "This method must be invoked before the thread is started." > > Not the usage as such, but there is a problem with setDaemon - as per: > > https://bugs.openjdk.java.net/browse/JDK-8221657 > > The test that causes the crash in the VM deliberately tests a case > where > it expects to get the IllegalThreadStateException. > > > I think the real solution would be for setDaemon to distinguish > between > > not-yet-started, running and finished. It should not use > isAlive(). It > > should throw an exception if it has been started, regardless of > whether > > it finished already or not. > > Yes that fix is needed at the Java level. The use of isAlive() > pre-dates > the existence of Thread.State. > > But a change at the Java level may be some time coming given this is a > day one bug in the spec and implementation of Thread.setDaemon, so I > wanted to address this quickly in the VM as we are seeing these crashes > in testing. > > > I think a simple patch could be very simply using > > if (threadStatus != 0) > > instead of > > isAlive() > > in Thread.setDaemon? Sure the fix is trivial (plus the method needs to be synchronized), but that assumes that this spec inconsistency: *

This method must be invoked before the thread is started. * * @throws IllegalThreadStateException * if this thread is {@linkplain #isAlive alive} is resolved in favour of the first statement. They may decide that after 25 years it's better to maintain the "not alive" semantics and permit you to modify a terminated thread. > We do this in other places in Thread.java too. > > -- > > Also I think it makes sense to scan for similar errors in the code base > (isAlive being used as "has-been-started") and fix those too. > > For example: > > ApplicationShutdownHook.java: > > static synchronized void add(Thread hook) { > ? ? if(hooks == null) > ? ? ? ? throw new IllegalStateException("Shutdown in progress"); > > ? ? if (hook.isAlive()) > ? ? ? ? throw new IllegalArgumentException("Hook already running"); > > ? ? if (hooks.containsKey(hook)) > ? ? ? ? throw new IllegalArgumentException("Hook previously registered"); > > ? ? hooks.put(hook, hook); > } > > would register a terminated thread as shutdown hook. I found similar > looking code in ThreadPoolExecutor. Yeah that's a nasty bug - you can register a shutdown hook that will result in other shutdown hooks not getting started! > I really think the jdk would be really the right place to fix this. And it may get fixed there eventually. Meanwhile I just want to stop these fairly new assertions from triggering. Thanks, David > > Thanks, > David > > > Not sure. Its late, I may not be thinking straight. > > > > Cheers, Thomas > > > > > > > > On Tue, Apr 2, 2019 at 12:33 AM David Holmes > > > >> wrote: > > > >? ? ?Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 > >? ? ?webrev: http://cr.openjdk.java.net/~dholmes/8218483/webrev/ > > > >? ? ?A bug in Thread.setDaemon (JDK-8221657) means that the daemon > state > >? ? ?of a > >? ? ?thread can change after the thread is !isAlive() at the Java > level. If > >? ? ?this happens before the VM call to > ThreadService::remove_thread then we > >? ? ?have a situation where we incremented the thread counters > when the > >? ? ?thread was not a daemon, and we decrement the thread counters > when the > >? ? ?thread is a daemon - and so the counters are out of sync and the > >? ? ?assertion fires. > > > >? ? ?The simple fix is to capture the daemon state of the thread > while it is > >? ? ?still alive and to pass that through to Threads::remove and thus > >? ? ?ThreadService::remove_thread. > > > >? ? ?Testing: > >? ? ? ? ?- manual test with modified VM (to delay Threads::remove > call) > >? ? ?as per > >? ? ?the bug report > >? ? ? ? ?- mach 5 tiers 1-3 > > > >? ? ?Thanks, > >? ? ?David > > > From david.holmes at oracle.com Wed Apr 3 07:31:18 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Apr 2019 17:31:18 +1000 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" In-Reply-To: References: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> <6dfb811e-df61-4b12-041d-5721a2282ba6@oracle.com> Message-ID: <4bfe6c5e-2ea4-2d8a-bea2-db92ae7d5d0a@oracle.com> PS. I filed: https://bugs.openjdk.java.net/browse/JDK-8221893 https://bugs.openjdk.java.net/browse/JDK-8221892 for the two bugs you reported. Thanks, David On 3/04/2019 5:14 pm, David Holmes wrote: > Hi Thomas, > > On 3/04/2019 4:37 pm, Thomas St?fe wrote: >> Hi David, >> >> >> On Tue, Apr 2, 2019 at 10:57 PM David Holmes > > wrote: >> >> ??? Hi Thomas, >> >> ??? Thanks for taking a look at this. >> >> ??? On 3/04/2019 5:41 am, Thomas St?fe wrote: >> ???? > Hi David, >> ???? > >> ???? > first thanks for the good analysis! >> ???? > >> ???? > Is this not a problem with the usage of setDaemon(): >> ???? > >> ???? > >> >> https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDaemon(boolean) >> >> ???? > >> ???? > "This method must be invoked before the thread is started." >> >> ??? Not the usage as such, but there is a problem with setDaemon - as >> per: >> >> ??? https://bugs.openjdk.java.net/browse/JDK-8221657 >> >> ??? The test that causes the crash in the VM deliberately tests a case >> ??? where >> ??? it expects to get the IllegalThreadStateException. >> >> ???? > I think the real solution would be for setDaemon to distinguish >> ??? between >> ???? > not-yet-started, running and finished. It should not use >> ??? isAlive(). It >> ???? > should throw an exception if it has been started, regardless of >> ??? whether >> ???? > it finished already or not. >> >> ??? Yes that fix is needed at the Java level. The use of isAlive() >> ??? pre-dates >> ??? the existence of Thread.State. >> >> ??? But a change at the Java level may be some time coming given this >> is a >> ??? day one bug in the spec and implementation of Thread.setDaemon, so I >> ??? wanted to address this quickly in the VM as we are seeing these >> crashes >> ??? in testing. >> >> >> I think a simple patch could be very simply using >> >> if (threadStatus != 0) >> >> instead of >> >> isAlive() >> >> in Thread.setDaemon? > > Sure the fix is trivial (plus the method needs to be synchronized), but > that assumes that this spec inconsistency: > > ???? *

This method must be invoked before the thread is started. > ???? * > ???? * @throws? IllegalThreadStateException > ???? *????????? if this thread is {@linkplain #isAlive alive} > > is resolved in favour of the first statement. They may decide that after > 25 years it's better to maintain the "not alive" semantics and permit > you to modify a terminated thread. > >> We do this in other places in Thread.java too. >> >> -- >> >> Also I think it makes sense to scan for similar errors in the code >> base (isAlive being used as "has-been-started") and fix those too. >> >> For example: >> >> ApplicationShutdownHook.java: >> >> static synchronized void add(Thread hook) { >> ?? ? if(hooks == null) >> ?? ? ? ? throw new IllegalStateException("Shutdown in progress"); >> >> ?? ? if (hook.isAlive()) >> ?? ? ? ? throw new IllegalArgumentException("Hook already running"); >> >> ?? ? if (hooks.containsKey(hook)) >> ?? ? ? ? throw new IllegalArgumentException("Hook previously >> registered"); >> >> ?? ? hooks.put(hook, hook); >> } >> >> would register a terminated thread as shutdown hook. I found similar >> looking code in ThreadPoolExecutor. > > Yeah that's a nasty bug - you can register a shutdown hook that will > result in other shutdown hooks not getting started! > >> I really think the jdk would be really the right place to fix this. > > And it may get fixed there eventually. Meanwhile I just want to stop > these fairly new assertions from triggering. > > Thanks, > David > >> >> ??? Thanks, >> ??? David >> >> ???? > Not sure. Its late, I may not be thinking straight. >> ???? > >> ???? > Cheers, Thomas >> ???? > >> ???? > >> ???? > >> ???? > On Tue, Apr 2, 2019 at 12:33 AM David Holmes >> ??? >> ???? > > ??? >> wrote: >> ???? > >> ???? >? ? ?Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 >> ???? >? ? ?webrev: http://cr.openjdk.java.net/~dholmes/8218483/webrev/ >> ???? > >> ???? >? ? ?A bug in Thread.setDaemon (JDK-8221657) means that the daemon >> ??? state >> ???? >? ? ?of a >> ???? >? ? ?thread can change after the thread is !isAlive() at the Java >> ??? level. If >> ???? >? ? ?this happens before the VM call to >> ??? ThreadService::remove_thread then we >> ???? >? ? ?have a situation where we incremented the thread counters >> ??? when the >> ???? >? ? ?thread was not a daemon, and we decrement the thread counters >> ??? when the >> ???? >? ? ?thread is a daemon - and so the counters are out of sync >> and the >> ???? >? ? ?assertion fires. >> ???? > >> ???? >? ? ?The simple fix is to capture the daemon state of the thread >> ??? while it is >> ???? >? ? ?still alive and to pass that through to Threads::remove and >> thus >> ???? >? ? ?ThreadService::remove_thread. >> ???? > >> ???? >? ? ?Testing: >> ???? >? ? ? ? ?- manual test with modified VM (to delay Threads::remove >> ??? call) >> ???? >? ? ?as per >> ???? >? ? ?the bug report >> ???? >? ? ? ? ?- mach 5 tiers 1-3 >> ???? > >> ???? >? ? ?Thanks, >> ???? >? ? ?David >> ???? > >> From thomas.stuefe at gmail.com Wed Apr 3 07:37:34 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 3 Apr 2019 09:37:34 +0200 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" In-Reply-To: References: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> <6dfb811e-df61-4b12-041d-5721a2282ba6@oracle.com> Message-ID: Hi David, On Wed, Apr 3, 2019 at 9:20 AM David Holmes wrote: > Hi Thomas, > > On 3/04/2019 4:37 pm, Thomas St?fe wrote: > > Hi David, > > > > > > On Tue, Apr 2, 2019 at 10:57 PM David Holmes > > wrote: > > > > Hi Thomas, > > > > Thanks for taking a look at this. > > > > On 3/04/2019 5:41 am, Thomas St?fe wrote: > > > Hi David, > > > > > > first thanks for the good analysis! > > > > > > Is this not a problem with the usage of setDaemon(): > > > > > > > > > https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDaemon(boolean) > > > > > > "This method must be invoked before the thread is started." > > > > Not the usage as such, but there is a problem with setDaemon - as > per: > > > > https://bugs.openjdk.java.net/browse/JDK-8221657 > > > > The test that causes the crash in the VM deliberately tests a case > > where > > it expects to get the IllegalThreadStateException. > > > > > I think the real solution would be for setDaemon to distinguish > > between > > > not-yet-started, running and finished. It should not use > > isAlive(). It > > > should throw an exception if it has been started, regardless of > > whether > > > it finished already or not. > > > > Yes that fix is needed at the Java level. The use of isAlive() > > pre-dates > > the existence of Thread.State. > > > > But a change at the Java level may be some time coming given this is > a > > day one bug in the spec and implementation of Thread.setDaemon, so I > > wanted to address this quickly in the VM as we are seeing these > crashes > > in testing. > > > > > > I think a simple patch could be very simply using > > > > if (threadStatus != 0) > > > > instead of > > > > isAlive() > > > > in Thread.setDaemon? > > Sure the fix is trivial (plus the method needs to be synchronized), but > that assumes that this spec inconsistency: > > *

This method must be invoked before the thread is started. > * > * @throws IllegalThreadStateException > * if this thread is {@linkplain #isAlive alive} > > is resolved in favour of the first statement. They may decide that after > 25 years it's better to maintain the "not alive" semantics and permit > you to modify a terminated thread. > > Okay, I get it now. You are worried about backward compatibility. Someone calling setDaemon() in this way would now get an exception where beforehand he would not. But how about this then: public final void setDaemon(boolean on) { checkAccess(); if (isAlive()) { throw new IllegalThreadStateException(); } else if (threadStatus != 0) { // Not alive but not NEW - terminated? // do not change daemon state. Do not throw to not break backward compatibility. } else { daemon = on; } } Of course that would be observable from the outside (Thread::isDaemon()). At the expense of some complexity (e.g. two variables, one "real", one outward facing as source for isDaemon), this could be fixed. -- But I do not want to stop your change. I think it is fine, I cannot see anything wrong with it. For a moment I wondered whether we are exposed to a similar thing here: thread.cpp:1996 ThreadService::current_thread_exiting(this, is_daemon(threadObj())); But at this point isAlive() would still return true, yes? Since it seems it only gets reset in ensure_join(). -- Cheers, Thomas > We do this in other places in Thread.java too. > > > > -- > > > > Also I think it makes sense to scan for similar errors in the code base > > (isAlive being used as "has-been-started") and fix those too. > > > > For example: > > > > ApplicationShutdownHook.java: > > > > static synchronized void add(Thread hook) { > > if(hooks == null) > > throw new IllegalStateException("Shutdown in progress"); > > > > if (hook.isAlive()) > > throw new IllegalArgumentException("Hook already running"); > > > > if (hooks.containsKey(hook)) > > throw new IllegalArgumentException("Hook previously > registered"); > > > > hooks.put(hook, hook); > > } > > > > would register a terminated thread as shutdown hook. I found similar > > looking code in ThreadPoolExecutor. > > Yeah that's a nasty bug - you can register a shutdown hook that will > result in other shutdown hooks not getting started! > > > I really think the jdk would be really the right place to fix this. > > And it may get fixed there eventually. Meanwhile I just want to stop > these fairly new assertions from triggering. > > Thanks, > David > > > > > Thanks, > > David > > > > > Not sure. Its late, I may not be thinking straight. > > > > > > Cheers, Thomas > > > > > > > > > > > > On Tue, Apr 2, 2019 at 12:33 AM David Holmes > > > > > > >> wrote: > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 > > > webrev: http://cr.openjdk.java.net/~dholmes/8218483/webrev/ > > > > > > A bug in Thread.setDaemon (JDK-8221657) means that the daemon > > state > > > of a > > > thread can change after the thread is !isAlive() at the Java > > level. If > > > this happens before the VM call to > > ThreadService::remove_thread then we > > > have a situation where we incremented the thread counters > > when the > > > thread was not a daemon, and we decrement the thread counters > > when the > > > thread is a daemon - and so the counters are out of sync and > the > > > assertion fires. > > > > > > The simple fix is to capture the daemon state of the thread > > while it is > > > still alive and to pass that through to Threads::remove and > thus > > > ThreadService::remove_thread. > > > > > > Testing: > > > - manual test with modified VM (to delay Threads::remove > > call) > > > as per > > > the bug report > > > - mach 5 tiers 1-3 > > > > > > Thanks, > > > David > > > > > > From thomas.stuefe at gmail.com Wed Apr 3 08:57:29 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 3 Apr 2019 10:57:29 +0200 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> References: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> Message-ID: Hi all, new version: Delta: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/delta_to_4/webrev/index.html Full: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.04/webrev/ Changes: - As Coleen wished, I completely removed the non-static variant of MetaspaceObj::is_metaspace_obj() and fixed the callers. - I also renamed the static variant of MetaspaceObj::is_metaspace_obj() to MetaspaceObj::is_valid() to be in line with similar calls, e.g. Symbol::is_valid(). @Coleen: This envelope should only weed out obvious non-null bogus values and hopefully stack and C-heap addresses; my hope is that nodes come and go but that the total envelope size will be always minuscule compare to the 64bit address range and outside C-heap and stacks. Usually mmap regions are clustered, as are C-Heap allocations and stacks. But if that turns out to be inefficient after a while, we may recalculate the envelope; just have to make sure no concurrent lock-less walks happen. Thanks, Thomas On Tue, Apr 2, 2019 at 2:21 PM wrote: > > > On 4/2/19 1:47 AM, Thomas St?fe wrote: > > Hi Coleen, Andrew, > > thank you for reviewing my little change. Unfortunately, I had an error in > the space list verification method which needed fixing, so here is a second > version: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.01/webrev/ > > > > Differences: > - As Coleen requested: in allocation.cpp I replaced the comparison > this==NULL with a static helper method > > > I think you have to change the callers to not pass this as null. So you > can't do metaspaceobj->is_metaspace_object() because you're calling with > "this" potentially NULL. > > So remove this function: > > bool MetaspaceObj::is_metaspace_object() const {- return Metaspace::contains((void*)this);+ return MetaspaceObj::is_metaspace_object(this); > } > > > > - I had mistype "envelope" as "envolope" in > "expand_envelope_to_include_node()". Since that sounded funny I changed it. > - The real bug was in VirtualSpaceList::verify() where I checked that the > extension of the envelope is as large as the current nodes. But that is > wrong, since the envelope never is shrunk (by design) and nodes at the > border of the envelope may have been unmapped. So the real test should be > to test if no node is outside the envelope. > > > So this envelope is an interesting concept and name. It seems okay. I > guess over time, it won't give you a very good answer. Maybe you'll have > to fix the boundaries someday. > > Looks good though. Thank you for making this improvement for performance. > > Coleen > > > Thanks, Thomas > > > On Wed, Mar 27, 2019 at 10:01 PM Thomas St?fe > wrote: > >> Hi all, >> >> May I please have reviews for this small optimization: >> >> cr: >> http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html >> Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 >> >> There are several functions which, given an unknown pointer assumed to be >> a metaspace object, check if the pointer is indeed a metaspace object by >> walking the VirtualSpaceList and checking ranges. >> >> This patch adds checks which weed out the obvious cases to avoid >> needlessly walking the vs list. >> >> Patch also adds verifications for the VirtualSpaceList in debug cases. >> Those run only when a new node has been added to the list, or when a node >> has been purged, so very sparingly. >> >> When purging nodes, I removed a small unnecessary and inefficient check >> which checked whether (one of the) purged nodes was still in the list. >> Since we now as part of the new VirtualSpaceNode::verify() walk this list, >> the check is unnecessary. >> >> Thanks, Thomas >> >> >> > From martin.doerr at sap.com Wed Apr 3 09:22:24 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 3 Apr 2019 09:22:24 +0000 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> Message-ID: Hi Thomas, thank you for doing all the work. I appreciate the improvement. Only question I have is do we want the now redundant NULL checks before calling MetaspaceObj::is_valid? (I don?t need to see another webrev if you just want to get rid of them.) Best regards, Martin From: Thomas St?fe Sent: Mittwoch, 3. April 2019 10:57 To: Coleen Phillmore ; Andrew Dinn ; Doerr, Martin Cc: Hotspot dev runtime Subject: Re: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends Hi all, new version: Delta: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/delta_to_4/webrev/index.html Full: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.04/webrev/ Changes: - As Coleen wished, I completely removed the non-static variant of MetaspaceObj::is_metaspace_obj() and fixed the callers. - I also renamed the static variant of MetaspaceObj::is_metaspace_obj() to MetaspaceObj::is_valid() to be in line with similar calls, e.g. Symbol::is_valid(). @Coleen: This envelope should only weed out obvious non-null bogus values and hopefully stack and C-heap addresses; my hope is that nodes come and go but that the total envelope size will be always minuscule compare to the 64bit address range and outside C-heap and stacks. Usually mmap regions are clustered, as are C-Heap allocations and stacks. But if that turns out to be inefficient after a while, we may recalculate the envelope; just have to make sure no concurrent lock-less walks happen. Thanks, Thomas On Tue, Apr 2, 2019 at 2:21 PM > wrote: On 4/2/19 1:47 AM, Thomas St?fe wrote: Hi Coleen, Andrew, thank you for reviewing my little change. Unfortunately, I had an error in the space list verification method which needed fixing, so here is a second version: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.01/webrev/ Differences: - As Coleen requested: in allocation.cpp I replaced the comparison this==NULL with a static helper method I think you have to change the callers to not pass this as null. So you can't do metaspaceobj->is_metaspace_object() because you're calling with "this" potentially NULL. So remove this function: bool MetaspaceObj::is_metaspace_object() const { - return Metaspace::contains((void*)this); + return MetaspaceObj::is_metaspace_object(this); } - I had mistype "envelope" as "envolope" in "expand_envelope_to_include_node()". Since that sounded funny I changed it. - The real bug was in VirtualSpaceList::verify() where I checked that the extension of the envelope is as large as the current nodes. But that is wrong, since the envelope never is shrunk (by design) and nodes at the border of the envelope may have been unmapped. So the real test should be to test if no node is outside the envelope. So this envelope is an interesting concept and name. It seems okay. I guess over time, it won't give you a very good answer. Maybe you'll have to fix the boundaries someday. Looks good though. Thank you for making this improvement for performance. Coleen Thanks, Thomas On Wed, Mar 27, 2019 at 10:01 PM Thomas St?fe > wrote: Hi all, May I please have reviews for this small optimization: cr: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 There are several functions which, given an unknown pointer assumed to be a metaspace object, check if the pointer is indeed a metaspace object by walking the VirtualSpaceList and checking ranges. This patch adds checks which weed out the obvious cases to avoid needlessly walking the vs list. Patch also adds verifications for the VirtualSpaceList in debug cases. Those run only when a new node has been added to the list, or when a node has been purged, so very sparingly. When purging nodes, I removed a small unnecessary and inefficient check which checked whether (one of the) purged nodes was still in the list. Since we now as part of the new VirtualSpaceNode::verify() walk this list, the check is unnecessary. Thanks, Thomas From thomas.stuefe at gmail.com Wed Apr 3 09:42:52 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 3 Apr 2019 11:42:52 +0200 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> Message-ID: On Wed, Apr 3, 2019 at 11:22 AM Doerr, Martin wrote: > Hi Thomas, > > > > thank you for doing all the work. I appreciate the improvement. > > > > Only question I have is do we want the now redundant NULL checks before > calling MetaspaceObj::is_valid? > > (I don?t need to see another webrev if you just want to get rid of them.) > > > Yeah, I left them in out of a vague notion of "documentation". Can remove them if you care. Thanks, Thomas > Best regards, > > Martin > > > > > > > > *From:* Thomas St?fe > *Sent:* Mittwoch, 3. April 2019 10:57 > *To:* Coleen Phillmore ; Andrew Dinn < > adinn at redhat.com>; Doerr, Martin > *Cc:* Hotspot dev runtime > *Subject:* Re: RFR(s): 8221539: [metaspace] Improve > MetaspaceObj::is_metaspace_obj() and friends > > > > Hi all, > > > > new version: > > > > Delta: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/delta_to_4/webrev/index.html > > > > Full: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.04/webrev/ > > > > Changes: > > > > - As Coleen wished, I completely removed the non-static variant of > MetaspaceObj::is_metaspace_obj() and fixed the callers. > > - I also renamed the static variant of MetaspaceObj::is_metaspace_obj() to > MetaspaceObj::is_valid() to be in line with similar calls, e.g. > Symbol::is_valid(). > > > > @Coleen: This envelope should only weed out obvious non-null bogus values > and hopefully stack and C-heap addresses; my hope is that nodes come and go > but that the total envelope size will be always minuscule compare to the > 64bit address range and outside C-heap and stacks. Usually mmap regions are > clustered, as are C-Heap allocations and stacks. > > > > But if that turns out to be inefficient after a while, we may recalculate > the envelope; just have to make sure no concurrent lock-less walks happen. > > > > Thanks, Thomas > > > > > > > > On Tue, Apr 2, 2019 at 2:21 PM wrote: > > > > On 4/2/19 1:47 AM, Thomas St?fe wrote: > > Hi Coleen, Andrew, > > > > thank you for reviewing my little change. Unfortunately, I had an error in > the space list verification method which needed fixing, so here is a second > version: > > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.01/webrev/ > > > > > Differences: > - As Coleen requested: in allocation.cpp I replaced the comparison > this==NULL with a static helper method > > > I think you have to change the callers to not pass this as null. So you > can't do metaspaceobj->is_metaspace_object() because you're calling with > "this" potentially NULL. > > So remove this function: > > > bool MetaspaceObj::is_metaspace_object() const { > > - return Metaspace::contains((void*)this); > > + return MetaspaceObj::is_metaspace_object(this); > > } > > > > > > - I had mistype "envelope" as "envolope" in > "expand_envelope_to_include_node()". Since that sounded funny I changed it. > - The real bug was in VirtualSpaceList::verify() where I checked that the > extension of the envelope is as large as the current nodes. But that is > wrong, since the envelope never is shrunk (by design) and nodes at the > border of the envelope may have been unmapped. So the real test should be > to test if no node is outside the envelope. > > > So this envelope is an interesting concept and name. It seems okay. I > guess over time, it won't give you a very good answer. Maybe you'll have > to fix the boundaries someday. > > Looks good though. Thank you for making this improvement for performance. > > Coleen > > > > Thanks, Thomas > > > > > > On Wed, Mar 27, 2019 at 10:01 PM Thomas St?fe > wrote: > > Hi all, > > > > May I please have reviews for this small optimization: > > > > cr: > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html > > Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 > > > > There are several functions which, given an unknown pointer assumed to be > a metaspace object, check if the pointer is indeed a metaspace object by > walking the VirtualSpaceList and checking ranges. > > This patch adds checks which weed out the obvious cases to avoid > needlessly walking the vs list. > > > > Patch also adds verifications for the VirtualSpaceList in debug cases. > Those run only when a new node has been added to the list, or when a node > has been purged, so very sparingly. > > > > When purging nodes, I removed a small unnecessary and inefficient check > which checked whether (one of the) purged nodes was still in the list. > Since we now as part of the new VirtualSpaceNode::verify() walk this list, > the check is unnecessary. > > > > Thanks, Thomas > > > > > > > > From goetz.lindenmaier at sap.com Wed Apr 3 10:23:22 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 3 Apr 2019 10:23:22 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: <59f344d6-d1b7-58ce-b127-26dca4033cb4@oracle.com> References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> <43d389d9-289f-e6b9-2380-c63ca634ccb4@oracle.com> <59f344d6-d1b7-58ce-b127-26dca4033cb4@oracle.com> Message-ID: Hi Ioi, I have thought about your proposal. Compiling the problematic class with the InMemoryJavaCompiler would work. But then the other test classes won't compile, because the problematic class referenced by them is not available when they are compiled. If I use the InMemoryJavaCompiler for all test classes, the trick with injecting a broken class by jasm will fail, though. I think I'll remove this test case. What do you think? Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev > On Behalf Of Ioi Lam > Sent: Dienstag, 2. April 2019 06:52 > To: hotspot-runtime-dev at openjdk.java.net > Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in > java-like Syntax. > > Hi Goetz, > > I think you can use this class to avoid writing classfiles with > non-ascii names. > > http://hg.openjdk.java.net/jdk/jdk/file/2221f042556d/test/lib/jdk/test/lib/co > mpiler/InMemoryJavaCompiler.java > > You'd need to write a class loader to load the byte array returned by > InMemoryJavaCompiler.compile(). > > Thanks > - Ioi > > > On 4/1/19 6:41 PM, David Holmes wrote: > > Hi Goetz, > > > > Your new test fails to compile on some systems: > > > > error: error while writing Strange\u20ac\u00a3Named: bad filename > > RelativeFile[test/Strange\u20ac\u00a3Named.class] > > class Strange\u20ac\u00a3Named { > > ^ > > 1 error > > result: Failed. Compilation failed: Compilation failed > > > > This was linux-x64 - only seems to occur on Oracle Linux Server 7.1. > > > > We also have some closed tests that also need updating so I'll need to > > coordinate with you on the push. > > > > Thanks, > > David > > ----- > > > > On 2/04/2019 10:33 am, David Holmes wrote: > >> Hi Goetz, > >> > >> Overall this looks good to me - a few minor nits/comments below. > >> > >> I've applied the patch and am running it through our internal build > >> and test system (tiers 1-3 initially). > >> > >> I have a suspicion there will be other tests that need to be updated > >> - possibly even JCK tests. Discovering those a-priori will be > >> difficult (simply running all the tests would take an extremely long > >> time). Will have a discussion about how best to handle those internally. > >> > >> --- > >> > >> src/hotspot/share/oops/method.cpp > >> > >> Please put a blank line after each new method. > >> > >> --- > >> > >> src/hotspot/share/oops/symbol.cpp > >> > >> +?????? os->print("."); > >> +???? } else { > >> +?????? os->print("%c", start[i]); > >> > >> Please use os->put(char c) for individual characters. > >> > >> -- > >> > >> The "start" name would seem better as "buf" to me. > >> > >> -- > >> > >> +???? } else if (start[i] == 'L') { > >> +?????? print_class(os, start+i+1, len-i-2); > >> > >> Can you insert a comment that help explains the -2: > >> > >> ????? } else if (start[i] == 'L') { > >> +????? // Expected format: L; > >> ??????? print_class(os, start+i+1, len-i-2); > >> > >> -- > >> > >> +?? for(SignatureStream ss(this); !ss.is_done(); ss.next()) { > >> > >> space after for (2 occurrences) > >> > >> --- > >> > >> > >> > test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth > ods.java > >> > >> > >> Not sure the special characters can be used directly in the sources. > >> Can they not be put in as unicode escapes at all places? > >> > >> --- > >> > >> Thanks, > >> David > >> ------- > >> > >> > >> On 1/04/2019 12:32 pm, David Holmes wrote: > >>> Hi Goetz, > >>> > >>> I'm looking at this ... > >>> > >>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> Any interest in this change? > >>> > >>> I'm personally of two minds here because these VM generated > >>> exceptions are not only delivered to Java source code. I'd like to > >>> know how other language developers using the JVM runtime would view > >>> this. > >>> > >>> That aside if you're going to make a change like this then I think > >>> the full signature string has to be quoted in some way to delineate > >>> it within the larger message. > >>> > >>>> Should I split it to adapt the exceptions separately one-by-one to > >>>> make the change smaller and simplify the review? > >>> > >>> I don't think that is necessary. > >>> > >>> Thanks, > >>> David > >>> ----- > >>> > >>>> I would propose to start out with AbstractMethodError only. > >>>> > >>>> Best regards, > >>>> ?? Goetz. > >>>> > >>>> > >>>> > >>>> From: Lindenmaier, Goetz > >>>> Sent: Tuesday, March 26, 2019 1:06 PM > >>>> To: hotspot-runtime-dev at openjdk.java.net > >>>> Subject: RFR(L): 8221470: Print methods in exception messages in > >>>> java-like Syntax. > >>>> > >>>> Hi, > >>>> > >>>> A row of exceptions are thrown from the hotspot runtime. > >>>> They print methods with their JNI signatures. To increase > >>>> readability and resemblance to source code, this change proposes > >>>> to print them in a Java-like syntax. > >>>> > >>>> Some examples: > >>>> current method printouts: > >>>> > >>>> test.TeMe3_B.ma()V > >>>> test.TeMe3_B.ma(IZ[[BF)[[D > >>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > >>>> > >>>> improved format: > >>>> > >>>> void test.TeMe3_B.ma() > >>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > >>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > >>>> > >>>> So far, Method::name_and_sig_as_C_string() is used to print > >>>> these messages. > >>>> > >>>> This change implements function Method::external_name() that prints > >>>> the better > >>>> format. > >>>> external_name() is chosen according to Klass::external_name(). > >>>> > >>>> Printing the better format requires parsing the signature > >>>> Symbol. This is implemented in > >>>> void Symbol::print_as_signature_external_return_type(outputStream > >>>> *os); > >>>> void Symbol::print_as_signature_external_parameters(outputStream > *os); > >>>> These method names are chosen according to > >>>> Symbol::as_class_external_name(). > >>>> > >>>> See this partial webrev for the new functions: > >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- > new_methods/ > >>>> > >>>> > >>>> Also, I changed a lot of exception messages to use the new format. > >>>> This required to adapt a row of tests. I added a test to check > >>>> the signature printing does not regress.? For all these changes, see > >>>> the full webrev: > >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ > >>>> > >>>> I hope I detected all places where method signatures are printed to > >>>> exception messages. > >>>> > >>>> Best regards, > >>>> ?? Goetz. > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> From martin.doerr at sap.com Wed Apr 3 10:34:40 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 3 Apr 2019 10:34:40 +0000 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> Message-ID: Hi Thomas, > Yeah, I left them in out of a vague notion of "documentation". Can remove them if you care. I?d prefer shorter code. Best regards, Martin From: Thomas St?fe Sent: Mittwoch, 3. April 2019 11:43 To: Doerr, Martin Cc: Coleen Phillmore ; Andrew Dinn ; Hotspot dev runtime Subject: Re: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends On Wed, Apr 3, 2019 at 11:22 AM Doerr, Martin > wrote: Hi Thomas, thank you for doing all the work. I appreciate the improvement. Only question I have is do we want the now redundant NULL checks before calling MetaspaceObj::is_valid? (I don?t need to see another webrev if you just want to get rid of them.) Yeah, I left them in out of a vague notion of "documentation". Can remove them if you care. Thanks, Thomas Best regards, Martin From: Thomas St?fe > Sent: Mittwoch, 3. April 2019 10:57 To: Coleen Phillmore >; Andrew Dinn >; Doerr, Martin > Cc: Hotspot dev runtime > Subject: Re: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends Hi all, new version: Delta: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/delta_to_4/webrev/index.html Full: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.04/webrev/ Changes: - As Coleen wished, I completely removed the non-static variant of MetaspaceObj::is_metaspace_obj() and fixed the callers. - I also renamed the static variant of MetaspaceObj::is_metaspace_obj() to MetaspaceObj::is_valid() to be in line with similar calls, e.g. Symbol::is_valid(). @Coleen: This envelope should only weed out obvious non-null bogus values and hopefully stack and C-heap addresses; my hope is that nodes come and go but that the total envelope size will be always minuscule compare to the 64bit address range and outside C-heap and stacks. Usually mmap regions are clustered, as are C-Heap allocations and stacks. But if that turns out to be inefficient after a while, we may recalculate the envelope; just have to make sure no concurrent lock-less walks happen. Thanks, Thomas On Tue, Apr 2, 2019 at 2:21 PM > wrote: On 4/2/19 1:47 AM, Thomas St?fe wrote: Hi Coleen, Andrew, thank you for reviewing my little change. Unfortunately, I had an error in the space list verification method which needed fixing, so here is a second version: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.01/webrev/ Differences: - As Coleen requested: in allocation.cpp I replaced the comparison this==NULL with a static helper method I think you have to change the callers to not pass this as null. So you can't do metaspaceobj->is_metaspace_object() because you're calling with "this" potentially NULL. So remove this function: bool MetaspaceObj::is_metaspace_object() const { - return Metaspace::contains((void*)this); + return MetaspaceObj::is_metaspace_object(this); } - I had mistype "envelope" as "envolope" in "expand_envelope_to_include_node()". Since that sounded funny I changed it. - The real bug was in VirtualSpaceList::verify() where I checked that the extension of the envelope is as large as the current nodes. But that is wrong, since the envelope never is shrunk (by design) and nodes at the border of the envelope may have been unmapped. So the real test should be to test if no node is outside the envelope. So this envelope is an interesting concept and name. It seems okay. I guess over time, it won't give you a very good answer. Maybe you'll have to fix the boundaries someday. Looks good though. Thank you for making this improvement for performance. Coleen Thanks, Thomas On Wed, Mar 27, 2019 at 10:01 PM Thomas St?fe > wrote: Hi all, May I please have reviews for this small optimization: cr: http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 There are several functions which, given an unknown pointer assumed to be a metaspace object, check if the pointer is indeed a metaspace object by walking the VirtualSpaceList and checking ranges. This patch adds checks which weed out the obvious cases to avoid needlessly walking the vs list. Patch also adds verifications for the VirtualSpaceList in debug cases. Those run only when a new node has been added to the list, or when a node has been purged, so very sparingly. When purging nodes, I removed a small unnecessary and inefficient check which checked whether (one of the) purged nodes was still in the list. Since we now as part of the new VirtualSpaceNode::verify() walk this list, the check is unnecessary. Thanks, Thomas From david.holmes at oracle.com Wed Apr 3 10:38:41 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Apr 2019 20:38:41 +1000 Subject: RFR (S): 8218483: Crash in "assert(_daemon_threads_count->get_value() > daemon_count) failed: thread count mismatch 5 : 5" In-Reply-To: References: <2b339bc4-6701-5d23-ab45-997c2342f8c6@oracle.com> <6dfb811e-df61-4b12-041d-5721a2282ba6@oracle.com> Message-ID: <8a682a43-45de-619a-bc50-2d9aeecb6ae3@oracle.com> On 3/04/2019 5:37 pm, Thomas St?fe wrote: > Hi David, > > On Wed, Apr 3, 2019 at 9:20 AM David Holmes > wrote: > > Hi Thomas, > > On 3/04/2019 4:37 pm, Thomas St?fe wrote: > > Hi David, > > > > > > On Tue, Apr 2, 2019 at 10:57 PM David Holmes > > > >> wrote: > > > >? ? ?Hi Thomas, > > > >? ? ?Thanks for taking a look at this. > > > >? ? ?On 3/04/2019 5:41 am, Thomas St?fe wrote: > >? ? ? > Hi David, > >? ? ? > > >? ? ? > first thanks for the good analysis! > >? ? ? > > >? ? ? > Is this not a problem with the usage of setDaemon(): > >? ? ? > > >? ? ? > > > > https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDaemon(boolean) > >? ? ? > > >? ? ? > "This method must be invoked before the thread is started." > > > >? ? ?Not the usage as such, but there is a problem with setDaemon > - as per: > > > > https://bugs.openjdk.java.net/browse/JDK-8221657 > > > >? ? ?The test that causes the crash in the VM deliberately tests a > case > >? ? ?where > >? ? ?it expects to get the IllegalThreadStateException. > > > >? ? ? > I think the real solution would be for setDaemon to > distinguish > >? ? ?between > >? ? ? > not-yet-started, running and finished. It should not use > >? ? ?isAlive(). It > >? ? ? > should throw an exception if it has been started, > regardless of > >? ? ?whether > >? ? ? > it finished already or not. > > > >? ? ?Yes that fix is needed at the Java level. The use of isAlive() > >? ? ?pre-dates > >? ? ?the existence of Thread.State. > > > >? ? ?But a change at the Java level may be some time coming given > this is a > >? ? ?day one bug in the spec and implementation of > Thread.setDaemon, so I > >? ? ?wanted to address this quickly in the VM as we are seeing > these crashes > >? ? ?in testing. > > > > > > I think a simple patch could be very simply using > > > > if (threadStatus != 0) > > > > instead of > > > > isAlive() > > > > in Thread.setDaemon? > > Sure the fix is trivial (plus the method needs to be synchronized), but > that assumes that this spec inconsistency: > > ? ? ? *

This method must be invoked before the thread is started. > ? ? ? * > ? ? ? * @throws? IllegalThreadStateException > ? ? ? *? ? ? ? ? if this thread is {@linkplain #isAlive alive} > > is resolved in favour of the first statement. They may decide that > after > 25 years it's better to maintain the "not alive" semantics and permit > you to modify a terminated thread. > > > Okay, I get it now. You are worried about backward compatibility. > Someone calling setDaemon() in this way would now get an exception where > beforehand he would not. > > But how about this then: > > public final void setDaemon(boolean on) { > ? ? checkAccess(); > ? ? if (isAlive()) { > ? ? ? ? throw new IllegalThreadStateException(); > ? ? } else if (threadStatus != 0) { > ? ? ? // Not alive but not NEW - terminated? > ? ? ? // do not change daemon state. Do not throw to not break backward > compatibility. > ? ? } ?else { > ? ? ? daemon = on; > ? ? } > } > > Of course that would be observable from the outside (Thread::isDaemon()). > > At the expense of some complexity (e.g. two variables, one "real", one > outward facing as source for isDaemon), this could be fixed. > > -- > > But I do not want to stop your change. I think it is fine, I cannot see > anything wrong with it. Okay thanks for clarifying. > For a moment I wondered whether we are exposed to a similar thing here: > > thread.cpp:1996? ThreadService::current_thread_exiting(this, > is_daemon(threadObj())); > > But at this point isAlive() would still return true, yes? Since it seems > it only gets reset in ensure_join(). Correct. The window for this bug is a call to setDaemon between the ensure_join and the Threads::remove. Thanks, David > -- > > Cheers, Thomas > > > > > We do this in other places in Thread.java too. > > > > -- > > > > Also I think it makes sense to scan for similar errors in the > code base > > (isAlive being used as "has-been-started") and fix those too. > > > > For example: > > > > ApplicationShutdownHook.java: > > > > static synchronized void add(Thread hook) { > >? ? ? if(hooks == null) > >? ? ? ? ? throw new IllegalStateException("Shutdown in progress"); > > > >? ? ? if (hook.isAlive()) > >? ? ? ? ? throw new IllegalArgumentException("Hook already running"); > > > >? ? ? if (hooks.containsKey(hook)) > >? ? ? ? ? throw new IllegalArgumentException("Hook previously > registered"); > > > >? ? ? hooks.put(hook, hook); > > } > > > > would register a terminated thread as shutdown hook. I found similar > > looking code in ThreadPoolExecutor. > > Yeah that's a nasty bug - you can register a shutdown hook that will > result in other shutdown hooks not getting started! > > > I really think the jdk would be really the right place to fix this. > > And it may get fixed there eventually. Meanwhile I just want to stop > these fairly new assertions from triggering. > > Thanks, > David > > > > >? ? ?Thanks, > >? ? ?David > > > >? ? ? > Not sure. Its late, I may not be thinking straight. > >? ? ? > > >? ? ? > Cheers, Thomas > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > On Tue, Apr 2, 2019 at 12:33 AM David Holmes > >? ? ? > > > >? ? ? > > >? ? ? >>> wrote: > >? ? ? > > >? ? ? >? ? ?Bug: https://bugs.openjdk.java.net/browse/JDK-8218483 > >? ? ? >? ? ?webrev: > http://cr.openjdk.java.net/~dholmes/8218483/webrev/ > >? ? ? > > >? ? ? >? ? ?A bug in Thread.setDaemon (JDK-8221657) means that the > daemon > >? ? ?state > >? ? ? >? ? ?of a > >? ? ? >? ? ?thread can change after the thread is !isAlive() at > the Java > >? ? ?level. If > >? ? ? >? ? ?this happens before the VM call to > >? ? ?ThreadService::remove_thread then we > >? ? ? >? ? ?have a situation where we incremented the thread counters > >? ? ?when the > >? ? ? >? ? ?thread was not a daemon, and we decrement the thread > counters > >? ? ?when the > >? ? ? >? ? ?thread is a daemon - and so the counters are out of > sync and the > >? ? ? >? ? ?assertion fires. > >? ? ? > > >? ? ? >? ? ?The simple fix is to capture the daemon state of the > thread > >? ? ?while it is > >? ? ? >? ? ?still alive and to pass that through to > Threads::remove and thus > >? ? ? >? ? ?ThreadService::remove_thread. > >? ? ? > > >? ? ? >? ? ?Testing: > >? ? ? >? ? ? ? ?- manual test with modified VM (to delay > Threads::remove > >? ? ?call) > >? ? ? >? ? ?as per > >? ? ? >? ? ?the bug report > >? ? ? >? ? ? ? ?- mach 5 tiers 1-3 > >? ? ? > > >? ? ? >? ? ?Thanks, > >? ? ? >? ? ?David > >? ? ? > > > > From adinn at redhat.com Wed Apr 3 10:47:42 2019 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 3 Apr 2019 11:47:42 +0100 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> Message-ID: <06383817-1d6e-9ad0-82f5-5a3f211d39cf@redhat.com> On 03/04/2019 11:34, Doerr, Martin wrote: >> Yeah, I left them in out of a vague notion of "documentation". Can remove them if you care. > > I?d prefer shorter code. Me too. Otherwise the latest updates look fine. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From thomas.stuefe at gmail.com Wed Apr 3 10:48:41 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 3 Apr 2019 12:48:41 +0200 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: <06383817-1d6e-9ad0-82f5-5a3f211d39cf@redhat.com> References: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> <06383817-1d6e-9ad0-82f5-5a3f211d39cf@redhat.com> Message-ID: On Wed, Apr 3, 2019 at 12:47 PM Andrew Dinn wrote: > On 03/04/2019 11:34, Doerr, Martin wrote: > >> Yeah, I left them in out of a vague notion of "documentation". Can > remove them if you care. > > > > I?d prefer shorter code. > Me too. Otherwise the latest updates look fine. > > Okay guys, I'll remove the comparisons before pushing. Thanks! ..Thomas > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From lutz.schmidt at sap.com Wed Apr 3 13:18:21 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 3 Apr 2019 13:18:21 +0000 Subject: RFR(XS): 8221482: Initialize VMRegImpl::regName[] earlier to prevent assert during PrintStubCode In-Reply-To: <0dfb7424-3595-4709-b6ed-33db4bdfc34d@oracle.com> References: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> <0dfb7424-3595-4709-b6ed-33db4bdfc34d@oracle.com> Message-ID: Hi Vladimir, thanks so much for your clarifying comments. And sorry for reacting with such delay. I was distracted by other tasks. I'll go ahead now and push - after rebasing, of course. Thanks, Lutz ?On 29.03.19, 19:53, "Vladimir Kozlov" wrote: On 3/28/19 8:59 PM, David Holmes wrote: > Hi Lutz, > > cc'd the compiler team > > On 28/03/2019 9:14 pm, Schmidt, Lutz wrote: >> Dear Community, >> >> may I please request reviews for this tiny change. The purpose is to initialize the regName[] >> array earlier during VM init. > > I can see that will fix the assertion for you, but then begs the question as to whether > VMRegImpl::set_regName itself has any initialization dependencies. The answer to that is not obvious > to me. I _think_ the Register setup only depends on C++ static initialization. > > Hopefully someone from compiler team can confirm this change is in fact safe. The array is static: http://hg.openjdk.java.net/jdk/jdk/file/6a1406c718ec/src/hotspot/share/code/vmreg.cpp#l37 And register's names are encoded: http://hg.openjdk.java.net/jdk/jdk/file/6a1406c718ec/src/hotspot/cpu/x86/register_x86.cpp#l41 There are no initialization dependencies. Vladimir > > Thanks, > David > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8221482 >> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8221482.01/ >> >> Submit-repo tests pending... >> >> Thanks, >> Lutz >> >> From goetz.lindenmaier at sap.com Wed Apr 3 13:31:10 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 3 Apr 2019 13:31:10 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: <889ac2fa-53d1-8131-57c4-621da1f494b5@oracle.com> References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> <353eb85c-5752-5679-23fd-a5b82191cdc5@oracle.com> <889ac2fa-53d1-8131-57c4-621da1f494b5@oracle.com> Message-ID: Hi Coleen, thanks for your feedback! Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev > On Behalf Of coleen.phillimore at oracle.com > Sent: Montag, 1. April 2019 17:22 > To: hotspot-runtime-dev at openjdk.java.net > Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in > java-like Syntax. > > > This looks really good.?? The quotes around the method name are an > improvement.? This change is a good improvement! > Coleen > > On 4/1/19 10:25 AM, Lindenmaier, Goetz wrote: > > Hi David, > > > >>> But for consistency, we should then quote all class, field and method > >>> names, which is currently not the case as you can easily see by looking > >>> at the updated messages in the tests. > >> It's mainly the spaces caused by return types that is the issue so I > >> don't see we need to quote everything to address that issue. > > I added single quotes around the method. > > I included the decorators like: > > 'abstract void AME3_B.ma()' > > > > An incremental webrev: > > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/02- > incremental_quoting/ > > the full webrev: > > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/02/ > > > > Best regards, > > Goetz. > > > > > > > > > > > >> David > >> ----- > >> > >>> I thought about leaving out the return type, but that would mean to > >>> drop important information. > >>> So I'm not sure here ... > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> > >>>>> Should I split it to adapt the exceptions separately one-by-one to > >>>>> make the change smaller and simplify the review? > >>>> I don't think that is necessary. > >>>> > >>>> Thanks, > >>>> David > >>>> ----- > >>>> > >>>>> I would propose to start out with AbstractMethodError only. > >>>>> > >>>>> Best regards, > >>>>> Goetz. > >>>>> > >>>>> > >>>>> > >>>>> From: Lindenmaier, Goetz > >>>>> Sent: Tuesday, March 26, 2019 1:06 PM > >>>>> To: hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: RFR(L): 8221470: Print methods in exception messages in java- > like > >>>> Syntax. > >>>>> Hi, > >>>>> > >>>>> A row of exceptions are thrown from the hotspot runtime. > >>>>> They print methods with their JNI signatures. To increase > >>>>> readability and resemblance to source code, this change proposes > >>>>> to print them in a Java-like syntax. > >>>>> > >>>>> Some examples: > >>>>> current method printouts: > >>>>> > >>>>> test.TeMe3_B.ma()V > >>>>> test.TeMe3_B.ma(IZ[[BF)[[D > >>>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > >>>>> > >>>>> improved format: > >>>>> > >>>>> void test.TeMe3_B.ma() > >>>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > >>>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > >>>>> > >>>>> So far, Method::name_and_sig_as_C_string() is used to print > >>>>> these messages. > >>>>> > >>>>> This change implements function Method::external_name() that prints > the > >>>> better > >>>>> format. > >>>>> external_name() is chosen according to Klass::external_name(). > >>>>> > >>>>> Printing the better format requires parsing the signature > >>>>> Symbol. This is implemented in > >>>>> void Symbol::print_as_signature_external_return_type(outputStream > *os); > >>>>> void Symbol::print_as_signature_external_parameters(outputStream > *os); > >>>>> These method names are chosen according to > >>>> Symbol::as_class_external_name(). > >>>>> See this partial webrev for the new functions: > >>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- > >>>> new_methods/ > >>>>> Also, I changed a lot of exception messages to use the new format. > >>>>> This required to adapt a row of tests. I added a test to check > >>>>> the signature printing does not regress. For all these changes, see > >>>>> the full webrev: > >>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ > >>>>> > >>>>> I hope I detected all places where method signatures are printed to > >>>>> exception messages. > >>>>> > >>>>> Best regards, > >>>>> Goetz. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> From sgehwolf at redhat.com Wed Apr 3 13:48:27 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 03 Apr 2019 15:48:27 +0200 Subject: RFR (XS): 8221529: [TESTBUG] Docker tests use old/deprecated image on AArch64 In-Reply-To: References: <5C9CFF6F.3060804@oracle.com> Message-ID: <46e037be844ebeb214a9847a78f89e274f871640.camel@redhat.com> On Wed, 2019-04-03 at 13:58 +0800, Nick Gasson wrote: > Thanks Misha. I think I still need another reviewer to look at it before > it's ok to push? This looks fine to me. I'm not a Reviewer, though. Thanks, Severin > > Nick > > On 29/03/2019 01:07, Mikhailo Seledtsov wrote: > > Looks good to me, > > Thank you for this fix. > > > > Misha > > > > On 3/28/19, 3:05 AM, Nick Gasson wrote: > > > Hi, > > > > > > This is a follow on from 8221342 to update the default Docker image > > > used on AArch64 from "aarch64/ubuntu" to "arm64v8/ubuntu". According > > > to Docker Hub the former is deprecated and hasn't been updated since > > > Ubuntu 16.04. This causes symbol resolution failures if the JDK image > > > being tested was built against a recent glibc. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221529 > > > Webrev: http://cr.openjdk.java.net/~ngasson/8221529/webrev.0/ > > > > > > Tested using the runtime/containers/docker hotspot jtreg tests on > > > AArch64 and x86. > > > > > > Thanks, > > > Nick > > > > > > From goetz.lindenmaier at sap.com Wed Apr 3 15:18:49 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 3 Apr 2019 15:18:49 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Hi, here a new webrev: http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/03/ I have removed the test with the bad class name. I named the variables name_str/array_str now. I'll push it to jdk-submit for further testing. Best regards, Goetz. > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 3. April 2019 00:42 > To: Lindenmaier, Goetz ; 'hotspot-runtime- > dev at openjdk.java.net' > Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in > java-like Syntax. > > Two follow ups ... > > On 2/04/2019 11:26 pm, Lindenmaier, Goetz wrote: > > Hi David, > > > >> Overall this looks good to me - a few minor nits/comments below. > > thanks! > > > >> I've applied the patch and am running it through our internal build and > >> test system (tiers 1-3 initially). > >> > >> I have a suspicion there will be other tests that need to be updated - > >> possibly even JCK tests. Discovering those a-priori will be difficult > >> (simply running all the tests would take an extremely long time). Will > >> have a discussion about how best to handle those internally. > > > > I ran most JCK test without problem. They usually don't check messages. > > I ran all hotspot, jdk, langtools, nashorn and jaxp test (except > > for headful tests). > > Thanks for the additional testing info. I duplicated some of that but > found no issues, other than a couple of closed tests. > > >> src/hotspot/share/oops/method.cpp > >> Please put a blank line after each new method. > > Fixed. > > > >> src/hotspot/share/oops/symbol.cpp > >> > >> + os->print("."); > >> + } else { > >> + os->print("%c", start[i]); > >> > >> Please use os->put(char c) for individual characters. > > Fixed. > > > >> The "start" name would seem better as "buf" to me. > > Hmm, buf to me is a local chunk of memory used temporarily. > > What about array_sig, class_sig? > > Not really "sigs". > > str? Else just leave it. > > Thanks, > David > ----- > > >> + } else if (start[i] == 'L') { > >> + print_class(os, start+i+1, len-i-2); > >> Can you insert a comment that help explains the -2: > > Done. > > > >> + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { > >> space after for (2 occurrences) > > Fixed. > > > >> > test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth > >> ods.java > >> > >> Not sure the special characters can be used directly in the sources. Can > >> they not be put in as unicode escapes at all places? > > I'll try what Ioi proposed. I'll post a new webrev including that. > > > > Best regards, > > Goetz. > > > > > >> > >> --- > >> > >> Thanks, > >> David > >> ------- > >> > >> > >> On 1/04/2019 12:32 pm, David Holmes wrote: > >>> Hi Goetz, > >>> > >>> I'm looking at this ... > >>> > >>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> Any interest in this change? > >>> > >>> I'm personally of two minds here because these VM generated exceptions > >>> are not only delivered to Java source code. I'd like to know how other > >>> language developers using the JVM runtime would view this. > >>> > >>> That aside if you're going to make a change like this then I think the > >>> full signature string has to be quoted in some way to delineate it > >>> within the larger message. > >>> > >>>> Should I split it to adapt the exceptions separately one-by-one to > >>>> make the change smaller and simplify the review? > >>> > >>> I don't think that is necessary. > >>> > >>> Thanks, > >>> David > >>> ----- > >>> > >>>> I would propose to start out with AbstractMethodError only. > >>>> > >>>> Best regards, > >>>> ?? Goetz. > >>>> > >>>> > >>>> > >>>> From: Lindenmaier, Goetz > >>>> Sent: Tuesday, March 26, 2019 1:06 PM > >>>> To: hotspot-runtime-dev at openjdk.java.net > >>>> Subject: RFR(L): 8221470: Print methods in exception messages in > >>>> java-like Syntax. > >>>> > >>>> Hi, > >>>> > >>>> A row of exceptions are thrown from the hotspot runtime. > >>>> They print methods with their JNI signatures. To increase > >>>> readability and resemblance to source code, this change proposes > >>>> to print them in a Java-like syntax. > >>>> > >>>> Some examples: > >>>> current method printouts: > >>>> > >>>> test.TeMe3_B.ma()V > >>>> test.TeMe3_B.ma(IZ[[BF)[[D > >>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > >>>> > >>>> improved format: > >>>> > >>>> void test.TeMe3_B.ma() > >>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > >>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > >>>> > >>>> So far, Method::name_and_sig_as_C_string() is used to print > >>>> these messages. > >>>> > >>>> This change implements function Method::external_name() that prints > >>>> the better > >>>> format. > >>>> external_name() is chosen according to Klass::external_name(). > >>>> > >>>> Printing the better format requires parsing the signature > >>>> Symbol. This is implemented in > >>>> void Symbol::print_as_signature_external_return_type(outputStream > *os); > >>>> void Symbol::print_as_signature_external_parameters(outputStream > *os); > >>>> These method names are chosen according to > >>>> Symbol::as_class_external_name(). > >>>> > >>>> See this partial webrev for the new functions: > >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- > >> new_methods/ > >>>> > >>>> > >>>> Also, I changed a lot of exception messages to use the new format. > >>>> This required to adapt a row of tests. I added a test to check > >>>> the signature printing does not regress.? For all these changes, see > >>>> the full webrev: > >>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ > >>>> > >>>> I hope I detected all places where method signatures are printed to > >>>> exception messages. > >>>> > >>>> Best regards, > >>>> ?? Goetz. > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> From shade at redhat.com Wed Apr 3 15:24:25 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 3 Apr 2019 17:24:25 +0200 Subject: RFR (XS) 8221918: runtime/SharedArchiveFile/serviceability/ReplaceCriticalClasses.java fails: Shared archive not found Message-ID: Test bug: https://bugs.openjdk.java.net/browse/JDK-8221918 Fix: http://cr.openjdk.java.net/~shade/8221918/webrev.01/ It seems the test is running with -Xshare:auto, which expects the CDS archive to be generated by default during the build? x86_32 fails with "Shared archive not found" in that test, and it fails tier1 then. Dumping the CDS archive before starting the test helps those configs where archive is not created automatically. I copy-pasted the new block from the test (later in the same file) and dropped some non-essentials. Testing: affected test on Linux {x86_64, x86_32} fastdebug, jdk-submit (running) -- Thanks, -Aleksey From gerard.ziemski at oracle.com Wed Apr 3 15:24:49 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Wed, 3 Apr 2019 10:24:49 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: Message-ID: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> Hi all, Please review this feature, which adds tracing events for the internal hash tables. The following attributes are implemented: This event was implemented for the following system tables: SymbolTable StringTable Placeholder Table LoaderConstraints Table ProtectionDomainCache Table Webrev:? http://cr.openjdk.java.net/~gziemski/8185525_rev1/ Bug:???? https://bugs.openjdk.java.net/browse/JDK-8185525 Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in progress?) Cheers From goetz.lindenmaier at sap.com Wed Apr 3 15:55:29 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 3 Apr 2019 15:55:29 +0000 Subject: RFR(L): 8218628: Add detailed message to NullPointerException describing what is null. In-Reply-To: <13d21753-689e-547e-a0c7-9dbf9c9bee7f@oracle.com> References: <7c4b0bc27961471e91195bef9e767226@sap.com> <5c445ea9-24fb-0007-78df-41b94aadde2a@oracle.com> <8d1cc0b0-4a01-4564-73a9-4c635bfbfbaf@oracle.com> <01361236-c046-0cac-e09d-be59ea6499e0@oracle.com> <2d38e96dcd214dd091f4d79d2a9e71e3@sap.com> <440e685b-b528-056d-385f-9dc010d65e97@oracle.com> <7189ff5f-a73f-5109-1d6b-aa8a2635543a@oracle.com> <13d21753-689e-547e-a0c7-9dbf9c9bee7f@oracle.com> Message-ID: Hi Maurizio, I put your java PoC into a webrev, and added some tests: http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/06-java-Maurizio/ Please have a look at NullPointerExceptionTestMini4.java. It is failing, as the class does not reside in a file. Do you have an idea how to fix this? Further, I extended the implementation to cover all necessary bytecodes. I had hoped it would print a message on each test case in NullPointerExceptionTest. But it seems to lose track of the byte code index at some point if the method gets bigger. I have an idea how to fix this, though, so I'm not concerned about that. I'm currently looking at how to identify hidden frames... Best regards, Goetz. > -----Original Message----- > From: Maurizio Cimadamore > Sent: Freitag, 15. M?rz 2019 12:33 > To: Lindenmaier, Goetz ; Mandy Chung > ; Roger Riggs > Cc: Java Core Libs ; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(L): 8218628: Add detailed message to NullPointerException > describing what is null. > > Hi Goetz, > please find the attached ASM-based patch. It is just a PoC, as such it > does not provide as fine-grained messages as the one discussed in the > RFE/JEP, but can be enhanced to cover custom debugging attribute, I believe. > > When running this: > > Object o = null; > o.toString(); > > you get: > > Exception in thread "main" java.lang.NullPointerException: attempt to > dereference 'null' when calling method 'toString' > ??? at org.oracle.npe.NPEHandler.main(NPEHandler.java:103) > > While when running this: > > Foo foo = null; > int y = foo.x; > > You get this: > > Exception in thread "main" java.lang.NullPointerException: attempt to > dereference 'null' when accessing field 'x' > ??? at org.oracle.npe.NPEHandler.main(NPEHandler.java:105) > > One problem I had is that ASM provides no way to get the instruction > given a program counter - which means we have to scan all the bytecodes > and update the sizes as we go along, and, ASM unfortunately doesn?t > expose opcode sizes either. A more robust solution would be to have a > big switch which returned the opcode size of any given opcode. Also, > accessing to StackWalker API on exception creation might not be > desirable in terms of performances, so this might be one of these area > where some VM help could be beneficial. Another problem is that we > cannot distinguish between user-generated exceptions (e.g. `throw new > NullPointerException`) and genuine NPE issued by the VM. > > But I guess the upshot is that it works to leave all the gory detail of > bytecode grovelling to a bytecode API - if the logic is applied lazily, > then the impact on performances should be minimal, and the solution more > maintainable longer term. > > Cheers > Maurizio > > On 15/03/2019 07:59, Lindenmaier, Goetz wrote: > > Yes, it would be nice if you shared that. From thomas.stuefe at gmail.com Wed Apr 3 16:41:43 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 3 Apr 2019 18:41:43 +0200 Subject: RFR(s): 8221925: [metaspace] provide size histogram for jcmd VM.metaspace Message-ID: Hi all, could I get reviews please for this small enhancement: rfe: https://bugs.openjdk.java.net/browse/JDK-8221925 cr: http://cr.openjdk.java.net/~stuefe/webrevs/8221925-metaspace-histogram/webrev.00/webrev/index.html This patch adds a small feature to the VM.metaspace command, a size histogram. That one is useful to get an idea of the size distribution of allocations, as a base for improving chunk allocation strategies. ---------------- Example output: $ jcmd stuefe VM.metaspace histo 12835: Size histogram: Non-Class: <16 <32 <64 <128 <256 <512 <1k <2k <4k <8k <16k <32k <64k <128k <256k larger -- by object type: -- Class 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Symbol 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TypeArrayU1 0 4056 2281 1098 1068 251 116 40 10 1 0 0 0 0 0 0 TypeArrayU2 2609 1698 1604 732 321 102 41 11 2 0 0 0 0 0 0 0 TypeArrayU4 78 193 49 0 0 0 0 0 0 0 0 0 0 0 0 0 TypeArrayU8 0 3563 2307 2298 1233 267 113 27 2 0 0 0 0 0 0 0 TypeArrayOther 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Method 0 0 0 32151 0 0 0 0 0 0 0 0 0 0 0 0 ConstMethod 0 0 1225 18968 9242 1958 526 202 23 4 1 2 0 0 0 0 MethodData 0 0 0 0 0 424 2323 364 103 26 8 1 0 0 0 0 ConstantPool 0 0 0 89 286 1577 1297 450 225 110 40 10 1 0 0 0 ConstantPoolCache 0 0 187 223 344 1432 936 229 145 60 23 4 1 0 0 0 Annotations 0 0 67 0 0 0 0 0 0 0 0 0 0 0 0 0 MethodCounters 0 0 0 12838 0 0 0 0 0 0 0 0 0 0 0 0 -- by space type: -- Standard 489 4008 3222 25473 4066 1458 2239 743 227 87 44 15 2 0 0 0 Boot 539 3380 2992 33203 5796 2351 2065 580 283 114 28 2 0 0 0 0 UnsafeAnonymous 159 622 506 1221 132 202 48 0 0 0 0 0 0 0 0 0 Reflection 1500 1500 1000 8500 2500 2000 1000 0 0 0 0 0 0 0 0 0 -- total: -- 2687 9510 7720 68397 12494 6011 5352 1323 510 201 72 17 2 0 0 0 Class: <16 <32 <64 <128 <256 <512 <1k <2k <4k <8k <16k <32k <64k <128k <256k larger -- by object type: -- Class 0 0 0 0 0 0 4013 219 4 0 0 0 0 0 0 0 Symbol 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TypeArrayU1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TypeArrayU2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TypeArrayU4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TypeArrayU8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TypeArrayOther 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Method 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ConstMethod 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 MethodData 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ConstantPool 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ConstantPoolCache 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Annotations 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 MethodCounters 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -- by space type: -- Standard 0 0 0 0 0 0 1129 143 4 0 0 0 0 0 0 0 Boot 0 0 0 0 0 0 1200 76 0 0 0 0 0 0 0 0 UnsafeAnonymous 0 0 0 0 0 0 184 0 0 0 0 0 0 0 0 0 Reflection 0 0 0 0 0 0 1500 0 0 0 0 0 0 0 0 0 -- total: -- 0 0 0 0 0 0 4013 219 4 0 0 0 0 0 0 0 -------------------------- I kept the coding as simple and lean as possible. I think the patch is reasonably small. I ran all jtreg Metaspace tests locally; will run the usual jdk-submit tests and put this through our nightlies at SAP. Thank you! Best Regards, Thomas From mikhailo.seledtsov at oracle.com Wed Apr 3 16:58:10 2019 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Wed, 03 Apr 2019 09:58:10 -0700 Subject: RFR (XS): 8221529: [TESTBUG] Docker tests use old/deprecated image on AArch64 In-Reply-To: References: <5C9CFF6F.3060804@oracle.com> Message-ID: <5CA4E622.4080305@oracle.com> Hi Nick, As far as I know you need a "capital R" Reviewer as well (unless rules are different for platform-specific changes, which I am not sure). Adding David and Goetz to the change, perhaps one of them could Review the change. Misha On 4/2/19, 10:58 PM, Nick Gasson wrote: > Thanks Misha. I think I still need another reviewer to look at it > before it's ok to push? > > Nick > > On 29/03/2019 01:07, Mikhailo Seledtsov wrote: >> Looks good to me, >> Thank you for this fix. >> >> Misha >> >> On 3/28/19, 3:05 AM, Nick Gasson wrote: >>> Hi, >>> >>> This is a follow on from 8221342 to update the default Docker image >>> used on AArch64 from "aarch64/ubuntu" to "arm64v8/ubuntu". According >>> to Docker Hub the former is deprecated and hasn't been updated since >>> Ubuntu 16.04. This causes symbol resolution failures if the JDK >>> image being tested was built against a recent glibc. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8221529 >>> Webrev: http://cr.openjdk.java.net/~ngasson/8221529/webrev.0/ >>> >>> Tested using the runtime/containers/docker hotspot jtreg tests on >>> AArch64 and x86. >>> >>> Thanks, >>> Nick >>> >>> From mikhailo.seledtsov at oracle.com Wed Apr 3 17:00:52 2019 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Wed, 03 Apr 2019 10:00:52 -0700 Subject: RFR(S): 8221710: [TESTBUG] more configurable parameters for docker testing In-Reply-To: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> References: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> Message-ID: <5CA4E6C4.9040704@oracle.com> Ping... On 3/29/19, 4:41 PM, mikhailo.seledtsov at oracle.com wrote: > These new parameters are introduced to help in development and > troubleshooting of the Docker tests. > > > 1. Docker command: jdk.test.docker.command > On some systems docker is installed in locations other than /bin > or /usr/bin. JTreg harness sets PATH to these locations, hence other > locations such as /usr/local/bin/ is not visible/executable within > JTReg tests. A good practice in this case is to provide the full path > to the executable for the test. > > 2. Retaining image after test: jdk.test.docker.retain.image > This is very useful for diagnostic purposes, for trouble shooting. > By default, docker images created by the tests are removed at the end > of the test. > Specifying this option to "true" provides an ability to inspect > the image, run the image, etc. > > 3. Overriding JDK under test just for docker tests: > jdk.test.docker.jdk.under.test > This feature is useful when developing tests on non-Linux > platform. In such cases, the default JDK under test is non-Linux, > hence will not run inside a docker container. This property allows > user to point the docker tests to JDK-under-test that is built for Linux. > > Also, now that jtreg.SkippedException is available started using it. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8221710 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8221710.00/ > Testing: ran docker tests > > > Thank you, > Misha From mikhailo.seledtsov at oracle.com Wed Apr 3 17:04:11 2019 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Wed, 03 Apr 2019 10:04:11 -0700 Subject: RFR (XS): 8221529: [TESTBUG] Docker tests use old/deprecated image on AArch64 In-Reply-To: <5CA4E622.4080305@oracle.com> References: <5C9CFF6F.3060804@oracle.com> <5CA4E622.4080305@oracle.com> Message-ID: <5CA4E78B.9030705@oracle.com> Also adding aarch64-port-dev at openjdk.java.net On 4/3/19, 9:58 AM, Mikhailo Seledtsov wrote: > Hi Nick, > > As far as I know you need a "capital R" Reviewer as well (unless > rules are different for platform-specific changes, which I am not sure). > Adding David and Goetz to the change, perhaps one of them could Review > the change. > > Misha > > On 4/2/19, 10:58 PM, Nick Gasson wrote: >> Thanks Misha. I think I still need another reviewer to look at it >> before it's ok to push? >> >> Nick >> >> On 29/03/2019 01:07, Mikhailo Seledtsov wrote: >>> Looks good to me, >>> Thank you for this fix. >>> >>> Misha >>> >>> On 3/28/19, 3:05 AM, Nick Gasson wrote: >>>> Hi, >>>> >>>> This is a follow on from 8221342 to update the default Docker image >>>> used on AArch64 from "aarch64/ubuntu" to "arm64v8/ubuntu". >>>> According to Docker Hub the former is deprecated and hasn't been >>>> updated since Ubuntu 16.04. This causes symbol resolution failures >>>> if the JDK image being tested was built against a recent glibc. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8221529 >>>> Webrev: http://cr.openjdk.java.net/~ngasson/8221529/webrev.0/ >>>> >>>> Tested using the runtime/containers/docker hotspot jtreg tests on >>>> AArch64 and x86. >>>> >>>> Thanks, >>>> Nick >>>> >>>> From shade at redhat.com Wed Apr 3 17:19:33 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 3 Apr 2019 19:19:33 +0200 Subject: RFR (XS) 8221918: runtime/SharedArchiveFile/serviceability/ReplaceCriticalClasses.java fails: Shared archive not found In-Reply-To: References: Message-ID: <6ea6c6ec-547e-72a3-d41b-217371739837@redhat.com> On 4/3/19 5:24 PM, Aleksey Shipilev wrote: > Test bug: > https://bugs.openjdk.java.net/browse/JDK-8221918 > > Fix: > http://cr.openjdk.java.net/~shade/8221918/webrev.01/ > > It seems the test is running with -Xshare:auto, which expects the CDS archive to be generated by > default during the build? x86_32 fails with "Shared archive not found" in that test, and it fails > tier1 then. > > Dumping the CDS archive before starting the test helps those configs where archive is not created > automatically. I copy-pasted the new block from the test (later in the same file) and dropped some > non-essentials. On a second thought, I think we are better using test-specific archive to avoid problems with other CDS tests that might also expect archive to be generated (which would break them when they run standalone) or read the shared archive while it is being generated. New webrev: http://cr.openjdk.java.net/~shade/8221918/webrev.02/ Testing: affected test on Linux {x86_64, x86_32} fastdebug, jdk-submit (running) -Aleksey From jianglizhou at google.com Wed Apr 3 17:30:03 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Wed, 3 Apr 2019 10:30:03 -0700 Subject: RFR (XS) 8221918: runtime/SharedArchiveFile/serviceability/ReplaceCriticalClasses.java fails: Shared archive not found In-Reply-To: References: Message-ID: Hi Aleksey, Looks okay to me. This also addresses the issue for ReplaceCriticalClassesForSubgraphs.java automatically. Theses tests intend to check the interactions between JVMTI agent and CDS for shared classes like Object, String, etc. The JVMTI interaction issue is not default CDS archive only issue, it becomes more prominent with the enabling of the default CDS archive. I'd suggest to dump the archive like the following and also remove the '.setUseSystemArchive(true)' from the CDSOptions in launchChild() (an archive generated in the JTwork/scratch is used in this case). That would also help avoid any potential issue in case the JDK's lib/ directory is read-only in some of the environments. CDSOptions opts = new CDSOptions() .setXShareMode("dump") .setUseVersion(false) .addSuffix("-showversion"); CDSTestUtils.createArchive(opts); Thanks and regards, Jiangli On Wed, Apr 3, 2019 at 8:25 AM Aleksey Shipilev wrote: > Test bug: > https://bugs.openjdk.java.net/browse/JDK-8221918 > > Fix: > http://cr.openjdk.java.net/~shade/8221918/webrev.01/ > > It seems the test is running with -Xshare:auto, which expects the CDS > archive to be generated by > default during the build? x86_32 fails with "Shared archive not found" in > that test, and it fails > tier1 then. > > Dumping the CDS archive before starting the test helps those configs where > archive is not created > automatically. I copy-pasted the new block from the test (later in the > same file) and dropped some > non-essentials. > > Testing: affected test on Linux {x86_64, x86_32} fastdebug, jdk-submit > (running) > > -- > Thanks, > -Aleksey > > > From jianglizhou at google.com Wed Apr 3 17:37:23 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Wed, 3 Apr 2019 10:37:23 -0700 Subject: RFR (XS) 8221918: runtime/SharedArchiveFile/serviceability/ReplaceCriticalClasses.java fails: Shared archive not found In-Reply-To: <6ea6c6ec-547e-72a3-d41b-217371739837@redhat.com> References: <6ea6c6ec-547e-72a3-d41b-217371739837@redhat.com> Message-ID: Looks like we were going to the same direction. I sent out the review before seeing your second version. This looks okay to me. Thanks and regards, Jiangli On Wed, Apr 3, 2019 at 10:20 AM Aleksey Shipilev wrote: > On 4/3/19 5:24 PM, Aleksey Shipilev wrote: > > Test bug: > > https://bugs.openjdk.java.net/browse/JDK-8221918 > > > > Fix: > > http://cr.openjdk.java.net/~shade/8221918/webrev.01/ > > > > It seems the test is running with -Xshare:auto, which expects the CDS > archive to be generated by > > default during the build? x86_32 fails with "Shared archive not found" > in that test, and it fails > > tier1 then. > > > > Dumping the CDS archive before starting the test helps those configs > where archive is not created > > automatically. I copy-pasted the new block from the test (later in the > same file) and dropped some > > non-essentials. > > On a second thought, I think we are better using test-specific archive to > avoid problems with other > CDS tests that might also expect archive to be generated (which would > break them when they run > standalone) or read the shared archive while it is being generated. > > New webrev: > http://cr.openjdk.java.net/~shade/8221918/webrev.02/ > > Testing: affected test on Linux {x86_64, x86_32} fastdebug, jdk-submit > (running) > > -Aleksey > > From shade at redhat.com Wed Apr 3 17:45:40 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 3 Apr 2019 19:45:40 +0200 Subject: RFR (XS) 8221918: runtime/SharedArchiveFile/serviceability/ReplaceCriticalClasses.java fails: Shared archive not found In-Reply-To: References: <6ea6c6ec-547e-72a3-d41b-217371739837@redhat.com> Message-ID: On 4/3/19 7:37 PM, Jiangli Zhou wrote: > Looks like we were going to the same direction. I sent out the review before seeing your second > version. This looks okay to me. Yes! Thanks for review. I have not noticed there CDSTestUtils.createArchive, I guess explicit setArchiveName is as good. I'll wait for jdk-submit to clear webrev.02 for me and wait for another review maybe. -Aleksey > New webrev: > ? http://cr.openjdk.java.net/~shade/8221918/webrev.02/ From erik.gahlin at oracle.com Wed Apr 3 17:44:47 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Wed, 3 Apr 2019 19:44:47 +0200 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> Message-ID: <5CA4F10F.2010805@oracle.com> Hi Gerard, Here are some comments about the metadata (to make it consistent with other events). The events should not be in the "Java Application" category since they are JVM events. You could perhaps put them in "Java Virtual Machine, Runtime, Tables". Some comments about the names and labels of fields. - Label: Number of buckets => Bucket Count - Label: Number of entries => Entry Count - Label: Total footprint => Total Footprint Could you remove descriptions that are exactly the same as the label. - Label: Maximum bucket size => Maximum Bucket Size - Label: Average bucket size => Average Bucket Size - Label: Variance of bucket size => Bucket Size Variance - Name: stdDevOfBucketSize => bucketSizeStandardDeviation - Label: Standard deviation of bucket size => Bucket Size Standard Deviation" Instead of using the word "size", it may make more sense to use the word "count" here as well, i.e "Average Bucket Count", or maybe I'm missing something? Is there a difference? I wonder how useful standard deviation and variance is? If support engineers are looking at a recording, or JMC adds a rule for the events, what would a good or bad value be? Is it possible to use the information for troubleshooting? - Name: addRate => insertionRate - Label: Rate of addition => Insertation Rate - Name: removeRate => removalRate - Label: Rate of removal => Removal Rate I'm missing unit tests for the events. Could you please add in /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the average not exceeding max, no negative values etc. Thanks! Erik > Hi all, > > Please review this feature, which adds tracing events for the internal > hash tables. > > The following attributes are implemented: > > description="Number of buckets" /> > description="Number of all entries" /> > label="Total footprint" description="Total memory footprint (the table > itself plus all of the entries)" /> > > > > > description="How many items were added since last event (per second)" /> > description="How many items were removed since last event (per > second)" /> > > This event was implemented for the following system tables: > > SymbolTable > StringTable > Placeholder Table > LoaderConstraints Table > ProtectionDomainCache Table > > Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8185525 > Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in progress?) > > > Cheers > From leonid.mesnik at oracle.com Wed Apr 3 23:24:50 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 03 Apr 2019 16:24:50 -0700 Subject: RFR(S): 8221710: [TESTBUG] more configurable parameters for docker testing In-Reply-To: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> References: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> Message-ID: <409fb2a0ba32837c448ca7e2bbf5a92f19a47c8b.camel@oracle.com> Hi Looks good to me. Please get approval from 'R'eviewer also. Leonid On Fri, 2019-03-29 at 16:41 -0700, mikhailo.seledtsov at oracle.com wrote: > These new parameters are introduced to help in development and > troubleshooting of the Docker tests. > > > 1. Docker command: jdk.test.docker.command > On some systems docker is installed in locations other than /bin > or > /usr/bin. JTreg harness sets PATH to these locations, hence other > locations such as /usr/local/bin/ is not visible/executable within > JTReg > tests. A good practice in this case is to provide the full path to > the > executable for the test. > > 2. Retaining image after test: jdk.test.docker.retain.image > This is very useful for diagnostic purposes, for trouble > shooting. > By default, docker images created by the tests are removed at the end > of > the test. > Specifying this option to "true" provides an ability to inspect > the > image, run the image, etc. > > 3. Overriding JDK under test just for docker tests: > jdk.test.docker.jdk.under.test > This feature is useful when developing tests on non-Linux > platform. > In such cases, the default JDK under test is non-Linux, hence will > not > run inside a docker container. This property allows user to point > the > docker tests to JDK-under-test that is built for Linux. > > Also, now that jtreg.SkippedException is available started using it. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8221710 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8221710.00/ > Testing: ran docker tests > > > Thank you, > Misha From david.holmes at oracle.com Wed Apr 3 23:52:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Apr 2019 09:52:48 +1000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Looks good. I'm re-running through our test system. Thanks, David On 4/04/2019 1:18 am, Lindenmaier, Goetz wrote: > Hi, > > here a new webrev: > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/03/ > > I have removed the test with the bad class name. > I named the variables name_str/array_str now. > > I'll push it to jdk-submit for further testing. > > Best regards, > Goetz. > > >> -----Original Message----- >> From: David Holmes >> Sent: Mittwoch, 3. April 2019 00:42 >> To: Lindenmaier, Goetz ; 'hotspot-runtime- >> dev at openjdk.java.net' >> Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in >> java-like Syntax. >> >> Two follow ups ... >> >> On 2/04/2019 11:26 pm, Lindenmaier, Goetz wrote: >>> Hi David, >>> >>>> Overall this looks good to me - a few minor nits/comments below. >>> thanks! >>> >>>> I've applied the patch and am running it through our internal build and >>>> test system (tiers 1-3 initially). >>>> >>>> I have a suspicion there will be other tests that need to be updated - >>>> possibly even JCK tests. Discovering those a-priori will be difficult >>>> (simply running all the tests would take an extremely long time). Will >>>> have a discussion about how best to handle those internally. >>> >>> I ran most JCK test without problem. They usually don't check messages. >>> I ran all hotspot, jdk, langtools, nashorn and jaxp test (except >>> for headful tests). >> >> Thanks for the additional testing info. I duplicated some of that but >> found no issues, other than a couple of closed tests. >> >>>> src/hotspot/share/oops/method.cpp >>>> Please put a blank line after each new method. >>> Fixed. >>> >>>> src/hotspot/share/oops/symbol.cpp >>>> >>>> + os->print("."); >>>> + } else { >>>> + os->print("%c", start[i]); >>>> >>>> Please use os->put(char c) for individual characters. >>> Fixed. >>> >>>> The "start" name would seem better as "buf" to me. >>> Hmm, buf to me is a local chunk of memory used temporarily. >>> What about array_sig, class_sig? >> >> Not really "sigs". >> >> str? Else just leave it. >> >> Thanks, >> David >> ----- >> >>>> + } else if (start[i] == 'L') { >>>> + print_class(os, start+i+1, len-i-2); >>>> Can you insert a comment that help explains the -2: >>> Done. >>> >>>> + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { >>>> space after for (2 occurrences) >>> Fixed. >>> >>>> >> test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth >>>> ods.java >>>> >>>> Not sure the special characters can be used directly in the sources. Can >>>> they not be put in as unicode escapes at all places? >>> I'll try what Ioi proposed. I'll post a new webrev including that. >>> >>> Best regards, >>> Goetz. >>> >>> >>>> >>>> --- >>>> >>>> Thanks, >>>> David >>>> ------- >>>> >>>> >>>> On 1/04/2019 12:32 pm, David Holmes wrote: >>>>> Hi Goetz, >>>>> >>>>> I'm looking at this ... >>>>> >>>>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: >>>>>> Hi, >>>>>> >>>>>> Any interest in this change? >>>>> >>>>> I'm personally of two minds here because these VM generated exceptions >>>>> are not only delivered to Java source code. I'd like to know how other >>>>> language developers using the JVM runtime would view this. >>>>> >>>>> That aside if you're going to make a change like this then I think the >>>>> full signature string has to be quoted in some way to delineate it >>>>> within the larger message. >>>>> >>>>>> Should I split it to adapt the exceptions separately one-by-one to >>>>>> make the change smaller and simplify the review? >>>>> >>>>> I don't think that is necessary. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> I would propose to start out with AbstractMethodError only. >>>>>> >>>>>> Best regards, >>>>>> ?? Goetz. >>>>>> >>>>>> >>>>>> >>>>>> From: Lindenmaier, Goetz >>>>>> Sent: Tuesday, March 26, 2019 1:06 PM >>>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: RFR(L): 8221470: Print methods in exception messages in >>>>>> java-like Syntax. >>>>>> >>>>>> Hi, >>>>>> >>>>>> A row of exceptions are thrown from the hotspot runtime. >>>>>> They print methods with their JNI signatures. To increase >>>>>> readability and resemblance to source code, this change proposes >>>>>> to print them in a Java-like syntax. >>>>>> >>>>>> Some examples: >>>>>> current method printouts: >>>>>> >>>>>> test.TeMe3_B.ma()V >>>>>> test.TeMe3_B.ma(IZ[[BF)[[D >>>>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; >>>>>> >>>>>> improved format: >>>>>> >>>>>> void test.TeMe3_B.ma() >>>>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) >>>>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) >>>>>> >>>>>> So far, Method::name_and_sig_as_C_string() is used to print >>>>>> these messages. >>>>>> >>>>>> This change implements function Method::external_name() that prints >>>>>> the better >>>>>> format. >>>>>> external_name() is chosen according to Klass::external_name(). >>>>>> >>>>>> Printing the better format requires parsing the signature >>>>>> Symbol. This is implemented in >>>>>> void Symbol::print_as_signature_external_return_type(outputStream >> *os); >>>>>> void Symbol::print_as_signature_external_parameters(outputStream >> *os); >>>>>> These method names are chosen according to >>>>>> Symbol::as_class_external_name(). >>>>>> >>>>>> See this partial webrev for the new functions: >>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- >>>> new_methods/ >>>>>> >>>>>> >>>>>> Also, I changed a lot of exception messages to use the new format. >>>>>> This required to adapt a row of tests. I added a test to check >>>>>> the signature printing does not regress.? For all these changes, see >>>>>> the full webrev: >>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ >>>>>> >>>>>> I hope I detected all places where method signatures are printed to >>>>>> exception messages. >>>>>> >>>>>> Best regards, >>>>>> ?? Goetz. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From coleen.phillimore at oracle.com Thu Apr 4 00:00:36 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 3 Apr 2019 20:00:36 -0400 Subject: RFR (T) 8221872: Remove ClassLoaderWeakHandle typedef Message-ID: <1d1affc5-41bb-90c4-a252-9e3b04bf28d2@oracle.com> Summary: Make consistent with StringTable and ResolvedMethodTable We decided to not have this typedef because StringTable doesn't have it and the new concurrent hashtable for ResolvedMethodTable won't either.? This will make it consistent. Tested with hs tier1 on Oracle platforms. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8221872.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8221872 Thanks, Coleen From coleen.phillimore at oracle.com Thu Apr 4 00:04:05 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 3 Apr 2019 20:04:05 -0400 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> Message-ID: This looks good, with the unnecessary null checks removed.? I don't need to see another version but do a sanity build before pushing please! Thanks! Coleen On 4/3/19 4:57 AM, Thomas St?fe wrote: > Hi all, > > new version: > > Delta: > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/delta_to_4/webrev/index.html > > Full: > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.04/webrev/ > > Changes: > > - As Coleen wished, I completely removed the non-static variant of > MetaspaceObj::is_metaspace_obj() and fixed the callers. > - I also renamed the static variant of > MetaspaceObj::is_metaspace_obj() to MetaspaceObj::is_valid() to be in > line with similar calls, e.g. Symbol::is_valid(). > > @Coleen: This envelope should only weed out obvious non-null bogus > values and hopefully stack and C-heap addresses; my hope is that nodes > come and go but that the total envelope size will be always minuscule > compare to the 64bit address range and outside C-heap and stacks. > Usually mmap regions are clustered, as are C-Heap allocations and stacks. > > But if that turns out to be inefficient after a while, we may > recalculate the envelope; just have to make sure no concurrent > lock-less walks happen. > > Thanks, Thomas > > > > On Tue, Apr 2, 2019 at 2:21 PM > wrote: > > > > On 4/2/19 1:47 AM, Thomas St?fe wrote: >> Hi Coleen, Andrew, >> >> thank you for reviewing my little change. Unfortunately, I had an >> error in the space list verification method which needed fixing, >> so here is a second version: >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.01/webrev/ > >> >> Differences: >> - As Coleen requested: in allocation.cpp I replaced the >> comparison this==NULL with a static helper method > > I think you have to change the callers to not pass this as null.? > So you can't do metaspaceobj->is_metaspace_object() because you're > calling with "this" potentially NULL. > > So remove this function: > > bool MetaspaceObj::is_metaspace_object() const { > - return Metaspace::contains((void*)this); > + return MetaspaceObj::is_metaspace_object(this); > } > > >> - I had mistype "envelope" as "envolope" in >> "expand_envelope_to_include_node()". Since that sounded funny I >> changed it. >> - The real bug was in VirtualSpaceList::verify() where I checked >> that the extension of the envelope is as large as the current >> nodes. But that is wrong, since the envelope never is shrunk (by >> design) and nodes at the border of the envelope may have been >> unmapped. So the real test should be to test if no node is >> outside the envelope. > > So this envelope is an interesting concept and name.? It seems > okay.? I guess over time, it won't give you a very good answer.? > Maybe you'll have to fix the boundaries someday. > > Looks good though.? Thank you for making this improvement for > performance. > > Coleen >> >> Thanks, Thomas >> >> >> On Wed, Mar 27, 2019 at 10:01 PM Thomas St?fe >> > wrote: >> >> Hi all, >> >> May I please have reviews for this small optimization: >> >> cr: >> http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html >> Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 >> >> There are several functions which, given an unknown pointer >> assumed to be a metaspace object, check if the pointer is >> indeed a metaspace object by walking the VirtualSpaceList and >> checking ranges. >> >> This patch adds checks which weed out the obvious cases to >> avoid needlessly walking the vs list. >> >> Patch also adds verifications for the VirtualSpaceList in >> debug cases. Those run only when a new node has been added to >> the list, or when a node has been purged, so very sparingly. >> >> When purging nodes, I removed a small unnecessary and >> inefficient check which checked whether (one of the) purged >> nodes was still in the list. Since we now as part of the new >> VirtualSpaceNode::verify() walk this list, the check is >> unnecessary. >> >> Thanks, Thomas >> >> > From david.holmes at oracle.com Thu Apr 4 00:12:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Apr 2019 10:12:48 +1000 Subject: RFR (XS) 8221918: runtime/SharedArchiveFile/serviceability/ReplaceCriticalClasses.java fails: Shared archive not found In-Reply-To: References: <6ea6c6ec-547e-72a3-d41b-217371739837@redhat.com> Message-ID: Hi Aleksey, There's no error checking in the dump part - if something goes wrong we won't see anything to indicate what it was, and AFAICS we won't even notice the failure (till the second part of the test). Aside: not sure why we can't just use: @run main/othervm -XX:SharedArchiveFile=... -Xshare:dump and let jtreg deal with error? Please update copyright to "2018, 2019," Thanks, David On 4/04/2019 3:45 am, Aleksey Shipilev wrote: > On 4/3/19 7:37 PM, Jiangli Zhou wrote: >> Looks like we were going to the same direction. I sent out the review before seeing your second >> version. This looks okay to me. > > Yes! Thanks for review. > > I have not noticed there CDSTestUtils.createArchive, I guess explicit setArchiveName is as good. > I'll wait for jdk-submit to clear webrev.02 for me and wait for another review maybe. > > -Aleksey > >> New webrev: >> ? http://cr.openjdk.java.net/~shade/8221918/webrev.02/ > From david.holmes at oracle.com Thu Apr 4 00:19:51 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Apr 2019 10:19:51 +1000 Subject: RFR (XS): 8221529: [TESTBUG] Docker tests use old/deprecated image on AArch64 In-Reply-To: <5CA4E78B.9030705@oracle.com> References: <5C9CFF6F.3060804@oracle.com> <5CA4E622.4080305@oracle.com> <5CA4E78B.9030705@oracle.com> Message-ID: Sorry I'll have to defer to someone more experienced with this as I can't comment on the validity of the change. David On 4/04/2019 3:04 am, Mikhailo Seledtsov wrote: > Also adding?? aarch64-port-dev at openjdk.java.net > > > On 4/3/19, 9:58 AM, Mikhailo Seledtsov wrote: >> Hi Nick, >> >> ? As far as I know you need a "capital R" Reviewer as well (unless >> rules are different for platform-specific changes, which I am not sure). >> Adding David and Goetz to the change, perhaps one of them could Review >> the change. >> >> Misha >> >> On 4/2/19, 10:58 PM, Nick Gasson wrote: >>> Thanks Misha. I think I still need another reviewer to look at it >>> before it's ok to push? >>> >>> Nick >>> >>> On 29/03/2019 01:07, Mikhailo Seledtsov wrote: >>>> Looks good to me, >>>> Thank you for this fix. >>>> >>>> Misha >>>> >>>> On 3/28/19, 3:05 AM, Nick Gasson wrote: >>>>> Hi, >>>>> >>>>> This is a follow on from 8221342 to update the default Docker image >>>>> used on AArch64 from "aarch64/ubuntu" to "arm64v8/ubuntu". >>>>> According to Docker Hub the former is deprecated and hasn't been >>>>> updated since Ubuntu 16.04. This causes symbol resolution failures >>>>> if the JDK image being tested was built against a recent glibc. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8221529 >>>>> Webrev: http://cr.openjdk.java.net/~ngasson/8221529/webrev.0/ >>>>> >>>>> Tested using the runtime/containers/docker hotspot jtreg tests on >>>>> AArch64 and x86. >>>>> >>>>> Thanks, >>>>> Nick >>>>> >>>>> From coleen.phillimore at oracle.com Thu Apr 4 00:33:19 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 3 Apr 2019 20:33:19 -0400 Subject: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses In-Reply-To: References: Message-ID: <9a392c4c-2620-99b1-79b5-7bffa8a4774f@oracle.com> On 4/2/19 10:33 AM, Doerr, Martin wrote: > Hi Zhengyu, > > that would be fine, too. I'll put it there if other reviewers prefer that, too. Yes, I prefer that too. Coleen > > Thanks and best regards, > Martin > > > -----Original Message----- > From: Zhengyu Gu > Sent: Dienstag, 2. April 2019 16:01 > To: Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses > > Hi Martin, > > Would it be more proper to do the check in os::is_readable_range()? > > Thanks, > > -Zhengyu > > On 4/2/19 9:05 AM, Doerr, Martin wrote: >> Hi, >> >> I'd like to fix a minor bug in Symbol::is_valid which can cause errors during error reporting: >> Address computation can overflow leading to skipped readability check. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8221833 >> >> Webrev: >> http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.00/ >> >> Please review. >> >> Best regards, >> Martin >> From coleen.phillimore at oracle.com Thu Apr 4 00:42:40 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 3 Apr 2019 20:42:40 -0400 Subject: RFR (T) 8221872: Remove ClassLoaderWeakHandle typedef In-Reply-To: <1d1affc5-41bb-90c4-a252-9e3b04bf28d2@oracle.com> References: <1d1affc5-41bb-90c4-a252-9e3b04bf28d2@oracle.com> Message-ID: I changed the description of the bug to: 8221872: Remove uses of ClassLoaderWeakHandle typedef in protection domain table The typedef will be removed with this change JDK-8221393 ResolvedMethodTable too small for StackWalking applications Thanks, Coleen On 4/3/19 8:00 PM, coleen.phillimore at oracle.com wrote: > Summary: Make consistent with StringTable and ResolvedMethodTable > > We decided to not have this typedef because StringTable doesn't have > it and the new concurrent hashtable for ResolvedMethodTable won't > either.? This will make it consistent. > > Tested with hs tier1 on Oracle platforms. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8221872.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8221872 > > Thanks, > Coleen > From david.holmes at oracle.com Thu Apr 4 00:47:30 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Apr 2019 10:47:30 +1000 Subject: RFR (T) 8221872: Remove ClassLoaderWeakHandle typedef In-Reply-To: References: <1d1affc5-41bb-90c4-a252-9e3b04bf28d2@oracle.com> Message-ID: Hi Coleen, On 4/04/2019 10:42 am, coleen.phillimore at oracle.com wrote: > > I changed the description of the bug to: > > 8221872: Remove uses of ClassLoaderWeakHandle typedef in protection > domain table > > The typedef will be removed with this change JDK-8221393 > ResolvedMethodTable > too small for StackWalking applications Personally I find the typedef improves readability - especially with parameterized types. But okay ... the removal of these uses of the typedef in anticipation of its removal seems fine. Thanks, David > Thanks, > Coleen > > On 4/3/19 8:00 PM, coleen.phillimore at oracle.com wrote: >> Summary: Make consistent with StringTable and ResolvedMethodTable >> >> We decided to not have this typedef because StringTable doesn't have >> it and the new concurrent hashtable for ResolvedMethodTable won't >> either.? This will make it consistent. >> >> Tested with hs tier1 on Oracle platforms. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8221872.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8221872 >> >> Thanks, >> Coleen >> > From coleen.phillimore at oracle.com Thu Apr 4 01:12:40 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 3 Apr 2019 21:12:40 -0400 Subject: RFR (T) 8221872: Remove ClassLoaderWeakHandle typedef In-Reply-To: References: <1d1affc5-41bb-90c4-a252-9e3b04bf28d2@oracle.com> Message-ID: <732641cb-507b-289b-91fe-3a61ab549322@oracle.com> On 4/3/19 8:47 PM, David Holmes wrote: > Hi Coleen, > > On 4/04/2019 10:42 am, coleen.phillimore at oracle.com wrote: >> >> I changed the description of the bug to: >> >> 8221872: Remove uses of ClassLoaderWeakHandle typedef in protection >> domain table >> >> The typedef will be removed with this change JDK-8221393 >> ResolvedMethodTable >> too small for StackWalking applications > > Personally I find the typedef improves readability - especially with > parameterized types. But okay ... the removal of these uses of the > typedef in anticipation of its removal seems fine. Thanks for the code review.? This doesn't have too many <> parameterized types so it's not so bad. Coleen > > Thanks, > David > >> Thanks, >> Coleen >> >> On 4/3/19 8:00 PM, coleen.phillimore at oracle.com wrote: >>> Summary: Make consistent with StringTable and ResolvedMethodTable >>> >>> We decided to not have this typedef because StringTable doesn't have >>> it and the new concurrent hashtable for ResolvedMethodTable won't >>> either.? This will make it consistent. >>> >>> Tested with hs tier1 on Oracle platforms. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8221872.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8221872 >>> >>> Thanks, >>> Coleen >>> >> From mikhailo.seledtsov at oracle.com Thu Apr 4 01:30:58 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 3 Apr 2019 18:30:58 -0700 Subject: RFR(S): 8221710: [TESTBUG] more configurable parameters for docker testing In-Reply-To: <409fb2a0ba32837c448ca7e2bbf5a92f19a47c8b.camel@oracle.com> References: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> <409fb2a0ba32837c448ca7e2bbf5a92f19a47c8b.camel@oracle.com> Message-ID: Thank you Leonid, Misha On 4/3/19 4:24 PM, Leonid Mesnik wrote: > Hi > > Looks good to me. Please get approval from 'R'eviewer also. > > Leonid > On Fri, 2019-03-29 at 16:41 -0700, mikhailo.seledtsov at oracle.com wrote: >> These new parameters are introduced to help in development and >> troubleshooting of the Docker tests. >> >> >> 1. Docker command: jdk.test.docker.command >> On some systems docker is installed in locations other than /bin >> or >> /usr/bin. JTreg harness sets PATH to these locations, hence other >> locations such as /usr/local/bin/ is not visible/executable within >> JTReg >> tests. A good practice in this case is to provide the full path to >> the >> executable for the test. >> >> 2. Retaining image after test: jdk.test.docker.retain.image >> This is very useful for diagnostic purposes, for trouble >> shooting. >> By default, docker images created by the tests are removed at the end >> of >> the test. >> Specifying this option to "true" provides an ability to inspect >> the >> image, run the image, etc. >> >> 3. Overriding JDK under test just for docker tests: >> jdk.test.docker.jdk.under.test >> This feature is useful when developing tests on non-Linux >> platform. >> In such cases, the default JDK under test is non-Linux, hence will >> not >> run inside a docker container. This property allows user to point >> the >> docker tests to JDK-under-test that is built for Linux. >> >> Also, now that jtreg.SkippedException is available started using it. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8221710 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8221710.00/ >> Testing: ran docker tests >> >> >> Thank you, >> Misha From igor.ignatyev at oracle.com Thu Apr 4 01:59:21 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 3 Apr 2019 18:59:21 -0700 Subject: RFR(S): 8221710: [TESTBUG] more configurable parameters for docker testing In-Reply-To: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> References: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> Message-ID: <6CB2736D-D389-4B87-B31E-6008641A15BD@oracle.com> Hi Misha, overall looks good to me. I'd use 'jdk.test.docker.jdk' property name instead of 'jdk.test.docker.jdk.under.test' though, but I don't insist. -- Igor > On Mar 29, 2019, at 4:41 PM, mikhailo.seledtsov at oracle.com wrote: > > These new parameters are introduced to help in development and troubleshooting of the Docker tests. > > > 1. Docker command: jdk.test.docker.command > On some systems docker is installed in locations other than /bin or /usr/bin. JTreg harness sets PATH to these locations, hence other locations such as /usr/local/bin/ is not visible/executable within JTReg tests. A good practice in this case is to provide the full path to the executable for the test. > > 2. Retaining image after test: jdk.test.docker.retain.image > This is very useful for diagnostic purposes, for trouble shooting. By default, docker images created by the tests are removed at the end of the test. > Specifying this option to "true" provides an ability to inspect the image, run the image, etc. > > 3. Overriding JDK under test just for docker tests: jdk.test.docker.jdk.under.test > This feature is useful when developing tests on non-Linux platform. In such cases, the default JDK under test is non-Linux, hence will not run inside a docker container. This property allows user to point the docker tests to JDK-under-test that is built for Linux. > > Also, now that jtreg.SkippedException is available started using it. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8221710 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8221710.00/ > Testing: ran docker tests > > > Thank you, > Misha From igor.ignatyev at oracle.com Thu Apr 4 02:06:49 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 3 Apr 2019 19:06:49 -0700 Subject: RFR(S): 8221710: [TESTBUG] more configurable parameters for docker testing In-Reply-To: <6CB2736D-D389-4B87-B31E-6008641A15BD@oracle.com> References: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> <6CB2736D-D389-4B87-B31E-6008641A15BD@oracle.com> Message-ID: took another look at removeDockerImage, and I don't like the new version. you made an assumption that all usages of this will be just to clean after a test, which might not be true, I can imagine someone needing to remove docker image as part of their test, so I'd prefer to revert changes in DockerTestUtils::removeDockerImage and just replaced DockerBasicTest::removeImageAfterTest with DockerTestUtils::RETAIN_IMAGE_AFTER_TEST. also it might make sense to make all these new constants (all but DOCKER_COMMAND?) public. -- Igor > On Apr 3, 2019, at 6:59 PM, Igor Ignatyev wrote: > > Hi Misha, > > overall looks good to me. I'd use 'jdk.test.docker.jdk' property name instead of 'jdk.test.docker.jdk.under.test' though, but I don't insist. > > -- Igor > >> On Mar 29, 2019, at 4:41 PM, mikhailo.seledtsov at oracle.com wrote: >> >> These new parameters are introduced to help in development and troubleshooting of the Docker tests. >> >> >> 1. Docker command: jdk.test.docker.command >> On some systems docker is installed in locations other than /bin or /usr/bin. JTreg harness sets PATH to these locations, hence other locations such as /usr/local/bin/ is not visible/executable within JTReg tests. A good practice in this case is to provide the full path to the executable for the test. >> >> 2. Retaining image after test: jdk.test.docker.retain.image >> This is very useful for diagnostic purposes, for trouble shooting. By default, docker images created by the tests are removed at the end of the test. >> Specifying this option to "true" provides an ability to inspect the image, run the image, etc. >> >> 3. Overriding JDK under test just for docker tests: jdk.test.docker.jdk.under.test >> This feature is useful when developing tests on non-Linux platform. In such cases, the default JDK under test is non-Linux, hence will not run inside a docker container. This property allows user to point the docker tests to JDK-under-test that is built for Linux. >> >> Also, now that jtreg.SkippedException is available started using it. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8221710 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8221710.00/ >> Testing: ran docker tests >> >> >> Thank you, >> Misha > From thomas.stuefe at gmail.com Thu Apr 4 04:08:42 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 4 Apr 2019 06:08:42 +0200 Subject: RFR(s): 8221539: [metaspace] Improve MetaspaceObj::is_metaspace_obj() and friends In-Reply-To: References: <46665f4c-1a12-28b2-46cb-275fab56ea57@oracle.com> Message-ID: On Thu, Apr 4, 2019 at 2:04 AM wrote: > > This looks good, with the unnecessary null checks removed. I don't need > to see another version but do a sanity build before pushing please! > Thanks! > Coleen > Thanks Coleen. I run jdk-submit before every push. Which is why I usually delay running it for after the reviews, to avoid uncessary re-tests. Thanks, Thomas > > > On 4/3/19 4:57 AM, Thomas St?fe wrote: > > Hi all, > > new version: > > Delta: > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/delta_to_4/webrev/index.html > > Full: > > http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.04/webrev/ > > Changes: > > - As Coleen wished, I completely removed the non-static variant of > MetaspaceObj::is_metaspace_obj() and fixed the callers. > - I also renamed the static variant of MetaspaceObj::is_metaspace_obj() to > MetaspaceObj::is_valid() to be in line with similar calls, e.g. > Symbol::is_valid(). > > @Coleen: This envelope should only weed out obvious non-null bogus values > and hopefully stack and C-heap addresses; my hope is that nodes come and go > but that the total envelope size will be always minuscule compare to the > 64bit address range and outside C-heap and stacks. Usually mmap regions are > clustered, as are C-Heap allocations and stacks. > > But if that turns out to be inefficient after a while, we may recalculate > the envelope; just have to make sure no concurrent lock-less walks happen. > > Thanks, Thomas > > > > On Tue, Apr 2, 2019 at 2:21 PM wrote: > >> >> >> On 4/2/19 1:47 AM, Thomas St?fe wrote: >> >> Hi Coleen, Andrew, >> >> thank you for reviewing my little change. Unfortunately, I had an error >> in the space list verification method which needed fixing, so here is a >> second version: >> >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.01/webrev/ >> >> >> >> Differences: >> - As Coleen requested: in allocation.cpp I replaced the comparison >> this==NULL with a static helper method >> >> >> I think you have to change the callers to not pass this as null. So you >> can't do metaspaceobj->is_metaspace_object() because you're calling with >> "this" potentially NULL. >> >> So remove this function: >> >> bool MetaspaceObj::is_metaspace_object() const {- return Metaspace::contains((void*)this);+ return MetaspaceObj::is_metaspace_object(this); >> } >> >> >> >> - I had mistype "envelope" as "envolope" in >> "expand_envelope_to_include_node()". Since that sounded funny I changed it. >> - The real bug was in VirtualSpaceList::verify() where I checked that the >> extension of the envelope is as large as the current nodes. But that is >> wrong, since the envelope never is shrunk (by design) and nodes at the >> border of the envelope may have been unmapped. So the real test should be >> to test if no node is outside the envelope. >> >> >> So this envelope is an interesting concept and name. It seems okay. I >> guess over time, it won't give you a very good answer. Maybe you'll have >> to fix the boundaries someday. >> >> Looks good though. Thank you for making this improvement for performance. >> >> Coleen >> >> >> Thanks, Thomas >> >> >> On Wed, Mar 27, 2019 at 10:01 PM Thomas St?fe >> wrote: >> >>> Hi all, >>> >>> May I please have reviews for this small optimization: >>> >>> cr: >>> http://cr.openjdk.java.net/~stuefe/webrevs/8221539--%5bmetaspace%5d-improve-metaspaceobj--is_metaspace_obj()-and-friends/webrev.00/webrev/index.html >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8221539 >>> >>> There are several functions which, given an unknown pointer assumed to >>> be a metaspace object, check if the pointer is indeed a metaspace object by >>> walking the VirtualSpaceList and checking ranges. >>> >>> This patch adds checks which weed out the obvious cases to avoid >>> needlessly walking the vs list. >>> >>> Patch also adds verifications for the VirtualSpaceList in debug cases. >>> Those run only when a new node has been added to the list, or when a node >>> has been purged, so very sparingly. >>> >>> When purging nodes, I removed a small unnecessary and inefficient check >>> which checked whether (one of the) purged nodes was still in the list. >>> Since we now as part of the new VirtualSpaceNode::verify() walk this list, >>> the check is unnecessary. >>> >>> Thanks, Thomas >>> >>> >>> >> > From david.holmes at oracle.com Thu Apr 4 05:53:29 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Apr 2019 15:53:29 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output Message-ID: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ The actual stack trace reported by NMT detail is affected by the inlining decisions of the native compiler, and on the type of build. So we define an "ideal" stacktrace and then allow for some frames to be missing based on empirical observations. So to date we have seen two frames that may or may not be inlined and so we allow for 2 non-matching entries. The special-casing of AllocateHeap is removed as now it is just an optional frame. Chris: does this maintain the "spirit" of the test as you intended? Zhengyu: can you test this on your system(s) please. Thanks, David From chris.plummer at oracle.com Thu Apr 4 06:12:57 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 3 Apr 2019 23:12:57 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> Message-ID: Hi David, I have concerns that this will hide some of the other bugs I've mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs result in 1 or two frames appearing in the stacktrace that should be skipped. Notably NativeCallStack::NativeCallStack() and os::get_native_stack(). Also, AllocateHeap() should normally not be in the stack trace, but the test has specifically allowed for it for windows and solaris slowdebug builds. Although these builds should have honored the ALWAYSINLINE directive, it was deemed acceptable that it was not in slowdebug builds. However, I would not want to allow AllocateHeap() to appear in a product build, and best not to see it in fastdebug either. Given the changes you made to allow more flexibly in which frames appear, I think you need to now also make sure the above 3 mentioned frames are not present, except for allowing AllocateHeap() in slowdebug builds. thanks, Chris On 4/3/19 10:53 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 > Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ > > The actual stack trace reported by NMT detail is affected by the > inlining decisions of the native compiler, and on the type of build. > So we define an "ideal" stacktrace and then allow for some frames to > be missing based on empirical observations. So to date we have seen > two frames that may or may not be inlined and so we allow for 2 > non-matching entries. > > The special-casing of AllocateHeap is removed as now it is just an > optional frame. > > Chris: does this maintain the "spirit" of the test as you intended? > > Zhengyu: can you test this on your system(s) please. > > Thanks, > David From david.holmes at oracle.com Thu Apr 4 06:23:08 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Apr 2019 16:23:08 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> Message-ID: Hi Chris, On 4/04/2019 4:12 pm, Chris Plummer wrote: > Hi David, > > I have concerns that this will hide some of the other bugs I've > mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs result > in 1 or two frames appearing in the stacktrace that should be skipped. > Notably NativeCallStack::NativeCallStack() and os::get_native_stack(). The test still checks those are not present first: 73 // We should never see either of these frames because they are supposed to be skipped. */ 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); 75 output.shouldNotContain("os::get_native_stack"); > Also, AllocateHeap() should normally not be in the stack trace, but the > test has specifically allowed for it for windows and solaris slowdebug > builds. Although these builds should have honored the ALWAYSINLINE > directive, it was deemed acceptable that it was not in slowdebug builds. > However, I would not want to allow AllocateHeap() to appear in a product > build, and best not to see it in fastdebug either. This is a test of NMT detail not a test of whether a given compiler chooses to inline something like AllocateHeap. I don't think it is the job of this test to be checking for something specific to the native compiler. The previous handling of AllocateHeap seemed to be there simply because it was the only way to deal with an optional frame - but now that's handled generically. Thanks, David > Given the changes you made to allow more flexibly in which frames > appear, I think you need to now also make sure the above 3 mentioned > frames are not present, except for allowing AllocateHeap() in slowdebug > builds. > > thanks, > > Chris > > On 4/3/19 10:53 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >> >> The actual stack trace reported by NMT detail is affected by the >> inlining decisions of the native compiler, and on the type of build. >> So we define an "ideal" stacktrace and then allow for some frames to >> be missing based on empirical observations. So to date we have seen >> two frames that may or may not be inlined and so we allow for 2 >> non-matching entries. >> >> The special-casing of AllocateHeap is removed as now it is just an >> optional frame. >> >> Chris: does this maintain the "spirit" of the test as you intended? >> >> Zhengyu: can you test this on your system(s) please. >> >> Thanks, >> David > > From chris.plummer at oracle.com Thu Apr 4 06:35:35 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 3 Apr 2019 23:35:35 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> Message-ID: <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> On 4/3/19 11:23 PM, David Holmes wrote: > Hi Chris, > > On 4/04/2019 4:12 pm, Chris Plummer wrote: >> Hi David, >> >> I have concerns that this will hide some of the other bugs I've >> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs >> result in 1 or two frames appearing in the stacktrace that should be >> skipped. Notably NativeCallStack::NativeCallStack() and >> os::get_native_stack(). > > The test still checks those are not present first: > > 73???????? // We should never see either of these frames because they > are supposed to be skipped. */ > 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); > 75???????? output.shouldNotContain("os::get_native_stack"); Ah yes. I skimmed over the test looking for it but missed it. > >> Also, AllocateHeap() should normally not be in the stack trace, but >> the test has specifically allowed for it for windows and solaris >> slowdebug builds. Although these builds should have honored the >> ALWAYSINLINE directive, it was deemed acceptable that it was not in >> slowdebug builds. However, I would not want to allow AllocateHeap() >> to appear in a product build, and best not to see it in fastdebug >> either. > > This is a test of NMT detail not a test of whether a given compiler > chooses to inline something like AllocateHeap. I don't think it is the > job of this test to be checking for something specific to the native > compiler. The previous handling of AllocateHeap seemed to be there > simply because it was the only way to deal with an optional frame - > but now that's handled generically. It's appearance means you effectively only have 3 frames to identity callsites instead of 4. If it does appear in a product build, a solution should be looked into to get rid of it. If the port owner decides it can't get rid of it (or is unwilling to), then an exception should be added to the test like was done for solaris and windows slowdebug builds. thanks, Chris > > Thanks, > David > >> Given the changes you made to allow more flexibly in which frames >> appear, I think you need to now also make sure the above 3 mentioned >> frames are not present, except for allowing AllocateHeap() in >> slowdebug builds. >> >> thanks, >> >> Chris >> >> On 4/3/19 10:53 PM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>> >>> The actual stack trace reported by NMT detail is affected by the >>> inlining decisions of the native compiler, and on the type of build. >>> So we define an "ideal" stacktrace and then allow for some frames to >>> be missing based on empirical observations. So to date we have seen >>> two frames that may or may not be inlined and so we allow for 2 >>> non-matching entries. >>> >>> The special-casing of AllocateHeap is removed as now it is just an >>> optional frame. >>> >>> Chris: does this maintain the "spirit" of the test as you intended? >>> >>> Zhengyu: can you test this on your system(s) please. >>> >>> Thanks, >>> David >> >> From stefan.karlsson at oracle.com Thu Apr 4 06:37:47 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 4 Apr 2019 08:37:47 +0200 Subject: RFR (T) 8221872: Remove ClassLoaderWeakHandle typedef In-Reply-To: <1d1affc5-41bb-90c4-a252-9e3b04bf28d2@oracle.com> References: <1d1affc5-41bb-90c4-a252-9e3b04bf28d2@oracle.com> Message-ID: <4f7177ab-70c6-7188-516f-47a5ca2e14db@oracle.com> Looks good. StefanK On 2019-04-04 02:00, coleen.phillimore at oracle.com wrote: > Summary: Make consistent with StringTable and ResolvedMethodTable > > We decided to not have this typedef because StringTable doesn't have it > and the new concurrent hashtable for ResolvedMethodTable won't either. > This will make it consistent. > > Tested with hs tier1 on Oracle platforms. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8221872.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8221872 > > Thanks, > Coleen > From thomas.stuefe at gmail.com Thu Apr 4 07:12:24 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 4 Apr 2019 09:12:24 +0200 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> Message-ID: Hi David, Chris, I think this is an improvement and goes in the right direction. Those hard-wired inline guesses always made me twitch a bit. The patch looks fine to me in its current form, since it is already an improvement. So the following remarks are "optional": - Since all we want to do is to test that NMT detail printing works, we do not have to use one of the malloc paths; I have the feeling the mmap paths are more "inline stable" since they usually end up in one of the ReservedSpace child class constructors which do not get inlined. Like this: 74 [0x0000000706400000 - 0x0000000800000000] reserved 4091904KB for Java Heap from 75 [0x00007f9b514cff07] ReservedHeapSpace::try_reserve_range(char*, char*, unsigned long, char*, char*, unsigned long, unsigned long, bool)+0xb7 76 [0x00007f9b514d08d8] ReservedHeapSpace::initialize_compressed_heap(unsigned long, unsigned long, bool)+0x5f8 77 [0x00007f9b514d0f3a] ReservedHeapSpace::ReservedHeapSpace(unsigned long, unsigned long, bool, char const*) [clone .part.29]+0x9a 78 [0x00007f9b51450331] Universe::reserve_heap(unsigned long, unsigned long)+0xe1 Or this: 256 [0x00007f9b308c5000 - 0x00007f9b3f8c5000] reserved 245760KB for Code from 257 [0x00007f9b514cad02] ReservedCodeSpace::ReservedCodeSpace(unsigned long, unsigned long, bool)+0xa2 258 [0x00007f9b505bcfb7] CodeCache::reserve_heap_memory(unsigned long)+0xe7 259 [0x00007f9b505bd75b] CodeCache::initialize_heaps()+0x2db 260 [0x00007f9b505bde45] CodeCache::initialize()+0x1b5 will be stacks you always will see. - I do not like scanning the whole output for each single stack frame. The test may give false positives. I would like it more if we were to read the file line by line, and when the first pattern line matches, check that subsequent lines match too. This is how we do call stack matching at SAP for similar tests. This is also more efficient since you do not re-scan the whole output each time. In general: NMT is really very useful. We could think about increasing NMT_TrackingStackDepth, since 4 is obviously not a lot. 6 or 8 would be better. I do not believe the memory footprint increase would be significant, but of course we would have to measure. Thanks! Thomas On Thu, Apr 4, 2019 at 8:36 AM Chris Plummer wrote: > On 4/3/19 11:23 PM, David Holmes wrote: > > Hi Chris, > > > > On 4/04/2019 4:12 pm, Chris Plummer wrote: > >> Hi David, > >> > >> I have concerns that this will hide some of the other bugs I've > >> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs > >> result in 1 or two frames appearing in the stacktrace that should be > >> skipped. Notably NativeCallStack::NativeCallStack() and > >> os::get_native_stack(). > > > > The test still checks those are not present first: > > > > 73 // We should never see either of these frames because they > > are supposed to be skipped. */ > > 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); > > 75 output.shouldNotContain("os::get_native_stack"); > Ah yes. I skimmed over the test looking for it but missed it. > > > >> Also, AllocateHeap() should normally not be in the stack trace, but > >> the test has specifically allowed for it for windows and solaris > >> slowdebug builds. Although these builds should have honored the > >> ALWAYSINLINE directive, it was deemed acceptable that it was not in > >> slowdebug builds. However, I would not want to allow AllocateHeap() > >> to appear in a product build, and best not to see it in fastdebug > >> either. > > > > This is a test of NMT detail not a test of whether a given compiler > > chooses to inline something like AllocateHeap. I don't think it is the > > job of this test to be checking for something specific to the native > > compiler. The previous handling of AllocateHeap seemed to be there > > simply because it was the only way to deal with an optional frame - > > but now that's handled generically. > It's appearance means you effectively only have 3 frames to identity > callsites instead of 4. If it does appear in a product build, a solution > should be looked into to get rid of it. If the port owner decides it > can't get rid of it (or is unwilling to), then an exception should be > added to the test like was done for solaris and windows slowdebug builds. > thanks, > > Chris > > > > Thanks, > > David > > > >> Given the changes you made to allow more flexibly in which frames > >> appear, I think you need to now also make sure the above 3 mentioned > >> frames are not present, except for allowing AllocateHeap() in > >> slowdebug builds. > >> > >> thanks, > >> > >> Chris > >> > >> On 4/3/19 10:53 PM, David Holmes wrote: > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 > >>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ > >>> > >>> The actual stack trace reported by NMT detail is affected by the > >>> inlining decisions of the native compiler, and on the type of build. > >>> So we define an "ideal" stacktrace and then allow for some frames to > >>> be missing based on empirical observations. So to date we have seen > >>> two frames that may or may not be inlined and so we allow for 2 > >>> non-matching entries. > >>> > >>> The special-casing of AllocateHeap is removed as now it is just an > >>> optional frame. > >>> > >>> Chris: does this maintain the "spirit" of the test as you intended? > >>> > >>> Zhengyu: can you test this on your system(s) please. > >>> > >>> Thanks, > >>> David > >> > >> > > > From david.holmes at oracle.com Thu Apr 4 07:14:54 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Apr 2019 17:14:54 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> Message-ID: <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> On 4/04/2019 4:35 pm, Chris Plummer wrote: > On 4/3/19 11:23 PM, David Holmes wrote: >> Hi Chris, >> >> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>> Hi David, >>> >>> I have concerns that this will hide some of the other bugs I've >>> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs >>> result in 1 or two frames appearing in the stacktrace that should be >>> skipped. Notably NativeCallStack::NativeCallStack() and >>> os::get_native_stack(). >> >> The test still checks those are not present first: >> >> 73???????? // We should never see either of these frames because they >> are supposed to be skipped. */ >> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >> 75???????? output.shouldNotContain("os::get_native_stack"); > Ah yes. I skimmed over the test looking for it but missed it. >> >>> Also, AllocateHeap() should normally not be in the stack trace, but >>> the test has specifically allowed for it for windows and solaris >>> slowdebug builds. Although these builds should have honored the >>> ALWAYSINLINE directive, it was deemed acceptable that it was not in >>> slowdebug builds. However, I would not want to allow AllocateHeap() >>> to appear in a product build, and best not to see it in fastdebug >>> either. >> >> This is a test of NMT detail not a test of whether a given compiler >> chooses to inline something like AllocateHeap. I don't think it is the >> job of this test to be checking for something specific to the native >> compiler. The previous handling of AllocateHeap seemed to be there >> simply because it was the only way to deal with an optional frame - >> but now that's handled generically. > It's appearance means you effectively only have 3 frames to identity > callsites instead of 4. Both stacktraces in the old test had 4 elements and expected 4 matches. The current bug is that one of those (new_entry) could actually be inlined as well, resulting in only 3 matches. So that is what the revised test checks for: at least 3 matches. Often there will be 4 matches. Hmmm but now I'm wondering why this trace: 50 public static String stackTraceAllocateHeap = 51 ".*AllocateHeap.*\n" + 52 ".*ModuleEntryTable.*new_entry.*\n" + doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it getting inlined already when AllocateHeap was not? Even so we still end up with 4 frames matching normally. > If it does appear in a product build, a solution > should be looked into to get rid of it. If the port owner decides it > can't get rid of it (or is unwilling to), then an exception should be > added to the test like was done for solaris and windows slowdebug builds. Are we specifically trying to test the compiler's ability to inline that function and just happen to be using this test to verify that? Doesn't seem like a suitable place to do this - and why do we need to do it? The Visual Studio docs state: "You cannot force the compiler to inline a particular function, even with the __forceinline keyword." so ALWAYSINLINE is just a hint even in product builds and could change with any update to the compiler. For Solaris Studio it is again not guaranteed to inline - specifically -xinline only has an effect at ?xO3 or higher. Which likely explains why it is ignored in slowdebug. And there are other cases where it won't honour the ALWAYSINLINE. Even with gcc we seem to be misusing the attribute if we want to ensure inlining when not optimising: "GCC does not inline any functions when not optimizing unless you specify the ?always_inline? attribute for the function, like this: /* Prototype. */ inline void foo (const char) __attribute__((always_inline));" and we don't write it that way. So if we're that concerned about release builds guaranteeing to inline AllocateHeap then I think we need something a bit more explicit than this test to determine that. Thanks, David > thanks, > > Chris >> >> Thanks, >> David >> >>> Given the changes you made to allow more flexibly in which frames >>> appear, I think you need to now also make sure the above 3 mentioned >>> frames are not present, except for allowing AllocateHeap() in >>> slowdebug builds. >>> >>> thanks, >>> >>> Chris >>> >>> On 4/3/19 10:53 PM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>> >>>> The actual stack trace reported by NMT detail is affected by the >>>> inlining decisions of the native compiler, and on the type of build. >>>> So we define an "ideal" stacktrace and then allow for some frames to >>>> be missing based on empirical observations. So to date we have seen >>>> two frames that may or may not be inlined and so we allow for 2 >>>> non-matching entries. >>>> >>>> The special-casing of AllocateHeap is removed as now it is just an >>>> optional frame. >>>> >>>> Chris: does this maintain the "spirit" of the test as you intended? >>>> >>>> Zhengyu: can you test this on your system(s) please. >>>> >>>> Thanks, >>>> David >>> >>> > > From shade at redhat.com Thu Apr 4 08:21:14 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 4 Apr 2019 10:21:14 +0200 Subject: RFR (XS) 8221918: runtime/SharedArchiveFile/serviceability/ReplaceCriticalClasses.java fails: Shared archive not found In-Reply-To: References: <6ea6c6ec-547e-72a3-d41b-217371739837@redhat.com> Message-ID: On 4/4/19 2:12 AM, David Holmes wrote: > There's no error checking in the dump part - if something goes wrong we won't see anything to > indicate what it was, and AFAICS we won't even notice the failure (till the second part of the test). Right. This should help: - CDSTestUtils.run(opts); + CDSTestUtils.run(opts).assertNormalExit(""); > Aside: not sure why we can't just use: > @run main/othervm -XX:SharedArchiveFile=... -Xshare:dump > and let jtreg deal with error? We technically can, but then we would lose the ability to generate shared archive file name, and would need to add the same line to the test subclasses, e.g ReplaceCriticalClassesForSubgraphs.java > Please update copyright to "2018, 2019," Updated. New webrev: http://cr.openjdk.java.net/~shade/8221918/webrev.03/ -Aleksey From aph at redhat.com Thu Apr 4 09:16:43 2019 From: aph at redhat.com (Andrew Haley) Date: Thu, 4 Apr 2019 10:16:43 +0100 Subject: [aarch64-port-dev ] RFR (XS): 8221529: [TESTBUG] Docker tests use old/deprecated image on AArch64 In-Reply-To: References: <5C9CFF6F.3060804@oracle.com> <5CA4E622.4080305@oracle.com> <5CA4E78B.9030705@oracle.com> Message-ID: <25e5eafb-2859-d270-18b4-a910b5eb2302@redhat.com> On 4/4/19 1:19 AM, David Holmes wrote: > Sorry I'll have to defer to someone more experienced with this as I > can't comment on the validity of the change. This is the proposal: diff --git a/test/lib/jdk/test/lib/containers/docker/DockerfileConfig.java b/test/lib/jdk/test/lib/containers/docker/DockerfileConfig.java --- a/test/lib/jdk/test/lib/containers/docker/DockerfileConfig.java +++ b/test/lib/jdk/test/lib/containers/docker/DockerfileConfig.java @@ -46,7 +46,7 @@ public class DockerfileConfig { switch (Platform.getOsArch()) { case "aarch64": - return "aarch64/ubuntu"; + return "arm64v8/ubuntu"; case "ppc64le": return "ppc64le/ubuntu"; case "s390x": Nick will have to explain what it's supposed to do, and why. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.gahlin at oracle.com Thu Apr 4 09:25:17 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Thu, 4 Apr 2019 11:25:17 +0200 Subject: RFR(S): 8221710: [TESTBUG] more configurable parameters for docker testing In-Reply-To: <5CA4E6C4.9040704@oracle.com> References: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> <5CA4E6C4.9040704@oracle.com> Message-ID: <7E31FB1E-7EC3-49CE-B13C-CC6AADBA3FA2@oracle.com> Looks good. Thanks for fixing this Misha. I don?t have a preference if the removal override should be in DockerTestUtils or in DockerBasicTest. Erik > On 3 Apr 2019, at 19:00, Mikhailo Seledtsov wrote: > > Ping... > > On 3/29/19, 4:41 PM, mikhailo.seledtsov at oracle.com wrote: >> These new parameters are introduced to help in development and troubleshooting of the Docker tests. >> >> >> 1. Docker command: jdk.test.docker.command >> On some systems docker is installed in locations other than /bin or /usr/bin. JTreg harness sets PATH to these locations, hence other locations such as /usr/local/bin/ is not visible/executable within JTReg tests. A good practice in this case is to provide the full path to the executable for the test. >> >> 2. Retaining image after test: jdk.test.docker.retain.image >> This is very useful for diagnostic purposes, for trouble shooting. By default, docker images created by the tests are removed at the end of the test. >> Specifying this option to "true" provides an ability to inspect the image, run the image, etc. >> >> 3. Overriding JDK under test just for docker tests: jdk.test.docker.jdk.under.test >> This feature is useful when developing tests on non-Linux platform. In such cases, the default JDK under test is non-Linux, hence will not run inside a docker container. This property allows user to point the docker tests to JDK-under-test that is built for Linux. >> >> Also, now that jtreg.SkippedException is available started using it. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8221710 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8221710.00/ >> Testing: ran docker tests >> >> >> Thank you, >> Misha From nick.gasson at arm.com Thu Apr 4 09:59:00 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Thu, 4 Apr 2019 17:59:00 +0800 Subject: [aarch64-port-dev ] RFR (XS): 8221529: [TESTBUG] Docker tests use old/deprecated image on AArch64 In-Reply-To: <25e5eafb-2859-d270-18b4-a910b5eb2302@redhat.com> References: <5C9CFF6F.3060804@oracle.com> <5CA4E622.4080305@oracle.com> <5CA4E78B.9030705@oracle.com> <25e5eafb-2859-d270-18b4-a910b5eb2302@redhat.com> Message-ID: <862a6ce7-0544-a4a0-5736-08b27089c876@arm.com> Hi Andrew, > > Nick will have to explain what it's supposed to do, and why. > By default on all non-x86 Linux platforms the Docker tests are supposed to use the official Ubuntu "latest" (=18.04) image from Docker Hub. But for AArch64 the image used is "aarch64/ubuntu" which according to [1] is deprecated in favour of "arm64v8/ubuntu" and hasn't been updated since 16.04: "The aarch64 organization is deprecated in favor of the more-specific arm64v8 organization, as per https://github.com/docker-library/official-images#architectures-other-than-amd64. Please adjust your usages accordingly." Practically, this causes problems if your JDK image is linked against a recent glibc: the Docker tests will fail with symbol resolution errors when these binaries are run in the Ubuntu 16.04 container. [1] https://hub.docker.com/r/aarch64/ubuntu Thanks, Nick From aph at redhat.com Thu Apr 4 10:41:00 2019 From: aph at redhat.com (Andrew Haley) Date: Thu, 4 Apr 2019 11:41:00 +0100 Subject: [aarch64-port-dev ] RFR (XS): 8221529: [TESTBUG] Docker tests use old/deprecated image on AArch64 In-Reply-To: <862a6ce7-0544-a4a0-5736-08b27089c876@arm.com> References: <5C9CFF6F.3060804@oracle.com> <5CA4E622.4080305@oracle.com> <5CA4E78B.9030705@oracle.com> <25e5eafb-2859-d270-18b4-a910b5eb2302@redhat.com> <862a6ce7-0544-a4a0-5736-08b27089c876@arm.com> Message-ID: On 4/4/19 10:59 AM, Nick Gasson wrote: > Hi Andrew, > > > > > Nick will have to explain what it's supposed to do, and why. > > > > By default on all non-x86 Linux platforms the Docker tests are supposed > to use the official Ubuntu "latest" (=18.04) image from Docker Hub. But > for AArch64 the image used is "aarch64/ubuntu" which according to [1] is > deprecated in favour of "arm64v8/ubuntu" and hasn't been updated since > 16.04: > > "The aarch64 organization is deprecated in favor of the more-specific > arm64v8 organization, as per > https://github.com/docker-library/official-images#architectures-other-than-amd64. > Please adjust your usages accordingly." > > Practically, this causes problems if your JDK image is linked against a > recent glibc: the Docker tests will fail with symbol resolution errors > when these binaries are run in the Ubuntu 16.04 container. > > [1] https://hub.docker.com/r/aarch64/ubuntu The patch is OK. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From martin.doerr at sap.com Thu Apr 4 10:53:32 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 4 Apr 2019 10:53:32 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Hi G?tz, looks good to me, too. Seems like there are no objections against using this syntax. I don't have any ones, either. I've looked over the code and haven't found any issues. Most of the hotspot change is only related to the signature change from JNI to external Java syntax. In addition, I have found: 1. linkResolver.cpp: 1246 to print class name in external format 2. klassVtable.cpp: 1230 to print external_type instead of just "type". I think these are nice improvements. I hope I haven't missed anything important to notice. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev On Behalf Of David Holmes Sent: Donnerstag, 4. April 2019 01:53 To: Lindenmaier, Goetz ; 'hotspot-runtime-dev at openjdk.java.net' Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. Looks good. I'm re-running through our test system. Thanks, David On 4/04/2019 1:18 am, Lindenmaier, Goetz wrote: > Hi, > > here a new webrev: > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/03/ > > I have removed the test with the bad class name. > I named the variables name_str/array_str now. > > I'll push it to jdk-submit for further testing. > > Best regards, > Goetz. > > >> -----Original Message----- >> From: David Holmes >> Sent: Mittwoch, 3. April 2019 00:42 >> To: Lindenmaier, Goetz ; 'hotspot-runtime- >> dev at openjdk.java.net' >> Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in >> java-like Syntax. >> >> Two follow ups ... >> >> On 2/04/2019 11:26 pm, Lindenmaier, Goetz wrote: >>> Hi David, >>> >>>> Overall this looks good to me - a few minor nits/comments below. >>> thanks! >>> >>>> I've applied the patch and am running it through our internal build and >>>> test system (tiers 1-3 initially). >>>> >>>> I have a suspicion there will be other tests that need to be updated - >>>> possibly even JCK tests. Discovering those a-priori will be difficult >>>> (simply running all the tests would take an extremely long time). Will >>>> have a discussion about how best to handle those internally. >>> >>> I ran most JCK test without problem. They usually don't check messages. >>> I ran all hotspot, jdk, langtools, nashorn and jaxp test (except >>> for headful tests). >> >> Thanks for the additional testing info. I duplicated some of that but >> found no issues, other than a couple of closed tests. >> >>>> src/hotspot/share/oops/method.cpp >>>> Please put a blank line after each new method. >>> Fixed. >>> >>>> src/hotspot/share/oops/symbol.cpp >>>> >>>> + os->print("."); >>>> + } else { >>>> + os->print("%c", start[i]); >>>> >>>> Please use os->put(char c) for individual characters. >>> Fixed. >>> >>>> The "start" name would seem better as "buf" to me. >>> Hmm, buf to me is a local chunk of memory used temporarily. >>> What about array_sig, class_sig? >> >> Not really "sigs". >> >> str? Else just leave it. >> >> Thanks, >> David >> ----- >> >>>> + } else if (start[i] == 'L') { >>>> + print_class(os, start+i+1, len-i-2); >>>> Can you insert a comment that help explains the -2: >>> Done. >>> >>>> + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { >>>> space after for (2 occurrences) >>> Fixed. >>> >>>> >> test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth >>>> ods.java >>>> >>>> Not sure the special characters can be used directly in the sources. Can >>>> they not be put in as unicode escapes at all places? >>> I'll try what Ioi proposed. I'll post a new webrev including that. >>> >>> Best regards, >>> Goetz. >>> >>> >>>> >>>> --- >>>> >>>> Thanks, >>>> David >>>> ------- >>>> >>>> >>>> On 1/04/2019 12:32 pm, David Holmes wrote: >>>>> Hi Goetz, >>>>> >>>>> I'm looking at this ... >>>>> >>>>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: >>>>>> Hi, >>>>>> >>>>>> Any interest in this change? >>>>> >>>>> I'm personally of two minds here because these VM generated exceptions >>>>> are not only delivered to Java source code. I'd like to know how other >>>>> language developers using the JVM runtime would view this. >>>>> >>>>> That aside if you're going to make a change like this then I think the >>>>> full signature string has to be quoted in some way to delineate it >>>>> within the larger message. >>>>> >>>>>> Should I split it to adapt the exceptions separately one-by-one to >>>>>> make the change smaller and simplify the review? >>>>> >>>>> I don't think that is necessary. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> I would propose to start out with AbstractMethodError only. >>>>>> >>>>>> Best regards, >>>>>> ?? Goetz. >>>>>> >>>>>> >>>>>> >>>>>> From: Lindenmaier, Goetz >>>>>> Sent: Tuesday, March 26, 2019 1:06 PM >>>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: RFR(L): 8221470: Print methods in exception messages in >>>>>> java-like Syntax. >>>>>> >>>>>> Hi, >>>>>> >>>>>> A row of exceptions are thrown from the hotspot runtime. >>>>>> They print methods with their JNI signatures. To increase >>>>>> readability and resemblance to source code, this change proposes >>>>>> to print them in a Java-like syntax. >>>>>> >>>>>> Some examples: >>>>>> current method printouts: >>>>>> >>>>>> test.TeMe3_B.ma()V >>>>>> test.TeMe3_B.ma(IZ[[BF)[[D >>>>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; >>>>>> >>>>>> improved format: >>>>>> >>>>>> void test.TeMe3_B.ma() >>>>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) >>>>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) >>>>>> >>>>>> So far, Method::name_and_sig_as_C_string() is used to print >>>>>> these messages. >>>>>> >>>>>> This change implements function Method::external_name() that prints >>>>>> the better >>>>>> format. >>>>>> external_name() is chosen according to Klass::external_name(). >>>>>> >>>>>> Printing the better format requires parsing the signature >>>>>> Symbol. This is implemented in >>>>>> void Symbol::print_as_signature_external_return_type(outputStream >> *os); >>>>>> void Symbol::print_as_signature_external_parameters(outputStream >> *os); >>>>>> These method names are chosen according to >>>>>> Symbol::as_class_external_name(). >>>>>> >>>>>> See this partial webrev for the new functions: >>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- >>>> new_methods/ >>>>>> >>>>>> >>>>>> Also, I changed a lot of exception messages to use the new format. >>>>>> This required to adapt a row of tests. I added a test to check >>>>>> the signature printing does not regress.? For all these changes, see >>>>>> the full webrev: >>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01/ >>>>>> >>>>>> I hope I detected all places where method signatures are printed to >>>>>> exception messages. >>>>>> >>>>>> Best regards, >>>>>> ?? Goetz. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From goetz.lindenmaier at sap.com Thu Apr 4 12:05:49 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 4 Apr 2019 12:05:49 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Hi Martin, thanks for looking at my change! > In addition, I have found: > 1. linkResolver.cpp: 1246 to print class name in external format Yes, other IllegalAccessErrors use the '.' format, too. This is for consistency. > 2. klassVtable.cpp: 1230 to print external_type instead of just "type". Also for consistency, see LinkageError in systemDictionary.cpp:2101. Best regards, Goetz. > I think these are nice improvements. > I hope I haven't missed anything important to notice. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-runtime-dev > On Behalf Of David Holmes > Sent: Donnerstag, 4. April 2019 01:53 > To: Lindenmaier, Goetz ; 'hotspot-runtime- > dev at openjdk.java.net' > Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in > java-like Syntax. > > Looks good. > > I'm re-running through our test system. > > Thanks, > David > > On 4/04/2019 1:18 am, Lindenmaier, Goetz wrote: > > Hi, > > > > here a new webrev: > > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/03/ > > > > I have removed the test with the bad class name. > > I named the variables name_str/array_str now. > > > > I'll push it to jdk-submit for further testing. > > > > Best regards, > > Goetz. > > > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Mittwoch, 3. April 2019 00:42 > >> To: Lindenmaier, Goetz ; 'hotspot-runtime- > >> dev at openjdk.java.net' > >> Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in > >> java-like Syntax. > >> > >> Two follow ups ... > >> > >> On 2/04/2019 11:26 pm, Lindenmaier, Goetz wrote: > >>> Hi David, > >>> > >>>> Overall this looks good to me - a few minor nits/comments below. > >>> thanks! > >>> > >>>> I've applied the patch and am running it through our internal build and > >>>> test system (tiers 1-3 initially). > >>>> > >>>> I have a suspicion there will be other tests that need to be updated - > >>>> possibly even JCK tests. Discovering those a-priori will be difficult > >>>> (simply running all the tests would take an extremely long time). Will > >>>> have a discussion about how best to handle those internally. > >>> > >>> I ran most JCK test without problem. They usually don't check messages. > >>> I ran all hotspot, jdk, langtools, nashorn and jaxp test (except > >>> for headful tests). > >> > >> Thanks for the additional testing info. I duplicated some of that but > >> found no issues, other than a couple of closed tests. > >> > >>>> src/hotspot/share/oops/method.cpp > >>>> Please put a blank line after each new method. > >>> Fixed. > >>> > >>>> src/hotspot/share/oops/symbol.cpp > >>>> > >>>> + os->print("."); > >>>> + } else { > >>>> + os->print("%c", start[i]); > >>>> > >>>> Please use os->put(char c) for individual characters. > >>> Fixed. > >>> > >>>> The "start" name would seem better as "buf" to me. > >>> Hmm, buf to me is a local chunk of memory used temporarily. > >>> What about array_sig, class_sig? > >> > >> Not really "sigs". > >> > >> str? Else just leave it. > >> > >> Thanks, > >> David > >> ----- > >> > >>>> + } else if (start[i] == 'L') { > >>>> + print_class(os, start+i+1, len-i-2); > >>>> Can you insert a comment that help explains the -2: > >>> Done. > >>> > >>>> + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { > >>>> space after for (2 occurrences) > >>> Fixed. > >>> > >>>> > >> > test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth > >>>> ods.java > >>>> > >>>> Not sure the special characters can be used directly in the sources. Can > >>>> they not be put in as unicode escapes at all places? > >>> I'll try what Ioi proposed. I'll post a new webrev including that. > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> > >>>> > >>>> --- > >>>> > >>>> Thanks, > >>>> David > >>>> ------- > >>>> > >>>> > >>>> On 1/04/2019 12:32 pm, David Holmes wrote: > >>>>> Hi Goetz, > >>>>> > >>>>> I'm looking at this ... > >>>>> > >>>>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Any interest in this change? > >>>>> > >>>>> I'm personally of two minds here because these VM generated > exceptions > >>>>> are not only delivered to Java source code. I'd like to know how other > >>>>> language developers using the JVM runtime would view this. > >>>>> > >>>>> That aside if you're going to make a change like this then I think the > >>>>> full signature string has to be quoted in some way to delineate it > >>>>> within the larger message. > >>>>> > >>>>>> Should I split it to adapt the exceptions separately one-by-one to > >>>>>> make the change smaller and simplify the review? > >>>>> > >>>>> I don't think that is necessary. > >>>>> > >>>>> Thanks, > >>>>> David > >>>>> ----- > >>>>> > >>>>>> I would propose to start out with AbstractMethodError only. > >>>>>> > >>>>>> Best regards, > >>>>>> ?? Goetz. > >>>>>> > >>>>>> > >>>>>> > >>>>>> From: Lindenmaier, Goetz > >>>>>> Sent: Tuesday, March 26, 2019 1:06 PM > >>>>>> To: hotspot-runtime-dev at openjdk.java.net > >>>>>> Subject: RFR(L): 8221470: Print methods in exception messages in > >>>>>> java-like Syntax. > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> A row of exceptions are thrown from the hotspot runtime. > >>>>>> They print methods with their JNI signatures. To increase > >>>>>> readability and resemblance to source code, this change proposes > >>>>>> to print them in a Java-like syntax. > >>>>>> > >>>>>> Some examples: > >>>>>> current method printouts: > >>>>>> > >>>>>> test.TeMe3_B.ma()V > >>>>>> test.TeMe3_B.ma(IZ[[BF)[[D > >>>>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > >>>>>> > >>>>>> improved format: > >>>>>> > >>>>>> void test.TeMe3_B.ma() > >>>>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > >>>>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > >>>>>> > >>>>>> So far, Method::name_and_sig_as_C_string() is used to print > >>>>>> these messages. > >>>>>> > >>>>>> This change implements function Method::external_name() that prints > >>>>>> the better > >>>>>> format. > >>>>>> external_name() is chosen according to Klass::external_name(). > >>>>>> > >>>>>> Printing the better format requires parsing the signature > >>>>>> Symbol. This is implemented in > >>>>>> void Symbol::print_as_signature_external_return_type(outputStream > >> *os); > >>>>>> void Symbol::print_as_signature_external_parameters(outputStream > >> *os); > >>>>>> These method names are chosen according to > >>>>>> Symbol::as_class_external_name(). > >>>>>> > >>>>>> See this partial webrev for the new functions: > >>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- > >>>> new_methods/ > >>>>>> > >>>>>> > >>>>>> Also, I changed a lot of exception messages to use the new format. > >>>>>> This required to adapt a row of tests. I added a test to check > >>>>>> the signature printing does not regress.? For all these changes, see > >>>>>> the full webrev: > >>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg- > signature/01/ > >>>>>> > >>>>>> I hope I detected all places where method signatures are printed to > >>>>>> exception messages. > >>>>>> > >>>>>> Best regards, > >>>>>> ?? Goetz. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> From goetz.lindenmaier at sap.com Thu Apr 4 12:19:42 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 4 Apr 2019 12:19:42 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Hi David, I ran it through jdk-submit, no failures. I now added reviewer information to the patch. I updated the webrev in-place; nothing changed except the reviewer information. You said we need to sync on the push. Do you just want to sponsor the change to make sure it works out? Or do you want to announce when I should push? Feel free to just push it in case you want to... Best regards, Goetz. > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 4. April 2019 01:53 > To: Lindenmaier, Goetz ; 'hotspot-runtime- > dev at openjdk.java.net' > Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in > java-like Syntax. > > Looks good. > > I'm re-running through our test system. > > Thanks, > David > > On 4/04/2019 1:18 am, Lindenmaier, Goetz wrote: > > Hi, > > > > here a new webrev: > > http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/03/ > > > > I have removed the test with the bad class name. > > I named the variables name_str/array_str now. > > > > I'll push it to jdk-submit for further testing. > > > > Best regards, > > Goetz. > > > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Mittwoch, 3. April 2019 00:42 > >> To: Lindenmaier, Goetz ; 'hotspot-runtime- > >> dev at openjdk.java.net' > >> Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in > >> java-like Syntax. > >> > >> Two follow ups ... > >> > >> On 2/04/2019 11:26 pm, Lindenmaier, Goetz wrote: > >>> Hi David, > >>> > >>>> Overall this looks good to me - a few minor nits/comments below. > >>> thanks! > >>> > >>>> I've applied the patch and am running it through our internal build and > >>>> test system (tiers 1-3 initially). > >>>> > >>>> I have a suspicion there will be other tests that need to be updated - > >>>> possibly even JCK tests. Discovering those a-priori will be difficult > >>>> (simply running all the tests would take an extremely long time). Will > >>>> have a discussion about how best to handle those internally. > >>> > >>> I ran most JCK test without problem. They usually don't check messages. > >>> I ran all hotspot, jdk, langtools, nashorn and jaxp test (except > >>> for headful tests). > >> > >> Thanks for the additional testing info. I duplicated some of that but > >> found no issues, other than a couple of closed tests. > >> > >>>> src/hotspot/share/oops/method.cpp > >>>> Please put a blank line after each new method. > >>> Fixed. > >>> > >>>> src/hotspot/share/oops/symbol.cpp > >>>> > >>>> + os->print("."); > >>>> + } else { > >>>> + os->print("%c", start[i]); > >>>> > >>>> Please use os->put(char c) for individual characters. > >>> Fixed. > >>> > >>>> The "start" name would seem better as "buf" to me. > >>> Hmm, buf to me is a local chunk of memory used temporarily. > >>> What about array_sig, class_sig? > >> > >> Not really "sigs". > >> > >> str? Else just leave it. > >> > >> Thanks, > >> David > >> ----- > >> > >>>> + } else if (start[i] == 'L') { > >>>> + print_class(os, start+i+1, len-i-2); > >>>> Can you insert a comment that help explains the -2: > >>> Done. > >>> > >>>> + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { > >>>> space after for (2 occurrences) > >>> Fixed. > >>> > >>>> > >> > test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth > >>>> ods.java > >>>> > >>>> Not sure the special characters can be used directly in the sources. Can > >>>> they not be put in as unicode escapes at all places? > >>> I'll try what Ioi proposed. I'll post a new webrev including that. > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> > >>>> > >>>> --- > >>>> > >>>> Thanks, > >>>> David > >>>> ------- > >>>> > >>>> > >>>> On 1/04/2019 12:32 pm, David Holmes wrote: > >>>>> Hi Goetz, > >>>>> > >>>>> I'm looking at this ... > >>>>> > >>>>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Any interest in this change? > >>>>> > >>>>> I'm personally of two minds here because these VM generated > exceptions > >>>>> are not only delivered to Java source code. I'd like to know how other > >>>>> language developers using the JVM runtime would view this. > >>>>> > >>>>> That aside if you're going to make a change like this then I think the > >>>>> full signature string has to be quoted in some way to delineate it > >>>>> within the larger message. > >>>>> > >>>>>> Should I split it to adapt the exceptions separately one-by-one to > >>>>>> make the change smaller and simplify the review? > >>>>> > >>>>> I don't think that is necessary. > >>>>> > >>>>> Thanks, > >>>>> David > >>>>> ----- > >>>>> > >>>>>> I would propose to start out with AbstractMethodError only. > >>>>>> > >>>>>> Best regards, > >>>>>> ?? Goetz. > >>>>>> > >>>>>> > >>>>>> > >>>>>> From: Lindenmaier, Goetz > >>>>>> Sent: Tuesday, March 26, 2019 1:06 PM > >>>>>> To: hotspot-runtime-dev at openjdk.java.net > >>>>>> Subject: RFR(L): 8221470: Print methods in exception messages in > >>>>>> java-like Syntax. > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> A row of exceptions are thrown from the hotspot runtime. > >>>>>> They print methods with their JNI signatures. To increase > >>>>>> readability and resemblance to source code, this change proposes > >>>>>> to print them in a Java-like syntax. > >>>>>> > >>>>>> Some examples: > >>>>>> current method printouts: > >>>>>> > >>>>>> test.TeMe3_B.ma()V > >>>>>> test.TeMe3_B.ma(IZ[[BF)[[D > >>>>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > >>>>>> > >>>>>> improved format: > >>>>>> > >>>>>> void test.TeMe3_B.ma() > >>>>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > >>>>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > >>>>>> > >>>>>> So far, Method::name_and_sig_as_C_string() is used to print > >>>>>> these messages. > >>>>>> > >>>>>> This change implements function Method::external_name() that prints > >>>>>> the better > >>>>>> format. > >>>>>> external_name() is chosen according to Klass::external_name(). > >>>>>> > >>>>>> Printing the better format requires parsing the signature > >>>>>> Symbol. This is implemented in > >>>>>> void Symbol::print_as_signature_external_return_type(outputStream > >> *os); > >>>>>> void Symbol::print_as_signature_external_parameters(outputStream > >> *os); > >>>>>> These method names are chosen according to > >>>>>> Symbol::as_class_external_name(). > >>>>>> > >>>>>> See this partial webrev for the new functions: > >>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- > >>>> new_methods/ > >>>>>> > >>>>>> > >>>>>> Also, I changed a lot of exception messages to use the new format. > >>>>>> This required to adapt a row of tests. I added a test to check > >>>>>> the signature printing does not regress.? For all these changes, see > >>>>>> the full webrev: > >>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg- > signature/01/ > >>>>>> > >>>>>> I hope I detected all places where method signatures are printed to > >>>>>> exception messages. > >>>>>> > >>>>>> Best regards, > >>>>>> ?? Goetz. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> From coleen.phillimore at oracle.com Thu Apr 4 13:12:44 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 4 Apr 2019 09:12:44 -0400 Subject: RFR (T) 8221872: Remove ClassLoaderWeakHandle typedef In-Reply-To: <4f7177ab-70c6-7188-516f-47a5ca2e14db@oracle.com> References: <1d1affc5-41bb-90c4-a252-9e3b04bf28d2@oracle.com> <4f7177ab-70c6-7188-516f-47a5ca2e14db@oracle.com> Message-ID: <2d5bef2c-a3bc-3a88-9827-d343b76ae997@oracle.com> Thanks Stefan. Coleen On 4/4/19 2:37 AM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2019-04-04 02:00, coleen.phillimore at oracle.com wrote: >> Summary: Make consistent with StringTable and ResolvedMethodTable >> >> We decided to not have this typedef because StringTable doesn't have >> it and the new concurrent hashtable for ResolvedMethodTable won't >> either. This will make it consistent. >> >> Tested with hs tier1 on Oracle platforms. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8221872.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8221872 >> >> Thanks, >> Coleen >> From zgu at redhat.com Thu Apr 4 14:36:01 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 4 Apr 2019 10:36:01 -0400 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> Message-ID: <7014f0ab-1374-00db-a038-a4399d24f90c@redhat.com> On 4/4/19 1:53 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 > Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ > > The actual stack trace reported by NMT detail is affected by the > inlining decisions of the native compiler, and on the type of build. So > we define an "ideal" stacktrace and then allow for some frames to be > missing based on empirical observations. So to date we have seen two > frames that may or may not be inlined and so we allow for 2 non-matching > entries. > > The special-casing of AllocateHeap is removed as now it is just an > optional frame. > > Chris: does this maintain the "spirit" of the test as you intended? > > Zhengyu: can you test this on your system(s) please. Passed! Thanks, -Zhengyu > > Thanks, > David From martin.doerr at sap.com Thu Apr 4 14:42:25 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 4 Apr 2019 14:42:25 +0000 Subject: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses In-Reply-To: <9a392c4c-2620-99b1-79b5-7bffa8a4774f@oracle.com> References: <9a392c4c-2620-99b1-79b5-7bffa8a4774f@oracle.com> Message-ID: Hi Coleen and Zhengyu, thanks for your feedback. I've also replaced pointer comparison by numeric comparison to avoid undefined behavior. New webrev: http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.01/ Best regards, Martin -----Original Message----- From: hotspot-runtime-dev On Behalf Of coleen.phillimore at oracle.com Sent: Donnerstag, 4. April 2019 02:33 To: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses On 4/2/19 10:33 AM, Doerr, Martin wrote: > Hi Zhengyu, > > that would be fine, too. I'll put it there if other reviewers prefer that, too. Yes, I prefer that too. Coleen > > Thanks and best regards, > Martin > > > -----Original Message----- > From: Zhengyu Gu > Sent: Dienstag, 2. April 2019 16:01 > To: Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses > > Hi Martin, > > Would it be more proper to do the check in os::is_readable_range()? > > Thanks, > > -Zhengyu > > On 4/2/19 9:05 AM, Doerr, Martin wrote: >> Hi, >> >> I'd like to fix a minor bug in Symbol::is_valid which can cause errors during error reporting: >> Address computation can overflow leading to skipped readability check. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8221833 >> >> Webrev: >> http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.00/ >> >> Please review. >> >> Best regards, >> Martin >> From mikhailo.seledtsov at oracle.com Thu Apr 4 14:52:28 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 4 Apr 2019 07:52:28 -0700 Subject: RFR(S): 8221710: [TESTBUG] more configurable parameters for docker testing In-Reply-To: <6CB2736D-D389-4B87-B31E-6008641A15BD@oracle.com> References: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> <6CB2736D-D389-4B87-B31E-6008641A15BD@oracle.com> Message-ID: <0fc9b999-2024-ad4d-ce8e-7dc6260638e0@oracle.com> Hi Igor, ? Thank you for review. I can do the rename, makes it shorter/simpler w/o loosing the meaning. I presume no need to re-post webrev for this small change. Misha On 4/3/19 6:59 PM, Igor Ignatyev wrote: > Hi Misha, > > overall looks good to me. I'd use 'jdk.test.docker.jdk' property name instead of 'jdk.test.docker.jdk.under.test' though, but I don't insist. > > -- Igor > >> On Mar 29, 2019, at 4:41 PM, mikhailo.seledtsov at oracle.com wrote: >> >> These new parameters are introduced to help in development and troubleshooting of the Docker tests. >> >> >> 1. Docker command: jdk.test.docker.command >> On some systems docker is installed in locations other than /bin or /usr/bin. JTreg harness sets PATH to these locations, hence other locations such as /usr/local/bin/ is not visible/executable within JTReg tests. A good practice in this case is to provide the full path to the executable for the test. >> >> 2. Retaining image after test: jdk.test.docker.retain.image >> This is very useful for diagnostic purposes, for trouble shooting. By default, docker images created by the tests are removed at the end of the test. >> Specifying this option to "true" provides an ability to inspect the image, run the image, etc. >> >> 3. Overriding JDK under test just for docker tests: jdk.test.docker.jdk.under.test >> This feature is useful when developing tests on non-Linux platform. In such cases, the default JDK under test is non-Linux, hence will not run inside a docker container. This property allows user to point the docker tests to JDK-under-test that is built for Linux. >> >> Also, now that jtreg.SkippedException is available started using it. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8221710 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8221710.00/ >> Testing: ran docker tests >> >> >> Thank you, >> Misha From mikhailo.seledtsov at oracle.com Thu Apr 4 14:55:24 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 4 Apr 2019 07:55:24 -0700 Subject: RFR(S): 8221710: [TESTBUG] more configurable parameters for docker testing In-Reply-To: References: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> <6CB2736D-D389-4B87-B31E-6008641A15BD@oracle.com> Message-ID: On 4/3/19 7:06 PM, Igor Ignatyev wrote: > took another look at removeDockerImage, and I don't like the new version. you made an assumption that all usages of this will be just to clean after a test, which might not be true, I can imagine someone needing to remove docker image as part of their test, OK, makes sense. > so I'd prefer to revert changes in DockerTestUtils::removeDockerImage and just replaced DockerBasicTest::removeImageAfterTest with DockerTestUtils::RETAIN_IMAGE_AFTER_TEST. I can do that. > > also it might make sense to make all these new constants (all but DOCKER_COMMAND?) public. This as well. Thank you, Misha > > -- Igor > >> On Apr 3, 2019, at 6:59 PM, Igor Ignatyev wrote: >> >> Hi Misha, >> >> overall looks good to me. I'd use 'jdk.test.docker.jdk' property name instead of 'jdk.test.docker.jdk.under.test' though, but I don't insist. >> >> -- Igor >> >>> On Mar 29, 2019, at 4:41 PM, mikhailo.seledtsov at oracle.com wrote: >>> >>> These new parameters are introduced to help in development and troubleshooting of the Docker tests. >>> >>> >>> 1. Docker command: jdk.test.docker.command >>> On some systems docker is installed in locations other than /bin or /usr/bin. JTreg harness sets PATH to these locations, hence other locations such as /usr/local/bin/ is not visible/executable within JTReg tests. A good practice in this case is to provide the full path to the executable for the test. >>> >>> 2. Retaining image after test: jdk.test.docker.retain.image >>> This is very useful for diagnostic purposes, for trouble shooting. By default, docker images created by the tests are removed at the end of the test. >>> Specifying this option to "true" provides an ability to inspect the image, run the image, etc. >>> >>> 3. Overriding JDK under test just for docker tests: jdk.test.docker.jdk.under.test >>> This feature is useful when developing tests on non-Linux platform. In such cases, the default JDK under test is non-Linux, hence will not run inside a docker container. This property allows user to point the docker tests to JDK-under-test that is built for Linux. >>> >>> Also, now that jtreg.SkippedException is available started using it. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8221710 >>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8221710.00/ >>> Testing: ran docker tests >>> >>> >>> Thank you, >>> Misha From mikhailo.seledtsov at oracle.com Thu Apr 4 14:55:58 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 4 Apr 2019 07:55:58 -0700 Subject: RFR(S): 8221710: [TESTBUG] more configurable parameters for docker testing In-Reply-To: <7E31FB1E-7EC3-49CE-B13C-CC6AADBA3FA2@oracle.com> References: <47eec1d6-51de-cdb3-dc9d-d6d7c4064af9@oracle.com> <5CA4E6C4.9040704@oracle.com> <7E31FB1E-7EC3-49CE-B13C-CC6AADBA3FA2@oracle.com> Message-ID: Hi Erik, ? Thank you for review, Misha On 4/4/19 2:25 AM, Erik Gahlin wrote: > Looks good. > > Thanks for fixing this Misha. > > I don?t have a preference if the removal override should be in DockerTestUtils or in DockerBasicTest. > > Erik > >> On 3 Apr 2019, at 19:00, Mikhailo Seledtsov wrote: >> >> Ping... >> >> On 3/29/19, 4:41 PM, mikhailo.seledtsov at oracle.com wrote: >>> These new parameters are introduced to help in development and troubleshooting of the Docker tests. >>> >>> >>> 1. Docker command: jdk.test.docker.command >>> On some systems docker is installed in locations other than /bin or /usr/bin. JTreg harness sets PATH to these locations, hence other locations such as /usr/local/bin/ is not visible/executable within JTReg tests. A good practice in this case is to provide the full path to the executable for the test. >>> >>> 2. Retaining image after test: jdk.test.docker.retain.image >>> This is very useful for diagnostic purposes, for trouble shooting. By default, docker images created by the tests are removed at the end of the test. >>> Specifying this option to "true" provides an ability to inspect the image, run the image, etc. >>> >>> 3. Overriding JDK under test just for docker tests: jdk.test.docker.jdk.under.test >>> This feature is useful when developing tests on non-Linux platform. In such cases, the default JDK under test is non-Linux, hence will not run inside a docker container. This property allows user to point the docker tests to JDK-under-test that is built for Linux. >>> >>> Also, now that jtreg.SkippedException is available started using it. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8221710 >>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8221710.00/ >>> Testing: ran docker tests >>> >>> >>> Thank you, >>> Misha From zgu at redhat.com Thu Apr 4 14:55:14 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 4 Apr 2019 10:55:14 -0400 Subject: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses In-Reply-To: References: <9a392c4c-2620-99b1-79b5-7bffa8a4774f@oracle.com> Message-ID: On 4/4/19 10:42 AM, Doerr, Martin wrote: > Hi Coleen and Zhengyu, > > thanks for your feedback. I've also replaced pointer comparison by numeric comparison to avoid undefined behavior. > > New webrev: > http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.01/ Looks good. Thanks, -Zhengyu > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-runtime-dev On Behalf Of coleen.phillimore at oracle.com > Sent: Donnerstag, 4. April 2019 02:33 > To: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses > > > > On 4/2/19 10:33 AM, Doerr, Martin wrote: >> Hi Zhengyu, >> >> that would be fine, too. I'll put it there if other reviewers prefer that, too. > > Yes, I prefer that too. > Coleen > >> >> Thanks and best regards, >> Martin >> >> >> -----Original Message----- >> From: Zhengyu Gu >> Sent: Dienstag, 2. April 2019 16:01 >> To: Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses >> >> Hi Martin, >> >> Would it be more proper to do the check in os::is_readable_range()? >> >> Thanks, >> >> -Zhengyu >> >> On 4/2/19 9:05 AM, Doerr, Martin wrote: >>> Hi, >>> >>> I'd like to fix a minor bug in Symbol::is_valid which can cause errors during error reporting: >>> Address computation can overflow leading to skipped readability check. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8221833 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.00/ >>> >>> Please review. >>> >>> Best regards, >>> Martin >>> > From gerard.ziemski at oracle.com Thu Apr 4 15:39:11 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Thu, 4 Apr 2019 10:39:11 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <5CA4F10F.2010805@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> Message-ID: hi Erik, On 4/3/19 12:44 PM, Erik Gahlin wrote: > Hi Gerard, > > Here are some comments about the metadata (to make it consistent with > other events). > > The events should not be in the "Java Application" category since they > are JVM events. You could perhaps put them in "Java Virtual Machine, > Runtime, Tables". Some comments about the names and labels of fields. > > - Label: Number of buckets => Bucket Count > - Label: Number of entries => Entry Count > - Label: Total footprint => Total Footprint > > Could you remove descriptions that are exactly the same as the label. > > - Label: Maximum bucket size => Maximum Bucket Size > - Label: Average bucket size => Average Bucket Size > - Label: Variance of bucket? size => Bucket Size Variance > - Name: stdDevOfBucketSize => bucketSizeStandardDeviation > - Label: Standard deviation of bucket size => Bucket Size Standard > Deviation" > > Instead of using the word "size", it may make more sense to use the > word "count" here as well, i.e "Average Bucket Count", or maybe I'm > missing something? Is there a difference? > > I wonder how useful standard deviation and variance is? If support > engineers are looking at a recording, or JMC adds a rule for the > events, what would a good or bad value be? Is it possible to use the > information for troubleshooting? While I'm working on all the above changes you suggested, we can discuss the standard devation and variance. I added them because they are part of the jcmd "VM.symboltable -verbose" command, so we are consistent. Now, regarding how useful they are, I always understood them as a sign of imbalanced table distribution, and without a proper histogram, this is the best description of the histogram shape. In reality, however, I think that if they identify an issue, then we might have a very curious distribution (some sort of hash table attack), or we have an issue with our hash function for the particular usage case. Still, I'd personally elect to keep them. Let me ask you a different question though, Is it expensive to have 2 doubles as part of an event (5 events per second)? And if so, is there currently (or planned) granularity for controlling not just which events to record, but also which attributes? > > - Name: addRate => insertionRate > - Label: Rate of addition =>? Insertation Rate > - Name: removeRate => removalRate > - Label: Rate of removal => Removal Rate Will do. > > I'm missing unit tests for the events. Could you please add in > /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the > average not exceeding max, no negative values etc. Working on it, do we need separate test per each event (table), or just one table will suffice (ex. StringTable)? Thank you for the feedback! cheers > > Thanks! > Erik > >> Hi all, >> >> Please review this feature, which adds tracing events for the >> internal hash tables. >> >> The following attributes are implemented: >> >> > description="Number of buckets" /> >> > description="Number of all entries" /> >> > label="Total footprint" description="Total memory footprint (the >> table itself plus all of the entries)" /> >> >> > /> >> >> > description="How many items were added since last event (per second)" /> >> > description="How many items were removed since last event (per >> second)" /> >> >> This event was implemented for the following system tables: >> >> SymbolTable >> StringTable >> Placeholder Table >> LoaderConstraints Table >> ProtectionDomainCache Table >> >> Webrev:? http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >> Bug:???? https://bugs.openjdk.java.net/browse/JDK-8185525 >> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in progress?) >> >> >> Cheers >> > > From chris.plummer at oracle.com Thu Apr 4 15:48:44 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 4 Apr 2019 08:48:44 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> Message-ID: <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> Hi David, On 4/4/19 12:14 AM, David Holmes wrote: > On 4/04/2019 4:35 pm, Chris Plummer wrote: >> On 4/3/19 11:23 PM, David Holmes wrote: >>> Hi Chris, >>> >>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>> Hi David, >>>> >>>> I have concerns that this will hide some of the other bugs I've >>>> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs >>>> result in 1 or two frames appearing in the stacktrace that should >>>> be skipped. Notably NativeCallStack::NativeCallStack() and >>>> os::get_native_stack(). >>> >>> The test still checks those are not present first: >>> >>> 73???????? // We should never see either of these frames because >>> they are supposed to be skipped. */ >>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>> 75???????? output.shouldNotContain("os::get_native_stack"); >> Ah yes. I skimmed over the test looking for it but missed it. >>> >>>> Also, AllocateHeap() should normally not be in the stack trace, but >>>> the test has specifically allowed for it for windows and solaris >>>> slowdebug builds. Although these builds should have honored the >>>> ALWAYSINLINE directive, it was deemed acceptable that it was not in >>>> slowdebug builds. However, I would not want to allow AllocateHeap() >>>> to appear in a product build, and best not to see it in fastdebug >>>> either. >>> >>> This is a test of NMT detail not a test of whether a given compiler >>> chooses to inline something like AllocateHeap. I don't think it is >>> the job of this test to be checking for something specific to the >>> native compiler. The previous handling of AllocateHeap seemed to be >>> there simply because it was the only way to deal with an optional >>> frame - but now that's handled generically. >> It's appearance means you effectively only have 3 frames to identity >> callsites instead of 4. > > Both stacktraces in the old test had 4 elements and expected 4 > matches. The current bug is that one of those (new_entry) could > actually be inlined as well, resulting in only 3 matches. So that is > what the revised test checks for: at least 3 matches. Often there will > be 4 matches. I think you misunderstood my "3 frames" comment. I was referring to how many frames NMT uses to identify the callsite. It wants to use 4, but if AllocateHeap() doesn't get inlined, it effectively is using 3. The test should detect when this happens so the NMT implementation can address the issue. > > Hmmm but now I'm wondering why this trace: > > ? 50???? public static String stackTraceAllocateHeap = > ? 51???????? ".*AllocateHeap.*\n" + > ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + > > doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it getting > inlined already when AllocateHeap was not? Even so we still end up > with 4 frames matching normally. I noticed that last night also and scratch my head over it for a while and then went to bed. The only explanation I could come up with is that allocate_new_entry() is getting inlined, and as a result (due to being a slowdebug build and doing minimal inlining) AllocateHeap() was not inlined. > >> If it does appear in a product build, a solution should be looked >> into to get rid of it. If the port owner decides it can't get rid of >> it (or is unwilling to), then an exception should be added to the >> test like was done for solaris and windows slowdebug builds. > > Are we specifically trying to test the compiler's ability to inline > that function and just happen to be using this test to verify that? > Doesn't seem like a suitable place to do this - and why do we need to > do it? The Visual Studio docs state: > > "You cannot force the compiler to inline a particular function, even > with the __forceinline keyword." > > so ALWAYSINLINE is just a hint even in product builds and could change > with any update to the compiler. > > For Solaris Studio it is again not guaranteed to inline - specifically > -xinline only has an effect at ?xO3 or higher. Which likely explains > why it is ignored in slowdebug. And there are other cases where it > won't honour the ALWAYSINLINE. > > Even with gcc we seem to be misusing the attribute if we want to > ensure inlining when not optimising: > > "GCC does not inline any functions when not optimizing unless you > specify the ?always_inline? attribute for the function, like this: > > /* Prototype.? */ > inline void foo (const char) __attribute__((always_inline));" > > and we don't write it that way. > > So if we're that concerned about release builds guaranteeing to inline > AllocateHeap then I think we need something a bit more explicit than > this test to determine that. With respect to the 3 methods/functions we don't want to see in the callsite stacktrace, NMT has made a number of assumptions on inlining. One of the things the test is doing is making sure those assumptions are correct. If incorrect, then you run into issues like I mentioned above where callsite backtraces effectively only have 3 unique frames rather than 4 (actually before some bug fixes it was often just 2 unique frames). So I think it's appropriate to have a test to make sure we are not seeing any of these 3 methods/functions. Now the test also has made inlining assumptions beyond what NMT has made, and that is really what this bug is about. In general I think your fix is fine in the way it relaxes which frames are actually found, but as Thomas points out, it suffers from not actually looking at a single stacktrace, but just looking for the specified frames somewhere in the output (and in the order specified.) You should probably address this. thanks, Chris > > Thanks, > David > >> thanks, >> >> Chris >>> >>> Thanks, >>> David >>> >>>> Given the changes you made to allow more flexibly in which frames >>>> appear, I think you need to now also make sure the above 3 >>>> mentioned frames are not present, except for allowing >>>> AllocateHeap() in slowdebug builds. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>> >>>>> The actual stack trace reported by NMT detail is affected by the >>>>> inlining decisions of the native compiler, and on the type of >>>>> build. So we define an "ideal" stacktrace and then allow for some >>>>> frames to be missing based on empirical observations. So to date >>>>> we have seen two frames that may or may not be inlined and so we >>>>> allow for 2 non-matching entries. >>>>> >>>>> The special-casing of AllocateHeap is removed as now it is just an >>>>> optional frame. >>>>> >>>>> Chris: does this maintain the "spirit" of the test as you intended? >>>>> >>>>> Zhengyu: can you test this on your system(s) please. >>>>> >>>>> Thanks, >>>>> David >>>> >>>> >> >> From erik.gahlin at oracle.com Thu Apr 4 18:16:30 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Thu, 4 Apr 2019 20:16:30 +0200 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> Message-ID: <5CA649FE.9070904@oracle.com> On 2019-04-04 17:39, gerard ziemski wrote: > hi Erik, > > > On 4/3/19 12:44 PM, Erik Gahlin wrote: >> Hi Gerard, >> >> Here are some comments about the metadata (to make it consistent with >> other events). >> >> The events should not be in the "Java Application" category since >> they are JVM events. You could perhaps put them in "Java Virtual >> Machine, Runtime, Tables". Some comments about the names and labels >> of fields. >> >> - Label: Number of buckets => Bucket Count >> - Label: Number of entries => Entry Count >> - Label: Total footprint => Total Footprint >> >> Could you remove descriptions that are exactly the same as the label. >> >> - Label: Maximum bucket size => Maximum Bucket Size >> - Label: Average bucket size => Average Bucket Size >> - Label: Variance of bucket size => Bucket Size Variance >> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >> - Label: Standard deviation of bucket size => Bucket Size Standard >> Deviation" >> >> Instead of using the word "size", it may make more sense to use the >> word "count" here as well, i.e "Average Bucket Count", or maybe I'm >> missing something? Is there a difference? >> >> I wonder how useful standard deviation and variance is? If support >> engineers are looking at a recording, or JMC adds a rule for the >> events, what would a good or bad value be? Is it possible to use the >> information for troubleshooting? > > While I'm working on all the above changes you suggested, we can > discuss the standard devation and variance. > > I added them because they are part of the jcmd "VM.symboltable > -verbose" command, so we are consistent. OK > > Now, regarding how useful they are, I always understood them as a sign > of imbalanced table distribution, and without a proper histogram, this > is the best description of the histogram shape. In reality, however, I > think that if they identify an issue, then we might have a very > curious distribution (some sort of hash table attack), or we have an > issue with our hash function for the particular usage case. > > Still, I'd personally elect to keep them. > > Let me ask you a different question though, Is it expensive to have 2 > doubles as part of an event (5 events per second)? Doubles can't be compressed so each value will take 8 bytes. I don't think the precision of a double is needed, so you could change it into a float and save a few bytes. Most user will not care about JVM internals and a lower rate than once per second is probably sufficient for support engineers to spot that something is wrong. The Thread Context Switch Rate event is emitted once every ten seconds. I think the same rate could be used here. > And if so, is there currently (or planned) granularity for controlling > not just which events to record, but also which attributes? > No. If overhead becomes an issues, it's usually better to emit all the information, but at a lower rate. That way, users can find out that the information exists, and increase the rate if a higher resolution is needed to solve their specific issue. >> >> - Name: addRate => insertionRate >> - Label: Rate of addition => Insertation Rate >> - Name: removeRate => removalRate >> - Label: Rate of removal => Removal Rate > > Will do. > >> >> I'm missing unit tests for the events. Could you please add in >> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >> average not exceeding max, no negative values etc. > > Working on it, do we need separate test per each event (table), or > just one table will suffice (ex. StringTable)? They are kind of similar, so I think one test file is sufficient, but we should sanity check data for all events. Thanks Erik > > Thank you for the feedback! > > > cheers >> >> Thanks! >> Erik >> >>> Hi all, >>> >>> Please review this feature, which adds tracing events for the >>> internal hash tables. >>> >>> The following attributes are implemented: >>> >>> >> description="Number of buckets" /> >>> >> description="Number of all entries" /> >>> >> label="Total footprint" description="Total memory footprint (the >>> table itself plus all of the entries)" /> >>> >>> >> /> >>> >>> >> description="How many items were added since last event (per >>> second)" /> >>> >> description="How many items were removed since last event (per >>> second)" /> >>> >>> This event was implemented for the following system tables: >>> >>> SymbolTable >>> StringTable >>> Placeholder Table >>> LoaderConstraints Table >>> ProtectionDomainCache Table >>> >>> Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185525 >>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in progress?) >>> >>> >>> Cheers >>> >> >> > From coleen.phillimore at oracle.com Thu Apr 4 18:27:06 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 4 Apr 2019 14:27:06 -0400 Subject: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses In-Reply-To: References: <9a392c4c-2620-99b1-79b5-7bffa8a4774f@oracle.com> Message-ID: <811daec1-73c2-3e76-da16-7b9b82bdc3d1@oracle.com> +1 thanks, Coleen On 4/4/19 10:55 AM, Zhengyu Gu wrote: > > > On 4/4/19 10:42 AM, Doerr, Martin wrote: >> Hi Coleen and Zhengyu, >> >> thanks for your feedback. I've also replaced pointer comparison by >> numeric comparison to avoid undefined behavior. >> >> New webrev: >> http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.01/ > > Looks good. > > Thanks, > > -Zhengyu > >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-runtime-dev >> On Behalf Of >> coleen.phillimore at oracle.com >> Sent: Donnerstag, 4. April 2019 02:33 >> To: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid >> not performed for some addresses >> >> >> >> On 4/2/19 10:33 AM, Doerr, Martin wrote: >>> Hi Zhengyu, >>> >>> that would be fine, too. I'll put it there if other reviewers prefer >>> that, too. >> >> Yes, I prefer that too. >> Coleen >> >>> >>> Thanks and best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: Zhengyu Gu >>> Sent: Dienstag, 2. April 2019 16:01 >>> To: Doerr, Martin ; >>> hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid >>> not performed for some addresses >>> >>> Hi Martin, >>> >>> Would it be more proper to do the check in os::is_readable_range()? >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> On 4/2/19 9:05 AM, Doerr, Martin wrote: >>>> Hi, >>>> >>>> I'd like to fix a minor bug in Symbol::is_valid which can cause >>>> errors during error reporting: >>>> Address computation can overflow leading to skipped readability check. >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8221833 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.00/ >>>> >>>> Please review. >>>> >>>> Best regards, >>>> Martin >>>> >> From coleen.phillimore at oracle.com Thu Apr 4 18:40:31 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 4 Apr 2019 14:40:31 -0400 Subject: RFR (T) 8221992: Fix old method replacement in ResolvedMethodTable Message-ID: Summary: Use method get_new_method() which is used in other call sites. See bug for more details.? Ran hs-tier1-3 and redefinition tests locally, including the one added for this code: test/jdk/java/lang/instrument/RedefineAddDeleteMethod/MethodHandleDeletedMethod.java open webrev at http://cr.openjdk.java.net/~coleenp/2019/8221992.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8221992 Thanks, Coleen From gerard.ziemski at oracle.com Thu Apr 4 19:52:53 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Thu, 4 Apr 2019 14:52:53 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <5CA649FE.9070904@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> Message-ID: Thank you Erik for clarifications. I have implemented all your suggestions, which you can find here http://cr.openjdk.java.net/~gziemski/8185525_rev2 I started Mach5 tier1-6 test to test the changes ... cheers On 4/4/19 1:16 PM, Erik Gahlin wrote: > On 2019-04-04 17:39, gerard ziemski wrote: >> hi Erik, >> >> >> On 4/3/19 12:44 PM, Erik Gahlin wrote: >>> Hi Gerard, >>> >>> Here are some comments about the metadata (to make it consistent >>> with other events). >>> >>> The events should not be in the "Java Application" category since >>> they are JVM events. You could perhaps put them in "Java Virtual >>> Machine, Runtime, Tables". Some comments about the names and labels >>> of fields. >>> >>> - Label: Number of buckets => Bucket Count >>> - Label: Number of entries => Entry Count >>> - Label: Total footprint => Total Footprint >>> >>> Could you remove descriptions that are exactly the same as the label. >>> >>> - Label: Maximum bucket size => Maximum Bucket Size >>> - Label: Average bucket size => Average Bucket Size >>> - Label: Variance of bucket? size => Bucket Size Variance >>> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >>> - Label: Standard deviation of bucket size => Bucket Size Standard >>> Deviation" >>> >>> Instead of using the word "size", it may make more sense to use the >>> word "count" here as well, i.e "Average Bucket Count", or maybe I'm >>> missing something? Is there a difference? >>> >>> I wonder how useful standard deviation and variance is? If support >>> engineers are looking at a recording, or JMC adds a rule for the >>> events, what would a good or bad value be? Is it possible to use the >>> information for troubleshooting? >> >> While I'm working on all the above changes you suggested, we can >> discuss the standard devation and variance. >> >> I added them because they are part of the jcmd "VM.symboltable >> -verbose" command, so we are consistent. > OK >> >> Now, regarding how useful they are, I always understood them as a >> sign of imbalanced table distribution, and without a proper >> histogram, this is the best description of the histogram shape. In >> reality, however, I think that if they identify an issue, then we >> might have a very curious distribution (some sort of hash table >> attack), or we have an issue with our hash function for the >> particular usage case. >> >> Still, I'd personally elect to keep them. >> >> Let me ask you a different question though, Is it expensive to have 2 >> doubles as part of an event (5 events per second)? > Doubles can't be compressed so each value will take 8 bytes. I don't > think the precision of a double is needed, so you could change it into > a float and save a few bytes. > > Most user will not care about JVM internals and a lower rate than once > per second is probably sufficient for support engineers to spot that > something is wrong. > > The Thread Context Switch Rate event is emitted once every ten > seconds. I think the same rate could be used here. > >> And if so, is there currently (or planned) granularity for >> controlling not just which events to record, but also which attributes? >> > No. > > If overhead becomes an issues, it's usually better to emit all the > information, but at a lower rate.? That way, users can find out that > the information exists, and increase the rate if a higher resolution > is needed to solve their specific issue. > >>> >>> - Name: addRate => insertionRate >>> - Label: Rate of addition =>? Insertation Rate >>> - Name: removeRate => removalRate >>> - Label: Rate of removal => Removal Rate >> >> Will do. >> >>> >>> I'm missing unit tests for the events. Could you please add in >>> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >>> average not exceeding max, no negative values etc. >> >> Working on it, do we need separate test per each event (table), or >> just one table will suffice (ex. StringTable)? > They are kind of similar, so I think one test file is sufficient, but > we should sanity check data for all events. > > Thanks > Erik > >> >> Thank you for the feedback! >> >> >> cheers >>> >>> Thanks! >>> Erik >>> >>>> Hi all, >>>> >>>> Please review this feature, which adds tracing events for the >>>> internal hash tables. >>>> >>>> The following attributes are implemented: >>>> >>>> >>>> >>>> >>> label="Total footprint" description="Total memory footprint (the >>>> table itself plus all of the entries)" /> >>>> >>>> >>> /> >>>> >>>> >>> description="How many items were added since last event (per >>>> second)" /> >>>> >>> description="How many items were removed since last event (per >>>> second)" /> >>>> >>>> This event was implemented for the following system tables: >>>> >>>> SymbolTable >>>> StringTable >>>> Placeholder Table >>>> LoaderConstraints Table >>>> ProtectionDomainCache Table >>>> >>>> Webrev:? http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>>> Bug:???? https://bugs.openjdk.java.net/browse/JDK-8185525 >>>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in >>>> progress?) >>>> >>>> >>>> Cheers From serguei.spitsyn at oracle.com Thu Apr 4 21:04:03 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 4 Apr 2019 14:04:03 -0700 Subject: RFR (T) 8221992: Fix old method replacement in ResolvedMethodTable In-Reply-To: References: Message-ID: Hi Coleen, It looks good to me. Thanks, Serguei On 4/4/19 11:40, coleen.phillimore at oracle.com wrote: > Summary: Use method get_new_method() which is used in other call sites. > > See bug for more details.? Ran hs-tier1-3 and redefinition tests > locally, including the one added for this code: > test/jdk/java/lang/instrument/RedefineAddDeleteMethod/MethodHandleDeletedMethod.java > > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8221992.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8221992 > > Thanks, > Coleen > > From erik.gahlin at oracle.com Thu Apr 4 21:11:57 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Thu, 4 Apr 2019 23:11:57 +0200 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> Message-ID: <5CA6731D.7020907@oracle.com> Thanks for fixing. A quick comments about the test. I think it can be simplified by using some of the test library functionality, i.e public static void main(String[] args) throws Throwable { try (Recording recording = new Recording()) { recording.enable(EventNames.SymbolTableStatistics); recording.enable(EventNames.StringTableStatistics); recording.enable(EventNames.PlaceholderTableStatistics); recording.enable(EventNames.LoaderConstraintsTableStatistics); recording.enable(EventNames.ProtectionDomainCacheTableStatistics); recording.start(); recording.stop(); List events = Events.fromRecording(recording); verifyTable(events, EventNames.SymbolTableStatistics); verifyTable(events, EventNames.StringTableStatistics); verifyTable(events, EventNames.PlaceholderTableStatistics); verifyTable(events, EventNames.LoaderConstraintsTableStatistics); verifyTable(events, EventNames.ProtectionDomainCacheTableStatistics); } } private static void verifyTable(List allEvents, String eventName) throws Exception { List eventsForTable = allEvents.stream() .filter(e -> e.getEventType().getName().equals(eventName)) .collect(Collectors.toList()); if (eventsForTable.isEmpty()) { throw new Exception("No events for " + eventName); } for (RecordedEvent event : eventsForTable) { Events.assertField(event, "bucketCount").atLeast(0L); long entryCount = Events.assertField(event, "entryCount").atLeast(0L).getValue(); Events.assertField(event, "totalFootprint").atLeast(0L); long averageBucketCount = Events.assertField(event, "averageBucketCount").atLeast(0L).getValue(); Events.assertField(event, "maximumBucketCount").atLeast(averageBucketCount); Events.assertField(event, "bucketCountVariance").atLeast(0.0f); Events.assertField(event, "bucketCountStandardDeviation").atLeast(0.0f); float insertionRate = Events.assertField(event, "insertionRate").atLeast(0.0f).getValue(); float removalRate = Events.assertField(event, "removalRate").atLeast(0.0f).getValue(); if ((insertionRate > 0.0f) && (insertionRate > removalRate)) { Asserts.assertGreaterThan(entryCount, 0L, "Entries marked as added, but no entries found for " + eventName); } } } - It's nice to have the main method on top so you can easily see what the test is supposed to do. - Changed (some) field names that used the previous naming style. - Reduced the number of methods to make it easier to read - Reduced number of calls to Events.fromRecording(...) as will repeatedly dump a file to disk. - Used Events.assertField() which will provide better error message if an assertion fails, - Used EventType::getName instead of event.toString() contains - Added sanity checks for standard deviation and variance fields - Wrapped Recording creation in try-with-resource to avoid warning about resource leak - Removed threshold as the events are periodic and don't use a threshold - Removed "Thread.sleep" - The test now relies on events having period "everyChunk" which means at least two events per recording are guaranteed Could you explain how the string table test work, and why it needs special handling? I also missed changes to the file EventNames.java (I haven't actually tried the code, but you get the idea) Thanks Erik > Thank you Erik for clarifications. > > I have implemented all your suggestions, which you can find here > http://cr.openjdk.java.net/~gziemski/8185525_rev2 > > I started Mach5 tier1-6 test to test the changes ... > > > cheers > > On 4/4/19 1:16 PM, Erik Gahlin wrote: >> On 2019-04-04 17:39, gerard ziemski wrote: >>> hi Erik, >>> >>> >>> On 4/3/19 12:44 PM, Erik Gahlin wrote: >>>> Hi Gerard, >>>> >>>> Here are some comments about the metadata (to make it consistent >>>> with other events). >>>> >>>> The events should not be in the "Java Application" category since >>>> they are JVM events. You could perhaps put them in "Java Virtual >>>> Machine, Runtime, Tables". Some comments about the names and labels >>>> of fields. >>>> >>>> - Label: Number of buckets => Bucket Count >>>> - Label: Number of entries => Entry Count >>>> - Label: Total footprint => Total Footprint >>>> >>>> Could you remove descriptions that are exactly the same as the label. >>>> >>>> - Label: Maximum bucket size => Maximum Bucket Size >>>> - Label: Average bucket size => Average Bucket Size >>>> - Label: Variance of bucket size => Bucket Size Variance >>>> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >>>> - Label: Standard deviation of bucket size => Bucket Size Standard >>>> Deviation" >>>> >>>> Instead of using the word "size", it may make more sense to use the >>>> word "count" here as well, i.e "Average Bucket Count", or maybe I'm >>>> missing something? Is there a difference? >>>> >>>> I wonder how useful standard deviation and variance is? If support >>>> engineers are looking at a recording, or JMC adds a rule for the >>>> events, what would a good or bad value be? Is it possible to use >>>> the information for troubleshooting? >>> >>> While I'm working on all the above changes you suggested, we can >>> discuss the standard devation and variance. >>> >>> I added them because they are part of the jcmd "VM.symboltable >>> -verbose" command, so we are consistent. >> OK >>> >>> Now, regarding how useful they are, I always understood them as a >>> sign of imbalanced table distribution, and without a proper >>> histogram, this is the best description of the histogram shape. In >>> reality, however, I think that if they identify an issue, then we >>> might have a very curious distribution (some sort of hash table >>> attack), or we have an issue with our hash function for the >>> particular usage case. >>> >>> Still, I'd personally elect to keep them. >>> >>> Let me ask you a different question though, Is it expensive to have >>> 2 doubles as part of an event (5 events per second)? >> Doubles can't be compressed so each value will take 8 bytes. I don't >> think the precision of a double is needed, so you could change it >> into a float and save a few bytes. >> >> Most user will not care about JVM internals and a lower rate than >> once per second is probably sufficient for support engineers to spot >> that something is wrong. >> >> The Thread Context Switch Rate event is emitted once every ten >> seconds. I think the same rate could be used here. >> >>> And if so, is there currently (or planned) granularity for >>> controlling not just which events to record, but also which attributes? >>> >> No. >> >> If overhead becomes an issues, it's usually better to emit all the >> information, but at a lower rate. That way, users can find out that >> the information exists, and increase the rate if a higher resolution >> is needed to solve their specific issue. >> >>>> >>>> - Name: addRate => insertionRate >>>> - Label: Rate of addition => Insertation Rate >>>> - Name: removeRate => removalRate >>>> - Label: Rate of removal => Removal Rate >>> >>> Will do. >>> >>>> >>>> I'm missing unit tests for the events. Could you please add in >>>> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >>>> average not exceeding max, no negative values etc. >>> >>> Working on it, do we need separate test per each event (table), or >>> just one table will suffice (ex. StringTable)? >> They are kind of similar, so I think one test file is sufficient, but >> we should sanity check data for all events. >> >> Thanks >> Erik >> >>> >>> Thank you for the feedback! >>> >>> >>> cheers >>>> >>>> Thanks! >>>> Erik >>>> >>>>> Hi all, >>>>> >>>>> Please review this feature, which adds tracing events for the >>>>> internal hash tables. >>>>> >>>>> The following attributes are implemented: >>>>> >>>>> >>>>> >>>>> >>>> label="Total footprint" description="Total memory footprint (the >>>>> table itself plus all of the entries)" /> >>>>> >>>>> >>>> label="Variance of bucket sizes" description="How far bucket >>>>> lengths are spread out from their average value" /> >>>>> >>>>> >>>> description="How many items were added since last event (per >>>>> second)" /> >>>>> >>>> description="How many items were removed since last event (per >>>>> second)" /> >>>>> >>>>> This event was implemented for the following system tables: >>>>> >>>>> SymbolTable >>>>> StringTable >>>>> Placeholder Table >>>>> LoaderConstraints Table >>>>> ProtectionDomainCache Table >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185525 >>>>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in >>>>> progress?) >>>>> >>>>> >>>>> Cheers From erik.gahlin at oracle.com Thu Apr 4 21:15:17 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Thu, 4 Apr 2019 23:15:17 +0200 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <5CA6731D.7020907@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> <5CA6731D.7020907@oracle.com> Message-ID: <5CA673E5.4030306@oracle.com> Sorry, I saw the EventNames.java file now. Thanks Erik > Thanks for fixing. > > A quick comments about the test. > > I think it can be simplified by using some of the test library > functionality, i.e > > public static void main(String[] args) throws Throwable { > try (Recording recording = new Recording()) { > recording.enable(EventNames.SymbolTableStatistics); > recording.enable(EventNames.StringTableStatistics); > recording.enable(EventNames.PlaceholderTableStatistics); > recording.enable(EventNames.LoaderConstraintsTableStatistics); > recording.enable(EventNames.ProtectionDomainCacheTableStatistics); > recording.start(); > recording.stop(); > > List events = Events.fromRecording(recording); > verifyTable(events, EventNames.SymbolTableStatistics); > verifyTable(events, EventNames.StringTableStatistics); > verifyTable(events, EventNames.PlaceholderTableStatistics); > verifyTable(events, EventNames.LoaderConstraintsTableStatistics); > verifyTable(events, > EventNames.ProtectionDomainCacheTableStatistics); > } > } > > private static void verifyTable(List allEvents, > String eventName) throws Exception { > List eventsForTable = allEvents.stream() > .filter(e -> > e.getEventType().getName().equals(eventName)) > .collect(Collectors.toList()); > if (eventsForTable.isEmpty()) { > throw new Exception("No events for " + eventName); > } > for (RecordedEvent event : eventsForTable) { > Events.assertField(event, "bucketCount").atLeast(0L); > long entryCount = Events.assertField(event, > "entryCount").atLeast(0L).getValue(); > Events.assertField(event, "totalFootprint").atLeast(0L); > long averageBucketCount = Events.assertField(event, > "averageBucketCount").atLeast(0L).getValue(); > Events.assertField(event, > "maximumBucketCount").atLeast(averageBucketCount); > Events.assertField(event, "bucketCountVariance").atLeast(0.0f); > Events.assertField(event, > "bucketCountStandardDeviation").atLeast(0.0f); > float insertionRate = Events.assertField(event, > "insertionRate").atLeast(0.0f).getValue(); > float removalRate = Events.assertField(event, > "removalRate").atLeast(0.0f).getValue(); > if ((insertionRate > 0.0f) && (insertionRate > removalRate)) { > Asserts.assertGreaterThan(entryCount, 0L, "Entries marked as > added, but no entries found for " + eventName); > } > } > } > > - It's nice to have the main method on top so you can easily see what > the test is supposed to do. > - Changed (some) field names that used the previous naming style. > - Reduced the number of methods to make it easier to read > - Reduced number of calls to Events.fromRecording(...) as will > repeatedly dump a file to disk. > - Used Events.assertField() which will provide better error message if > an assertion fails, > - Used EventType::getName instead of event.toString() contains > - Added sanity checks for standard deviation and variance fields > - Wrapped Recording creation in try-with-resource to avoid warning > about resource leak > - Removed threshold as the events are periodic and don't use a threshold > - Removed "Thread.sleep" > - The test now relies on events having period "everyChunk" which means > at least two events per recording are guaranteed > > Could you explain how the string table test work, and why it needs > special handling? > > I also missed changes to the file EventNames.java > > (I haven't actually tried the code, but you get the idea) > > Thanks > Erik > >> Thank you Erik for clarifications. >> >> I have implemented all your suggestions, which you can find here >> http://cr.openjdk.java.net/~gziemski/8185525_rev2 >> >> I started Mach5 tier1-6 test to test the changes ... >> >> >> cheers >> >> On 4/4/19 1:16 PM, Erik Gahlin wrote: >>> On 2019-04-04 17:39, gerard ziemski wrote: >>>> hi Erik, >>>> >>>> >>>> On 4/3/19 12:44 PM, Erik Gahlin wrote: >>>>> Hi Gerard, >>>>> >>>>> Here are some comments about the metadata (to make it consistent >>>>> with other events). >>>>> >>>>> The events should not be in the "Java Application" category since >>>>> they are JVM events. You could perhaps put them in "Java Virtual >>>>> Machine, Runtime, Tables". Some comments about the names and >>>>> labels of fields. >>>>> >>>>> - Label: Number of buckets => Bucket Count >>>>> - Label: Number of entries => Entry Count >>>>> - Label: Total footprint => Total Footprint >>>>> >>>>> Could you remove descriptions that are exactly the same as the label. >>>>> >>>>> - Label: Maximum bucket size => Maximum Bucket Size >>>>> - Label: Average bucket size => Average Bucket Size >>>>> - Label: Variance of bucket size => Bucket Size Variance >>>>> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >>>>> - Label: Standard deviation of bucket size => Bucket Size Standard >>>>> Deviation" >>>>> >>>>> Instead of using the word "size", it may make more sense to use >>>>> the word "count" here as well, i.e "Average Bucket Count", or >>>>> maybe I'm missing something? Is there a difference? >>>>> >>>>> I wonder how useful standard deviation and variance is? If support >>>>> engineers are looking at a recording, or JMC adds a rule for the >>>>> events, what would a good or bad value be? Is it possible to use >>>>> the information for troubleshooting? >>>> >>>> While I'm working on all the above changes you suggested, we can >>>> discuss the standard devation and variance. >>>> >>>> I added them because they are part of the jcmd "VM.symboltable >>>> -verbose" command, so we are consistent. >>> OK >>>> >>>> Now, regarding how useful they are, I always understood them as a >>>> sign of imbalanced table distribution, and without a proper >>>> histogram, this is the best description of the histogram shape. In >>>> reality, however, I think that if they identify an issue, then we >>>> might have a very curious distribution (some sort of hash table >>>> attack), or we have an issue with our hash function for the >>>> particular usage case. >>>> >>>> Still, I'd personally elect to keep them. >>>> >>>> Let me ask you a different question though, Is it expensive to have >>>> 2 doubles as part of an event (5 events per second)? >>> Doubles can't be compressed so each value will take 8 bytes. I don't >>> think the precision of a double is needed, so you could change it >>> into a float and save a few bytes. >>> >>> Most user will not care about JVM internals and a lower rate than >>> once per second is probably sufficient for support engineers to spot >>> that something is wrong. >>> >>> The Thread Context Switch Rate event is emitted once every ten >>> seconds. I think the same rate could be used here. >>> >>>> And if so, is there currently (or planned) granularity for >>>> controlling not just which events to record, but also which >>>> attributes? >>>> >>> No. >>> >>> If overhead becomes an issues, it's usually better to emit all the >>> information, but at a lower rate. That way, users can find out that >>> the information exists, and increase the rate if a higher resolution >>> is needed to solve their specific issue. >>> >>>>> >>>>> - Name: addRate => insertionRate >>>>> - Label: Rate of addition => Insertation Rate >>>>> - Name: removeRate => removalRate >>>>> - Label: Rate of removal => Removal Rate >>>> >>>> Will do. >>>> >>>>> >>>>> I'm missing unit tests for the events. Could you please add in >>>>> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >>>>> average not exceeding max, no negative values etc. >>>> >>>> Working on it, do we need separate test per each event (table), or >>>> just one table will suffice (ex. StringTable)? >>> They are kind of similar, so I think one test file is sufficient, >>> but we should sanity check data for all events. >>> >>> Thanks >>> Erik >>> >>>> >>>> Thank you for the feedback! >>>> >>>> >>>> cheers >>>>> >>>>> Thanks! >>>>> Erik >>>>> >>>>>> Hi all, >>>>>> >>>>>> Please review this feature, which adds tracing events for the >>>>>> internal hash tables. >>>>>> >>>>>> The following attributes are implemented: >>>>>> >>>>>> >>>>>> >>>>>> >>>>> label="Total footprint" description="Total memory footprint (the >>>>>> table itself plus all of the entries)" /> >>>>>> >>>>>> >>>>> label="Variance of bucket sizes" description="How far bucket >>>>>> lengths are spread out from their average value" /> >>>>>> >>>>>> >>>>> description="How many items were added since last event (per >>>>>> second)" /> >>>>>> >>>>> description="How many items were removed since last event (per >>>>>> second)" /> >>>>>> >>>>>> This event was implemented for the following system tables: >>>>>> >>>>>> SymbolTable >>>>>> StringTable >>>>>> Placeholder Table >>>>>> LoaderConstraints Table >>>>>> ProtectionDomainCache Table >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185525 >>>>>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in >>>>>> progress?) >>>>>> >>>>>> >>>>>> Cheers > From coleen.phillimore at oracle.com Thu Apr 4 21:34:32 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 4 Apr 2019 17:34:32 -0400 Subject: RFR (T) 8221992: Fix old method replacement in ResolvedMethodTable In-Reply-To: References: Message-ID: <7a4ad12c-3bc2-36c5-fe47-4585aa7cd40d@oracle.com> Thank you Serguei! Coleen On 4/4/19 5:04 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > It looks good to me. > > Thanks, > Serguei > > > On 4/4/19 11:40, coleen.phillimore at oracle.com wrote: >> Summary: Use method get_new_method() which is used in other call sites. >> >> See bug for more details.? Ran hs-tier1-3 and redefinition tests >> locally, including the one added for this code: >> test/jdk/java/lang/instrument/RedefineAddDeleteMethod/MethodHandleDeletedMethod.java >> >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8221992.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8221992 >> >> Thanks, >> Coleen >> >> > From david.holmes at oracle.com Fri Apr 5 00:23:15 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Apr 2019 10:23:15 +1000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Hi Goetz, On 4/04/2019 10:19 pm, Lindenmaier, Goetz wrote: > Hi David, > > I ran it through jdk-submit, no failures. > I now added reviewer information to the patch. > I updated the webrev in-place; nothing changed except > the reviewer information. You missed Coleen so I added her. > You said we need to sync on the push. Do you just > want to sponsor the change to make sure it works out? > Or do you want to announce when I should push? > > Feel free to just push it in case you want to... Changes pushed :) Thanks, David > Best regards, > Goetz. > >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 4. April 2019 01:53 >> To: Lindenmaier, Goetz ; 'hotspot-runtime- >> dev at openjdk.java.net' >> Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in >> java-like Syntax. >> >> Looks good. >> >> I'm re-running through our test system. >> >> Thanks, >> David >> >> On 4/04/2019 1:18 am, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> here a new webrev: >>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/03/ >>> >>> I have removed the test with the bad class name. >>> I named the variables name_str/array_str now. >>> >>> I'll push it to jdk-submit for further testing. >>> >>> Best regards, >>> Goetz. >>> >>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Mittwoch, 3. April 2019 00:42 >>>> To: Lindenmaier, Goetz ; 'hotspot-runtime- >>>> dev at openjdk.java.net' >>>> Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in >>>> java-like Syntax. >>>> >>>> Two follow ups ... >>>> >>>> On 2/04/2019 11:26 pm, Lindenmaier, Goetz wrote: >>>>> Hi David, >>>>> >>>>>> Overall this looks good to me - a few minor nits/comments below. >>>>> thanks! >>>>> >>>>>> I've applied the patch and am running it through our internal build and >>>>>> test system (tiers 1-3 initially). >>>>>> >>>>>> I have a suspicion there will be other tests that need to be updated - >>>>>> possibly even JCK tests. Discovering those a-priori will be difficult >>>>>> (simply running all the tests would take an extremely long time). Will >>>>>> have a discussion about how best to handle those internally. >>>>> >>>>> I ran most JCK test without problem. They usually don't check messages. >>>>> I ran all hotspot, jdk, langtools, nashorn and jaxp test (except >>>>> for headful tests). >>>> >>>> Thanks for the additional testing info. I duplicated some of that but >>>> found no issues, other than a couple of closed tests. >>>> >>>>>> src/hotspot/share/oops/method.cpp >>>>>> Please put a blank line after each new method. >>>>> Fixed. >>>>> >>>>>> src/hotspot/share/oops/symbol.cpp >>>>>> >>>>>> + os->print("."); >>>>>> + } else { >>>>>> + os->print("%c", start[i]); >>>>>> >>>>>> Please use os->put(char c) for individual characters. >>>>> Fixed. >>>>> >>>>>> The "start" name would seem better as "buf" to me. >>>>> Hmm, buf to me is a local chunk of memory used temporarily. >>>>> What about array_sig, class_sig? >>>> >>>> Not really "sigs". >>>> >>>> str? Else just leave it. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>>> + } else if (start[i] == 'L') { >>>>>> + print_class(os, start+i+1, len-i-2); >>>>>> Can you insert a comment that help explains the -2: >>>>> Done. >>>>> >>>>>> + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { >>>>>> space after for (2 occurrences) >>>>> Fixed. >>>>> >>>>>> >>>> >> test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMeth >>>>>> ods.java >>>>>> >>>>>> Not sure the special characters can be used directly in the sources. Can >>>>>> they not be put in as unicode escapes at all places? >>>>> I'll try what Ioi proposed. I'll post a new webrev including that. >>>>> >>>>> Best regards, >>>>> Goetz. >>>>> >>>>> >>>>>> >>>>>> --- >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ------- >>>>>> >>>>>> >>>>>> On 1/04/2019 12:32 pm, David Holmes wrote: >>>>>>> Hi Goetz, >>>>>>> >>>>>>> I'm looking at this ... >>>>>>> >>>>>>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Any interest in this change? >>>>>>> >>>>>>> I'm personally of two minds here because these VM generated >> exceptions >>>>>>> are not only delivered to Java source code. I'd like to know how other >>>>>>> language developers using the JVM runtime would view this. >>>>>>> >>>>>>> That aside if you're going to make a change like this then I think the >>>>>>> full signature string has to be quoted in some way to delineate it >>>>>>> within the larger message. >>>>>>> >>>>>>>> Should I split it to adapt the exceptions separately one-by-one to >>>>>>>> make the change smaller and simplify the review? >>>>>>> >>>>>>> I don't think that is necessary. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> I would propose to start out with AbstractMethodError only. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> ?? Goetz. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> From: Lindenmaier, Goetz >>>>>>>> Sent: Tuesday, March 26, 2019 1:06 PM >>>>>>>> To: hotspot-runtime-dev at openjdk.java.net >>>>>>>> Subject: RFR(L): 8221470: Print methods in exception messages in >>>>>>>> java-like Syntax. >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> A row of exceptions are thrown from the hotspot runtime. >>>>>>>> They print methods with their JNI signatures. To increase >>>>>>>> readability and resemblance to source code, this change proposes >>>>>>>> to print them in a Java-like syntax. >>>>>>>> >>>>>>>> Some examples: >>>>>>>> current method printouts: >>>>>>>> >>>>>>>> test.TeMe3_B.ma()V >>>>>>>> test.TeMe3_B.ma(IZ[[BF)[[D >>>>>>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; >>>>>>>> >>>>>>>> improved format: >>>>>>>> >>>>>>>> void test.TeMe3_B.ma() >>>>>>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) >>>>>>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) >>>>>>>> >>>>>>>> So far, Method::name_and_sig_as_C_string() is used to print >>>>>>>> these messages. >>>>>>>> >>>>>>>> This change implements function Method::external_name() that prints >>>>>>>> the better >>>>>>>> format. >>>>>>>> external_name() is chosen according to Klass::external_name(). >>>>>>>> >>>>>>>> Printing the better format requires parsing the signature >>>>>>>> Symbol. This is implemented in >>>>>>>> void Symbol::print_as_signature_external_return_type(outputStream >>>> *os); >>>>>>>> void Symbol::print_as_signature_external_parameters(outputStream >>>> *os); >>>>>>>> These method names are chosen according to >>>>>>>> Symbol::as_class_external_name(). >>>>>>>> >>>>>>>> See this partial webrev for the new functions: >>>>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/01- >>>>>> new_methods/ >>>>>>>> >>>>>>>> >>>>>>>> Also, I changed a lot of exception messages to use the new format. >>>>>>>> This required to adapt a row of tests. I added a test to check >>>>>>>> the signature printing does not regress.? For all these changes, see >>>>>>>> the full webrev: >>>>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg- >> signature/01/ >>>>>>>> >>>>>>>> I hope I detected all places where method signatures are printed to >>>>>>>> exception messages. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> ?? Goetz. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> From david.holmes at oracle.com Fri Apr 5 01:28:13 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Apr 2019 11:28:13 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> Message-ID: <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> Hi Chris, On 5/04/2019 1:48 am, Chris Plummer wrote: > Hi David, > > On 4/4/19 12:14 AM, David Holmes wrote: >> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>> On 4/3/19 11:23 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> I have concerns that this will hide some of the other bugs I've >>>>> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs >>>>> result in 1 or two frames appearing in the stacktrace that should >>>>> be skipped. Notably NativeCallStack::NativeCallStack() and >>>>> os::get_native_stack(). >>>> >>>> The test still checks those are not present first: >>>> >>>> 73???????? // We should never see either of these frames because >>>> they are supposed to be skipped. */ >>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>> 75???????? output.shouldNotContain("os::get_native_stack"); >>> Ah yes. I skimmed over the test looking for it but missed it. >>>> >>>>> Also, AllocateHeap() should normally not be in the stack trace, but >>>>> the test has specifically allowed for it for windows and solaris >>>>> slowdebug builds. Although these builds should have honored the >>>>> ALWAYSINLINE directive, it was deemed acceptable that it was not in >>>>> slowdebug builds. However, I would not want to allow AllocateHeap() >>>>> to appear in a product build, and best not to see it in fastdebug >>>>> either. >>>> >>>> This is a test of NMT detail not a test of whether a given compiler >>>> chooses to inline something like AllocateHeap. I don't think it is >>>> the job of this test to be checking for something specific to the >>>> native compiler. The previous handling of AllocateHeap seemed to be >>>> there simply because it was the only way to deal with an optional >>>> frame - but now that's handled generically. >>> It's appearance means you effectively only have 3 frames to identity >>> callsites instead of 4. >> >> Both stacktraces in the old test had 4 elements and expected 4 >> matches. The current bug is that one of those (new_entry) could >> actually be inlined as well, resulting in only 3 matches. So that is >> what the revised test checks for: at least 3 matches. Often there will >> be 4 matches. > I think you misunderstood my "3 frames" comment. I was referring to how > many frames NMT uses to identify the callsite. It wants to use 4, but if > AllocateHeap() doesn't get inlined, it effectively is using 3. The test > should detect when this happens so the NMT implementation can address > the issue. You're right I don't understand this part as I don't know how/what NMT detail is doing in this regard. >> >> Hmmm but now I'm wondering why this trace: >> >> ? 50???? public static String stackTraceAllocateHeap = >> ? 51???????? ".*AllocateHeap.*\n" + >> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >> >> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it getting >> inlined already when AllocateHeap was not? Even so we still end up >> with 4 frames matching normally. > I noticed that last night also and scratch my head over it for a while > and then went to bed. The only explanation I could come up with is that > allocate_new_entry() is getting inlined, and as a result (due to being a > slowdebug build and doing minimal inlining) AllocateHeap() was not inlined. >> >>> If it does appear in a product build, a solution should be looked >>> into to get rid of it. If the port owner decides it can't get rid of >>> it (or is unwilling to), then an exception should be added to the >>> test like was done for solaris and windows slowdebug builds. >> >> Are we specifically trying to test the compiler's ability to inline >> that function and just happen to be using this test to verify that? >> Doesn't seem like a suitable place to do this - and why do we need to >> do it? The Visual Studio docs state: >> >> "You cannot force the compiler to inline a particular function, even >> with the __forceinline keyword." >> >> so ALWAYSINLINE is just a hint even in product builds and could change >> with any update to the compiler. >> >> For Solaris Studio it is again not guaranteed to inline - specifically >> -xinline only has an effect at ?xO3 or higher. Which likely explains >> why it is ignored in slowdebug. And there are other cases where it >> won't honour the ALWAYSINLINE. >> >> Even with gcc we seem to be misusing the attribute if we want to >> ensure inlining when not optimising: >> >> "GCC does not inline any functions when not optimizing unless you >> specify the ?always_inline? attribute for the function, like this: >> >> /* Prototype.? */ >> inline void foo (const char) __attribute__((always_inline));" >> >> and we don't write it that way. >> >> So if we're that concerned about release builds guaranteeing to inline >> AllocateHeap then I think we need something a bit more explicit than >> this test to determine that. > With respect to the 3 methods/functions we don't want to see in the > callsite stacktrace, NMT has made a number of assumptions on inlining. > One of the things the test is doing is making sure those assumptions are > correct. If incorrect, then you run into issues like I mentioned above > where callsite backtraces effectively only have 3 unique frames rather > than 4 (actually before some bug fixes it was often just 2 unique > frames). So I think it's appropriate to have a test to make sure we are > not seeing any of these 3 methods/functions. Okay I get the gist of that. Is there somewhere I can clearly see what this inlining assumptions are that NMT makes? Are they clearly documented? > Now the test also has made inlining assumptions beyond what NMT has > made, and that is really what this bug is about. In general I think your > fix is fine in the way it relaxes which frames are actually found, but > as Thomas points out, it suffers from not actually looking at a single > stacktrace, but just looking for the specified frames somewhere in the > output (and in the order specified.) You should probably address this. Right that was an error on my part. I thought the existing MULTILINE pattern matching with .* would also find non-sequential lines and so I was acting similarly. I will re-think this. Thanks, David > thanks, > > Chris >> >> Thanks, >> David >> >>> thanks, >>> >>> Chris >>>> >>>> Thanks, >>>> David >>>> >>>>> Given the changes you made to allow more flexibly in which frames >>>>> appear, I think you need to now also make sure the above 3 >>>>> mentioned frames are not present, except for allowing >>>>> AllocateHeap() in slowdebug builds. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>> >>>>>> The actual stack trace reported by NMT detail is affected by the >>>>>> inlining decisions of the native compiler, and on the type of >>>>>> build. So we define an "ideal" stacktrace and then allow for some >>>>>> frames to be missing based on empirical observations. So to date >>>>>> we have seen two frames that may or may not be inlined and so we >>>>>> allow for 2 non-matching entries. >>>>>> >>>>>> The special-casing of AllocateHeap is removed as now it is just an >>>>>> optional frame. >>>>>> >>>>>> Chris: does this maintain the "spirit" of the test as you intended? >>>>>> >>>>>> Zhengyu: can you test this on your system(s) please. >>>>>> >>>>>> Thanks, >>>>>> David >>>>> >>>>> >>> >>> > > From david.holmes at oracle.com Fri Apr 5 02:17:58 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Apr 2019 12:17:58 +1000 Subject: RFR (XS) 8221918: runtime/SharedArchiveFile/serviceability/ReplaceCriticalClasses.java fails: Shared archive not found In-Reply-To: References: <6ea6c6ec-547e-72a3-d41b-217371739837@redhat.com> Message-ID: On 4/04/2019 6:21 pm, Aleksey Shipilev wrote: > On 4/4/19 2:12 AM, David Holmes wrote: >> There's no error checking in the dump part - if something goes wrong we won't see anything to >> indicate what it was, and AFAICS we won't even notice the failure (till the second part of the test). > > Right. This should help: > > - CDSTestUtils.run(opts); > + CDSTestUtils.run(opts).assertNormalExit(""); Okay. > >> Aside: not sure why we can't just use: >> @run main/othervm -XX:SharedArchiveFile=... -Xshare:dump >> and let jtreg deal with error? > > We technically can, but then we would lose the ability to generate shared archive file name, and > would need to add the same line to the test subclasses, e.g ReplaceCriticalClassesForSubgraphs.java Okay. >> Please update copyright to "2018, 2019," > > Updated. > > New webrev: > http://cr.openjdk.java.net/~shade/8221918/webrev.03/ Looks fine. Thanks, David ----- > -Aleksey > From chris.plummer at oracle.com Fri Apr 5 04:17:29 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 4 Apr 2019 21:17:29 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> Message-ID: Hi David, On 4/4/19 6:28 PM, David Holmes wrote: > Hi Chris, > > On 5/04/2019 1:48 am, Chris Plummer wrote: >> Hi David, >> >> On 4/4/19 12:14 AM, David Holmes wrote: >>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> I have concerns that this will hide some of the other bugs I've >>>>>> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs >>>>>> result in 1 or two frames appearing in the stacktrace that should >>>>>> be skipped. Notably NativeCallStack::NativeCallStack() and >>>>>> os::get_native_stack(). >>>>> >>>>> The test still checks those are not present first: >>>>> >>>>> 73???????? // We should never see either of these frames because >>>>> they are supposed to be skipped. */ >>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>> >>>>>> Also, AllocateHeap() should normally not be in the stack trace, >>>>>> but the test has specifically allowed for it for windows and >>>>>> solaris slowdebug builds. Although these builds should have >>>>>> honored the ALWAYSINLINE directive, it was deemed acceptable that >>>>>> it was not in slowdebug builds. However, I would not want to >>>>>> allow AllocateHeap() to appear in a product build, and best not >>>>>> to see it in fastdebug either. >>>>> >>>>> This is a test of NMT detail not a test of whether a given >>>>> compiler chooses to inline something like AllocateHeap. I don't >>>>> think it is the job of this test to be checking for something >>>>> specific to the native compiler. The previous handling of >>>>> AllocateHeap seemed to be there simply because it was the only way >>>>> to deal with an optional frame - but now that's handled generically. >>>> It's appearance means you effectively only have 3 frames to >>>> identity callsites instead of 4. >>> >>> Both stacktraces in the old test had 4 elements and expected 4 >>> matches. The current bug is that one of those (new_entry) could >>> actually be inlined as well, resulting in only 3 matches. So that is >>> what the revised test checks for: at least 3 matches. Often there >>> will be 4 matches. >> I think you misunderstood my "3 frames" comment. I was referring to >> how many frames NMT uses to identify the callsite. It wants to use 4, >> but if AllocateHeap() doesn't get inlined, it effectively is using 3. >> The test should detect when this happens so the NMT implementation >> can address the issue. > > You're right I don't understand this part as I don't know how/what NMT > detail is doing in this regard. An NMT callsite is simply the 4 most recent frames (afters some pruning) that led to the os:malloc() call. "4" is somewhat arbitrary as Thomas pointed out, and is controlled by NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger means more refinement of the callsites (thus more callsites), but a clearer picture of what actually led to the os:malloc(). For example, with NMT_TrackingStackDepth == 4, if you have a() calls b() calls c() calls d() calls os:malloc(), and foo() and bar() both call a(), the NMT detail output will not distinguish between these two calls paths to os:mallco(), and will consider both paths to be the same callsite. The 4 frames in the NMT detail output would always be a, b, c, and d. However, bump up NMT_TrackingStackDepth to 5 and now NMT will treat them as two separate callsites, one with foo() as the bottom frame and one with bar() as the bottom frame, and both with a, b, c, and d as the other 4 frames. So my point is if AllocateHeap() is not inlined, then every allocation that is the result of doing a "new" of any CHeapObj subtype will have AllocateHeap() in its callsite, which effectively lowers they callsite refinement by 1. > >>> >>> Hmmm but now I'm wondering why this trace: >>> >>> ? 50???? public static String stackTraceAllocateHeap = >>> ? 51???????? ".*AllocateHeap.*\n" + >>> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >>> >>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it getting >>> inlined already when AllocateHeap was not? Even so we still end up >>> with 4 frames matching normally. >> I noticed that last night also and scratch my head over it for a >> while and then went to bed. The only explanation I could come up with >> is that allocate_new_entry() is getting inlined, and as a result (due >> to being a slowdebug build and doing minimal inlining) AllocateHeap() >> was not inlined. >>> >>>> If it does appear in a product build, a solution should be looked >>>> into to get rid of it. If the port owner decides it can't get rid >>>> of it (or is unwilling to), then an exception should be added to >>>> the test like was done for solaris and windows slowdebug builds. >>> >>> Are we specifically trying to test the compiler's ability to inline >>> that function and just happen to be using this test to verify that? >>> Doesn't seem like a suitable place to do this - and why do we need >>> to do it? The Visual Studio docs state: >>> >>> "You cannot force the compiler to inline a particular function, even >>> with the __forceinline keyword." >>> >>> so ALWAYSINLINE is just a hint even in product builds and could >>> change with any update to the compiler. >>> >>> For Solaris Studio it is again not guaranteed to inline - >>> specifically -xinline only has an effect at ?xO3 or higher. Which >>> likely explains why it is ignored in slowdebug. And there are other >>> cases where it won't honour the ALWAYSINLINE. >>> >>> Even with gcc we seem to be misusing the attribute if we want to >>> ensure inlining when not optimising: >>> >>> "GCC does not inline any functions when not optimizing unless you >>> specify the ?always_inline? attribute for the function, like this: >>> >>> /* Prototype.? */ >>> inline void foo (const char) __attribute__((always_inline));" >>> >>> and we don't write it that way. >>> >>> So if we're that concerned about release builds guaranteeing to >>> inline AllocateHeap then I think we need something a bit more >>> explicit than this test to determine that. >> With respect to the 3 methods/functions we don't want to see in the >> callsite stacktrace, NMT has made a number of assumptions on >> inlining. One of the things the test is doing is making sure those >> assumptions are correct. If incorrect, then you run into issues like >> I mentioned above where callsite backtraces effectively only have 3 >> unique frames rather than 4 (actually before some bug fixes it was >> often just 2 unique frames). So I think it's appropriate to have a >> test to make sure we are not seeing any of these 3 methods/functions. > > Okay I get the gist of that. Is there somewhere I can clearly see what > this inlining assumptions are that NMT makes? Are they clearly > documented? Not that I know of. I discovered them while looking at the various bugs that led to NativeCallStack::NativeCallStack() and os::get_native_stack() (and sometimes both) being in the callsite. Reviewing the bugs I referred to will give you an idea of where to look. One good place to look at NativeCallStack::NativeCallStack(). Lots of special case code there that controls how many frames to skip based on on the platform and whether optimized or not. Also some comments there to help you out. I did a lot of bug fixing in this method. Looking at this code also reminds me of a reason to have the test continue to check for all 4 specific frames. If the frame skipping code skips an extra frame, then the callsite will be missing a needed frame at the top. The way the test was written it would detect this. With your changes it will not. It would just revert to always matching on 3 frames instead of 4, and the frame skipping bug would go unnoticed. thanks, Chris > >> Now the test also has made inlining assumptions beyond what NMT has >> made, and that is really what this bug is about. In general I think >> your fix is fine in the way it relaxes which frames are actually >> found, but as Thomas points out, it suffers from not actually looking >> at a single stacktrace, but just looking for the specified frames >> somewhere in the output (and in the order specified.) You should >> probably address this. > > Right that was an error on my part. I thought the existing MULTILINE > pattern matching with .* would also find non-sequential lines and so I > was acting similarly. I will re-think this. > > Thanks, > David > >> thanks, >> >> Chris >>> >>> Thanks, >>> David >>> >>>> thanks, >>>> >>>> Chris >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Given the changes you made to allow more flexibly in which frames >>>>>> appear, I think you need to now also make sure the above 3 >>>>>> mentioned frames are not present, except for allowing >>>>>> AllocateHeap() in slowdebug builds. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>> >>>>>>> The actual stack trace reported by NMT detail is affected by the >>>>>>> inlining decisions of the native compiler, and on the type of >>>>>>> build. So we define an "ideal" stacktrace and then allow for >>>>>>> some frames to be missing based on empirical observations. So to >>>>>>> date we have seen two frames that may or may not be inlined and >>>>>>> so we allow for 2 non-matching entries. >>>>>>> >>>>>>> The special-casing of AllocateHeap is removed as now it is just >>>>>>> an optional frame. >>>>>>> >>>>>>> Chris: does this maintain the "spirit" of the test as you intended? >>>>>>> >>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>> >>>>>> >>>> >>>> >> >> From david.holmes at oracle.com Fri Apr 5 05:11:19 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Apr 2019 15:11:19 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> Message-ID: <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> Hi Chris, Thanks for the explanation about the frame counting from os::malloc - now I get it. But I don't understand your final comment: > Looking at this code also reminds me of a reason to have the test > continue to check for all 4 specific frames. If the frame skipping code > skips an extra frame, then the callsite will be missing a needed frame > at the top. The way the test was written it would detect this. With your > changes it will not. It would just revert to always matching on 3 frames > instead of 4, and the frame skipping bug would go unnoticed. How can I fix this bug if I have to check for 4 specific frames but one (or more) may be missing - i.e how can I tell the different between "Frame A was inlined" and "Frame A was skipped by mistake" ?? Thanks, David On 5/04/2019 2:17 pm, Chris Plummer wrote: > Hi David, > > On 4/4/19 6:28 PM, David Holmes wrote: >> Hi Chris, >> >> On 5/04/2019 1:48 am, Chris Plummer wrote: >>> Hi David, >>> >>> On 4/4/19 12:14 AM, David Holmes wrote: >>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> I have concerns that this will hide some of the other bugs I've >>>>>>> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs >>>>>>> result in 1 or two frames appearing in the stacktrace that should >>>>>>> be skipped. Notably NativeCallStack::NativeCallStack() and >>>>>>> os::get_native_stack(). >>>>>> >>>>>> The test still checks those are not present first: >>>>>> >>>>>> 73???????? // We should never see either of these frames because >>>>>> they are supposed to be skipped. */ >>>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>> >>>>>>> Also, AllocateHeap() should normally not be in the stack trace, >>>>>>> but the test has specifically allowed for it for windows and >>>>>>> solaris slowdebug builds. Although these builds should have >>>>>>> honored the ALWAYSINLINE directive, it was deemed acceptable that >>>>>>> it was not in slowdebug builds. However, I would not want to >>>>>>> allow AllocateHeap() to appear in a product build, and best not >>>>>>> to see it in fastdebug either. >>>>>> >>>>>> This is a test of NMT detail not a test of whether a given >>>>>> compiler chooses to inline something like AllocateHeap. I don't >>>>>> think it is the job of this test to be checking for something >>>>>> specific to the native compiler. The previous handling of >>>>>> AllocateHeap seemed to be there simply because it was the only way >>>>>> to deal with an optional frame - but now that's handled generically. >>>>> It's appearance means you effectively only have 3 frames to >>>>> identity callsites instead of 4. >>>> >>>> Both stacktraces in the old test had 4 elements and expected 4 >>>> matches. The current bug is that one of those (new_entry) could >>>> actually be inlined as well, resulting in only 3 matches. So that is >>>> what the revised test checks for: at least 3 matches. Often there >>>> will be 4 matches. >>> I think you misunderstood my "3 frames" comment. I was referring to >>> how many frames NMT uses to identify the callsite. It wants to use 4, >>> but if AllocateHeap() doesn't get inlined, it effectively is using 3. >>> The test should detect when this happens so the NMT implementation >>> can address the issue. >> >> You're right I don't understand this part as I don't know how/what NMT >> detail is doing in this regard. > > An NMT callsite is simply the 4 most recent frames (afters some pruning) > that led to the os:malloc() call. "4" is somewhat arbitrary as Thomas > pointed out, and is controlled by NMT_TrackingStackDepth. Making > NMT_TrackingStackDepth bigger means more refinement of the callsites > (thus more callsites), but a clearer picture of what actually led to the > os:malloc(). > > For example, with NMT_TrackingStackDepth == 4, if you have a() calls b() > calls c() calls d() calls os:malloc(), and foo() and bar() both call > a(), the NMT detail output will not distinguish between these two calls > paths to os:mallco(), and will consider both paths to be the same > callsite. The 4 frames in the NMT detail output would always be a, b, c, > and d. However, bump up NMT_TrackingStackDepth to 5 and now NMT will > treat them as two separate callsites, one with foo() as the bottom frame > and one with bar() as the bottom frame, and both with a, b, c, and d as > the other 4 frames. > > So my point is if AllocateHeap() is not inlined, then every allocation > that is the result of doing a "new" of any CHeapObj subtype will have > AllocateHeap() in its callsite, which effectively lowers they callsite > refinement by 1. > >> >>>> >>>> Hmmm but now I'm wondering why this trace: >>>> >>>> ? 50???? public static String stackTraceAllocateHeap = >>>> ? 51???????? ".*AllocateHeap.*\n" + >>>> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >>>> >>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it getting >>>> inlined already when AllocateHeap was not? Even so we still end up >>>> with 4 frames matching normally. >>> I noticed that last night also and scratch my head over it for a >>> while and then went to bed. The only explanation I could come up with >>> is that allocate_new_entry() is getting inlined, and as a result (due >>> to being a slowdebug build and doing minimal inlining) AllocateHeap() >>> was not inlined. >>>> >>>>> If it does appear in a product build, a solution should be looked >>>>> into to get rid of it. If the port owner decides it can't get rid >>>>> of it (or is unwilling to), then an exception should be added to >>>>> the test like was done for solaris and windows slowdebug builds. >>>> >>>> Are we specifically trying to test the compiler's ability to inline >>>> that function and just happen to be using this test to verify that? >>>> Doesn't seem like a suitable place to do this - and why do we need >>>> to do it? The Visual Studio docs state: >>>> >>>> "You cannot force the compiler to inline a particular function, even >>>> with the __forceinline keyword." >>>> >>>> so ALWAYSINLINE is just a hint even in product builds and could >>>> change with any update to the compiler. >>>> >>>> For Solaris Studio it is again not guaranteed to inline - >>>> specifically -xinline only has an effect at ?xO3 or higher. Which >>>> likely explains why it is ignored in slowdebug. And there are other >>>> cases where it won't honour the ALWAYSINLINE. >>>> >>>> Even with gcc we seem to be misusing the attribute if we want to >>>> ensure inlining when not optimising: >>>> >>>> "GCC does not inline any functions when not optimizing unless you >>>> specify the ?always_inline? attribute for the function, like this: >>>> >>>> /* Prototype.? */ >>>> inline void foo (const char) __attribute__((always_inline));" >>>> >>>> and we don't write it that way. >>>> >>>> So if we're that concerned about release builds guaranteeing to >>>> inline AllocateHeap then I think we need something a bit more >>>> explicit than this test to determine that. >>> With respect to the 3 methods/functions we don't want to see in the >>> callsite stacktrace, NMT has made a number of assumptions on >>> inlining. One of the things the test is doing is making sure those >>> assumptions are correct. If incorrect, then you run into issues like >>> I mentioned above where callsite backtraces effectively only have 3 >>> unique frames rather than 4 (actually before some bug fixes it was >>> often just 2 unique frames). So I think it's appropriate to have a >>> test to make sure we are not seeing any of these 3 methods/functions. >> >> Okay I get the gist of that. Is there somewhere I can clearly see what >> this inlining assumptions are that NMT makes? Are they clearly >> documented? > > Not that I know of. I discovered them while looking at the various bugs > that led to NativeCallStack::NativeCallStack() and > os::get_native_stack() (and sometimes both) being in the callsite. > Reviewing the bugs I referred to will give you an idea of where to look. > One good place to look at NativeCallStack::NativeCallStack(). Lots of > special case code there that controls how many frames to skip based on > on the platform and whether optimized or not. Also some comments there > to help you out. I did a lot of bug fixing in this method. > > Looking at this code also reminds me of a reason to have the test > continue to check for all 4 specific frames. If the frame skipping code > skips an extra frame, then the callsite will be missing a needed frame > at the top. The way the test was written it would detect this. With your > changes it will not. It would just revert to always matching on 3 frames > instead of 4, and the frame skipping bug would go unnoticed. > > thanks, > > Chris > >> >>> Now the test also has made inlining assumptions beyond what NMT has >>> made, and that is really what this bug is about. In general I think >>> your fix is fine in the way it relaxes which frames are actually >>> found, but as Thomas points out, it suffers from not actually looking >>> at a single stacktrace, but just looking for the specified frames >>> somewhere in the output (and in the order specified.) You should >>> probably address this. >> >> Right that was an error on my part. I thought the existing MULTILINE >> pattern matching with .* would also find non-sequential lines and so I >> was acting similarly. I will re-think this. >> >> Thanks, >> David >> >>> thanks, >>> >>> Chris >>>> >>>> Thanks, >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Given the changes you made to allow more flexibly in which frames >>>>>>> appear, I think you need to now also make sure the above 3 >>>>>>> mentioned frames are not present, except for allowing >>>>>>> AllocateHeap() in slowdebug builds. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>> >>>>>>>> The actual stack trace reported by NMT detail is affected by the >>>>>>>> inlining decisions of the native compiler, and on the type of >>>>>>>> build. So we define an "ideal" stacktrace and then allow for >>>>>>>> some frames to be missing based on empirical observations. So to >>>>>>>> date we have seen two frames that may or may not be inlined and >>>>>>>> so we allow for 2 non-matching entries. >>>>>>>> >>>>>>>> The special-casing of AllocateHeap is removed as now it is just >>>>>>>> an optional frame. >>>>>>>> >>>>>>>> Chris: does this maintain the "spirit" of the test as you intended? >>>>>>>> >>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > > From goetz.lindenmaier at sap.com Fri Apr 5 05:41:13 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 5 Apr 2019 05:41:13 +0000 Subject: [ping] RFR(L): 8221470: Print methods in exception messages in java-like Syntax. In-Reply-To: References: <30e62810-ea0b-9732-cc03-a0f3fde984b4@oracle.com> Message-ID: Hi David, Thanks for pushing the change, and adding Coleen! Best regards, Goetz. > -----Original Message----- > From: David Holmes > Sent: Friday, April 5, 2019 2:23 AM > To: Lindenmaier, Goetz ; 'hotspot-runtime- > dev at openjdk.java.net' > Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages in > java-like Syntax. > > Hi Goetz, > > On 4/04/2019 10:19 pm, Lindenmaier, Goetz wrote: > > Hi David, > > > > I ran it through jdk-submit, no failures. > > I now added reviewer information to the patch. > > I updated the webrev in-place; nothing changed except > > the reviewer information. > > You missed Coleen so I added her. > > > You said we need to sync on the push. Do you just > > want to sponsor the change to make sure it works out? > > Or do you want to announce when I should push? > > > > Feel free to just push it in case you want to... > > Changes pushed :) > > Thanks, > David > > > Best regards, > > Goetz. > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Donnerstag, 4. April 2019 01:53 > >> To: Lindenmaier, Goetz ; 'hotspot- > runtime- > >> dev at openjdk.java.net' > >> Subject: Re: [ping] RFR(L): 8221470: Print methods in exception messages > in > >> java-like Syntax. > >> > >> Looks good. > >> > >> I'm re-running through our test system. > >> > >> Thanks, > >> David > >> > >> On 4/04/2019 1:18 am, Lindenmaier, Goetz wrote: > >>> Hi, > >>> > >>> here a new webrev: > >>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg-signature/03/ > >>> > >>> I have removed the test with the bad class name. > >>> I named the variables name_str/array_str now. > >>> > >>> I'll push it to jdk-submit for further testing. > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> > >>>> -----Original Message----- > >>>> From: David Holmes > >>>> Sent: Mittwoch, 3. April 2019 00:42 > >>>> To: Lindenmaier, Goetz ; 'hotspot- > runtime- > >>>> dev at openjdk.java.net' > >>>> Subject: Re: [ping] RFR(L): 8221470: Print methods in exception > messages in > >>>> java-like Syntax. > >>>> > >>>> Two follow ups ... > >>>> > >>>> On 2/04/2019 11:26 pm, Lindenmaier, Goetz wrote: > >>>>> Hi David, > >>>>> > >>>>>> Overall this looks good to me - a few minor nits/comments below. > >>>>> thanks! > >>>>> > >>>>>> I've applied the patch and am running it through our internal build > and > >>>>>> test system (tiers 1-3 initially). > >>>>>> > >>>>>> I have a suspicion there will be other tests that need to be updated - > >>>>>> possibly even JCK tests. Discovering those a-priori will be difficult > >>>>>> (simply running all the tests would take an extremely long time). Will > >>>>>> have a discussion about how best to handle those internally. > >>>>> > >>>>> I ran most JCK test without problem. They usually don't check > messages. > >>>>> I ran all hotspot, jdk, langtools, nashorn and jaxp test (except > >>>>> for headful tests). > >>>> > >>>> Thanks for the additional testing info. I duplicated some of that but > >>>> found no issues, other than a couple of closed tests. > >>>> > >>>>>> src/hotspot/share/oops/method.cpp > >>>>>> Please put a blank line after each new method. > >>>>> Fixed. > >>>>> > >>>>>> src/hotspot/share/oops/symbol.cpp > >>>>>> > >>>>>> + os->print("."); > >>>>>> + } else { > >>>>>> + os->print("%c", start[i]); > >>>>>> > >>>>>> Please use os->put(char c) for individual characters. > >>>>> Fixed. > >>>>> > >>>>>> The "start" name would seem better as "buf" to me. > >>>>> Hmm, buf to me is a local chunk of memory used temporarily. > >>>>> What about array_sig, class_sig? > >>>> > >>>> Not really "sigs". > >>>> > >>>> str? Else just leave it. > >>>> > >>>> Thanks, > >>>> David > >>>> ----- > >>>> > >>>>>> + } else if (start[i] == 'L') { > >>>>>> + print_class(os, start+i+1, len-i-2); > >>>>>> Can you insert a comment that help explains the -2: > >>>>> Done. > >>>>> > >>>>>> + for(SignatureStream ss(this); !ss.is_done(); ss.next()) { > >>>>>> space after for (2 occurrences) > >>>>> Fixed. > >>>>> > >>>>>> > >>>> > >> > test/hotspot/jtreg/runtime/exceptionMsgs/methodPrinting/TestPrintingMe > th > >>>>>> ods.java > >>>>>> > >>>>>> Not sure the special characters can be used directly in the sources. > Can > >>>>>> they not be put in as unicode escapes at all places? > >>>>> I'll try what Ioi proposed. I'll post a new webrev including that. > >>>>> > >>>>> Best regards, > >>>>> Goetz. > >>>>> > >>>>> > >>>>>> > >>>>>> --- > >>>>>> > >>>>>> Thanks, > >>>>>> David > >>>>>> ------- > >>>>>> > >>>>>> > >>>>>> On 1/04/2019 12:32 pm, David Holmes wrote: > >>>>>>> Hi Goetz, > >>>>>>> > >>>>>>> I'm looking at this ... > >>>>>>> > >>>>>>> On 29/03/2019 8:26 pm, Lindenmaier, Goetz wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> Any interest in this change? > >>>>>>> > >>>>>>> I'm personally of two minds here because these VM generated > >> exceptions > >>>>>>> are not only delivered to Java source code. I'd like to know how > other > >>>>>>> language developers using the JVM runtime would view this. > >>>>>>> > >>>>>>> That aside if you're going to make a change like this then I think the > >>>>>>> full signature string has to be quoted in some way to delineate it > >>>>>>> within the larger message. > >>>>>>> > >>>>>>>> Should I split it to adapt the exceptions separately one-by-one to > >>>>>>>> make the change smaller and simplify the review? > >>>>>>> > >>>>>>> I don't think that is necessary. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> David > >>>>>>> ----- > >>>>>>> > >>>>>>>> I would propose to start out with AbstractMethodError only. > >>>>>>>> > >>>>>>>> Best regards, > >>>>>>>> ?? Goetz. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> From: Lindenmaier, Goetz > >>>>>>>> Sent: Tuesday, March 26, 2019 1:06 PM > >>>>>>>> To: hotspot-runtime-dev at openjdk.java.net > >>>>>>>> Subject: RFR(L): 8221470: Print methods in exception messages in > >>>>>>>> java-like Syntax. > >>>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> A row of exceptions are thrown from the hotspot runtime. > >>>>>>>> They print methods with their JNI signatures. To increase > >>>>>>>> readability and resemblance to source code, this change proposes > >>>>>>>> to print them in a Java-like syntax. > >>>>>>>> > >>>>>>>> Some examples: > >>>>>>>> current method printouts: > >>>>>>>> > >>>>>>>> test.TeMe3_B.ma()V > >>>>>>>> test.TeMe3_B.ma(IZ[[BF)[[D > >>>>>>>> test.TeMe3_B.ma([[[Ljava/lang/Object;)[[Ltest/TeMe3_B; > >>>>>>>> > >>>>>>>> improved format: > >>>>>>>> > >>>>>>>> void test.TeMe3_B.ma() > >>>>>>>> double[][] test.TeMe3_B.ma(int, boolean, byte[][], float) > >>>>>>>> test.TeMe3_B[][] test.TeMe3_B.ma(java.lang.Object[][][]) > >>>>>>>> > >>>>>>>> So far, Method::name_and_sig_as_C_string() is used to print > >>>>>>>> these messages. > >>>>>>>> > >>>>>>>> This change implements function Method::external_name() that > prints > >>>>>>>> the better > >>>>>>>> format. > >>>>>>>> external_name() is chosen according to Klass::external_name(). > >>>>>>>> > >>>>>>>> Printing the better format requires parsing the signature > >>>>>>>> Symbol. This is implemented in > >>>>>>>> void > Symbol::print_as_signature_external_return_type(outputStream > >>>> *os); > >>>>>>>> void > Symbol::print_as_signature_external_parameters(outputStream > >>>> *os); > >>>>>>>> These method names are chosen according to > >>>>>>>> Symbol::as_class_external_name(). > >>>>>>>> > >>>>>>>> See this partial webrev for the new functions: > >>>>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg- > signature/01- > >>>>>> new_methods/ > >>>>>>>> > >>>>>>>> > >>>>>>>> Also, I changed a lot of exception messages to use the new > format. > >>>>>>>> This required to adapt a row of tests. I added a test to check > >>>>>>>> the signature printing does not regress.? For all these changes, see > >>>>>>>> the full webrev: > >>>>>>>> http://cr.openjdk.java.net/~goetz/wr19/8221470-exMsg- > >> signature/01/ > >>>>>>>> > >>>>>>>> I hope I detected all places where method signatures are printed > to > >>>>>>>> exception messages. > >>>>>>>> > >>>>>>>> Best regards, > >>>>>>>> ?? Goetz. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> From chris.plummer at oracle.com Fri Apr 5 05:53:20 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 4 Apr 2019 22:53:20 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> Message-ID: <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> Hi David, For the callsite that this test is checking for, right now there appear to be 3 possible stacktraces: the "normal" one, the one that includes AllocateHeap() on solaris and windows slowdebug builds, and the one Zhengyu is now seeing on linux-x64. You would need to check for all 3, limiting the AllocateHeap() one to just being allowed on solaris and windows slowdebug as it is now. So basically this test needs to cover all (allowable) stacktraces that we've seen for this callsite, and be updated in the future as needed. Not ideal, but I don't see a better solution. It's similar to the situation described in JDK-8163899 which covered the fragility of the NMT frame skipping code. In the end it was decided it would be easier to just deal fix issues as they came up rather then engineer a solution that wasn't as fragile. I think this test falls in the same category. thanks, Chris On 4/4/19 10:11 PM, David Holmes wrote: > Hi Chris, > > Thanks for the explanation about the frame counting from os::malloc - > now I get it. But I don't understand your final comment: > > > Looking at this code also reminds me of a reason to have the test > > continue to check for all 4 specific frames. If the frame skipping code > > skips an extra frame, then the callsite will be missing a needed frame > > at the top. The way the test was written it would detect this. With > your > > changes it will not. It would just revert to always matching on 3 > frames > > instead of 4, and the frame skipping bug would go unnoticed. > > How can I fix this bug if I have to check for 4 specific frames but > one (or more) may be missing - i.e how can I tell the different > between "Frame A was inlined" and "Frame A was skipped by mistake" ?? > > Thanks, > David > > > On 5/04/2019 2:17 pm, Chris Plummer wrote: >> Hi David, >> >> On 4/4/19 6:28 PM, David Holmes wrote: >>> Hi Chris, >>> >>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>> Hi David, >>>> >>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> I have concerns that this will hide some of the other bugs I've >>>>>>>> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These >>>>>>>> bugs result in 1 or two frames appearing in the stacktrace that >>>>>>>> should be skipped. Notably NativeCallStack::NativeCallStack() >>>>>>>> and os::get_native_stack(). >>>>>>> >>>>>>> The test still checks those are not present first: >>>>>>> >>>>>>> 73???????? // We should never see either of these frames because >>>>>>> they are supposed to be skipped. */ >>>>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>>> >>>>>>>> Also, AllocateHeap() should normally not be in the stack trace, >>>>>>>> but the test has specifically allowed for it for windows and >>>>>>>> solaris slowdebug builds. Although these builds should have >>>>>>>> honored the ALWAYSINLINE directive, it was deemed acceptable >>>>>>>> that it was not in slowdebug builds. However, I would not want >>>>>>>> to allow AllocateHeap() to appear in a product build, and best >>>>>>>> not to see it in fastdebug either. >>>>>>> >>>>>>> This is a test of NMT detail not a test of whether a given >>>>>>> compiler chooses to inline something like AllocateHeap. I don't >>>>>>> think it is the job of this test to be checking for something >>>>>>> specific to the native compiler. The previous handling of >>>>>>> AllocateHeap seemed to be there simply because it was the only >>>>>>> way to deal with an optional frame - but now that's handled >>>>>>> generically. >>>>>> It's appearance means you effectively only have 3 frames to >>>>>> identity callsites instead of 4. >>>>> >>>>> Both stacktraces in the old test had 4 elements and expected 4 >>>>> matches. The current bug is that one of those (new_entry) could >>>>> actually be inlined as well, resulting in only 3 matches. So that >>>>> is what the revised test checks for: at least 3 matches. Often >>>>> there will be 4 matches. >>>> I think you misunderstood my "3 frames" comment. I was referring to >>>> how many frames NMT uses to identify the callsite. It wants to use >>>> 4, but if AllocateHeap() doesn't get inlined, it effectively is >>>> using 3. The test should detect when this happens so the NMT >>>> implementation can address the issue. >>> >>> You're right I don't understand this part as I don't know how/what >>> NMT detail is doing in this regard. >> >> An NMT callsite is simply the 4 most recent frames (afters some >> pruning) that led to the os:malloc() call. "4" is somewhat arbitrary >> as Thomas pointed out, and is controlled by NMT_TrackingStackDepth. >> Making NMT_TrackingStackDepth bigger means more refinement of the >> callsites (thus more callsites), but a clearer picture of what >> actually led to the os:malloc(). >> >> For example, with NMT_TrackingStackDepth == 4, if you have a() calls >> b() calls c() calls d() calls os:malloc(), and foo() and bar() both >> call a(), the NMT detail output will not distinguish between these >> two calls paths to os:mallco(), and will consider both paths to be >> the same callsite. The 4 frames in the NMT detail output would always >> be a, b, c, and d. However, bump up NMT_TrackingStackDepth to 5 and >> now NMT will treat them as two separate callsites, one with foo() as >> the bottom frame and one with bar() as the bottom frame, and both >> with a, b, c, and d as the other 4 frames. >> >> So my point is if AllocateHeap() is not inlined, then every >> allocation that is the result of doing a "new" of any CHeapObj >> subtype will have AllocateHeap() in its callsite, which effectively >> lowers they callsite refinement by 1. >> >>> >>>>> >>>>> Hmmm but now I'm wondering why this trace: >>>>> >>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >>>>> >>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it >>>>> getting inlined already when AllocateHeap was not? Even so we >>>>> still end up with 4 frames matching normally. >>>> I noticed that last night also and scratch my head over it for a >>>> while and then went to bed. The only explanation I could come up >>>> with is that allocate_new_entry() is getting inlined, and as a >>>> result (due to being a slowdebug build and doing minimal inlining) >>>> AllocateHeap() was not inlined. >>>>> >>>>>> If it does appear in a product build, a solution should be looked >>>>>> into to get rid of it. If the port owner decides it can't get rid >>>>>> of it (or is unwilling to), then an exception should be added to >>>>>> the test like was done for solaris and windows slowdebug builds. >>>>> >>>>> Are we specifically trying to test the compiler's ability to >>>>> inline that function and just happen to be using this test to >>>>> verify that? Doesn't seem like a suitable place to do this - and >>>>> why do we need to do it? The Visual Studio docs state: >>>>> >>>>> "You cannot force the compiler to inline a particular function, >>>>> even with the __forceinline keyword." >>>>> >>>>> so ALWAYSINLINE is just a hint even in product builds and could >>>>> change with any update to the compiler. >>>>> >>>>> For Solaris Studio it is again not guaranteed to inline - >>>>> specifically -xinline only has an effect at ?xO3 or higher. Which >>>>> likely explains why it is ignored in slowdebug. And there are >>>>> other cases where it won't honour the ALWAYSINLINE. >>>>> >>>>> Even with gcc we seem to be misusing the attribute if we want to >>>>> ensure inlining when not optimising: >>>>> >>>>> "GCC does not inline any functions when not optimizing unless you >>>>> specify the ?always_inline? attribute for the function, like this: >>>>> >>>>> /* Prototype.? */ >>>>> inline void foo (const char) __attribute__((always_inline));" >>>>> >>>>> and we don't write it that way. >>>>> >>>>> So if we're that concerned about release builds guaranteeing to >>>>> inline AllocateHeap then I think we need something a bit more >>>>> explicit than this test to determine that. >>>> With respect to the 3 methods/functions we don't want to see in the >>>> callsite stacktrace, NMT has made a number of assumptions on >>>> inlining. One of the things the test is doing is making sure those >>>> assumptions are correct. If incorrect, then you run into issues >>>> like I mentioned above where callsite backtraces effectively only >>>> have 3 unique frames rather than 4 (actually before some bug fixes >>>> it was often just 2 unique frames). So I think it's appropriate to >>>> have a test to make sure we are not seeing any of these 3 >>>> methods/functions. >>> >>> Okay I get the gist of that. Is there somewhere I can clearly see >>> what this inlining assumptions are that NMT makes? Are they clearly >>> documented? >> >> Not that I know of. I discovered them while looking at the various >> bugs that led to NativeCallStack::NativeCallStack() and >> os::get_native_stack() (and sometimes both) being in the callsite. >> Reviewing the bugs I referred to will give you an idea of where to >> look. One good place to look at NativeCallStack::NativeCallStack(). >> Lots of special case code there that controls how many frames to skip >> based on on the platform and whether optimized or not. Also some >> comments there to help you out. I did a lot of bug fixing in this >> method. >> >> Looking at this code also reminds me of a reason to have the test >> continue to check for all 4 specific frames. If the frame skipping >> code skips an extra frame, then the callsite will be missing a needed >> frame at the top. The way the test was written it would detect this. >> With your changes it will not. It would just revert to always >> matching on 3 frames instead of 4, and the frame skipping bug would >> go unnoticed. >> >> thanks, >> >> Chris >> >>> >>>> Now the test also has made inlining assumptions beyond what NMT has >>>> made, and that is really what this bug is about. In general I think >>>> your fix is fine in the way it relaxes which frames are actually >>>> found, but as Thomas points out, it suffers from not actually >>>> looking at a single stacktrace, but just looking for the specified >>>> frames somewhere in the output (and in the order specified.) You >>>> should probably address this. >>> >>> Right that was an error on my part. I thought the existing MULTILINE >>> pattern matching with .* would also find non-sequential lines and so >>> I was acting similarly. I will re-think this. >>> >>> Thanks, >>> David >>> >>>> thanks, >>>> >>>> Chris >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> Given the changes you made to allow more flexibly in which >>>>>>>> frames appear, I think you need to now also make sure the above >>>>>>>> 3 mentioned frames are not present, except for allowing >>>>>>>> AllocateHeap() in slowdebug builds. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>> >>>>>>>>> The actual stack trace reported by NMT detail is affected by >>>>>>>>> the inlining decisions of the native compiler, and on the type >>>>>>>>> of build. So we define an "ideal" stacktrace and then allow >>>>>>>>> for some frames to be missing based on empirical observations. >>>>>>>>> So to date we have seen two frames that may or may not be >>>>>>>>> inlined and so we allow for 2 non-matching entries. >>>>>>>>> >>>>>>>>> The special-casing of AllocateHeap is removed as now it is >>>>>>>>> just an optional frame. >>>>>>>>> >>>>>>>>> Chris: does this maintain the "spirit" of the test as you >>>>>>>>> intended? >>>>>>>>> >>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From david.holmes at oracle.com Fri Apr 5 05:56:50 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Apr 2019 15:56:50 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> Message-ID: <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> Hi Chris, Okay I will simply check for the third alternative. Thanks, David On 5/04/2019 3:53 pm, Chris Plummer wrote: > Hi David, > > For the callsite that this test is checking for, right now there appear > to be 3 possible stacktraces: the "normal" one, the one that includes > AllocateHeap() on solaris and windows slowdebug builds, and the one > Zhengyu is now seeing on linux-x64. You would need to check for all 3, > limiting the AllocateHeap() one to just being allowed on solaris and > windows slowdebug as it is now. So basically this test needs to cover > all (allowable) stacktraces that we've seen for this callsite, and be > updated in the future as needed. Not ideal, but I don't see a better > solution. It's similar to the situation described in JDK-8163899 which > covered the fragility of the NMT frame skipping code. In the end it was > decided it would be easier to just deal fix issues as they came up > rather then engineer a solution that wasn't as fragile. I think this > test falls in the same category. > > thanks, > > Chris > > On 4/4/19 10:11 PM, David Holmes wrote: >> Hi Chris, >> >> Thanks for the explanation about the frame counting from os::malloc - >> now I get it. But I don't understand your final comment: >> >> > Looking at this code also reminds me of a reason to have the test >> > continue to check for all 4 specific frames. If the frame skipping code >> > skips an extra frame, then the callsite will be missing a needed frame >> > at the top. The way the test was written it would detect this. With >> your >> > changes it will not. It would just revert to always matching on 3 >> frames >> > instead of 4, and the frame skipping bug would go unnoticed. >> >> How can I fix this bug if I have to check for 4 specific frames but >> one (or more) may be missing - i.e how can I tell the different >> between "Frame A was inlined" and "Frame A was skipped by mistake" ?? >> >> Thanks, >> David >> >> >> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>> Hi David, >>> >>> On 4/4/19 6:28 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> I have concerns that this will hide some of the other bugs I've >>>>>>>>> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These >>>>>>>>> bugs result in 1 or two frames appearing in the stacktrace that >>>>>>>>> should be skipped. Notably NativeCallStack::NativeCallStack() >>>>>>>>> and os::get_native_stack(). >>>>>>>> >>>>>>>> The test still checks those are not present first: >>>>>>>> >>>>>>>> 73???????? // We should never see either of these frames because >>>>>>>> they are supposed to be skipped. */ >>>>>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>>>> >>>>>>>>> Also, AllocateHeap() should normally not be in the stack trace, >>>>>>>>> but the test has specifically allowed for it for windows and >>>>>>>>> solaris slowdebug builds. Although these builds should have >>>>>>>>> honored the ALWAYSINLINE directive, it was deemed acceptable >>>>>>>>> that it was not in slowdebug builds. However, I would not want >>>>>>>>> to allow AllocateHeap() to appear in a product build, and best >>>>>>>>> not to see it in fastdebug either. >>>>>>>> >>>>>>>> This is a test of NMT detail not a test of whether a given >>>>>>>> compiler chooses to inline something like AllocateHeap. I don't >>>>>>>> think it is the job of this test to be checking for something >>>>>>>> specific to the native compiler. The previous handling of >>>>>>>> AllocateHeap seemed to be there simply because it was the only >>>>>>>> way to deal with an optional frame - but now that's handled >>>>>>>> generically. >>>>>>> It's appearance means you effectively only have 3 frames to >>>>>>> identity callsites instead of 4. >>>>>> >>>>>> Both stacktraces in the old test had 4 elements and expected 4 >>>>>> matches. The current bug is that one of those (new_entry) could >>>>>> actually be inlined as well, resulting in only 3 matches. So that >>>>>> is what the revised test checks for: at least 3 matches. Often >>>>>> there will be 4 matches. >>>>> I think you misunderstood my "3 frames" comment. I was referring to >>>>> how many frames NMT uses to identify the callsite. It wants to use >>>>> 4, but if AllocateHeap() doesn't get inlined, it effectively is >>>>> using 3. The test should detect when this happens so the NMT >>>>> implementation can address the issue. >>>> >>>> You're right I don't understand this part as I don't know how/what >>>> NMT detail is doing in this regard. >>> >>> An NMT callsite is simply the 4 most recent frames (afters some >>> pruning) that led to the os:malloc() call. "4" is somewhat arbitrary >>> as Thomas pointed out, and is controlled by NMT_TrackingStackDepth. >>> Making NMT_TrackingStackDepth bigger means more refinement of the >>> callsites (thus more callsites), but a clearer picture of what >>> actually led to the os:malloc(). >>> >>> For example, with NMT_TrackingStackDepth == 4, if you have a() calls >>> b() calls c() calls d() calls os:malloc(), and foo() and bar() both >>> call a(), the NMT detail output will not distinguish between these >>> two calls paths to os:mallco(), and will consider both paths to be >>> the same callsite. The 4 frames in the NMT detail output would always >>> be a, b, c, and d. However, bump up NMT_TrackingStackDepth to 5 and >>> now NMT will treat them as two separate callsites, one with foo() as >>> the bottom frame and one with bar() as the bottom frame, and both >>> with a, b, c, and d as the other 4 frames. >>> >>> So my point is if AllocateHeap() is not inlined, then every >>> allocation that is the result of doing a "new" of any CHeapObj >>> subtype will have AllocateHeap() in its callsite, which effectively >>> lowers they callsite refinement by 1. >>> >>>> >>>>>> >>>>>> Hmmm but now I'm wondering why this trace: >>>>>> >>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >>>>>> >>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it >>>>>> getting inlined already when AllocateHeap was not? Even so we >>>>>> still end up with 4 frames matching normally. >>>>> I noticed that last night also and scratch my head over it for a >>>>> while and then went to bed. The only explanation I could come up >>>>> with is that allocate_new_entry() is getting inlined, and as a >>>>> result (due to being a slowdebug build and doing minimal inlining) >>>>> AllocateHeap() was not inlined. >>>>>> >>>>>>> If it does appear in a product build, a solution should be looked >>>>>>> into to get rid of it. If the port owner decides it can't get rid >>>>>>> of it (or is unwilling to), then an exception should be added to >>>>>>> the test like was done for solaris and windows slowdebug builds. >>>>>> >>>>>> Are we specifically trying to test the compiler's ability to >>>>>> inline that function and just happen to be using this test to >>>>>> verify that? Doesn't seem like a suitable place to do this - and >>>>>> why do we need to do it? The Visual Studio docs state: >>>>>> >>>>>> "You cannot force the compiler to inline a particular function, >>>>>> even with the __forceinline keyword." >>>>>> >>>>>> so ALWAYSINLINE is just a hint even in product builds and could >>>>>> change with any update to the compiler. >>>>>> >>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>> specifically -xinline only has an effect at ?xO3 or higher. Which >>>>>> likely explains why it is ignored in slowdebug. And there are >>>>>> other cases where it won't honour the ALWAYSINLINE. >>>>>> >>>>>> Even with gcc we seem to be misusing the attribute if we want to >>>>>> ensure inlining when not optimising: >>>>>> >>>>>> "GCC does not inline any functions when not optimizing unless you >>>>>> specify the ?always_inline? attribute for the function, like this: >>>>>> >>>>>> /* Prototype.? */ >>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>> >>>>>> and we don't write it that way. >>>>>> >>>>>> So if we're that concerned about release builds guaranteeing to >>>>>> inline AllocateHeap then I think we need something a bit more >>>>>> explicit than this test to determine that. >>>>> With respect to the 3 methods/functions we don't want to see in the >>>>> callsite stacktrace, NMT has made a number of assumptions on >>>>> inlining. One of the things the test is doing is making sure those >>>>> assumptions are correct. If incorrect, then you run into issues >>>>> like I mentioned above where callsite backtraces effectively only >>>>> have 3 unique frames rather than 4 (actually before some bug fixes >>>>> it was often just 2 unique frames). So I think it's appropriate to >>>>> have a test to make sure we are not seeing any of these 3 >>>>> methods/functions. >>>> >>>> Okay I get the gist of that. Is there somewhere I can clearly see >>>> what this inlining assumptions are that NMT makes? Are they clearly >>>> documented? >>> >>> Not that I know of. I discovered them while looking at the various >>> bugs that led to NativeCallStack::NativeCallStack() and >>> os::get_native_stack() (and sometimes both) being in the callsite. >>> Reviewing the bugs I referred to will give you an idea of where to >>> look. One good place to look at NativeCallStack::NativeCallStack(). >>> Lots of special case code there that controls how many frames to skip >>> based on on the platform and whether optimized or not. Also some >>> comments there to help you out. I did a lot of bug fixing in this >>> method. >>> >>> Looking at this code also reminds me of a reason to have the test >>> continue to check for all 4 specific frames. If the frame skipping >>> code skips an extra frame, then the callsite will be missing a needed >>> frame at the top. The way the test was written it would detect this. >>> With your changes it will not. It would just revert to always >>> matching on 3 frames instead of 4, and the frame skipping bug would >>> go unnoticed. >>> >>> thanks, >>> >>> Chris >>> >>>> >>>>> Now the test also has made inlining assumptions beyond what NMT has >>>>> made, and that is really what this bug is about. In general I think >>>>> your fix is fine in the way it relaxes which frames are actually >>>>> found, but as Thomas points out, it suffers from not actually >>>>> looking at a single stacktrace, but just looking for the specified >>>>> frames somewhere in the output (and in the order specified.) You >>>>> should probably address this. >>>> >>>> Right that was an error on my part. I thought the existing MULTILINE >>>> pattern matching with .* would also find non-sequential lines and so >>>> I was acting similarly. I will re-think this. >>>> >>>> Thanks, >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> Given the changes you made to allow more flexibly in which >>>>>>>>> frames appear, I think you need to now also make sure the above >>>>>>>>> 3 mentioned frames are not present, except for allowing >>>>>>>>> AllocateHeap() in slowdebug builds. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>> >>>>>>>>>> The actual stack trace reported by NMT detail is affected by >>>>>>>>>> the inlining decisions of the native compiler, and on the type >>>>>>>>>> of build. So we define an "ideal" stacktrace and then allow >>>>>>>>>> for some frames to be missing based on empirical observations. >>>>>>>>>> So to date we have seen two frames that may or may not be >>>>>>>>>> inlined and so we allow for 2 non-matching entries. >>>>>>>>>> >>>>>>>>>> The special-casing of AllocateHeap is removed as now it is >>>>>>>>>> just an optional frame. >>>>>>>>>> >>>>>>>>>> Chris: does this maintain the "spirit" of the test as you >>>>>>>>>> intended? >>>>>>>>>> >>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > > From chris.plummer at oracle.com Fri Apr 5 06:01:31 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 4 Apr 2019 23:01:31 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> Message-ID: Thinking about this a bit more, there is still the potential for some confusion if this test fails again in the future due to the top frame missing. Is it missing because it got inlined or is it missing because the frame skipping code skipped an extra frame? Hopefully whoever deals with it doesn't just hastily add another valid stacktrace to the test but instead investigates to make sure the issue is indeed that the method got inlined. Chris On 4/4/19 10:56 PM, David Holmes wrote: > Hi Chris, > > Okay I will simply check for the third alternative. > > Thanks, > David > > On 5/04/2019 3:53 pm, Chris Plummer wrote: >> Hi David, >> >> For the callsite that this test is checking for, right now there >> appear to be 3 possible stacktraces: the "normal" one, the one that >> includes AllocateHeap() on solaris and windows slowdebug builds, and >> the one Zhengyu is now seeing on linux-x64. You would need to check >> for all 3, limiting the AllocateHeap() one to just being allowed on >> solaris and windows slowdebug as it is now. So basically this test >> needs to cover all (allowable) stacktraces that we've seen for this >> callsite, and be updated in the future as needed. Not ideal, but I >> don't see a better solution. It's similar to the situation described >> in JDK-8163899 which covered the fragility of the NMT frame skipping >> code. In the end it was decided it would be easier to just deal fix >> issues as they came up rather then engineer a solution that wasn't as >> fragile. I think this test falls in the same category. >> >> thanks, >> >> Chris >> >> On 4/4/19 10:11 PM, David Holmes wrote: >>> Hi Chris, >>> >>> Thanks for the explanation about the frame counting from os::malloc >>> - now I get it. But I don't understand your final comment: >>> >>> > Looking at this code also reminds me of a reason to have the test >>> > continue to check for all 4 specific frames. If the frame skipping >>> code >>> > skips an extra frame, then the callsite will be missing a needed >>> frame >>> > at the top. The way the test was written it would detect this. >>> With your >>> > changes it will not. It would just revert to always matching on 3 >>> frames >>> > instead of 4, and the frame skipping bug would go unnoticed. >>> >>> How can I fix this bug if I have to check for 4 specific frames but >>> one (or more) may be missing - i.e how can I tell the different >>> between "Frame A was inlined" and "Frame A was skipped by mistake" ?? >>> >>> Thanks, >>> David >>> >>> >>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>> Hi David, >>>> >>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> I have concerns that this will hide some of the other bugs >>>>>>>>>> I've mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. >>>>>>>>>> These bugs result in 1 or two frames appearing in the >>>>>>>>>> stacktrace that should be skipped. Notably >>>>>>>>>> NativeCallStack::NativeCallStack() and os::get_native_stack(). >>>>>>>>> >>>>>>>>> The test still checks those are not present first: >>>>>>>>> >>>>>>>>> 73???????? // We should never see either of these frames >>>>>>>>> because they are supposed to be skipped. */ >>>>>>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>>>>> >>>>>>>>>> Also, AllocateHeap() should normally not be in the stack >>>>>>>>>> trace, but the test has specifically allowed for it for >>>>>>>>>> windows and solaris slowdebug builds. Although these builds >>>>>>>>>> should have honored the ALWAYSINLINE directive, it was deemed >>>>>>>>>> acceptable that it was not in slowdebug builds. However, I >>>>>>>>>> would not want to allow AllocateHeap() to appear in a product >>>>>>>>>> build, and best not to see it in fastdebug either. >>>>>>>>> >>>>>>>>> This is a test of NMT detail not a test of whether a given >>>>>>>>> compiler chooses to inline something like AllocateHeap. I >>>>>>>>> don't think it is the job of this test to be checking for >>>>>>>>> something specific to the native compiler. The previous >>>>>>>>> handling of AllocateHeap seemed to be there simply because it >>>>>>>>> was the only way to deal with an optional frame - but now >>>>>>>>> that's handled generically. >>>>>>>> It's appearance means you effectively only have 3 frames to >>>>>>>> identity callsites instead of 4. >>>>>>> >>>>>>> Both stacktraces in the old test had 4 elements and expected 4 >>>>>>> matches. The current bug is that one of those (new_entry) could >>>>>>> actually be inlined as well, resulting in only 3 matches. So >>>>>>> that is what the revised test checks for: at least 3 matches. >>>>>>> Often there will be 4 matches. >>>>>> I think you misunderstood my "3 frames" comment. I was referring >>>>>> to how many frames NMT uses to identify the callsite. It wants to >>>>>> use 4, but if AllocateHeap() doesn't get inlined, it effectively >>>>>> is using 3. The test should detect when this happens so the NMT >>>>>> implementation can address the issue. >>>>> >>>>> You're right I don't understand this part as I don't know how/what >>>>> NMT detail is doing in this regard. >>>> >>>> An NMT callsite is simply the 4 most recent frames (afters some >>>> pruning) that led to the os:malloc() call. "4" is somewhat >>>> arbitrary as Thomas pointed out, and is controlled by >>>> NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger means >>>> more refinement of the callsites (thus more callsites), but a >>>> clearer picture of what actually led to the os:malloc(). >>>> >>>> For example, with NMT_TrackingStackDepth == 4, if you have a() >>>> calls b() calls c() calls d() calls os:malloc(), and foo() and >>>> bar() both call a(), the NMT detail output will not distinguish >>>> between these two calls paths to os:mallco(), and will consider >>>> both paths to be the same callsite. The 4 frames in the NMT detail >>>> output would always be a, b, c, and d. However, bump up >>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as two >>>> separate callsites, one with foo() as the bottom frame and one with >>>> bar() as the bottom frame, and both with a, b, c, and d as the >>>> other 4 frames. >>>> >>>> So my point is if AllocateHeap() is not inlined, then every >>>> allocation that is the result of doing a "new" of any CHeapObj >>>> subtype will have AllocateHeap() in its callsite, which effectively >>>> lowers they callsite refinement by 1. >>>> >>>>> >>>>>>> >>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>> >>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>> >>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it >>>>>>> getting inlined already when AllocateHeap was not? Even so we >>>>>>> still end up with 4 frames matching normally. >>>>>> I noticed that last night also and scratch my head over it for a >>>>>> while and then went to bed. The only explanation I could come up >>>>>> with is that allocate_new_entry() is getting inlined, and as a >>>>>> result (due to being a slowdebug build and doing minimal >>>>>> inlining) AllocateHeap() was not inlined. >>>>>>> >>>>>>>> If it does appear in a product build, a solution should be >>>>>>>> looked into to get rid of it. If the port owner decides it >>>>>>>> can't get rid of it (or is unwilling to), then an exception >>>>>>>> should be added to the test like was done for solaris and >>>>>>>> windows slowdebug builds. >>>>>>> >>>>>>> Are we specifically trying to test the compiler's ability to >>>>>>> inline that function and just happen to be using this test to >>>>>>> verify that? Doesn't seem like a suitable place to do this - and >>>>>>> why do we need to do it? The Visual Studio docs state: >>>>>>> >>>>>>> "You cannot force the compiler to inline a particular function, >>>>>>> even with the __forceinline keyword." >>>>>>> >>>>>>> so ALWAYSINLINE is just a hint even in product builds and could >>>>>>> change with any update to the compiler. >>>>>>> >>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>> specifically -xinline only has an effect at ?xO3 or higher. >>>>>>> Which likely explains why it is ignored in slowdebug. And there >>>>>>> are other cases where it won't honour the ALWAYSINLINE. >>>>>>> >>>>>>> Even with gcc we seem to be misusing the attribute if we want to >>>>>>> ensure inlining when not optimising: >>>>>>> >>>>>>> "GCC does not inline any functions when not optimizing unless >>>>>>> you specify the ?always_inline? attribute for the function, like >>>>>>> this: >>>>>>> >>>>>>> /* Prototype.? */ >>>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>>> >>>>>>> and we don't write it that way. >>>>>>> >>>>>>> So if we're that concerned about release builds guaranteeing to >>>>>>> inline AllocateHeap then I think we need something a bit more >>>>>>> explicit than this test to determine that. >>>>>> With respect to the 3 methods/functions we don't want to see in >>>>>> the callsite stacktrace, NMT has made a number of assumptions on >>>>>> inlining. One of the things the test is doing is making sure >>>>>> those assumptions are correct. If incorrect, then you run into >>>>>> issues like I mentioned above where callsite backtraces >>>>>> effectively only have 3 unique frames rather than 4 (actually >>>>>> before some bug fixes it was often just 2 unique frames). So I >>>>>> think it's appropriate to have a test to make sure we are not >>>>>> seeing any of these 3 methods/functions. >>>>> >>>>> Okay I get the gist of that. Is there somewhere I can clearly see >>>>> what this inlining assumptions are that NMT makes? Are they >>>>> clearly documented? >>>> >>>> Not that I know of. I discovered them while looking at the various >>>> bugs that led to NativeCallStack::NativeCallStack() and >>>> os::get_native_stack() (and sometimes both) being in the callsite. >>>> Reviewing the bugs I referred to will give you an idea of where to >>>> look. One good place to look at NativeCallStack::NativeCallStack(). >>>> Lots of special case code there that controls how many frames to >>>> skip based on on the platform and whether optimized or not. Also >>>> some comments there to help you out. I did a lot of bug fixing in >>>> this method. >>>> >>>> Looking at this code also reminds me of a reason to have the test >>>> continue to check for all 4 specific frames. If the frame skipping >>>> code skips an extra frame, then the callsite will be missing a >>>> needed frame at the top. The way the test was written it would >>>> detect this. With your changes it will not. It would just revert to >>>> always matching on 3 frames instead of 4, and the frame skipping >>>> bug would go unnoticed. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>>> >>>>>> Now the test also has made inlining assumptions beyond what NMT >>>>>> has made, and that is really what this bug is about. In general I >>>>>> think your fix is fine in the way it relaxes which frames are >>>>>> actually found, but as Thomas points out, it suffers from not >>>>>> actually looking at a single stacktrace, but just looking for the >>>>>> specified frames somewhere in the output (and in the order >>>>>> specified.) You should probably address this. >>>>> >>>>> Right that was an error on my part. I thought the existing >>>>> MULTILINE pattern matching with .* would also find non-sequential >>>>> lines and so I was acting similarly. I will re-think this. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> Given the changes you made to allow more flexibly in which >>>>>>>>>> frames appear, I think you need to now also make sure the >>>>>>>>>> above 3 mentioned frames are not present, except for allowing >>>>>>>>>> AllocateHeap() in slowdebug builds. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>> >>>>>>>>>>> The actual stack trace reported by NMT detail is affected by >>>>>>>>>>> the inlining decisions of the native compiler, and on the >>>>>>>>>>> type of build. So we define an "ideal" stacktrace and then >>>>>>>>>>> allow for some frames to be missing based on empirical >>>>>>>>>>> observations. So to date we have seen two frames that may or >>>>>>>>>>> may not be inlined and so we allow for 2 non-matching entries. >>>>>>>>>>> >>>>>>>>>>> The special-casing of AllocateHeap is removed as now it is >>>>>>>>>>> just an optional frame. >>>>>>>>>>> >>>>>>>>>>> Chris: does this maintain the "spirit" of the test as you >>>>>>>>>>> intended? >>>>>>>>>>> >>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From david.holmes at oracle.com Fri Apr 5 07:04:12 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Apr 2019 17:04:12 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> Message-ID: <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> Hi Chris, Updated webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ Checks for alternate stack now. Added lots of comments and misc fixups. Zhengyu: please re-test (I can't test any slowdebug except linux-x64). Thanks, David On 5/04/2019 4:01 pm, Chris Plummer wrote: > Thinking about this a bit more, there is still the potential for some > confusion if this test fails again in the future due to the top frame > missing. Is it missing because it got inlined or is it missing because > the frame skipping code skipped an extra frame? Hopefully whoever deals > with it doesn't just hastily add another valid stacktrace to the test > but instead investigates to make sure the issue is indeed that the > method got inlined. > > Chris > > On 4/4/19 10:56 PM, David Holmes wrote: >> Hi Chris, >> >> Okay I will simply check for the third alternative. >> >> Thanks, >> David >> >> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>> Hi David, >>> >>> For the callsite that this test is checking for, right now there >>> appear to be 3 possible stacktraces: the "normal" one, the one that >>> includes AllocateHeap() on solaris and windows slowdebug builds, and >>> the one Zhengyu is now seeing on linux-x64. You would need to check >>> for all 3, limiting the AllocateHeap() one to just being allowed on >>> solaris and windows slowdebug as it is now. So basically this test >>> needs to cover all (allowable) stacktraces that we've seen for this >>> callsite, and be updated in the future as needed. Not ideal, but I >>> don't see a better solution. It's similar to the situation described >>> in JDK-8163899 which covered the fragility of the NMT frame skipping >>> code. In the end it was decided it would be easier to just deal fix >>> issues as they came up rather then engineer a solution that wasn't as >>> fragile. I think this test falls in the same category. >>> >>> thanks, >>> >>> Chris >>> >>> On 4/4/19 10:11 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> Thanks for the explanation about the frame counting from os::malloc >>>> - now I get it. But I don't understand your final comment: >>>> >>>> > Looking at this code also reminds me of a reason to have the test >>>> > continue to check for all 4 specific frames. If the frame skipping >>>> code >>>> > skips an extra frame, then the callsite will be missing a needed >>>> frame >>>> > at the top. The way the test was written it would detect this. >>>> With your >>>> > changes it will not. It would just revert to always matching on 3 >>>> frames >>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>> >>>> How can I fix this bug if I have to check for 4 specific frames but >>>> one (or more) may be missing - i.e how can I tell the different >>>> between "Frame A was inlined" and "Frame A was skipped by mistake" ?? >>>> >>>> Thanks, >>>> David >>>> >>>> >>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> I have concerns that this will hide some of the other bugs >>>>>>>>>>> I've mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. >>>>>>>>>>> These bugs result in 1 or two frames appearing in the >>>>>>>>>>> stacktrace that should be skipped. Notably >>>>>>>>>>> NativeCallStack::NativeCallStack() and os::get_native_stack(). >>>>>>>>>> >>>>>>>>>> The test still checks those are not present first: >>>>>>>>>> >>>>>>>>>> 73???????? // We should never see either of these frames >>>>>>>>>> because they are supposed to be skipped. */ >>>>>>>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>>>>>> >>>>>>>>>>> Also, AllocateHeap() should normally not be in the stack >>>>>>>>>>> trace, but the test has specifically allowed for it for >>>>>>>>>>> windows and solaris slowdebug builds. Although these builds >>>>>>>>>>> should have honored the ALWAYSINLINE directive, it was deemed >>>>>>>>>>> acceptable that it was not in slowdebug builds. However, I >>>>>>>>>>> would not want to allow AllocateHeap() to appear in a product >>>>>>>>>>> build, and best not to see it in fastdebug either. >>>>>>>>>> >>>>>>>>>> This is a test of NMT detail not a test of whether a given >>>>>>>>>> compiler chooses to inline something like AllocateHeap. I >>>>>>>>>> don't think it is the job of this test to be checking for >>>>>>>>>> something specific to the native compiler. The previous >>>>>>>>>> handling of AllocateHeap seemed to be there simply because it >>>>>>>>>> was the only way to deal with an optional frame - but now >>>>>>>>>> that's handled generically. >>>>>>>>> It's appearance means you effectively only have 3 frames to >>>>>>>>> identity callsites instead of 4. >>>>>>>> >>>>>>>> Both stacktraces in the old test had 4 elements and expected 4 >>>>>>>> matches. The current bug is that one of those (new_entry) could >>>>>>>> actually be inlined as well, resulting in only 3 matches. So >>>>>>>> that is what the revised test checks for: at least 3 matches. >>>>>>>> Often there will be 4 matches. >>>>>>> I think you misunderstood my "3 frames" comment. I was referring >>>>>>> to how many frames NMT uses to identify the callsite. It wants to >>>>>>> use 4, but if AllocateHeap() doesn't get inlined, it effectively >>>>>>> is using 3. The test should detect when this happens so the NMT >>>>>>> implementation can address the issue. >>>>>> >>>>>> You're right I don't understand this part as I don't know how/what >>>>>> NMT detail is doing in this regard. >>>>> >>>>> An NMT callsite is simply the 4 most recent frames (afters some >>>>> pruning) that led to the os:malloc() call. "4" is somewhat >>>>> arbitrary as Thomas pointed out, and is controlled by >>>>> NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger means >>>>> more refinement of the callsites (thus more callsites), but a >>>>> clearer picture of what actually led to the os:malloc(). >>>>> >>>>> For example, with NMT_TrackingStackDepth == 4, if you have a() >>>>> calls b() calls c() calls d() calls os:malloc(), and foo() and >>>>> bar() both call a(), the NMT detail output will not distinguish >>>>> between these two calls paths to os:mallco(), and will consider >>>>> both paths to be the same callsite. The 4 frames in the NMT detail >>>>> output would always be a, b, c, and d. However, bump up >>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as two >>>>> separate callsites, one with foo() as the bottom frame and one with >>>>> bar() as the bottom frame, and both with a, b, c, and d as the >>>>> other 4 frames. >>>>> >>>>> So my point is if AllocateHeap() is not inlined, then every >>>>> allocation that is the result of doing a "new" of any CHeapObj >>>>> subtype will have AllocateHeap() in its callsite, which effectively >>>>> lowers they callsite refinement by 1. >>>>> >>>>>> >>>>>>>> >>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>> >>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>> >>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it >>>>>>>> getting inlined already when AllocateHeap was not? Even so we >>>>>>>> still end up with 4 frames matching normally. >>>>>>> I noticed that last night also and scratch my head over it for a >>>>>>> while and then went to bed. The only explanation I could come up >>>>>>> with is that allocate_new_entry() is getting inlined, and as a >>>>>>> result (due to being a slowdebug build and doing minimal >>>>>>> inlining) AllocateHeap() was not inlined. >>>>>>>> >>>>>>>>> If it does appear in a product build, a solution should be >>>>>>>>> looked into to get rid of it. If the port owner decides it >>>>>>>>> can't get rid of it (or is unwilling to), then an exception >>>>>>>>> should be added to the test like was done for solaris and >>>>>>>>> windows slowdebug builds. >>>>>>>> >>>>>>>> Are we specifically trying to test the compiler's ability to >>>>>>>> inline that function and just happen to be using this test to >>>>>>>> verify that? Doesn't seem like a suitable place to do this - and >>>>>>>> why do we need to do it? The Visual Studio docs state: >>>>>>>> >>>>>>>> "You cannot force the compiler to inline a particular function, >>>>>>>> even with the __forceinline keyword." >>>>>>>> >>>>>>>> so ALWAYSINLINE is just a hint even in product builds and could >>>>>>>> change with any update to the compiler. >>>>>>>> >>>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>>> specifically -xinline only has an effect at ?xO3 or higher. >>>>>>>> Which likely explains why it is ignored in slowdebug. And there >>>>>>>> are other cases where it won't honour the ALWAYSINLINE. >>>>>>>> >>>>>>>> Even with gcc we seem to be misusing the attribute if we want to >>>>>>>> ensure inlining when not optimising: >>>>>>>> >>>>>>>> "GCC does not inline any functions when not optimizing unless >>>>>>>> you specify the ?always_inline? attribute for the function, like >>>>>>>> this: >>>>>>>> >>>>>>>> /* Prototype.? */ >>>>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>>>> >>>>>>>> and we don't write it that way. >>>>>>>> >>>>>>>> So if we're that concerned about release builds guaranteeing to >>>>>>>> inline AllocateHeap then I think we need something a bit more >>>>>>>> explicit than this test to determine that. >>>>>>> With respect to the 3 methods/functions we don't want to see in >>>>>>> the callsite stacktrace, NMT has made a number of assumptions on >>>>>>> inlining. One of the things the test is doing is making sure >>>>>>> those assumptions are correct. If incorrect, then you run into >>>>>>> issues like I mentioned above where callsite backtraces >>>>>>> effectively only have 3 unique frames rather than 4 (actually >>>>>>> before some bug fixes it was often just 2 unique frames). So I >>>>>>> think it's appropriate to have a test to make sure we are not >>>>>>> seeing any of these 3 methods/functions. >>>>>> >>>>>> Okay I get the gist of that. Is there somewhere I can clearly see >>>>>> what this inlining assumptions are that NMT makes? Are they >>>>>> clearly documented? >>>>> >>>>> Not that I know of. I discovered them while looking at the various >>>>> bugs that led to NativeCallStack::NativeCallStack() and >>>>> os::get_native_stack() (and sometimes both) being in the callsite. >>>>> Reviewing the bugs I referred to will give you an idea of where to >>>>> look. One good place to look at NativeCallStack::NativeCallStack(). >>>>> Lots of special case code there that controls how many frames to >>>>> skip based on on the platform and whether optimized or not. Also >>>>> some comments there to help you out. I did a lot of bug fixing in >>>>> this method. >>>>> >>>>> Looking at this code also reminds me of a reason to have the test >>>>> continue to check for all 4 specific frames. If the frame skipping >>>>> code skips an extra frame, then the callsite will be missing a >>>>> needed frame at the top. The way the test was written it would >>>>> detect this. With your changes it will not. It would just revert to >>>>> always matching on 3 frames instead of 4, and the frame skipping >>>>> bug would go unnoticed. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>>> >>>>>>> Now the test also has made inlining assumptions beyond what NMT >>>>>>> has made, and that is really what this bug is about. In general I >>>>>>> think your fix is fine in the way it relaxes which frames are >>>>>>> actually found, but as Thomas points out, it suffers from not >>>>>>> actually looking at a single stacktrace, but just looking for the >>>>>>> specified frames somewhere in the output (and in the order >>>>>>> specified.) You should probably address this. >>>>>> >>>>>> Right that was an error on my part. I thought the existing >>>>>> MULTILINE pattern matching with .* would also find non-sequential >>>>>> lines and so I was acting similarly. I will re-think this. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> Given the changes you made to allow more flexibly in which >>>>>>>>>>> frames appear, I think you need to now also make sure the >>>>>>>>>>> above 3 mentioned frames are not present, except for allowing >>>>>>>>>>> AllocateHeap() in slowdebug builds. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>> >>>>>>>>>>>> The actual stack trace reported by NMT detail is affected by >>>>>>>>>>>> the inlining decisions of the native compiler, and on the >>>>>>>>>>>> type of build. So we define an "ideal" stacktrace and then >>>>>>>>>>>> allow for some frames to be missing based on empirical >>>>>>>>>>>> observations. So to date we have seen two frames that may or >>>>>>>>>>>> may not be inlined and so we allow for 2 non-matching entries. >>>>>>>>>>>> >>>>>>>>>>>> The special-casing of AllocateHeap is removed as now it is >>>>>>>>>>>> just an optional frame. >>>>>>>>>>>> >>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as you >>>>>>>>>>>> intended? >>>>>>>>>>>> >>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > > From martin.doerr at sap.com Fri Apr 5 07:23:13 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 5 Apr 2019 07:23:13 +0000 Subject: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses In-Reply-To: <811daec1-73c2-3e76-da16-7b9b82bdc3d1@oracle.com> References: <9a392c4c-2620-99b1-79b5-7bffa8a4774f@oracle.com> <811daec1-73c2-3e76-da16-7b9b82bdc3d1@oracle.com> Message-ID: Thank you for the reviews! Martin -----Original Message----- From: coleen.phillimore at oracle.com Sent: Donnerstag, 4. April 2019 20:27 To: Zhengyu Gu ; Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid not performed for some addresses +1 thanks, Coleen On 4/4/19 10:55 AM, Zhengyu Gu wrote: > > > On 4/4/19 10:42 AM, Doerr, Martin wrote: >> Hi Coleen and Zhengyu, >> >> thanks for your feedback. I've also replaced pointer comparison by >> numeric comparison to avoid undefined behavior. >> >> New webrev: >> http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.01/ > > Looks good. > > Thanks, > > -Zhengyu > >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-runtime-dev >> On Behalf Of >> coleen.phillimore at oracle.com >> Sent: Donnerstag, 4. April 2019 02:33 >> To: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid >> not performed for some addresses >> >> >> >> On 4/2/19 10:33 AM, Doerr, Martin wrote: >>> Hi Zhengyu, >>> >>> that would be fine, too. I'll put it there if other reviewers prefer >>> that, too. >> >> Yes, I prefer that too. >> Coleen >> >>> >>> Thanks and best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: Zhengyu Gu >>> Sent: Dienstag, 2. April 2019 16:01 >>> To: Doerr, Martin ; >>> hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(XS): 8221833: Readability check in Symbol::is_valid >>> not performed for some addresses >>> >>> Hi Martin, >>> >>> Would it be more proper to do the check in os::is_readable_range()? >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> On 4/2/19 9:05 AM, Doerr, Martin wrote: >>>> Hi, >>>> >>>> I'd like to fix a minor bug in Symbol::is_valid which can cause >>>> errors during error reporting: >>>> Address computation can overflow leading to skipped readability check. >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8221833 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~mdoerr/8221833_valid_symbol/webrev.00/ >>>> >>>> Please review. >>>> >>>> Best regards, >>>> Martin >>>> >> From robin.westberg at oracle.com Fri Apr 5 08:05:28 2019 From: robin.westberg at oracle.com (Robin Westberg) Date: Fri, 5 Apr 2019 10:05:28 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows Message-ID: Hi all, Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ Testing: tier1 Best regards, Robin From david.holmes at oracle.com Fri Apr 5 08:49:54 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Apr 2019 18:49:54 +1000 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: Message-ID: Hi Robin, On 5/04/2019 6:05 pm, Robin Westberg wrote: > Hi all, > > Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. Thanks, David ----- > Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 > Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ > Testing: tier1 > > Best regards, > Robin > From robin.westberg at oracle.com Fri Apr 5 09:53:45 2019 From: robin.westberg at oracle.com (Robin Westberg) Date: Fri, 5 Apr 2019 11:53:45 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: Message-ID: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> Hi David, Thanks for taking a look! > On 5 Apr 2019, at 10:49, David Holmes wrote: > > Hi Robin, > > On 5/04/2019 6:05 pm, Robin Westberg wrote: >> Hi all, >> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. > > Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. Best regards, Robin > > Thanks, > David > ----- > >> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >> Testing: tier1 >> Best regards, >> Robin From thomas.stuefe at gmail.com Fri Apr 5 10:06:49 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 5 Apr 2019 12:06:49 +0200 Subject: RFR(s): 8222015: Small VM.metaspace improvements Message-ID: Hi all, may I have please a review for this collection of small improvements to the VM.metaspace diagnostic command? - it clearly marks now classes whose metadata reside in cds - it shows the number of classes loaded, incl. those from cds, in the overviews too. Issue: https://bugs.openjdk.java.net/browse/JDK-8222015 cr: http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/webrev.00/webrev/ Example output: http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-by-spacetype.txt http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders.txt http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders-showclasses.txt (scroll down -> cds classes in are now marked with 's') Thank you, Thomas From david.holmes at oracle.com Fri Apr 5 10:10:07 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Apr 2019 20:10:07 +1000 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> Message-ID: <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> On 5/04/2019 7:53 pm, Robin Westberg wrote: > Hi David, > > Thanks for taking a look! > >> On 5 Apr 2019, at 10:49, David Holmes wrote: >> >> Hi Robin, >> >> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>> Hi all, >>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >> >> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. > > Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). > > That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) Thanks, David > > Best regards, > Robin > >> >> Thanks, >> David >> ----- >> >>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>> Testing: tier1 >>> Best regards, >>> Robin > From robin.westberg at oracle.com Fri Apr 5 11:54:16 2019 From: robin.westberg at oracle.com (Robin Westberg) Date: Fri, 5 Apr 2019 13:54:16 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> Message-ID: Hi David, > On 5 Apr 2019, at 12:10, David Holmes wrote: > > On 5/04/2019 7:53 pm, Robin Westberg wrote: >> Hi David, >> Thanks for taking a look! >>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>> >>> Hi Robin, >>> >>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>> Hi all, >>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>> >>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. > > I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. > It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. Best regards, Robin > > Thanks, > David > >> Best regards, >> Robin >>> >>> Thanks, >>> David >>> ----- >>> >>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>> Testing: tier1 >>>> Best regards, >>>> Robin From robbin.ehn at oracle.com Fri Apr 5 12:07:03 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 5 Apr 2019 14:07:03 +0200 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> Message-ID: <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> Hi Dan, (Martin there is question for you last in this email) After first pass I did not find any real issues. Considering what you had to work with, it looks good! #1 There are some assert which are redundant (to me at least) like: src/hotspot/share/runtime/objectMonitor.cpp L445 if (!dmw->is_marked() && dmw->hash() == 0) { // This dmw is neutral and has not yet started the restoration // protocol so we mark a copy of the dmw to begin the protocol. markOop marked_dmw = dmw->set_marked(); assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, marked_dmw->is_marked(), marked_dmw->hash()); That assert is basically a test that set_marked worked? L505 if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { assert(_succ != Self, "invariant"); assert(_owner == Self, "invariant"); Assert on _owner checks that our cmpxchg is not broken? I think it's easier to read the code if some on the most obvious asserts are removed. Maybe comments instead. #2 Not your doing but I think we should remove TRAPS/Thread * Self and use JavaThread* instead. E.g. so we can change: void ObjectMonitor::EnterI(TRAPS) { Thread * const Self = THREAD; assert(Self->is_Java_thread(), "invariant"); assert(((JavaThread *) Self)->thread_state() == _thread_blocked, "invariant"); to: void ObjectMonitor::EnterI(JavaThread* Self) { assert(Self->thread_state() == _thread_blocked, "invariant"); #3 src/hotspot/share/runtime/objectMonitor.inline.hpp 164 inline void ObjectMonitor::inc_ref_count() { 165 // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc 166 // backend on PPC does not yet conform to these requirements. Therefore 167 // the increment is simulated with a load phi; cas phi + 1; loop. 168 // Without this MO_SEQ_CST Atomic::inc simulation, AsyncDeflateIdleMonitors 169 // is not safe. I think was fixed with: 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics You should get a leading sync and trailing one with the default conservative model and thus get proper memory ordering. Martin, I'm I correct? Thanks, Robbin On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: > Greetings, > > Welcome to the OpenJDK review thread for my port of Carsten's work on: > > ??? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > Here's a link to the OpenJDK wiki that describes my port: > > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > > Here's the webrev URL: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ > > Here's a link to Carsten's original webrev: > > http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ > > Earlier versions of this patch have been through several rounds of > preliminary review. Many thanks to Carsten, Coleen, Robbin, and > Roman for their preliminary code review comments. A very special > thanks to Robbin and Roman for building and testing the patch in > their own environments (including specJBB2015). > > This version of the patch has been thru Mach5 tier[1-8] testing on > Oracle's usual set of platforms. Earlier versions have been run > through my stress kit on my Linux-X64 and Solaris-X64 servers > (product, fastdebug, slowdebug).Earlier versions have run Kitchensink > for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug > and slowdebug). Earlier versions have run my monitor inflation stress > tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, > fastdebug and slowdebug). > > All of the testing done on earlier versions will be redone on the > latest version of the patch. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > > P.S. > One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java > is currently failing in -Xcomp mode on Win* only. I've been trying > to characterize/analyze this failure for more than a week now. At > this point I'm convinced that Async Monitor Deflation is aggravating > an existing bug. However, I plan to have a better handle on that > failure before these bits are pushed to the jdk/jdk repo. From coleen.phillimore at oracle.com Fri Apr 5 12:35:04 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 5 Apr 2019 08:35:04 -0400 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> Message-ID: Hi Gerard,?? This is somewhat of a first pass review. I like the change a? lot.? I have a couple of suggestions. http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/utilities/statistics.hpp.html Can you rename this file tableStatistics.cpp/hpp because "statistics" is too general and the class is called TableStatistics. http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/jfr/periodic/jfrPeriodic.cpp.udiff.html Is there anyway to parameterize these functions and/or add them to TableStatistics? Also, when Stefan is done with the ResolvedMethodTable, you can add that too in a separate RFE https://bugs.openjdk.java.net/browse/JDK-8221393 Thanks, Coleen On 4/4/19 3:52 PM, gerard ziemski wrote: > Thank you Erik for clarifications. > > I have implemented all your suggestions, which you can find here > http://cr.openjdk.java.net/~gziemski/8185525_rev2 > > I started Mach5 tier1-6 test to test the changes ... > > > cheers > > On 4/4/19 1:16 PM, Erik Gahlin wrote: >> On 2019-04-04 17:39, gerard ziemski wrote: >>> hi Erik, >>> >>> >>> On 4/3/19 12:44 PM, Erik Gahlin wrote: >>>> Hi Gerard, >>>> >>>> Here are some comments about the metadata (to make it consistent >>>> with other events). >>>> >>>> The events should not be in the "Java Application" category since >>>> they are JVM events. You could perhaps put them in "Java Virtual >>>> Machine, Runtime, Tables". Some comments about the names and labels >>>> of fields. >>>> >>>> - Label: Number of buckets => Bucket Count >>>> - Label: Number of entries => Entry Count >>>> - Label: Total footprint => Total Footprint >>>> >>>> Could you remove descriptions that are exactly the same as the label. >>>> >>>> - Label: Maximum bucket size => Maximum Bucket Size >>>> - Label: Average bucket size => Average Bucket Size >>>> - Label: Variance of bucket? size => Bucket Size Variance >>>> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >>>> - Label: Standard deviation of bucket size => Bucket Size Standard >>>> Deviation" >>>> >>>> Instead of using the word "size", it may make more sense to use the >>>> word "count" here as well, i.e "Average Bucket Count", or maybe I'm >>>> missing something? Is there a difference? >>>> >>>> I wonder how useful standard deviation and variance is? If support >>>> engineers are looking at a recording, or JMC adds a rule for the >>>> events, what would a good or bad value be? Is it possible to use >>>> the information for troubleshooting? >>> >>> While I'm working on all the above changes you suggested, we can >>> discuss the standard devation and variance. >>> >>> I added them because they are part of the jcmd "VM.symboltable >>> -verbose" command, so we are consistent. >> OK >>> >>> Now, regarding how useful they are, I always understood them as a >>> sign of imbalanced table distribution, and without a proper >>> histogram, this is the best description of the histogram shape. In >>> reality, however, I think that if they identify an issue, then we >>> might have a very curious distribution (some sort of hash table >>> attack), or we have an issue with our hash function for the >>> particular usage case. >>> >>> Still, I'd personally elect to keep them. >>> >>> Let me ask you a different question though, Is it expensive to have >>> 2 doubles as part of an event (5 events per second)? >> Doubles can't be compressed so each value will take 8 bytes. I don't >> think the precision of a double is needed, so you could change it >> into a float and save a few bytes. >> >> Most user will not care about JVM internals and a lower rate than >> once per second is probably sufficient for support engineers to spot >> that something is wrong. >> >> The Thread Context Switch Rate event is emitted once every ten >> seconds. I think the same rate could be used here. >> >>> And if so, is there currently (or planned) granularity for >>> controlling not just which events to record, but also which attributes? >>> >> No. >> >> If overhead becomes an issues, it's usually better to emit all the >> information, but at a lower rate.? That way, users can find out that >> the information exists, and increase the rate if a higher resolution >> is needed to solve their specific issue. >> >>>> >>>> - Name: addRate => insertionRate >>>> - Label: Rate of addition =>? Insertation Rate >>>> - Name: removeRate => removalRate >>>> - Label: Rate of removal => Removal Rate >>> >>> Will do. >>> >>>> >>>> I'm missing unit tests for the events. Could you please add in >>>> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >>>> average not exceeding max, no negative values etc. >>> >>> Working on it, do we need separate test per each event (table), or >>> just one table will suffice (ex. StringTable)? >> They are kind of similar, so I think one test file is sufficient, but >> we should sanity check data for all events. >> >> Thanks >> Erik >> >>> >>> Thank you for the feedback! >>> >>> >>> cheers >>>> >>>> Thanks! >>>> Erik >>>> >>>>> Hi all, >>>>> >>>>> Please review this feature, which adds tracing events for the >>>>> internal hash tables. >>>>> >>>>> The following attributes are implemented: >>>>> >>>>> >>>>> >>>>> >>>> label="Total footprint" description="Total memory footprint (the >>>>> table itself plus all of the entries)" /> >>>>> >>>>> >>>> label="Variance of bucket sizes" description="How far bucket >>>>> lengths are spread out from their average value" /> >>>>> >>>>> >>>> description="How many items were added since last event (per >>>>> second)" /> >>>>> >>>> description="How many items were removed since last event (per >>>>> second)" /> >>>>> >>>>> This event was implemented for the following system tables: >>>>> >>>>> SymbolTable >>>>> StringTable >>>>> Placeholder Table >>>>> LoaderConstraints Table >>>>> ProtectionDomainCache Table >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>>>> Bug:???? https://bugs.openjdk.java.net/browse/JDK-8185525 >>>>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in >>>>> progress?) >>>>> >>>>> >>>>> Cheers From martin.doerr at sap.com Fri Apr 5 12:37:18 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 5 Apr 2019 12:37:18 +0000 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> Message-ID: Hi everybody, > I think was fixed with: > 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics > You should get a leading sync and trailing one with the default conservative > model and thus get proper memory ordering. > Martin, I'm I correct? Exactly. Thanks for pointing this out. PPC uses the strongest possible ordering semantics with memory_order_conservative (default parameter). I've seen that comment about PPC in "void ThreadsList::inc_nested_handle_cnt()". This function could get replaced. Best regards, Martin -----Original Message----- From: Robbin Ehn Sent: Freitag, 5. April 2019 14:07 To: daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net; Carsten Varming ; Roman Kennke ; Doerr, Martin Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints Hi Dan, (Martin there is question for you last in this email) After first pass I did not find any real issues. Considering what you had to work with, it looks good! #1 There are some assert which are redundant (to me at least) like: src/hotspot/share/runtime/objectMonitor.cpp L445 if (!dmw->is_marked() && dmw->hash() == 0) { // This dmw is neutral and has not yet started the restoration // protocol so we mark a copy of the dmw to begin the protocol. markOop marked_dmw = dmw->set_marked(); assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, marked_dmw->is_marked(), marked_dmw->hash()); That assert is basically a test that set_marked worked? L505 if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { assert(_succ != Self, "invariant"); assert(_owner == Self, "invariant"); Assert on _owner checks that our cmpxchg is not broken? I think it's easier to read the code if some on the most obvious asserts are removed. Maybe comments instead. #2 Not your doing but I think we should remove TRAPS/Thread * Self and use JavaThread* instead. E.g. so we can change: void ObjectMonitor::EnterI(TRAPS) { Thread * const Self = THREAD; assert(Self->is_Java_thread(), "invariant"); assert(((JavaThread *) Self)->thread_state() == _thread_blocked, "invariant"); to: void ObjectMonitor::EnterI(JavaThread* Self) { assert(Self->thread_state() == _thread_blocked, "invariant"); #3 src/hotspot/share/runtime/objectMonitor.inline.hpp 164 inline void ObjectMonitor::inc_ref_count() { 165 // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc 166 // backend on PPC does not yet conform to these requirements. Therefore 167 // the increment is simulated with a load phi; cas phi + 1; loop. 168 // Without this MO_SEQ_CST Atomic::inc simulation, AsyncDeflateIdleMonitors 169 // is not safe. I think was fixed with: 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics You should get a leading sync and trailing one with the default conservative model and thus get proper memory ordering. Martin, I'm I correct? Thanks, Robbin On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: > Greetings, > > Welcome to the OpenJDK review thread for my port of Carsten's work on: > > ??? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > Here's a link to the OpenJDK wiki that describes my port: > > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > > Here's the webrev URL: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ > > Here's a link to Carsten's original webrev: > > http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ > > Earlier versions of this patch have been through several rounds of > preliminary review. Many thanks to Carsten, Coleen, Robbin, and > Roman for their preliminary code review comments. A very special > thanks to Robbin and Roman for building and testing the patch in > their own environments (including specJBB2015). > > This version of the patch has been thru Mach5 tier[1-8] testing on > Oracle's usual set of platforms. Earlier versions have been run > through my stress kit on my Linux-X64 and Solaris-X64 servers > (product, fastdebug, slowdebug).Earlier versions have run Kitchensink > for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug > and slowdebug). Earlier versions have run my monitor inflation stress > tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, > fastdebug and slowdebug). > > All of the testing done on earlier versions will be redone on the > latest version of the patch. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > > P.S. > One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java > is currently failing in -Xcomp mode on Win* only. I've been trying > to characterize/analyze this failure for more than a week now. At > this point I'm convinced that Async Monitor Deflation is aggravating > an existing bug. However, I plan to have a better handle on that > failure before these bits are pushed to the jdk/jdk repo. From zgu at redhat.com Fri Apr 5 13:05:28 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 5 Apr 2019 09:05:28 -0400 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> Message-ID: Hi David, On 4/5/19 3:04 AM, David Holmes wrote: > Hi Chris, > > Updated webrev: > > http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ > > Checks for alternate stack now. Added lots of comments and misc fixups. > > Zhengyu: please re-test (I can't test any slowdebug except linux-x64). It passed all three configurations (release, fastdebug and slowdebug) Thanks, -Zhengyu > > Thanks, > David > > On 5/04/2019 4:01 pm, Chris Plummer wrote: >> Thinking about this a bit more, there is still the potential for some >> confusion if this test fails again in the future due to the top frame >> missing. Is it missing because it got inlined or is it missing because >> the frame skipping code skipped an extra frame? Hopefully whoever >> deals with it doesn't just hastily add another valid stacktrace to the >> test but instead investigates to make sure the issue is indeed that >> the method got inlined. >> >> Chris >> >> On 4/4/19 10:56 PM, David Holmes wrote: >>> Hi Chris, >>> >>> Okay I will simply check for the third alternative. >>> >>> Thanks, >>> David >>> >>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>> Hi David, >>>> >>>> For the callsite that this test is checking for, right now there >>>> appear to be 3 possible stacktraces: the "normal" one, the one that >>>> includes AllocateHeap() on solaris and windows slowdebug builds, and >>>> the one Zhengyu is now seeing on linux-x64. You would need to check >>>> for all 3, limiting the AllocateHeap() one to just being allowed on >>>> solaris and windows slowdebug as it is now. So basically this test >>>> needs to cover all (allowable) stacktraces that we've seen for this >>>> callsite, and be updated in the future as needed. Not ideal, but I >>>> don't see a better solution. It's similar to the situation described >>>> in JDK-8163899 which covered the fragility of the NMT frame skipping >>>> code. In the end it was decided it would be easier to just deal fix >>>> issues as they came up rather then engineer a solution that wasn't >>>> as fragile. I think this test falls in the same category. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> Thanks for the explanation about the frame counting from os::malloc >>>>> - now I get it. But I don't understand your final comment: >>>>> >>>>> > Looking at this code also reminds me of a reason to have the test >>>>> > continue to check for all 4 specific frames. If the frame >>>>> skipping code >>>>> > skips an extra frame, then the callsite will be missing a needed >>>>> frame >>>>> > at the top. The way the test was written it would detect this. >>>>> With your >>>>> > changes it will not. It would just revert to always matching on 3 >>>>> frames >>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>> >>>>> How can I fix this bug if I have to check for 4 specific frames but >>>>> one (or more) may be missing - i.e how can I tell the different >>>>> between "Frame A was inlined" and "Frame A was skipped by mistake" ?? >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> I have concerns that this will hide some of the other bugs >>>>>>>>>>>> I've mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. >>>>>>>>>>>> These bugs result in 1 or two frames appearing in the >>>>>>>>>>>> stacktrace that should be skipped. Notably >>>>>>>>>>>> NativeCallStack::NativeCallStack() and os::get_native_stack(). >>>>>>>>>>> >>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>> >>>>>>>>>>> 73???????? // We should never see either of these frames >>>>>>>>>>> because they are supposed to be skipped. */ >>>>>>>>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>>>>>>> >>>>>>>>>>>> Also, AllocateHeap() should normally not be in the stack >>>>>>>>>>>> trace, but the test has specifically allowed for it for >>>>>>>>>>>> windows and solaris slowdebug builds. Although these builds >>>>>>>>>>>> should have honored the ALWAYSINLINE directive, it was >>>>>>>>>>>> deemed acceptable that it was not in slowdebug builds. >>>>>>>>>>>> However, I would not want to allow AllocateHeap() to appear >>>>>>>>>>>> in a product build, and best not to see it in fastdebug either. >>>>>>>>>>> >>>>>>>>>>> This is a test of NMT detail not a test of whether a given >>>>>>>>>>> compiler chooses to inline something like AllocateHeap. I >>>>>>>>>>> don't think it is the job of this test to be checking for >>>>>>>>>>> something specific to the native compiler. The previous >>>>>>>>>>> handling of AllocateHeap seemed to be there simply because it >>>>>>>>>>> was the only way to deal with an optional frame - but now >>>>>>>>>>> that's handled generically. >>>>>>>>>> It's appearance means you effectively only have 3 frames to >>>>>>>>>> identity callsites instead of 4. >>>>>>>>> >>>>>>>>> Both stacktraces in the old test had 4 elements and expected 4 >>>>>>>>> matches. The current bug is that one of those (new_entry) could >>>>>>>>> actually be inlined as well, resulting in only 3 matches. So >>>>>>>>> that is what the revised test checks for: at least 3 matches. >>>>>>>>> Often there will be 4 matches. >>>>>>>> I think you misunderstood my "3 frames" comment. I was referring >>>>>>>> to how many frames NMT uses to identify the callsite. It wants >>>>>>>> to use 4, but if AllocateHeap() doesn't get inlined, it >>>>>>>> effectively is using 3. The test should detect when this happens >>>>>>>> so the NMT implementation can address the issue. >>>>>>> >>>>>>> You're right I don't understand this part as I don't know >>>>>>> how/what NMT detail is doing in this regard. >>>>>> >>>>>> An NMT callsite is simply the 4 most recent frames (afters some >>>>>> pruning) that led to the os:malloc() call. "4" is somewhat >>>>>> arbitrary as Thomas pointed out, and is controlled by >>>>>> NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger means >>>>>> more refinement of the callsites (thus more callsites), but a >>>>>> clearer picture of what actually led to the os:malloc(). >>>>>> >>>>>> For example, with NMT_TrackingStackDepth == 4, if you have a() >>>>>> calls b() calls c() calls d() calls os:malloc(), and foo() and >>>>>> bar() both call a(), the NMT detail output will not distinguish >>>>>> between these two calls paths to os:mallco(), and will consider >>>>>> both paths to be the same callsite. The 4 frames in the NMT detail >>>>>> output would always be a, b, c, and d. However, bump up >>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as two >>>>>> separate callsites, one with foo() as the bottom frame and one >>>>>> with bar() as the bottom frame, and both with a, b, c, and d as >>>>>> the other 4 frames. >>>>>> >>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>> allocation that is the result of doing a "new" of any CHeapObj >>>>>> subtype will have AllocateHeap() in its callsite, which >>>>>> effectively lowers they callsite refinement by 1. >>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>> >>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>> >>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it >>>>>>>>> getting inlined already when AllocateHeap was not? Even so we >>>>>>>>> still end up with 4 frames matching normally. >>>>>>>> I noticed that last night also and scratch my head over it for a >>>>>>>> while and then went to bed. The only explanation I could come up >>>>>>>> with is that allocate_new_entry() is getting inlined, and as a >>>>>>>> result (due to being a slowdebug build and doing minimal >>>>>>>> inlining) AllocateHeap() was not inlined. >>>>>>>>> >>>>>>>>>> If it does appear in a product build, a solution should be >>>>>>>>>> looked into to get rid of it. If the port owner decides it >>>>>>>>>> can't get rid of it (or is unwilling to), then an exception >>>>>>>>>> should be added to the test like was done for solaris and >>>>>>>>>> windows slowdebug builds. >>>>>>>>> >>>>>>>>> Are we specifically trying to test the compiler's ability to >>>>>>>>> inline that function and just happen to be using this test to >>>>>>>>> verify that? Doesn't seem like a suitable place to do this - >>>>>>>>> and why do we need to do it? The Visual Studio docs state: >>>>>>>>> >>>>>>>>> "You cannot force the compiler to inline a particular function, >>>>>>>>> even with the __forceinline keyword." >>>>>>>>> >>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and could >>>>>>>>> change with any update to the compiler. >>>>>>>>> >>>>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>>>> specifically -xinline only has an effect at ?xO3 or higher. >>>>>>>>> Which likely explains why it is ignored in slowdebug. And there >>>>>>>>> are other cases where it won't honour the ALWAYSINLINE. >>>>>>>>> >>>>>>>>> Even with gcc we seem to be misusing the attribute if we want >>>>>>>>> to ensure inlining when not optimising: >>>>>>>>> >>>>>>>>> "GCC does not inline any functions when not optimizing unless >>>>>>>>> you specify the ?always_inline? attribute for the function, >>>>>>>>> like this: >>>>>>>>> >>>>>>>>> /* Prototype.? */ >>>>>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>>>>> >>>>>>>>> and we don't write it that way. >>>>>>>>> >>>>>>>>> So if we're that concerned about release builds guaranteeing to >>>>>>>>> inline AllocateHeap then I think we need something a bit more >>>>>>>>> explicit than this test to determine that. >>>>>>>> With respect to the 3 methods/functions we don't want to see in >>>>>>>> the callsite stacktrace, NMT has made a number of assumptions on >>>>>>>> inlining. One of the things the test is doing is making sure >>>>>>>> those assumptions are correct. If incorrect, then you run into >>>>>>>> issues like I mentioned above where callsite backtraces >>>>>>>> effectively only have 3 unique frames rather than 4 (actually >>>>>>>> before some bug fixes it was often just 2 unique frames). So I >>>>>>>> think it's appropriate to have a test to make sure we are not >>>>>>>> seeing any of these 3 methods/functions. >>>>>>> >>>>>>> Okay I get the gist of that. Is there somewhere I can clearly see >>>>>>> what this inlining assumptions are that NMT makes? Are they >>>>>>> clearly documented? >>>>>> >>>>>> Not that I know of. I discovered them while looking at the various >>>>>> bugs that led to NativeCallStack::NativeCallStack() and >>>>>> os::get_native_stack() (and sometimes both) being in the callsite. >>>>>> Reviewing the bugs I referred to will give you an idea of where to >>>>>> look. One good place to look at >>>>>> NativeCallStack::NativeCallStack(). Lots of special case code >>>>>> there that controls how many frames to skip based on on the >>>>>> platform and whether optimized or not. Also some comments there to >>>>>> help you out. I did a lot of bug fixing in this method. >>>>>> >>>>>> Looking at this code also reminds me of a reason to have the test >>>>>> continue to check for all 4 specific frames. If the frame skipping >>>>>> code skips an extra frame, then the callsite will be missing a >>>>>> needed frame at the top. The way the test was written it would >>>>>> detect this. With your changes it will not. It would just revert >>>>>> to always matching on 3 frames instead of 4, and the frame >>>>>> skipping bug would go unnoticed. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>>> >>>>>>>> Now the test also has made inlining assumptions beyond what NMT >>>>>>>> has made, and that is really what this bug is about. In general >>>>>>>> I think your fix is fine in the way it relaxes which frames are >>>>>>>> actually found, but as Thomas points out, it suffers from not >>>>>>>> actually looking at a single stacktrace, but just looking for >>>>>>>> the specified frames somewhere in the output (and in the order >>>>>>>> specified.) You should probably address this. >>>>>>> >>>>>>> Right that was an error on my part. I thought the existing >>>>>>> MULTILINE pattern matching with .* would also find non-sequential >>>>>>> lines and so I was acting similarly. I will re-think this. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> Given the changes you made to allow more flexibly in which >>>>>>>>>>>> frames appear, I think you need to now also make sure the >>>>>>>>>>>> above 3 mentioned frames are not present, except for >>>>>>>>>>>> allowing AllocateHeap() in slowdebug builds. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>> >>>>>>>>>>>>> The actual stack trace reported by NMT detail is affected >>>>>>>>>>>>> by the inlining decisions of the native compiler, and on >>>>>>>>>>>>> the type of build. So we define an "ideal" stacktrace and >>>>>>>>>>>>> then allow for some frames to be missing based on empirical >>>>>>>>>>>>> observations. So to date we have seen two frames that may >>>>>>>>>>>>> or may not be inlined and so we allow for 2 non-matching >>>>>>>>>>>>> entries. >>>>>>>>>>>>> >>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now it is >>>>>>>>>>>>> just an optional frame. >>>>>>>>>>>>> >>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as you >>>>>>>>>>>>> intended? >>>>>>>>>>>>> >>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From daniel.daugherty at oracle.com Fri Apr 5 15:11:37 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 5 Apr 2019 11:11:37 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> Message-ID: <6063b200-9c98-fdef-3fbf-6a84c5dd0422@oracle.com> Greetings, I presented the Async Monitor Deflation OpenJDK Wiki at this week's Runtime Design Review meeting (in the Runtime staff time slot)... Karen Kinnear kindly took notes for the meeting and I've attached them here... I've also attached my replies... Dan On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: > Greetings, > > Welcome to the OpenJDK review thread for my port of Carsten's work on: > > ??? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > Here's a link to the OpenJDK wiki that describes my port: > > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > > Here's the webrev URL: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ > > Here's a link to Carsten's original webrev: > > http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ > > Earlier versions of this patch have been through several rounds of > preliminary review. Many thanks to Carsten, Coleen, Robbin, and > Roman for their preliminary code review comments. A very special > thanks to Robbin and Roman for building and testing the patch in > their own environments (including specJBB2015). > > This version of the patch has been thru Mach5 tier[1-8] testing on > Oracle's usual set of platforms. Earlier versions have been run > through my stress kit on my Linux-X64 and Solaris-X64 servers > (product, fastdebug, slowdebug).Earlier versions have run Kitchensink > for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug > and slowdebug). Earlier versions have run my monitor inflation stress > tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, > fastdebug and slowdebug). > > All of the testing done on earlier versions will be redone on the > latest version of the patch. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > > P.S. > One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java > is currently failing in -Xcomp mode on Win* only. I've been trying > to characterize/analyze this failure for more than a week now. At > this point I'm convinced that Async Monitor Deflation is aggravating > an existing bug. However, I plan to have a better handle on that > failure before these bits are pushed to the jdk/jdk repo. > -------------- next part -------------- An embedded message was scrubbed... From: Karen Kinnear Subject: Re: Async Monitor Deflation design review (2019.04.02) Date: Wed, 3 Apr 2019 17:12:56 -0400 Size: 6762 URL: -------------- next part -------------- An embedded message was scrubbed... From: "Daniel D. Daugherty" Subject: Re: Async Monitor Deflation design review (2019.04.02) Date: Fri, 5 Apr 2019 09:48:24 -0400 Size: 33855 URL: From gerard.ziemski at oracle.com Fri Apr 5 15:30:23 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Fri, 5 Apr 2019 10:30:23 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> Message-ID: <4862ac3f-f165-7d04-6248-514e588e70a2@oracle.com> hi Coleen, Thank you for the review. My comments are inline below: On 4/5/19 7:35 AM, coleen.phillimore at oracle.com wrote: > > Hi Gerard,?? This is somewhat of a first pass review. > > I like the change a? lot.? I have a couple of suggestions. > > http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/utilities/statistics.hpp.html > > > Can you rename this file tableStatistics.cpp/hpp because "statistics" > is too general and the class is called TableStatistics. I deliberately named the file "statistics.hpp", because I assume we will be adding more JFR events in the future, and this file could hold all the related code, which for now just comprises of table statistics as you pointed out. > > http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/jfr/periodic/jfrPeriodic.cpp.udiff.html > > > Is there anyway to parameterize these functions and/or add them to > TableStatistics? I didn't want to add JFR dependency to TableStatistics. I'm unsure what I can do more here, and whether it deserves the effort - TableStatistics basically serves as a struct for passing event attributes around, but I'm open to suggestions. > > Also, when Stefan is done with the ResolvedMethodTable, you can add > that too in a separate RFE > https://bugs.openjdk.java.net/browse/JDK-8221393 Thank you, I linked them. > > Thanks, > Coleen > > > On 4/4/19 3:52 PM, gerard ziemski wrote: >> Thank you Erik for clarifications. >> >> I have implemented all your suggestions, which you can find here >> http://cr.openjdk.java.net/~gziemski/8185525_rev2 >> >> I started Mach5 tier1-6 test to test the changes ... >> >> >> cheers >> >> On 4/4/19 1:16 PM, Erik Gahlin wrote: >>> On 2019-04-04 17:39, gerard ziemski wrote: >>>> hi Erik, >>>> >>>> >>>> On 4/3/19 12:44 PM, Erik Gahlin wrote: >>>>> Hi Gerard, >>>>> >>>>> Here are some comments about the metadata (to make it consistent >>>>> with other events). >>>>> >>>>> The events should not be in the "Java Application" category since >>>>> they are JVM events. You could perhaps put them in "Java Virtual >>>>> Machine, Runtime, Tables". Some comments about the names and >>>>> labels of fields. >>>>> >>>>> - Label: Number of buckets => Bucket Count >>>>> - Label: Number of entries => Entry Count >>>>> - Label: Total footprint => Total Footprint >>>>> >>>>> Could you remove descriptions that are exactly the same as the label. >>>>> >>>>> - Label: Maximum bucket size => Maximum Bucket Size >>>>> - Label: Average bucket size => Average Bucket Size >>>>> - Label: Variance of bucket? size => Bucket Size Variance >>>>> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >>>>> - Label: Standard deviation of bucket size => Bucket Size Standard >>>>> Deviation" >>>>> >>>>> Instead of using the word "size", it may make more sense to use >>>>> the word "count" here as well, i.e "Average Bucket Count", or >>>>> maybe I'm missing something? Is there a difference? >>>>> >>>>> I wonder how useful standard deviation and variance is? If support >>>>> engineers are looking at a recording, or JMC adds a rule for the >>>>> events, what would a good or bad value be? Is it possible to use >>>>> the information for troubleshooting? >>>> >>>> While I'm working on all the above changes you suggested, we can >>>> discuss the standard devation and variance. >>>> >>>> I added them because they are part of the jcmd "VM.symboltable >>>> -verbose" command, so we are consistent. >>> OK >>>> >>>> Now, regarding how useful they are, I always understood them as a >>>> sign of imbalanced table distribution, and without a proper >>>> histogram, this is the best description of the histogram shape. In >>>> reality, however, I think that if they identify an issue, then we >>>> might have a very curious distribution (some sort of hash table >>>> attack), or we have an issue with our hash function for the >>>> particular usage case. >>>> >>>> Still, I'd personally elect to keep them. >>>> >>>> Let me ask you a different question though, Is it expensive to have >>>> 2 doubles as part of an event (5 events per second)? >>> Doubles can't be compressed so each value will take 8 bytes. I don't >>> think the precision of a double is needed, so you could change it >>> into a float and save a few bytes. >>> >>> Most user will not care about JVM internals and a lower rate than >>> once per second is probably sufficient for support engineers to spot >>> that something is wrong. >>> >>> The Thread Context Switch Rate event is emitted once every ten >>> seconds. I think the same rate could be used here. >>> >>>> And if so, is there currently (or planned) granularity for >>>> controlling not just which events to record, but also which >>>> attributes? >>>> >>> No. >>> >>> If overhead becomes an issues, it's usually better to emit all the >>> information, but at a lower rate.? That way, users can find out that >>> the information exists, and increase the rate if a higher resolution >>> is needed to solve their specific issue. >>> >>>>> >>>>> - Name: addRate => insertionRate >>>>> - Label: Rate of addition =>? Insertation Rate >>>>> - Name: removeRate => removalRate >>>>> - Label: Rate of removal => Removal Rate >>>> >>>> Will do. >>>> >>>>> >>>>> I'm missing unit tests for the events. Could you please add in >>>>> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >>>>> average not exceeding max, no negative values etc. >>>> >>>> Working on it, do we need separate test per each event (table), or >>>> just one table will suffice (ex. StringTable)? >>> They are kind of similar, so I think one test file is sufficient, >>> but we should sanity check data for all events. >>> >>> Thanks >>> Erik >>> >>>> >>>> Thank you for the feedback! >>>> >>>> >>>> cheers >>>>> >>>>> Thanks! >>>>> Erik >>>>> >>>>>> Hi all, >>>>>> >>>>>> Please review this feature, which adds tracing events for the >>>>>> internal hash tables. >>>>>> >>>>>> The following attributes are implemented: >>>>>> >>>>>> >>>>>> >>>>>> >>>>> label="Total footprint" description="Total memory footprint (the >>>>>> table itself plus all of the entries)" /> >>>>>> >>>>>> >>>>> label="Variance of bucket sizes" description="How far bucket >>>>>> lengths are spread out from their average value" /> >>>>>> >>>>>> >>>>> description="How many items were added since last event (per >>>>>> second)" /> >>>>>> >>>>> description="How many items were removed since last event (per >>>>>> second)" /> >>>>>> >>>>>> This event was implemented for the following system tables: >>>>>> >>>>>> SymbolTable >>>>>> StringTable >>>>>> Placeholder Table >>>>>> LoaderConstraints Table >>>>>> ProtectionDomainCache Table >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185525 >>>>>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in >>>>>> progress?) >>>>>> >>>>>> >>>>>> Cheers > > From karen.kinnear at oracle.com Fri Apr 5 15:40:19 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 5 Apr 2019 11:40:19 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <6063b200-9c98-fdef-3fbf-6a84c5dd0422@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <6063b200-9c98-fdef-3fbf-6a84c5dd0422@oracle.com> Message-ID: Many thanks Dan for your replies: Additional Tests I had attached to the notes: (I did not include Inflate.java since you have the better Inflate2.java which used to be attached to one of the bug reports). thanks, Karen -------------- next part -------------- > On Apr 5, 2019, at 11:11 AM, Daniel D. Daugherty wrote: > > Greetings, > > I presented the Async Monitor Deflation OpenJDK Wiki at this week's > Runtime Design Review meeting (in the Runtime staff time slot)... > > Karen Kinnear kindly took notes for the meeting and I've attached > them here... I've also attached my replies... > > Dan > > > On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> Welcome to the OpenJDK review thread for my port of Carsten's work on: >> >> JDK-8153224 Monitor deflation prolong safepoints >> https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> Here's a link to the OpenJDK wiki that describes my port: >> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >> >> Here's the webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >> >> Here's a link to Carsten's original webrev: >> >> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >> >> Earlier versions of this patch have been through several rounds of >> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >> Roman for their preliminary code review comments. A very special >> thanks to Robbin and Roman for building and testing the patch in >> their own environments (including specJBB2015). >> >> This version of the patch has been thru Mach5 tier[1-8] testing on >> Oracle's usual set of platforms. Earlier versions have been run >> through my stress kit on my Linux-X64 and Solaris-X64 servers >> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >> and slowdebug). Earlier versions have run my monitor inflation stress >> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >> fastdebug and slowdebug). >> >> All of the testing done on earlier versions will be redone on the >> latest version of the patch. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> >> P.S. >> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >> is currently failing in -Xcomp mode on Win* only. I've been trying >> to characterize/analyze this failure for more than a week now. At >> this point I'm convinced that Async Monitor Deflation is aggravating >> an existing bug. However, I plan to have a better handle on that >> failure before these bits are pushed to the jdk/jdk repo. >> > > From gerard.ziemski at oracle.com Fri Apr 5 15:42:27 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Fri, 5 Apr 2019 10:42:27 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <5CA6731D.7020907@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> <5CA6731D.7020907@oracle.com> Message-ID: <55d79be9-1e33-613f-36e5-c710f618bb52@oracle.com> Thank you Erik for more suggestions. I have implemented them, which you can find here http://cr.openjdk.java.net/~gziemski/8185525_rev3 I will start Mach5 tier1-6 test to test the changes shortly On 4/4/19 4:11 PM, Erik Gahlin wrote: > Thanks for fixing. > > A quick comments about the test. > > I think it can be simplified by using some of the test library > functionality, i.e > > ? public static void main(String[] args) throws Throwable { > ??? try (Recording recording = new Recording()) { > ????? recording.enable(EventNames.SymbolTableStatistics); > ????? recording.enable(EventNames.StringTableStatistics); > ????? recording.enable(EventNames.PlaceholderTableStatistics); > recording.enable(EventNames.LoaderConstraintsTableStatistics); > recording.enable(EventNames.ProtectionDomainCacheTableStatistics); > ????? recording.start(); > ????? recording.stop(); > > ????? List events = Events.fromRecording(recording); > ????? verifyTable(events, EventNames.SymbolTableStatistics); > ????? verifyTable(events, EventNames.StringTableStatistics); > ????? verifyTable(events, EventNames.PlaceholderTableStatistics); > ????? verifyTable(events, EventNames.LoaderConstraintsTableStatistics); > ????? verifyTable(events, > EventNames.ProtectionDomainCacheTableStatistics); > ??? } > ? } > > ? private static void verifyTable(List allEvents, > String eventName) throws Exception { > ??? List eventsForTable = allEvents.stream() > ????????????????????????????????????????????????? .filter(e -> > e.getEventType().getName().equals(eventName)) > .collect(Collectors.toList()); > ???? if (eventsForTable.isEmpty()) { > ?????? throw new Exception("No events for " + eventName); > ???? } > ???? for (RecordedEvent event : eventsForTable) { > ?????? Events.assertField(event, "bucketCount").atLeast(0L); > ?????? long entryCount = Events.assertField(event, > "entryCount").atLeast(0L).getValue(); > ?????? Events.assertField(event, "totalFootprint").atLeast(0L); > ?????? long averageBucketCount = Events.assertField(event, > "averageBucketCount").atLeast(0L).getValue(); > ?????? Events.assertField(event, > "maximumBucketCount").atLeast(averageBucketCount); > ?????? Events.assertField(event, "bucketCountVariance").atLeast(0.0f); > ?????? Events.assertField(event, > "bucketCountStandardDeviation").atLeast(0.0f); > ?????? float insertionRate = Events.assertField(event, > "insertionRate").atLeast(0.0f).getValue(); > ?????? float removalRate = Events.assertField(event, > "removalRate").atLeast(0.0f).getValue(); > ?????? if ((insertionRate > 0.0f) && (insertionRate > removalRate)) { > ???????? Asserts.assertGreaterThan(entryCount, 0L, "Entries marked as > added, but no entries found for " + eventName); > ?????? } > ??? } > ? } > > - It's nice to have the main method on top so you can easily see what > the test is supposed to do. > - Changed (some) field names that used the previous naming style. > - Reduced the number of methods to make it easier to read > - Reduced number of calls to Events.fromRecording(...) as will > repeatedly dump a file to disk. > - Used Events.assertField() which will provide better error message if > an assertion fails, > - Used EventType::getName instead of event.toString() contains > - Added sanity checks for standard deviation and variance fields > - Wrapped Recording creation in try-with-resource to avoid warning > about resource leak > - Removed threshold as the events are periodic and don't use a threshold > - Removed "Thread.sleep" > - The test now relies on events having period "everyChunk" which means > at least two events per recording are guaranteed > > Could you explain how the string table test work, and why it needs > special handling? Some tables might have zero entries, so I just wanted to be clear that for StringTable (which will always have some entries) we do something that will make it grow. I was planning on doing more elaborate tests, but I don't think it's necessary. > > I also missed changes to the file EventNames.java > > (I haven't actually tried the code, but you get the idea) I fixed it up just a bit, and it works nice, thanks! > > Thanks > Erik > >> Thank you Erik for clarifications. >> >> I have implemented all your suggestions, which you can find here >> http://cr.openjdk.java.net/~gziemski/8185525_rev2 >> >> I started Mach5 tier1-6 test to test the changes ... >> >> >> cheers >> >> On 4/4/19 1:16 PM, Erik Gahlin wrote: >>> On 2019-04-04 17:39, gerard ziemski wrote: >>>> hi Erik, >>>> >>>> >>>> On 4/3/19 12:44 PM, Erik Gahlin wrote: >>>>> Hi Gerard, >>>>> >>>>> Here are some comments about the metadata (to make it consistent >>>>> with other events). >>>>> >>>>> The events should not be in the "Java Application" category since >>>>> they are JVM events. You could perhaps put them in "Java Virtual >>>>> Machine, Runtime, Tables". Some comments about the names and >>>>> labels of fields. >>>>> >>>>> - Label: Number of buckets => Bucket Count >>>>> - Label: Number of entries => Entry Count >>>>> - Label: Total footprint => Total Footprint >>>>> >>>>> Could you remove descriptions that are exactly the same as the label. >>>>> >>>>> - Label: Maximum bucket size => Maximum Bucket Size >>>>> - Label: Average bucket size => Average Bucket Size >>>>> - Label: Variance of bucket? size => Bucket Size Variance >>>>> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >>>>> - Label: Standard deviation of bucket size => Bucket Size Standard >>>>> Deviation" >>>>> >>>>> Instead of using the word "size", it may make more sense to use >>>>> the word "count" here as well, i.e "Average Bucket Count", or >>>>> maybe I'm missing something? Is there a difference? >>>>> >>>>> I wonder how useful standard deviation and variance is? If support >>>>> engineers are looking at a recording, or JMC adds a rule for the >>>>> events, what would a good or bad value be? Is it possible to use >>>>> the information for troubleshooting? >>>> >>>> While I'm working on all the above changes you suggested, we can >>>> discuss the standard devation and variance. >>>> >>>> I added them because they are part of the jcmd "VM.symboltable >>>> -verbose" command, so we are consistent. >>> OK >>>> >>>> Now, regarding how useful they are, I always understood them as a >>>> sign of imbalanced table distribution, and without a proper >>>> histogram, this is the best description of the histogram shape. In >>>> reality, however, I think that if they identify an issue, then we >>>> might have a very curious distribution (some sort of hash table >>>> attack), or we have an issue with our hash function for the >>>> particular usage case. >>>> >>>> Still, I'd personally elect to keep them. >>>> >>>> Let me ask you a different question though, Is it expensive to have >>>> 2 doubles as part of an event (5 events per second)? >>> Doubles can't be compressed so each value will take 8 bytes. I don't >>> think the precision of a double is needed, so you could change it >>> into a float and save a few bytes. >>> >>> Most user will not care about JVM internals and a lower rate than >>> once per second is probably sufficient for support engineers to spot >>> that something is wrong. >>> >>> The Thread Context Switch Rate event is emitted once every ten >>> seconds. I think the same rate could be used here. >>> >>>> And if so, is there currently (or planned) granularity for >>>> controlling not just which events to record, but also which >>>> attributes? >>>> >>> No. >>> >>> If overhead becomes an issues, it's usually better to emit all the >>> information, but at a lower rate.? That way, users can find out that >>> the information exists, and increase the rate if a higher resolution >>> is needed to solve their specific issue. >>> >>>>> >>>>> - Name: addRate => insertionRate >>>>> - Label: Rate of addition =>? Insertation Rate >>>>> - Name: removeRate => removalRate >>>>> - Label: Rate of removal => Removal Rate >>>> >>>> Will do. >>>> >>>>> >>>>> I'm missing unit tests for the events. Could you please add in >>>>> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >>>>> average not exceeding max, no negative values etc. >>>> >>>> Working on it, do we need separate test per each event (table), or >>>> just one table will suffice (ex. StringTable)? >>> They are kind of similar, so I think one test file is sufficient, >>> but we should sanity check data for all events. >>> >>> Thanks >>> Erik >>> >>>> >>>> Thank you for the feedback! >>>> >>>> >>>> cheers >>>>> >>>>> Thanks! >>>>> Erik >>>>> >>>>>> Hi all, >>>>>> >>>>>> Please review this feature, which adds tracing events for the >>>>>> internal hash tables. >>>>>> >>>>>> The following attributes are implemented: >>>>>> >>>>>> >>>>>> >>>>>> >>>>> label="Total footprint" description="Total memory footprint (the >>>>>> table itself plus all of the entries)" /> >>>>>> >>>>>> >>>>> label="Variance of bucket sizes" description="How far bucket >>>>>> lengths are spread out from their average value" /> >>>>>> >>>>>> >>>>> description="How many items were added since last event (per >>>>>> second)" /> >>>>>> >>>>> description="How many items were removed since last event (per >>>>>> second)" /> >>>>>> >>>>>> This event was implemented for the following system tables: >>>>>> >>>>>> SymbolTable >>>>>> StringTable >>>>>> Placeholder Table >>>>>> LoaderConstraints Table >>>>>> ProtectionDomainCache Table >>>>>> >>>>>> Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185525 >>>>>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in >>>>>> progress?) >>>>>> >>>>>> >>>>>> Cheers > > From daniel.daugherty at oracle.com Fri Apr 5 15:44:31 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 5 Apr 2019 11:44:31 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <1015f61b-71a3-3c81-09a2-159d16c0b24b@oracle.com> <4e262c9a-6a21-b213-27f9-d8e59c27ba84@oracle.com> <0f4ab494-57d1-d202-5ffa-f2416031c5ff@oracle.com> Message-ID: On 4/3/19 3:04 AM, Robbin Ehn wrote: > Hi Dan, Carsten, > >>>> However, moving deflate_idle_monitors() from the safepoint cleanup >>>> phase >>>> to before the actual garbage collection can wait until we do the >>>> work to >>>> decouple triggering of monitor deflation to be independent of the the >>>> safepoint cleanup phase. >>>> > > If I got the context correct here: > All cleanups are done before the VM op is execute. > It is the last thing we do in SS::begin(), so anything added to > SS::do_cleanup_tasks/ParallelSPCleanupTask is done first in any > safepoint. Agreed. src/hotspot/share/runtime/safepoint.cpp: void SafepointSynchronize::begin() { ? // We do the safepoint cleanup first since a GC related safepoint ? // needs cleanup to be completed before running the GC op. ? EventSafepointCleanup cleanup_event; ? do_cleanup_tasks(); ? post_safepoint_cleanup_event(cleanup_event, _safepoint_counter); ? post_safepoint_begin_event(begin_event, _safepoint_counter, nof_threads, _current_jni_active_count); ? SafepointTracing::cleanup(); } And do_cleanup_tasks() is where we do the deflate idle monitors work. Dan > > /Robbin > >>> >>> SGTM. >>> >>>> Speaking of optimizations, it sure would be nice if little changes to >>>>> java threads could be combined and performed on the way out of the >>>>> safepoint in one go instead of having lots of iterations of the >>>>> thread list >>>>> in various places. Some people have thousands of threads and each >>>>> traversal >>>>> of the thread list hurts. >>>>> >>>>> >>>>> Do you have a specific example in mind? >>>>> >>>> >>>> No concrete example for a public mailing list. :(. But do notice that >>>> independent tasks that require traversals of the thread list are >>>> already >>>> fused in ParallelSPCleanupThreadClosure >>>> . >>>> >>>> If you made deflate_thread_local_monitors >>>> set jt->omShouldDeflateIdleMonitors to true, then you wouldn't need to >>>> iterator over all java threads in do_safepoint_work. >>>> >>>> >>>> I think I see what you mean... so when >>>> ParallelSPCleanupThreadClosure:: >>>> do_thread() calls deflate_thread_local_monitors(): >>>> >>>> 2250?? if (AsyncDeflateIdleMonitors) { >>>> 2251???? // Nothing to do when idle ObjectMonitors are deflated >>>> using a >>>> 2252???? // JavaThread unless a special cleanup has been requested. >>>> >>>> Replace L2251-2 with: >>>> ????????? // Mark the JavaThread for idle monitor cleanup unless a >>>> ????????? // special cleanup has been requested. >>>> 2253???? if (!is_cleanup_requested()) { >>>> >>>> Add these three lines: >>>> ??????????? if (thread->omInUseCount > 0) { >>>> ????????????? // This JavaThread is using monitors so mark it. >>>> ????????????? thread->omShouldDeflateIdleMonitors = true; >>>> ??????????? } >>>> 2254?????? return; >>>> 2255???? } >>>> >>>> That will allow this block to go away: >>>> >>>> 1695?? // Request deflation of per-thread idle monitors by each >>>> JavaThread: >>>> 1696?? for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = >>>> jtiwh.next(); ) { >>>> 1697???? if (jt->omInUseCount > 0) { >>>> 1698?????? // This JavaThread is using monitors so check it. >>>> 1699?????? jt->omShouldDeflateIdleMonitors = true; >>>> 1700???? } >>>> 1701?? } >>>> >>>> Please let me know if I understand what you meant... >>>> >>> >>> This is exactly what I meant. >>> >>> >>> Good. This will be in the next round of code review. >>> >> >> Nice. >> >> Carsten >> From daniel.daugherty at oracle.com Fri Apr 5 15:54:01 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 5 Apr 2019 11:54:01 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> Message-ID: <6b4195d3-ad6c-c598-3e9f-d35621f68bb5@oracle.com> On 4/5/19 8:07 AM, Robbin Ehn wrote: > Hi Dan, > > (Martin there is question for you last in this email) > > After first pass I did not find any real issues. Thanks for taking another pass thru the webrev... > Considering what you had to work with, it looks good! The base Java monitor code is showing its age... :-) > #1 > There are some assert which are redundant (to me at least) like: > src/hotspot/share/runtime/objectMonitor.cpp > L445 > ? if (!dmw->is_marked() && dmw->hash() == 0) { > ??? // This dmw is neutral and has not yet started the restoration > ??? // protocol so we mark a copy of the dmw to begin the protocol. > ??? markOop marked_dmw = dmw->set_marked(); > ??? assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, > ?????????? "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, > ?????????? marked_dmw->is_marked(), marked_dmw->hash()); > > That assert is basically a test that set_marked worked? Yeah... that's a little paranoid... will take care of that. > L505 > ??? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == > DEFLATER_MARKER) { > ????? assert(_succ != Self, "invariant"); > ????? assert(_owner == Self, "invariant"); > > Assert on _owner checks that our cmpxchg is not broken? Also a little paranoid... will take care of that... > I think it's easier to read the code if some on the most obvious > asserts are removed. Maybe comments instead. I'll take a pass at the end of the CR1 round and look at each of the new asserts. If them seem to paranoid, I'll drop them > #2 > Not your doing but I think we should remove TRAPS/Thread * Self and > use JavaThread* instead. > E.g. so we can change: > void ObjectMonitor::EnterI(TRAPS) { > ? Thread * const Self = THREAD; > ? assert(Self->is_Java_thread(), "invariant"); > ? assert(((JavaThread *) Self)->thread_state() == _thread_blocked, > "invariant"); > > to: > > void ObjectMonitor::EnterI(JavaThread* Self) { > ? assert(Self->thread_state() == _thread_blocked, "invariant"); I'd rather not make that change as part of this project. I'm likely to do another cleanup subtask related to the _count field discussion from the design review. I could see looking at TRAPS then... > #3 > src/hotspot/share/runtime/objectMonitor.inline.hpp > ?164 inline void ObjectMonitor::inc_ref_count() { > ?165?? // The increment needs to be MO_SEQ_CST. At the moment, the > Atomic::inc > ?166?? // backend on PPC does not yet conform to these requirements. > Therefore > ?167?? // the increment is simulated with a load phi; cas phi + 1; loop. > ?168?? // Without this MO_SEQ_CST Atomic::inc simulation, > AsyncDeflateIdleMonitors > ?169?? // is not safe. > > I think was fixed with: > 8202080: Introduce ordering semantics for Atomic::add/inc and other > RMW atomics > You should get a leading sync and trailing one with the default > conservative model and thus get proper memory ordering. > Martin, I'm I correct? So what code are you saying we can switch to for this project? Dan > > Thanks, Robbin > > On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> Welcome to the OpenJDK review thread for my port of Carsten's work on: >> >> ???? JDK-8153224 Monitor deflation prolong safepoints >> ???? https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> Here's a link to the OpenJDK wiki that describes my port: >> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >> >> Here's the webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >> >> Here's a link to Carsten's original webrev: >> >> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >> >> Earlier versions of this patch have been through several rounds of >> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >> Roman for their preliminary code review comments. A very special >> thanks to Robbin and Roman for building and testing the patch in >> their own environments (including specJBB2015). >> >> This version of the patch has been thru Mach5 tier[1-8] testing on >> Oracle's usual set of platforms. Earlier versions have been run >> through my stress kit on my Linux-X64 and Solaris-X64 servers >> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >> and slowdebug). Earlier versions have run my monitor inflation stress >> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >> fastdebug and slowdebug). >> >> All of the testing done on earlier versions will be redone on the >> latest version of the patch. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> >> P.S. >> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >> is currently failing in -Xcomp mode on Win* only. I've been trying >> to characterize/analyze this failure for more than a week now. At >> this point I'm convinced that Async Monitor Deflation is aggravating >> an existing bug. However, I plan to have a better handle on that >> failure before these bits are pushed to the jdk/jdk repo. From daniel.daugherty at oracle.com Fri Apr 5 16:01:48 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 5 Apr 2019 12:01:48 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> Message-ID: On 4/5/19 8:37 AM, Doerr, Martin wrote: > Hi everybody, > >> I think was fixed with: >> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics >> You should get a leading sync and trailing one with the default conservative >> model and thus get proper memory ordering. >> Martin, I'm I correct? > Exactly. Thanks for pointing this out. PPC uses the strongest possible ordering semantics with memory_order_conservative (default parameter). > I've seen that comment about PPC in "void ThreadsList::inc_nested_handle_cnt()". This function could get replaced. Okay so we need a new bug to update these two Thread-SMR functions: src/hotspot/share/runtime/threadSMR.cpp: void ThreadsList::dec_nested_handle_cnt() { ? // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec ? // backend on PPC does not yet conform to these requirements. Therefore ? // the decrement is simulated with an Atomic::sub(1, &addr). ? // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR mechanism ? // is not generally safe to use. ? Atomic::sub(1, &_nested_handle_cnt); } void ThreadsList::inc_nested_handle_cnt() { ? // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc ? // backend on PPC does not yet conform to these requirements. Therefore ? // the increment is simulated with a load phi; cas phi + 1; loop. ? // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR mechanism ? // is not generally safe to use. ? intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); ? for (;;) { ??? if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == sample) { ????? return; ??? } else { ????? sample = OrderAccess::load_acquire(&_nested_handle_cnt); ??? } ? } } I'll file a new bug, loop in Robbin, Erik O and Martin, and make sure we're all in agreement. Once we decide that Thread-SMR's functions look like, I'll adapt my Async Monitor Deflation functions... Dan > > Best regards, > Martin > > > -----Original Message----- > From: Robbin Ehn > Sent: Freitag, 5. April 2019 14:07 > To: daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net; Carsten Varming ; Roman Kennke ; Doerr, Martin > Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints > > Hi Dan, > > (Martin there is question for you last in this email) > > After first pass I did not find any real issues. > Considering what you had to work with, it looks good! > > #1 > There are some assert which are redundant (to me at least) like: > src/hotspot/share/runtime/objectMonitor.cpp > L445 > if (!dmw->is_marked() && dmw->hash() == 0) { > // This dmw is neutral and has not yet started the restoration > // protocol so we mark a copy of the dmw to begin the protocol. > markOop marked_dmw = dmw->set_marked(); > assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, > "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, > marked_dmw->is_marked(), marked_dmw->hash()); > > That assert is basically a test that set_marked worked? > > L505 > if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { > assert(_succ != Self, "invariant"); > assert(_owner == Self, "invariant"); > > Assert on _owner checks that our cmpxchg is not broken? > > I think it's easier to read the code if some on the most obvious asserts are > removed. Maybe comments instead. > > #2 > Not your doing but I think we should remove TRAPS/Thread * Self and use > JavaThread* instead. > E.g. so we can change: > void ObjectMonitor::EnterI(TRAPS) { > Thread * const Self = THREAD; > assert(Self->is_Java_thread(), "invariant"); > assert(((JavaThread *) Self)->thread_state() == _thread_blocked, "invariant"); > > to: > > void ObjectMonitor::EnterI(JavaThread* Self) { > assert(Self->thread_state() == _thread_blocked, "invariant"); > > #3 > src/hotspot/share/runtime/objectMonitor.inline.hpp > 164 inline void ObjectMonitor::inc_ref_count() { > 165 // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc > 166 // backend on PPC does not yet conform to these requirements. Therefore > 167 // the increment is simulated with a load phi; cas phi + 1; loop. > 168 // Without this MO_SEQ_CST Atomic::inc simulation, AsyncDeflateIdleMonitors > 169 // is not safe. > > I think was fixed with: > 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics > You should get a leading sync and trailing one with the default conservative > model and thus get proper memory ordering. > Martin, I'm I correct? > > Thanks, Robbin > > On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> Welcome to the OpenJDK review thread for my port of Carsten's work on: >> >> ??? JDK-8153224 Monitor deflation prolong safepoints >> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> Here's a link to the OpenJDK wiki that describes my port: >> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >> >> Here's the webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >> >> Here's a link to Carsten's original webrev: >> >> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >> >> Earlier versions of this patch have been through several rounds of >> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >> Roman for their preliminary code review comments. A very special >> thanks to Robbin and Roman for building and testing the patch in >> their own environments (including specJBB2015). >> >> This version of the patch has been thru Mach5 tier[1-8] testing on >> Oracle's usual set of platforms. Earlier versions have been run >> through my stress kit on my Linux-X64 and Solaris-X64 servers >> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >> and slowdebug). Earlier versions have run my monitor inflation stress >> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >> fastdebug and slowdebug). >> >> All of the testing done on earlier versions will be redone on the >> latest version of the patch. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> >> P.S. >> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >> is currently failing in -Xcomp mode on Win* only. I've been trying >> to characterize/analyze this failure for more than a week now. At >> this point I'm convinced that Async Monitor Deflation is aggravating >> an existing bug. However, I plan to have a better handle on that >> failure before these bits are pushed to the jdk/jdk repo. From daniel.daugherty at oracle.com Fri Apr 5 16:10:15 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 5 Apr 2019 12:10:15 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> Message-ID: <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Filed: ??? JDK-8222034 Thread-SMR functions should be updated to remove work around ??? https://bugs.openjdk.java.net/browse/JDK-8222034 Martin and Robbin, please check it out and make sure that I captured things correctly... Dan On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: > On 4/5/19 8:37 AM, Doerr, Martin wrote: >> Hi everybody, >> >>> I think was fixed with: >>> 8202080: Introduce ordering semantics for Atomic::add/inc and other >>> RMW atomics >>> You should get a leading sync and trailing one with the default >>> conservative >>> model and thus get proper memory ordering. >>> Martin, I'm I correct? >> Exactly. Thanks for pointing this out. PPC uses the strongest >> possible ordering semantics with memory_order_conservative (default >> parameter). >> I've seen that comment about PPC in "void >> ThreadsList::inc_nested_handle_cnt()". This function could get replaced. > > Okay so we need a new bug to update these two Thread-SMR functions: > > src/hotspot/share/runtime/threadSMR.cpp: > > void ThreadsList::dec_nested_handle_cnt() { > ? // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec > ? // backend on PPC does not yet conform to these requirements. Therefore > ? // the decrement is simulated with an Atomic::sub(1, &addr). > ? // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR > mechanism > ? // is not generally safe to use. > ? Atomic::sub(1, &_nested_handle_cnt); > } > > void ThreadsList::inc_nested_handle_cnt() { > ? // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc > ? // backend on PPC does not yet conform to these requirements. Therefore > ? // the increment is simulated with a load phi; cas phi + 1; loop. > ? // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR > mechanism > ? // is not generally safe to use. > ? intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); > ? for (;;) { > ??? if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == > sample) { > ????? return; > ??? } else { > ????? sample = OrderAccess::load_acquire(&_nested_handle_cnt); > ??? } > ? } > } > > I'll file a new bug, loop in Robbin, Erik O and Martin, and make > sure we're all in agreement. Once we decide that Thread-SMR's > functions look like, I'll adapt my Async Monitor Deflation > functions... > > Dan > > >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: Robbin Ehn >> Sent: Freitag, 5. April 2019 14:07 >> To: daniel.daugherty at oracle.com; >> hotspot-runtime-dev at openjdk.java.net; Carsten Varming >> ; Roman Kennke ; Doerr, Martin >> >> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >> >> Hi Dan, >> >> (Martin there is question for you last in this email) >> >> After first pass I did not find any real issues. >> Considering what you had to work with, it looks good! >> >> #1 >> There are some assert which are redundant (to me at least) like: >> src/hotspot/share/runtime/objectMonitor.cpp >> L445 >> ??? if (!dmw->is_marked() && dmw->hash() == 0) { >> ????? // This dmw is neutral and has not yet started the restoration >> ????? // protocol so we mark a copy of the dmw to begin the protocol. >> ????? markOop marked_dmw = dmw->set_marked(); >> ????? assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >> ???????????? "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >> ???????????? marked_dmw->is_marked(), marked_dmw->hash()); >> >> That assert is basically a test that set_marked worked? >> >> L505 >> ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >> DEFLATER_MARKER) { >> ??????? assert(_succ != Self, "invariant"); >> ??????? assert(_owner == Self, "invariant"); >> >> Assert on _owner checks that our cmpxchg is not broken? >> >> I think it's easier to read the code if some on the most obvious >> asserts are >> removed. Maybe comments instead. >> >> #2 >> Not your doing but I think we should remove TRAPS/Thread * Self and use >> JavaThread* instead. >> E.g. so we can change: >> void ObjectMonitor::EnterI(TRAPS) { >> ??? Thread * const Self = THREAD; >> ??? assert(Self->is_Java_thread(), "invariant"); >> ??? assert(((JavaThread *) Self)->thread_state() == _thread_blocked, >> "invariant"); >> >> to: >> >> void ObjectMonitor::EnterI(JavaThread* Self) { >> ??? assert(Self->thread_state() == _thread_blocked, "invariant"); >> >> #3 >> src/hotspot/share/runtime/objectMonitor.inline.hpp >> ?? 164 inline void ObjectMonitor::inc_ref_count() { >> ?? 165?? // The increment needs to be MO_SEQ_CST. At the moment, the >> Atomic::inc >> ?? 166?? // backend on PPC does not yet conform to these >> requirements. Therefore >> ?? 167?? // the increment is simulated with a load phi; cas phi + 1; >> loop. >> ?? 168?? // Without this MO_SEQ_CST Atomic::inc simulation, >> AsyncDeflateIdleMonitors >> ?? 169?? // is not safe. >> >> I think was fixed with: >> 8202080: Introduce ordering semantics for Atomic::add/inc and other >> RMW atomics >> You should get a leading sync and trailing one with the default >> conservative >> model and thus get proper memory ordering. >> Martin, I'm I correct? >> >> Thanks, Robbin >> >> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>> >>> ? ??? JDK-8153224 Monitor deflation prolong safepoints >>> ? ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >>> >>> Here's a link to the OpenJDK wiki that describes my port: >>> >>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>> >>> Here's the webrev URL: >>> >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>> >>> Here's a link to Carsten's original webrev: >>> >>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>> >>> Earlier versions of this patch have been through several rounds of >>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>> Roman for their preliminary code review comments. A very special >>> thanks to Robbin and Roman for building and testing the patch in >>> their own environments (including specJBB2015). >>> >>> This version of the patch has been thru Mach5 tier[1-8] testing on >>> Oracle's usual set of platforms. Earlier versions have been run >>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>> and slowdebug). Earlier versions have run my monitor inflation stress >>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>> fastdebug and slowdebug). >>> >>> All of the testing done on earlier versions will be redone on the >>> latest version of the patch. >>> >>> Thanks, in advance, for any questions, comments or suggestions. >>> >>> Dan >>> >>> P.S. >>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>> is currently failing in -Xcomp mode on Win* only. I've been trying >>> to characterize/analyze this failure for more than a week now. At >>> this point I'm convinced that Async Monitor Deflation is aggravating >>> an existing bug. However, I plan to have a better handle on that >>> failure before these bits are pushed to the jdk/jdk repo. > > From chris.plummer at oracle.com Fri Apr 5 17:09:47 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 5 Apr 2019 10:09:47 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> Message-ID: <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> Hi David, Why was the JVM_DefineModule frame left off of stackTraceAlternate? Since you've added the following: ?103???????? if (!okToHaveAllocateHeap) { ?104???????????? output.shouldNotContain("AllocateHeap"); ?105???????? } You can simplify the following: ?123???????? if (okToHaveAllocateHeap) { ?124???????????? expectedStackTrace = stackTraceAllocateHeap; ?125???????????? if (stackTraceMatches(expectedStackTrace, output)) { ?126???????????????? return; ?127???????????? } ?128???????? } else { The is no need for the okToHaveAllocateHeap check here anymore. Just check all 3 allowed stacktraces until one passes. This is a slight improvement in flexibility in that it would no longer require the slowdebug builds to match stackTraceAllocateHeap. They could match any of the 3. You could then put all 3 allowed stacktraces in an array and check them in a loop if you wish. The following is no longer correct: ?140???????? throw new RuntimeException("Expected stack trace missing from output: " + expectedStackTrace); In your current approach, expectedStackTrace is just the last stacktrace we tried. Since we may try more than one, maybe all the ones that failed to match should be listed (or none listed if just too messy). thanks, Chris On 4/5/19 12:04 AM, David Holmes wrote: > Hi Chris, > > Updated webrev: > > http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ > > Checks for alternate stack now. Added lots of comments and misc fixups. > > Zhengyu: please re-test (I can't test any slowdebug except linux-x64). > > Thanks, > David > > On 5/04/2019 4:01 pm, Chris Plummer wrote: >> Thinking about this a bit more, there is still the potential for some >> confusion if this test fails again in the future due to the top frame >> missing. Is it missing because it got inlined or is it missing >> because the frame skipping code skipped an extra frame? Hopefully >> whoever deals with it doesn't just hastily add another valid >> stacktrace to the test but instead investigates to make sure the >> issue is indeed that the method got inlined. >> >> Chris >> >> On 4/4/19 10:56 PM, David Holmes wrote: >>> Hi Chris, >>> >>> Okay I will simply check for the third alternative. >>> >>> Thanks, >>> David >>> >>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>> Hi David, >>>> >>>> For the callsite that this test is checking for, right now there >>>> appear to be 3 possible stacktraces: the "normal" one, the one that >>>> includes AllocateHeap() on solaris and windows slowdebug builds, >>>> and the one Zhengyu is now seeing on linux-x64. You would need to >>>> check for all 3, limiting the AllocateHeap() one to just being >>>> allowed on solaris and windows slowdebug as it is now. So basically >>>> this test needs to cover all (allowable) stacktraces that we've >>>> seen for this callsite, and be updated in the future as needed. Not >>>> ideal, but I don't see a better solution. It's similar to the >>>> situation described in JDK-8163899 which covered the fragility of >>>> the NMT frame skipping code. In the end it was decided it would be >>>> easier to just deal fix issues as they came up rather then engineer >>>> a solution that wasn't as fragile. I think this test falls in the >>>> same category. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> Thanks for the explanation about the frame counting from >>>>> os::malloc - now I get it. But I don't understand your final comment: >>>>> >>>>> > Looking at this code also reminds me of a reason to have the test >>>>> > continue to check for all 4 specific frames. If the frame >>>>> skipping code >>>>> > skips an extra frame, then the callsite will be missing a needed >>>>> frame >>>>> > at the top. The way the test was written it would detect this. >>>>> With your >>>>> > changes it will not. It would just revert to always matching on >>>>> 3 frames >>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>> >>>>> How can I fix this bug if I have to check for 4 specific frames >>>>> but one (or more) may be missing - i.e how can I tell the >>>>> different between "Frame A was inlined" and "Frame A was skipped >>>>> by mistake" ?? >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> I have concerns that this will hide some of the other bugs >>>>>>>>>>>> I've mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. >>>>>>>>>>>> These bugs result in 1 or two frames appearing in the >>>>>>>>>>>> stacktrace that should be skipped. Notably >>>>>>>>>>>> NativeCallStack::NativeCallStack() and os::get_native_stack(). >>>>>>>>>>> >>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>> >>>>>>>>>>> 73???????? // We should never see either of these frames >>>>>>>>>>> because they are supposed to be skipped. */ >>>>>>>>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>>>>>>> >>>>>>>>>>>> Also, AllocateHeap() should normally not be in the stack >>>>>>>>>>>> trace, but the test has specifically allowed for it for >>>>>>>>>>>> windows and solaris slowdebug builds. Although these builds >>>>>>>>>>>> should have honored the ALWAYSINLINE directive, it was >>>>>>>>>>>> deemed acceptable that it was not in slowdebug builds. >>>>>>>>>>>> However, I would not want to allow AllocateHeap() to appear >>>>>>>>>>>> in a product build, and best not to see it in fastdebug >>>>>>>>>>>> either. >>>>>>>>>>> >>>>>>>>>>> This is a test of NMT detail not a test of whether a given >>>>>>>>>>> compiler chooses to inline something like AllocateHeap. I >>>>>>>>>>> don't think it is the job of this test to be checking for >>>>>>>>>>> something specific to the native compiler. The previous >>>>>>>>>>> handling of AllocateHeap seemed to be there simply because >>>>>>>>>>> it was the only way to deal with an optional frame - but now >>>>>>>>>>> that's handled generically. >>>>>>>>>> It's appearance means you effectively only have 3 frames to >>>>>>>>>> identity callsites instead of 4. >>>>>>>>> >>>>>>>>> Both stacktraces in the old test had 4 elements and expected 4 >>>>>>>>> matches. The current bug is that one of those (new_entry) >>>>>>>>> could actually be inlined as well, resulting in only 3 >>>>>>>>> matches. So that is what the revised test checks for: at least >>>>>>>>> 3 matches. Often there will be 4 matches. >>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>> referring to how many frames NMT uses to identify the callsite. >>>>>>>> It wants to use 4, but if AllocateHeap() doesn't get inlined, >>>>>>>> it effectively is using 3. The test should detect when this >>>>>>>> happens so the NMT implementation can address the issue. >>>>>>> >>>>>>> You're right I don't understand this part as I don't know >>>>>>> how/what NMT detail is doing in this regard. >>>>>> >>>>>> An NMT callsite is simply the 4 most recent frames (afters some >>>>>> pruning) that led to the os:malloc() call. "4" is somewhat >>>>>> arbitrary as Thomas pointed out, and is controlled by >>>>>> NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger >>>>>> means more refinement of the callsites (thus more callsites), but >>>>>> a clearer picture of what actually led to the os:malloc(). >>>>>> >>>>>> For example, with NMT_TrackingStackDepth == 4, if you have a() >>>>>> calls b() calls c() calls d() calls os:malloc(), and foo() and >>>>>> bar() both call a(), the NMT detail output will not distinguish >>>>>> between these two calls paths to os:mallco(), and will consider >>>>>> both paths to be the same callsite. The 4 frames in the NMT >>>>>> detail output would always be a, b, c, and d. However, bump up >>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as two >>>>>> separate callsites, one with foo() as the bottom frame and one >>>>>> with bar() as the bottom frame, and both with a, b, c, and d as >>>>>> the other 4 frames. >>>>>> >>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>> allocation that is the result of doing a "new" of any CHeapObj >>>>>> subtype will have AllocateHeap() in its callsite, which >>>>>> effectively lowers they callsite refinement by 1. >>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>> >>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>> >>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it >>>>>>>>> getting inlined already when AllocateHeap was not? Even so we >>>>>>>>> still end up with 4 frames matching normally. >>>>>>>> I noticed that last night also and scratch my head over it for >>>>>>>> a while and then went to bed. The only explanation I could come >>>>>>>> up with is that allocate_new_entry() is getting inlined, and as >>>>>>>> a result (due to being a slowdebug build and doing minimal >>>>>>>> inlining) AllocateHeap() was not inlined. >>>>>>>>> >>>>>>>>>> If it does appear in a product build, a solution should be >>>>>>>>>> looked into to get rid of it. If the port owner decides it >>>>>>>>>> can't get rid of it (or is unwilling to), then an exception >>>>>>>>>> should be added to the test like was done for solaris and >>>>>>>>>> windows slowdebug builds. >>>>>>>>> >>>>>>>>> Are we specifically trying to test the compiler's ability to >>>>>>>>> inline that function and just happen to be using this test to >>>>>>>>> verify that? Doesn't seem like a suitable place to do this - >>>>>>>>> and why do we need to do it? The Visual Studio docs state: >>>>>>>>> >>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>> >>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and >>>>>>>>> could change with any update to the compiler. >>>>>>>>> >>>>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>>>> specifically -xinline only has an effect at ?xO3 or higher. >>>>>>>>> Which likely explains why it is ignored in slowdebug. And >>>>>>>>> there are other cases where it won't honour the ALWAYSINLINE. >>>>>>>>> >>>>>>>>> Even with gcc we seem to be misusing the attribute if we want >>>>>>>>> to ensure inlining when not optimising: >>>>>>>>> >>>>>>>>> "GCC does not inline any functions when not optimizing unless >>>>>>>>> you specify the ?always_inline? attribute for the function, >>>>>>>>> like this: >>>>>>>>> >>>>>>>>> /* Prototype.? */ >>>>>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>>>>> >>>>>>>>> and we don't write it that way. >>>>>>>>> >>>>>>>>> So if we're that concerned about release builds guaranteeing >>>>>>>>> to inline AllocateHeap then I think we need something a bit >>>>>>>>> more explicit than this test to determine that. >>>>>>>> With respect to the 3 methods/functions we don't want to see in >>>>>>>> the callsite stacktrace, NMT has made a number of assumptions >>>>>>>> on inlining. One of the things the test is doing is making sure >>>>>>>> those assumptions are correct. If incorrect, then you run into >>>>>>>> issues like I mentioned above where callsite backtraces >>>>>>>> effectively only have 3 unique frames rather than 4 (actually >>>>>>>> before some bug fixes it was often just 2 unique frames). So I >>>>>>>> think it's appropriate to have a test to make sure we are not >>>>>>>> seeing any of these 3 methods/functions. >>>>>>> >>>>>>> Okay I get the gist of that. Is there somewhere I can clearly >>>>>>> see what this inlining assumptions are that NMT makes? Are they >>>>>>> clearly documented? >>>>>> >>>>>> Not that I know of. I discovered them while looking at the >>>>>> various bugs that led to NativeCallStack::NativeCallStack() and >>>>>> os::get_native_stack() (and sometimes both) being in the >>>>>> callsite. Reviewing the bugs I referred to will give you an idea >>>>>> of where to look. One good place to look at >>>>>> NativeCallStack::NativeCallStack(). Lots of special case code >>>>>> there that controls how many frames to skip based on on the >>>>>> platform and whether optimized or not. Also some comments there >>>>>> to help you out. I did a lot of bug fixing in this method. >>>>>> >>>>>> Looking at this code also reminds me of a reason to have the test >>>>>> continue to check for all 4 specific frames. If the frame >>>>>> skipping code skips an extra frame, then the callsite will be >>>>>> missing a needed frame at the top. The way the test was written >>>>>> it would detect this. With your changes it will not. It would >>>>>> just revert to always matching on 3 frames instead of 4, and the >>>>>> frame skipping bug would go unnoticed. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>>> >>>>>>>> Now the test also has made inlining assumptions beyond what NMT >>>>>>>> has made, and that is really what this bug is about. In general >>>>>>>> I think your fix is fine in the way it relaxes which frames are >>>>>>>> actually found, but as Thomas points out, it suffers from not >>>>>>>> actually looking at a single stacktrace, but just looking for >>>>>>>> the specified frames somewhere in the output (and in the order >>>>>>>> specified.) You should probably address this. >>>>>>> >>>>>>> Right that was an error on my part. I thought the existing >>>>>>> MULTILINE pattern matching with .* would also find >>>>>>> non-sequential lines and so I was acting similarly. I will >>>>>>> re-think this. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> Given the changes you made to allow more flexibly in which >>>>>>>>>>>> frames appear, I think you need to now also make sure the >>>>>>>>>>>> above 3 mentioned frames are not present, except for >>>>>>>>>>>> allowing AllocateHeap() in slowdebug builds. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>> >>>>>>>>>>>>> The actual stack trace reported by NMT detail is affected >>>>>>>>>>>>> by the inlining decisions of the native compiler, and on >>>>>>>>>>>>> the type of build. So we define an "ideal" stacktrace and >>>>>>>>>>>>> then allow for some frames to be missing based on >>>>>>>>>>>>> empirical observations. So to date we have seen two frames >>>>>>>>>>>>> that may or may not be inlined and so we allow for 2 >>>>>>>>>>>>> non-matching entries. >>>>>>>>>>>>> >>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now it is >>>>>>>>>>>>> just an optional frame. >>>>>>>>>>>>> >>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as you >>>>>>>>>>>>> intended? >>>>>>>>>>>>> >>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From karen.kinnear at oracle.com Fri Apr 5 20:59:21 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 5 Apr 2019 16:59:21 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: Dan, Some more minor comments from reading the code: 1. Could you add comments to markOop.hpp about the use in the displaced_mark_word of is_marked to prevent any users of is_marked here from needing to have that information saved/restored? 2. In objectMonitor.hpp in is_busy you clarify the difference in use between _count (which I think you may be changing to _contended) and _ref_count. Could you possibly also comment where you declare them? 3. clear_using_JT: would it make sense to have an assertion that _owner is either null or DEFLATER_MARKER? 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 < _count with comments that caller ensured _count <= 0 In ReenterI: guarantee 0 <= _count, with comment not _count < 0 ? Am I missing something subtle here or should they be the same guarantees? 5. I could use a little help with allocation state transitions, e.g. in deflate_monitor_list_using_JT you see is_new with object set so you mark it as old so next deflation will check it - why do you set it to old here rather than in inflate once we set values? 6. Could you get rid of the new goto?s? 7. On the updated wiki for the hash race example: Racing Threads: ?T-hash is about to inc the ref_count field? actually - T-hash just did - ref_count == 1 - so maybe change middle values 8. There is an old comment in FastHashCode that // WARNING: // The displaced header is strictly immutable. // It can NOT be changed in ANY cases. I presume that only applies to the displaced header for a stack lock - could you possibly update that while you are in the code? Also in FastHashCode // The only update to the header in the monitor (outside GC) 823 // is install the hash code. If someone add new usage of 824 // displaced header, please update this code Can you update that comment as well? I know you?ve already updated the code logic. So I walked the logic for the hashcode interactions - I didn?t find any holes. Thank you for walking most of it in email/wiki. In particular, inflate does the save_om_ptr dance to inc_ref_count, so this code above will be called while preventing async deflation. 9. install_displaced_markword_in_object What happens if the cas_set_mark fails? I get that today this handles the race with enter and deflate_monitor_using_JT. If we remove the call from enter, is the expectation that we?ve blocked all others who did not set is_marked themselves? If we remove the call from enter would it make sense to ensure that the cas_set_mark succeeds here? 10. Is there any benefit in a bit of stress testing with something like a temporary flag that deflates in mAlloc each time it is called? Looking forward to the performance runs as well as the latency numbers. thanks, Karen > On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty wrote: > > Filed: > > JDK-8222034 Thread-SMR functions should be updated to remove work around > https://bugs.openjdk.java.net/browse/JDK-8222034 > > Martin and Robbin, please check it out and make sure that I captured > things correctly... > > Dan > > > > On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: >> On 4/5/19 8:37 AM, Doerr, Martin wrote: >>> Hi everybody, >>> >>>> I think was fixed with: >>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics >>>> You should get a leading sync and trailing one with the default conservative >>>> model and thus get proper memory ordering. >>>> Martin, I'm I correct? >>> Exactly. Thanks for pointing this out. PPC uses the strongest possible ordering semantics with memory_order_conservative (default parameter). >>> I've seen that comment about PPC in "void ThreadsList::inc_nested_handle_cnt()". This function could get replaced. >> >> Okay so we need a new bug to update these two Thread-SMR functions: >> >> src/hotspot/share/runtime/threadSMR.cpp: >> >> void ThreadsList::dec_nested_handle_cnt() { >> // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec >> // backend on PPC does not yet conform to these requirements. Therefore >> // the decrement is simulated with an Atomic::sub(1, &addr). >> // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR mechanism >> // is not generally safe to use. >> Atomic::sub(1, &_nested_handle_cnt); >> } >> >> void ThreadsList::inc_nested_handle_cnt() { >> // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc >> // backend on PPC does not yet conform to these requirements. Therefore >> // the increment is simulated with a load phi; cas phi + 1; loop. >> // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR mechanism >> // is not generally safe to use. >> intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); >> for (;;) { >> if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == sample) { >> return; >> } else { >> sample = OrderAccess::load_acquire(&_nested_handle_cnt); >> } >> } >> } >> >> I'll file a new bug, loop in Robbin, Erik O and Martin, and make >> sure we're all in agreement. Once we decide that Thread-SMR's >> functions look like, I'll adapt my Async Monitor Deflation >> functions... >> >> Dan >> >> >>> >>> Best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: Robbin Ehn >>> Sent: Freitag, 5. April 2019 14:07 >>> To: daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net; Carsten Varming ; Roman Kennke ; Doerr, Martin >>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>> >>> Hi Dan, >>> >>> (Martin there is question for you last in this email) >>> >>> After first pass I did not find any real issues. >>> Considering what you had to work with, it looks good! >>> >>> #1 >>> There are some assert which are redundant (to me at least) like: >>> src/hotspot/share/runtime/objectMonitor.cpp >>> L445 >>> if (!dmw->is_marked() && dmw->hash() == 0) { >>> // This dmw is neutral and has not yet started the restoration >>> // protocol so we mark a copy of the dmw to begin the protocol. >>> markOop marked_dmw = dmw->set_marked(); >>> assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >>> "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >>> marked_dmw->is_marked(), marked_dmw->hash()); >>> >>> That assert is basically a test that set_marked worked? >>> >>> L505 >>> if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { >>> assert(_succ != Self, "invariant"); >>> assert(_owner == Self, "invariant"); >>> >>> Assert on _owner checks that our cmpxchg is not broken? >>> >>> I think it's easier to read the code if some on the most obvious asserts are >>> removed. Maybe comments instead. >>> >>> #2 >>> Not your doing but I think we should remove TRAPS/Thread * Self and use >>> JavaThread* instead. >>> E.g. so we can change: >>> void ObjectMonitor::EnterI(TRAPS) { >>> Thread * const Self = THREAD; >>> assert(Self->is_Java_thread(), "invariant"); >>> assert(((JavaThread *) Self)->thread_state() == _thread_blocked, "invariant"); >>> >>> to: >>> >>> void ObjectMonitor::EnterI(JavaThread* Self) { >>> assert(Self->thread_state() == _thread_blocked, "invariant"); >>> >>> #3 >>> src/hotspot/share/runtime/objectMonitor.inline.hpp >>> 164 inline void ObjectMonitor::inc_ref_count() { >>> 165 // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc >>> 166 // backend on PPC does not yet conform to these requirements. Therefore >>> 167 // the increment is simulated with a load phi; cas phi + 1; loop. >>> 168 // Without this MO_SEQ_CST Atomic::inc simulation, AsyncDeflateIdleMonitors >>> 169 // is not safe. >>> >>> I think was fixed with: >>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics >>> You should get a leading sync and trailing one with the default conservative >>> model and thus get proper memory ordering. >>> Martin, I'm I correct? >>> >>> Thanks, Robbin >>> >>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>>> Greetings, >>>> >>>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>>> >>>> JDK-8153224 Monitor deflation prolong safepoints >>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>> >>>> Here's a link to the OpenJDK wiki that describes my port: >>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>> >>>> Here's the webrev URL: >>>> >>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>> >>>> Here's a link to Carsten's original webrev: >>>> >>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>> >>>> Earlier versions of this patch have been through several rounds of >>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>> Roman for their preliminary code review comments. A very special >>>> thanks to Robbin and Roman for building and testing the patch in >>>> their own environments (including specJBB2015). >>>> >>>> This version of the patch has been thru Mach5 tier[1-8] testing on >>>> Oracle's usual set of platforms. Earlier versions have been run >>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>>> and slowdebug). Earlier versions have run my monitor inflation stress >>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>> fastdebug and slowdebug). >>>> >>>> All of the testing done on earlier versions will be redone on the >>>> latest version of the patch. >>>> >>>> Thanks, in advance, for any questions, comments or suggestions. >>>> >>>> Dan >>>> >>>> P.S. >>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>>> is currently failing in -Xcomp mode on Win* only. I've been trying >>>> to characterize/analyze this failure for more than a week now. At >>>> this point I'm convinced that Async Monitor Deflation is aggravating >>>> an existing bug. However, I plan to have a better handle on that >>>> failure before these bits are pushed to the jdk/jdk repo. >> >> > From varming at gmail.com Sat Apr 6 03:01:01 2019 From: varming at gmail.com (Carsten Varming) Date: Fri, 5 Apr 2019 23:01:01 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: Dear Karen, Please see inline answers. On Fri, Apr 5, 2019 at 4:59 PM Karen Kinnear wrote: > > 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 < > _count > with comments that caller ensured _count <= 0 > In ReenterI: guarantee 0 <= _count, with comment not _count < 0 > ? Am I missing something subtle here or should they be the same guarantees? > In ::enter _count is incremented when the thread is trying to acquire the monitor and decremented after the monitor has been acquired. The 0 < _count assertion is between those two point in the code. A thread acquiring a monitor and then calling wait will increment _count and then decrement _count as part of acquiring the monitor, thus _count can be 0 by the time the thread calls wait and when ReenterI is called. 9. install_displaced_markword_in_object > What happens if the cas_set_mark fails? > I get that today this handles the race with enter and > deflate_monitor_using_JT. If we remove > the call from enter, is the expectation that we?ve blocked all others who > did not set is_marked themselves? > If we remove the call from enter would it make sense to ensure that the > cas_set_mark succeeds here? > I designed my original patch such that no thread would ever wait for the the deflating thread to finish deflating a monitor. If you remove install_displaced_markword_in_object from enter, then the entering thread can end up busy waiting by continuously reading the monitor pointer from the object mark word and then realizing that the monitor is being deflated and it should retry by going back to reading the object mark word. This bad behavior is completely avoided by calling install_displaced_markword_in_object. In my original patch no thread would ever wait for a deflating thread to finish. This property got lost in FastHashCode as that function evolved since I wrote my patch, but I think this property is worth preserving where possible. It might even be worth looking at FastHashCode to see if we can re-establish this property. I hope this helps. Best, Carsten > On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty < > daniel.daugherty at oracle.com> wrote: > > Filed: > > JDK-8222034 Thread-SMR functions should be updated to remove work > around > https://bugs.openjdk.java.net/browse/JDK-8222034 > > Martin and Robbin, please check it out and make sure that I captured > things correctly... > > Dan > > > > On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: > > On 4/5/19 8:37 AM, Doerr, Martin wrote: > > Hi everybody, > > I think was fixed with: > 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW > atomics > You should get a leading sync and trailing one with the default > conservative > model and thus get proper memory ordering. > Martin, I'm I correct? > > Exactly. Thanks for pointing this out. PPC uses the strongest possible > ordering semantics with memory_order_conservative (default parameter). > I've seen that comment about PPC in "void > ThreadsList::inc_nested_handle_cnt()". This function could get replaced. > > > Okay so we need a new bug to update these two Thread-SMR functions: > > src/hotspot/share/runtime/threadSMR.cpp: > > void ThreadsList::dec_nested_handle_cnt() { > // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec > // backend on PPC does not yet conform to these requirements. Therefore > // the decrement is simulated with an Atomic::sub(1, &addr). > // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR > mechanism > // is not generally safe to use. > Atomic::sub(1, &_nested_handle_cnt); > } > > void ThreadsList::inc_nested_handle_cnt() { > // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc > // backend on PPC does not yet conform to these requirements. Therefore > // the increment is simulated with a load phi; cas phi + 1; loop. > // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR > mechanism > // is not generally safe to use. > intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); > for (;;) { > if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == > sample) { > return; > } else { > sample = OrderAccess::load_acquire(&_nested_handle_cnt); > } > } > } > > I'll file a new bug, loop in Robbin, Erik O and Martin, and make > sure we're all in agreement. Once we decide that Thread-SMR's > functions look like, I'll adapt my Async Monitor Deflation > functions... > > Dan > > > > Best regards, > Martin > > > -----Original Message----- > From: Robbin Ehn > Sent: Freitag, 5. April 2019 14:07 > To: daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net; > Carsten Varming ; Roman Kennke ; > Doerr, Martin > Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints > > Hi Dan, > > (Martin there is question for you last in this email) > > After first pass I did not find any real issues. > Considering what you had to work with, it looks good! > > #1 > There are some assert which are redundant (to me at least) like: > src/hotspot/share/runtime/objectMonitor.cpp > L445 > if (!dmw->is_marked() && dmw->hash() == 0) { > // This dmw is neutral and has not yet started the restoration > // protocol so we mark a copy of the dmw to begin the protocol. > markOop marked_dmw = dmw->set_marked(); > assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, > "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, > marked_dmw->is_marked(), marked_dmw->hash()); > > That assert is basically a test that set_marked worked? > > L505 > if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == > DEFLATER_MARKER) { > assert(_succ != Self, "invariant"); > assert(_owner == Self, "invariant"); > > Assert on _owner checks that our cmpxchg is not broken? > > I think it's easier to read the code if some on the most obvious asserts > are > removed. Maybe comments instead. > > #2 > Not your doing but I think we should remove TRAPS/Thread * Self and use > JavaThread* instead. > E.g. so we can change: > void ObjectMonitor::EnterI(TRAPS) { > Thread * const Self = THREAD; > assert(Self->is_Java_thread(), "invariant"); > assert(((JavaThread *) Self)->thread_state() == _thread_blocked, > "invariant"); > > to: > > void ObjectMonitor::EnterI(JavaThread* Self) { > assert(Self->thread_state() == _thread_blocked, "invariant"); > > #3 > src/hotspot/share/runtime/objectMonitor.inline.hpp > 164 inline void ObjectMonitor::inc_ref_count() { > 165 // The increment needs to be MO_SEQ_CST. At the moment, the > Atomic::inc > 166 // backend on PPC does not yet conform to these requirements. > Therefore > 167 // the increment is simulated with a load phi; cas phi + 1; loop. > 168 // Without this MO_SEQ_CST Atomic::inc simulation, > AsyncDeflateIdleMonitors > 169 // is not safe. > > I think was fixed with: > 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW > atomics > You should get a leading sync and trailing one with the default > conservative > model and thus get proper memory ordering. > Martin, I'm I correct? > > Thanks, Robbin > > On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: > > Greetings, > > Welcome to the OpenJDK review thread for my port of Carsten's work on: > > JDK-8153224 Monitor deflation prolong safepoints > https://bugs.openjdk.java.net/browse/JDK-8153224 > > Here's a link to the OpenJDK wiki that describes my port: > > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > > Here's the webrev URL: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ > > Here's a link to Carsten's original webrev: > > http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ > > Earlier versions of this patch have been through several rounds of > preliminary review. Many thanks to Carsten, Coleen, Robbin, and > Roman for their preliminary code review comments. A very special > thanks to Robbin and Roman for building and testing the patch in > their own environments (including specJBB2015). > > This version of the patch has been thru Mach5 tier[1-8] testing on > Oracle's usual set of platforms. Earlier versions have been run > through my stress kit on my Linux-X64 and Solaris-X64 servers > (product, fastdebug, slowdebug).Earlier versions have run Kitchensink > for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug > and slowdebug). Earlier versions have run my monitor inflation stress > tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, > fastdebug and slowdebug). > > All of the testing done on earlier versions will be redone on the > latest version of the patch. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > > P.S. > One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java > is currently failing in -Xcomp mode on Win* only. I've been trying > to characterize/analyze this failure for more than a week now. At > this point I'm convinced that Async Monitor Deflation is aggravating > an existing bug. However, I plan to have a better handle on that > failure before these bits are pushed to the jdk/jdk repo. > > > > > > From david.holmes at oracle.com Sat Apr 6 04:13:24 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 6 Apr 2019 14:13:24 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> Message-ID: <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> Hi Chris, On 6/04/2019 3:09 am, Chris Plummer wrote: > Hi David, > > Why was the JVM_DefineModule frame left off of stackTraceAlternate? ?? That isn't part of any of the existing stacktraces. > Since you've added the following: > > ?103???????? if (!okToHaveAllocateHeap) { > ?104???????????? output.shouldNotContain("AllocateHeap"); > ?105???????? } I didn't add that - see old code line 80. > You can simplify the following: > > ?123???????? if (okToHaveAllocateHeap) { > ?124???????????? expectedStackTrace = stackTraceAllocateHeap; > ?125???????????? if (stackTraceMatches(expectedStackTrace, output)) { > ?126???????????????? return; > ?127???????????? } > ?128???????? } else { > > The is no need for the okToHaveAllocateHeap check here anymore. Just > check all 3 allowed stacktraces until one passes. This is a slight > improvement in flexibility in that it would no longer require the > slowdebug builds to match stackTraceAllocateHeap. They could match any > of the 3. You could then put all 3 allowed stacktraces in an array and > check them in a loop if you wish. The only change I have made (which might be obscured by the structure) is that if stackTraceDefault fails to match I then try stackTraceAlternate. The handling of okToHaveAllocateHeap is unchanged. By the same argument you made I think it best to only expect the AllocateHeap stack on those slowdebug platforms, so that we can notice when something changes - again I've mode no change in this regard. > The following is no longer correct: > > ?140???????? throw new RuntimeException("Expected stack trace missing > from output: " + expectedStackTrace); > > In your current approach, expectedStackTrace is just the last stacktrace > we tried. Since we may try more than one, maybe all the ones that failed > to match should be listed (or none listed if just too messy). It reports the last failing stacktrace, out of a possible two. Perhaps I can print both ... you want something in the jtr file so that it can be triaged without having to go and look up the test code. Thanks, David > thanks, > > Chris > > On 4/5/19 12:04 AM, David Holmes wrote: >> Hi Chris, >> >> Updated webrev: >> >> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >> >> Checks for alternate stack now. Added lots of comments and misc fixups. >> >> Zhengyu: please re-test (I can't test any slowdebug except linux-x64). >> >> Thanks, >> David >> >> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>> Thinking about this a bit more, there is still the potential for some >>> confusion if this test fails again in the future due to the top frame >>> missing. Is it missing because it got inlined or is it missing >>> because the frame skipping code skipped an extra frame? Hopefully >>> whoever deals with it doesn't just hastily add another valid >>> stacktrace to the test but instead investigates to make sure the >>> issue is indeed that the method got inlined. >>> >>> Chris >>> >>> On 4/4/19 10:56 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> Okay I will simply check for the third alternative. >>>> >>>> Thanks, >>>> David >>>> >>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> For the callsite that this test is checking for, right now there >>>>> appear to be 3 possible stacktraces: the "normal" one, the one that >>>>> includes AllocateHeap() on solaris and windows slowdebug builds, >>>>> and the one Zhengyu is now seeing on linux-x64. You would need to >>>>> check for all 3, limiting the AllocateHeap() one to just being >>>>> allowed on solaris and windows slowdebug as it is now. So basically >>>>> this test needs to cover all (allowable) stacktraces that we've >>>>> seen for this callsite, and be updated in the future as needed. Not >>>>> ideal, but I don't see a better solution. It's similar to the >>>>> situation described in JDK-8163899 which covered the fragility of >>>>> the NMT frame skipping code. In the end it was decided it would be >>>>> easier to just deal fix issues as they came up rather then engineer >>>>> a solution that wasn't as fragile. I think this test falls in the >>>>> same category. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Thanks for the explanation about the frame counting from >>>>>> os::malloc - now I get it. But I don't understand your final comment: >>>>>> >>>>>> > Looking at this code also reminds me of a reason to have the test >>>>>> > continue to check for all 4 specific frames. If the frame >>>>>> skipping code >>>>>> > skips an extra frame, then the callsite will be missing a needed >>>>>> frame >>>>>> > at the top. The way the test was written it would detect this. >>>>>> With your >>>>>> > changes it will not. It would just revert to always matching on >>>>>> 3 frames >>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>> >>>>>> How can I fix this bug if I have to check for 4 specific frames >>>>>> but one (or more) may be missing - i.e how can I tell the >>>>>> different between "Frame A was inlined" and "Frame A was skipped >>>>>> by mistake" ?? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> >>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> I have concerns that this will hide some of the other bugs >>>>>>>>>>>>> I've mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. >>>>>>>>>>>>> These bugs result in 1 or two frames appearing in the >>>>>>>>>>>>> stacktrace that should be skipped. Notably >>>>>>>>>>>>> NativeCallStack::NativeCallStack() and os::get_native_stack(). >>>>>>>>>>>> >>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>> >>>>>>>>>>>> 73???????? // We should never see either of these frames >>>>>>>>>>>> because they are supposed to be skipped. */ >>>>>>>>>>>> 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>>>>>>>> >>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the stack >>>>>>>>>>>>> trace, but the test has specifically allowed for it for >>>>>>>>>>>>> windows and solaris slowdebug builds. Although these builds >>>>>>>>>>>>> should have honored the ALWAYSINLINE directive, it was >>>>>>>>>>>>> deemed acceptable that it was not in slowdebug builds. >>>>>>>>>>>>> However, I would not want to allow AllocateHeap() to appear >>>>>>>>>>>>> in a product build, and best not to see it in fastdebug >>>>>>>>>>>>> either. >>>>>>>>>>>> >>>>>>>>>>>> This is a test of NMT detail not a test of whether a given >>>>>>>>>>>> compiler chooses to inline something like AllocateHeap. I >>>>>>>>>>>> don't think it is the job of this test to be checking for >>>>>>>>>>>> something specific to the native compiler. The previous >>>>>>>>>>>> handling of AllocateHeap seemed to be there simply because >>>>>>>>>>>> it was the only way to deal with an optional frame - but now >>>>>>>>>>>> that's handled generically. >>>>>>>>>>> It's appearance means you effectively only have 3 frames to >>>>>>>>>>> identity callsites instead of 4. >>>>>>>>>> >>>>>>>>>> Both stacktraces in the old test had 4 elements and expected 4 >>>>>>>>>> matches. The current bug is that one of those (new_entry) >>>>>>>>>> could actually be inlined as well, resulting in only 3 >>>>>>>>>> matches. So that is what the revised test checks for: at least >>>>>>>>>> 3 matches. Often there will be 4 matches. >>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>> referring to how many frames NMT uses to identify the callsite. >>>>>>>>> It wants to use 4, but if AllocateHeap() doesn't get inlined, >>>>>>>>> it effectively is using 3. The test should detect when this >>>>>>>>> happens so the NMT implementation can address the issue. >>>>>>>> >>>>>>>> You're right I don't understand this part as I don't know >>>>>>>> how/what NMT detail is doing in this regard. >>>>>>> >>>>>>> An NMT callsite is simply the 4 most recent frames (afters some >>>>>>> pruning) that led to the os:malloc() call. "4" is somewhat >>>>>>> arbitrary as Thomas pointed out, and is controlled by >>>>>>> NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger >>>>>>> means more refinement of the callsites (thus more callsites), but >>>>>>> a clearer picture of what actually led to the os:malloc(). >>>>>>> >>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have a() >>>>>>> calls b() calls c() calls d() calls os:malloc(), and foo() and >>>>>>> bar() both call a(), the NMT detail output will not distinguish >>>>>>> between these two calls paths to os:mallco(), and will consider >>>>>>> both paths to be the same callsite. The 4 frames in the NMT >>>>>>> detail output would always be a, b, c, and d. However, bump up >>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as two >>>>>>> separate callsites, one with foo() as the bottom frame and one >>>>>>> with bar() as the bottom frame, and both with a, b, c, and d as >>>>>>> the other 4 frames. >>>>>>> >>>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>>> allocation that is the result of doing a "new" of any CHeapObj >>>>>>> subtype will have AllocateHeap() in its callsite, which >>>>>>> effectively lowers they callsite refinement by 1. >>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>> >>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>> ? 52???????? ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>> >>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it >>>>>>>>>> getting inlined already when AllocateHeap was not? Even so we >>>>>>>>>> still end up with 4 frames matching normally. >>>>>>>>> I noticed that last night also and scratch my head over it for >>>>>>>>> a while and then went to bed. The only explanation I could come >>>>>>>>> up with is that allocate_new_entry() is getting inlined, and as >>>>>>>>> a result (due to being a slowdebug build and doing minimal >>>>>>>>> inlining) AllocateHeap() was not inlined. >>>>>>>>>> >>>>>>>>>>> If it does appear in a product build, a solution should be >>>>>>>>>>> looked into to get rid of it. If the port owner decides it >>>>>>>>>>> can't get rid of it (or is unwilling to), then an exception >>>>>>>>>>> should be added to the test like was done for solaris and >>>>>>>>>>> windows slowdebug builds. >>>>>>>>>> >>>>>>>>>> Are we specifically trying to test the compiler's ability to >>>>>>>>>> inline that function and just happen to be using this test to >>>>>>>>>> verify that? Doesn't seem like a suitable place to do this - >>>>>>>>>> and why do we need to do it? The Visual Studio docs state: >>>>>>>>>> >>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>> >>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and >>>>>>>>>> could change with any update to the compiler. >>>>>>>>>> >>>>>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>>>>> specifically -xinline only has an effect at ?xO3 or higher. >>>>>>>>>> Which likely explains why it is ignored in slowdebug. And >>>>>>>>>> there are other cases where it won't honour the ALWAYSINLINE. >>>>>>>>>> >>>>>>>>>> Even with gcc we seem to be misusing the attribute if we want >>>>>>>>>> to ensure inlining when not optimising: >>>>>>>>>> >>>>>>>>>> "GCC does not inline any functions when not optimizing unless >>>>>>>>>> you specify the ?always_inline? attribute for the function, >>>>>>>>>> like this: >>>>>>>>>> >>>>>>>>>> /* Prototype.? */ >>>>>>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>>>>>> >>>>>>>>>> and we don't write it that way. >>>>>>>>>> >>>>>>>>>> So if we're that concerned about release builds guaranteeing >>>>>>>>>> to inline AllocateHeap then I think we need something a bit >>>>>>>>>> more explicit than this test to determine that. >>>>>>>>> With respect to the 3 methods/functions we don't want to see in >>>>>>>>> the callsite stacktrace, NMT has made a number of assumptions >>>>>>>>> on inlining. One of the things the test is doing is making sure >>>>>>>>> those assumptions are correct. If incorrect, then you run into >>>>>>>>> issues like I mentioned above where callsite backtraces >>>>>>>>> effectively only have 3 unique frames rather than 4 (actually >>>>>>>>> before some bug fixes it was often just 2 unique frames). So I >>>>>>>>> think it's appropriate to have a test to make sure we are not >>>>>>>>> seeing any of these 3 methods/functions. >>>>>>>> >>>>>>>> Okay I get the gist of that. Is there somewhere I can clearly >>>>>>>> see what this inlining assumptions are that NMT makes? Are they >>>>>>>> clearly documented? >>>>>>> >>>>>>> Not that I know of. I discovered them while looking at the >>>>>>> various bugs that led to NativeCallStack::NativeCallStack() and >>>>>>> os::get_native_stack() (and sometimes both) being in the >>>>>>> callsite. Reviewing the bugs I referred to will give you an idea >>>>>>> of where to look. One good place to look at >>>>>>> NativeCallStack::NativeCallStack(). Lots of special case code >>>>>>> there that controls how many frames to skip based on on the >>>>>>> platform and whether optimized or not. Also some comments there >>>>>>> to help you out. I did a lot of bug fixing in this method. >>>>>>> >>>>>>> Looking at this code also reminds me of a reason to have the test >>>>>>> continue to check for all 4 specific frames. If the frame >>>>>>> skipping code skips an extra frame, then the callsite will be >>>>>>> missing a needed frame at the top. The way the test was written >>>>>>> it would detect this. With your changes it will not. It would >>>>>>> just revert to always matching on 3 frames instead of 4, and the >>>>>>> frame skipping bug would go unnoticed. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>>> >>>>>>>>> Now the test also has made inlining assumptions beyond what NMT >>>>>>>>> has made, and that is really what this bug is about. In general >>>>>>>>> I think your fix is fine in the way it relaxes which frames are >>>>>>>>> actually found, but as Thomas points out, it suffers from not >>>>>>>>> actually looking at a single stacktrace, but just looking for >>>>>>>>> the specified frames somewhere in the output (and in the order >>>>>>>>> specified.) You should probably address this. >>>>>>>> >>>>>>>> Right that was an error on my part. I thought the existing >>>>>>>> MULTILINE pattern matching with .* would also find >>>>>>>> non-sequential lines and so I was acting similarly. I will >>>>>>>> re-think this. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> Given the changes you made to allow more flexibly in which >>>>>>>>>>>>> frames appear, I think you need to now also make sure the >>>>>>>>>>>>> above 3 mentioned frames are not present, except for >>>>>>>>>>>>> allowing AllocateHeap() in slowdebug builds. >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> The actual stack trace reported by NMT detail is affected >>>>>>>>>>>>>> by the inlining decisions of the native compiler, and on >>>>>>>>>>>>>> the type of build. So we define an "ideal" stacktrace and >>>>>>>>>>>>>> then allow for some frames to be missing based on >>>>>>>>>>>>>> empirical observations. So to date we have seen two frames >>>>>>>>>>>>>> that may or may not be inlined and so we allow for 2 >>>>>>>>>>>>>> non-matching entries. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now it is >>>>>>>>>>>>>> just an optional frame. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as you >>>>>>>>>>>>>> intended? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > > From chris.plummer at oracle.com Sat Apr 6 06:24:09 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 5 Apr 2019 23:24:09 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> Message-ID: <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> On 4/5/19 9:13 PM, David Holmes wrote: > Hi Chris, > > On 6/04/2019 3:09 am, Chris Plummer wrote: >> Hi David, >> >> Why was the JVM_DefineModule frame left off of stackTraceAlternate? > > ?? That isn't part of any of the existing stacktraces. See the following comment from Zhengyu in the CR: https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 > >> Since you've added the following: >> >> ??103???????? if (!okToHaveAllocateHeap) { >> ??104???????????? output.shouldNotContain("AllocateHeap"); >> ??105???????? } > > I didn't add that - see old code line 80. Ok, but my comment below still applies since this check is in place. > >> You can simplify the following: >> >> ??123???????? if (okToHaveAllocateHeap) { >> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >> ??125???????????? if (stackTraceMatches(expectedStackTrace, output)) { >> ??126???????????????? return; >> ??127???????????? } >> ??128???????? } else { >> >> The is no need for the okToHaveAllocateHeap check here anymore. Just >> check all 3 allowed stacktraces until one passes. This is a slight >> improvement in flexibility in that it would no longer require the >> slowdebug builds to match stackTraceAllocateHeap. They could match >> any of the 3. You could then put all 3 allowed stacktraces in an >> array and check them in a loop if you wish. > > The only change I have made (which might be obscured by the structure) > is that if stackTraceDefault fails to match I then try > stackTraceAlternate. The handling of okToHaveAllocateHeap is unchanged. > > By the same argument you made I think it best to only expect the > AllocateHeap stack on those slowdebug platforms, so that we can notice > when something changes - again I've mode no change in this regard. Since line 104 already verified that AllocateHeap does not appear except possibly in slow debug heaps, it is harmless to check all builds against the stacktrace that includes AllocateHeap. Also, if a slowdebug platform were to change to no longer include AllocateHeap, checking it against the other two stacktraces would allow the test to continue to pass without modification. For these two reasons I was suggesting just always check all 3 stacktraces until one passes. It would simplify the logic some. > >> The following is no longer correct: >> >> ??140???????? throw new RuntimeException("Expected stack trace >> missing from output: " + expectedStackTrace); >> >> In your current approach, expectedStackTrace is just the last >> stacktrace we tried. Since we may try more than one, maybe all the >> ones that failed to match should be listed (or none listed if just >> too messy). > > It reports the last failing stacktrace, out of a possible two. Perhaps > I can print both ... you want something in the jtr file so that it can > be triaged without having to go and look up the test code. Yeah, just pointing out that only printing one stacktrace might lead the .jtr reader down the wrong path. thanks, Chris > > Thanks, > David > >> thanks, >> >> Chris >> >> On 4/5/19 12:04 AM, David Holmes wrote: >>> Hi Chris, >>> >>> Updated webrev: >>> >>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>> >>> Checks for alternate stack now. Added lots of comments and misc fixups. >>> >>> Zhengyu: please re-test (I can't test any slowdebug except linux-x64). >>> >>> Thanks, >>> David >>> >>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>> Thinking about this a bit more, there is still the potential for >>>> some confusion if this test fails again in the future due to the >>>> top frame missing. Is it missing because it got inlined or is it >>>> missing because the frame skipping code skipped an extra frame? >>>> Hopefully whoever deals with it doesn't just hastily add another >>>> valid stacktrace to the test but instead investigates to make sure >>>> the issue is indeed that the method got inlined. >>>> >>>> Chris >>>> >>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> Okay I will simply check for the third alternative. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> For the callsite that this test is checking for, right now there >>>>>> appear to be 3 possible stacktraces: the "normal" one, the one >>>>>> that includes AllocateHeap() on solaris and windows slowdebug >>>>>> builds, and the one Zhengyu is now seeing on linux-x64. You would >>>>>> need to check for all 3, limiting the AllocateHeap() one to just >>>>>> being allowed on solaris and windows slowdebug as it is now. So >>>>>> basically this test needs to cover all (allowable) stacktraces >>>>>> that we've seen for this callsite, and be updated in the future >>>>>> as needed. Not ideal, but I don't see a better solution. It's >>>>>> similar to the situation described in JDK-8163899 which covered >>>>>> the fragility of the NMT frame skipping code. In the end it was >>>>>> decided it would be easier to just deal fix issues as they came >>>>>> up rather then engineer a solution that wasn't as fragile. I >>>>>> think this test falls in the same category. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Thanks for the explanation about the frame counting from >>>>>>> os::malloc - now I get it. But I don't understand your final >>>>>>> comment: >>>>>>> >>>>>>> > Looking at this code also reminds me of a reason to have the test >>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>> skipping code >>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>> needed frame >>>>>>> > at the top. The way the test was written it would detect this. >>>>>>> With your >>>>>>> > changes it will not. It would just revert to always matching >>>>>>> on 3 frames >>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>> >>>>>>> How can I fix this bug if I have to check for 4 specific frames >>>>>>> but one (or more) may be missing - i.e how can I tell the >>>>>>> different between "Frame A was inlined" and "Frame A was skipped >>>>>>> by mistake" ?? >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have concerns that this will hide some of the other >>>>>>>>>>>>>> bugs I've mentioned: JDK-8133749, JDK-8133747, and >>>>>>>>>>>>>> JDK-8133740. These bugs result in 1 or two frames >>>>>>>>>>>>>> appearing in the stacktrace that should be skipped. >>>>>>>>>>>>>> Notably NativeCallStack::NativeCallStack() and >>>>>>>>>>>>>> os::get_native_stack(). >>>>>>>>>>>>> >>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>> >>>>>>>>>>>>> 73???????? // We should never see either of these frames >>>>>>>>>>>>> because they are supposed to be skipped. */ >>>>>>>>>>>>> 74 >>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>>>>>>>>> >>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the stack >>>>>>>>>>>>>> trace, but the test has specifically allowed for it for >>>>>>>>>>>>>> windows and solaris slowdebug builds. Although these >>>>>>>>>>>>>> builds should have honored the ALWAYSINLINE directive, it >>>>>>>>>>>>>> was deemed acceptable that it was not in slowdebug >>>>>>>>>>>>>> builds. However, I would not want to allow AllocateHeap() >>>>>>>>>>>>>> to appear in a product build, and best not to see it in >>>>>>>>>>>>>> fastdebug either. >>>>>>>>>>>>> >>>>>>>>>>>>> This is a test of NMT detail not a test of whether a given >>>>>>>>>>>>> compiler chooses to inline something like AllocateHeap. I >>>>>>>>>>>>> don't think it is the job of this test to be checking for >>>>>>>>>>>>> something specific to the native compiler. The previous >>>>>>>>>>>>> handling of AllocateHeap seemed to be there simply because >>>>>>>>>>>>> it was the only way to deal with an optional frame - but >>>>>>>>>>>>> now that's handled generically. >>>>>>>>>>>> It's appearance means you effectively only have 3 frames to >>>>>>>>>>>> identity callsites instead of 4. >>>>>>>>>>> >>>>>>>>>>> Both stacktraces in the old test had 4 elements and expected >>>>>>>>>>> 4 matches. The current bug is that one of those (new_entry) >>>>>>>>>>> could actually be inlined as well, resulting in only 3 >>>>>>>>>>> matches. So that is what the revised test checks for: at >>>>>>>>>>> least 3 matches. Often there will be 4 matches. >>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() doesn't >>>>>>>>>> get inlined, it effectively is using 3. The test should >>>>>>>>>> detect when this happens so the NMT implementation can >>>>>>>>>> address the issue. >>>>>>>>> >>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>> >>>>>>>> An NMT callsite is simply the 4 most recent frames (afters some >>>>>>>> pruning) that led to the os:malloc() call. "4" is somewhat >>>>>>>> arbitrary as Thomas pointed out, and is controlled by >>>>>>>> NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger >>>>>>>> means more refinement of the callsites (thus more callsites), >>>>>>>> but a clearer picture of what actually led to the os:malloc(). >>>>>>>> >>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have a() >>>>>>>> calls b() calls c() calls d() calls os:malloc(), and foo() and >>>>>>>> bar() both call a(), the NMT detail output will not distinguish >>>>>>>> between these two calls paths to os:mallco(), and will consider >>>>>>>> both paths to be the same callsite. The 4 frames in the NMT >>>>>>>> detail output would always be a, b, c, and d. However, bump up >>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as two >>>>>>>> separate callsites, one with foo() as the bottom frame and one >>>>>>>> with bar() as the bottom frame, and both with a, b, c, and d as >>>>>>>> the other 4 frames. >>>>>>>> >>>>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>>>> allocation that is the result of doing a "new" of any CHeapObj >>>>>>>> subtype will have AllocateHeap() in its callsite, which >>>>>>>> effectively lowers they callsite refinement by 1. >>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>> >>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>> >>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it >>>>>>>>>>> getting inlined already when AllocateHeap was not? Even so >>>>>>>>>>> we still end up with 4 frames matching normally. >>>>>>>>>> I noticed that last night also and scratch my head over it >>>>>>>>>> for a while and then went to bed. The only explanation I >>>>>>>>>> could come up with is that allocate_new_entry() is getting >>>>>>>>>> inlined, and as a result (due to being a slowdebug build and >>>>>>>>>> doing minimal inlining) AllocateHeap() was not inlined. >>>>>>>>>>> >>>>>>>>>>>> If it does appear in a product build, a solution should be >>>>>>>>>>>> looked into to get rid of it. If the port owner decides it >>>>>>>>>>>> can't get rid of it (or is unwilling to), then an exception >>>>>>>>>>>> should be added to the test like was done for solaris and >>>>>>>>>>>> windows slowdebug builds. >>>>>>>>>>> >>>>>>>>>>> Are we specifically trying to test the compiler's ability to >>>>>>>>>>> inline that function and just happen to be using this test >>>>>>>>>>> to verify that? Doesn't seem like a suitable place to do >>>>>>>>>>> this - and why do we need to do it? The Visual Studio docs >>>>>>>>>>> state: >>>>>>>>>>> >>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>> >>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and >>>>>>>>>>> could change with any update to the compiler. >>>>>>>>>>> >>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>>>>>> specifically -xinline only has an effect at ?xO3 or higher. >>>>>>>>>>> Which likely explains why it is ignored in slowdebug. And >>>>>>>>>>> there are other cases where it won't honour the ALWAYSINLINE. >>>>>>>>>>> >>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we >>>>>>>>>>> want to ensure inlining when not optimising: >>>>>>>>>>> >>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>> unless you specify the ?always_inline? attribute for the >>>>>>>>>>> function, like this: >>>>>>>>>>> >>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>>>>>>> >>>>>>>>>>> and we don't write it that way. >>>>>>>>>>> >>>>>>>>>>> So if we're that concerned about release builds guaranteeing >>>>>>>>>>> to inline AllocateHeap then I think we need something a bit >>>>>>>>>>> more explicit than this test to determine that. >>>>>>>>>> With respect to the 3 methods/functions we don't want to see >>>>>>>>>> in the callsite stacktrace, NMT has made a number of >>>>>>>>>> assumptions on inlining. One of the things the test is doing >>>>>>>>>> is making sure those assumptions are correct. If incorrect, >>>>>>>>>> then you run into issues like I mentioned above where >>>>>>>>>> callsite backtraces effectively only have 3 unique frames >>>>>>>>>> rather than 4 (actually before some bug fixes it was often >>>>>>>>>> just 2 unique frames). So I think it's appropriate to have a >>>>>>>>>> test to make sure we are not seeing any of these 3 >>>>>>>>>> methods/functions. >>>>>>>>> >>>>>>>>> Okay I get the gist of that. Is there somewhere I can clearly >>>>>>>>> see what this inlining assumptions are that NMT makes? Are >>>>>>>>> they clearly documented? >>>>>>>> >>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>> various bugs that led to NativeCallStack::NativeCallStack() and >>>>>>>> os::get_native_stack() (and sometimes both) being in the >>>>>>>> callsite. Reviewing the bugs I referred to will give you an >>>>>>>> idea of where to look. One good place to look at >>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case code >>>>>>>> there that controls how many frames to skip based on on the >>>>>>>> platform and whether optimized or not. Also some comments there >>>>>>>> to help you out. I did a lot of bug fixing in this method. >>>>>>>> >>>>>>>> Looking at this code also reminds me of a reason to have the >>>>>>>> test continue to check for all 4 specific frames. If the frame >>>>>>>> skipping code skips an extra frame, then the callsite will be >>>>>>>> missing a needed frame at the top. The way the test was written >>>>>>>> it would detect this. With your changes it will not. It would >>>>>>>> just revert to always matching on 3 frames instead of 4, and >>>>>>>> the frame skipping bug would go unnoticed. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>>> >>>>>>>>>> Now the test also has made inlining assumptions beyond what >>>>>>>>>> NMT has made, and that is really what this bug is about. In >>>>>>>>>> general I think your fix is fine in the way it relaxes which >>>>>>>>>> frames are actually found, but as Thomas points out, it >>>>>>>>>> suffers from not actually looking at a single stacktrace, but >>>>>>>>>> just looking for the specified frames somewhere in the output >>>>>>>>>> (and in the order specified.) You should probably address this. >>>>>>>>> >>>>>>>>> Right that was an error on my part. I thought the existing >>>>>>>>> MULTILINE pattern matching with .* would also find >>>>>>>>> non-sequential lines and so I was acting similarly. I will >>>>>>>>> re-think this. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>> which frames appear, I think you need to now also make >>>>>>>>>>>>>> sure the above 3 mentioned frames are not present, except >>>>>>>>>>>>>> for allowing AllocateHeap() in slowdebug builds. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames to be >>>>>>>>>>>>>>> missing based on empirical observations. So to date we >>>>>>>>>>>>>>> have seen two frames that may or may not be inlined and >>>>>>>>>>>>>>> so we allow for 2 non-matching entries. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now it >>>>>>>>>>>>>>> is just an optional frame. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as >>>>>>>>>>>>>>> you intended? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From david.holmes at oracle.com Sun Apr 7 06:06:51 2019 From: david.holmes at oracle.com (David Holmes) Date: Sun, 7 Apr 2019 16:06:51 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> Message-ID: On 6/04/2019 4:24 pm, Chris Plummer wrote: > On 4/5/19 9:13 PM, David Holmes wrote: >> Hi Chris, >> >> On 6/04/2019 3:09 am, Chris Plummer wrote: >>> Hi David, >>> >>> Why was the JVM_DefineModule frame left off of stackTraceAlternate? >> >> ?? That isn't part of any of the existing stacktraces. > See the following comment from Zhengyu in the CR: > > https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 That comment simply includes a fragment of a stack which happens to include JVM_DefineModule and makes no further mention of it. I don't recall anyone saying that we should now be including that frame in the check. Do you want the test extended to also check for that frame? >> >>> Since you've added the following: >>> >>> ??103???????? if (!okToHaveAllocateHeap) { >>> ??104???????????? output.shouldNotContain("AllocateHeap"); >>> ??105???????? } >> >> I didn't add that - see old code line 80. > Ok, but my comment below still applies since this check is in place. >> >>> You can simplify the following: >>> >>> ??123???????? if (okToHaveAllocateHeap) { >>> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >>> ??125???????????? if (stackTraceMatches(expectedStackTrace, output)) { >>> ??126???????????????? return; >>> ??127???????????? } >>> ??128???????? } else { >>> >>> The is no need for the okToHaveAllocateHeap check here anymore. Just >>> check all 3 allowed stacktraces until one passes. This is a slight >>> improvement in flexibility in that it would no longer require the >>> slowdebug builds to match stackTraceAllocateHeap. They could match >>> any of the 3. You could then put all 3 allowed stacktraces in an >>> array and check them in a loop if you wish. >> >> The only change I have made (which might be obscured by the structure) >> is that if stackTraceDefault fails to match I then try >> stackTraceAlternate. The handling of okToHaveAllocateHeap is unchanged. >> >> By the same argument you made I think it best to only expect the >> AllocateHeap stack on those slowdebug platforms, so that we can notice >> when something changes - again I've mode no change in this regard. > Since line 104 already verified that AllocateHeap does not appear except > possibly in slow debug heaps, it is harmless to check all builds against > the stacktrace that includes AllocateHeap. "Harmless" but a waste of time checking for a stack that we know can't match. The current version was at your suggestion: "You would need to check for all 3, limiting the AllocateHeap() one to just being allowed on solaris and windows slowdebug as it is now." Checking all three returns to my original version (modulo not removing the check for the AllocateHeap frame, and fixing the matching logic). > Also, if a slowdebug platform > were to change to no longer include AllocateHeap, checking it against > the other two stacktraces would allow the test to continue to pass > without modification. This is counter to your earlier argument that we should be using this test to specifically check for such changes in compiler behaviour and update the platform specific guards accordingly. If you allow it to go either way then we would never remove the guard even when it was no longer needed on any platform. > For these two reasons I was suggesting just always > check all 3 stacktraces until one passes. It would simplify the logic some. I'd need to change a number of other things make the main logic simpler (ie loop over all three stacks) but the error reporting part will be more awkward. And Thomas already complained about the number of times we scan the entire process output doing this matching, so this would make it worse - unless I completely change the way we do the matching, which then introduces more complexity and more likelihood of introducing new bugs. Let me know how you want to proceed. Thanks, David ----- >> >>> The following is no longer correct: >>> >>> ??140???????? throw new RuntimeException("Expected stack trace >>> missing from output: " + expectedStackTrace); >>> >>> In your current approach, expectedStackTrace is just the last >>> stacktrace we tried. Since we may try more than one, maybe all the >>> ones that failed to match should be listed (or none listed if just >>> too messy). >> >> It reports the last failing stacktrace, out of a possible two. Perhaps >> I can print both ... you want something in the jtr file so that it can >> be triaged without having to go and look up the test code. > Yeah, just pointing out that only printing one stacktrace might lead the > .jtr reader down the wrong path. > > thanks, > > Chris >> >> Thanks, >> David >> >>> thanks, >>> >>> Chris >>> >>> On 4/5/19 12:04 AM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> Updated webrev: >>>> >>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>>> >>>> Checks for alternate stack now. Added lots of comments and misc fixups. >>>> >>>> Zhengyu: please re-test (I can't test any slowdebug except linux-x64). >>>> >>>> Thanks, >>>> David >>>> >>>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>>> Thinking about this a bit more, there is still the potential for >>>>> some confusion if this test fails again in the future due to the >>>>> top frame missing. Is it missing because it got inlined or is it >>>>> missing because the frame skipping code skipped an extra frame? >>>>> Hopefully whoever deals with it doesn't just hastily add another >>>>> valid stacktrace to the test but instead investigates to make sure >>>>> the issue is indeed that the method got inlined. >>>>> >>>>> Chris >>>>> >>>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Okay I will simply check for the third alternative. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> For the callsite that this test is checking for, right now there >>>>>>> appear to be 3 possible stacktraces: the "normal" one, the one >>>>>>> that includes AllocateHeap() on solaris and windows slowdebug >>>>>>> builds, and the one Zhengyu is now seeing on linux-x64. You would >>>>>>> need to check for all 3, limiting the AllocateHeap() one to just >>>>>>> being allowed on solaris and windows slowdebug as it is now. So >>>>>>> basically this test needs to cover all (allowable) stacktraces >>>>>>> that we've seen for this callsite, and be updated in the future >>>>>>> as needed. Not ideal, but I don't see a better solution. It's >>>>>>> similar to the situation described in JDK-8163899 which covered >>>>>>> the fragility of the NMT frame skipping code. In the end it was >>>>>>> decided it would be easier to just deal fix issues as they came >>>>>>> up rather then engineer a solution that wasn't as fragile. I >>>>>>> think this test falls in the same category. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thanks for the explanation about the frame counting from >>>>>>>> os::malloc - now I get it. But I don't understand your final >>>>>>>> comment: >>>>>>>> >>>>>>>> > Looking at this code also reminds me of a reason to have the test >>>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>>> skipping code >>>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>>> needed frame >>>>>>>> > at the top. The way the test was written it would detect this. >>>>>>>> With your >>>>>>>> > changes it will not. It would just revert to always matching >>>>>>>> on 3 frames >>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>> >>>>>>>> How can I fix this bug if I have to check for 4 specific frames >>>>>>>> but one (or more) may be missing - i.e how can I tell the >>>>>>>> different between "Frame A was inlined" and "Frame A was skipped >>>>>>>> by mistake" ?? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> >>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have concerns that this will hide some of the other >>>>>>>>>>>>>>> bugs I've mentioned: JDK-8133749, JDK-8133747, and >>>>>>>>>>>>>>> JDK-8133740. These bugs result in 1 or two frames >>>>>>>>>>>>>>> appearing in the stacktrace that should be skipped. >>>>>>>>>>>>>>> Notably NativeCallStack::NativeCallStack() and >>>>>>>>>>>>>>> os::get_native_stack(). >>>>>>>>>>>>>> >>>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 73???????? // We should never see either of these frames >>>>>>>>>>>>>> because they are supposed to be skipped. */ >>>>>>>>>>>>>> 74 >>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but missed it. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the stack >>>>>>>>>>>>>>> trace, but the test has specifically allowed for it for >>>>>>>>>>>>>>> windows and solaris slowdebug builds. Although these >>>>>>>>>>>>>>> builds should have honored the ALWAYSINLINE directive, it >>>>>>>>>>>>>>> was deemed acceptable that it was not in slowdebug >>>>>>>>>>>>>>> builds. However, I would not want to allow AllocateHeap() >>>>>>>>>>>>>>> to appear in a product build, and best not to see it in >>>>>>>>>>>>>>> fastdebug either. >>>>>>>>>>>>>> >>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a given >>>>>>>>>>>>>> compiler chooses to inline something like AllocateHeap. I >>>>>>>>>>>>>> don't think it is the job of this test to be checking for >>>>>>>>>>>>>> something specific to the native compiler. The previous >>>>>>>>>>>>>> handling of AllocateHeap seemed to be there simply because >>>>>>>>>>>>>> it was the only way to deal with an optional frame - but >>>>>>>>>>>>>> now that's handled generically. >>>>>>>>>>>>> It's appearance means you effectively only have 3 frames to >>>>>>>>>>>>> identity callsites instead of 4. >>>>>>>>>>>> >>>>>>>>>>>> Both stacktraces in the old test had 4 elements and expected >>>>>>>>>>>> 4 matches. The current bug is that one of those (new_entry) >>>>>>>>>>>> could actually be inlined as well, resulting in only 3 >>>>>>>>>>>> matches. So that is what the revised test checks for: at >>>>>>>>>>>> least 3 matches. Often there will be 4 matches. >>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() doesn't >>>>>>>>>>> get inlined, it effectively is using 3. The test should >>>>>>>>>>> detect when this happens so the NMT implementation can >>>>>>>>>>> address the issue. >>>>>>>>>> >>>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>>> >>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters some >>>>>>>>> pruning) that led to the os:malloc() call. "4" is somewhat >>>>>>>>> arbitrary as Thomas pointed out, and is controlled by >>>>>>>>> NMT_TrackingStackDepth. Making NMT_TrackingStackDepth bigger >>>>>>>>> means more refinement of the callsites (thus more callsites), >>>>>>>>> but a clearer picture of what actually led to the os:malloc(). >>>>>>>>> >>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have a() >>>>>>>>> calls b() calls c() calls d() calls os:malloc(), and foo() and >>>>>>>>> bar() both call a(), the NMT detail output will not distinguish >>>>>>>>> between these two calls paths to os:mallco(), and will consider >>>>>>>>> both paths to be the same callsite. The 4 frames in the NMT >>>>>>>>> detail output would always be a, b, c, and d. However, bump up >>>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as two >>>>>>>>> separate callsites, one with foo() as the bottom frame and one >>>>>>>>> with bar() as the bottom frame, and both with a, b, c, and d as >>>>>>>>> the other 4 frames. >>>>>>>>> >>>>>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>>>>> allocation that is the result of doing a "new" of any CHeapObj >>>>>>>>> subtype will have AllocateHeap() in its callsite, which >>>>>>>>> effectively lowers they callsite refinement by 1. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>>> >>>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>>> >>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was it >>>>>>>>>>>> getting inlined already when AllocateHeap was not? Even so >>>>>>>>>>>> we still end up with 4 frames matching normally. >>>>>>>>>>> I noticed that last night also and scratch my head over it >>>>>>>>>>> for a while and then went to bed. The only explanation I >>>>>>>>>>> could come up with is that allocate_new_entry() is getting >>>>>>>>>>> inlined, and as a result (due to being a slowdebug build and >>>>>>>>>>> doing minimal inlining) AllocateHeap() was not inlined. >>>>>>>>>>>> >>>>>>>>>>>>> If it does appear in a product build, a solution should be >>>>>>>>>>>>> looked into to get rid of it. If the port owner decides it >>>>>>>>>>>>> can't get rid of it (or is unwilling to), then an exception >>>>>>>>>>>>> should be added to the test like was done for solaris and >>>>>>>>>>>>> windows slowdebug builds. >>>>>>>>>>>> >>>>>>>>>>>> Are we specifically trying to test the compiler's ability to >>>>>>>>>>>> inline that function and just happen to be using this test >>>>>>>>>>>> to verify that? Doesn't seem like a suitable place to do >>>>>>>>>>>> this - and why do we need to do it? The Visual Studio docs >>>>>>>>>>>> state: >>>>>>>>>>>> >>>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>>> >>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and >>>>>>>>>>>> could change with any update to the compiler. >>>>>>>>>>>> >>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>>>>>>> specifically -xinline only has an effect at ?xO3 or higher. >>>>>>>>>>>> Which likely explains why it is ignored in slowdebug. And >>>>>>>>>>>> there are other cases where it won't honour the ALWAYSINLINE. >>>>>>>>>>>> >>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we >>>>>>>>>>>> want to ensure inlining when not optimising: >>>>>>>>>>>> >>>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>>> unless you specify the ?always_inline? attribute for the >>>>>>>>>>>> function, like this: >>>>>>>>>>>> >>>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>>>>>>>> >>>>>>>>>>>> and we don't write it that way. >>>>>>>>>>>> >>>>>>>>>>>> So if we're that concerned about release builds guaranteeing >>>>>>>>>>>> to inline AllocateHeap then I think we need something a bit >>>>>>>>>>>> more explicit than this test to determine that. >>>>>>>>>>> With respect to the 3 methods/functions we don't want to see >>>>>>>>>>> in the callsite stacktrace, NMT has made a number of >>>>>>>>>>> assumptions on inlining. One of the things the test is doing >>>>>>>>>>> is making sure those assumptions are correct. If incorrect, >>>>>>>>>>> then you run into issues like I mentioned above where >>>>>>>>>>> callsite backtraces effectively only have 3 unique frames >>>>>>>>>>> rather than 4 (actually before some bug fixes it was often >>>>>>>>>>> just 2 unique frames). So I think it's appropriate to have a >>>>>>>>>>> test to make sure we are not seeing any of these 3 >>>>>>>>>>> methods/functions. >>>>>>>>>> >>>>>>>>>> Okay I get the gist of that. Is there somewhere I can clearly >>>>>>>>>> see what this inlining assumptions are that NMT makes? Are >>>>>>>>>> they clearly documented? >>>>>>>>> >>>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>>> various bugs that led to NativeCallStack::NativeCallStack() and >>>>>>>>> os::get_native_stack() (and sometimes both) being in the >>>>>>>>> callsite. Reviewing the bugs I referred to will give you an >>>>>>>>> idea of where to look. One good place to look at >>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case code >>>>>>>>> there that controls how many frames to skip based on on the >>>>>>>>> platform and whether optimized or not. Also some comments there >>>>>>>>> to help you out. I did a lot of bug fixing in this method. >>>>>>>>> >>>>>>>>> Looking at this code also reminds me of a reason to have the >>>>>>>>> test continue to check for all 4 specific frames. If the frame >>>>>>>>> skipping code skips an extra frame, then the callsite will be >>>>>>>>> missing a needed frame at the top. The way the test was written >>>>>>>>> it would detect this. With your changes it will not. It would >>>>>>>>> just revert to always matching on 3 frames instead of 4, and >>>>>>>>> the frame skipping bug would go unnoticed. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Now the test also has made inlining assumptions beyond what >>>>>>>>>>> NMT has made, and that is really what this bug is about. In >>>>>>>>>>> general I think your fix is fine in the way it relaxes which >>>>>>>>>>> frames are actually found, but as Thomas points out, it >>>>>>>>>>> suffers from not actually looking at a single stacktrace, but >>>>>>>>>>> just looking for the specified frames somewhere in the output >>>>>>>>>>> (and in the order specified.) You should probably address this. >>>>>>>>>> >>>>>>>>>> Right that was an error on my part. I thought the existing >>>>>>>>>> MULTILINE pattern matching with .* would also find >>>>>>>>>> non-sequential lines and so I was acting similarly. I will >>>>>>>>>> re-think this. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>>> which frames appear, I think you need to now also make >>>>>>>>>>>>>>> sure the above 3 mentioned frames are not present, except >>>>>>>>>>>>>>> for allowing AllocateHeap() in slowdebug builds. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames to be >>>>>>>>>>>>>>>> missing based on empirical observations. So to date we >>>>>>>>>>>>>>>> have seen two frames that may or may not be inlined and >>>>>>>>>>>>>>>> so we allow for 2 non-matching entries. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now it >>>>>>>>>>>>>>>> is just an optional frame. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as >>>>>>>>>>>>>>>> you intended? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > > From chris.plummer at oracle.com Sun Apr 7 06:51:56 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Sat, 6 Apr 2019 23:51:56 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> Message-ID: <828659d7-c8a1-b509-84f6-4a9596e617c8@oracle.com> Hi David, On 4/6/19 11:06 PM, David Holmes wrote: > On 6/04/2019 4:24 pm, Chris Plummer wrote: >> On 4/5/19 9:13 PM, David Holmes wrote: >>> Hi Chris, >>> >>> On 6/04/2019 3:09 am, Chris Plummer wrote: >>>> Hi David, >>>> >>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate? >>> >>> ?? That isn't part of any of the existing stacktraces. >> See the following comment from Zhengyu in the CR: >> >> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 > > > That comment simply includes a fragment of a stack which happens to > include JVM_DefineModule and makes no further mention of it. I don't > recall anyone saying that we should now be including that frame in the > check. > > Do you want the test extended to also check for that frame? Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule is the additional frame that now appears in the detail output for that call chain. So yes, the test should include it. If the inlining of ModuleEntryTable::new_entry() had always happened, then the test would originally have checked for the stacktrace as it appears in the CR comment. > >>> >>>> Since you've added the following: >>>> >>>> ??103???????? if (!okToHaveAllocateHeap) { >>>> ??104???????????? output.shouldNotContain("AllocateHeap"); >>>> ??105???????? } >>> >>> I didn't add that - see old code line 80. >> Ok, but my comment below still applies since this check is in place. >>> >>>> You can simplify the following: >>>> >>>> ??123???????? if (okToHaveAllocateHeap) { >>>> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >>>> ??125???????????? if (stackTraceMatches(expectedStackTrace, output)) { >>>> ??126???????????????? return; >>>> ??127???????????? } >>>> ??128???????? } else { >>>> >>>> The is no need for the okToHaveAllocateHeap check here anymore. >>>> Just check all 3 allowed stacktraces until one passes. This is a >>>> slight improvement in flexibility in that it would no longer >>>> require the slowdebug builds to match stackTraceAllocateHeap. They >>>> could match any of the 3. You could then put all 3 allowed >>>> stacktraces in an array and check them in a loop if you wish. >>> >>> The only change I have made (which might be obscured by the >>> structure) is that if stackTraceDefault fails to match I then try >>> stackTraceAlternate. The handling of okToHaveAllocateHeap is unchanged. >>> >>> By the same argument you made I think it best to only expect the >>> AllocateHeap stack on those slowdebug platforms, so that we can >>> notice when something changes - again I've mode no change in this >>> regard. >> Since line 104 already verified that AllocateHeap does not appear >> except possibly in slow debug heaps, it is harmless to check all >> builds against the stacktrace that includes AllocateHeap. > > "Harmless" but a waste of time checking for a stack that we know can't > match. The current version was at your suggestion: > > "You would need to check for all 3, limiting the AllocateHeap() one to > just being allowed on solaris and windows slowdebug as it is now." That was before I realized there was already an explicit check for AllocateHeap() to not be allowed except for slowdebug ones. Once I realized that, it occurred to me that checking for all 3 stacktraces in a loop would simplify the logic. > > Checking all three returns to my original version (modulo not removing > the check for the AllocateHeap frame, and fixing the matching logic). Your original version checked for a large number of permutations that included any 3 of 5 specified frames, not checks for any of 3 specific stacktraces (of 4 frames each). > >> Also, if a slowdebug platform were to change to no longer include >> AllocateHeap, checking it against the other two stacktraces would >> allow the test to continue to pass without modification. > > This is counter to your earlier argument that we should be using this > test to specifically check for such changes in compiler behaviour and > update the platform specific guards accordingly. If you allow it to go > either way then we would never remove the guard even when it was no > longer needed on any platform. But this is one compiler inlining behavior change that is ok. If AllocateHeap() suddenly starts being inlined by slowdebug builds, that is actually a good thing, and we would end up modifying the test to allow it. So why not allow it now? > >> For these two reasons I was suggesting just always check all 3 >> stacktraces until one passes. It would simplify the logic some. > > I'd need to change a number of other things make the main logic > simpler (ie loop over all three stacks) but the error reporting part > will be more awkward. And Thomas already complained about the number > of times we scan the entire process output doing this matching, so > this would make it worse - unless I completely change the way we do > the matching, which then introduces more complexity and more > likelihood of introducing new bugs. > > Let me know how you want to proceed. The loop idea was just to make the code simpler. If you feel it will slow things down unacceptably, then I'm fine with the logic as-is in v2, but you need to add JVM_DefineModule to the new stacktrace. thanks, Chris > > Thanks, > David > ----- > >>> >>>> The following is no longer correct: >>>> >>>> ??140???????? throw new RuntimeException("Expected stack trace >>>> missing from output: " + expectedStackTrace); >>>> >>>> In your current approach, expectedStackTrace is just the last >>>> stacktrace we tried. Since we may try more than one, maybe all the >>>> ones that failed to match should be listed (or none listed if just >>>> too messy). >>> >>> It reports the last failing stacktrace, out of a possible two. >>> Perhaps I can print both ... you want something in the jtr file so >>> that it can be triaged without having to go and look up the test code. >> Yeah, just pointing out that only printing one stacktrace might lead >> the .jtr reader down the wrong path. >> >> thanks, >> >> Chris >>> >>> Thanks, >>> David >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 4/5/19 12:04 AM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> Updated webrev: >>>>> >>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>>>> >>>>> Checks for alternate stack now. Added lots of comments and misc >>>>> fixups. >>>>> >>>>> Zhengyu: please re-test (I can't test any slowdebug except >>>>> linux-x64). >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>>>> Thinking about this a bit more, there is still the potential for >>>>>> some confusion if this test fails again in the future due to the >>>>>> top frame missing. Is it missing because it got inlined or is it >>>>>> missing because the frame skipping code skipped an extra frame? >>>>>> Hopefully whoever deals with it doesn't just hastily add another >>>>>> valid stacktrace to the test but instead investigates to make >>>>>> sure the issue is indeed that the method got inlined. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Okay I will simply check for the third alternative. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> For the callsite that this test is checking for, right now >>>>>>>> there appear to be 3 possible stacktraces: the "normal" one, >>>>>>>> the one that includes AllocateHeap() on solaris and windows >>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on >>>>>>>> linux-x64. You would need to check for all 3, limiting the >>>>>>>> AllocateHeap() one to just being allowed on solaris and windows >>>>>>>> slowdebug as it is now. So basically this test needs to cover >>>>>>>> all (allowable) stacktraces that we've seen for this callsite, >>>>>>>> and be updated in the future as needed. Not ideal, but I don't >>>>>>>> see a better solution. It's similar to the situation described >>>>>>>> in JDK-8163899 which covered the fragility of the NMT frame >>>>>>>> skipping code. In the end it was decided it would be easier to >>>>>>>> just deal fix issues as they came up rather then engineer a >>>>>>>> solution that wasn't as fragile. I think this test falls in the >>>>>>>> same category. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Thanks for the explanation about the frame counting from >>>>>>>>> os::malloc - now I get it. But I don't understand your final >>>>>>>>> comment: >>>>>>>>> >>>>>>>>> > Looking at this code also reminds me of a reason to have the >>>>>>>>> test >>>>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>>>> skipping code >>>>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>>>> needed frame >>>>>>>>> > at the top. The way the test was written it would detect >>>>>>>>> this. With your >>>>>>>>> > changes it will not. It would just revert to always matching >>>>>>>>> on 3 frames >>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>> >>>>>>>>> How can I fix this bug if I have to check for 4 specific >>>>>>>>> frames but one (or more) may be missing - i.e how can I tell >>>>>>>>> the different between "Frame A was inlined" and "Frame A was >>>>>>>>> skipped by mistake" ?? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have concerns that this will hide some of the other >>>>>>>>>>>>>>>> bugs I've mentioned: JDK-8133749, JDK-8133747, and >>>>>>>>>>>>>>>> JDK-8133740. These bugs result in 1 or two frames >>>>>>>>>>>>>>>> appearing in the stacktrace that should be skipped. >>>>>>>>>>>>>>>> Notably NativeCallStack::NativeCallStack() and >>>>>>>>>>>>>>>> os::get_native_stack(). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 73???????? // We should never see either of these frames >>>>>>>>>>>>>>> because they are supposed to be skipped. */ >>>>>>>>>>>>>>> 74 >>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but missed >>>>>>>>>>>>>> it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the >>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed for >>>>>>>>>>>>>>>> it for windows and solaris slowdebug builds. Although >>>>>>>>>>>>>>>> these builds should have honored the ALWAYSINLINE >>>>>>>>>>>>>>>> directive, it was deemed acceptable that it was not in >>>>>>>>>>>>>>>> slowdebug builds. However, I would not want to allow >>>>>>>>>>>>>>>> AllocateHeap() to appear in a product build, and best >>>>>>>>>>>>>>>> not to see it in fastdebug either. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a >>>>>>>>>>>>>>> given compiler chooses to inline something like >>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this test >>>>>>>>>>>>>>> to be checking for something specific to the native >>>>>>>>>>>>>>> compiler. The previous handling of AllocateHeap seemed >>>>>>>>>>>>>>> to be there simply because it was the only way to deal >>>>>>>>>>>>>>> with an optional frame - but now that's handled >>>>>>>>>>>>>>> generically. >>>>>>>>>>>>>> It's appearance means you effectively only have 3 frames >>>>>>>>>>>>>> to identity callsites instead of 4. >>>>>>>>>>>>> >>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and >>>>>>>>>>>>> expected 4 matches. The current bug is that one of those >>>>>>>>>>>>> (new_entry) could actually be inlined as well, resulting >>>>>>>>>>>>> in only 3 matches. So that is what the revised test checks >>>>>>>>>>>>> for: at least 3 matches. Often there will be 4 matches. >>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() doesn't >>>>>>>>>>>> get inlined, it effectively is using 3. The test should >>>>>>>>>>>> detect when this happens so the NMT implementation can >>>>>>>>>>>> address the issue. >>>>>>>>>>> >>>>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>>>> >>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters >>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is >>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is controlled >>>>>>>>>> by NMT_TrackingStackDepth. Making NMT_TrackingStackDepth >>>>>>>>>> bigger means more refinement of the callsites (thus more >>>>>>>>>> callsites), but a clearer picture of what actually led to the >>>>>>>>>> os:malloc(). >>>>>>>>>> >>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have >>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and >>>>>>>>>> foo() and bar() both call a(), the NMT detail output will not >>>>>>>>>> distinguish between these two calls paths to os:mallco(), and >>>>>>>>>> will consider both paths to be the same callsite. The 4 >>>>>>>>>> frames in the NMT detail output would always be a, b, c, and >>>>>>>>>> d. However, bump up NMT_TrackingStackDepth to 5 and now NMT >>>>>>>>>> will treat them as two separate callsites, one with foo() as >>>>>>>>>> the bottom frame and one with bar() as the bottom frame, and >>>>>>>>>> both with a, b, c, and d as the other 4 frames. >>>>>>>>>> >>>>>>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>>>>>> allocation that is the result of doing a "new" of any >>>>>>>>>> CHeapObj subtype will have AllocateHeap() in its callsite, >>>>>>>>>> which effectively lowers they callsite refinement by 1. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>>>> >>>>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>>>> >>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was >>>>>>>>>>>>> it getting inlined already when AllocateHeap was not? Even >>>>>>>>>>>>> so we still end up with 4 frames matching normally. >>>>>>>>>>>> I noticed that last night also and scratch my head over it >>>>>>>>>>>> for a while and then went to bed. The only explanation I >>>>>>>>>>>> could come up with is that allocate_new_entry() is getting >>>>>>>>>>>> inlined, and as a result (due to being a slowdebug build >>>>>>>>>>>> and doing minimal inlining) AllocateHeap() was not inlined. >>>>>>>>>>>>> >>>>>>>>>>>>>> If it does appear in a product build, a solution should >>>>>>>>>>>>>> be looked into to get rid of it. If the port owner >>>>>>>>>>>>>> decides it can't get rid of it (or is unwilling to), then >>>>>>>>>>>>>> an exception should be added to the test like was done >>>>>>>>>>>>>> for solaris and windows slowdebug builds. >>>>>>>>>>>>> >>>>>>>>>>>>> Are we specifically trying to test the compiler's ability >>>>>>>>>>>>> to inline that function and just happen to be using this >>>>>>>>>>>>> test to verify that? Doesn't seem like a suitable place to >>>>>>>>>>>>> do this - and why do we need to do it? The Visual Studio >>>>>>>>>>>>> docs state: >>>>>>>>>>>>> >>>>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>>>> >>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and >>>>>>>>>>>>> could change with any update to the compiler. >>>>>>>>>>>>> >>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>>>>>>>> specifically -xinline only has an effect at ?xO3 or >>>>>>>>>>>>> higher. Which likely explains why it is ignored in >>>>>>>>>>>>> slowdebug. And there are other cases where it won't honour >>>>>>>>>>>>> the ALWAYSINLINE. >>>>>>>>>>>>> >>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we >>>>>>>>>>>>> want to ensure inlining when not optimising: >>>>>>>>>>>>> >>>>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>>>> unless you specify the ?always_inline? attribute for the >>>>>>>>>>>>> function, like this: >>>>>>>>>>>>> >>>>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>>>>>>>>> >>>>>>>>>>>>> and we don't write it that way. >>>>>>>>>>>>> >>>>>>>>>>>>> So if we're that concerned about release builds >>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we need >>>>>>>>>>>>> something a bit more explicit than this test to determine >>>>>>>>>>>>> that. >>>>>>>>>>>> With respect to the 3 methods/functions we don't want to >>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of >>>>>>>>>>>> assumptions on inlining. One of the things the test is >>>>>>>>>>>> doing is making sure those assumptions are correct. If >>>>>>>>>>>> incorrect, then you run into issues like I mentioned above >>>>>>>>>>>> where callsite backtraces effectively only have 3 unique >>>>>>>>>>>> frames rather than 4 (actually before some bug fixes it was >>>>>>>>>>>> often just 2 unique frames). So I think it's appropriate to >>>>>>>>>>>> have a test to make sure we are not seeing any of these 3 >>>>>>>>>>>> methods/functions. >>>>>>>>>>> >>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can >>>>>>>>>>> clearly see what this inlining assumptions are that NMT >>>>>>>>>>> makes? Are they clearly documented? >>>>>>>>>> >>>>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>>>> various bugs that led to NativeCallStack::NativeCallStack() >>>>>>>>>> and os::get_native_stack() (and sometimes both) being in the >>>>>>>>>> callsite. Reviewing the bugs I referred to will give you an >>>>>>>>>> idea of where to look. One good place to look at >>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case code >>>>>>>>>> there that controls how many frames to skip based on on the >>>>>>>>>> platform and whether optimized or not. Also some comments >>>>>>>>>> there to help you out. I did a lot of bug fixing in this method. >>>>>>>>>> >>>>>>>>>> Looking at this code also reminds me of a reason to have the >>>>>>>>>> test continue to check for all 4 specific frames. If the >>>>>>>>>> frame skipping code skips an extra frame, then the callsite >>>>>>>>>> will be missing a needed frame at the top. The way the test >>>>>>>>>> was written it would detect this. With your changes it will >>>>>>>>>> not. It would just revert to always matching on 3 frames >>>>>>>>>> instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Now the test also has made inlining assumptions beyond what >>>>>>>>>>>> NMT has made, and that is really what this bug is about. In >>>>>>>>>>>> general I think your fix is fine in the way it relaxes >>>>>>>>>>>> which frames are actually found, but as Thomas points out, >>>>>>>>>>>> it suffers from not actually looking at a single >>>>>>>>>>>> stacktrace, but just looking for the specified frames >>>>>>>>>>>> somewhere in the output (and in the order specified.) You >>>>>>>>>>>> should probably address this. >>>>>>>>>>> >>>>>>>>>>> Right that was an error on my part. I thought the existing >>>>>>>>>>> MULTILINE pattern matching with .* would also find >>>>>>>>>>> non-sequential lines and so I was acting similarly. I will >>>>>>>>>>> re-think this. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>>>> which frames appear, I think you need to now also make >>>>>>>>>>>>>>>> sure the above 3 mentioned frames are not present, >>>>>>>>>>>>>>>> except for allowing AllocateHeap() in slowdebug builds. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames to >>>>>>>>>>>>>>>>> be missing based on empirical observations. So to date >>>>>>>>>>>>>>>>> we have seen two frames that may or may not be inlined >>>>>>>>>>>>>>>>> and so we allow for 2 non-matching entries. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now >>>>>>>>>>>>>>>>> it is just an optional frame. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as >>>>>>>>>>>>>>>>> you intended? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From david.holmes at oracle.com Sun Apr 7 07:10:52 2019 From: david.holmes at oracle.com (David Holmes) Date: Sun, 7 Apr 2019 17:10:52 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <828659d7-c8a1-b509-84f6-4a9596e617c8@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> <828659d7-c8a1-b509-84f6-4a9596e617c8@oracle.com> Message-ID: <404ed0ac-2785-79a0-8fc3-d306af6166ca@oracle.com> Hi Chris, On 7/04/2019 4:51 pm, Chris Plummer wrote: > Hi David, > > On 4/6/19 11:06 PM, David Holmes wrote: >> On 6/04/2019 4:24 pm, Chris Plummer wrote: >>> On 4/5/19 9:13 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> On 6/04/2019 3:09 am, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate? >>>> >>>> ?? That isn't part of any of the existing stacktraces. >>> See the following comment from Zhengyu in the CR: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 >> >> >> >> That comment simply includes a fragment of a stack which happens to >> include JVM_DefineModule and makes no further mention of it. I don't >> recall anyone saying that we should now be including that frame in the >> check. >> >> Do you want the test extended to also check for that frame? > Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule is > the additional frame that now appears in the detail output for that call > chain. So yes, the test should include it. If the inlining of > ModuleEntryTable::new_entry() had always happened, then the test would > originally have checked for the stacktrace as it appears in the CR comment. I see - to be clear you want to always check for 4 frames, so the additional frame is only checked for the alternate stack. >>>> >>>>> Since you've added the following: >>>>> >>>>> ??103???????? if (!okToHaveAllocateHeap) { >>>>> ??104???????????? output.shouldNotContain("AllocateHeap"); >>>>> ??105???????? } >>>> >>>> I didn't add that - see old code line 80. >>> Ok, but my comment below still applies since this check is in place. >>>> >>>>> You can simplify the following: >>>>> >>>>> ??123???????? if (okToHaveAllocateHeap) { >>>>> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >>>>> ??125???????????? if (stackTraceMatches(expectedStackTrace, output)) { >>>>> ??126???????????????? return; >>>>> ??127???????????? } >>>>> ??128???????? } else { >>>>> >>>>> The is no need for the okToHaveAllocateHeap check here anymore. >>>>> Just check all 3 allowed stacktraces until one passes. This is a >>>>> slight improvement in flexibility in that it would no longer >>>>> require the slowdebug builds to match stackTraceAllocateHeap. They >>>>> could match any of the 3. You could then put all 3 allowed >>>>> stacktraces in an array and check them in a loop if you wish. >>>> >>>> The only change I have made (which might be obscured by the >>>> structure) is that if stackTraceDefault fails to match I then try >>>> stackTraceAlternate. The handling of okToHaveAllocateHeap is unchanged. >>>> >>>> By the same argument you made I think it best to only expect the >>>> AllocateHeap stack on those slowdebug platforms, so that we can >>>> notice when something changes - again I've mode no change in this >>>> regard. >>> Since line 104 already verified that AllocateHeap does not appear >>> except possibly in slow debug heaps, it is harmless to check all >>> builds against the stacktrace that includes AllocateHeap. >> >> "Harmless" but a waste of time checking for a stack that we know can't >> match. The current version was at your suggestion: >> >> "You would need to check for all 3, limiting the AllocateHeap() one to >> just being allowed on solaris and windows slowdebug as it is now." > That was before I realized there was already an explicit check for > AllocateHeap() to not be allowed except for slowdebug ones. Once I > realized that, it occurred to me that checking for all 3 stacktraces in > a loop would simplify the logic. >> >> Checking all three returns to my original version (modulo not removing >> the check for the AllocateHeap frame, and fixing the matching logic). > Your original version checked for a large number of permutations that > included any 3 of 5 specified frames, not checks for any of 3 specific > stacktraces (of 4 frames each). That was never the intent and what I was referring to when I said "and fixing the matching logic". >> >>> Also, if a slowdebug platform were to change to no longer include >>> AllocateHeap, checking it against the other two stacktraces would >>> allow the test to continue to pass without modification. >> >> This is counter to your earlier argument that we should be using this >> test to specifically check for such changes in compiler behaviour and >> update the platform specific guards accordingly. If you allow it to go >> either way then we would never remove the guard even when it was no >> longer needed on any platform. > But this is one compiler inlining behavior change that is ok. If > AllocateHeap() suddenly starts being inlined by slowdebug builds, that > is actually a good thing, and we would end up modifying the test to > allow it. So why not allow it now? >> >>> For these two reasons I was suggesting just always check all 3 >>> stacktraces until one passes. It would simplify the logic some. >> >> I'd need to change a number of other things make the main logic >> simpler (ie loop over all three stacks) but the error reporting part >> will be more awkward. And Thomas already complained about the number >> of times we scan the entire process output doing this matching, so >> this would make it worse - unless I completely change the way we do >> the matching, which then introduces more complexity and more >> likelihood of introducing new bugs. >> >> Let me know how you want to proceed. > > The loop idea was just to make the code simpler. If you feel it will > slow things down unacceptably, then I'm fine with the logic as-is in v2, > but you need to add JVM_DefineModule to the new stacktrace. Okay I intend to add the missing 4th frame, and print both potential stacks on failure, but otherwise leave at V2. Thanks, David ----- > thanks, > > Chris > >> >> Thanks, >> David >> ----- >> >>>> >>>>> The following is no longer correct: >>>>> >>>>> ??140???????? throw new RuntimeException("Expected stack trace >>>>> missing from output: " + expectedStackTrace); >>>>> >>>>> In your current approach, expectedStackTrace is just the last >>>>> stacktrace we tried. Since we may try more than one, maybe all the >>>>> ones that failed to match should be listed (or none listed if just >>>>> too messy). >>>> >>>> It reports the last failing stacktrace, out of a possible two. >>>> Perhaps I can print both ... you want something in the jtr file so >>>> that it can be triaged without having to go and look up the test code. >>> Yeah, just pointing out that only printing one stacktrace might lead >>> the .jtr reader down the wrong path. >>> >>> thanks, >>> >>> Chris >>>> >>>> Thanks, >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 4/5/19 12:04 AM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Updated webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>>>>> >>>>>> Checks for alternate stack now. Added lots of comments and misc >>>>>> fixups. >>>>>> >>>>>> Zhengyu: please re-test (I can't test any slowdebug except >>>>>> linux-x64). >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>>>>> Thinking about this a bit more, there is still the potential for >>>>>>> some confusion if this test fails again in the future due to the >>>>>>> top frame missing. Is it missing because it got inlined or is it >>>>>>> missing because the frame skipping code skipped an extra frame? >>>>>>> Hopefully whoever deals with it doesn't just hastily add another >>>>>>> valid stacktrace to the test but instead investigates to make >>>>>>> sure the issue is indeed that the method got inlined. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Okay I will simply check for the third alternative. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> For the callsite that this test is checking for, right now >>>>>>>>> there appear to be 3 possible stacktraces: the "normal" one, >>>>>>>>> the one that includes AllocateHeap() on solaris and windows >>>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on >>>>>>>>> linux-x64. You would need to check for all 3, limiting the >>>>>>>>> AllocateHeap() one to just being allowed on solaris and windows >>>>>>>>> slowdebug as it is now. So basically this test needs to cover >>>>>>>>> all (allowable) stacktraces that we've seen for this callsite, >>>>>>>>> and be updated in the future as needed. Not ideal, but I don't >>>>>>>>> see a better solution. It's similar to the situation described >>>>>>>>> in JDK-8163899 which covered the fragility of the NMT frame >>>>>>>>> skipping code. In the end it was decided it would be easier to >>>>>>>>> just deal fix issues as they came up rather then engineer a >>>>>>>>> solution that wasn't as fragile. I think this test falls in the >>>>>>>>> same category. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> Thanks for the explanation about the frame counting from >>>>>>>>>> os::malloc - now I get it. But I don't understand your final >>>>>>>>>> comment: >>>>>>>>>> >>>>>>>>>> > Looking at this code also reminds me of a reason to have the >>>>>>>>>> test >>>>>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>>>>> skipping code >>>>>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>>>>> needed frame >>>>>>>>>> > at the top. The way the test was written it would detect >>>>>>>>>> this. With your >>>>>>>>>> > changes it will not. It would just revert to always matching >>>>>>>>>> on 3 frames >>>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>> >>>>>>>>>> How can I fix this bug if I have to check for 4 specific >>>>>>>>>> frames but one (or more) may be missing - i.e how can I tell >>>>>>>>>> the different between "Frame A was inlined" and "Frame A was >>>>>>>>>> skipped by mistake" ?? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have concerns that this will hide some of the other >>>>>>>>>>>>>>>>> bugs I've mentioned: JDK-8133749, JDK-8133747, and >>>>>>>>>>>>>>>>> JDK-8133740. These bugs result in 1 or two frames >>>>>>>>>>>>>>>>> appearing in the stacktrace that should be skipped. >>>>>>>>>>>>>>>>> Notably NativeCallStack::NativeCallStack() and >>>>>>>>>>>>>>>>> os::get_native_stack(). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 73???????? // We should never see either of these frames >>>>>>>>>>>>>>>> because they are supposed to be skipped. */ >>>>>>>>>>>>>>>> 74 >>>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but missed >>>>>>>>>>>>>>> it. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the >>>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed for >>>>>>>>>>>>>>>>> it for windows and solaris slowdebug builds. Although >>>>>>>>>>>>>>>>> these builds should have honored the ALWAYSINLINE >>>>>>>>>>>>>>>>> directive, it was deemed acceptable that it was not in >>>>>>>>>>>>>>>>> slowdebug builds. However, I would not want to allow >>>>>>>>>>>>>>>>> AllocateHeap() to appear in a product build, and best >>>>>>>>>>>>>>>>> not to see it in fastdebug either. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a >>>>>>>>>>>>>>>> given compiler chooses to inline something like >>>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this test >>>>>>>>>>>>>>>> to be checking for something specific to the native >>>>>>>>>>>>>>>> compiler. The previous handling of AllocateHeap seemed >>>>>>>>>>>>>>>> to be there simply because it was the only way to deal >>>>>>>>>>>>>>>> with an optional frame - but now that's handled >>>>>>>>>>>>>>>> generically. >>>>>>>>>>>>>>> It's appearance means you effectively only have 3 frames >>>>>>>>>>>>>>> to identity callsites instead of 4. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and >>>>>>>>>>>>>> expected 4 matches. The current bug is that one of those >>>>>>>>>>>>>> (new_entry) could actually be inlined as well, resulting >>>>>>>>>>>>>> in only 3 matches. So that is what the revised test checks >>>>>>>>>>>>>> for: at least 3 matches. Often there will be 4 matches. >>>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() doesn't >>>>>>>>>>>>> get inlined, it effectively is using 3. The test should >>>>>>>>>>>>> detect when this happens so the NMT implementation can >>>>>>>>>>>>> address the issue. >>>>>>>>>>>> >>>>>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>>>>> >>>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters >>>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is >>>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is controlled >>>>>>>>>>> by NMT_TrackingStackDepth. Making NMT_TrackingStackDepth >>>>>>>>>>> bigger means more refinement of the callsites (thus more >>>>>>>>>>> callsites), but a clearer picture of what actually led to the >>>>>>>>>>> os:malloc(). >>>>>>>>>>> >>>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have >>>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and >>>>>>>>>>> foo() and bar() both call a(), the NMT detail output will not >>>>>>>>>>> distinguish between these two calls paths to os:mallco(), and >>>>>>>>>>> will consider both paths to be the same callsite. The 4 >>>>>>>>>>> frames in the NMT detail output would always be a, b, c, and >>>>>>>>>>> d. However, bump up NMT_TrackingStackDepth to 5 and now NMT >>>>>>>>>>> will treat them as two separate callsites, one with foo() as >>>>>>>>>>> the bottom frame and one with bar() as the bottom frame, and >>>>>>>>>>> both with a, b, c, and d as the other 4 frames. >>>>>>>>>>> >>>>>>>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>>>>>>> allocation that is the result of doing a "new" of any >>>>>>>>>>> CHeapObj subtype will have AllocateHeap() in its callsite, >>>>>>>>>>> which effectively lowers they callsite refinement by 1. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>>>>> >>>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was >>>>>>>>>>>>>> it getting inlined already when AllocateHeap was not? Even >>>>>>>>>>>>>> so we still end up with 4 frames matching normally. >>>>>>>>>>>>> I noticed that last night also and scratch my head over it >>>>>>>>>>>>> for a while and then went to bed. The only explanation I >>>>>>>>>>>>> could come up with is that allocate_new_entry() is getting >>>>>>>>>>>>> inlined, and as a result (due to being a slowdebug build >>>>>>>>>>>>> and doing minimal inlining) AllocateHeap() was not inlined. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> If it does appear in a product build, a solution should >>>>>>>>>>>>>>> be looked into to get rid of it. If the port owner >>>>>>>>>>>>>>> decides it can't get rid of it (or is unwilling to), then >>>>>>>>>>>>>>> an exception should be added to the test like was done >>>>>>>>>>>>>>> for solaris and windows slowdebug builds. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Are we specifically trying to test the compiler's ability >>>>>>>>>>>>>> to inline that function and just happen to be using this >>>>>>>>>>>>>> test to verify that? Doesn't seem like a suitable place to >>>>>>>>>>>>>> do this - and why do we need to do it? The Visual Studio >>>>>>>>>>>>>> docs state: >>>>>>>>>>>>>> >>>>>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>>>>> >>>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and >>>>>>>>>>>>>> could change with any update to the compiler. >>>>>>>>>>>>>> >>>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>>>>>>>>> specifically -xinline only has an effect at ?xO3 or >>>>>>>>>>>>>> higher. Which likely explains why it is ignored in >>>>>>>>>>>>>> slowdebug. And there are other cases where it won't honour >>>>>>>>>>>>>> the ALWAYSINLINE. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we >>>>>>>>>>>>>> want to ensure inlining when not optimising: >>>>>>>>>>>>>> >>>>>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>>>>> unless you specify the ?always_inline? attribute for the >>>>>>>>>>>>>> function, like this: >>>>>>>>>>>>>> >>>>>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>>>>> inline void foo (const char) __attribute__((always_inline));" >>>>>>>>>>>>>> >>>>>>>>>>>>>> and we don't write it that way. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So if we're that concerned about release builds >>>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we need >>>>>>>>>>>>>> something a bit more explicit than this test to determine >>>>>>>>>>>>>> that. >>>>>>>>>>>>> With respect to the 3 methods/functions we don't want to >>>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of >>>>>>>>>>>>> assumptions on inlining. One of the things the test is >>>>>>>>>>>>> doing is making sure those assumptions are correct. If >>>>>>>>>>>>> incorrect, then you run into issues like I mentioned above >>>>>>>>>>>>> where callsite backtraces effectively only have 3 unique >>>>>>>>>>>>> frames rather than 4 (actually before some bug fixes it was >>>>>>>>>>>>> often just 2 unique frames). So I think it's appropriate to >>>>>>>>>>>>> have a test to make sure we are not seeing any of these 3 >>>>>>>>>>>>> methods/functions. >>>>>>>>>>>> >>>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can >>>>>>>>>>>> clearly see what this inlining assumptions are that NMT >>>>>>>>>>>> makes? Are they clearly documented? >>>>>>>>>>> >>>>>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>>>>> various bugs that led to NativeCallStack::NativeCallStack() >>>>>>>>>>> and os::get_native_stack() (and sometimes both) being in the >>>>>>>>>>> callsite. Reviewing the bugs I referred to will give you an >>>>>>>>>>> idea of where to look. One good place to look at >>>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case code >>>>>>>>>>> there that controls how many frames to skip based on on the >>>>>>>>>>> platform and whether optimized or not. Also some comments >>>>>>>>>>> there to help you out. I did a lot of bug fixing in this method. >>>>>>>>>>> >>>>>>>>>>> Looking at this code also reminds me of a reason to have the >>>>>>>>>>> test continue to check for all 4 specific frames. If the >>>>>>>>>>> frame skipping code skips an extra frame, then the callsite >>>>>>>>>>> will be missing a needed frame at the top. The way the test >>>>>>>>>>> was written it would detect this. With your changes it will >>>>>>>>>>> not. It would just revert to always matching on 3 frames >>>>>>>>>>> instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Now the test also has made inlining assumptions beyond what >>>>>>>>>>>>> NMT has made, and that is really what this bug is about. In >>>>>>>>>>>>> general I think your fix is fine in the way it relaxes >>>>>>>>>>>>> which frames are actually found, but as Thomas points out, >>>>>>>>>>>>> it suffers from not actually looking at a single >>>>>>>>>>>>> stacktrace, but just looking for the specified frames >>>>>>>>>>>>> somewhere in the output (and in the order specified.) You >>>>>>>>>>>>> should probably address this. >>>>>>>>>>>> >>>>>>>>>>>> Right that was an error on my part. I thought the existing >>>>>>>>>>>> MULTILINE pattern matching with .* would also find >>>>>>>>>>>> non-sequential lines and so I was acting similarly. I will >>>>>>>>>>>> re-think this. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>>>>> which frames appear, I think you need to now also make >>>>>>>>>>>>>>>>> sure the above 3 mentioned frames are not present, >>>>>>>>>>>>>>>>> except for allowing AllocateHeap() in slowdebug builds. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames to >>>>>>>>>>>>>>>>>> be missing based on empirical observations. So to date >>>>>>>>>>>>>>>>>> we have seen two frames that may or may not be inlined >>>>>>>>>>>>>>>>>> and so we allow for 2 non-matching entries. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now >>>>>>>>>>>>>>>>>> it is just an optional frame. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as >>>>>>>>>>>>>>>>>> you intended? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > > From thomas.stuefe at gmail.com Sun Apr 7 07:17:32 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sun, 7 Apr 2019 09:17:32 +0200 Subject: RFR(T): 8221738: ErrorFile option does not handle pre-existing error files of the same name Message-ID: Hi all, May I please have reviews for this small fix: bug: https://bugs.openjdk.java.net/browse/JDK-8221738 cr: http://cr.openjdk.java.net/~stuefe/webrevs/8221738-errorfile-option-does-not-handle-pre-existing-error-files-of-the-same-name/webrev.00/webrev/ Fixes a long standing issue where -XX:ErrorFile= will only work if does not exist yet. If it does, error file falls silently back to "/hs_err_pid...". For more detailed discussions, please see the bug and the associated CSR. The fix now causes the error file to be overwritten Thanks, Thomas From chris.plummer at oracle.com Sun Apr 7 07:21:40 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Sun, 7 Apr 2019 00:21:40 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <404ed0ac-2785-79a0-8fc3-d306af6166ca@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> <828659d7-c8a1-b509-84f6-4a9596e617c8@oracle.com> <404ed0ac-2785-79a0-8fc3-d306af6166ca@oracle.com> Message-ID: On 4/7/19 12:10 AM, David Holmes wrote: > Hi Chris, > > On 7/04/2019 4:51 pm, Chris Plummer wrote: >> Hi David, >> >> On 4/6/19 11:06 PM, David Holmes wrote: >>> On 6/04/2019 4:24 pm, Chris Plummer wrote: >>>> On 4/5/19 9:13 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> On 6/04/2019 3:09 am, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate? >>>>> >>>>> ?? That isn't part of any of the existing stacktraces. >>>> See the following comment from Zhengyu in the CR: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 >>> >>> >>> >>> >>> That comment simply includes a fragment of a stack which happens to >>> include JVM_DefineModule and makes no further mention of it. I don't >>> recall anyone saying that we should now be including that frame in >>> the check. >>> >>> Do you want the test extended to also check for that frame? >> Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule >> is the additional frame that now appears in the detail output for >> that call chain. So yes, the test should include it. If the inlining >> of ModuleEntryTable::new_entry() had always happened, then the test >> would originally have checked for the stacktrace as it appears in the >> CR comment. > > I see - to be clear you want to always check for 4 frames, so the > additional frame is only checked for the alternate stack. Yes. > >>>>> >>>>>> Since you've added the following: >>>>>> >>>>>> ??103???????? if (!okToHaveAllocateHeap) { >>>>>> ??104 output.shouldNotContain("AllocateHeap"); >>>>>> ??105???????? } >>>>> >>>>> I didn't add that - see old code line 80. >>>> Ok, but my comment below still applies since this check is in place. >>>>> >>>>>> You can simplify the following: >>>>>> >>>>>> ??123???????? if (okToHaveAllocateHeap) { >>>>>> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >>>>>> ??125???????????? if (stackTraceMatches(expectedStackTrace, >>>>>> output)) { >>>>>> ??126???????????????? return; >>>>>> ??127???????????? } >>>>>> ??128???????? } else { >>>>>> >>>>>> The is no need for the okToHaveAllocateHeap check here anymore. >>>>>> Just check all 3 allowed stacktraces until one passes. This is a >>>>>> slight improvement in flexibility in that it would no longer >>>>>> require the slowdebug builds to match stackTraceAllocateHeap. >>>>>> They could match any of the 3. You could then put all 3 allowed >>>>>> stacktraces in an array and check them in a loop if you wish. >>>>> >>>>> The only change I have made (which might be obscured by the >>>>> structure) is that if stackTraceDefault fails to match I then try >>>>> stackTraceAlternate. The handling of okToHaveAllocateHeap is >>>>> unchanged. >>>>> >>>>> By the same argument you made I think it best to only expect the >>>>> AllocateHeap stack on those slowdebug platforms, so that we can >>>>> notice when something changes - again I've mode no change in this >>>>> regard. >>>> Since line 104 already verified that AllocateHeap does not appear >>>> except possibly in slow debug heaps, it is harmless to check all >>>> builds against the stacktrace that includes AllocateHeap. >>> >>> "Harmless" but a waste of time checking for a stack that we know >>> can't match. The current version was at your suggestion: >>> >>> "You would need to check for all 3, limiting the AllocateHeap() one >>> to just being allowed on solaris and windows slowdebug as it is now." >> That was before I realized there was already an explicit check for >> AllocateHeap() to not be allowed except for slowdebug ones. Once I >> realized that, it occurred to me that checking for all 3 stacktraces >> in a loop would simplify the logic. >>> >>> Checking all three returns to my original version (modulo not >>> removing the check for the AllocateHeap frame, and fixing the >>> matching logic). >> Your original version checked for a large number of permutations that >> included any 3 of 5 specified frames, not checks for any of 3 >> specific stacktraces (of 4 frames each). > > That was never the intent and what I was referring to when I said "and > fixing the matching logic". > >>> >>>> Also, if a slowdebug platform were to change to no longer include >>>> AllocateHeap, checking it against the other two stacktraces would >>>> allow the test to continue to pass without modification. >>> >>> This is counter to your earlier argument that we should be using >>> this test to specifically check for such changes in compiler >>> behaviour and update the platform specific guards accordingly. If >>> you allow it to go either way then we would never remove the guard >>> even when it was no longer needed on any platform. >> But this is one compiler inlining behavior change that is ok. If >> AllocateHeap() suddenly starts being inlined by slowdebug builds, >> that is actually a good thing, and we would end up modifying the test >> to allow it. So why not allow it now? >>> >>>> For these two reasons I was suggesting just always check all 3 >>>> stacktraces until one passes. It would simplify the logic some. >>> >>> I'd need to change a number of other things make the main logic >>> simpler (ie loop over all three stacks) but the error reporting part >>> will be more awkward. And Thomas already complained about the number >>> of times we scan the entire process output doing this matching, so >>> this would make it worse - unless I completely change the way we do >>> the matching, which then introduces more complexity and more >>> likelihood of introducing new bugs. >>> >>> Let me know how you want to proceed. >> >> The loop idea was just to make the code simpler. If you feel it will >> slow things down unacceptably, then I'm fine with the logic as-is in >> v2, but you need to add JVM_DefineModule to the new stacktrace. > > Okay I intend to add the missing 4th frame, and print both potential > stacks on failure, but otherwise leave at V2. Sounds good. Chris > > Thanks, > David > ----- > >> thanks, >> >> Chris >> >>> >>> Thanks, >>> David >>> ----- >>> >>>>> >>>>>> The following is no longer correct: >>>>>> >>>>>> ??140???????? throw new RuntimeException("Expected stack trace >>>>>> missing from output: " + expectedStackTrace); >>>>>> >>>>>> In your current approach, expectedStackTrace is just the last >>>>>> stacktrace we tried. Since we may try more than one, maybe all >>>>>> the ones that failed to match should be listed (or none listed if >>>>>> just too messy). >>>>> >>>>> It reports the last failing stacktrace, out of a possible two. >>>>> Perhaps I can print both ... you want something in the jtr file so >>>>> that it can be triaged without having to go and look up the test >>>>> code. >>>> Yeah, just pointing out that only printing one stacktrace might >>>> lead the .jtr reader down the wrong path. >>>> >>>> thanks, >>>> >>>> Chris >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 4/5/19 12:04 AM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Updated webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>>>>>> >>>>>>> Checks for alternate stack now. Added lots of comments and misc >>>>>>> fixups. >>>>>>> >>>>>>> Zhengyu: please re-test (I can't test any slowdebug except >>>>>>> linux-x64). >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>>>>>> Thinking about this a bit more, there is still the potential >>>>>>>> for some confusion if this test fails again in the future due >>>>>>>> to the top frame missing. Is it missing because it got inlined >>>>>>>> or is it missing because the frame skipping code skipped an >>>>>>>> extra frame? Hopefully whoever deals with it doesn't just >>>>>>>> hastily add another valid stacktrace to the test but instead >>>>>>>> investigates to make sure the issue is indeed that the method >>>>>>>> got inlined. >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Okay I will simply check for the third alternative. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> For the callsite that this test is checking for, right now >>>>>>>>>> there appear to be 3 possible stacktraces: the "normal" one, >>>>>>>>>> the one that includes AllocateHeap() on solaris and windows >>>>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on >>>>>>>>>> linux-x64. You would need to check for all 3, limiting the >>>>>>>>>> AllocateHeap() one to just being allowed on solaris and >>>>>>>>>> windows slowdebug as it is now. So basically this test needs >>>>>>>>>> to cover all (allowable) stacktraces that we've seen for this >>>>>>>>>> callsite, and be updated in the future as needed. Not ideal, >>>>>>>>>> but I don't see a better solution. It's similar to the >>>>>>>>>> situation described in JDK-8163899 which covered the >>>>>>>>>> fragility of the NMT frame skipping code. In the end it was >>>>>>>>>> decided it would be easier to just deal fix issues as they >>>>>>>>>> came up rather then engineer a solution that wasn't as >>>>>>>>>> fragile. I think this test falls in the same category. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> Thanks for the explanation about the frame counting from >>>>>>>>>>> os::malloc - now I get it. But I don't understand your final >>>>>>>>>>> comment: >>>>>>>>>>> >>>>>>>>>>> > Looking at this code also reminds me of a reason to have >>>>>>>>>>> the test >>>>>>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>>>>>> skipping code >>>>>>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>>>>>> needed frame >>>>>>>>>>> > at the top. The way the test was written it would detect >>>>>>>>>>> this. With your >>>>>>>>>>> > changes it will not. It would just revert to always >>>>>>>>>>> matching on 3 frames >>>>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>>> >>>>>>>>>>> How can I fix this bug if I have to check for 4 specific >>>>>>>>>>> frames but one (or more) may be missing - i.e how can I tell >>>>>>>>>>> the different between "Frame A was inlined" and "Frame A was >>>>>>>>>>> skipped by mistake" ?? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have concerns that this will hide some of the other >>>>>>>>>>>>>>>>>> bugs I've mentioned: JDK-8133749, JDK-8133747, and >>>>>>>>>>>>>>>>>> JDK-8133740. These bugs result in 1 or two frames >>>>>>>>>>>>>>>>>> appearing in the stacktrace that should be skipped. >>>>>>>>>>>>>>>>>> Notably NativeCallStack::NativeCallStack() and >>>>>>>>>>>>>>>>>> os::get_native_stack(). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 73???????? // We should never see either of these >>>>>>>>>>>>>>>>> frames because they are supposed to be skipped. */ >>>>>>>>>>>>>>>>> 74 >>>>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but >>>>>>>>>>>>>>>> missed it. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the >>>>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed >>>>>>>>>>>>>>>>>> for it for windows and solaris slowdebug builds. >>>>>>>>>>>>>>>>>> Although these builds should have honored the >>>>>>>>>>>>>>>>>> ALWAYSINLINE directive, it was deemed acceptable that >>>>>>>>>>>>>>>>>> it was not in slowdebug builds. However, I would not >>>>>>>>>>>>>>>>>> want to allow AllocateHeap() to appear in a product >>>>>>>>>>>>>>>>>> build, and best not to see it in fastdebug either. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a >>>>>>>>>>>>>>>>> given compiler chooses to inline something like >>>>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this test >>>>>>>>>>>>>>>>> to be checking for something specific to the native >>>>>>>>>>>>>>>>> compiler. The previous handling of AllocateHeap seemed >>>>>>>>>>>>>>>>> to be there simply because it was the only way to deal >>>>>>>>>>>>>>>>> with an optional frame - but now that's handled >>>>>>>>>>>>>>>>> generically. >>>>>>>>>>>>>>>> It's appearance means you effectively only have 3 >>>>>>>>>>>>>>>> frames to identity callsites instead of 4. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and >>>>>>>>>>>>>>> expected 4 matches. The current bug is that one of those >>>>>>>>>>>>>>> (new_entry) could actually be inlined as well, resulting >>>>>>>>>>>>>>> in only 3 matches. So that is what the revised test >>>>>>>>>>>>>>> checks for: at least 3 matches. Often there will be 4 >>>>>>>>>>>>>>> matches. >>>>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() >>>>>>>>>>>>>> doesn't get inlined, it effectively is using 3. The test >>>>>>>>>>>>>> should detect when this happens so the NMT implementation >>>>>>>>>>>>>> can address the issue. >>>>>>>>>>>>> >>>>>>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>>>>>> >>>>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters >>>>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is >>>>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is controlled >>>>>>>>>>>> by NMT_TrackingStackDepth. Making NMT_TrackingStackDepth >>>>>>>>>>>> bigger means more refinement of the callsites (thus more >>>>>>>>>>>> callsites), but a clearer picture of what actually led to >>>>>>>>>>>> the os:malloc(). >>>>>>>>>>>> >>>>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have >>>>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and >>>>>>>>>>>> foo() and bar() both call a(), the NMT detail output will >>>>>>>>>>>> not distinguish between these two calls paths to >>>>>>>>>>>> os:mallco(), and will consider both paths to be the same >>>>>>>>>>>> callsite. The 4 frames in the NMT detail output would >>>>>>>>>>>> always be a, b, c, and d. However, bump up >>>>>>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as >>>>>>>>>>>> two separate callsites, one with foo() as the bottom frame >>>>>>>>>>>> and one with bar() as the bottom frame, and both with a, b, >>>>>>>>>>>> c, and d as the other 4 frames. >>>>>>>>>>>> >>>>>>>>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>>>>>>>> allocation that is the result of doing a "new" of any >>>>>>>>>>>> CHeapObj subtype will have AllocateHeap() in its callsite, >>>>>>>>>>>> which effectively lowers they callsite refinement by 1. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was >>>>>>>>>>>>>>> it getting inlined already when AllocateHeap was not? >>>>>>>>>>>>>>> Even so we still end up with 4 frames matching normally. >>>>>>>>>>>>>> I noticed that last night also and scratch my head over >>>>>>>>>>>>>> it for a while and then went to bed. The only explanation >>>>>>>>>>>>>> I could come up with is that allocate_new_entry() is >>>>>>>>>>>>>> getting inlined, and as a result (due to being a >>>>>>>>>>>>>> slowdebug build and doing minimal inlining) >>>>>>>>>>>>>> AllocateHeap() was not inlined. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If it does appear in a product build, a solution should >>>>>>>>>>>>>>>> be looked into to get rid of it. If the port owner >>>>>>>>>>>>>>>> decides it can't get rid of it (or is unwilling to), >>>>>>>>>>>>>>>> then an exception should be added to the test like was >>>>>>>>>>>>>>>> done for solaris and windows slowdebug builds. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Are we specifically trying to test the compiler's >>>>>>>>>>>>>>> ability to inline that function and just happen to be >>>>>>>>>>>>>>> using this test to verify that? Doesn't seem like a >>>>>>>>>>>>>>> suitable place to do this - and why do we need to do it? >>>>>>>>>>>>>>> The Visual Studio docs state: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds >>>>>>>>>>>>>>> and could change with any update to the compiler. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline >>>>>>>>>>>>>>> - specifically -xinline only has an effect at ?xO3 or >>>>>>>>>>>>>>> higher. Which likely explains why it is ignored in >>>>>>>>>>>>>>> slowdebug. And there are other cases where it won't >>>>>>>>>>>>>>> honour the ALWAYSINLINE. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we >>>>>>>>>>>>>>> want to ensure inlining when not optimising: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>>>>>> unless you specify the ?always_inline? attribute for the >>>>>>>>>>>>>>> function, like this: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>>>>>> inline void foo (const char) >>>>>>>>>>>>>>> __attribute__((always_inline));" >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> and we don't write it that way. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So if we're that concerned about release builds >>>>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we need >>>>>>>>>>>>>>> something a bit more explicit than this test to >>>>>>>>>>>>>>> determine that. >>>>>>>>>>>>>> With respect to the 3 methods/functions we don't want to >>>>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of >>>>>>>>>>>>>> assumptions on inlining. One of the things the test is >>>>>>>>>>>>>> doing is making sure those assumptions are correct. If >>>>>>>>>>>>>> incorrect, then you run into issues like I mentioned >>>>>>>>>>>>>> above where callsite backtraces effectively only have 3 >>>>>>>>>>>>>> unique frames rather than 4 (actually before some bug >>>>>>>>>>>>>> fixes it was often just 2 unique frames). So I think it's >>>>>>>>>>>>>> appropriate to have a test to make sure we are not seeing >>>>>>>>>>>>>> any of these 3 methods/functions. >>>>>>>>>>>>> >>>>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can >>>>>>>>>>>>> clearly see what this inlining assumptions are that NMT >>>>>>>>>>>>> makes? Are they clearly documented? >>>>>>>>>>>> >>>>>>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>>>>>> various bugs that led to NativeCallStack::NativeCallStack() >>>>>>>>>>>> and os::get_native_stack() (and sometimes both) being in >>>>>>>>>>>> the callsite. Reviewing the bugs I referred to will give >>>>>>>>>>>> you an idea of where to look. One good place to look at >>>>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case >>>>>>>>>>>> code there that controls how many frames to skip based on >>>>>>>>>>>> on the platform and whether optimized or not. Also some >>>>>>>>>>>> comments there to help you out. I did a lot of bug fixing >>>>>>>>>>>> in this method. >>>>>>>>>>>> >>>>>>>>>>>> Looking at this code also reminds me of a reason to have >>>>>>>>>>>> the test continue to check for all 4 specific frames. If >>>>>>>>>>>> the frame skipping code skips an extra frame, then the >>>>>>>>>>>> callsite will be missing a needed frame at the top. The way >>>>>>>>>>>> the test was written it would detect this. With your >>>>>>>>>>>> changes it will not. It would just revert to always >>>>>>>>>>>> matching on 3 frames instead of 4, and the frame skipping >>>>>>>>>>>> bug would go unnoticed. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Now the test also has made inlining assumptions beyond >>>>>>>>>>>>>> what NMT has made, and that is really what this bug is >>>>>>>>>>>>>> about. In general I think your fix is fine in the way it >>>>>>>>>>>>>> relaxes which frames are actually found, but as Thomas >>>>>>>>>>>>>> points out, it suffers from not actually looking at a >>>>>>>>>>>>>> single stacktrace, but just looking for the specified >>>>>>>>>>>>>> frames somewhere in the output (and in the order >>>>>>>>>>>>>> specified.) You should probably address this. >>>>>>>>>>>>> >>>>>>>>>>>>> Right that was an error on my part. I thought the existing >>>>>>>>>>>>> MULTILINE pattern matching with .* would also find >>>>>>>>>>>>> non-sequential lines and so I was acting similarly. I will >>>>>>>>>>>>> re-think this. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>>>>>> which frames appear, I think you need to now also >>>>>>>>>>>>>>>>>> make sure the above 3 mentioned frames are not >>>>>>>>>>>>>>>>>> present, except for allowing AllocateHeap() in >>>>>>>>>>>>>>>>>> slowdebug builds. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames to >>>>>>>>>>>>>>>>>>> be missing based on empirical observations. So to >>>>>>>>>>>>>>>>>>> date we have seen two frames that may or may not be >>>>>>>>>>>>>>>>>>> inlined and so we allow for 2 non-matching entries. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now >>>>>>>>>>>>>>>>>>> it is just an optional frame. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test >>>>>>>>>>>>>>>>>>> as you intended? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From yasuenag at gmail.com Sun Apr 7 12:10:03 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Sun, 7 Apr 2019 21:10:03 +0900 Subject: RFR(m): 8220762: Rework EventLog message string handling and provide VM.events command In-Reply-To: References: Message-ID: <4f177217-ad73-2070-4459-f23497d4d34c@gmail.com> Hi Thomas, Thank you for handling this issue. I have some comments to your change. - StringFifoBuffer::wrap() seems not to be used. Can you remove it? - We will see "Command executed successfully" on console when we run VM.events dcmd with unexpected log name (e.g. jcmd VM.events name=aaa). Can you show some error message? (It relates to JDK-8165869) - Can you add jtreg test(s) for VM.events? Thanks, Yasumasa On 2019/03/26 1:39, Thomas St?fe wrote: > Polite Ping... > > All submit tests passed. > > ..Thomas > > On Sun, Mar 17, 2019 at 9:26 AM Thomas St?fe > wrote: > >> Dear all, >> >> may I please have thoughts and reviews for this medium-ish change: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8220762 >> CR: >> http://cr.openjdk.java.net/~stuefe/webrevs/8220762--rewrite-eventlog-system-to-reduce-footprint-and-to-not-truncate-output/webrev.00/webrev/ >> >> The Eventlog system keeps records which mostly consist of string messages. >> These records are fixed-sized which is not such a good fit for an event >> system. >> >> When writing the occasional large message, we get truncations when hitting >> the max string size. In the past, the solution to that had been increasing >> the buffer size (see https://bugs.openjdk.java.net/browse/JDK-8204551), >> but that is a bit unsatisfying since we waste much of that memory for >> records whose messages are usually shorter. >> >> This patch changes the event log system and replaces the fixed-length >> string array with a var-length fifo buffer for strings. The advantage is >> that we use exactly as much space as we need for that particular message - >> short messages won't waste a whole fixed-length record text - while >> avoiding truncation on long messages. >> >> This is done by introducing a new var-length string fifo buffer. An event >> log record now just holds a reference to a string in this buffer; so an >> EventLog now consists of two Fifos: the traditional EventLogRecord-Fifo and >> an associated variable-length-string-Fifo. >> >> -- >> >> I also took the liberty to cleanup and simplify the coding a bit: >> >> The old EventLog system made it possible to define ones own event log >> records, being templatized. However, that feature was - with one tiny >> exception - never used. Almost all child classes of EventLog just use the >> EventLog system in its pure form : a record consisting of [timestamp, >> thread info, message text]. >> >> This also makes sense since a lot of the information we write to the log >> is volatile - it exists when we log but may not exist anymore when we print >> the log - and if one were to store it in the record, one would have to take >> care to keep it alive somehow or copy it. In practice, it is usually too >> much hassle with low benefits, so it has been easier to just printf() those >> information to the record right away and be done with it. >> >> The only exception from this was the GC event log, which kept a boolean >> flag in the event log message designating whether that record was logged >> "before" or "after" a GC. But since the sole purpose of this flag was to >> print "before" or "after" at the printing point, it could have done so >> right away when logging. >> >> So I decided to scratch the templatization here and go with a much simpler >> system, which is easier to read and maintain. What do you think? >> >> -- >> >> Finally, I added a new diagnostic command to jcmd, called "VM.events", to >> print out the event log. Before, we could only get this via VM.info; I >> think it is worth a separate command, especially since this does not add >> much code. >> >> >> >> Cheers, Thomas >> >> From david.holmes at oracle.com Sun Apr 7 23:40:29 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Apr 2019 09:40:29 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> Message-ID: <495c2471-b034-77ed-0150-10f525a5384a@oracle.com> Hi Thomas, My apologies, I did not mean to ignore your input here. Thanks for taking a look and pointing out the scanning error in my original proposal. Hopefully you are okay with the simpler approach that Chris has advocated. Thanks, David On 4/04/2019 5:12 pm, Thomas St?fe wrote: > Hi David, Chris, > > I think this is an improvement and goes in the right direction. Those > hard-wired inline guesses always made me twitch a bit. > > The patch looks fine to me in its current form, since it is already an > improvement. So the following remarks are "optional": > > - Since all we want to do is to test that NMT detail printing works, we > do not have to use one of the malloc paths; I have the feeling the mmap > paths are more "inline stable" since they usually end up in one of the > ReservedSpace child class constructors which do not get inlined. > > Like this: > > ? 74 [0x0000000706400000 - 0x0000000800000000] reserved 4091904KB for > Java Heap from > ? 75? ? ?[0x00007f9b514cff07] > ReservedHeapSpace::try_reserve_range(char*, char*, unsigned long, char*, > char*, unsigned long, unsigned long, bool)+0xb7 > ? 76? ? ?[0x00007f9b514d08d8] > ReservedHeapSpace::initialize_compressed_heap(unsigned long, unsigned > long, bool)+0x5f8 > ? 77? ? ?[0x00007f9b514d0f3a] > ReservedHeapSpace::ReservedHeapSpace(unsigned long, unsigned long, bool, > char const*) [clone .part.29]+0x9a > ? 78? ? ?[0x00007f9b51450331] Universe::reserve_heap(unsigned long, > unsigned long)+0xe1 > > Or this: > > ?256 [0x00007f9b308c5000 - 0x00007f9b3f8c5000] reserved 245760KB for > Code from > ?257? ? ?[0x00007f9b514cad02] > ReservedCodeSpace::ReservedCodeSpace(unsigned long, unsigned long, > bool)+0xa2 > ?258? ? ?[0x00007f9b505bcfb7] CodeCache::reserve_heap_memory(unsigned > long)+0xe7 > ?259? ? ?[0x00007f9b505bd75b] CodeCache::initialize_heaps()+0x2db > ?260? ? ?[0x00007f9b505bde45] CodeCache::initialize()+0x1b5 > > will be stacks you always will see. > > - I do not like scanning the whole output for each single stack frame. > The test may give false positives. I would like it more if we were to > read the file line by line, and when the first pattern line matches, > check that subsequent lines match too. This is how we do call stack > matching at SAP for similar tests. > This is also more efficient since you do not re-scan the whole output > each time. > > In general: > > NMT is really very useful. We could think about > increasing?NMT_TrackingStackDepth, since 4 is obviously not a lot. 6 or > 8 would be better. I do not believe the memory footprint increase would > be significant, but of course we would have to measure. > > Thanks! Thomas > > On Thu, Apr 4, 2019 at 8:36 AM Chris Plummer > wrote: > > On 4/3/19 11:23 PM, David Holmes wrote: > > Hi Chris, > > > > On 4/04/2019 4:12 pm, Chris Plummer wrote: > >> Hi David, > >> > >> I have concerns that this will hide some of the other bugs I've > >> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs > >> result in 1 or two frames appearing in the stacktrace that > should be > >> skipped. Notably NativeCallStack::NativeCallStack() and > >> os::get_native_stack(). > > > > The test still checks those are not present first: > > > > 73???????? // We should never see either of these frames because > they > > are supposed to be skipped. */ > > 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); > > 75???????? output.shouldNotContain("os::get_native_stack"); > Ah yes. I skimmed over the test looking for it but missed it. > > > >> Also, AllocateHeap() should normally not be in the stack trace, but > >> the test has specifically allowed for it for windows and solaris > >> slowdebug builds. Although these builds should have honored the > >> ALWAYSINLINE directive, it was deemed acceptable that it was not in > >> slowdebug builds. However, I would not want to allow AllocateHeap() > >> to appear in a product build, and best not to see it in fastdebug > >> either. > > > > This is a test of NMT detail not a test of whether a given compiler > > chooses to inline something like AllocateHeap. I don't think it > is the > > job of this test to be checking for something specific to the native > > compiler. The previous handling of AllocateHeap seemed to be there > > simply because it was the only way to deal with an optional frame - > > but now that's handled generically. > It's appearance means you effectively only have 3 frames to identity > callsites instead of 4. If it does appear in a product build, a > solution > should be looked into to get rid of it. If the port owner decides it > can't get rid of it (or is unwilling to), then an exception should be > added to the test like was done for solaris and windows slowdebug > builds. > > thanks, > > Chris > > > > Thanks, > > David > > > >> Given the changes you made to allow more flexibly in which frames > >> appear, I think you need to now also make sure the above 3 > mentioned > >> frames are not present, except for allowing AllocateHeap() in > >> slowdebug builds. > >> > >> thanks, > >> > >> Chris > >> > >> On 4/3/19 10:53 PM, David Holmes wrote: > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 > >>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ > >>> > >>> The actual stack trace reported by NMT detail is affected by the > >>> inlining decisions of the native compiler, and on the type of > build. > >>> So we define an "ideal" stacktrace and then allow for some > frames to > >>> be missing based on empirical observations. So to date we have > seen > >>> two frames that may or may not be inlined and so we allow for 2 > >>> non-matching entries. > >>> > >>> The special-casing of AllocateHeap is removed as now it is just an > >>> optional frame. > >>> > >>> Chris: does this maintain the "spirit" of the test as you intended? > >>> > >>> Zhengyu: can you test this on your system(s) please. > >>> > >>> Thanks, > >>> David > >> > >> > > From david.holmes at oracle.com Sun Apr 7 23:38:22 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Apr 2019 09:38:22 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <404ed0ac-2785-79a0-8fc3-d306af6166ca@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> <828659d7-c8a1-b509-84f6-4a9596e617c8@oracle.com> <404ed0ac-2785-79a0-8fc3-d306af6166ca@oracle.com> Message-ID: <3f191e86-c4ce-a286-a7aa-03467e26301e@oracle.com> Okay here is updated webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev.v3/ with the incremental change shown in: http://cr.openjdk.java.net/~dholmes/8218458/webrev.v3/incr_v2.patch I added the extra frame for the alternate stack and now print the stack we are looking for as part of the normal test output rather than trying to include it in the exception message when we fail. Zhengyu: can you please test this again on your platforms. Again I can't test the alternate stack matching as I have no systems where it fails (nor can I test slowedebug other than Linux). Thanks, David ----- On 7/04/2019 5:10 pm, David Holmes wrote: > Hi Chris, > > On 7/04/2019 4:51 pm, Chris Plummer wrote: >> Hi David, >> >> On 4/6/19 11:06 PM, David Holmes wrote: >>> On 6/04/2019 4:24 pm, Chris Plummer wrote: >>>> On 4/5/19 9:13 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> On 6/04/2019 3:09 am, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate? >>>>> >>>>> ?? That isn't part of any of the existing stacktraces. >>>> See the following comment from Zhengyu in the CR: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 >>> >>> >>> >>> >>> That comment simply includes a fragment of a stack which happens to >>> include JVM_DefineModule and makes no further mention of it. I don't >>> recall anyone saying that we should now be including that frame in >>> the check. >>> >>> Do you want the test extended to also check for that frame? >> Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule is >> the additional frame that now appears in the detail output for that >> call chain. So yes, the test should include it. If the inlining of >> ModuleEntryTable::new_entry() had always happened, then the test would >> originally have checked for the stacktrace as it appears in the CR >> comment. > > I see - to be clear you want to always check for 4 frames, so the > additional frame is only checked for the alternate stack. > >>>>> >>>>>> Since you've added the following: >>>>>> >>>>>> ??103???????? if (!okToHaveAllocateHeap) { >>>>>> ??104???????????? output.shouldNotContain("AllocateHeap"); >>>>>> ??105???????? } >>>>> >>>>> I didn't add that - see old code line 80. >>>> Ok, but my comment below still applies since this check is in place. >>>>> >>>>>> You can simplify the following: >>>>>> >>>>>> ??123???????? if (okToHaveAllocateHeap) { >>>>>> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >>>>>> ??125???????????? if (stackTraceMatches(expectedStackTrace, >>>>>> output)) { >>>>>> ??126???????????????? return; >>>>>> ??127???????????? } >>>>>> ??128???????? } else { >>>>>> >>>>>> The is no need for the okToHaveAllocateHeap check here anymore. >>>>>> Just check all 3 allowed stacktraces until one passes. This is a >>>>>> slight improvement in flexibility in that it would no longer >>>>>> require the slowdebug builds to match stackTraceAllocateHeap. They >>>>>> could match any of the 3. You could then put all 3 allowed >>>>>> stacktraces in an array and check them in a loop if you wish. >>>>> >>>>> The only change I have made (which might be obscured by the >>>>> structure) is that if stackTraceDefault fails to match I then try >>>>> stackTraceAlternate. The handling of okToHaveAllocateHeap is >>>>> unchanged. >>>>> >>>>> By the same argument you made I think it best to only expect the >>>>> AllocateHeap stack on those slowdebug platforms, so that we can >>>>> notice when something changes - again I've mode no change in this >>>>> regard. >>>> Since line 104 already verified that AllocateHeap does not appear >>>> except possibly in slow debug heaps, it is harmless to check all >>>> builds against the stacktrace that includes AllocateHeap. >>> >>> "Harmless" but a waste of time checking for a stack that we know >>> can't match. The current version was at your suggestion: >>> >>> "You would need to check for all 3, limiting the AllocateHeap() one >>> to just being allowed on solaris and windows slowdebug as it is now." >> That was before I realized there was already an explicit check for >> AllocateHeap() to not be allowed except for slowdebug ones. Once I >> realized that, it occurred to me that checking for all 3 stacktraces >> in a loop would simplify the logic. >>> >>> Checking all three returns to my original version (modulo not >>> removing the check for the AllocateHeap frame, and fixing the >>> matching logic). >> Your original version checked for a large number of permutations that >> included any 3 of 5 specified frames, not checks for any of 3 specific >> stacktraces (of 4 frames each). > > That was never the intent and what I was referring to when I said "and > fixing the matching logic". > >>> >>>> Also, if a slowdebug platform were to change to no longer include >>>> AllocateHeap, checking it against the other two stacktraces would >>>> allow the test to continue to pass without modification. >>> >>> This is counter to your earlier argument that we should be using this >>> test to specifically check for such changes in compiler behaviour and >>> update the platform specific guards accordingly. If you allow it to >>> go either way then we would never remove the guard even when it was >>> no longer needed on any platform. >> But this is one compiler inlining behavior change that is ok. If >> AllocateHeap() suddenly starts being inlined by slowdebug builds, that >> is actually a good thing, and we would end up modifying the test to >> allow it. So why not allow it now? >>> >>>> For these two reasons I was suggesting just always check all 3 >>>> stacktraces until one passes. It would simplify the logic some. >>> >>> I'd need to change a number of other things make the main logic >>> simpler (ie loop over all three stacks) but the error reporting part >>> will be more awkward. And Thomas already complained about the number >>> of times we scan the entire process output doing this matching, so >>> this would make it worse - unless I completely change the way we do >>> the matching, which then introduces more complexity and more >>> likelihood of introducing new bugs. >>> >>> Let me know how you want to proceed. >> >> The loop idea was just to make the code simpler. If you feel it will >> slow things down unacceptably, then I'm fine with the logic as-is in >> v2, but you need to add JVM_DefineModule to the new stacktrace. > > Okay I intend to add the missing 4th frame, and print both potential > stacks on failure, but otherwise leave at V2. > > Thanks, > David > ----- > >> thanks, >> >> Chris >> >>> >>> Thanks, >>> David >>> ----- >>> >>>>> >>>>>> The following is no longer correct: >>>>>> >>>>>> ??140???????? throw new RuntimeException("Expected stack trace >>>>>> missing from output: " + expectedStackTrace); >>>>>> >>>>>> In your current approach, expectedStackTrace is just the last >>>>>> stacktrace we tried. Since we may try more than one, maybe all the >>>>>> ones that failed to match should be listed (or none listed if just >>>>>> too messy). >>>>> >>>>> It reports the last failing stacktrace, out of a possible two. >>>>> Perhaps I can print both ... you want something in the jtr file so >>>>> that it can be triaged without having to go and look up the test code. >>>> Yeah, just pointing out that only printing one stacktrace might lead >>>> the .jtr reader down the wrong path. >>>> >>>> thanks, >>>> >>>> Chris >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 4/5/19 12:04 AM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Updated webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>>>>>> >>>>>>> Checks for alternate stack now. Added lots of comments and misc >>>>>>> fixups. >>>>>>> >>>>>>> Zhengyu: please re-test (I can't test any slowdebug except >>>>>>> linux-x64). >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>>>>>> Thinking about this a bit more, there is still the potential for >>>>>>>> some confusion if this test fails again in the future due to the >>>>>>>> top frame missing. Is it missing because it got inlined or is it >>>>>>>> missing because the frame skipping code skipped an extra frame? >>>>>>>> Hopefully whoever deals with it doesn't just hastily add another >>>>>>>> valid stacktrace to the test but instead investigates to make >>>>>>>> sure the issue is indeed that the method got inlined. >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Okay I will simply check for the third alternative. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> For the callsite that this test is checking for, right now >>>>>>>>>> there appear to be 3 possible stacktraces: the "normal" one, >>>>>>>>>> the one that includes AllocateHeap() on solaris and windows >>>>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on >>>>>>>>>> linux-x64. You would need to check for all 3, limiting the >>>>>>>>>> AllocateHeap() one to just being allowed on solaris and >>>>>>>>>> windows slowdebug as it is now. So basically this test needs >>>>>>>>>> to cover all (allowable) stacktraces that we've seen for this >>>>>>>>>> callsite, and be updated in the future as needed. Not ideal, >>>>>>>>>> but I don't see a better solution. It's similar to the >>>>>>>>>> situation described in JDK-8163899 which covered the fragility >>>>>>>>>> of the NMT frame skipping code. In the end it was decided it >>>>>>>>>> would be easier to just deal fix issues as they came up rather >>>>>>>>>> then engineer a solution that wasn't as fragile. I think this >>>>>>>>>> test falls in the same category. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> Thanks for the explanation about the frame counting from >>>>>>>>>>> os::malloc - now I get it. But I don't understand your final >>>>>>>>>>> comment: >>>>>>>>>>> >>>>>>>>>>> > Looking at this code also reminds me of a reason to have >>>>>>>>>>> the test >>>>>>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>>>>>> skipping code >>>>>>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>>>>>> needed frame >>>>>>>>>>> > at the top. The way the test was written it would detect >>>>>>>>>>> this. With your >>>>>>>>>>> > changes it will not. It would just revert to always >>>>>>>>>>> matching on 3 frames >>>>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>>> >>>>>>>>>>> How can I fix this bug if I have to check for 4 specific >>>>>>>>>>> frames but one (or more) may be missing - i.e how can I tell >>>>>>>>>>> the different between "Frame A was inlined" and "Frame A was >>>>>>>>>>> skipped by mistake" ?? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have concerns that this will hide some of the other >>>>>>>>>>>>>>>>>> bugs I've mentioned: JDK-8133749, JDK-8133747, and >>>>>>>>>>>>>>>>>> JDK-8133740. These bugs result in 1 or two frames >>>>>>>>>>>>>>>>>> appearing in the stacktrace that should be skipped. >>>>>>>>>>>>>>>>>> Notably NativeCallStack::NativeCallStack() and >>>>>>>>>>>>>>>>>> os::get_native_stack(). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 73???????? // We should never see either of these >>>>>>>>>>>>>>>>> frames because they are supposed to be skipped. */ >>>>>>>>>>>>>>>>> 74 >>>>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but >>>>>>>>>>>>>>>> missed it. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the >>>>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed for >>>>>>>>>>>>>>>>>> it for windows and solaris slowdebug builds. Although >>>>>>>>>>>>>>>>>> these builds should have honored the ALWAYSINLINE >>>>>>>>>>>>>>>>>> directive, it was deemed acceptable that it was not in >>>>>>>>>>>>>>>>>> slowdebug builds. However, I would not want to allow >>>>>>>>>>>>>>>>>> AllocateHeap() to appear in a product build, and best >>>>>>>>>>>>>>>>>> not to see it in fastdebug either. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a >>>>>>>>>>>>>>>>> given compiler chooses to inline something like >>>>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this test >>>>>>>>>>>>>>>>> to be checking for something specific to the native >>>>>>>>>>>>>>>>> compiler. The previous handling of AllocateHeap seemed >>>>>>>>>>>>>>>>> to be there simply because it was the only way to deal >>>>>>>>>>>>>>>>> with an optional frame - but now that's handled >>>>>>>>>>>>>>>>> generically. >>>>>>>>>>>>>>>> It's appearance means you effectively only have 3 frames >>>>>>>>>>>>>>>> to identity callsites instead of 4. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and >>>>>>>>>>>>>>> expected 4 matches. The current bug is that one of those >>>>>>>>>>>>>>> (new_entry) could actually be inlined as well, resulting >>>>>>>>>>>>>>> in only 3 matches. So that is what the revised test >>>>>>>>>>>>>>> checks for: at least 3 matches. Often there will be 4 >>>>>>>>>>>>>>> matches. >>>>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() doesn't >>>>>>>>>>>>>> get inlined, it effectively is using 3. The test should >>>>>>>>>>>>>> detect when this happens so the NMT implementation can >>>>>>>>>>>>>> address the issue. >>>>>>>>>>>>> >>>>>>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>>>>>> >>>>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters >>>>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is >>>>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is controlled >>>>>>>>>>>> by NMT_TrackingStackDepth. Making NMT_TrackingStackDepth >>>>>>>>>>>> bigger means more refinement of the callsites (thus more >>>>>>>>>>>> callsites), but a clearer picture of what actually led to >>>>>>>>>>>> the os:malloc(). >>>>>>>>>>>> >>>>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have >>>>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and >>>>>>>>>>>> foo() and bar() both call a(), the NMT detail output will >>>>>>>>>>>> not distinguish between these two calls paths to >>>>>>>>>>>> os:mallco(), and will consider both paths to be the same >>>>>>>>>>>> callsite. The 4 frames in the NMT detail output would always >>>>>>>>>>>> be a, b, c, and d. However, bump up NMT_TrackingStackDepth >>>>>>>>>>>> to 5 and now NMT will treat them as two separate callsites, >>>>>>>>>>>> one with foo() as the bottom frame and one with bar() as the >>>>>>>>>>>> bottom frame, and both with a, b, c, and d as the other 4 >>>>>>>>>>>> frames. >>>>>>>>>>>> >>>>>>>>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>>>>>>>> allocation that is the result of doing a "new" of any >>>>>>>>>>>> CHeapObj subtype will have AllocateHeap() in its callsite, >>>>>>>>>>>> which effectively lowers they callsite refinement by 1. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was >>>>>>>>>>>>>>> it getting inlined already when AllocateHeap was not? >>>>>>>>>>>>>>> Even so we still end up with 4 frames matching normally. >>>>>>>>>>>>>> I noticed that last night also and scratch my head over it >>>>>>>>>>>>>> for a while and then went to bed. The only explanation I >>>>>>>>>>>>>> could come up with is that allocate_new_entry() is getting >>>>>>>>>>>>>> inlined, and as a result (due to being a slowdebug build >>>>>>>>>>>>>> and doing minimal inlining) AllocateHeap() was not inlined. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If it does appear in a product build, a solution should >>>>>>>>>>>>>>>> be looked into to get rid of it. If the port owner >>>>>>>>>>>>>>>> decides it can't get rid of it (or is unwilling to), >>>>>>>>>>>>>>>> then an exception should be added to the test like was >>>>>>>>>>>>>>>> done for solaris and windows slowdebug builds. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Are we specifically trying to test the compiler's ability >>>>>>>>>>>>>>> to inline that function and just happen to be using this >>>>>>>>>>>>>>> test to verify that? Doesn't seem like a suitable place >>>>>>>>>>>>>>> to do this - and why do we need to do it? The Visual >>>>>>>>>>>>>>> Studio docs state: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds and >>>>>>>>>>>>>>> could change with any update to the compiler. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline - >>>>>>>>>>>>>>> specifically -xinline only has an effect at ?xO3 or >>>>>>>>>>>>>>> higher. Which likely explains why it is ignored in >>>>>>>>>>>>>>> slowdebug. And there are other cases where it won't >>>>>>>>>>>>>>> honour the ALWAYSINLINE. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we >>>>>>>>>>>>>>> want to ensure inlining when not optimising: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>>>>>> unless you specify the ?always_inline? attribute for the >>>>>>>>>>>>>>> function, like this: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>>>>>> inline void foo (const char) >>>>>>>>>>>>>>> __attribute__((always_inline));" >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> and we don't write it that way. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So if we're that concerned about release builds >>>>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we need >>>>>>>>>>>>>>> something a bit more explicit than this test to determine >>>>>>>>>>>>>>> that. >>>>>>>>>>>>>> With respect to the 3 methods/functions we don't want to >>>>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of >>>>>>>>>>>>>> assumptions on inlining. One of the things the test is >>>>>>>>>>>>>> doing is making sure those assumptions are correct. If >>>>>>>>>>>>>> incorrect, then you run into issues like I mentioned above >>>>>>>>>>>>>> where callsite backtraces effectively only have 3 unique >>>>>>>>>>>>>> frames rather than 4 (actually before some bug fixes it >>>>>>>>>>>>>> was often just 2 unique frames). So I think it's >>>>>>>>>>>>>> appropriate to have a test to make sure we are not seeing >>>>>>>>>>>>>> any of these 3 methods/functions. >>>>>>>>>>>>> >>>>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can >>>>>>>>>>>>> clearly see what this inlining assumptions are that NMT >>>>>>>>>>>>> makes? Are they clearly documented? >>>>>>>>>>>> >>>>>>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>>>>>> various bugs that led to NativeCallStack::NativeCallStack() >>>>>>>>>>>> and os::get_native_stack() (and sometimes both) being in the >>>>>>>>>>>> callsite. Reviewing the bugs I referred to will give you an >>>>>>>>>>>> idea of where to look. One good place to look at >>>>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case >>>>>>>>>>>> code there that controls how many frames to skip based on on >>>>>>>>>>>> the platform and whether optimized or not. Also some >>>>>>>>>>>> comments there to help you out. I did a lot of bug fixing in >>>>>>>>>>>> this method. >>>>>>>>>>>> >>>>>>>>>>>> Looking at this code also reminds me of a reason to have the >>>>>>>>>>>> test continue to check for all 4 specific frames. If the >>>>>>>>>>>> frame skipping code skips an extra frame, then the callsite >>>>>>>>>>>> will be missing a needed frame at the top. The way the test >>>>>>>>>>>> was written it would detect this. With your changes it will >>>>>>>>>>>> not. It would just revert to always matching on 3 frames >>>>>>>>>>>> instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Now the test also has made inlining assumptions beyond >>>>>>>>>>>>>> what NMT has made, and that is really what this bug is >>>>>>>>>>>>>> about. In general I think your fix is fine in the way it >>>>>>>>>>>>>> relaxes which frames are actually found, but as Thomas >>>>>>>>>>>>>> points out, it suffers from not actually looking at a >>>>>>>>>>>>>> single stacktrace, but just looking for the specified >>>>>>>>>>>>>> frames somewhere in the output (and in the order >>>>>>>>>>>>>> specified.) You should probably address this. >>>>>>>>>>>>> >>>>>>>>>>>>> Right that was an error on my part. I thought the existing >>>>>>>>>>>>> MULTILINE pattern matching with .* would also find >>>>>>>>>>>>> non-sequential lines and so I was acting similarly. I will >>>>>>>>>>>>> re-think this. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>>>>>> which frames appear, I think you need to now also make >>>>>>>>>>>>>>>>>> sure the above 3 mentioned frames are not present, >>>>>>>>>>>>>>>>>> except for allowing AllocateHeap() in slowdebug builds. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames to >>>>>>>>>>>>>>>>>>> be missing based on empirical observations. So to >>>>>>>>>>>>>>>>>>> date we have seen two frames that may or may not be >>>>>>>>>>>>>>>>>>> inlined and so we allow for 2 non-matching entries. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now >>>>>>>>>>>>>>>>>>> it is just an optional frame. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test as >>>>>>>>>>>>>>>>>>> you intended? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >> From david.holmes at oracle.com Mon Apr 8 01:49:28 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Apr 2019 11:49:28 +1000 Subject: RFR (XXXS): 8221584: SIGSEGV in os::PlatformEvent::unpark() in JvmtiRawMonitor::raw_exit while posting method exit event Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8221584 webrev: http://cr.openjdk.java.net/~dholmes/8221584/webrev/ I'm really just sponsoring this fix as the problem was diagnozed by Robbin Ehn and Stefan Karlsson - thanks guys! :) So they are the contributors and I'm already one Reviewer. There's a missing loadstore barrier between extracting the ParkEvent from an ObjectWaiter node, and setting the node's TState to allow the the entering thread to proceed. It seems our recent update to gcc 8.2 resulted in the compiler reordering those two actions, meaning that the Objectwaiter pointer could now be pointing into a stack location with random contents. That might manifest as a SEGV or we may treat random memory as a pthread_mutex_t and get an EINVAL (or potentially other errors) on pthread_mutex_lock. Testing: mach5 tiers 1-3 (sanity - the added barrier can't break anything) Thanks, David From jianglizhou at google.com Mon Apr 8 03:30:45 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Sun, 7 Apr 2019 20:30:45 -0700 Subject: RFR(s): 8222015: Small VM.metaspace improvements In-Reply-To: References: Message-ID: Hi Thomas, This seems good to me. I have a few minor suggestions below, but please feel free to keep your existing code without changing. - For consistency with the existing code and VM.metaspace output, it might be worth renaming _num_classes_cds to _num_classes_shared, and _num_classes_cds_by_spacetype to _num_classes_shared_by_spacetype. - src/hotspot/share/memory/metaspace/printCLDMetaspaceInfoClosure.cpp You could replace the following MetaspaceShared::is_in_shared_metaspace(k) call with k->is_shared() if 'k' is guaranteed to be a valid Klass. 58 void do_klass(Klass* k) { 59 _num_classes ++; 60 if (MetaspaceShared::is_in_shared_metaspace(k)) { 61 _num_classes_cds ++; 62 } 63 } - src/hotspot/share/memory/metaspace/printMetaspaceInfoKlassClosure.cpp 46 // Print a 's' for shared classes 47 _out->put(MetaspaceShared::is_in_shared_metaspace(k) ? 's': ' '); 48 Same suggestion as the above. Thanks and regards, Jiangli On Fri, Apr 5, 2019 at 3:07 AM Thomas St?fe wrote: > Hi all, > > may I have please a review for this collection of small improvements to the > VM.metaspace diagnostic command? > > - it clearly marks now classes whose metadata reside in cds > - it shows the number of classes loaded, incl. those from cds, in the > overviews too. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8222015 > cr: > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/webrev.00/webrev/ > > Example output: > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-by-spacetype.txt > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders.txt > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders-showclasses.txt > (scroll > down -> cds classes in are now marked with 's') > > Thank you, > > Thomas > From nick.gasson at arm.com Mon Apr 8 03:32:56 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Mon, 8 Apr 2019 11:32:56 +0800 Subject: [aarch64-port-dev ] RFR (XS): 8221529: [TESTBUG] Docker tests use old/deprecated image on AArch64 In-Reply-To: References: <5C9CFF6F.3060804@oracle.com> <5CA4E622.4080305@oracle.com> <5CA4E78B.9030705@oracle.com> <25e5eafb-2859-d270-18b4-a910b5eb2302@redhat.com> <862a6ce7-0544-a4a0-5736-08b27089c876@arm.com> Message-ID: <47803bbe-8fca-97a4-4fad-e9feb586496f@arm.com> Thanks Andrew, Misha, and Severin for your reviews! http://hg.openjdk.java.net/jdk/jdk/rev/40658cb7f47a Nick On 04/04/2019 18:41, Andrew Haley wrote: > On 4/4/19 10:59 AM, Nick Gasson wrote: >> Hi Andrew, >> >> > >> > Nick will have to explain what it's supposed to do, and why. >> > >> >> By default on all non-x86 Linux platforms the Docker tests are supposed >> to use the official Ubuntu "latest" (=18.04) image from Docker Hub. But >> for AArch64 the image used is "aarch64/ubuntu" which according to [1] is >> deprecated in favour of "arm64v8/ubuntu" and hasn't been updated since >> 16.04: >> >> "The aarch64 organization is deprecated in favor of the more-specific >> arm64v8 organization, as per >> https://github.com/docker-library/official-images#architectures-other-than-amd64. >> Please adjust your usages accordingly." >> >> Practically, this causes problems if your JDK image is linked against a >> recent glibc: the Docker tests will fail with symbol resolution errors >> when these binaries are run in the Ubuntu 16.04 container. >> >> [1] https://hub.docker.com/r/aarch64/ubuntu > > The patch is OK. > From chris.plummer at oracle.com Mon Apr 8 04:33:28 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Sun, 7 Apr 2019 21:33:28 -0700 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <3f191e86-c4ce-a286-a7aa-03467e26301e@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> <828659d7-c8a1-b509-84f6-4a9596e617c8@oracle.com> <404ed0ac-2785-79a0-8fc3-d306af6166ca@oracle.com> <3f191e86-c4ce-a286-a7aa-03467e26301e@oracle.com> Message-ID: <95a8f862-1dfd-7769-2b85-0e47f444e411@oracle.com> Hi David, Looks good. thanks, Chris On 4/7/19 4:38 PM, David Holmes wrote: > Okay here is updated webrev: > > http://cr.openjdk.java.net/~dholmes/8218458/webrev.v3/ > > with the incremental change shown in: > > http://cr.openjdk.java.net/~dholmes/8218458/webrev.v3/incr_v2.patch > > I added the extra frame for the alternate stack and now print the > stack we are looking for as part of the normal test output rather than > trying to include it in the exception message when we fail. > > Zhengyu: can you please test this again on your platforms. Again I > can't test the alternate stack matching as I have no systems where it > fails (nor can I test slowedebug other than Linux). > > Thanks, > David > ----- > > On 7/04/2019 5:10 pm, David Holmes wrote: >> Hi Chris, >> >> On 7/04/2019 4:51 pm, Chris Plummer wrote: >>> Hi David, >>> >>> On 4/6/19 11:06 PM, David Holmes wrote: >>>> On 6/04/2019 4:24 pm, Chris Plummer wrote: >>>>> On 4/5/19 9:13 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 6/04/2019 3:09 am, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate? >>>>>> >>>>>> ?? That isn't part of any of the existing stacktraces. >>>>> See the following comment from Zhengyu in the CR: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 >>>> >>>> >>>> >>>> >>>> >>>> That comment simply includes a fragment of a stack which happens to >>>> include JVM_DefineModule and makes no further mention of it. I >>>> don't recall anyone saying that we should now be including that >>>> frame in the check. >>>> >>>> Do you want the test extended to also check for that frame? >>> Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule >>> is the additional frame that now appears in the detail output for >>> that call chain. So yes, the test should include it. If the inlining >>> of ModuleEntryTable::new_entry() had always happened, then the test >>> would originally have checked for the stacktrace as it appears in >>> the CR comment. >> >> I see - to be clear you want to always check for 4 frames, so the >> additional frame is only checked for the alternate stack. >> >>>>>> >>>>>>> Since you've added the following: >>>>>>> >>>>>>> ??103???????? if (!okToHaveAllocateHeap) { >>>>>>> ??104 output.shouldNotContain("AllocateHeap"); >>>>>>> ??105???????? } >>>>>> >>>>>> I didn't add that - see old code line 80. >>>>> Ok, but my comment below still applies since this check is in place. >>>>>> >>>>>>> You can simplify the following: >>>>>>> >>>>>>> ??123???????? if (okToHaveAllocateHeap) { >>>>>>> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >>>>>>> ??125???????????? if (stackTraceMatches(expectedStackTrace, >>>>>>> output)) { >>>>>>> ??126???????????????? return; >>>>>>> ??127???????????? } >>>>>>> ??128???????? } else { >>>>>>> >>>>>>> The is no need for the okToHaveAllocateHeap check here anymore. >>>>>>> Just check all 3 allowed stacktraces until one passes. This is a >>>>>>> slight improvement in flexibility in that it would no longer >>>>>>> require the slowdebug builds to match stackTraceAllocateHeap. >>>>>>> They could match any of the 3. You could then put all 3 allowed >>>>>>> stacktraces in an array and check them in a loop if you wish. >>>>>> >>>>>> The only change I have made (which might be obscured by the >>>>>> structure) is that if stackTraceDefault fails to match I then try >>>>>> stackTraceAlternate. The handling of okToHaveAllocateHeap is >>>>>> unchanged. >>>>>> >>>>>> By the same argument you made I think it best to only expect the >>>>>> AllocateHeap stack on those slowdebug platforms, so that we can >>>>>> notice when something changes - again I've mode no change in this >>>>>> regard. >>>>> Since line 104 already verified that AllocateHeap does not appear >>>>> except possibly in slow debug heaps, it is harmless to check all >>>>> builds against the stacktrace that includes AllocateHeap. >>>> >>>> "Harmless" but a waste of time checking for a stack that we know >>>> can't match. The current version was at your suggestion: >>>> >>>> "You would need to check for all 3, limiting the AllocateHeap() one >>>> to just being allowed on solaris and windows slowdebug as it is now." >>> That was before I realized there was already an explicit check for >>> AllocateHeap() to not be allowed except for slowdebug ones. Once I >>> realized that, it occurred to me that checking for all 3 stacktraces >>> in a loop would simplify the logic. >>>> >>>> Checking all three returns to my original version (modulo not >>>> removing the check for the AllocateHeap frame, and fixing the >>>> matching logic). >>> Your original version checked for a large number of permutations >>> that included any 3 of 5 specified frames, not checks for any of 3 >>> specific stacktraces (of 4 frames each). >> >> That was never the intent and what I was referring to when I said >> "and fixing the matching logic". >> >>>> >>>>> Also, if a slowdebug platform were to change to no longer include >>>>> AllocateHeap, checking it against the other two stacktraces would >>>>> allow the test to continue to pass without modification. >>>> >>>> This is counter to your earlier argument that we should be using >>>> this test to specifically check for such changes in compiler >>>> behaviour and update the platform specific guards accordingly. If >>>> you allow it to go either way then we would never remove the guard >>>> even when it was no longer needed on any platform. >>> But this is one compiler inlining behavior change that is ok. If >>> AllocateHeap() suddenly starts being inlined by slowdebug builds, >>> that is actually a good thing, and we would end up modifying the >>> test to allow it. So why not allow it now? >>>> >>>>> For these two reasons I was suggesting just always check all 3 >>>>> stacktraces until one passes. It would simplify the logic some. >>>> >>>> I'd need to change a number of other things make the main logic >>>> simpler (ie loop over all three stacks) but the error reporting >>>> part will be more awkward. And Thomas already complained about the >>>> number of times we scan the entire process output doing this >>>> matching, so this would make it worse - unless I completely change >>>> the way we do the matching, which then introduces more complexity >>>> and more likelihood of introducing new bugs. >>>> >>>> Let me know how you want to proceed. >>> >>> The loop idea was just to make the code simpler. If you feel it will >>> slow things down unacceptably, then I'm fine with the logic as-is in >>> v2, but you need to add JVM_DefineModule to the new stacktrace. >> >> Okay I intend to add the missing 4th frame, and print both potential >> stacks on failure, but otherwise leave at V2. >> >> Thanks, >> David >> ----- >> >>> thanks, >>> >>> Chris >>> >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>>> >>>>>>> The following is no longer correct: >>>>>>> >>>>>>> ??140???????? throw new RuntimeException("Expected stack trace >>>>>>> missing from output: " + expectedStackTrace); >>>>>>> >>>>>>> In your current approach, expectedStackTrace is just the last >>>>>>> stacktrace we tried. Since we may try more than one, maybe all >>>>>>> the ones that failed to match should be listed (or none listed >>>>>>> if just too messy). >>>>>> >>>>>> It reports the last failing stacktrace, out of a possible two. >>>>>> Perhaps I can print both ... you want something in the jtr file >>>>>> so that it can be triaged without having to go and look up the >>>>>> test code. >>>>> Yeah, just pointing out that only printing one stacktrace might >>>>> lead the .jtr reader down the wrong path. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 4/5/19 12:04 AM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>>>>>>> >>>>>>>> Checks for alternate stack now. Added lots of comments and misc >>>>>>>> fixups. >>>>>>>> >>>>>>>> Zhengyu: please re-test (I can't test any slowdebug except >>>>>>>> linux-x64). >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>>>>>>> Thinking about this a bit more, there is still the potential >>>>>>>>> for some confusion if this test fails again in the future due >>>>>>>>> to the top frame missing. Is it missing because it got inlined >>>>>>>>> or is it missing because the frame skipping code skipped an >>>>>>>>> extra frame? Hopefully whoever deals with it doesn't just >>>>>>>>> hastily add another valid stacktrace to the test but instead >>>>>>>>> investigates to make sure the issue is indeed that the method >>>>>>>>> got inlined. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> Okay I will simply check for the third alternative. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> For the callsite that this test is checking for, right now >>>>>>>>>>> there appear to be 3 possible stacktraces: the "normal" one, >>>>>>>>>>> the one that includes AllocateHeap() on solaris and windows >>>>>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on >>>>>>>>>>> linux-x64. You would need to check for all 3, limiting the >>>>>>>>>>> AllocateHeap() one to just being allowed on solaris and >>>>>>>>>>> windows slowdebug as it is now. So basically this test needs >>>>>>>>>>> to cover all (allowable) stacktraces that we've seen for >>>>>>>>>>> this callsite, and be updated in the future as needed. Not >>>>>>>>>>> ideal, but I don't see a better solution. It's similar to >>>>>>>>>>> the situation described in JDK-8163899 which covered the >>>>>>>>>>> fragility of the NMT frame skipping code. In the end it was >>>>>>>>>>> decided it would be easier to just deal fix issues as they >>>>>>>>>>> came up rather then engineer a solution that wasn't as >>>>>>>>>>> fragile. I think this test falls in the same category. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the explanation about the frame counting from >>>>>>>>>>>> os::malloc - now I get it. But I don't understand your >>>>>>>>>>>> final comment: >>>>>>>>>>>> >>>>>>>>>>>> > Looking at this code also reminds me of a reason to have >>>>>>>>>>>> the test >>>>>>>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>>>>>>> skipping code >>>>>>>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>>>>>>> needed frame >>>>>>>>>>>> > at the top. The way the test was written it would detect >>>>>>>>>>>> this. With your >>>>>>>>>>>> > changes it will not. It would just revert to always >>>>>>>>>>>> matching on 3 frames >>>>>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>>>> >>>>>>>>>>>> How can I fix this bug if I have to check for 4 specific >>>>>>>>>>>> frames but one (or more) may be missing - i.e how can I >>>>>>>>>>>> tell the different between "Frame A was inlined" and "Frame >>>>>>>>>>>> A was skipped by mistake" ?? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have concerns that this will hide some of the >>>>>>>>>>>>>>>>>>> other bugs I've mentioned: JDK-8133749, JDK-8133747, >>>>>>>>>>>>>>>>>>> and JDK-8133740. These bugs result in 1 or two >>>>>>>>>>>>>>>>>>> frames appearing in the stacktrace that should be >>>>>>>>>>>>>>>>>>> skipped. Notably NativeCallStack::NativeCallStack() >>>>>>>>>>>>>>>>>>> and os::get_native_stack(). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 73???????? // We should never see either of these >>>>>>>>>>>>>>>>>> frames because they are supposed to be skipped. */ >>>>>>>>>>>>>>>>>> 74 >>>>>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but >>>>>>>>>>>>>>>>> missed it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the >>>>>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed >>>>>>>>>>>>>>>>>>> for it for windows and solaris slowdebug builds. >>>>>>>>>>>>>>>>>>> Although these builds should have honored the >>>>>>>>>>>>>>>>>>> ALWAYSINLINE directive, it was deemed acceptable >>>>>>>>>>>>>>>>>>> that it was not in slowdebug builds. However, I >>>>>>>>>>>>>>>>>>> would not want to allow AllocateHeap() to appear in >>>>>>>>>>>>>>>>>>> a product build, and best not to see it in fastdebug >>>>>>>>>>>>>>>>>>> either. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a >>>>>>>>>>>>>>>>>> given compiler chooses to inline something like >>>>>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this >>>>>>>>>>>>>>>>>> test to be checking for something specific to the >>>>>>>>>>>>>>>>>> native compiler. The previous handling of >>>>>>>>>>>>>>>>>> AllocateHeap seemed to be there simply because it was >>>>>>>>>>>>>>>>>> the only way to deal with an optional frame - but now >>>>>>>>>>>>>>>>>> that's handled generically. >>>>>>>>>>>>>>>>> It's appearance means you effectively only have 3 >>>>>>>>>>>>>>>>> frames to identity callsites instead of 4. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and >>>>>>>>>>>>>>>> expected 4 matches. The current bug is that one of >>>>>>>>>>>>>>>> those (new_entry) could actually be inlined as well, >>>>>>>>>>>>>>>> resulting in only 3 matches. So that is what the >>>>>>>>>>>>>>>> revised test checks for: at least 3 matches. Often >>>>>>>>>>>>>>>> there will be 4 matches. >>>>>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() >>>>>>>>>>>>>>> doesn't get inlined, it effectively is using 3. The test >>>>>>>>>>>>>>> should detect when this happens so the NMT >>>>>>>>>>>>>>> implementation can address the issue. >>>>>>>>>>>>>> >>>>>>>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>>>>>>> >>>>>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters >>>>>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is >>>>>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is >>>>>>>>>>>>> controlled by NMT_TrackingStackDepth. Making >>>>>>>>>>>>> NMT_TrackingStackDepth bigger means more refinement of the >>>>>>>>>>>>> callsites (thus more callsites), but a clearer picture of >>>>>>>>>>>>> what actually led to the os:malloc(). >>>>>>>>>>>>> >>>>>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have >>>>>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and >>>>>>>>>>>>> foo() and bar() both call a(), the NMT detail output will >>>>>>>>>>>>> not distinguish between these two calls paths to >>>>>>>>>>>>> os:mallco(), and will consider both paths to be the same >>>>>>>>>>>>> callsite. The 4 frames in the NMT detail output would >>>>>>>>>>>>> always be a, b, c, and d. However, bump up >>>>>>>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as >>>>>>>>>>>>> two separate callsites, one with foo() as the bottom frame >>>>>>>>>>>>> and one with bar() as the bottom frame, and both with a, >>>>>>>>>>>>> b, c, and d as the other 4 frames. >>>>>>>>>>>>> >>>>>>>>>>>>> So my point is if AllocateHeap() is not inlined, then >>>>>>>>>>>>> every allocation that is the result of doing a "new" of >>>>>>>>>>>>> any CHeapObj subtype will have AllocateHeap() in its >>>>>>>>>>>>> callsite, which effectively lowers they callsite >>>>>>>>>>>>> refinement by 1. >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? >>>>>>>>>>>>>>>> Was it getting inlined already when AllocateHeap was >>>>>>>>>>>>>>>> not? Even so we still end up with 4 frames matching >>>>>>>>>>>>>>>> normally. >>>>>>>>>>>>>>> I noticed that last night also and scratch my head over >>>>>>>>>>>>>>> it for a while and then went to bed. The only >>>>>>>>>>>>>>> explanation I could come up with is that >>>>>>>>>>>>>>> allocate_new_entry() is getting inlined, and as a result >>>>>>>>>>>>>>> (due to being a slowdebug build and doing minimal >>>>>>>>>>>>>>> inlining) AllocateHeap() was not inlined. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If it does appear in a product build, a solution >>>>>>>>>>>>>>>>> should be looked into to get rid of it. If the port >>>>>>>>>>>>>>>>> owner decides it can't get rid of it (or is unwilling >>>>>>>>>>>>>>>>> to), then an exception should be added to the test >>>>>>>>>>>>>>>>> like was done for solaris and windows slowdebug builds. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Are we specifically trying to test the compiler's >>>>>>>>>>>>>>>> ability to inline that function and just happen to be >>>>>>>>>>>>>>>> using this test to verify that? Doesn't seem like a >>>>>>>>>>>>>>>> suitable place to do this - and why do we need to do >>>>>>>>>>>>>>>> it? The Visual Studio docs state: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds >>>>>>>>>>>>>>>> and could change with any update to the compiler. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline >>>>>>>>>>>>>>>> - specifically -xinline only has an effect at ?xO3 or >>>>>>>>>>>>>>>> higher. Which likely explains why it is ignored in >>>>>>>>>>>>>>>> slowdebug. And there are other cases where it won't >>>>>>>>>>>>>>>> honour the ALWAYSINLINE. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if >>>>>>>>>>>>>>>> we want to ensure inlining when not optimising: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>>>>>>> unless you specify the ?always_inline? attribute for >>>>>>>>>>>>>>>> the function, like this: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>>>>>>> inline void foo (const char) >>>>>>>>>>>>>>>> __attribute__((always_inline));" >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> and we don't write it that way. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So if we're that concerned about release builds >>>>>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we >>>>>>>>>>>>>>>> need something a bit more explicit than this test to >>>>>>>>>>>>>>>> determine that. >>>>>>>>>>>>>>> With respect to the 3 methods/functions we don't want to >>>>>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of >>>>>>>>>>>>>>> assumptions on inlining. One of the things the test is >>>>>>>>>>>>>>> doing is making sure those assumptions are correct. If >>>>>>>>>>>>>>> incorrect, then you run into issues like I mentioned >>>>>>>>>>>>>>> above where callsite backtraces effectively only have 3 >>>>>>>>>>>>>>> unique frames rather than 4 (actually before some bug >>>>>>>>>>>>>>> fixes it was often just 2 unique frames). So I think >>>>>>>>>>>>>>> it's appropriate to have a test to make sure we are not >>>>>>>>>>>>>>> seeing any of these 3 methods/functions. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can >>>>>>>>>>>>>> clearly see what this inlining assumptions are that NMT >>>>>>>>>>>>>> makes? Are they clearly documented? >>>>>>>>>>>>> >>>>>>>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>>>>>>> various bugs that led to >>>>>>>>>>>>> NativeCallStack::NativeCallStack() and >>>>>>>>>>>>> os::get_native_stack() (and sometimes both) being in the >>>>>>>>>>>>> callsite. Reviewing the bugs I referred to will give you >>>>>>>>>>>>> an idea of where to look. One good place to look at >>>>>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case >>>>>>>>>>>>> code there that controls how many frames to skip based on >>>>>>>>>>>>> on the platform and whether optimized or not. Also some >>>>>>>>>>>>> comments there to help you out. I did a lot of bug fixing >>>>>>>>>>>>> in this method. >>>>>>>>>>>>> >>>>>>>>>>>>> Looking at this code also reminds me of a reason to have >>>>>>>>>>>>> the test continue to check for all 4 specific frames. If >>>>>>>>>>>>> the frame skipping code skips an extra frame, then the >>>>>>>>>>>>> callsite will be missing a needed frame at the top. The >>>>>>>>>>>>> way the test was written it would detect this. With your >>>>>>>>>>>>> changes it will not. It would just revert to always >>>>>>>>>>>>> matching on 3 frames instead of 4, and the frame skipping >>>>>>>>>>>>> bug would go unnoticed. >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Now the test also has made inlining assumptions beyond >>>>>>>>>>>>>>> what NMT has made, and that is really what this bug is >>>>>>>>>>>>>>> about. In general I think your fix is fine in the way it >>>>>>>>>>>>>>> relaxes which frames are actually found, but as Thomas >>>>>>>>>>>>>>> points out, it suffers from not actually looking at a >>>>>>>>>>>>>>> single stacktrace, but just looking for the specified >>>>>>>>>>>>>>> frames somewhere in the output (and in the order >>>>>>>>>>>>>>> specified.) You should probably address this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Right that was an error on my part. I thought the >>>>>>>>>>>>>> existing MULTILINE pattern matching with .* would also >>>>>>>>>>>>>> find non-sequential lines and so I was acting similarly. >>>>>>>>>>>>>> I will re-think this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>>>>>>> which frames appear, I think you need to now also >>>>>>>>>>>>>>>>>>> make sure the above 3 mentioned frames are not >>>>>>>>>>>>>>>>>>> present, except for allowing AllocateHeap() in >>>>>>>>>>>>>>>>>>> slowdebug builds. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames >>>>>>>>>>>>>>>>>>>> to be missing based on empirical observations. So >>>>>>>>>>>>>>>>>>>> to date we have seen two frames that may or may not >>>>>>>>>>>>>>>>>>>> be inlined and so we allow for 2 non-matching entries. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as >>>>>>>>>>>>>>>>>>>> now it is just an optional frame. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test >>>>>>>>>>>>>>>>>>>> as you intended? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> From stefan.karlsson at oracle.com Mon Apr 8 05:07:42 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 8 Apr 2019 07:07:42 +0200 Subject: RFR (XXXS): 8221584: SIGSEGV in os::PlatformEvent::unpark() in JvmtiRawMonitor::raw_exit while posting method exit event In-Reply-To: References: Message-ID: <25f9d30f-9878-bbdd-5fc2-2d6bebb71775@oracle.com> Looks good! StefanK On 2019-04-08 03:49, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8221584 > webrev: http://cr.openjdk.java.net/~dholmes/8221584/webrev/ > > I'm really just sponsoring this fix as the problem was diagnozed by > Robbin Ehn and Stefan Karlsson - thanks guys! :) So they are the > contributors and I'm already one Reviewer. > > There's a missing loadstore barrier between extracting the ParkEvent > from an ObjectWaiter node, and setting the node's TState to allow the > the entering thread to proceed. It seems our recent update to gcc 8.2 > resulted in the compiler reordering those two actions, meaning that > the Objectwaiter pointer could now be pointing into a stack location > with random contents. That might manifest as a SEGV or we may treat > random memory as a pthread_mutex_t and get an EINVAL (or potentially > other errors) on pthread_mutex_lock. > > Testing: mach5 tiers 1-3 (sanity - the added barrier can't break > anything) > > Thanks, > David From david.holmes at oracle.com Mon Apr 8 05:27:16 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Apr 2019 15:27:16 +1000 Subject: RFR (XXXS): 8221584: SIGSEGV in os::PlatformEvent::unpark() in JvmtiRawMonitor::raw_exit while posting method exit event In-Reply-To: <25f9d30f-9878-bbdd-5fc2-2d6bebb71775@oracle.com> References: <25f9d30f-9878-bbdd-5fc2-2d6bebb71775@oracle.com> Message-ID: <0835af6d-e4e5-a92c-430e-0ae68fdb851d@oracle.com> On 8/04/2019 3:07 pm, Stefan Karlsson wrote: > Looks good! Thanks - do you want to be co-contributor or a reviewer? :) David > StefanK > > On 2019-04-08 03:49, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8221584 >> webrev: http://cr.openjdk.java.net/~dholmes/8221584/webrev/ >> >> I'm really just sponsoring this fix as the problem was diagnozed by >> Robbin Ehn and Stefan Karlsson - thanks guys! :) So they are the >> contributors and I'm already one Reviewer. >> >> There's a missing loadstore barrier between extracting the ParkEvent >> from an ObjectWaiter node, and setting the node's TState to allow the >> the entering thread to proceed. It seems our recent update to gcc 8.2 >> resulted in the compiler reordering those two actions, meaning that >> the Objectwaiter pointer could now be pointing into a stack location >> with random contents. That might manifest as a SEGV or we may treat >> random memory as a pthread_mutex_t and get an EINVAL (or potentially >> other errors) on pthread_mutex_lock. >> >> Testing: mach5 tiers 1-3 (sanity - the added barrier can't break >> anything) >> >> Thanks, >> David > From david.holmes at oracle.com Mon Apr 8 06:03:32 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Apr 2019 16:03:32 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <95a8f862-1dfd-7769-2b85-0e47f444e411@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> <828659d7-c8a1-b509-84f6-4a9596e617c8@oracle.com> <404ed0ac-2785-79a0-8fc3-d306af6166ca@oracle.com> <3f191e86-c4ce-a286-a7aa-03467e26301e@oracle.com> <95a8f862-1dfd-7769-2b85-0e47f444e411@oracle.com> Message-ID: <9b518aa5-3607-fc95-c3f1-d869bc129fb2@oracle.com> Thanks Chris! David On 8/04/2019 2:33 pm, Chris Plummer wrote: > Hi David, > > Looks good. > > thanks, > > Chris > > On 4/7/19 4:38 PM, David Holmes wrote: >> Okay here is updated webrev: >> >> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v3/ >> >> with the incremental change shown in: >> >> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v3/incr_v2.patch >> >> I added the extra frame for the alternate stack and now print the >> stack we are looking for as part of the normal test output rather than >> trying to include it in the exception message when we fail. >> >> Zhengyu: can you please test this again on your platforms. Again I >> can't test the alternate stack matching as I have no systems where it >> fails (nor can I test slowedebug other than Linux). >> >> Thanks, >> David >> ----- >> >> On 7/04/2019 5:10 pm, David Holmes wrote: >>> Hi Chris, >>> >>> On 7/04/2019 4:51 pm, Chris Plummer wrote: >>>> Hi David, >>>> >>>> On 4/6/19 11:06 PM, David Holmes wrote: >>>>> On 6/04/2019 4:24 pm, Chris Plummer wrote: >>>>>> On 4/5/19 9:13 PM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> On 6/04/2019 3:09 am, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate? >>>>>>> >>>>>>> ?? That isn't part of any of the existing stacktraces. >>>>>> See the following comment from Zhengyu in the CR: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> That comment simply includes a fragment of a stack which happens to >>>>> include JVM_DefineModule and makes no further mention of it. I >>>>> don't recall anyone saying that we should now be including that >>>>> frame in the check. >>>>> >>>>> Do you want the test extended to also check for that frame? >>>> Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule >>>> is the additional frame that now appears in the detail output for >>>> that call chain. So yes, the test should include it. If the inlining >>>> of ModuleEntryTable::new_entry() had always happened, then the test >>>> would originally have checked for the stacktrace as it appears in >>>> the CR comment. >>> >>> I see - to be clear you want to always check for 4 frames, so the >>> additional frame is only checked for the alternate stack. >>> >>>>>>> >>>>>>>> Since you've added the following: >>>>>>>> >>>>>>>> ??103???????? if (!okToHaveAllocateHeap) { >>>>>>>> ??104 output.shouldNotContain("AllocateHeap"); >>>>>>>> ??105???????? } >>>>>>> >>>>>>> I didn't add that - see old code line 80. >>>>>> Ok, but my comment below still applies since this check is in place. >>>>>>> >>>>>>>> You can simplify the following: >>>>>>>> >>>>>>>> ??123???????? if (okToHaveAllocateHeap) { >>>>>>>> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >>>>>>>> ??125???????????? if (stackTraceMatches(expectedStackTrace, >>>>>>>> output)) { >>>>>>>> ??126???????????????? return; >>>>>>>> ??127???????????? } >>>>>>>> ??128???????? } else { >>>>>>>> >>>>>>>> The is no need for the okToHaveAllocateHeap check here anymore. >>>>>>>> Just check all 3 allowed stacktraces until one passes. This is a >>>>>>>> slight improvement in flexibility in that it would no longer >>>>>>>> require the slowdebug builds to match stackTraceAllocateHeap. >>>>>>>> They could match any of the 3. You could then put all 3 allowed >>>>>>>> stacktraces in an array and check them in a loop if you wish. >>>>>>> >>>>>>> The only change I have made (which might be obscured by the >>>>>>> structure) is that if stackTraceDefault fails to match I then try >>>>>>> stackTraceAlternate. The handling of okToHaveAllocateHeap is >>>>>>> unchanged. >>>>>>> >>>>>>> By the same argument you made I think it best to only expect the >>>>>>> AllocateHeap stack on those slowdebug platforms, so that we can >>>>>>> notice when something changes - again I've mode no change in this >>>>>>> regard. >>>>>> Since line 104 already verified that AllocateHeap does not appear >>>>>> except possibly in slow debug heaps, it is harmless to check all >>>>>> builds against the stacktrace that includes AllocateHeap. >>>>> >>>>> "Harmless" but a waste of time checking for a stack that we know >>>>> can't match. The current version was at your suggestion: >>>>> >>>>> "You would need to check for all 3, limiting the AllocateHeap() one >>>>> to just being allowed on solaris and windows slowdebug as it is now." >>>> That was before I realized there was already an explicit check for >>>> AllocateHeap() to not be allowed except for slowdebug ones. Once I >>>> realized that, it occurred to me that checking for all 3 stacktraces >>>> in a loop would simplify the logic. >>>>> >>>>> Checking all three returns to my original version (modulo not >>>>> removing the check for the AllocateHeap frame, and fixing the >>>>> matching logic). >>>> Your original version checked for a large number of permutations >>>> that included any 3 of 5 specified frames, not checks for any of 3 >>>> specific stacktraces (of 4 frames each). >>> >>> That was never the intent and what I was referring to when I said >>> "and fixing the matching logic". >>> >>>>> >>>>>> Also, if a slowdebug platform were to change to no longer include >>>>>> AllocateHeap, checking it against the other two stacktraces would >>>>>> allow the test to continue to pass without modification. >>>>> >>>>> This is counter to your earlier argument that we should be using >>>>> this test to specifically check for such changes in compiler >>>>> behaviour and update the platform specific guards accordingly. If >>>>> you allow it to go either way then we would never remove the guard >>>>> even when it was no longer needed on any platform. >>>> But this is one compiler inlining behavior change that is ok. If >>>> AllocateHeap() suddenly starts being inlined by slowdebug builds, >>>> that is actually a good thing, and we would end up modifying the >>>> test to allow it. So why not allow it now? >>>>> >>>>>> For these two reasons I was suggesting just always check all 3 >>>>>> stacktraces until one passes. It would simplify the logic some. >>>>> >>>>> I'd need to change a number of other things make the main logic >>>>> simpler (ie loop over all three stacks) but the error reporting >>>>> part will be more awkward. And Thomas already complained about the >>>>> number of times we scan the entire process output doing this >>>>> matching, so this would make it worse - unless I completely change >>>>> the way we do the matching, which then introduces more complexity >>>>> and more likelihood of introducing new bugs. >>>>> >>>>> Let me know how you want to proceed. >>>> >>>> The loop idea was just to make the code simpler. If you feel it will >>>> slow things down unacceptably, then I'm fine with the logic as-is in >>>> v2, but you need to add JVM_DefineModule to the new stacktrace. >>> >>> Okay I intend to add the missing 4th frame, and print both potential >>> stacks on failure, but otherwise leave at V2. >>> >>> Thanks, >>> David >>> ----- >>> >>>> thanks, >>>> >>>> Chris >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>>> >>>>>>>> The following is no longer correct: >>>>>>>> >>>>>>>> ??140???????? throw new RuntimeException("Expected stack trace >>>>>>>> missing from output: " + expectedStackTrace); >>>>>>>> >>>>>>>> In your current approach, expectedStackTrace is just the last >>>>>>>> stacktrace we tried. Since we may try more than one, maybe all >>>>>>>> the ones that failed to match should be listed (or none listed >>>>>>>> if just too messy). >>>>>>> >>>>>>> It reports the last failing stacktrace, out of a possible two. >>>>>>> Perhaps I can print both ... you want something in the jtr file >>>>>>> so that it can be triaged without having to go and look up the >>>>>>> test code. >>>>>> Yeah, just pointing out that only printing one stacktrace might >>>>>> lead the .jtr reader down the wrong path. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 4/5/19 12:04 AM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Updated webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>>>>>>>> >>>>>>>>> Checks for alternate stack now. Added lots of comments and misc >>>>>>>>> fixups. >>>>>>>>> >>>>>>>>> Zhengyu: please re-test (I can't test any slowdebug except >>>>>>>>> linux-x64). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>>>>>>>> Thinking about this a bit more, there is still the potential >>>>>>>>>> for some confusion if this test fails again in the future due >>>>>>>>>> to the top frame missing. Is it missing because it got inlined >>>>>>>>>> or is it missing because the frame skipping code skipped an >>>>>>>>>> extra frame? Hopefully whoever deals with it doesn't just >>>>>>>>>> hastily add another valid stacktrace to the test but instead >>>>>>>>>> investigates to make sure the issue is indeed that the method >>>>>>>>>> got inlined. >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> Okay I will simply check for the third alternative. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> For the callsite that this test is checking for, right now >>>>>>>>>>>> there appear to be 3 possible stacktraces: the "normal" one, >>>>>>>>>>>> the one that includes AllocateHeap() on solaris and windows >>>>>>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on >>>>>>>>>>>> linux-x64. You would need to check for all 3, limiting the >>>>>>>>>>>> AllocateHeap() one to just being allowed on solaris and >>>>>>>>>>>> windows slowdebug as it is now. So basically this test needs >>>>>>>>>>>> to cover all (allowable) stacktraces that we've seen for >>>>>>>>>>>> this callsite, and be updated in the future as needed. Not >>>>>>>>>>>> ideal, but I don't see a better solution. It's similar to >>>>>>>>>>>> the situation described in JDK-8163899 which covered the >>>>>>>>>>>> fragility of the NMT frame skipping code. In the end it was >>>>>>>>>>>> decided it would be easier to just deal fix issues as they >>>>>>>>>>>> came up rather then engineer a solution that wasn't as >>>>>>>>>>>> fragile. I think this test falls in the same category. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for the explanation about the frame counting from >>>>>>>>>>>>> os::malloc - now I get it. But I don't understand your >>>>>>>>>>>>> final comment: >>>>>>>>>>>>> >>>>>>>>>>>>> > Looking at this code also reminds me of a reason to have >>>>>>>>>>>>> the test >>>>>>>>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>>>>>>>> skipping code >>>>>>>>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>>>>>>>> needed frame >>>>>>>>>>>>> > at the top. The way the test was written it would detect >>>>>>>>>>>>> this. With your >>>>>>>>>>>>> > changes it will not. It would just revert to always >>>>>>>>>>>>> matching on 3 frames >>>>>>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>>>>> >>>>>>>>>>>>> How can I fix this bug if I have to check for 4 specific >>>>>>>>>>>>> frames but one (or more) may be missing - i.e how can I >>>>>>>>>>>>> tell the different between "Frame A was inlined" and "Frame >>>>>>>>>>>>> A was skipped by mistake" ?? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I have concerns that this will hide some of the >>>>>>>>>>>>>>>>>>>> other bugs I've mentioned: JDK-8133749, JDK-8133747, >>>>>>>>>>>>>>>>>>>> and JDK-8133740. These bugs result in 1 or two >>>>>>>>>>>>>>>>>>>> frames appearing in the stacktrace that should be >>>>>>>>>>>>>>>>>>>> skipped. Notably NativeCallStack::NativeCallStack() >>>>>>>>>>>>>>>>>>>> and os::get_native_stack(). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 73???????? // We should never see either of these >>>>>>>>>>>>>>>>>>> frames because they are supposed to be skipped. */ >>>>>>>>>>>>>>>>>>> 74 >>>>>>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but >>>>>>>>>>>>>>>>>> missed it. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the >>>>>>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed >>>>>>>>>>>>>>>>>>>> for it for windows and solaris slowdebug builds. >>>>>>>>>>>>>>>>>>>> Although these builds should have honored the >>>>>>>>>>>>>>>>>>>> ALWAYSINLINE directive, it was deemed acceptable >>>>>>>>>>>>>>>>>>>> that it was not in slowdebug builds. However, I >>>>>>>>>>>>>>>>>>>> would not want to allow AllocateHeap() to appear in >>>>>>>>>>>>>>>>>>>> a product build, and best not to see it in fastdebug >>>>>>>>>>>>>>>>>>>> either. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a >>>>>>>>>>>>>>>>>>> given compiler chooses to inline something like >>>>>>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this >>>>>>>>>>>>>>>>>>> test to be checking for something specific to the >>>>>>>>>>>>>>>>>>> native compiler. The previous handling of >>>>>>>>>>>>>>>>>>> AllocateHeap seemed to be there simply because it was >>>>>>>>>>>>>>>>>>> the only way to deal with an optional frame - but now >>>>>>>>>>>>>>>>>>> that's handled generically. >>>>>>>>>>>>>>>>>> It's appearance means you effectively only have 3 >>>>>>>>>>>>>>>>>> frames to identity callsites instead of 4. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and >>>>>>>>>>>>>>>>> expected 4 matches. The current bug is that one of >>>>>>>>>>>>>>>>> those (new_entry) could actually be inlined as well, >>>>>>>>>>>>>>>>> resulting in only 3 matches. So that is what the >>>>>>>>>>>>>>>>> revised test checks for: at least 3 matches. Often >>>>>>>>>>>>>>>>> there will be 4 matches. >>>>>>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() >>>>>>>>>>>>>>>> doesn't get inlined, it effectively is using 3. The test >>>>>>>>>>>>>>>> should detect when this happens so the NMT >>>>>>>>>>>>>>>> implementation can address the issue. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>>>>>>>> >>>>>>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters >>>>>>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is >>>>>>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is >>>>>>>>>>>>>> controlled by NMT_TrackingStackDepth. Making >>>>>>>>>>>>>> NMT_TrackingStackDepth bigger means more refinement of the >>>>>>>>>>>>>> callsites (thus more callsites), but a clearer picture of >>>>>>>>>>>>>> what actually led to the os:malloc(). >>>>>>>>>>>>>> >>>>>>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have >>>>>>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and >>>>>>>>>>>>>> foo() and bar() both call a(), the NMT detail output will >>>>>>>>>>>>>> not distinguish between these two calls paths to >>>>>>>>>>>>>> os:mallco(), and will consider both paths to be the same >>>>>>>>>>>>>> callsite. The 4 frames in the NMT detail output would >>>>>>>>>>>>>> always be a, b, c, and d. However, bump up >>>>>>>>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as >>>>>>>>>>>>>> two separate callsites, one with foo() as the bottom frame >>>>>>>>>>>>>> and one with bar() as the bottom frame, and both with a, >>>>>>>>>>>>>> b, c, and d as the other 4 frames. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So my point is if AllocateHeap() is not inlined, then >>>>>>>>>>>>>> every allocation that is the result of doing a "new" of >>>>>>>>>>>>>> any CHeapObj subtype will have AllocateHeap() in its >>>>>>>>>>>>>> callsite, which effectively lowers they callsite >>>>>>>>>>>>>> refinement by 1. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? >>>>>>>>>>>>>>>>> Was it getting inlined already when AllocateHeap was >>>>>>>>>>>>>>>>> not? Even so we still end up with 4 frames matching >>>>>>>>>>>>>>>>> normally. >>>>>>>>>>>>>>>> I noticed that last night also and scratch my head over >>>>>>>>>>>>>>>> it for a while and then went to bed. The only >>>>>>>>>>>>>>>> explanation I could come up with is that >>>>>>>>>>>>>>>> allocate_new_entry() is getting inlined, and as a result >>>>>>>>>>>>>>>> (due to being a slowdebug build and doing minimal >>>>>>>>>>>>>>>> inlining) AllocateHeap() was not inlined. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> If it does appear in a product build, a solution >>>>>>>>>>>>>>>>>> should be looked into to get rid of it. If the port >>>>>>>>>>>>>>>>>> owner decides it can't get rid of it (or is unwilling >>>>>>>>>>>>>>>>>> to), then an exception should be added to the test >>>>>>>>>>>>>>>>>> like was done for solaris and windows slowdebug builds. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Are we specifically trying to test the compiler's >>>>>>>>>>>>>>>>> ability to inline that function and just happen to be >>>>>>>>>>>>>>>>> using this test to verify that? Doesn't seem like a >>>>>>>>>>>>>>>>> suitable place to do this - and why do we need to do >>>>>>>>>>>>>>>>> it? The Visual Studio docs state: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds >>>>>>>>>>>>>>>>> and could change with any update to the compiler. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline >>>>>>>>>>>>>>>>> - specifically -xinline only has an effect at ?xO3 or >>>>>>>>>>>>>>>>> higher. Which likely explains why it is ignored in >>>>>>>>>>>>>>>>> slowdebug. And there are other cases where it won't >>>>>>>>>>>>>>>>> honour the ALWAYSINLINE. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if >>>>>>>>>>>>>>>>> we want to ensure inlining when not optimising: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>>>>>>>> unless you specify the ?always_inline? attribute for >>>>>>>>>>>>>>>>> the function, like this: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>>>>>>>> inline void foo (const char) >>>>>>>>>>>>>>>>> __attribute__((always_inline));" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> and we don't write it that way. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So if we're that concerned about release builds >>>>>>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we >>>>>>>>>>>>>>>>> need something a bit more explicit than this test to >>>>>>>>>>>>>>>>> determine that. >>>>>>>>>>>>>>>> With respect to the 3 methods/functions we don't want to >>>>>>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of >>>>>>>>>>>>>>>> assumptions on inlining. One of the things the test is >>>>>>>>>>>>>>>> doing is making sure those assumptions are correct. If >>>>>>>>>>>>>>>> incorrect, then you run into issues like I mentioned >>>>>>>>>>>>>>>> above where callsite backtraces effectively only have 3 >>>>>>>>>>>>>>>> unique frames rather than 4 (actually before some bug >>>>>>>>>>>>>>>> fixes it was often just 2 unique frames). So I think >>>>>>>>>>>>>>>> it's appropriate to have a test to make sure we are not >>>>>>>>>>>>>>>> seeing any of these 3 methods/functions. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can >>>>>>>>>>>>>>> clearly see what this inlining assumptions are that NMT >>>>>>>>>>>>>>> makes? Are they clearly documented? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>>>>>>>> various bugs that led to >>>>>>>>>>>>>> NativeCallStack::NativeCallStack() and >>>>>>>>>>>>>> os::get_native_stack() (and sometimes both) being in the >>>>>>>>>>>>>> callsite. Reviewing the bugs I referred to will give you >>>>>>>>>>>>>> an idea of where to look. One good place to look at >>>>>>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case >>>>>>>>>>>>>> code there that controls how many frames to skip based on >>>>>>>>>>>>>> on the platform and whether optimized or not. Also some >>>>>>>>>>>>>> comments there to help you out. I did a lot of bug fixing >>>>>>>>>>>>>> in this method. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Looking at this code also reminds me of a reason to have >>>>>>>>>>>>>> the test continue to check for all 4 specific frames. If >>>>>>>>>>>>>> the frame skipping code skips an extra frame, then the >>>>>>>>>>>>>> callsite will be missing a needed frame at the top. The >>>>>>>>>>>>>> way the test was written it would detect this. With your >>>>>>>>>>>>>> changes it will not. It would just revert to always >>>>>>>>>>>>>> matching on 3 frames instead of 4, and the frame skipping >>>>>>>>>>>>>> bug would go unnoticed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Now the test also has made inlining assumptions beyond >>>>>>>>>>>>>>>> what NMT has made, and that is really what this bug is >>>>>>>>>>>>>>>> about. In general I think your fix is fine in the way it >>>>>>>>>>>>>>>> relaxes which frames are actually found, but as Thomas >>>>>>>>>>>>>>>> points out, it suffers from not actually looking at a >>>>>>>>>>>>>>>> single stacktrace, but just looking for the specified >>>>>>>>>>>>>>>> frames somewhere in the output (and in the order >>>>>>>>>>>>>>>> specified.) You should probably address this. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Right that was an error on my part. I thought the >>>>>>>>>>>>>>> existing MULTILINE pattern matching with .* would also >>>>>>>>>>>>>>> find non-sequential lines and so I was acting similarly. >>>>>>>>>>>>>>> I will re-think this. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>>>>>>>> which frames appear, I think you need to now also >>>>>>>>>>>>>>>>>>>> make sure the above 3 mentioned frames are not >>>>>>>>>>>>>>>>>>>> present, except for allowing AllocateHeap() in >>>>>>>>>>>>>>>>>>>> slowdebug builds. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames >>>>>>>>>>>>>>>>>>>>> to be missing based on empirical observations. So >>>>>>>>>>>>>>>>>>>>> to date we have seen two frames that may or may not >>>>>>>>>>>>>>>>>>>>> be inlined and so we allow for 2 non-matching entries. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as >>>>>>>>>>>>>>>>>>>>> now it is just an optional frame. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test >>>>>>>>>>>>>>>>>>>>> as you intended? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> > > From robbin.ehn at oracle.com Mon Apr 8 07:13:31 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 8 Apr 2019 09:13:31 +0200 Subject: RFR (XXXS): 8221584: SIGSEGV in os::PlatformEvent::unpark() in JvmtiRawMonitor::raw_exit while posting method exit event In-Reply-To: <0835af6d-e4e5-a92c-430e-0ae68fdb851d@oracle.com> References: <25f9d30f-9878-bbdd-5fc2-2d6bebb71775@oracle.com> <0835af6d-e4e5-a92c-430e-0ae68fdb851d@oracle.com> Message-ID: On 4/8/19 7:27 AM, David Holmes wrote: > On 8/04/2019 3:07 pm, Stefan Karlsson wrote: >> Looks good! +1 Thanks, Robbin (I can be reviewer) > > Thanks - do you want to be co-contributor or a reviewer?? :) > > David > >> StefanK >> >> On 2019-04-08 03:49, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8221584 >>> webrev: http://cr.openjdk.java.net/~dholmes/8221584/webrev/ >>> >>> I'm really just sponsoring this fix as the problem was diagnozed by Robbin >>> Ehn and Stefan Karlsson - thanks guys! :) So they are the contributors and >>> I'm already one Reviewer. >>> >>> There's a missing loadstore barrier between extracting the ParkEvent from an >>> ObjectWaiter node, and setting the node's TState to allow the the entering >>> thread to proceed. It seems our recent update to gcc 8.2 resulted in the >>> compiler reordering those two actions, meaning that the Objectwaiter pointer >>> could now be pointing into a stack location with random contents. That might >>> manifest as a SEGV or we may treat random memory as a pthread_mutex_t and get >>> an EINVAL (or potentially other errors) on pthread_mutex_lock. >>> >>> Testing: mach5 tiers 1-3 (sanity - the added barrier can't break anything) >>> >>> Thanks, >>> David >> From david.holmes at oracle.com Mon Apr 8 07:48:38 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Apr 2019 17:48:38 +1000 Subject: RFR (XXXS): 8221584: SIGSEGV in os::PlatformEvent::unpark() in JvmtiRawMonitor::raw_exit while posting method exit event In-Reply-To: References: <25f9d30f-9878-bbdd-5fc2-2d6bebb71775@oracle.com> <0835af6d-e4e5-a92c-430e-0ae68fdb851d@oracle.com> Message-ID: <6189f901-1ffd-6b73-9369-0f8bb1fc3575@oracle.com> Thanks Robbin. I guess between the three of us we have this covered one way or another. :) David On 8/04/2019 5:13 pm, Robbin Ehn wrote: > On 4/8/19 7:27 AM, David Holmes wrote: >> On 8/04/2019 3:07 pm, Stefan Karlsson wrote: >>> Looks good! > > +1 > > Thanks, Robbin (I can be reviewer) > >> >> Thanks - do you want to be co-contributor or a reviewer?? :) >> >> David >> >>> StefanK >>> >>> On 2019-04-08 03:49, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8221584 >>>> webrev: http://cr.openjdk.java.net/~dholmes/8221584/webrev/ >>>> >>>> I'm really just sponsoring this fix as the problem was diagnozed by >>>> Robbin Ehn and Stefan Karlsson - thanks guys! :) So they are the >>>> contributors and I'm already one Reviewer. >>>> >>>> There's a missing loadstore barrier between extracting the ParkEvent >>>> from an ObjectWaiter node, and setting the node's TState to allow >>>> the the entering thread to proceed. It seems our recent update to >>>> gcc 8.2 resulted in the compiler reordering those two actions, >>>> meaning that the Objectwaiter pointer could now be pointing into a >>>> stack location with random contents. That might manifest as a SEGV >>>> or we may treat random memory as a pthread_mutex_t and get an EINVAL >>>> (or potentially other errors) on pthread_mutex_lock. >>>> >>>> Testing: mach5 tiers 1-3 (sanity - the added barrier can't break >>>> anything) >>>> >>>> Thanks, >>>> David >>> From claes.redestad at oracle.com Mon Apr 8 08:41:23 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 8 Apr 2019 10:41:23 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero Message-ID: Hi, by adding a bit to String that is true iff String.hash has been calculated as being 0, we can get rid of the corner case where such hash codes are recalculated on every call. Peter Levart came up with a elegant scheme for ensuring that we can keep using non-volatile stores without explicit fencing and still reap the benefits of this[1], and I've synced up the hotspot code that deals with the String.hash value to mirror that logic. Bug: https://bugs.openjdk.java.net/browse/JDK-8221836 Webrev: http://cr.openjdk.java.net/~redestad/8221836/open.01/ Since there exists small padding gaps in the current object layout of strings (on all VM bitness and compressed oops varieties), adding this boolean does not add any extra footprint per String instance. Testing: tier1-3, verified a speed-up in targeted microbenchmarks. Thanks! /Claes [1] http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-April/059480.html From shade at redhat.com Mon Apr 8 08:56:11 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 8 Apr 2019 10:56:11 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: Message-ID: On 4/8/19 10:41 AM, Claes Redestad wrote: > by adding a bit to String that is true iff String.hash has been calculated as being 0, we can get > rid of the corner case where such hash > codes are recalculated on every call. > > Peter Levart came up with a elegant scheme for ensuring that we can keep > using non-volatile stores without explicit fencing and still reap the > benefits of this[1], and I've synced up the hotspot code that deals with > the String.hash value to mirror that logic. > > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8221836 > Webrev: http://cr.openjdk.java.net/~redestad/8221836/open.01/ > > Since there exists small padding gaps in the current object layout of > strings (on all VM bitness and compressed oops varieties), adding this > boolean does not add any extra footprint per String instance. Regardless, I think this change does not carry its weight. Introducing special paths for handling something as obscure as zero hash code, which then raises questions about correctness (I had hard time convincing myself that code is concurrency-safe), seems rather odd to me. It is a sane engineering tradeoff to make code more maintainable with accepting performance hit in 2^(-32) of cases. -Aleksey From claes.redestad at oracle.com Mon Apr 8 09:25:11 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 8 Apr 2019 11:25:11 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: Message-ID: On 2019-04-08 10:56, Aleksey Shipilev wrote: > Regardless, I think this change does not carry its weight. Introducing special paths for handling > something as obscure as zero hash code, which then raises questions about correctness (I had hard > time convincing myself that code is concurrency-safe), seems rather odd to me. It is a sane > engineering tradeoff to make code more maintainable with accepting performance hit in 2^(-32) of cases. Sure, String::hashCode/hash_code locally becomes a bit more complex, but I view this as being a net improvement on the total amount of special handling we need to do for Strings and their hash codes. While the performance gain on most real world use cases is likely to be non-existent, there exists some past concerns with injecting zero hash Strings into poorly implemented caches. This patch adds some defense-in- depth to help avoid issues that could otherwise arise from use of Strings as key in some hashing data structures. /Claes From shade at redhat.com Mon Apr 8 09:35:53 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 8 Apr 2019 11:35:53 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: Message-ID: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> On 4/8/19 11:25 AM, Claes Redestad wrote: > On 2019-04-08 10:56, Aleksey Shipilev wrote: >> Regardless, I think this change does not carry its weight. Introducing special paths for handling >> something as obscure as zero hash code, which then raises questions about correctness (I had hard >> time convincing myself that code is concurrency-safe), seems rather odd to me. It is a sane >> engineering tradeoff to make code more maintainable with accepting performance hit in 2^(-32) of >> cases. > > Sure, String::hashCode/hash_code locally becomes a bit more complex, but > I view this as being a net improvement on the total amount of special > handling we need to do for Strings and their hash codes. I don't see it. The change *added* new handling for the flag in all those places we used to handle zero hash code, and then some. > While the performance gain on most real world use cases is likely to be > non-existent, there exists some past concerns with injecting zero hash > Strings into poorly implemented caches. This patch adds some defense-in- > depth to help avoid issues that could otherwise arise from use of > Strings as key in some hashing data structures. That does not make much sense to me: it is much easier to construct hashcode collisions rather than generating unique strings with zero hashcodes. Alternative hashing was there to mitigate that, and it would also handle zero hash attacks. -Aleksey From claes.redestad at oracle.com Mon Apr 8 09:42:54 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 8 Apr 2019 11:42:54 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> Message-ID: <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> On 2019-04-08 11:35, Aleksey Shipilev wrote: >> Sure, String::hashCode/hash_code locally becomes a bit more complex, but >> I view this as being a net improvement on the total amount of special >> handling we need to do for Strings and their hash codes. > I don't see it. The change *added* new handling for the flag in all those places we used to handle > zero hash code, and then some. There's a few simple boilerplate methods added and the logic of hash_code(string) is consolidated to mimic String::hashCode, but code at the real call-sites like stringDedupTable and stringTable is simplified. /Claes From adinn at redhat.com Mon Apr 8 10:15:29 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 8 Apr 2019 11:15:29 +0100 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> Message-ID: <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> On 08/04/2019 10:42, Claes Redestad wrote: > On 2019-04-08 11:35, Aleksey Shipilev wrote: >>> Sure, String::hashCode/hash_code locally becomes a bit more complex, but >>> I view this as being a net improvement on the total amount of special >>> handling we need to do for Strings and their hash codes. >> I don't see it. The change *added* new handling for the flag in all >> those places we used to handle >> zero hash code, and then some. > > There's a few simple boilerplate methods added and the logic of > hash_code(string) is consolidated to mimic String::hashCode, but code at > the real call-sites like stringDedupTable and stringTable is simplified. Aleksey, I'm definitely buying Claes argument on this point. Also, I think your other quibble suffers from "what-aboutism" -- the fact that there are other ways for perverse performance issues to manifest (hashcode collisions) doesn't mean that this gap should not be plugged. However, you also said in your opening criticism "I had hard time convincing myself that code is concurrency-safe" I think that is a more telling complaint. Can you elaborate on why you found it hard to convince yourself of this? (I know what I think is the issue and I don't view it as an /especially/ thorny problem). Claes, is there a reason why you named the argument to method hash_is_set 'string' when every other method uses the name 'java_string'? Is you 'j' key a tad sticky? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From peter.levart at gmail.com Mon Apr 8 10:24:58 2019 From: peter.levart at gmail.com (Peter Levart) Date: Mon, 8 Apr 2019 12:24:58 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: Message-ID: <3453bd0a-37e6-957e-f1ea-f86fe7a9ff17@gmail.com> I think the most benefit in this patch is the emptyString.hashCode() speedup. By holding a boolean flag in the String object itself, there is one less de-reference to be made on fast-path in case of empty string. Which shows in microbenchmark and would show even more if code iterated many different instances of empty strings that don't share the underlying array invoking .hashCode() on them. Which, I admit, is not a frequent case in practice, but hey, it is a speedup after all. Regards, Peter On 4/8/19 10:56 AM, Aleksey Shipilev wrote: > On 4/8/19 10:41 AM, Claes Redestad wrote: >> by adding a bit to String that is true iff String.hash has been calculated as being 0, we can get >> rid of the corner case where such hash >> codes are recalculated on every call. >> >> Peter Levart came up with a elegant scheme for ensuring that we can keep >> using non-volatile stores without explicit fencing and still reap the >> benefits of this[1], and I've synced up the hotspot code that deals with >> the String.hash value to mirror that logic. >> >> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8221836 >> Webrev: http://cr.openjdk.java.net/~redestad/8221836/open.01/ >> >> Since there exists small padding gaps in the current object layout of >> strings (on all VM bitness and compressed oops varieties), adding this >> boolean does not add any extra footprint per String instance. > Regardless, I think this change does not carry its weight. Introducing special paths for handling > something as obscure as zero hash code, which then raises questions about correctness (I had hard > time convincing myself that code is concurrency-safe), seems rather odd to me. It is a sane > engineering tradeoff to make code more maintainable with accepting performance hit in 2^(-32) of cases. > > -Aleksey > From shade at redhat.com Mon Apr 8 10:28:03 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 8 Apr 2019 12:28:03 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> Message-ID: <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> On 4/8/19 12:15 PM, Andrew Dinn wrote: > On 08/04/2019 10:42, Claes Redestad wrote: >> On 2019-04-08 11:35, Aleksey Shipilev wrote: >>>> Sure, String::hashCode/hash_code locally becomes a bit more complex, but >>>> I view this as being a net improvement on the total amount of special >>>> handling we need to do for Strings and their hash codes. >>> I don't see it. The change *added* new handling for the flag in all >>> those places we used to handle >>> zero hash code, and then some. >> >> There's a few simple boilerplate methods added and the logic of >> hash_code(string) is consolidated to mimic String::hashCode, but code at >> the real call-sites like stringDedupTable and stringTable is simplified. Again, I don't see it. The same cleanup (moving hash computation code to java_lang_String::hash*) can be done without introducing the flag? > Aleksey, I'm definitely buying Claes argument on this point. Also, I > think your other quibble suffers from "what-aboutism" -- the fact that > there are other ways for perverse performance issues to manifest > (hashcode collisions) doesn't mean that this gap should not be plugged. Read carefully: I said that alternative hashing that is there to mitigate hashcode collisions *also* takes care of zero hashcode attacks. So if we do care about obscure zero hashcode attack, we can piggyback on the already implemented mechanism that is there to mitigate the much broader attack. > However, you also said in your opening criticism > > "I had hard time convincing myself that code is concurrency-safe" > > I think that is a more telling complaint. Can you elaborate on why you > found it hard to convince yourself of this? (I know what I think is the Because the whole thing in current code is "benign data race" on hash field. Pulling in another field into race needs careful consideration if it breaks the benignity. It apparently does not, but the cognitive complexity involved in reading that code makes the minuscule benefit much more questionable. -Aleksey From claes.redestad at oracle.com Mon Apr 8 10:59:11 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 8 Apr 2019 12:59:11 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> Message-ID: On 2019-04-08 12:15, Andrew Dinn wrote: > Claes, is there a reason why you named the argument to method > hash_is_set 'string' when every other method uses the name > 'java_string'? Is you 'j' key a tad sticky? I took the cue from set_hash which uses the 'string' naming, but yes, you're right to point out naming of arguments in this API is now a bit inconsistent. /Claes From claes.redestad at oracle.com Mon Apr 8 11:00:54 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 8 Apr 2019 13:00:54 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> Message-ID: <99cbd4e8-ab41-97ba-1a68-fff63f033e66@oracle.com> On 2019-04-08 12:28, Aleksey Shipilev wrote: > Because the whole thing in current code is "benign data race" on hash field. Pulling in another > field into race needs careful consideration if it breaks the benignity. It apparently does not, but > the cognitive complexity involved in reading that code makes the minuscule benefit much more > questionable. Can some carefully worded comments in the vicinity ease your concern? /Claes From shade at redhat.com Mon Apr 8 11:12:39 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 8 Apr 2019 13:12:39 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <99cbd4e8-ab41-97ba-1a68-fff63f033e66@oracle.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <99cbd4e8-ab41-97ba-1a68-fff63f033e66@oracle.com> Message-ID: On 4/8/19 1:00 PM, Claes Redestad wrote: > On 2019-04-08 12:28, Aleksey Shipilev wrote: >> Because the whole thing in current code is "benign data race" on hash field. Pulling in another >> field into race needs careful consideration if it breaks the benignity. It apparently does not, but >> the cognitive complexity involved in reading that code makes the minuscule benefit much more >> questionable. > > Can some carefully worded comments in the vicinity ease your concern? No. I am against deviating from the benign data race template, for either current or future changes, without the clearly overwhelming benefit of doing so. We have enough bugs as it is to risk exposing more bugs for the cases where the benefit is non-compelling. Therefore, my concern would be alleviated if we don't do this change at all. -Aleksey From peter.levart at gmail.com Mon Apr 8 11:28:20 2019 From: peter.levart at gmail.com (Peter Levart) Date: Mon, 8 Apr 2019 13:28:20 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> Message-ID: <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> On 4/8/19 12:28 PM, Aleksey Shipilev wrote: >> However, you also said in your opening criticism >> >> "I had hard time convincing myself that code is concurrency-safe" >> >> I think that is a more telling complaint. Can you elaborate on why you >> found it hard to convince yourself of this? (I know what I think is the > Because the whole thing in current code is "benign data race" on hash field. Pulling in another > field into race needs careful consideration if it breaks the benignity. It apparently does not, but > the cognitive complexity involved in reading that code makes the minuscule benefit much more > questionable. > > -Aleksey > Hi Aleksey, The reasoning is very similar as with just one field. With one field (hash) the thread sees either the default value (0) or a non-zero value calculated either by this thread sometime before or by a concurrent thread that has already stored it. Regardless of ordering, the thread either uses the non-zero value or (re)calculates it (again). The value calculation is deterministic and uses immutable published state (the array), so it always calculates the same value for the same object. Idempotence is guaranteed. The same reasoning can be extended to a general case where there are many fields used for caching of a calculated state from some immutable published state. The constraint is that the calculation must be deterministic and must also deterministically choose which of the many fields used for caching is to be modified. Only one field may be modified, never more than one. The thread therefore sees either the default values of all fields or the default values of all but one field which has been set by either this thread sometime before or by a concurrent thread. Regardless of ordering, the thread either uses the state combined from the default values of all fields but one and a non-default value of a single field or (re)calculates the non-default value of the single field. The value calculation is deterministic, uses immutable published state and deterministically chooses the field to modify, so it always calculates the same "next" state for the object. Idempotence is guaranteed. Regards, Peter From shade at redhat.com Mon Apr 8 11:40:43 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 8 Apr 2019 13:40:43 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> Message-ID: On 4/8/19 1:28 PM, Peter Levart wrote: > On 4/8/19 12:28 PM, Aleksey Shipilev wrote: >>> However, you also said in your opening criticism >>> >>> "I had hard time convincing myself that code is concurrency-safe" >>> >>> I think that is a more telling complaint. Can you elaborate on why you >>> found it hard to convince yourself of this? (I know what I think is the >> Because the whole thing in current code is "benign data race" on hash field. Pulling in another >> field into race needs careful consideration if it breaks the benignity. It apparently does not, but >> the cognitive complexity involved in reading that code makes the minuscule benefit much more >> questionable. > > The reasoning is very similar as with just one field. With one field (hash) the thread sees either > the default value (0) or a non-zero value calculated either by this thread sometime before or by a > concurrent thread that has already stored it. Regardless of ordering, the thread either uses the > non-zero value or (re)calculates it (again). The value calculation is deterministic and uses > immutable published state (the array), so it always calculates the same value for the same object. > Idempotence is guaranteed. > > The same reasoning can be extended to a general case where there are many fields used for caching of > a calculated state from some immutable published state. The constraint is that the calculation must > be deterministic and must also deterministically choose which of the many fields used for caching is > to be modified. Only one field may be modified, never more than one. The thread therefore sees > either the default values of all fields or the default values of all but one field which has been > set by either this thread sometime before or by a concurrent thread. Regardless of ordering, the > thread either uses the state combined from the default values of all fields but one and a > non-default value of a single field or (re)calculates the non-default value of the single field. The > value calculation is deterministic, uses immutable published state and deterministically chooses the > field to modify, so it always calculates the same "next" state for the object. Idempotence is > guaranteed. Thank you, the mere existence of this wall of text solidifies my argument: the need to invoke the argument like that is exactly the cognitive complexity I've been talking about, and it speaks about maintainability/risk cost, while benefits are still around the machine epsilon. Let's just draw the line on micro-optimizations, okay? This one is interesting experiment in itself, and it certainly passes the "hold my beer" curiosity threshold, but it does not pass the "should we actually do this" bar for me. -Aleksey From zgu at redhat.com Mon Apr 8 12:35:49 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 8 Apr 2019 08:35:49 -0400 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <3f191e86-c4ce-a286-a7aa-03467e26301e@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> <828659d7-c8a1-b509-84f6-4a9596e617c8@oracle.com> <404ed0ac-2785-79a0-8fc3-d306af6166ca@oracle.com> <3f191e86-c4ce-a286-a7aa-03467e26301e@oracle.com> Message-ID: > Zhengyu: can you please test this again on your platforms. Again I can't > test the alternate stack matching as I have no systems where it fails > (nor can I test slowedebug other than Linux). Still good. Thanks, -Zhengyu > > Thanks, > David > ----- > > On 7/04/2019 5:10 pm, David Holmes wrote: >> Hi Chris, >> >> On 7/04/2019 4:51 pm, Chris Plummer wrote: >>> Hi David, >>> >>> On 4/6/19 11:06 PM, David Holmes wrote: >>>> On 6/04/2019 4:24 pm, Chris Plummer wrote: >>>>> On 4/5/19 9:13 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 6/04/2019 3:09 am, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate? >>>>>> >>>>>> ?? That isn't part of any of the existing stacktraces. >>>>> See the following comment from Zhengyu in the CR: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 >>>> >>>> >>>> >>>> >>>> >>>> That comment simply includes a fragment of a stack which happens to >>>> include JVM_DefineModule and makes no further mention of it. I don't >>>> recall anyone saying that we should now be including that frame in >>>> the check. >>>> >>>> Do you want the test extended to also check for that frame? >>> Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule >>> is the additional frame that now appears in the detail output for >>> that call chain. So yes, the test should include it. If the inlining >>> of ModuleEntryTable::new_entry() had always happened, then the test >>> would originally have checked for the stacktrace as it appears in the >>> CR comment. >> >> I see - to be clear you want to always check for 4 frames, so the >> additional frame is only checked for the alternate stack. >> >>>>>> >>>>>>> Since you've added the following: >>>>>>> >>>>>>> ??103???????? if (!okToHaveAllocateHeap) { >>>>>>> ??104???????????? output.shouldNotContain("AllocateHeap"); >>>>>>> ??105???????? } >>>>>> >>>>>> I didn't add that - see old code line 80. >>>>> Ok, but my comment below still applies since this check is in place. >>>>>> >>>>>>> You can simplify the following: >>>>>>> >>>>>>> ??123???????? if (okToHaveAllocateHeap) { >>>>>>> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >>>>>>> ??125???????????? if (stackTraceMatches(expectedStackTrace, >>>>>>> output)) { >>>>>>> ??126???????????????? return; >>>>>>> ??127???????????? } >>>>>>> ??128???????? } else { >>>>>>> >>>>>>> The is no need for the okToHaveAllocateHeap check here anymore. >>>>>>> Just check all 3 allowed stacktraces until one passes. This is a >>>>>>> slight improvement in flexibility in that it would no longer >>>>>>> require the slowdebug builds to match stackTraceAllocateHeap. >>>>>>> They could match any of the 3. You could then put all 3 allowed >>>>>>> stacktraces in an array and check them in a loop if you wish. >>>>>> >>>>>> The only change I have made (which might be obscured by the >>>>>> structure) is that if stackTraceDefault fails to match I then try >>>>>> stackTraceAlternate. The handling of okToHaveAllocateHeap is >>>>>> unchanged. >>>>>> >>>>>> By the same argument you made I think it best to only expect the >>>>>> AllocateHeap stack on those slowdebug platforms, so that we can >>>>>> notice when something changes - again I've mode no change in this >>>>>> regard. >>>>> Since line 104 already verified that AllocateHeap does not appear >>>>> except possibly in slow debug heaps, it is harmless to check all >>>>> builds against the stacktrace that includes AllocateHeap. >>>> >>>> "Harmless" but a waste of time checking for a stack that we know >>>> can't match. The current version was at your suggestion: >>>> >>>> "You would need to check for all 3, limiting the AllocateHeap() one >>>> to just being allowed on solaris and windows slowdebug as it is now." >>> That was before I realized there was already an explicit check for >>> AllocateHeap() to not be allowed except for slowdebug ones. Once I >>> realized that, it occurred to me that checking for all 3 stacktraces >>> in a loop would simplify the logic. >>>> >>>> Checking all three returns to my original version (modulo not >>>> removing the check for the AllocateHeap frame, and fixing the >>>> matching logic). >>> Your original version checked for a large number of permutations that >>> included any 3 of 5 specified frames, not checks for any of 3 >>> specific stacktraces (of 4 frames each). >> >> That was never the intent and what I was referring to when I said "and >> fixing the matching logic". >> >>>> >>>>> Also, if a slowdebug platform were to change to no longer include >>>>> AllocateHeap, checking it against the other two stacktraces would >>>>> allow the test to continue to pass without modification. >>>> >>>> This is counter to your earlier argument that we should be using >>>> this test to specifically check for such changes in compiler >>>> behaviour and update the platform specific guards accordingly. If >>>> you allow it to go either way then we would never remove the guard >>>> even when it was no longer needed on any platform. >>> But this is one compiler inlining behavior change that is ok. If >>> AllocateHeap() suddenly starts being inlined by slowdebug builds, >>> that is actually a good thing, and we would end up modifying the test >>> to allow it. So why not allow it now? >>>> >>>>> For these two reasons I was suggesting just always check all 3 >>>>> stacktraces until one passes. It would simplify the logic some. >>>> >>>> I'd need to change a number of other things make the main logic >>>> simpler (ie loop over all three stacks) but the error reporting part >>>> will be more awkward. And Thomas already complained about the number >>>> of times we scan the entire process output doing this matching, so >>>> this would make it worse - unless I completely change the way we do >>>> the matching, which then introduces more complexity and more >>>> likelihood of introducing new bugs. >>>> >>>> Let me know how you want to proceed. >>> >>> The loop idea was just to make the code simpler. If you feel it will >>> slow things down unacceptably, then I'm fine with the logic as-is in >>> v2, but you need to add JVM_DefineModule to the new stacktrace. >> >> Okay I intend to add the missing 4th frame, and print both potential >> stacks on failure, but otherwise leave at V2. >> >> Thanks, >> David >> ----- >> >>> thanks, >>> >>> Chris >>> >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>>> >>>>>>> The following is no longer correct: >>>>>>> >>>>>>> ??140???????? throw new RuntimeException("Expected stack trace >>>>>>> missing from output: " + expectedStackTrace); >>>>>>> >>>>>>> In your current approach, expectedStackTrace is just the last >>>>>>> stacktrace we tried. Since we may try more than one, maybe all >>>>>>> the ones that failed to match should be listed (or none listed if >>>>>>> just too messy). >>>>>> >>>>>> It reports the last failing stacktrace, out of a possible two. >>>>>> Perhaps I can print both ... you want something in the jtr file so >>>>>> that it can be triaged without having to go and look up the test >>>>>> code. >>>>> Yeah, just pointing out that only printing one stacktrace might >>>>> lead the .jtr reader down the wrong path. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 4/5/19 12:04 AM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>>>>>>> >>>>>>>> Checks for alternate stack now. Added lots of comments and misc >>>>>>>> fixups. >>>>>>>> >>>>>>>> Zhengyu: please re-test (I can't test any slowdebug except >>>>>>>> linux-x64). >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>>>>>>> Thinking about this a bit more, there is still the potential >>>>>>>>> for some confusion if this test fails again in the future due >>>>>>>>> to the top frame missing. Is it missing because it got inlined >>>>>>>>> or is it missing because the frame skipping code skipped an >>>>>>>>> extra frame? Hopefully whoever deals with it doesn't just >>>>>>>>> hastily add another valid stacktrace to the test but instead >>>>>>>>> investigates to make sure the issue is indeed that the method >>>>>>>>> got inlined. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> Okay I will simply check for the third alternative. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> For the callsite that this test is checking for, right now >>>>>>>>>>> there appear to be 3 possible stacktraces: the "normal" one, >>>>>>>>>>> the one that includes AllocateHeap() on solaris and windows >>>>>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on >>>>>>>>>>> linux-x64. You would need to check for all 3, limiting the >>>>>>>>>>> AllocateHeap() one to just being allowed on solaris and >>>>>>>>>>> windows slowdebug as it is now. So basically this test needs >>>>>>>>>>> to cover all (allowable) stacktraces that we've seen for this >>>>>>>>>>> callsite, and be updated in the future as needed. Not ideal, >>>>>>>>>>> but I don't see a better solution. It's similar to the >>>>>>>>>>> situation described in JDK-8163899 which covered the >>>>>>>>>>> fragility of the NMT frame skipping code. In the end it was >>>>>>>>>>> decided it would be easier to just deal fix issues as they >>>>>>>>>>> came up rather then engineer a solution that wasn't as >>>>>>>>>>> fragile. I think this test falls in the same category. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the explanation about the frame counting from >>>>>>>>>>>> os::malloc - now I get it. But I don't understand your final >>>>>>>>>>>> comment: >>>>>>>>>>>> >>>>>>>>>>>> > Looking at this code also reminds me of a reason to have >>>>>>>>>>>> the test >>>>>>>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>>>>>>> skipping code >>>>>>>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>>>>>>> needed frame >>>>>>>>>>>> > at the top. The way the test was written it would detect >>>>>>>>>>>> this. With your >>>>>>>>>>>> > changes it will not. It would just revert to always >>>>>>>>>>>> matching on 3 frames >>>>>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>>>> >>>>>>>>>>>> How can I fix this bug if I have to check for 4 specific >>>>>>>>>>>> frames but one (or more) may be missing - i.e how can I tell >>>>>>>>>>>> the different between "Frame A was inlined" and "Frame A was >>>>>>>>>>>> skipped by mistake" ?? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>>>>>>> Hi David, >>>>>>>>>>>>> >>>>>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have concerns that this will hide some of the other >>>>>>>>>>>>>>>>>>> bugs I've mentioned: JDK-8133749, JDK-8133747, and >>>>>>>>>>>>>>>>>>> JDK-8133740. These bugs result in 1 or two frames >>>>>>>>>>>>>>>>>>> appearing in the stacktrace that should be skipped. >>>>>>>>>>>>>>>>>>> Notably NativeCallStack::NativeCallStack() and >>>>>>>>>>>>>>>>>>> os::get_native_stack(). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 73???????? // We should never see either of these >>>>>>>>>>>>>>>>>> frames because they are supposed to be skipped. */ >>>>>>>>>>>>>>>>>> 74 >>>>>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but >>>>>>>>>>>>>>>>> missed it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the >>>>>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed >>>>>>>>>>>>>>>>>>> for it for windows and solaris slowdebug builds. >>>>>>>>>>>>>>>>>>> Although these builds should have honored the >>>>>>>>>>>>>>>>>>> ALWAYSINLINE directive, it was deemed acceptable that >>>>>>>>>>>>>>>>>>> it was not in slowdebug builds. However, I would not >>>>>>>>>>>>>>>>>>> want to allow AllocateHeap() to appear in a product >>>>>>>>>>>>>>>>>>> build, and best not to see it in fastdebug either. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a >>>>>>>>>>>>>>>>>> given compiler chooses to inline something like >>>>>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this test >>>>>>>>>>>>>>>>>> to be checking for something specific to the native >>>>>>>>>>>>>>>>>> compiler. The previous handling of AllocateHeap seemed >>>>>>>>>>>>>>>>>> to be there simply because it was the only way to deal >>>>>>>>>>>>>>>>>> with an optional frame - but now that's handled >>>>>>>>>>>>>>>>>> generically. >>>>>>>>>>>>>>>>> It's appearance means you effectively only have 3 >>>>>>>>>>>>>>>>> frames to identity callsites instead of 4. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and >>>>>>>>>>>>>>>> expected 4 matches. The current bug is that one of those >>>>>>>>>>>>>>>> (new_entry) could actually be inlined as well, resulting >>>>>>>>>>>>>>>> in only 3 matches. So that is what the revised test >>>>>>>>>>>>>>>> checks for: at least 3 matches. Often there will be 4 >>>>>>>>>>>>>>>> matches. >>>>>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() >>>>>>>>>>>>>>> doesn't get inlined, it effectively is using 3. The test >>>>>>>>>>>>>>> should detect when this happens so the NMT implementation >>>>>>>>>>>>>>> can address the issue. >>>>>>>>>>>>>> >>>>>>>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>>>>>>> >>>>>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters >>>>>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is >>>>>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is controlled >>>>>>>>>>>>> by NMT_TrackingStackDepth. Making NMT_TrackingStackDepth >>>>>>>>>>>>> bigger means more refinement of the callsites (thus more >>>>>>>>>>>>> callsites), but a clearer picture of what actually led to >>>>>>>>>>>>> the os:malloc(). >>>>>>>>>>>>> >>>>>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have >>>>>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and >>>>>>>>>>>>> foo() and bar() both call a(), the NMT detail output will >>>>>>>>>>>>> not distinguish between these two calls paths to >>>>>>>>>>>>> os:mallco(), and will consider both paths to be the same >>>>>>>>>>>>> callsite. The 4 frames in the NMT detail output would >>>>>>>>>>>>> always be a, b, c, and d. However, bump up >>>>>>>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as >>>>>>>>>>>>> two separate callsites, one with foo() as the bottom frame >>>>>>>>>>>>> and one with bar() as the bottom frame, and both with a, b, >>>>>>>>>>>>> c, and d as the other 4 frames. >>>>>>>>>>>>> >>>>>>>>>>>>> So my point is if AllocateHeap() is not inlined, then every >>>>>>>>>>>>> allocation that is the result of doing a "new" of any >>>>>>>>>>>>> CHeapObj subtype will have AllocateHeap() in its callsite, >>>>>>>>>>>>> which effectively lowers they callsite refinement by 1. >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? Was >>>>>>>>>>>>>>>> it getting inlined already when AllocateHeap was not? >>>>>>>>>>>>>>>> Even so we still end up with 4 frames matching normally. >>>>>>>>>>>>>>> I noticed that last night also and scratch my head over >>>>>>>>>>>>>>> it for a while and then went to bed. The only explanation >>>>>>>>>>>>>>> I could come up with is that allocate_new_entry() is >>>>>>>>>>>>>>> getting inlined, and as a result (due to being a >>>>>>>>>>>>>>> slowdebug build and doing minimal inlining) >>>>>>>>>>>>>>> AllocateHeap() was not inlined. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If it does appear in a product build, a solution should >>>>>>>>>>>>>>>>> be looked into to get rid of it. If the port owner >>>>>>>>>>>>>>>>> decides it can't get rid of it (or is unwilling to), >>>>>>>>>>>>>>>>> then an exception should be added to the test like was >>>>>>>>>>>>>>>>> done for solaris and windows slowdebug builds. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Are we specifically trying to test the compiler's >>>>>>>>>>>>>>>> ability to inline that function and just happen to be >>>>>>>>>>>>>>>> using this test to verify that? Doesn't seem like a >>>>>>>>>>>>>>>> suitable place to do this - and why do we need to do it? >>>>>>>>>>>>>>>> The Visual Studio docs state: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds >>>>>>>>>>>>>>>> and could change with any update to the compiler. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline >>>>>>>>>>>>>>>> - specifically -xinline only has an effect at ?xO3 or >>>>>>>>>>>>>>>> higher. Which likely explains why it is ignored in >>>>>>>>>>>>>>>> slowdebug. And there are other cases where it won't >>>>>>>>>>>>>>>> honour the ALWAYSINLINE. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if we >>>>>>>>>>>>>>>> want to ensure inlining when not optimising: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>>>>>>> unless you specify the ?always_inline? attribute for the >>>>>>>>>>>>>>>> function, like this: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>>>>>>> inline void foo (const char) >>>>>>>>>>>>>>>> __attribute__((always_inline));" >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> and we don't write it that way. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So if we're that concerned about release builds >>>>>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we need >>>>>>>>>>>>>>>> something a bit more explicit than this test to >>>>>>>>>>>>>>>> determine that. >>>>>>>>>>>>>>> With respect to the 3 methods/functions we don't want to >>>>>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of >>>>>>>>>>>>>>> assumptions on inlining. One of the things the test is >>>>>>>>>>>>>>> doing is making sure those assumptions are correct. If >>>>>>>>>>>>>>> incorrect, then you run into issues like I mentioned >>>>>>>>>>>>>>> above where callsite backtraces effectively only have 3 >>>>>>>>>>>>>>> unique frames rather than 4 (actually before some bug >>>>>>>>>>>>>>> fixes it was often just 2 unique frames). So I think it's >>>>>>>>>>>>>>> appropriate to have a test to make sure we are not seeing >>>>>>>>>>>>>>> any of these 3 methods/functions. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can >>>>>>>>>>>>>> clearly see what this inlining assumptions are that NMT >>>>>>>>>>>>>> makes? Are they clearly documented? >>>>>>>>>>>>> >>>>>>>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>>>>>>> various bugs that led to NativeCallStack::NativeCallStack() >>>>>>>>>>>>> and os::get_native_stack() (and sometimes both) being in >>>>>>>>>>>>> the callsite. Reviewing the bugs I referred to will give >>>>>>>>>>>>> you an idea of where to look. One good place to look at >>>>>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case >>>>>>>>>>>>> code there that controls how many frames to skip based on >>>>>>>>>>>>> on the platform and whether optimized or not. Also some >>>>>>>>>>>>> comments there to help you out. I did a lot of bug fixing >>>>>>>>>>>>> in this method. >>>>>>>>>>>>> >>>>>>>>>>>>> Looking at this code also reminds me of a reason to have >>>>>>>>>>>>> the test continue to check for all 4 specific frames. If >>>>>>>>>>>>> the frame skipping code skips an extra frame, then the >>>>>>>>>>>>> callsite will be missing a needed frame at the top. The way >>>>>>>>>>>>> the test was written it would detect this. With your >>>>>>>>>>>>> changes it will not. It would just revert to always >>>>>>>>>>>>> matching on 3 frames instead of 4, and the frame skipping >>>>>>>>>>>>> bug would go unnoticed. >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Now the test also has made inlining assumptions beyond >>>>>>>>>>>>>>> what NMT has made, and that is really what this bug is >>>>>>>>>>>>>>> about. In general I think your fix is fine in the way it >>>>>>>>>>>>>>> relaxes which frames are actually found, but as Thomas >>>>>>>>>>>>>>> points out, it suffers from not actually looking at a >>>>>>>>>>>>>>> single stacktrace, but just looking for the specified >>>>>>>>>>>>>>> frames somewhere in the output (and in the order >>>>>>>>>>>>>>> specified.) You should probably address this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Right that was an error on my part. I thought the existing >>>>>>>>>>>>>> MULTILINE pattern matching with .* would also find >>>>>>>>>>>>>> non-sequential lines and so I was acting similarly. I will >>>>>>>>>>>>>> re-think this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>>>>>>> which frames appear, I think you need to now also >>>>>>>>>>>>>>>>>>> make sure the above 3 mentioned frames are not >>>>>>>>>>>>>>>>>>> present, except for allowing AllocateHeap() in >>>>>>>>>>>>>>>>>>> slowdebug builds. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames to >>>>>>>>>>>>>>>>>>>> be missing based on empirical observations. So to >>>>>>>>>>>>>>>>>>>> date we have seen two frames that may or may not be >>>>>>>>>>>>>>>>>>>> inlined and so we allow for 2 non-matching entries. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as now >>>>>>>>>>>>>>>>>>>> it is just an optional frame. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test >>>>>>>>>>>>>>>>>>>> as you intended? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> From peter.levart at gmail.com Mon Apr 8 12:44:31 2019 From: peter.levart at gmail.com (Peter Levart) Date: Mon, 8 Apr 2019 14:44:31 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> Message-ID: On 4/8/19 1:40 PM, Aleksey Shipilev wrote: > On 4/8/19 1:28 PM, Peter Levart wrote: >> The reasoning is very similar as with just one field. With one field (hash) the thread sees either >> the default value (0) or a non-zero value calculated either by this thread sometime before or by a >> concurrent thread that has already stored it. Regardless of ordering, the thread either uses the >> non-zero value or (re)calculates it (again). The value calculation is deterministic and uses >> immutable published state (the array), so it always calculates the same value for the same object. >> Idempotence is guaranteed. >> >> The same reasoning can be extended to a general case where there are many fields used for caching of >> a calculated state from some immutable published state. The constraint is that the calculation must >> be deterministic and must also deterministically choose which of the many fields used for caching is >> to be modified. Only one field may be modified, never more than one. The thread therefore sees >> either the default values of all fields or the default values of all but one field which has been >> set by either this thread sometime before or by a concurrent thread. Regardless of ordering, the >> thread either uses the state combined from the default values of all fields but one and a >> non-default value of a single field or (re)calculates the non-default value of the single field. The >> value calculation is deterministic, uses immutable published state and deterministically chooses the >> field to modify, so it always calculates the same "next" state for the object. Idempotence is >> guaranteed. > Thank you, the mere existence of this wall of text solidifies my argument: the need to invoke the > argument like that is exactly the cognitive complexity I've been talking about, and it speaks about > maintainability/risk cost, while benefits are still around the machine epsilon. I tried to write the two descriptions side by side to show that the 2nd is not more complex than the 1st. It's just using longer "nouns". The sentences are otherwise equivalent and there's additional text that describes the "nouns". I could have done a better job though... So here's 2nd try: The String hash code caching (as it is written today) is an example of a benign data race that can be described as caching of lazily calculated state from immutable published state, both modeled in the same object. Data race is benign if: - the published state which is used as input of the calculation is immutable - the calculation is deterministic - threads observe the cached calculated state of the object to be updated just once atomically. Meaning that there are only two different observable states of object: "initial" state where the calculated cached data is not set and "updated" state where the the calculated cached data is set. Java fields up to 32 bits wide (+ reference fields regardless of width) exhibit atomic updates. So if the update of the object state (transition from "initial" to "updated" state) is performed by a write of a deterministically calculated value to a single deterministically chosen field of no more than 32 bits (or a reference field), the whole object state is observed to change atomically and the data race is benign. Current and proposed caching differ only in the number of fields used for caching the calculated state, but both adhere to the above rules. So the reasoning stays the same as with current code. It only takes a little to realize that it's all about a single field that is updated while the presence of other fields (zero or more) don't change the picture since they are constant for the whole lifetime of object. If you're afraid that a future maintainer of that code would not realize that, then a simple comment put into String.hashCode method and java_lang_String::set_hash C++ metohd that would say something like the following: // only a single field may be modified so that the Object state is updated atomically ...is surely going to help him/her keep the String free from bugs... Regards, Peter From robin.westberg at oracle.com Mon Apr 8 12:47:03 2019 From: robin.westberg at oracle.com (Robin Westberg) Date: Mon, 8 Apr 2019 14:47:03 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> Message-ID: Hi again, Here?s an updated version where I?ve moved the naked_short_nanosleep function into the Posix class, to avoid future cross-platform use. (It?s still used in the SpinYield and TimedYield implementations though). Full webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.01/ Incremental: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00-01/ Best regards, Robin > On 5 Apr 2019, at 13:54, Robin Westberg wrote: > > Hi David, > >> On 5 Apr 2019, at 12:10, David Holmes wrote: >> >> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>> Hi David, >>> Thanks for taking a look! >>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>> >>>> Hi Robin, >>>> >>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>> Hi all, >>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>> >>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >> >> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). > > Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. > > I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. > >> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) > > Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. > > Best regards, > Robin > >> >> Thanks, >> David >> >>> Best regards, >>> Robin >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>> Testing: tier1 >>>>> Best regards, >>>>> Robin From david.holmes at oracle.com Mon Apr 8 13:15:04 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Apr 2019 23:15:04 +1000 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <3ff96342-5f30-c59a-aa1f-c6f5af522d80@oracle.com> <8ba5dc82-a4a8-5d1f-50bb-bff4acd6531b@oracle.com> <843f926f-72a5-8e4a-cf43-2d91a612c885@oracle.com> <64531cb6-61f9-6e49-daf7-1cda9169855d@oracle.com> <23e19ab6-3eb1-47c0-f39c-006a70ccbf51@oracle.com> <95af5015-f831-7ffe-2972-7f60663740b2@oracle.com> <9f9f80e6-bec0-cc24-7347-3abeaeb29295@oracle.com> <11cd5d70-a5c5-ef54-3a4d-d33f2cadcecd@oracle.com> <93450077-f0b4-409a-ec88-51df17f5e311@oracle.com> <1597082c-5972-2c0a-c853-0aca6f2cea01@oracle.com> <828659d7-c8a1-b509-84f6-4a9596e617c8@oracle.com> <404ed0ac-2785-79a0-8fc3-d306af6166ca@oracle.com> <3f191e86-c4ce-a286-a7aa-03467e26301e@oracle.com> Message-ID: <0c00260a-6b3d-d019-ea5d-ddb7b5be509b@oracle.com> On 8/04/2019 10:35 pm, Zhengyu Gu wrote: >> Zhengyu: can you please test this again on your platforms. Again I >> can't test the alternate stack matching as I have no systems where it >> fails (nor can I test slowedebug other than Linux). > > Still good. Thanks! David > Thanks, > > -Zhengyu > >> >> Thanks, >> David >> ----- >> >> On 7/04/2019 5:10 pm, David Holmes wrote: >>> Hi Chris, >>> >>> On 7/04/2019 4:51 pm, Chris Plummer wrote: >>>> Hi David, >>>> >>>> On 4/6/19 11:06 PM, David Holmes wrote: >>>>> On 6/04/2019 4:24 pm, Chris Plummer wrote: >>>>>> On 4/5/19 9:13 PM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> On 6/04/2019 3:09 am, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Why was the JVM_DefineModule frame left off of stackTraceAlternate? >>>>>>> >>>>>>> ?? That isn't part of any of the existing stacktraces. >>>>>> See the following comment from Zhengyu in the CR: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8218458?focusedCommentId=14242865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14242865 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> That comment simply includes a fragment of a stack which happens to >>>>> include JVM_DefineModule and makes no further mention of it. I >>>>> don't recall anyone saying that we should now be including that >>>>> frame in the check. >>>>> >>>>> Do you want the test extended to also check for that frame? >>>> Because ModuleEntryTable::new_entry() got inlined, JVM_DefineModule >>>> is the additional frame that now appears in the detail output for >>>> that call chain. So yes, the test should include it. If the inlining >>>> of ModuleEntryTable::new_entry() had always happened, then the test >>>> would originally have checked for the stacktrace as it appears in >>>> the CR comment. >>> >>> I see - to be clear you want to always check for 4 frames, so the >>> additional frame is only checked for the alternate stack. >>> >>>>>>> >>>>>>>> Since you've added the following: >>>>>>>> >>>>>>>> ??103???????? if (!okToHaveAllocateHeap) { >>>>>>>> ??104???????????? output.shouldNotContain("AllocateHeap"); >>>>>>>> ??105???????? } >>>>>>> >>>>>>> I didn't add that - see old code line 80. >>>>>> Ok, but my comment below still applies since this check is in place. >>>>>>> >>>>>>>> You can simplify the following: >>>>>>>> >>>>>>>> ??123???????? if (okToHaveAllocateHeap) { >>>>>>>> ??124???????????? expectedStackTrace = stackTraceAllocateHeap; >>>>>>>> ??125???????????? if (stackTraceMatches(expectedStackTrace, >>>>>>>> output)) { >>>>>>>> ??126???????????????? return; >>>>>>>> ??127???????????? } >>>>>>>> ??128???????? } else { >>>>>>>> >>>>>>>> The is no need for the okToHaveAllocateHeap check here anymore. >>>>>>>> Just check all 3 allowed stacktraces until one passes. This is a >>>>>>>> slight improvement in flexibility in that it would no longer >>>>>>>> require the slowdebug builds to match stackTraceAllocateHeap. >>>>>>>> They could match any of the 3. You could then put all 3 allowed >>>>>>>> stacktraces in an array and check them in a loop if you wish. >>>>>>> >>>>>>> The only change I have made (which might be obscured by the >>>>>>> structure) is that if stackTraceDefault fails to match I then try >>>>>>> stackTraceAlternate. The handling of okToHaveAllocateHeap is >>>>>>> unchanged. >>>>>>> >>>>>>> By the same argument you made I think it best to only expect the >>>>>>> AllocateHeap stack on those slowdebug platforms, so that we can >>>>>>> notice when something changes - again I've mode no change in this >>>>>>> regard. >>>>>> Since line 104 already verified that AllocateHeap does not appear >>>>>> except possibly in slow debug heaps, it is harmless to check all >>>>>> builds against the stacktrace that includes AllocateHeap. >>>>> >>>>> "Harmless" but a waste of time checking for a stack that we know >>>>> can't match. The current version was at your suggestion: >>>>> >>>>> "You would need to check for all 3, limiting the AllocateHeap() one >>>>> to just being allowed on solaris and windows slowdebug as it is now." >>>> That was before I realized there was already an explicit check for >>>> AllocateHeap() to not be allowed except for slowdebug ones. Once I >>>> realized that, it occurred to me that checking for all 3 stacktraces >>>> in a loop would simplify the logic. >>>>> >>>>> Checking all three returns to my original version (modulo not >>>>> removing the check for the AllocateHeap frame, and fixing the >>>>> matching logic). >>>> Your original version checked for a large number of permutations >>>> that included any 3 of 5 specified frames, not checks for any of 3 >>>> specific stacktraces (of 4 frames each). >>> >>> That was never the intent and what I was referring to when I said >>> "and fixing the matching logic". >>> >>>>> >>>>>> Also, if a slowdebug platform were to change to no longer include >>>>>> AllocateHeap, checking it against the other two stacktraces would >>>>>> allow the test to continue to pass without modification. >>>>> >>>>> This is counter to your earlier argument that we should be using >>>>> this test to specifically check for such changes in compiler >>>>> behaviour and update the platform specific guards accordingly. If >>>>> you allow it to go either way then we would never remove the guard >>>>> even when it was no longer needed on any platform. >>>> But this is one compiler inlining behavior change that is ok. If >>>> AllocateHeap() suddenly starts being inlined by slowdebug builds, >>>> that is actually a good thing, and we would end up modifying the >>>> test to allow it. So why not allow it now? >>>>> >>>>>> For these two reasons I was suggesting just always check all 3 >>>>>> stacktraces until one passes. It would simplify the logic some. >>>>> >>>>> I'd need to change a number of other things make the main logic >>>>> simpler (ie loop over all three stacks) but the error reporting >>>>> part will be more awkward. And Thomas already complained about the >>>>> number of times we scan the entire process output doing this >>>>> matching, so this would make it worse - unless I completely change >>>>> the way we do the matching, which then introduces more complexity >>>>> and more likelihood of introducing new bugs. >>>>> >>>>> Let me know how you want to proceed. >>>> >>>> The loop idea was just to make the code simpler. If you feel it will >>>> slow things down unacceptably, then I'm fine with the logic as-is in >>>> v2, but you need to add JVM_DefineModule to the new stacktrace. >>> >>> Okay I intend to add the missing 4th frame, and print both potential >>> stacks on failure, but otherwise leave at V2. >>> >>> Thanks, >>> David >>> ----- >>> >>>> thanks, >>>> >>>> Chris >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>>> >>>>>>>> The following is no longer correct: >>>>>>>> >>>>>>>> ??140???????? throw new RuntimeException("Expected stack trace >>>>>>>> missing from output: " + expectedStackTrace); >>>>>>>> >>>>>>>> In your current approach, expectedStackTrace is just the last >>>>>>>> stacktrace we tried. Since we may try more than one, maybe all >>>>>>>> the ones that failed to match should be listed (or none listed >>>>>>>> if just too messy). >>>>>>> >>>>>>> It reports the last failing stacktrace, out of a possible two. >>>>>>> Perhaps I can print both ... you want something in the jtr file >>>>>>> so that it can be triaged without having to go and look up the >>>>>>> test code. >>>>>> Yeah, just pointing out that only printing one stacktrace might >>>>>> lead the .jtr reader down the wrong path. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 4/5/19 12:04 AM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Updated webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev.v2/ >>>>>>>>> >>>>>>>>> Checks for alternate stack now. Added lots of comments and misc >>>>>>>>> fixups. >>>>>>>>> >>>>>>>>> Zhengyu: please re-test (I can't test any slowdebug except >>>>>>>>> linux-x64). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 5/04/2019 4:01 pm, Chris Plummer wrote: >>>>>>>>>> Thinking about this a bit more, there is still the potential >>>>>>>>>> for some confusion if this test fails again in the future due >>>>>>>>>> to the top frame missing. Is it missing because it got inlined >>>>>>>>>> or is it missing because the frame skipping code skipped an >>>>>>>>>> extra frame? Hopefully whoever deals with it doesn't just >>>>>>>>>> hastily add another valid stacktrace to the test but instead >>>>>>>>>> investigates to make sure the issue is indeed that the method >>>>>>>>>> got inlined. >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 4/4/19 10:56 PM, David Holmes wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> Okay I will simply check for the third alternative. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On 5/04/2019 3:53 pm, Chris Plummer wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> For the callsite that this test is checking for, right now >>>>>>>>>>>> there appear to be 3 possible stacktraces: the "normal" one, >>>>>>>>>>>> the one that includes AllocateHeap() on solaris and windows >>>>>>>>>>>> slowdebug builds, and the one Zhengyu is now seeing on >>>>>>>>>>>> linux-x64. You would need to check for all 3, limiting the >>>>>>>>>>>> AllocateHeap() one to just being allowed on solaris and >>>>>>>>>>>> windows slowdebug as it is now. So basically this test needs >>>>>>>>>>>> to cover all (allowable) stacktraces that we've seen for >>>>>>>>>>>> this callsite, and be updated in the future as needed. Not >>>>>>>>>>>> ideal, but I don't see a better solution. It's similar to >>>>>>>>>>>> the situation described in JDK-8163899 which covered the >>>>>>>>>>>> fragility of the NMT frame skipping code. In the end it was >>>>>>>>>>>> decided it would be easier to just deal fix issues as they >>>>>>>>>>>> came up rather then engineer a solution that wasn't as >>>>>>>>>>>> fragile. I think this test falls in the same category. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 4/4/19 10:11 PM, David Holmes wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for the explanation about the frame counting from >>>>>>>>>>>>> os::malloc - now I get it. But I don't understand your >>>>>>>>>>>>> final comment: >>>>>>>>>>>>> >>>>>>>>>>>>> > Looking at this code also reminds me of a reason to have >>>>>>>>>>>>> the test >>>>>>>>>>>>> > continue to check for all 4 specific frames. If the frame >>>>>>>>>>>>> skipping code >>>>>>>>>>>>> > skips an extra frame, then the callsite will be missing a >>>>>>>>>>>>> needed frame >>>>>>>>>>>>> > at the top. The way the test was written it would detect >>>>>>>>>>>>> this. With your >>>>>>>>>>>>> > changes it will not. It would just revert to always >>>>>>>>>>>>> matching on 3 frames >>>>>>>>>>>>> > instead of 4, and the frame skipping bug would go unnoticed. >>>>>>>>>>>>> >>>>>>>>>>>>> How can I fix this bug if I have to check for 4 specific >>>>>>>>>>>>> frames but one (or more) may be missing - i.e how can I >>>>>>>>>>>>> tell the different between "Frame A was inlined" and "Frame >>>>>>>>>>>>> A was skipped by mistake" ?? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 5/04/2019 2:17 pm, Chris Plummer wrote: >>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 4/4/19 6:28 PM, David Holmes wrote: >>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 5/04/2019 1:48 am, Chris Plummer wrote: >>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 4/4/19 12:14 AM, David Holmes wrote: >>>>>>>>>>>>>>>>> On 4/04/2019 4:35 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>>> On 4/3/19 11:23 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 4/04/2019 4:12 pm, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>>> Hi David, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I have concerns that this will hide some of the >>>>>>>>>>>>>>>>>>>> other bugs I've mentioned: JDK-8133749, JDK-8133747, >>>>>>>>>>>>>>>>>>>> and JDK-8133740. These bugs result in 1 or two >>>>>>>>>>>>>>>>>>>> frames appearing in the stacktrace that should be >>>>>>>>>>>>>>>>>>>> skipped. Notably NativeCallStack::NativeCallStack() >>>>>>>>>>>>>>>>>>>> and os::get_native_stack(). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The test still checks those are not present first: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 73???????? // We should never see either of these >>>>>>>>>>>>>>>>>>> frames because they are supposed to be skipped. */ >>>>>>>>>>>>>>>>>>> 74 >>>>>>>>>>>>>>>>>>> output.shouldNotContain("NativeCallStack::NativeCallStack"); >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 75 output.shouldNotContain("os::get_native_stack"); >>>>>>>>>>>>>>>>>> Ah yes. I skimmed over the test looking for it but >>>>>>>>>>>>>>>>>> missed it. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Also, AllocateHeap() should normally not be in the >>>>>>>>>>>>>>>>>>>> stack trace, but the test has specifically allowed >>>>>>>>>>>>>>>>>>>> for it for windows and solaris slowdebug builds. >>>>>>>>>>>>>>>>>>>> Although these builds should have honored the >>>>>>>>>>>>>>>>>>>> ALWAYSINLINE directive, it was deemed acceptable >>>>>>>>>>>>>>>>>>>> that it was not in slowdebug builds. However, I >>>>>>>>>>>>>>>>>>>> would not want to allow AllocateHeap() to appear in >>>>>>>>>>>>>>>>>>>> a product build, and best not to see it in fastdebug >>>>>>>>>>>>>>>>>>>> either. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> This is a test of NMT detail not a test of whether a >>>>>>>>>>>>>>>>>>> given compiler chooses to inline something like >>>>>>>>>>>>>>>>>>> AllocateHeap. I don't think it is the job of this >>>>>>>>>>>>>>>>>>> test to be checking for something specific to the >>>>>>>>>>>>>>>>>>> native compiler. The previous handling of >>>>>>>>>>>>>>>>>>> AllocateHeap seemed to be there simply because it was >>>>>>>>>>>>>>>>>>> the only way to deal with an optional frame - but now >>>>>>>>>>>>>>>>>>> that's handled generically. >>>>>>>>>>>>>>>>>> It's appearance means you effectively only have 3 >>>>>>>>>>>>>>>>>> frames to identity callsites instead of 4. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Both stacktraces in the old test had 4 elements and >>>>>>>>>>>>>>>>> expected 4 matches. The current bug is that one of >>>>>>>>>>>>>>>>> those (new_entry) could actually be inlined as well, >>>>>>>>>>>>>>>>> resulting in only 3 matches. So that is what the >>>>>>>>>>>>>>>>> revised test checks for: at least 3 matches. Often >>>>>>>>>>>>>>>>> there will be 4 matches. >>>>>>>>>>>>>>>> I think you misunderstood my "3 frames" comment. I was >>>>>>>>>>>>>>>> referring to how many frames NMT uses to identify the >>>>>>>>>>>>>>>> callsite. It wants to use 4, but if AllocateHeap() >>>>>>>>>>>>>>>> doesn't get inlined, it effectively is using 3. The test >>>>>>>>>>>>>>>> should detect when this happens so the NMT >>>>>>>>>>>>>>>> implementation can address the issue. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You're right I don't understand this part as I don't know >>>>>>>>>>>>>>> how/what NMT detail is doing in this regard. >>>>>>>>>>>>>> >>>>>>>>>>>>>> An NMT callsite is simply the 4 most recent frames (afters >>>>>>>>>>>>>> some pruning) that led to the os:malloc() call. "4" is >>>>>>>>>>>>>> somewhat arbitrary as Thomas pointed out, and is >>>>>>>>>>>>>> controlled by NMT_TrackingStackDepth. Making >>>>>>>>>>>>>> NMT_TrackingStackDepth bigger means more refinement of the >>>>>>>>>>>>>> callsites (thus more callsites), but a clearer picture of >>>>>>>>>>>>>> what actually led to the os:malloc(). >>>>>>>>>>>>>> >>>>>>>>>>>>>> For example, with NMT_TrackingStackDepth == 4, if you have >>>>>>>>>>>>>> a() calls b() calls c() calls d() calls os:malloc(), and >>>>>>>>>>>>>> foo() and bar() both call a(), the NMT detail output will >>>>>>>>>>>>>> not distinguish between these two calls paths to >>>>>>>>>>>>>> os:mallco(), and will consider both paths to be the same >>>>>>>>>>>>>> callsite. The 4 frames in the NMT detail output would >>>>>>>>>>>>>> always be a, b, c, and d. However, bump up >>>>>>>>>>>>>> NMT_TrackingStackDepth to 5 and now NMT will treat them as >>>>>>>>>>>>>> two separate callsites, one with foo() as the bottom frame >>>>>>>>>>>>>> and one with bar() as the bottom frame, and both with a, >>>>>>>>>>>>>> b, c, and d as the other 4 frames. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So my point is if AllocateHeap() is not inlined, then >>>>>>>>>>>>>> every allocation that is the result of doing a "new" of >>>>>>>>>>>>>> any CHeapObj subtype will have AllocateHeap() in its >>>>>>>>>>>>>> callsite, which effectively lowers they callsite >>>>>>>>>>>>>> refinement by 1. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hmmm but now I'm wondering why this trace: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ? 50???? public static String stackTraceAllocateHeap = >>>>>>>>>>>>>>>>> ? 51???????? ".*AllocateHeap.*\n" + >>>>>>>>>>>>>>>>> ? 52 ".*ModuleEntryTable.*new_entry.*\n" + >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> doesn't include ".*Hashtable.*allocate_new_entry.*"? >>>>>>>>>>>>>>>>> Was it getting inlined already when AllocateHeap was >>>>>>>>>>>>>>>>> not? Even so we still end up with 4 frames matching >>>>>>>>>>>>>>>>> normally. >>>>>>>>>>>>>>>> I noticed that last night also and scratch my head over >>>>>>>>>>>>>>>> it for a while and then went to bed. The only >>>>>>>>>>>>>>>> explanation I could come up with is that >>>>>>>>>>>>>>>> allocate_new_entry() is getting inlined, and as a result >>>>>>>>>>>>>>>> (due to being a slowdebug build and doing minimal >>>>>>>>>>>>>>>> inlining) AllocateHeap() was not inlined. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> If it does appear in a product build, a solution >>>>>>>>>>>>>>>>>> should be looked into to get rid of it. If the port >>>>>>>>>>>>>>>>>> owner decides it can't get rid of it (or is unwilling >>>>>>>>>>>>>>>>>> to), then an exception should be added to the test >>>>>>>>>>>>>>>>>> like was done for solaris and windows slowdebug builds. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Are we specifically trying to test the compiler's >>>>>>>>>>>>>>>>> ability to inline that function and just happen to be >>>>>>>>>>>>>>>>> using this test to verify that? Doesn't seem like a >>>>>>>>>>>>>>>>> suitable place to do this - and why do we need to do >>>>>>>>>>>>>>>>> it? The Visual Studio docs state: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> "You cannot force the compiler to inline a particular >>>>>>>>>>>>>>>>> function, even with the __forceinline keyword." >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> so ALWAYSINLINE is just a hint even in product builds >>>>>>>>>>>>>>>>> and could change with any update to the compiler. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> For Solaris Studio it is again not guaranteed to inline >>>>>>>>>>>>>>>>> - specifically -xinline only has an effect at ?xO3 or >>>>>>>>>>>>>>>>> higher. Which likely explains why it is ignored in >>>>>>>>>>>>>>>>> slowdebug. And there are other cases where it won't >>>>>>>>>>>>>>>>> honour the ALWAYSINLINE. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Even with gcc we seem to be misusing the attribute if >>>>>>>>>>>>>>>>> we want to ensure inlining when not optimising: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> "GCC does not inline any functions when not optimizing >>>>>>>>>>>>>>>>> unless you specify the ?always_inline? attribute for >>>>>>>>>>>>>>>>> the function, like this: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /* Prototype.? */ >>>>>>>>>>>>>>>>> inline void foo (const char) >>>>>>>>>>>>>>>>> __attribute__((always_inline));" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> and we don't write it that way. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So if we're that concerned about release builds >>>>>>>>>>>>>>>>> guaranteeing to inline AllocateHeap then I think we >>>>>>>>>>>>>>>>> need something a bit more explicit than this test to >>>>>>>>>>>>>>>>> determine that. >>>>>>>>>>>>>>>> With respect to the 3 methods/functions we don't want to >>>>>>>>>>>>>>>> see in the callsite stacktrace, NMT has made a number of >>>>>>>>>>>>>>>> assumptions on inlining. One of the things the test is >>>>>>>>>>>>>>>> doing is making sure those assumptions are correct. If >>>>>>>>>>>>>>>> incorrect, then you run into issues like I mentioned >>>>>>>>>>>>>>>> above where callsite backtraces effectively only have 3 >>>>>>>>>>>>>>>> unique frames rather than 4 (actually before some bug >>>>>>>>>>>>>>>> fixes it was often just 2 unique frames). So I think >>>>>>>>>>>>>>>> it's appropriate to have a test to make sure we are not >>>>>>>>>>>>>>>> seeing any of these 3 methods/functions. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Okay I get the gist of that. Is there somewhere I can >>>>>>>>>>>>>>> clearly see what this inlining assumptions are that NMT >>>>>>>>>>>>>>> makes? Are they clearly documented? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Not that I know of. I discovered them while looking at the >>>>>>>>>>>>>> various bugs that led to >>>>>>>>>>>>>> NativeCallStack::NativeCallStack() and >>>>>>>>>>>>>> os::get_native_stack() (and sometimes both) being in the >>>>>>>>>>>>>> callsite. Reviewing the bugs I referred to will give you >>>>>>>>>>>>>> an idea of where to look. One good place to look at >>>>>>>>>>>>>> NativeCallStack::NativeCallStack(). Lots of special case >>>>>>>>>>>>>> code there that controls how many frames to skip based on >>>>>>>>>>>>>> on the platform and whether optimized or not. Also some >>>>>>>>>>>>>> comments there to help you out. I did a lot of bug fixing >>>>>>>>>>>>>> in this method. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Looking at this code also reminds me of a reason to have >>>>>>>>>>>>>> the test continue to check for all 4 specific frames. If >>>>>>>>>>>>>> the frame skipping code skips an extra frame, then the >>>>>>>>>>>>>> callsite will be missing a needed frame at the top. The >>>>>>>>>>>>>> way the test was written it would detect this. With your >>>>>>>>>>>>>> changes it will not. It would just revert to always >>>>>>>>>>>>>> matching on 3 frames instead of 4, and the frame skipping >>>>>>>>>>>>>> bug would go unnoticed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Now the test also has made inlining assumptions beyond >>>>>>>>>>>>>>>> what NMT has made, and that is really what this bug is >>>>>>>>>>>>>>>> about. In general I think your fix is fine in the way it >>>>>>>>>>>>>>>> relaxes which frames are actually found, but as Thomas >>>>>>>>>>>>>>>> points out, it suffers from not actually looking at a >>>>>>>>>>>>>>>> single stacktrace, but just looking for the specified >>>>>>>>>>>>>>>> frames somewhere in the output (and in the order >>>>>>>>>>>>>>>> specified.) You should probably address this. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Right that was an error on my part. I thought the >>>>>>>>>>>>>>> existing MULTILINE pattern matching with .* would also >>>>>>>>>>>>>>> find non-sequential lines and so I was acting similarly. >>>>>>>>>>>>>>> I will re-think this. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Given the changes you made to allow more flexibly in >>>>>>>>>>>>>>>>>>>> which frames appear, I think you need to now also >>>>>>>>>>>>>>>>>>>> make sure the above 3 mentioned frames are not >>>>>>>>>>>>>>>>>>>> present, except for allowing AllocateHeap() in >>>>>>>>>>>>>>>>>>>> slowdebug builds. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 4/3/19 10:53 PM, David Holmes wrote: >>>>>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 >>>>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8218458/webrev/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The actual stack trace reported by NMT detail is >>>>>>>>>>>>>>>>>>>>> affected by the inlining decisions of the native >>>>>>>>>>>>>>>>>>>>> compiler, and on the type of build. So we define an >>>>>>>>>>>>>>>>>>>>> "ideal" stacktrace and then allow for some frames >>>>>>>>>>>>>>>>>>>>> to be missing based on empirical observations. So >>>>>>>>>>>>>>>>>>>>> to date we have seen two frames that may or may not >>>>>>>>>>>>>>>>>>>>> be inlined and so we allow for 2 non-matching entries. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The special-casing of AllocateHeap is removed as >>>>>>>>>>>>>>>>>>>>> now it is just an optional frame. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Chris: does this maintain the "spirit" of the test >>>>>>>>>>>>>>>>>>>>> as you intended? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Zhengyu: can you test this on your system(s) please. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> From daniel.daugherty at oracle.com Mon Apr 8 14:09:50 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 8 Apr 2019 10:09:50 -0400 Subject: RFR (XXXS): 8221584: SIGSEGV in os::PlatformEvent::unpark() in JvmtiRawMonitor::raw_exit while posting method exit event In-Reply-To: References: Message-ID: <3b27692a-6fc6-e001-eedf-97b579daf7ef@oracle.com> On 4/7/19 9:49 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8221584 > webrev: http://cr.openjdk.java.net/~dholmes/8221584/webrev/ src/hotspot/share/prims/jvmtiRawMonitor.cpp ??? No comments. Thumbs up! Dan > > I'm really just sponsoring this fix as the problem was diagnozed by > Robbin Ehn and Stefan Karlsson - thanks guys! :) So they are the > contributors and I'm already one Reviewer. > > There's a missing loadstore barrier between extracting the ParkEvent > from an ObjectWaiter node, and setting the node's TState to allow the > the entering thread to proceed. It seems our recent update to gcc 8.2 > resulted in the compiler reordering those two actions, meaning that > the Objectwaiter pointer could now be pointing into a stack location > with random contents. That might manifest as a SEGV or we may treat > random memory as a pthread_mutex_t and get an EINVAL (or potentially > other errors) on pthread_mutex_lock. > > Testing: mach5 tiers 1-3 (sanity - the added barrier can't break > anything) > > Thanks, > David From shade at redhat.com Mon Apr 8 14:55:35 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 8 Apr 2019 16:55:35 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> Message-ID: <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> On 4/8/19 2:44 PM, Peter Levart wrote: > If you're afraid that a future maintainer of that code would not realize that, then a simple comment > put into String.hashCode method and java_lang_String::set_hash C++ metohd that would say something > like the following: > > // only a single field may be modified so that the Object state is updated atomically > > ...is surely going to help him/her keep the String free from bugs... (sighs) Famous last words! Let me try again myself. Benign data race is already creepy enough in its original form. You have to come up with the overwhelming reason to complicate it even further. Why introduce this complexity to begin with? That's my argument. The complication of this code gives us almost nothing in return, why make the change? Without the compelling reason to do it, all this is just a runaway "hold my beer" micro-optimization exercise, which is fun and instructional, but should not be pushed to the actual JDK. In other words, just because we *can* it does not follow that we *should*. -Aleksey From adinn at redhat.com Mon Apr 8 15:10:51 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 8 Apr 2019 16:10:51 +0100 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> Message-ID: <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> On 08/04/2019 15:55, Aleksey Shipilev wrote: > Why introduce this complexity to begin with? That's my argument. The complication of this code gives > us almost nothing in return, why make the change? Without the compelling reason to do it, all this > is just a runaway "hold my beer" micro-optimization exercise, which is fun and instructional, but > should not be pushed to the actual JDK. In other words, just because we *can* it does not follow > that we *should*. I think that is a good argument. The vast majority of apps are not going to see any performance hit from the relatively rare occurrence of zero hashes. Those rare apps which see a lot of them will still only see a marginal performance hit. n.b. I say marginal because I believe that an app which sees enough zero hashes for the effect not to be marginal (i.e.it is not doing much else) is probably not an app but a Trojan horse of a (micro?) benchmark masquerading as an app (timeo Danaos et lecti marcae ferentes -- pardon my Latin and don't ask what those notches on the couch might mean ;-). So, I agree with Aleksey that adding a potential maintenance headache for this little gain is not the right trade-off. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From claes.redestad at oracle.com Mon Apr 8 15:24:57 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 8 Apr 2019 17:24:57 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <3453bd0a-37e6-957e-f1ea-f86fe7a9ff17@gmail.com> References: <3453bd0a-37e6-957e-f1ea-f86fe7a9ff17@gmail.com> Message-ID: <06bf9a51-7ed2-9b01-994c-f08832e3cf46@oracle.com> Right, this and possibly reducing latency when running with String deduplication enabled might be the more tangible benefits. Removing a cause for spurious performance degradations is nice, but mainly theoretical. There's likely a pre-existing negative interaction between string dedup and String archiving that would need to be resolved either way. I've simplified the patch somewhat and folded set_hash/hash into hash_code (since direct manipulation of the hash field should be avoided), along with a comment to try and explain and caution about the data race: http://cr.openjdk.java.net/~redestad/8221836/open.02/ Thanks! /Claes On 2019-04-08 12:24, Peter Levart wrote: > I think the most benefit in this patch is the emptyString.hashCode() > speedup. By holding a boolean flag in the String object itself, there is > one less de-reference to be made on fast-path in case of empty string. > Which shows in microbenchmark and would show even more if code iterated > many different instances of empty strings that don't share the > underlying array invoking .hashCode() on them. Which, I admit, is not a > frequent case in practice, but hey, it is a speedup after all. > > Regards, Peter From ivan.gerasimov at oracle.com Mon Apr 8 16:12:36 2019 From: ivan.gerasimov at oracle.com (Ivan Gerasimov) Date: Mon, 8 Apr 2019 09:12:36 -0700 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: Message-ID: Hi Claes! Would it make sense to preset hashIsZero = true in the empty string constructor? The current code avoids calculating the hashCode for an empty string, and the new code doesn't seem to do that because hashIsZero = false by default for each newly constructed copy of the empty string. With kind regards, Ivan On 4/8/19 1:41 AM, Claes Redestad wrote: > Hi, > > by adding a bit to String that is true iff String.hash has been > calculated as being 0, we can get rid of the corner case where such hash > codes are recalculated on every call. > > Peter Levart came up with a elegant scheme for ensuring that we can keep > using non-volatile stores without explicit fencing and still reap the > benefits of this[1], and I've synced up the hotspot code that deals with > the String.hash value to mirror that logic. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221836 > Webrev: http://cr.openjdk.java.net/~redestad/8221836/open.01/ > > Since there exists small padding gaps in the current object layout of > strings (on all VM bitness and compressed oops varieties), adding this > boolean does not add any extra footprint per String instance. > > Testing: tier1-3, verified a speed-up in targeted microbenchmarks. > > Thanks! > > /Claes > > [1] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-April/059480.html > -- With kind regards, Ivan Gerasimov From martin.doerr at sap.com Mon Apr 8 16:23:39 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 8 Apr 2019 16:23:39 +0000 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: Hi Dan, thanks for addressing this issue. I appreciate it. I wonder if the comments are correct. Does dec_nested_handle_cnt really only need MO_ACQ_REL while inc_nested_handle_cnt needs MO_SEQ_CST? I don't see comments explaining what was intended to get ordered. I guess we can just use memory_order_conservative (default). Shouldn't be performance critical. Best regards, Martin -----Original Message----- From: Daniel D. Daugherty Sent: Freitag, 5. April 2019 18:10 To: Doerr, Martin ; Robbin Ehn ; hotspot-runtime-dev at openjdk.java.net; Carsten Varming ; Roman Kennke Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints Filed: ??? JDK-8222034 Thread-SMR functions should be updated to remove work around ??? https://bugs.openjdk.java.net/browse/JDK-8222034 Martin and Robbin, please check it out and make sure that I captured things correctly... Dan On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: > On 4/5/19 8:37 AM, Doerr, Martin wrote: >> Hi everybody, >> >>> I think was fixed with: >>> 8202080: Introduce ordering semantics for Atomic::add/inc and other >>> RMW atomics >>> You should get a leading sync and trailing one with the default >>> conservative >>> model and thus get proper memory ordering. >>> Martin, I'm I correct? >> Exactly. Thanks for pointing this out. PPC uses the strongest >> possible ordering semantics with memory_order_conservative (default >> parameter). >> I've seen that comment about PPC in "void >> ThreadsList::inc_nested_handle_cnt()". This function could get replaced. > > Okay so we need a new bug to update these two Thread-SMR functions: > > src/hotspot/share/runtime/threadSMR.cpp: > > void ThreadsList::dec_nested_handle_cnt() { > ? // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec > ? // backend on PPC does not yet conform to these requirements. Therefore > ? // the decrement is simulated with an Atomic::sub(1, &addr). > ? // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR > mechanism > ? // is not generally safe to use. > ? Atomic::sub(1, &_nested_handle_cnt); > } > > void ThreadsList::inc_nested_handle_cnt() { > ? // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc > ? // backend on PPC does not yet conform to these requirements. Therefore > ? // the increment is simulated with a load phi; cas phi + 1; loop. > ? // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR > mechanism > ? // is not generally safe to use. > ? intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); > ? for (;;) { > ??? if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == > sample) { > ????? return; > ??? } else { > ????? sample = OrderAccess::load_acquire(&_nested_handle_cnt); > ??? } > ? } > } > > I'll file a new bug, loop in Robbin, Erik O and Martin, and make > sure we're all in agreement. Once we decide that Thread-SMR's > functions look like, I'll adapt my Async Monitor Deflation > functions... > > Dan > > >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: Robbin Ehn >> Sent: Freitag, 5. April 2019 14:07 >> To: daniel.daugherty at oracle.com; >> hotspot-runtime-dev at openjdk.java.net; Carsten Varming >> ; Roman Kennke ; Doerr, Martin >> >> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >> >> Hi Dan, >> >> (Martin there is question for you last in this email) >> >> After first pass I did not find any real issues. >> Considering what you had to work with, it looks good! >> >> #1 >> There are some assert which are redundant (to me at least) like: >> src/hotspot/share/runtime/objectMonitor.cpp >> L445 >> ??? if (!dmw->is_marked() && dmw->hash() == 0) { >> ????? // This dmw is neutral and has not yet started the restoration >> ????? // protocol so we mark a copy of the dmw to begin the protocol. >> ????? markOop marked_dmw = dmw->set_marked(); >> ????? assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >> ???????????? "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >> ???????????? marked_dmw->is_marked(), marked_dmw->hash()); >> >> That assert is basically a test that set_marked worked? >> >> L505 >> ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >> DEFLATER_MARKER) { >> ??????? assert(_succ != Self, "invariant"); >> ??????? assert(_owner == Self, "invariant"); >> >> Assert on _owner checks that our cmpxchg is not broken? >> >> I think it's easier to read the code if some on the most obvious >> asserts are >> removed. Maybe comments instead. >> >> #2 >> Not your doing but I think we should remove TRAPS/Thread * Self and use >> JavaThread* instead. >> E.g. so we can change: >> void ObjectMonitor::EnterI(TRAPS) { >> ??? Thread * const Self = THREAD; >> ??? assert(Self->is_Java_thread(), "invariant"); >> ??? assert(((JavaThread *) Self)->thread_state() == _thread_blocked, >> "invariant"); >> >> to: >> >> void ObjectMonitor::EnterI(JavaThread* Self) { >> ??? assert(Self->thread_state() == _thread_blocked, "invariant"); >> >> #3 >> src/hotspot/share/runtime/objectMonitor.inline.hpp >> ?? 164 inline void ObjectMonitor::inc_ref_count() { >> ?? 165?? // The increment needs to be MO_SEQ_CST. At the moment, the >> Atomic::inc >> ?? 166?? // backend on PPC does not yet conform to these >> requirements. Therefore >> ?? 167?? // the increment is simulated with a load phi; cas phi + 1; >> loop. >> ?? 168?? // Without this MO_SEQ_CST Atomic::inc simulation, >> AsyncDeflateIdleMonitors >> ?? 169?? // is not safe. >> >> I think was fixed with: >> 8202080: Introduce ordering semantics for Atomic::add/inc and other >> RMW atomics >> You should get a leading sync and trailing one with the default >> conservative >> model and thus get proper memory ordering. >> Martin, I'm I correct? >> >> Thanks, Robbin >> >> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>> >>> ? ??? JDK-8153224 Monitor deflation prolong safepoints >>> ? ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >>> >>> Here's a link to the OpenJDK wiki that describes my port: >>> >>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>> >>> Here's the webrev URL: >>> >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>> >>> Here's a link to Carsten's original webrev: >>> >>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>> >>> Earlier versions of this patch have been through several rounds of >>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>> Roman for their preliminary code review comments. A very special >>> thanks to Robbin and Roman for building and testing the patch in >>> their own environments (including specJBB2015). >>> >>> This version of the patch has been thru Mach5 tier[1-8] testing on >>> Oracle's usual set of platforms. Earlier versions have been run >>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>> and slowdebug). Earlier versions have run my monitor inflation stress >>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>> fastdebug and slowdebug). >>> >>> All of the testing done on earlier versions will be redone on the >>> latest version of the patch. >>> >>> Thanks, in advance, for any questions, comments or suggestions. >>> >>> Dan >>> >>> P.S. >>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>> is currently failing in -Xcomp mode on Win* only. I've been trying >>> to characterize/analyze this failure for more than a week now. At >>> this point I'm convinced that Async Monitor Deflation is aggravating >>> an existing bug. However, I plan to have a better handle on that >>> failure before these bits are pushed to the jdk/jdk repo. > > From claes.redestad at oracle.com Mon Apr 8 16:26:51 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 8 Apr 2019 18:26:51 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: Message-ID: Hi Ivan, not sure that would be an optimization, since you'd trade a conditional write for an unconditional one. The computation itself for the empty string has trivial cost. /Claes On 2019-04-08 18:12, Ivan Gerasimov wrote: > Hi Claes! > > Would it make sense to preset hashIsZero = true in the empty string > constructor? > > The current code avoids calculating the hashCode for an empty string, > and the new code doesn't seem to do that because hashIsZero = false by > default for each newly constructed copy of the empty string. > > With kind regards, > Ivan From daniel.daugherty at oracle.com Mon Apr 8 16:55:01 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 8 Apr 2019 12:55:01 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> Message-ID: <5b0d2152-e336-675b-5c89-45636596a279@oracle.com> Greetings, I took the last repo that I ran through Mach5 tier[1-8] testing and did 10 SPECjbb2015 runs on the 'release' version of those bits. I also did 10 SPECjbb2015 runs on the 'release' version of the baseline bits. Baseline: jdk-13+13 Exp:????? v2.00 (8153224-webrev/3-for-jdk13) plus ????????? special-cleanup-for-global-in-use-list Linux-X64 Machine: ? - Ubuntu 16.04, Dell T7600, 64GB RAM ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 threads MacOSX Machine: ? - MacOS 10.13.6, Mac Mini, mid 2011, 16GB RAM ? - 2 GHz Intel Core i7 (I7-2635QM), 1 CPU x 4 cores x 2 threads Solaris-X64 Machine: ? - Solaris 11.2 SRU5.5, Dell T7600, 64GB RAM ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 threads Average Results for Each OS ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ?????? 23838.00?? 22446.90? 20738.80??????? 6166.70? Linux-X64 base ?????? 23838.00?? 22279.40? 20262.00??????? 5891.50? Linux-X64 exp ??????? 5841.80??? 4885.00?? 4764.00??????? 1495.10? MacOSX base ??????? 5621.00??? 4701.00?? 4778.00??????? 1492.10? MacOSX exp ?????? 16125.20?? 13852.30? 12780.50??????? 2791.90? Solaris-X64 base ?????? 15788.70?? 13861.40? 12551.40??????? 2665.90? Solaris-X64 exp I'm new to SPECjbb2015 so I don't what "hbIR" and "jOPS" are yet. Based a bit of googling so far, it appears that for critical-jOPS, higher is better: - Linux-X64 exp critical-jOPS is ~4.5% lower than Linux-X64 base - MacOSX base and MacOSX exp critical-jOPS are almost identical - Solaris-X64 exp critical-jOPS is ~4.5% lower than Linux-X64 base I have not tried to research or analyze the other columns yet. The results for each of the 10 runs are shown below. Dan Linux-X64 Runs ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ????????? 23838????? 22719???? 19070?????????? 6515 SPECjbb2015.Lin-X64.base.01 ????????? 23838????? 21642???? 20262?????????? 5591 SPECjbb2015.Lin-X64.base.02 ????????? 23838????? 23108???? 20262?????????? 6508 SPECjbb2015.Lin-X64.base.03 ????????? 23838????? 21730???? 21454?????????? 6235 SPECjbb2015.Lin-X64.base.04 ????????? 23838????? 22220???? 21454?????????? 6028 SPECjbb2015.Lin-X64.base.05 ????????? 23838????? 22543???? 20262?????????? 5996 SPECjbb2015.Lin-X64.base.06 ????????? 23838????? 23014???? 21454?????????? 6192 SPECjbb2015.Lin-X64.base.07 ????????? 23838????? 22543???? 21454?????????? 5889 SPECjbb2015.Lin-X64.base.08 ????????? 23838????? 22750???? 20262?????????? 6038 SPECjbb2015.Lin-X64.base.09 ????????? 23838????? 22200???? 21454?????????? 6675 SPECjbb2015.Lin-X64.base.10 ---------------? ---------? --------? -------------? -------- ?????? 23838.00?? 22446.90? 20738.80??????? 6166.70? average of values ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ????????? 23838????? 21422???? 20262?????????? 6329 SPECjbb2015.Lin-X64.exp.01 ????????? 23838????? 22543???? 19070?????????? 6351 SPECjbb2015.Lin-X64.exp.02 ????????? 23838????? 22100???? 20262?????????? 5005 SPECjbb2015.Lin-X64.exp.03 ????????? 23838????? 22543???? 20262?????????? 5881 SPECjbb2015.Lin-X64.exp.04 ????????? 23838????? 23170???? 20262?????????? 5938 SPECjbb2015.Lin-X64.exp.05 ????????? 23838????? 22543???? 20262?????????? 5744 SPECjbb2015.Lin-X64.exp.06 ????????? 23838????? 22100???? 20262?????????? 5482 SPECjbb2015.Lin-X64.exp.07 ????????? 23838????? 22543???? 20262?????????? 6213 SPECjbb2015.Lin-X64.exp.08 ????????? 23838????? 22100???? 21454?????????? 5637 SPECjbb2015.Lin-X64.exp.09 ????????? 23838????? 21730???? 20262?????????? 6335 SPECjbb2015.Lin-X64.exp.10 ---------------? ---------? --------? -------------? -------- ?????? 23838.00?? 22279.40? 20262.00??????? 5891.50? average of values MacOSX Runs ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ?????????? 6725?????? 5621????? 4708?????????? 1543 SPECjbb2015.MacOSX.base.01 ?????????? 5621?????? 4701????? 4778?????????? 1326 SPECjbb2015.MacOSX.base.02 ?????????? 6725?????? 5621????? 4708?????????? 1475 SPECjbb2015.MacOSX.base.03 ?????????? 5621?????? 4701????? 4778?????????? 1372 SPECjbb2015.MacOSX.base.04 ?????????? 5621?????? 4701????? 4778?????????? 1560 SPECjbb2015.MacOSX.base.05 ?????????? 5621?????? 4701????? 4778?????????? 1471 SPECjbb2015.MacOSX.base.06 ?????????? 5621?????? 4701????? 4778?????????? 1430 SPECjbb2015.MacOSX.base.07 ?????????? 5621?????? 4701????? 4778?????????? 1560 SPECjbb2015.MacOSX.base.08 ?????????? 5621?????? 4701????? 4778?????????? 1581 SPECjbb2015.MacOSX.base.09 ?????????? 5621?????? 4701????? 4778?????????? 1633 SPECjbb2015.MacOSX.base.10 ---------------? ---------? --------? -------------? -------- ??????? 5841.80??? 4885.00?? 4764.00??????? 1495.10? average of values ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ?????????? 5621?????? 4701????? 4778?????????? 1566 SPECjbb2015.MacOSX.exp.01 ?????????? 5621?????? 4701????? 4778?????????? 1430 SPECjbb2015.MacOSX.exp.02 ?????????? 5621?????? 4701????? 4778?????????? 1530 SPECjbb2015.MacOSX.exp.03 ?????????? 5621?????? 4701????? 4778?????????? 1304 SPECjbb2015.MacOSX.exp.04 ?????????? 5621?????? 4701????? 4778?????????? 1560 SPECjbb2015.MacOSX.exp.05 ?????????? 5621?????? 4701????? 4778?????????? 1460 SPECjbb2015.MacOSX.exp.06 ?????????? 5621?????? 4701????? 4778?????????? 1638 SPECjbb2015.MacOSX.exp.07 ?????????? 5621?????? 4701????? 4778?????????? 1471 SPECjbb2015.MacOSX.exp.08 ?????????? 5621?????? 4701????? 4778?????????? 1402 SPECjbb2015.MacOSX.exp.09 ?????????? 5621?????? 4701????? 4778?????????? 1560 SPECjbb2015.MacOSX.exp.10 ---------------? ---------? --------? -------------? -------- ??????? 5621.00??? 4701.00?? 4778.00??????? 1492.10? average of values Solaris-X64 Runs ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ????????? 16584????? 13957???? 13267?????????? 2332 SPECjbb2015.Sol-X64.base.01 ????????? 16584????? 13837???? 13267?????????? 3123 SPECjbb2015.Sol-X64.base.02 ????????? 16584????? 13837???? 13267?????????? 2853 SPECjbb2015.Sol-X64.base.03 ????????? 16584????? 13837???? 12438?????????? 2667 SPECjbb2015.Sol-X64.base.04 ????????? 14743????? 14210???? 12532?????????? 2920 SPECjbb2015.Sol-X64.base.05 ????????? 16584????? 13837???? 12438?????????? 3534 SPECjbb2015.Sol-X64.base.06 ????????? 13837????? 13497???? 12453?????????? 2226 SPECjbb2015.Sol-X64.base.07 ????????? 16584????? 13837???? 12438?????????? 2265 SPECjbb2015.Sol-X64.base.08 ????????? 16584????? 13837???? 13267?????????? 2853 SPECjbb2015.Sol-X64.base.09 ????????? 16584????? 13837???? 12438?????????? 3146 SPECjbb2015.Sol-X64.base.10 ---------------? ---------? --------? -------------? -------- ?????? 16125.20?? 13852.30? 12780.50??????? 2791.90? average of values ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ????????? 16584????? 13837???? 12438?????????? 2073 SPECjbb2015.Sol-X64.exp.01 ????????? 16584????? 14353???? 13267?????????? 2667 SPECjbb2015.Sol-X64.exp.02 ????????? 16584????? 13837???? 12438?????????? 2349 SPECjbb2015.Sol-X64.exp.03 ????????? 16584????? 13837???? 12438?????????? 2494 SPECjbb2015.Sol-X64.exp.04 ????????? 13981????? 13832???? 12583?????????? 3241 SPECjbb2015.Sol-X64.exp.05 ????????? 13837????? 13575???? 12453?????????? 2621 SPECjbb2015.Sol-X64.exp.06 ????????? 13981????? 13832???? 12583?????????? 2768 SPECjbb2015.Sol-X64.exp.07 ????????? 16584????? 13837???? 12438?????????? 3000 SPECjbb2015.Sol-X64.exp.08 ????????? 16584????? 13837???? 12438?????????? 2952 SPECjbb2015.Sol-X64.exp.09 ????????? 16584????? 13837???? 12438?????????? 2494 SPECjbb2015.Sol-X64.exp.10 ---------------? ---------? --------? -------------? -------- ?????? 15788.70?? 13861.40? 12551.40??????? 2665.90? average of values On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: > Greetings, > > Welcome to the OpenJDK review thread for my port of Carsten's work on: > > ??? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > Here's a link to the OpenJDK wiki that describes my port: > > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > > Here's the webrev URL: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ > > Here's a link to Carsten's original webrev: > > http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ > > Earlier versions of this patch have been through several rounds of > preliminary review. Many thanks to Carsten, Coleen, Robbin, and > Roman for their preliminary code review comments. A very special > thanks to Robbin and Roman for building and testing the patch in > their own environments (including specJBB2015). > > This version of the patch has been thru Mach5 tier[1-8] testing on > Oracle's usual set of platforms. Earlier versions have been run > through my stress kit on my Linux-X64 and Solaris-X64 servers > (product, fastdebug, slowdebug).Earlier versions have run Kitchensink > for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug > and slowdebug). Earlier versions have run my monitor inflation stress > tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, > fastdebug and slowdebug). > > All of the testing done on earlier versions will be redone on the > latest version of the patch. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > > P.S. > One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java > is currently failing in -Xcomp mode on Win* only. I've been trying > to characterize/analyze this failure for more than a week now. At > this point I'm convinced that Async Monitor Deflation is aggravating > an existing bug. However, I plan to have a better handle on that > failure before these bits are pushed to the jdk/jdk repo. > From claes.redestad at oracle.com Mon Apr 8 21:19:22 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 8 Apr 2019 23:19:22 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> Message-ID: On 2019-04-08 17:10, Andrew Dinn wrote: > On 08/04/2019 15:55, Aleksey Shipilev wrote: >> Why introduce this complexity to begin with? That's my argument. The complication of this code gives >> us almost nothing in return, why make the change? Without the compelling reason to do it, all this >> is just a runaway "hold my beer" micro-optimization exercise, which is fun and instructional, but >> should not be pushed to the actual JDK. In other words, just because we *can* it does not follow >> that we *should*. > I think that is a good argument. The vast majority of apps are not going > to see any performance hit from the relatively rare occurrence of zero > hashes. Those rare apps which see a lot of them will still only see a > marginal performance hit. > > n.b. I say marginal because I believe that an app which sees enough zero > hashes for the effect not to be marginal (i.e.it is not doing much else) > is probably not an app but a Trojan horse of a (micro?) benchmark > masquerading as an app (timeo Danaos et lecti marcae ferentes -- pardon > my Latin and don't ask what those notches on the couch might mean ;-). > > So, I agree with Aleksey that adding a potential maintenance headache > for this little gain is not the right trade-off. First, I disagree strongly that this patch adds significant complexity (especially after recent simplifications[1]) or that it risks increasing maintenance headache down the line. Secondly, I think the gain w.r.t. defense-in-depth is very real. Sure, we have alt-hashing and similar counter-measures in various JDK collection libraries, but that is unlikely to help every API or library ever invented out there. Third... well, the other performance gains are of course nice-to- have - improvements to "".hashCode() and allowing String deduplication to not have to filter out such Strings - but I agree that they are likely mostly theoretical for anything real-world. If this change then exposes a bug in some unexpected place elsewhere (I can only guess what dangers lurks out there... unforeseen interaction with a weaker memory model? some wonky JIT reordering?) then that might even be for the better in the end. If/when that happens, we can opt to back this out (or not) while addressing whatever issue we've unearthed. /Claes [1] http://cr.openjdk.java.net/~redestad/8221836/open.02/ From john.r.rose at oracle.com Mon Apr 8 21:31:40 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 8 Apr 2019 14:31:40 -0700 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <06bf9a51-7ed2-9b01-994c-f08832e3cf46@oracle.com> References: <3453bd0a-37e6-957e-f1ea-f86fe7a9ff17@gmail.com> <06bf9a51-7ed2-9b01-994c-f08832e3cf46@oracle.com> Message-ID: <0A870E14-A24E-4C67-9D93-BC6A9FE4EB50@oracle.com> I agree that this is a good change, and you can use me as a reviewer. I disagree with Aleksey; it's a new technique but not complex to document or understand. The two state components are independent in their action; there is no race between their state changes. Meanwhile, there are two reasons I want the change: 1. Less risk of spurious updates to COW memory segments in shared archives. 2. No risk of hashcode recomputation for the 2^-32 case. This might seem laughable, until you remember that it's exactly those cases that DOS attackers like to create. Both are defense in depth, against performance potholes and intentional attacks. If we spent as much time documenting this change as we spent complaining about its supposed uselessness, we'd be done. ? John On Apr 8, 2019, at 8:24 AM, Claes Redestad wrote: > > Right, this and possibly reducing latency when running with String > deduplication enabled might be the more tangible benefits. Removing > a cause for spurious performance degradations is nice, but mainly > theoretical. There's likely a pre-existing negative interaction > between string dedup and String archiving that would need to be > resolved either way. > > I've simplified the patch somewhat and folded set_hash/hash into > hash_code (since direct manipulation of the hash field should be > avoided), along with a comment to try and explain and caution about the > data race: > > http://cr.openjdk.java.net/~redestad/8221836/open.02/ > > Thanks! > > /Claes > > On 2019-04-08 12:24, Peter Levart wrote: >> I think the most benefit in this patch is the emptyString.hashCode() speedup. By holding a boolean flag in the String object itself, there is one less de-reference to be made on fast-path in case of empty string. Which shows in microbenchmark and would show even more if code iterated many different instances of empty strings that don't share the underlying array invoking .hashCode() on them. Which, I admit, is not a frequent case in practice, but hey, it is a speedup after all. >> Regards, Peter From david.holmes at oracle.com Mon Apr 8 21:35:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 Apr 2019 07:35:48 +1000 Subject: RFR (XXXS): 8221584: SIGSEGV in os::PlatformEvent::unpark() in JvmtiRawMonitor::raw_exit while posting method exit event In-Reply-To: <3b27692a-6fc6-e001-eedf-97b579daf7ef@oracle.com> References: <3b27692a-6fc6-e001-eedf-97b579daf7ef@oracle.com> Message-ID: <9c044731-e1bf-6a0c-d6c0-2800af383629@oracle.com> Thanks Dan! David On 9/04/2019 12:09 am, Daniel D. Daugherty wrote: > On 4/7/19 9:49 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8221584 >> webrev: http://cr.openjdk.java.net/~dholmes/8221584/webrev/ > > src/hotspot/share/prims/jvmtiRawMonitor.cpp > ??? No comments. > > Thumbs up! > > Dan > >> >> I'm really just sponsoring this fix as the problem was diagnozed by >> Robbin Ehn and Stefan Karlsson - thanks guys! :) So they are the >> contributors and I'm already one Reviewer. >> >> There's a missing loadstore barrier between extracting the ParkEvent >> from an ObjectWaiter node, and setting the node's TState to allow the >> the entering thread to proceed. It seems our recent update to gcc 8.2 >> resulted in the compiler reordering those two actions, meaning that >> the Objectwaiter pointer could now be pointing into a stack location >> with random contents. That might manifest as a SEGV or we may treat >> random memory as a pthread_mutex_t and get an EINVAL (or potentially >> other errors) on pthread_mutex_lock. >> >> Testing: mach5 tiers 1-3 (sanity - the added barrier can't break >> anything) >> >> Thanks, >> David > From daniel.daugherty at oracle.com Tue Apr 9 01:04:34 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 8 Apr 2019 21:04:34 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: On 4/5/19 4:59 PM, Karen Kinnear wrote: > Dan, > > Some more minor comments from reading the code: Thanks for the additional comments. I'm gathering changes for the next round of code review (CR1) so these will be resolved in that round... More below... > 1. Could you add comments to markOop.hpp about > the use in the displaced_mark_word of is_marked to prevent any users > of is_marked > here from needing to have that information saved/restored? I _think_ I know what you're looking for here... Perhaps this: src/hotspot/share/oops/markOop.hpp: ? // ObjectMonitor::install_displaced_markword_in_object() uses ? // is_marked() on ObjectMonitor::_header as part of the restoration ? // protocol for an object's header. In this usage, the mark bit is ? // only ever set (and cleared) on the ObjectMonitor::_header field. ? bool is_marked()?? const { ??? return (mask_bits(value(), lock_mask_in_place) == marked_value); ? } > 2. In objectMonitor.hpp > ? in is_busy you clarify the difference in use between _count (which I > think you may be changing > to _contended) and _ref_count. Could you possibly also comment where > you declare them? I'll do the rename of _count -> _contentions in a subtask of 8153224 like the other cleanups of the monitor subsystem. Here's the comment in question: src/hotspot/share/runtime/objectMonitor.hpp: ? intptr_t is_busy() const { ??? // TODO-FIXME: merge _count and _waiters. ??? // TODO-FIXME: assert _owner == null implies _recursions = 0 ??? // TODO-FIXME: assert _WaitSet != null implies _count > 0 ??? // We do not include _ref_count in the is_busy() check because ??? // _ref_count is for indicating that the ObjectMonitor* is in ??? // use which is orthogonal to whether the ObjectMonitor itself ??? // is in use for a locking operation. ??? return _count|_waiters|intptr_t(_owner)|intptr_t(_cxq)|intptr_t(_EntryList); ? } I don't think this comment clarifies _count vs. _ref_count. I added the last four lines of the comment and their purpose is to describe why _ref_count isn't used by is_busy(). The TODO-FIXME lines need to be revisited since (at least) the third one is wrong. Here's the existing comment for _ref_count: ? volatile jint _ref_count;???????? // ref count for ObjectMonitor* Here's the existing comment for _count: ? volatile jint? _count;??????????? // reference count to prevent reclamation/deflation ??????????????????????????????????? // at stop-the-world time. See ObjectSynchronizer::deflate_monitor(). ??????????????????????????????????? // _count is approximately |_WaitSet| + |_EntryList| And here's what I proposed to change it to in my reply to your design review notes: ? volatile jint? _contentions;????? // Number of active contentions in enter(). It is used by is_busy() ??????????????????????????????????? // along with other fields to determine if an ObjectMonitor can be ??????????????????????????????????? // deflated. See ObjectSynchronizer::deflate_monitor(). I think we're good here with the proposed change of comment (and the rename) for the _contentions field along with existing comment for _ref_count and the existing comment for is_busy(). I may delete the third TODO-FIXME line as part of the next cleanup. > 3. clear_using_JT: would it make sense to have an assertion that > ?_owner is either null or DEFLATER_MARKER? We could add something like: ? assert(_owner == NULL || ???????? (AsyncDeflateIdleMonitors && _owner == DEFLATER_MARKER), ???????? "Fatal logic error in ObjectMonitor owner!"); and that will catch any races in async monitor deflation where the _owner field is set to a monitor owner value (stack addr or thread*). For monitor deflation at a safepoint, the non-NULL _owner field is caught in clear() (which calls clear_using_JT()). > 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 > < _count > with comments that caller ensured _count <= 0 > In ReenterI: guarantee 0 <= _count, with comment not _count < 0 > ? Am I missing something subtle here or should they be the same > guarantees? Here's the code in question: src/hotspot/share/runtime/objectMonitor.cpp: void ObjectMonitor::EnterI(TRAPS) { ? if (_owner == DEFLATER_MARKER) { ??? guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller"); ??? // Deflater thread tried to lock this monitor, but it failed to make _count negative and gave up. void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) { ??? if (_owner == DEFLATER_MARKER) { ????? guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER && _count < 0, monitor must not be owned by deflater thread here"); Reading these two guarantee() calls always throws me off stride because I would have written them like this: ??? guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller"); and ????? guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER && _count < 0, monitor must not be owned by deflater thread here"); When rewritten like the above, you have: ??? "_count > 0" ... _count <= 0 and: ??? "_count >= 0" ... "_count < 0" which is easier for my brain to read... okay... enough sidebar... Short answer: No the guarantees should not be the same. Longer answer: EnterI() is called by enter() after enter() has incremented the _count field to indicate the contended state of things. So in EnterI(), "_count > 0" is the right check. ReenterI() is called after wait() has returned (notified or timedout), and the _count field is not used on reentry ops so "_count >= 0" is the right check. I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, there are two places in EnterI() that do this): ??? L501: ? if (_owner == DEFLATER_MARKER) { ??? ? ?? ?? ? // The deflation protocol finished the first part (setting _owner), ??? ? ?? ? ?? // but it failed the second part (making _count negative) and bailed. ??? ? ? ? ? ? // Because we're called from enter() we have at least one contention. ??? ? ? ? ??? guarantee(count > 0, "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller"); ??? L504: ??? // Try to acquire monitor. ??? L505: ??? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { ??? L629: ??? if (_owner == DEFLATER_MARKER) { ????? ?? ?????? // The deflation protocol finished the first part (setting _owner), ????? ?? ?????? // but it failed the second part (making _count negative) and bailed. ????? ?? ?????? // Because we're called from enter() we have at least one contention. ?? ???????????? guarantee(count> 0 , "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller"); ??? L632: ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { And I'm going to tweak the ReenterI() code like this: ??? L759: ??? if (_owner == DEFLATER_MARKER) { ??????????????? // The deflation protocol finished the first part (setting _owner), ??????????????? // but it will observe _waiters != 0 and will bail out. Because we're ??????????????? // called from wait() we may or may not have any contentions. ? ? ? ? ? ????? guarantee(count >= 0, "Impossible: _owner == DEFLATER_MARKER && _count < 0 should have been handled by the caller"); ??? L761: ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { You didn't ask this, but it is okay that _count is only used to track contentions in enter()/EnterI() and is not used to track contentions in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is used by is_busy() to observe the busy state for an ObjectMonitor that is being wait()'ed for. The _waiters field is decremented after a waiter has returned from ReenterI() so the _owner field takes over answering the is_busy() question... > 5. I could use a little help with allocation state transitions, > e.g. in deflate_monitor_list_using_JT > ? you see is_new with object set so you mark it as old so next > deflation will check it Here's the code in question: src/hotspot/share/runtime/synchronizer.cpp: int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** listHeadp, ObjectMonitor** freeHeadp, ObjectMonitor** freeTailp, ObjectMonitor** savedMidInUsep) { ??? // Only try to deflate if there is an associated Java object and if ??? // mid is old (is not newly allocated and is not newly freed). ??? if (mid->object() != NULL && mid->is_old() && ??????? deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { ????? // Deflation succeeded so update the in-use list. ??? } else { ????? // mid is considered in-use if it does not have an associated ????? // Java object or mid is not old or deflation did not succeed. ????? // A mid->is_new() node can be seen here when it is freshly returned ????? // by omAlloc() (and skips the deflation code path). ????? // A mid->is_old() node can be seen here when deflation failed. ????? // A mid->is_free() node can be seen here when a fresh node from ????? // omAlloc() is released by omRelease() due to losing the race ????? // in inflate(). ????? if (mid->object() != NULL && mid->is_new()) { ??????? // mid has an associated Java object and has now been seen ??????? // as newly allocated so mark it as "old". ??????? mid->set_allocation_state(ObjectMonitor::Old); ????? } > ? - why do you set it to old here rather than in inflate once we set > values? Inflation is used in quite a few places. If we marked the ObjectMonitor as "Old" in inflate(), then that would make the ObjectMonitor available for deflation by deflate_monitor_using_JT() earlier: src/hotspot/share/runtime/synchronizer.cpp: > bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, > ObjectMonitor** freeHeadp, > ObjectMonitor** freeTailp) { > ? assert(AsyncDeflateIdleMonitors, "sanity check"); > ? assert(Thread::current()->is_Java_thread(), "precondition"); > ? // A newly allocated ObjectMonitor should not be seen here so we > ? // avoid an endless inflate/deflate cycle. > ? assert(mid->is_old(), "precondition"); So the idea behind only deflating ObjectMonitors that have reached allocation state "Old" is to prevent "an endless inflate/deflate cycle". Here's the relevant section from Carsten's JEP: > To avoid endless inflation / deflation cycles in the prototype, monitor > deflation is only attempted the second time a monitor is seen by the > thread marking monitors as deflatable: If the thread (the only thread > marking monitors as deflatable; might be service thread or some GC > related thread or even a dedicated thread) sees a monitor in state New, > then the thread marks the monitor as Old and moves on. So there is > little interaction between a thread inflating a lock to a monitor and > the deflating thread, the inflating thread just has to make sure the > monitor is marked New and this marker is published using appropriate > barriers. There isn't an explicit example in the JEP of what Carsten was thinking of with "an endless inflate/deflate cycle". I didn't try to think of such an example for the OpenJDK wiki either. I simple wrote: > ObjectMonitor has a new allocation_state field that supports three > states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied > to ObjectMonitors that have reached the 'Old' state. When the Async > Monitor Deflation code sees an ObjectMonitor in the 'New' state, it > is changed to the 'Old' state, but is not deflated. This prevents a > newly allocated ObjectMonitor from being immediately deflated which > could cause an inflation<->deflation oscillation. So let's think about what might happen if an ObjectMonitor is marked as "Old" in inflate(). Here's an example use of inflate() in the "slow enter" code path: src/hotspot/share/runtime/synchronizer.cpp: > void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) { base< ?? inflate(THREAD, obj(), inflate_cause_monitor_enter)->enter(THREAD); new>? ?? ObjectMonitorHandle omh; new>? ?? inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); new>? ?? do_loop = !omh.om_ptr()->enter(THREAD); In the "base" code, we took the return from inflate() and used it to call ObjectMonitor::enter(). If we never changed that bit of code and inflate() marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() could async deflate the ObjectMonitor while we were trying to call enter() on it... Boom! So we might think that holding off marking an ObjectMonitor as "Old" can save us... and it can, but not in all cases... :-( It is entirely possible that our call to slow_enter() is made on an ObjectMonitor that's already marked "Old". In that case, our thread (T-enter) calls inflate() which returns the existing ObjectMonitor* and we use it to call enter(). If the thread (T-deflate) calling deflate_monitor_using_JT() does its magic before T-enter sets the owner field or the count field... Boom! The previous paragraph is exactly what motivated the _ref_count field, the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* parameter to inflate(). inflate() calls ObjectMonitorHandle::save_om_ptr() which increments the ObjectMonitor's ref_count and then checks for async deflation protocol collisions. If there's a collision, then save_om_ptr() returns false and the caller (inflate() in this case) has to retry. When inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot be deflated and is safe until the ObjectMonitorHandle is destroyed. So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot deflate the ObjectMonitor in the window after inflate() returns and before T-enter sets the owner field or increments the count field. But you know all that already! So let's bring this back to having inflate() mark the ObjectMonitor as "Old"... Since inflate() returns an ObjectMonitor with the ref_count > 0, it doesn't matter if the ObjectMonitor is marked as "Old" in inflate(). T-deflate cannot deflate it due to ref_count > 0. Here's another crazy thought... inflate() is the only function that calls omAlloc(), and omAlloc() is the only function that sets "New". If we move the setting of "Old" from deflate_monitor_list_using_JT() to inflate(), then the change from "New" -> "Old" never happens outside of the inflate() call so why do we need the allocation state? Small dose of reality: I've found having the allocation state to be very helpful when debugging race related crashes. We could make the allocation state be DEBUG_ONLY, but then what about race debugging of product bits... sigh... > 6. Could you get rid of the new goto?s? I believe there is only one left from Carsten's prototype: src/hotspot/share/runtime/synchronizer.cpp: > intptr_t ObjectSynchronizer::FastHashCode(Thread * Self, oop obj) { > ? } else if (mark->has_monitor()) { > ??? ObjectMonitorHandle omh; > ??? if (!omh.save_om_ptr(obj, mark)) { > ????? // Lost a race with async deflation so try again. > ????? assert(AsyncDeflateIdleMonitors, "sanity check"); > ????? goto Retry; > ??? } I can change FastHashCode() to use the same "while (do_loop)" as the other code that needs to do retries... > 7. On the updated wiki for the hash race example: > Racing Threads: ?T-hash is about to inc the ref_count field? > actually - T-hash just did - ref_count == 1 - so maybe change middle > values Actually, we're talking about the set up for the race and the diagram shows "ref_count == 1" and should show "ref_count == 0". So I have fixed that on the "Racing Threads" diagram. In the following "T-deflate Wins" and "T-hash Wins" diagrams, "ref_count == 1" is shown in both initial race results ObjectMonitor box. In "T-deflate Wins", it shows ref_count being restored to 0 in the second ObjectMonitor box. Thanks for catching this error. I've fixed it on the wiki. > > 8. There is an old comment in FastHashCode > that > ?// WARNING: > ? ? // ? The displaced header is strictly immutable. > ? ? // It can NOT be changed in ANY cases. > > I presume that only applies to the displaced header for a stack lock - > could you > possibly update that while you are in the code? Here's the whole comment: > ??? // WARNING: > ??? //?? The displaced header is strictly immutable. > ??? // It can NOT be changed in ANY cases. So we have > ??? // to inflate the header into heavyweight monitor > ??? // even the current thread owns the lock. The reason > ??? // is the BasicLock (stack slot) will be asynchronously > ??? // read by other threads during the inflate() function. > ??? // Any change to stack may not propagate to other threads > ??? // correctly. That comment applies the displaced header that's in the BasicLock on the thread's stack and it definitely needs some cleaning up independent of the Async Monitor Deflation project. > Also in FastHashCode > // The only update to the header in the monitor (outside GC) > 823 // is install the hash code. If someone add new usage of > 824 // displaced header, please update this code > Can you update that comment as well? I know you?ve already updated the > code logic. I'll revisit that comment as well. I believe Carsten updated it in his prototype, but when I backed out that change when I simplified the hashcode stuff due to ObjectMonitorHandles/ref_count. > So I walked the logic for the hashcode interactions - I didn?t find > any holes. Thank you for walking most of it in email/wiki. > In particular, inflate does the save_om_ptr dance to inc_ref_count, so > this code above will > be called while preventing async deflation. Right. > 9. install_displaced_markword_in_object > What happens if the cas_set_mark fails? Here's the code in question: src/hotspot/share/runtime/objectMonitor.cpp: > void ObjectMonitor::install_displaced_markword_in_object() { > ? if (dmw->is_marked()) { > ??? // The dmw copy is marked which means a hash was not set by a racing > ??? // thread. Clear the mark from the copy in preparation for possible > ??? // restoration from this thread. > ??? assert(dmw->hash() == 0, "must be 0: hash=" INTPTR_FORMAT, > dmw->hash()); > ??? dmw = dmw->set_unmarked(); > ? } > ? assert(dmw->is_neutral(), "must be a neutral markword"); > > ? oop const obj = (oop) object(); > ? // Install displaced markword if object markword still points to this > ? // monitor. Both the mutator trying to enter() and the thread deflating > ? // the monitor will reach this point, but only one can win. > ? // Note: If a mutator won the cmpxchg() race above and installed a hash > ? // in _header, then the updated dmw contains that hash and we'll install > ? // it in the object's markword here. > ? obj->cas_set_mark(dmw, markOopDesc::encode(this)); We don't check the return from cas_set_mark() here intentionally. If we have just T-enter and T-deflate racing through this code, then after the "if (dmw->is_marked()) {" block, both threads will have the same 'dmw' value. One thread will set it and the other thread will fail to set it, but we don't care because both threads wanted to set the same value... As a result of the cas_set_mark() call in both threads, both threads will see the same value in the object's header (if they happen to look). I talk about this in the "Either Wins the Second Race" sub-section on the wiki. > I get that today this handles the race with enter and > deflate_monitor_using_JT. If we remove > the call from enter, is the expectation that we?ve blocked all others > who did not set is_marked themselves? > If we remove the call from enter would it make sense to ensure that > the cas_set_mark succeeds here? If we remove the install_displaced_markword_in_object() call from enter(), then I don't think we need install_displaced_markword_in_object() at all and can restore the object's header with: ??? // Restore the header back to obj ??? obj->release_set_mark(mid->header()); just like ObjectSynchronizer::deflate_monitor(). The question is whether we think install_displaced_markword_in_object() buys us something other than "help" in restoring the object's header. > 10. Is there any benefit in a bit of stress testing with something > like a temporary flag that deflates in > mAlloc each time it is called? Maybe? :-) Something like DeflateAsyncMonitorsALot? Can you eloborate on your thinking a bit? > Looking forward to the performance runs as well as the latency numbers. I posted the SPECjbb2015 numbers from this past weekend earlier today. Rather disappointing on my T7600's... Neutral on my MacMini... When you say "latency numbers", what do you mean? Do you mean how long ObjectMonitors that could be deflated are kept inflated? Or do you mean something else? I think I've responded to everything. Please let me know if I missed something... Dan > > thanks, > Karen > >> On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty >> > wrote: >> >> Filed: >> >> ??? JDK-8222034 Thread-SMR functions should be updated to remove work >> around >> https://bugs.openjdk.java.net/browse/JDK-8222034 >> >> Martin and Robbin, please check it out and make sure that I captured >> things correctly... >> >> Dan >> >> >> >> On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: >>> On 4/5/19 8:37 AM, Doerr, Martin wrote: >>>> Hi everybody, >>>> >>>>> I think was fixed with: >>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and >>>>> other RMW atomics >>>>> You should get a leading sync and trailing one with the default >>>>> conservative >>>>> model and thus get proper memory ordering. >>>>> Martin, I'm I correct? >>>> Exactly. Thanks for pointing this out. PPC uses the strongest >>>> possible ordering semantics with memory_order_conservative (default >>>> parameter). >>>> I've seen that comment about PPC in "void >>>> ThreadsList::inc_nested_handle_cnt()". This function could get >>>> replaced. >>> >>> Okay so we need a new bug to update these two Thread-SMR functions: >>> >>> src/hotspot/share/runtime/threadSMR.cpp: >>> >>> void ThreadsList::dec_nested_handle_cnt() { >>> ? // The decrement needs to be MO_ACQ_REL. At the moment, the >>> Atomic::dec >>> ? // backend on PPC does not yet conform to these requirements. >>> Therefore >>> ? // the decrement is simulated with an Atomic::sub(1, &addr). >>> ? // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR >>> mechanism >>> ? // is not generally safe to use. >>> ? Atomic::sub(1, &_nested_handle_cnt); >>> } >>> >>> void ThreadsList::inc_nested_handle_cnt() { >>> ? // The increment needs to be MO_SEQ_CST. At the moment, the >>> Atomic::inc >>> ? // backend on PPC does not yet conform to these requirements. >>> Therefore >>> ? // the increment is simulated with a load phi; cas phi + 1; loop. >>> ? // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR >>> mechanism >>> ? // is not generally safe to use. >>> ? intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>> ? for (;;) { >>> ??? if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == >>> sample) { >>> ????? return; >>> ??? } else { >>> ????? sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>> ??? } >>> ? } >>> } >>> >>> I'll file a new bug, loop in Robbin, Erik O and Martin, and make >>> sure we're all in agreement. Once we decide that Thread-SMR's >>> functions look like, I'll adapt my Async Monitor Deflation >>> functions... >>> >>> Dan >>> >>> >>>> >>>> Best regards, >>>> Martin >>>> >>>> >>>> -----Original Message----- >>>> From: Robbin Ehn > >>>> Sent: Freitag, 5. April 2019 14:07 >>>> To: daniel.daugherty at oracle.com >>>> ; >>>> hotspot-runtime-dev at openjdk.java.net >>>> ; Carsten Varming >>>> >; Roman Kennke >>>> >; Doerr, Martin >>>> > >>>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>>> >>>> Hi Dan, >>>> >>>> (Martin there is question for you last in this email) >>>> >>>> After first pass I did not find any real issues. >>>> Considering what you had to work with, it looks good! >>>> >>>> #1 >>>> There are some assert which are redundant (to me at least) like: >>>> src/hotspot/share/runtime/objectMonitor.cpp >>>> L445 >>>> ??? if (!dmw->is_marked() && dmw->hash() == 0) { >>>> ????? // This dmw is neutral and has not yet started the restoration >>>> ????? // protocol so we mark a copy of the dmw to begin the protocol. >>>> ????? markOop marked_dmw = dmw->set_marked(); >>>> ????? assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >>>> ???????????? "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >>>> ???????????? marked_dmw->is_marked(), marked_dmw->hash()); >>>> >>>> That assert is basically a test that set_marked worked? >>>> >>>> L505 >>>> ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>>> DEFLATER_MARKER) { >>>> ??????? assert(_succ != Self, "invariant"); >>>> ??????? assert(_owner == Self, "invariant"); >>>> >>>> Assert on _owner checks that our cmpxchg is not broken? >>>> >>>> I think it's easier to read the code if some on the most obvious >>>> asserts are >>>> removed. Maybe comments instead. >>>> >>>> #2 >>>> Not your doing but I think we should remove TRAPS/Thread * Self and use >>>> JavaThread* instead. >>>> E.g. so we can change: >>>> void ObjectMonitor::EnterI(TRAPS) { >>>> ??? Thread * const Self = THREAD; >>>> ??? assert(Self->is_Java_thread(), "invariant"); >>>> ??? assert(((JavaThread *) Self)->thread_state() == >>>> _thread_blocked, "invariant"); >>>> >>>> to: >>>> >>>> void ObjectMonitor::EnterI(JavaThread* Self) { >>>> ??? assert(Self->thread_state() == _thread_blocked, "invariant"); >>>> >>>> #3 >>>> src/hotspot/share/runtime/objectMonitor.inline.hpp >>>> ?? 164 inline void ObjectMonitor::inc_ref_count() { >>>> ?? 165?? // The increment needs to be MO_SEQ_CST. At the moment, >>>> the Atomic::inc >>>> ?? 166?? // backend on PPC does not yet conform to these >>>> requirements. Therefore >>>> ?? 167?? // the increment is simulated with a load phi; cas phi + >>>> 1; loop. >>>> ?? 168?? // Without this MO_SEQ_CST Atomic::inc simulation, >>>> AsyncDeflateIdleMonitors >>>> ?? 169?? // is not safe. >>>> >>>> I think was fixed with: >>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other >>>> RMW atomics >>>> You should get a leading sync and trailing one with the default >>>> conservative >>>> model and thus get proper memory ordering. >>>> Martin, I'm I correct? >>>> >>>> Thanks, Robbin >>>> >>>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>>>> Greetings, >>>>> >>>>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>>>> >>>>> ? ??? JDK-8153224 Monitor deflation prolong safepoints >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>> Here's a link to the OpenJDK wiki that describes my port: >>>>> >>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>> Here's the webrev URL: >>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>>> >>>>> Here's a link to Carsten's original webrev: >>>>> >>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>>> >>>>> Earlier versions of this patch have been through several rounds of >>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>>> Roman for their preliminary code review comments. A very special >>>>> thanks to Robbin and Roman for building and testing the patch in >>>>> their own environments (including specJBB2015). >>>>> >>>>> This version of the patch has been thru Mach5 tier[1-8] testing on >>>>> Oracle's usual set of platforms. Earlier versions have been run >>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>>>> and slowdebug). Earlier versions have run my monitor inflation stress >>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>>> fastdebug and slowdebug). >>>>> >>>>> All of the testing done on earlier versions will be redone on the >>>>> latest version of the patch. >>>>> >>>>> Thanks, in advance, for any questions, comments or suggestions. >>>>> >>>>> Dan >>>>> >>>>> P.S. >>>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>> is currently failing in -Xcomp mode on Win* only. I've been trying >>>>> to characterize/analyze this failure for more than a week now. At >>>>> this point I'm convinced that Async Monitor Deflation is aggravating >>>>> an existing bug. However, I plan to have a better handle on that >>>>> failure before these bits are pushed to the jdk/jdk repo. >>> >>> >> > From david.holmes at oracle.com Tue Apr 9 02:45:02 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 Apr 2019 12:45:02 +1000 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> Message-ID: <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> Hi Robin, On 8/04/2019 10:47 pm, Robin Westberg wrote: > Hi again, > > Here?s an updated version where I?ve moved the naked_short_nanosleep function into the Posix class, to avoid future cross-platform use. (It?s still used in the SpinYield and TimedYield implementations though). But you also changed the existing Windows os::naked_short_sleep to use the WaitableTimer which is a significant change to make. Is this just because it will likely have better resolution than the native Sleep function? Code using os::naked_short_sleep(1) might be unexpectedly impacted by this if what was a 10ms (or worse) sleep becomes closer to 1ms. PerfDataManager::destroy in particular could be impacted as its racy to begin with. That's not to say this isn't a good thing to fix, just be aware it may have unexpected consequences. > Full webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.01/ > Incremental: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00-01/ src/hotspot/share/utilities/spinYield.cpp I'm somewhat dubious about getting rid of non-Posix naked_short_nanosleep and instead adding a win32 ifdef to this code. It kind of defeats the purpose of the os abstraction layer - albeit that Windows can't really do a nanosleep. Why did you get rid of the sleep_ns parameter and hardwire it to 1000? A configurable sleep time would be a feature of this utility. --- src/hotspot/share/utilities/timedYield.hpp 36 // scheduled on the same cpu as the waiter, we will first try local cpu 37 // yielding until we reach OS sleep primitive granularity in the waiting Whether or not the yielding is "cpu local" will depend on the OS and the scheduler in use. I would just refer to OS native yield. 40 jlong _yield_time_ns; 41 jlong _sleep_time_ns; 42 jlong _max_yield_time_ns; 43 jlong _yield_granularity_ns; We are avoiding using j-types in the VM unless they actually pertain to Java variables. These should be int64_t (to be compatible with javaTimeNanos call). 52 // Perform next round of delay. 53 void wait(); I think delay() would be a better name than wait(). --- src/hotspot/share/utilities/timedYield.cpp 42 // Phase 1 - local cpu yielding 43 if (_yield_time_ns < _max_yield_time_ns) { 44 #ifdef WIN32 45 if (SwitchToThread() == 0) { 46 // Nothing else is ready to run on this cpu, spin a little 47 while (os::javaTimeNanos() - start < _yield_granularity_ns) { 48 SpinPause(); 49 } 50 } 51 #else 52 os::Posix::naked_short_nanosleep(_yield_granularity_ns); 53 #endif 54 _yield_time_ns += os::javaTimeNanos() - start; 55 return; 56 } I have a few issues with this code. It's breaking the os abstraction layer again by using an OS ifdef and not using os APIs that exist for this very purpose - i.e os::naked_yield(). And for non-Windows it seems quite bizarre the "yielding" part of TimedYield is actually implemented with a sleep and not os::naked_yield! If the existing os api's need adjustment or expansion to provide the functionality desired by this code then I would much prefer to see the os API's updated to address that. That said, given the original problem is that os::naked_short_nanosleep on Windows is too coarse with the use of WaitableTimer why not just replace that with a simple loop of the form: while (elapsed < sleep_ns) { if (SwitchToThread() == 0) { SpinPause(); elapsed = ... } ? Thanks, David ----- > Best regards, > Robin > >> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >> >> Hi David, >> >>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>> >>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>> Hi David, >>>> Thanks for taking a look! >>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>> >>>>> Hi Robin, >>>>> >>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>> Hi all, >>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>> >>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>> >>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >> >> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >> >> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >> >>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >> >> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >> >> Best regards, >> Robin >> >>> >>> Thanks, >>> David >>> >>>> Best regards, >>>> Robin >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>> Testing: tier1 >>>>>> Best regards, >>>>>> Robin > From shade at redhat.com Tue Apr 9 08:11:51 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 9 Apr 2019 10:11:51 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <0A870E14-A24E-4C67-9D93-BC6A9FE4EB50@oracle.com> References: <3453bd0a-37e6-957e-f1ea-f86fe7a9ff17@gmail.com> <06bf9a51-7ed2-9b01-994c-f08832e3cf46@oracle.com> <0A870E14-A24E-4C67-9D93-BC6A9FE4EB50@oracle.com> Message-ID: <5392bd57-d2b1-0cc5-fbd0-7c1993c31729@redhat.com> On 4/8/19 11:31 PM, John Rose wrote: > I agree that this is a good change, and you can use me as a reviewer. Which opens up the process question: are you acting as Project Lead here to resolve the disagreement between Reviewers (the only accepting Reviewer being yourself)? (There are ways for me to yield: for example, accept this patch provisionally, if Claes and/or accepting reviewers agree that any follow-up issue with it triggers the immediate backout, and future attempts to introduce it are rejected given the observed non-trivial cost. This seems like a no-brainer for those who argue there is little risk in doing this.) > I disagree with Aleksey; it's a new technique but not complex > to document or understand. The two state components are > independent in their action; there is no race between their > state changes. Yes, I am a firm believer that concurrency hacks need justifications, regardless of how simple they appear at the moment. Call me paranoid, but I know enough about (Java) concurrency to be paranoid. For tiny benefits, you need to demonstrate the non-existent cost to tip the cost/benefit balance over the acceptance threshold. But even the benefits are questionable: > Meanwhile, there are two reasons I want the change: > > 1. Less risk of spurious updates to COW memory segments in > shared archives. There is no risk for current code either. How come adding the new writable field provides "less risk of updates", anyway? With one field we can track the object state in an obvious manner. Spreading that state over two fields makes it less risky how exactly? > 2. No risk of hashcode recomputation for the 2^-32 case. > This might seem laughable, until you remember that it's exactly > those cases that DOS attackers like to create. Alt-hashing covers this obscure case in the course of mitigating much easier and much broader attack on String hashcode. We don't get to wave in every single hack into class libraries under "security" justification, especially when the mitigation already exists. -Aleksey From adinn at redhat.com Tue Apr 9 08:20:31 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 9 Apr 2019 09:20:31 +0100 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> Message-ID: <1bfa62d2-fa96-da46-63c0-7e4cf4cf40fd@redhat.com> On 08/04/2019 22:19, Claes Redestad wrote: > First, I disagree strongly that this patch adds significant complexity > (especially after recent simplifications[1]) or that it risks increasing > maintenance headache down the line. It all depends what you mean by significant. It definitely adds complexity. If there are other benefits than the rather questionable one of avoiding infrequent zero hashes they may well justify that complexity. > Secondly, I think the gain w.r.t. defense-in-depth is very real. Sure, > we have alt-hashing and similar counter-measures in various JDK > collection libraries, but that is unlikely to help every API or library > ever invented out there. This is a better argument. > Third... well, the other performance gains are of course nice-to- > have - improvements to "".hashCode() and allowing String deduplication > to not have to filter out such Strings - but I agree that they are > likely mostly theoretical for anything real-world. Ok, so the argument is DID then? I'll buy that. > If this change then exposes a bug in some unexpected place elsewhere > (I can only guess what dangers lurks out there... unforeseen interaction > with a weaker memory model? some wonky JIT reordering?) then that might > even be for the better in the end. If/when that happens, we can opt to > back this out (or not) while addressing whatever issue we've unearthed. I don't believe Aleksey is suggesting that some hidden memory ordering or JIT transformation bug may come to bite us. As I understand it the problem he is concerned with is subsequent injection of such a bug i.e. some developer 1) not recognising that the code as it stands only works in the presence of these re-ordering possibilities by careful design and 2) mistakenly changing the code so that those possibilities are no longer bypassed. I agree that is a real concern. If the patch is to go in -- and I concede there is an acceptable argument for that -- then it needs commenting to help avoid this happening. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From adinn at redhat.com Tue Apr 9 08:36:17 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 9 Apr 2019 09:36:17 +0100 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <5392bd57-d2b1-0cc5-fbd0-7c1993c31729@redhat.com> References: <3453bd0a-37e6-957e-f1ea-f86fe7a9ff17@gmail.com> <06bf9a51-7ed2-9b01-994c-f08832e3cf46@oracle.com> <0A870E14-A24E-4C67-9D93-BC6A9FE4EB50@oracle.com> <5392bd57-d2b1-0cc5-fbd0-7c1993c31729@redhat.com> Message-ID: <6304c50d-bb9e-6a80-ba1d-462bc04c7f21@redhat.com> On 09/04/2019 09:11, Aleksey Shipilev wrote: > On 4/8/19 11:31 PM, John Rose wrote: >> I agree that this is a good change, and you can use me as a reviewer. > > Which opens up the process question: are you acting as Project Lead here to resolve the disagreement > between Reviewers (the only accepting Reviewer being yourself)? > > (There are ways for me to yield: for example, accept this patch provisionally, if Claes and/or > accepting reviewers agree that any follow-up issue with it triggers the immediate backout, and > future attempts to introduce it are rejected given the observed non-trivial cost. This seems like a > no-brainer for those who argue there is little risk in doing this.) Hmm, well, ... I just posted a reply to Claes accepting this patch on DID grounds, /assuming/ the code is suitably commented. However, I think your extra provision here is thoroughly reasonable. I'd even be happy to extend it with a promise that you can claim 'I told you so' on list if this ever happens (in CAPS, if you must :-). regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From claes.redestad at oracle.com Tue Apr 9 08:42:16 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 9 Apr 2019 10:42:16 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <1bfa62d2-fa96-da46-63c0-7e4cf4cf40fd@redhat.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> <1bfa62d2-fa96-da46-63c0-7e4cf4cf40fd@redhat.com> Message-ID: Hi Andrew, On 2019-04-09 10:20, Andrew Dinn wrote: > If the patch is to go in -- and I concede there is an acceptable argument for that -- then it > needs commenting to help avoid this happening. open.02 already adds what I believed to be sufficient commenting, have you taken this into consideration? http://cr.openjdk.java.net/~redestad/8221836/open.02/src/java.base/share/classes/java/lang/String.java.udiff.html /Claes From peter.levart at gmail.com Tue Apr 9 08:53:32 2019 From: peter.levart at gmail.com (Peter Levart) Date: Tue, 9 Apr 2019 10:53:32 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <5392bd57-d2b1-0cc5-fbd0-7c1993c31729@redhat.com> References: <3453bd0a-37e6-957e-f1ea-f86fe7a9ff17@gmail.com> <06bf9a51-7ed2-9b01-994c-f08832e3cf46@oracle.com> <0A870E14-A24E-4C67-9D93-BC6A9FE4EB50@oracle.com> <5392bd57-d2b1-0cc5-fbd0-7c1993c31729@redhat.com> Message-ID: <803a506d-c427-c8c0-5c56-730de7edd919@gmail.com> Hi Aleksey, On 4/9/19 10:11 AM, Aleksey Shipilev wrote: >> 2. No risk of hashcode recomputation for the 2^-32 case. >> This might seem laughable, until you remember that it's exactly >> those cases that DOS attackers like to create. > Alt-hashing covers this obscure case in the course of mitigating much easier and much broader attack > on String hashcode. We don't get to wave in every single hack into class libraries under "security" > justification, especially when the mitigation already exists. > > -Aleksey > Which alt-hashing are you talking about? The one which was removed from Java code of String in transition from JDK 7 -> JDK 8 ? AFAIK, there's no alt-caching for pure java code for Strings any more (there's something for internal JVM use). It was dropped when (Concurrent)HashMap got tree-ification. Regards, Peter From peter.levart at gmail.com Tue Apr 9 09:04:36 2019 From: peter.levart at gmail.com (Peter Levart) Date: Tue, 9 Apr 2019 11:04:36 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> <1bfa62d2-fa96-da46-63c0-7e4cf4cf40fd@redhat.com> Message-ID: <543624a9-70df-3588-68f0-9246789c74a9@gmail.com> On 4/9/19 10:42 AM, Claes Redestad wrote: > Hi Andrew, > > On 2019-04-09 10:20, Andrew Dinn wrote: >> If the patch is to go in -- and I concede there is an acceptable >> argument for that -- then it >> needs commenting to help avoid this happening. > > open.02 already adds what I believed to be sufficient commenting, have > you taken this into consideration? > > http://cr.openjdk.java.net/~redestad/8221836/open.02/src/java.base/share/classes/java/lang/String.java.udiff.html > > > /Claes Perhaps you could add the same comment to the C++ java_lang_String::hash_code(oop java_string) method too? Regards, Peter From adinn at redhat.com Tue Apr 9 09:05:39 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 9 Apr 2019 10:05:39 +0100 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> <1bfa62d2-fa96-da46-63c0-7e4cf4cf40fd@redhat.com> Message-ID: <183620f8-831f-8754-d3db-b7c3c1328b21@redhat.com> On 09/04/2019 09:42, Claes Redestad wrote: > Hi Andrew, > > On 2019-04-09 10:20, Andrew Dinn wrote: >> If the patch is to go in -- and I concede there is an acceptable >> argument for that -- then it >> needs commenting to help avoid this happening. > > open.02 already adds what I believed to be sufficient commenting, have > you taken this into consideration? public int hashCode() { + // The hash or hashIsZero fields are subject to a benign data race, + // making it crucial to ensure that any observable result of the + // calculation in this method stays correct under any possible read of + // these fields. One necessary restriction to allow this to be correct + // without explicit memory fences or similar concurrency primitives is + // that we can ever only write to one of these two fields for a given + // String instance. int h = hash; It would probably also be good if you extended the comment to document the status quo i.e. as Peter noted that the assigned values are computed deterministically from immutable state. Perhaps this: public int hashCode() { + // The hash or hashIsZero fields are subject to a benign data race, + // making it crucial to ensure that any observable result of the + // calculation in this method stays correct under any possible read of + // these fields. One necessary restriction to allow this to be correct + // without explicit memory fences or similar concurrency primitives is + // that we can ever only write to one of these two fields for a given + // String instance. Clearly the assigned values must also be computed + // deterministically from immutable state. int h = hash; regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From shade at redhat.com Tue Apr 9 09:14:13 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 9 Apr 2019 11:14:13 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <803a506d-c427-c8c0-5c56-730de7edd919@gmail.com> References: <3453bd0a-37e6-957e-f1ea-f86fe7a9ff17@gmail.com> <06bf9a51-7ed2-9b01-994c-f08832e3cf46@oracle.com> <0A870E14-A24E-4C67-9D93-BC6A9FE4EB50@oracle.com> <5392bd57-d2b1-0cc5-fbd0-7c1993c31729@redhat.com> <803a506d-c427-c8c0-5c56-730de7edd919@gmail.com> Message-ID: <411e41fe-cea3-9ef2-78de-99d4f46cdfa4@redhat.com> On 4/9/19 10:53 AM, Peter Levart wrote: > On 4/9/19 10:11 AM, Aleksey Shipilev wrote: >>> 2. No risk of hashcode recomputation for the 2^-32 case. >>> This might seem laughable, until you remember that it's exactly >>> those cases that DOS attackers like to create. >> Alt-hashing covers this obscure case in the course of mitigating much easier and much broader attack >> on String hashcode. We don't get to wave in every single hack into class libraries under "security" >> justification, especially when the mitigation already exists. > > Which alt-hashing are you talking about? The one which was removed from Java code of String in > transition from JDK 7 -> JDK 8 ? > > AFAIK, there's no alt-caching for pure java code for Strings any more (there's something for > internal JVM use). It was dropped when (Concurrent)HashMap got tree-ification. Oh snap, I misremember things. Okay, so treefication mitigates the attack on colliding hashcodes by going the String.compare route, and thus resolving the O(n^2) algorithmic attack. I still don't see zero hashcode presents a similar same attack surface: computing the hashcode for the incoming string would take the same time, regardless of what hashcode value it produces. Although I can see the defense-in-depth argument more clearly now, I am still on the fence it warrants a fix. -Aleksey From claes.redestad at oracle.com Tue Apr 9 12:21:05 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 9 Apr 2019 14:21:05 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <183620f8-831f-8754-d3db-b7c3c1328b21@redhat.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> <1bfa62d2-fa96-da46-63c0-7e4cf4cf40fd@redhat.com> <183620f8-831f-8754-d3db-b7c3c1328b21@redhat.com> Message-ID: <9b67c12d-8109-e3f7-1d82-b6aecd2ffb39@oracle.com> On 2019-04-09 11:05, Andrew Dinn wrote: > It would probably also be good if you extended the comment to document > the status quo i.e. as Peter noted that the assigned values are computed > deterministically from immutable state. How about this: http://cr.openjdk.java.net/~redestad/8221836/open.03/ Thanks! /Claes From adinn at redhat.com Tue Apr 9 13:03:15 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 9 Apr 2019 14:03:15 +0100 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: <9b67c12d-8109-e3f7-1d82-b6aecd2ffb39@oracle.com> References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> <1bfa62d2-fa96-da46-63c0-7e4cf4cf40fd@redhat.com> <183620f8-831f-8754-d3db-b7c3c1328b21@redhat.com> <9b67c12d-8109-e3f7-1d82-b6aecd2ffb39@oracle.com> Message-ID: On 09/04/2019 13:21, Claes Redestad wrote: > On 2019-04-09 11:05, Andrew Dinn wrote: >> It would probably also be good if you extended the comment to document >> the status quo i.e. as Peter noted that the assigned values are computed >> deterministically from immutable state. > > How about this: > http://cr.openjdk.java.net/~redestad/8221836/open.03/ Yes, that looks fine. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From coleen.phillimore at oracle.com Tue Apr 9 14:43:33 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 9 Apr 2019 10:43:33 -0400 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <4862ac3f-f165-7d04-6248-514e588e70a2@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> <4862ac3f-f165-7d04-6248-514e588e70a2@oracle.com> Message-ID: <6b9dcad0-5563-0e6b-1c37-c8a4c40d407a@oracle.com> On 4/5/19 11:30 AM, gerard ziemski wrote: > hi Coleen, > > Thank you for the review. My comments are inline below: > > On 4/5/19 7:35 AM, coleen.phillimore at oracle.com wrote: >> >> Hi Gerard,?? This is somewhat of a first pass review. >> >> I like the change a? lot.? I have a couple of suggestions. >> >> http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/utilities/statistics.hpp.html >> >> >> Can you rename this file tableStatistics.cpp/hpp because "statistics" >> is too general and the class is called TableStatistics. > I deliberately named the file "statistics.hpp", because I assume we > will be adding more JFR events in the future, and this file could hold > all the related code, which for now just comprises of table statistics > as you pointed out. Hi I don't agree with that.? I think if you want more different JFR statistics you could add files where they belong as differentStatistics.hpp > > >> >> http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/jfr/periodic/jfrPeriodic.cpp.udiff.html >> >> >> Is there anyway to parameterize these functions and/or add them to >> TableStatistics? > I didn't want to add JFR dependency to TableStatistics. I'm unsure > what I can do more here, and whether it deserves the effort - > TableStatistics basically serves as a struct for passing event > attributes around, but I'm open to suggestions. > I didn't think this should be moved from jfrPeriodic.cpp.? I thought it could be something like an X macro. Or just make this bit a function that they all call with event as parameter. + event.set_numberOfBuckets(statistics._number_of_buckets); + event.set_numberOfEntries(statistics._number_of_entries); + event.set_totalFootprint(statistics._total_footprint); + event.set_maximumBucketCount(statistics._maximum_bucket_size); + event.set_averageBucketCount(statistics._average_bucket_size); + event.set_varianceOfBucketCount(statistics._variance_of_bucket_size); + event.set_stdDevOfBucketCount(statistics._stddev_of_bucket_size); + event.set_insertionRate(statistics._add_rate); + event.set_removalRate(statistics._remove_rate); + event.commit(); > >> >> Also, when Stefan is done with the ResolvedMethodTable, you can add >> that too in a separate RFE >> https://bugs.openjdk.java.net/browse/JDK-8221393 > Thank you, I linked them. Thanks! Coleen > > >> >> Thanks, >> Coleen >> >> >> On 4/4/19 3:52 PM, gerard ziemski wrote: >>> Thank you Erik for clarifications. >>> >>> I have implemented all your suggestions, which you can find here >>> http://cr.openjdk.java.net/~gziemski/8185525_rev2 >>> >>> I started Mach5 tier1-6 test to test the changes ... >>> >>> >>> cheers >>> >>> On 4/4/19 1:16 PM, Erik Gahlin wrote: >>>> On 2019-04-04 17:39, gerard ziemski wrote: >>>>> hi Erik, >>>>> >>>>> >>>>> On 4/3/19 12:44 PM, Erik Gahlin wrote: >>>>>> Hi Gerard, >>>>>> >>>>>> Here are some comments about the metadata (to make it consistent >>>>>> with other events). >>>>>> >>>>>> The events should not be in the "Java Application" category since >>>>>> they are JVM events. You could perhaps put them in "Java Virtual >>>>>> Machine, Runtime, Tables". Some comments about the names and >>>>>> labels of fields. >>>>>> >>>>>> - Label: Number of buckets => Bucket Count >>>>>> - Label: Number of entries => Entry Count >>>>>> - Label: Total footprint => Total Footprint >>>>>> >>>>>> Could you remove descriptions that are exactly the same as the >>>>>> label. >>>>>> >>>>>> - Label: Maximum bucket size => Maximum Bucket Size >>>>>> - Label: Average bucket size => Average Bucket Size >>>>>> - Label: Variance of bucket? size => Bucket Size Variance >>>>>> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >>>>>> - Label: Standard deviation of bucket size => Bucket Size >>>>>> Standard Deviation" >>>>>> >>>>>> Instead of using the word "size", it may make more sense to use >>>>>> the word "count" here as well, i.e "Average Bucket Count", or >>>>>> maybe I'm missing something? Is there a difference? >>>>>> >>>>>> I wonder how useful standard deviation and variance is? If >>>>>> support engineers are looking at a recording, or JMC adds a rule >>>>>> for the events, what would a good or bad value be? Is it possible >>>>>> to use the information for troubleshooting? >>>>> >>>>> While I'm working on all the above changes you suggested, we can >>>>> discuss the standard devation and variance. >>>>> >>>>> I added them because they are part of the jcmd "VM.symboltable >>>>> -verbose" command, so we are consistent. >>>> OK >>>>> >>>>> Now, regarding how useful they are, I always understood them as a >>>>> sign of imbalanced table distribution, and without a proper >>>>> histogram, this is the best description of the histogram shape. In >>>>> reality, however, I think that if they identify an issue, then we >>>>> might have a very curious distribution (some sort of hash table >>>>> attack), or we have an issue with our hash function for the >>>>> particular usage case. >>>>> >>>>> Still, I'd personally elect to keep them. >>>>> >>>>> Let me ask you a different question though, Is it expensive to >>>>> have 2 doubles as part of an event (5 events per second)? >>>> Doubles can't be compressed so each value will take 8 bytes. I >>>> don't think the precision of a double is needed, so you could >>>> change it into a float and save a few bytes. >>>> >>>> Most user will not care about JVM internals and a lower rate than >>>> once per second is probably sufficient for support engineers to >>>> spot that something is wrong. >>>> >>>> The Thread Context Switch Rate event is emitted once every ten >>>> seconds. I think the same rate could be used here. >>>> >>>>> And if so, is there currently (or planned) granularity for >>>>> controlling not just which events to record, but also which >>>>> attributes? >>>>> >>>> No. >>>> >>>> If overhead becomes an issues, it's usually better to emit all the >>>> information, but at a lower rate.? That way, users can find out >>>> that the information exists, and increase the rate if a higher >>>> resolution is needed to solve their specific issue. >>>> >>>>>> >>>>>> - Name: addRate => insertionRate >>>>>> - Label: Rate of addition =>? Insertation Rate >>>>>> - Name: removeRate => removalRate >>>>>> - Label: Rate of removal => Removal Rate >>>>> >>>>> Will do. >>>>> >>>>>> >>>>>> I'm missing unit tests for the events. Could you please add in >>>>>> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e >>>>>> the average not exceeding max, no negative values etc. >>>>> >>>>> Working on it, do we need separate test per each event (table), or >>>>> just one table will suffice (ex. StringTable)? >>>> They are kind of similar, so I think one test file is sufficient, >>>> but we should sanity check data for all events. >>>> >>>> Thanks >>>> Erik >>>> >>>>> >>>>> Thank you for the feedback! >>>>> >>>>> >>>>> cheers >>>>>> >>>>>> Thanks! >>>>>> Erik >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this feature, which adds tracing events for the >>>>>>> internal hash tables. >>>>>>> >>>>>>> The following attributes are implemented: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> label="Total footprint" description="Total memory footprint (the >>>>>>> table itself plus all of the entries)" /> >>>>>>> >>>>>>> >>>>>> label="Variance of bucket sizes" description="How far bucket >>>>>>> lengths are spread out from their average value" /> >>>>>>> >>>>>>> >>>>>> description="How many items were added since last event (per >>>>>>> second)" /> >>>>>>> >>>>>> description="How many items were removed since last event (per >>>>>>> second)" /> >>>>>>> >>>>>>> This event was implemented for the following system tables: >>>>>>> >>>>>>> SymbolTable >>>>>>> StringTable >>>>>>> Placeholder Table >>>>>>> LoaderConstraints Table >>>>>>> ProtectionDomainCache Table >>>>>>> >>>>>>> Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185525 >>>>>>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in >>>>>>> progress?) >>>>>>> >>>>>>> >>>>>>> Cheers >> >> > From gerard.ziemski at oracle.com Tue Apr 9 15:44:42 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Tue, 9 Apr 2019 10:44:42 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <6b9dcad0-5563-0e6b-1c37-c8a4c40d407a@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> <4862ac3f-f165-7d04-6248-514e588e70a2@oracle.com> <6b9dcad0-5563-0e6b-1c37-c8a4c40d407a@oracle.com> Message-ID: Thank you Coleen for more feedback! On 4/9/19 9:43 AM, coleen.phillimore at oracle.com wrote: >>> http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/utilities/statistics.hpp.html >>> >>> >>> Can you rename this file tableStatistics.cpp/hpp because >>> "statistics" is too general and the class is called TableStatistics. >> I deliberately named the file "statistics.hpp", because I assume we >> will be adding more JFR events in the future, and this file could >> hold all the related code, which for now just comprises of table >> statistics as you pointed out. > > Hi I don't agree with that.? I think if you want more different JFR > statistics you could add files where they belong as > differentStatistics.hpp Done. >>> http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/jfr/periodic/jfrPeriodic.cpp.udiff.html >>> >>> >>> Is there anyway to parameterize these functions and/or add them to >>> TableStatistics? >> I didn't want to add JFR dependency to TableStatistics. I'm unsure >> what I can do more here, and whether it deserves the effort - >> TableStatistics basically serves as a struct for passing event >> attributes around, but I'm open to suggestions. >> > > I didn't think this should be moved from jfrPeriodic.cpp.? I thought > it could be something like an X macro. > > Or just make this bit a function that they all call with event as > parameter. > > + event.set_numberOfBuckets(statistics._number_of_buckets); > + event.set_numberOfEntries(statistics._number_of_entries); > + event.set_totalFootprint(statistics._total_footprint); > + event.set_maximumBucketCount(statistics._maximum_bucket_size); > + event.set_averageBucketCount(statistics._average_bucket_size); > + event.set_varianceOfBucketCount(statistics._variance_of_bucket_size); > + event.set_stdDevOfBucketCount(statistics._stddev_of_bucket_size); > + event.set_insertionRate(statistics._add_rate); > + event.set_removalRate(statistics._remove_rate); > + event.commit(); Each of those JFR events are an instance of a different class, so the best I can do is a macro here (otherwise I'd have to create a base class for the TableStatistics events from which to extend our 6 table events, but I'm not sure JFR architecture supports that - it generates class automatically from the event's meta description) Updated webrev http://cr.openjdk.java.net/~gziemski/8185525_rev4/ cheers From daniel.daugherty at oracle.com Tue Apr 9 15:53:36 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 9 Apr 2019 11:53:36 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: Hi Carsten, Thanks for responding to Karen's code review comments. Karen, I have a query for you down at the end of my reply... More below... On 4/5/19 11:01 PM, Carsten Varming wrote: > Dear Karen, > > Please see inline answers. > > On Fri, Apr 5, 2019 at 4:59 PM Karen Kinnear > wrote: > > > 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees > that 0 < _count > with comments that caller ensured _count <= 0 > In ReenterI: guarantee 0 <= _count, with comment not _count < 0 > ? Am I missing something subtle here or should they be the same > guarantees? > > > In ::enter _count is incremented when the thread is trying to acquire > the monitor and decremented after the monitor has been acquired. The 0 > < _count assertion is between those two point in the code. A thread > acquiring a monitor and then calling wait will increment _count and > then decrement _count as part of acquiring the monitor, thus _count > can be 0 by the time the thread calls wait and when ReenterI is called. I had a similar answer and I'm planning to tweak the comments and the guarantees a bit in the next round of code review (CR1); please see my reply to Karen's CR for the proposed changes. > > 9. install_displaced_markword_in_object > What happens if the cas_set_mark fails? > I get that today this handles the race with enter and > deflate_monitor_using_JT. If we remove > the call from enter, is the expectation that we?ve blocked all > others who did not set is_marked themselves? > If we remove the call from enter would it make sense to ensure > that the cas_set_mark succeeds here? > > > I designed my original patch such that no thread would ever wait for > the the deflating thread to finish deflating a monitor. If you remove > install_displaced_markword_in_object from enter, then the entering > thread can end up busy waiting by continuously reading the monitor > pointer from the object mark word and then realizing that the monitor > is being deflated and it should retry by going back to reading the > object mark word. This bad behavior is completely avoided by calling > install_displaced_markword_in_object. Here's the code in question: src/hotspot/share/runtime/objectMonitor.cpp: > bool ObjectMonitor::enter(TRAPS) { > ? // Prevent deflation. See ObjectSynchronizer::deflate_monitor() and > is_busy(). > ? // Ensure the object-monitor relationship remains stable while > there's contention. > ? const jint count = Atomic::add(1, &_count); > ? if (count <= 0 && _owner == DEFLATER_MARKER) { > ??? // Async deflation in progress. Help deflater thread install > ??? // the mark word (in case deflater thread is slow). > ??? install_displaced_markword_in_object(); > ??? Self->_Stalled = 0; > ??? return false;? // Caller should retry. Never mind about _count as > this monitor has been deflated. > ? } Our thread (T-enter) observes that the ObjectMonitor is being deflated by T-deflate, calls install_displaced_markword_in_object() and returns false to the caller which causes a retry. Restoring the header/dmw from the ObjectMonitor to the object's header here isn't needed for correctness so it could be dropped (and would simplify the code). Your counterpoint is if we drop the call, then T-enter could do retry after retry if T-deflate is slow to get to its install_displaced_markword_in_object() call. If T-enter calls install_displaced_markword_in_object(), then T-enter will do a single retry because the object T-enter is trying to lock will no longer have an ObjectMonitor. Okay I finally grok it... I think we need to clarify the comment a bit: > ? if (count <= 0 && _owner == DEFLATER_MARKER) { >? ?? // Async deflation is in progress. Attempt to restore the >???? // header/dmw to the object's header so that we only retry once >???? // if the deflater thread happens to be slow. >? ?? install_displaced_markword_in_object(); > In my original patch no thread would ever wait for a deflating thread > to finish. This property got lost in FastHashCode as that function > evolved since I wrote my patch, but I think this property is worth > preserving where possible. It might even be worth looking at > FastHashCode to see if we can re-establish this property. Async Monitor Deflation causes races with FastHashCode() when the target object has an existing ObjectMonitor. Here's the base code: > 768 } else if (mark->has_monitor()) { > 769 monitor = mark->monitor(); > 770 temp = monitor->header(); > 771 assert(temp->is_neutral(), "invariant"); > 772 hash = temp->hash(); > 773 if (hash) { > 774 return hash; > 775 } > 776 // Skip to the following code to reduce code size The 'monitor' fetched on L769 is unstable due to Async Monitor Deflation and can cause an incorrect hash value to be returned. The solution is to protect the ObjectMonitor*: > 775 } else if (mark->has_monitor()) { > 776 ObjectMonitorHandle omh; > 777 if (!omh.save_om_ptr(obj, mark)) { > 778 // Lost a race with async deflation so try again. > 779 assert(AsyncDeflateIdleMonitors, "sanity check"); > 780 goto Retry; > 781 } > 782 monitor = omh.om_ptr(); > 783 temp = monitor->header(); > 784 assert(temp->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)temp)); > 785 hash = temp->hash(); > 786 if (hash != 0) { > 787 return hash; > 788 } > 789 // Skip to the following code to reduce code size where L776-L782 handle the protection duty and possible retry. So we have to protect the ObjectMonitor*, but, like enter(), we could call install_displaced_markword_in_object() when we retry which would limit T-hash to a single retry. ObjectSynchronizer::inflate() has a similar collision and retry issue: > 1456 // CASE: inflated > 1457 if (mark->has_monitor()) { > 1458 if (!omh_p->save_om_ptr(object, mark)) { > 1459 // Lost a race with async deflation so try again. > 1460 assert(AsyncDeflateIdleMonitors, "sanity check"); > 1461 continue; > 1462 } In this situation, inflate() discovers that the object already has an ObjectMonitor; the object may not have had one when inflate() was called, but it has one now. That particular race predates this project. In any case, inflate() wants to return a stable ObjectMonitor* in the ObjectMonitorHandle, but if save_om_ptr() returns false, then inflate() has to retry. The only reason for save_om_ptr() to return false is due to a collision with Async Monitor Deflation. Like enter, we could call install_displaced_markword_in_object() when we retry which would limit inflate() to a single retry. Okay, I've evolved from thinking we could simplify the code by dropping install_displaced_markword_in_object() to thinking that I understand what install_displaced_markword_in_object() brings to the party. And now I'm proposing that we add 2 more install_displaced_markword_in_object() calls to limit retries on two more code paths. Karen, are you convinced that install_displaced_markword_in_object() is useful? Dan > > I hope this helps. > > Best, > Carsten > >> On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty >> > > wrote: >> >> Filed: >> >> ??? JDK-8222034 Thread-SMR functions should be updated to remove >> work around >> https://bugs.openjdk.java.net/browse/JDK-8222034 >> >> Martin and Robbin, please check it out and make sure that I captured >> things correctly... >> >> Dan >> >> >> >> On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: >>> On 4/5/19 8:37 AM, Doerr, Martin wrote: >>>> Hi everybody, >>>> >>>>> I think was fixed with: >>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and >>>>> other RMW atomics >>>>> You should get a leading sync and trailing one with the >>>>> default conservative >>>>> model and thus get proper memory ordering. >>>>> Martin, I'm I correct? >>>> Exactly. Thanks for pointing this out. PPC uses the strongest >>>> possible ordering semantics with memory_order_conservative >>>> (default parameter). >>>> I've seen that comment about PPC in "void >>>> ThreadsList::inc_nested_handle_cnt()". This function could get >>>> replaced. >>> >>> Okay so we need a new bug to update these two Thread-SMR functions: >>> >>> src/hotspot/share/runtime/threadSMR.cpp: >>> >>> void ThreadsList::dec_nested_handle_cnt() { >>> ? // The decrement needs to be MO_ACQ_REL. At the moment, the >>> Atomic::dec >>> ? // backend on PPC does not yet conform to these requirements. >>> Therefore >>> ? // the decrement is simulated with an Atomic::sub(1, &addr). >>> ? // Without this MO_ACQ_REL Atomic::dec simulation, the nested >>> SMR mechanism >>> ? // is not generally safe to use. >>> ? Atomic::sub(1, &_nested_handle_cnt); >>> } >>> >>> void ThreadsList::inc_nested_handle_cnt() { >>> ? // The increment needs to be MO_SEQ_CST. At the moment, the >>> Atomic::inc >>> ? // backend on PPC does not yet conform to these requirements. >>> Therefore >>> ? // the increment is simulated with a load phi; cas phi + 1; loop. >>> ? // Without this MO_SEQ_CST Atomic::inc simulation, the nested >>> SMR mechanism >>> ? // is not generally safe to use. >>> ? intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>> ? for (;;) { >>> ??? if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) >>> == sample) { >>> ????? return; >>> ??? } else { >>> ????? sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>> ??? } >>> ? } >>> } >>> >>> I'll file a new bug, loop in Robbin, Erik O and Martin, and make >>> sure we're all in agreement. Once we decide that Thread-SMR's >>> functions look like, I'll adapt my Async Monitor Deflation >>> functions... >>> >>> Dan >>> >>> >>>> >>>> Best regards, >>>> Martin >>>> >>>> >>>> -----Original Message----- >>>> From: Robbin Ehn >>> > >>>> Sent: Freitag, 5. April 2019 14:07 >>>> To: daniel.daugherty at oracle.com >>>> ; >>>> hotspot-runtime-dev at openjdk.java.net >>>> ; Carsten Varming >>>> >; Roman Kennke >>>> >; Doerr, Martin >>>> > >>>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>>> >>>> Hi Dan, >>>> >>>> (Martin there is question for you last in this email) >>>> >>>> After first pass I did not find any real issues. >>>> Considering what you had to work with, it looks good! >>>> >>>> #1 >>>> There are some assert which are redundant (to me at least) like: >>>> src/hotspot/share/runtime/objectMonitor.cpp >>>> L445 >>>> ??? if (!dmw->is_marked() && dmw->hash() == 0) { >>>> ????? // This dmw is neutral and has not yet started the >>>> restoration >>>> ????? // protocol so we mark a copy of the dmw to begin the >>>> protocol. >>>> ????? markOop marked_dmw = dmw->set_marked(); >>>> ????? assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >>>> ???????????? "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >>>> ???????????? marked_dmw->is_marked(), marked_dmw->hash()); >>>> >>>> That assert is basically a test that set_marked worked? >>>> >>>> L505 >>>> ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>>> DEFLATER_MARKER) { >>>> ??????? assert(_succ != Self, "invariant"); >>>> ??????? assert(_owner == Self, "invariant"); >>>> >>>> Assert on _owner checks that our cmpxchg is not broken? >>>> >>>> I think it's easier to read the code if some on the most >>>> obvious asserts are >>>> removed. Maybe comments instead. >>>> >>>> #2 >>>> Not your doing but I think we should remove TRAPS/Thread * Self >>>> and use >>>> JavaThread* instead. >>>> E.g. so we can change: >>>> void ObjectMonitor::EnterI(TRAPS) { >>>> ??? Thread * const Self = THREAD; >>>> ??? assert(Self->is_Java_thread(), "invariant"); >>>> ??? assert(((JavaThread *) Self)->thread_state() == >>>> _thread_blocked, "invariant"); >>>> >>>> to: >>>> >>>> void ObjectMonitor::EnterI(JavaThread* Self) { >>>> ??? assert(Self->thread_state() == _thread_blocked, "invariant"); >>>> >>>> #3 >>>> src/hotspot/share/runtime/objectMonitor.inline.hpp >>>> ?? 164 inline void ObjectMonitor::inc_ref_count() { >>>> ?? 165?? // The increment needs to be MO_SEQ_CST. At the >>>> moment, the Atomic::inc >>>> ?? 166?? // backend on PPC does not yet conform to these >>>> requirements. Therefore >>>> ?? 167?? // the increment is simulated with a load phi; cas phi >>>> + 1; loop. >>>> ?? 168?? // Without this MO_SEQ_CST Atomic::inc simulation, >>>> AsyncDeflateIdleMonitors >>>> ?? 169?? // is not safe. >>>> >>>> I think was fixed with: >>>> 8202080: Introduce ordering semantics for Atomic::add/inc and >>>> other RMW atomics >>>> You should get a leading sync and trailing one with the default >>>> conservative >>>> model and thus get proper memory ordering. >>>> Martin, I'm I correct? >>>> >>>> Thanks, Robbin >>>> >>>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>>>> Greetings, >>>>> >>>>> Welcome to the OpenJDK review thread for my port of Carsten's >>>>> work on: >>>>> >>>>> ? ??? JDK-8153224 Monitor deflation prolong safepoints >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>> Here's a link to the OpenJDK wiki that describes my port: >>>>> >>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>> Here's the webrev URL: >>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>>> >>>>> Here's a link to Carsten's original webrev: >>>>> >>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>>> >>>>> Earlier versions of this patch have been through several rounds of >>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>>> Roman for their preliminary code review comments. A very special >>>>> thanks to Robbin and Roman for building and testing the patch in >>>>> their own environments (including specJBB2015). >>>>> >>>>> This version of the patch has been thru Mach5 tier[1-8] testing on >>>>> Oracle's usual set of platforms. Earlier versions have been run >>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>>> (product, fastdebug, slowdebug).Earlier versions have run >>>>> Kitchensink >>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>>> fastdebug >>>>> and slowdebug). Earlier versions have run my monitor inflation >>>>> stress >>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>>> fastdebug and slowdebug). >>>>> >>>>> All of the testing done on earlier versions will be redone on the >>>>> latest version of the patch. >>>>> >>>>> Thanks, in advance, for any questions, comments or suggestions. >>>>> >>>>> Dan >>>>> >>>>> P.S. >>>>> One subtest in >>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>> is currently failing in -Xcomp mode on Win* only. I've been trying >>>>> to characterize/analyze this failure for more than a week now. At >>>>> this point I'm convinced that Async Monitor Deflation is >>>>> aggravating >>>>> an existing bug. However, I plan to have a better handle on that >>>>> failure before these bits are pushed to the jdk/jdk repo. >>> >>> >> > From coleen.phillimore at oracle.com Tue Apr 9 15:57:15 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 9 Apr 2019 11:57:15 -0400 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> <4862ac3f-f165-7d04-6248-514e588e70a2@oracle.com> <6b9dcad0-5563-0e6b-1c37-c8a4c40d407a@oracle.com> Message-ID: <72cbb5f1-c843-2995-12da-31eb22734c30@oracle.com> On 4/9/19 11:44 AM, gerard ziemski wrote: > Thank you Coleen for more feedback! > > > On 4/9/19 9:43 AM, coleen.phillimore at oracle.com wrote: >>>> http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/utilities/statistics.hpp.html >>>> >>>> >>>> Can you rename this file tableStatistics.cpp/hpp because >>>> "statistics" is too general and the class is called TableStatistics. >>> I deliberately named the file "statistics.hpp", because I assume we >>> will be adding more JFR events in the future, and this file could >>> hold all the related code, which for now just comprises of table >>> statistics as you pointed out. >> >> Hi I don't agree with that.? I think if you want more different JFR >> statistics you could add files where they belong as >> differentStatistics.hpp > > Done. Thanks! > > >>>> http://cr.openjdk.java.net/~gziemski/8185525_rev2/src/hotspot/share/jfr/periodic/jfrPeriodic.cpp.udiff.html >>>> >>>> >>>> Is there anyway to parameterize these functions and/or add them to >>>> TableStatistics? >>> I didn't want to add JFR dependency to TableStatistics. I'm unsure >>> what I can do more here, and whether it deserves the effort - >>> TableStatistics basically serves as a struct for passing event >>> attributes around, but I'm open to suggestions. >>> >> >> I didn't think this should be moved from jfrPeriodic.cpp.? I thought >> it could be something like an X macro. >> >> Or just make this bit a function that they all call with event as >> parameter. >> >> + event.set_numberOfBuckets(statistics._number_of_buckets); >> + event.set_numberOfEntries(statistics._number_of_entries); >> + event.set_totalFootprint(statistics._total_footprint); >> + event.set_maximumBucketCount(statistics._maximum_bucket_size); >> + event.set_averageBucketCount(statistics._average_bucket_size); >> + event.set_varianceOfBucketCount(statistics._variance_of_bucket_size); >> + event.set_stdDevOfBucketCount(statistics._stddev_of_bucket_size); >> + event.set_insertionRate(statistics._add_rate); >> + event.set_removalRate(statistics._remove_rate); >> + event.commit(); > > Each of those JFR events are an instance of a different class, so the > best I can do is a macro here (otherwise I'd have to create a base > class for the TableStatistics events from which to extend our 6 table > events, but I'm not sure JFR architecture supports that - it generates > class automatically from the event's meta description) > > Updated webrev http://cr.openjdk.java.net/~gziemski/8185525_rev4/ Yes, that looks better to me. + //statistics.print(tty, "SymbolTable"); You should remove commented out code.?? You can always add it back locally if you want to debug it again.? (I don't need to see this change). thanks, Coleen > > > cheers From daniel.daugherty at oracle.com Tue Apr 9 16:13:51 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 9 Apr 2019 12:13:51 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: Martin, I added this comment to JDK-8222034. Hopefully Erik O will chime in there since he wrote the original code... Dan On 4/8/19 12:23 PM, Doerr, Martin wrote: > Hi Dan, > > thanks for addressing this issue. I appreciate it. > > I wonder if the comments are correct. Does dec_nested_handle_cnt really only need MO_ACQ_REL while inc_nested_handle_cnt needs MO_SEQ_CST? > I don't see comments explaining what was intended to get ordered. > > I guess we can just use memory_order_conservative (default). Shouldn't be performance critical. > > Best regards, > Martin > > > -----Original Message----- > From: Daniel D. Daugherty > Sent: Freitag, 5. April 2019 18:10 > To: Doerr, Martin ; Robbin Ehn ; hotspot-runtime-dev at openjdk.java.net; Carsten Varming ; Roman Kennke > Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints > > Filed: > > ??? JDK-8222034 Thread-SMR functions should be updated to remove work > around > ??? https://bugs.openjdk.java.net/browse/JDK-8222034 > > Martin and Robbin, please check it out and make sure that I captured > things correctly... > > Dan > > > > On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: >> On 4/5/19 8:37 AM, Doerr, Martin wrote: >>> Hi everybody, >>> >>>> I think was fixed with: >>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other >>>> RMW atomics >>>> You should get a leading sync and trailing one with the default >>>> conservative >>>> model and thus get proper memory ordering. >>>> Martin, I'm I correct? >>> Exactly. Thanks for pointing this out. PPC uses the strongest >>> possible ordering semantics with memory_order_conservative (default >>> parameter). >>> I've seen that comment about PPC in "void >>> ThreadsList::inc_nested_handle_cnt()". This function could get replaced. >> Okay so we need a new bug to update these two Thread-SMR functions: >> >> src/hotspot/share/runtime/threadSMR.cpp: >> >> void ThreadsList::dec_nested_handle_cnt() { >> ? // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec >> ? // backend on PPC does not yet conform to these requirements. Therefore >> ? // the decrement is simulated with an Atomic::sub(1, &addr). >> ? // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR >> mechanism >> ? // is not generally safe to use. >> ? Atomic::sub(1, &_nested_handle_cnt); >> } >> >> void ThreadsList::inc_nested_handle_cnt() { >> ? // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc >> ? // backend on PPC does not yet conform to these requirements. Therefore >> ? // the increment is simulated with a load phi; cas phi + 1; loop. >> ? // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR >> mechanism >> ? // is not generally safe to use. >> ? intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); >> ? for (;;) { >> ??? if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == >> sample) { >> ????? return; >> ??? } else { >> ????? sample = OrderAccess::load_acquire(&_nested_handle_cnt); >> ??? } >> ? } >> } >> >> I'll file a new bug, loop in Robbin, Erik O and Martin, and make >> sure we're all in agreement. Once we decide that Thread-SMR's >> functions look like, I'll adapt my Async Monitor Deflation >> functions... >> >> Dan >> >> >>> Best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: Robbin Ehn >>> Sent: Freitag, 5. April 2019 14:07 >>> To: daniel.daugherty at oracle.com; >>> hotspot-runtime-dev at openjdk.java.net; Carsten Varming >>> ; Roman Kennke ; Doerr, Martin >>> >>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>> >>> Hi Dan, >>> >>> (Martin there is question for you last in this email) >>> >>> After first pass I did not find any real issues. >>> Considering what you had to work with, it looks good! >>> >>> #1 >>> There are some assert which are redundant (to me at least) like: >>> src/hotspot/share/runtime/objectMonitor.cpp >>> L445 >>> ??? if (!dmw->is_marked() && dmw->hash() == 0) { >>> ????? // This dmw is neutral and has not yet started the restoration >>> ????? // protocol so we mark a copy of the dmw to begin the protocol. >>> ????? markOop marked_dmw = dmw->set_marked(); >>> ????? assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >>> ???????????? "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >>> ???????????? marked_dmw->is_marked(), marked_dmw->hash()); >>> >>> That assert is basically a test that set_marked worked? >>> >>> L505 >>> ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>> DEFLATER_MARKER) { >>> ??????? assert(_succ != Self, "invariant"); >>> ??????? assert(_owner == Self, "invariant"); >>> >>> Assert on _owner checks that our cmpxchg is not broken? >>> >>> I think it's easier to read the code if some on the most obvious >>> asserts are >>> removed. Maybe comments instead. >>> >>> #2 >>> Not your doing but I think we should remove TRAPS/Thread * Self and use >>> JavaThread* instead. >>> E.g. so we can change: >>> void ObjectMonitor::EnterI(TRAPS) { >>> ??? Thread * const Self = THREAD; >>> ??? assert(Self->is_Java_thread(), "invariant"); >>> ??? assert(((JavaThread *) Self)->thread_state() == _thread_blocked, >>> "invariant"); >>> >>> to: >>> >>> void ObjectMonitor::EnterI(JavaThread* Self) { >>> ??? assert(Self->thread_state() == _thread_blocked, "invariant"); >>> >>> #3 >>> src/hotspot/share/runtime/objectMonitor.inline.hpp >>> ?? 164 inline void ObjectMonitor::inc_ref_count() { >>> ?? 165?? // The increment needs to be MO_SEQ_CST. At the moment, the >>> Atomic::inc >>> ?? 166?? // backend on PPC does not yet conform to these >>> requirements. Therefore >>> ?? 167?? // the increment is simulated with a load phi; cas phi + 1; >>> loop. >>> ?? 168?? // Without this MO_SEQ_CST Atomic::inc simulation, >>> AsyncDeflateIdleMonitors >>> ?? 169?? // is not safe. >>> >>> I think was fixed with: >>> 8202080: Introduce ordering semantics for Atomic::add/inc and other >>> RMW atomics >>> You should get a leading sync and trailing one with the default >>> conservative >>> model and thus get proper memory ordering. >>> Martin, I'm I correct? >>> >>> Thanks, Robbin >>> >>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>>> Greetings, >>>> >>>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>>> >>>> ? ??? JDK-8153224 Monitor deflation prolong safepoints >>>> ? ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >>>> >>>> Here's a link to the OpenJDK wiki that describes my port: >>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>> >>>> Here's the webrev URL: >>>> >>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>> >>>> Here's a link to Carsten's original webrev: >>>> >>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>> >>>> Earlier versions of this patch have been through several rounds of >>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>> Roman for their preliminary code review comments. A very special >>>> thanks to Robbin and Roman for building and testing the patch in >>>> their own environments (including specJBB2015). >>>> >>>> This version of the patch has been thru Mach5 tier[1-8] testing on >>>> Oracle's usual set of platforms. Earlier versions have been run >>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>>> and slowdebug). Earlier versions have run my monitor inflation stress >>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>> fastdebug and slowdebug). >>>> >>>> All of the testing done on earlier versions will be redone on the >>>> latest version of the patch. >>>> >>>> Thanks, in advance, for any questions, comments or suggestions. >>>> >>>> Dan >>>> >>>> P.S. >>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>>> is currently failing in -Xcomp mode on Win* only. I've been trying >>>> to characterize/analyze this failure for more than a week now. At >>>> this point I'm convinced that Async Monitor Deflation is aggravating >>>> an existing bug. However, I plan to have a better handle on that >>>> failure before these bits are pushed to the jdk/jdk repo. >> From erik.gahlin at oracle.com Tue Apr 9 20:50:55 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Tue, 9 Apr 2019 22:50:55 +0200 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> Message-ID: <5CAD05AF.1090700@oracle.com> Thanks Gerard, In metadata.xml (and possible elsewhere) can you change the fields "varianceOfBucketCount" to "bucketCountVariance" "stdDevOfBucketCount" to "bucketCountStandardDeviation" I noticed that events are only emitted if we are able to take the resize lock. Can this be fixed? What prevents us from always getting the data? That's how other periodic events work and losing data sometimes may lead to subtle bugs that hard to understand and replicate in systems that rely on the information. Could we retry on a failure? If it is very problematic to fix, it may be OK to skip the events, but then tests would need to be updated to take that into account (retrying). Otherwise we may get intermittent failures. Thanks Erik > hi Erik, > > > On 4/3/19 12:44 PM, Erik Gahlin wrote: >> Hi Gerard, >> >> Here are some comments about the metadata (to make it consistent with >> other events). >> >> The events should not be in the "Java Application" category since >> they are JVM events. You could perhaps put them in "Java Virtual >> Machine, Runtime, Tables". Some comments about the names and labels >> of fields. >> >> - Label: Number of buckets => Bucket Count >> - Label: Number of entries => Entry Count >> - Label: Total footprint => Total Footprint >> >> Could you remove descriptions that are exactly the same as the label. >> >> - Label: Maximum bucket size => Maximum Bucket Size >> - Label: Average bucket size => Average Bucket Size >> - Label: Variance of bucket size => Bucket Size Variance >> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >> - Label: Standard deviation of bucket size => Bucket Size Standard >> Deviation" >> >> Instead of using the word "size", it may make more sense to use the >> word "count" here as well, i.e "Average Bucket Count", or maybe I'm >> missing something? Is there a difference? >> >> I wonder how useful standard deviation and variance is? If support >> engineers are looking at a recording, or JMC adds a rule for the >> events, what would a good or bad value be? Is it possible to use the >> information for troubleshooting? > > While I'm working on all the above changes you suggested, we can > discuss the standard devation and variance. > > I added them because they are part of the jcmd "VM.symboltable > -verbose" command, so we are consistent. > > Now, regarding how useful they are, I always understood them as a sign > of imbalanced table distribution, and without a proper histogram, this > is the best description of the histogram shape. In reality, however, I > think that if they identify an issue, then we might have a very > curious distribution (some sort of hash table attack), or we have an > issue with our hash function for the particular usage case. > > Still, I'd personally elect to keep them. > > Let me ask you a different question though, Is it expensive to have 2 > doubles as part of an event (5 events per second)? And if so, is there > currently (or planned) granularity for controlling not just which > events to record, but also which attributes? > >> >> - Name: addRate => insertionRate >> - Label: Rate of addition => Insertation Rate >> - Name: removeRate => removalRate >> - Label: Rate of removal => Removal Rate > > Will do. > >> >> I'm missing unit tests for the events. Could you please add in >> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >> average not exceeding max, no negative values etc. > > Working on it, do we need separate test per each event (table), or > just one table will suffice (ex. StringTable)? > > Thank you for the feedback! > > > cheers >> >> Thanks! >> Erik >> >>> Hi all, >>> >>> Please review this feature, which adds tracing events for the >>> internal hash tables. >>> >>> The following attributes are implemented: >>> >>> >> description="Number of buckets" /> >>> >> description="Number of all entries" /> >>> >> label="Total footprint" description="Total memory footprint (the >>> table itself plus all of the entries)" /> >>> >>> >> /> >>> >>> >> description="How many items were added since last event (per >>> second)" /> >>> >> description="How many items were removed since last event (per >>> second)" /> >>> >>> This event was implemented for the following system tables: >>> >>> SymbolTable >>> StringTable >>> Placeholder Table >>> LoaderConstraints Table >>> ProtectionDomainCache Table >>> >>> Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185525 >>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in progress?) >>> >>> >>> Cheers >>> >> >> > From coleen.phillimore at oracle.com Tue Apr 9 23:58:55 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 9 Apr 2019 19:58:55 -0400 Subject: RFR (S) 8222231: Clean up interfaceSupport.inline.hpp duplicated code Message-ID: Some code was left from removing code UseMembar and further improvements encouraged by dholmes.? I also removed a couple redundant and unneeded clear_unhandled_oops calls.? The one in ThreadBlockInVMWithDeadlockCheck is in the calling code Monitor::lock, and the thread in vm from native makes no sense. Tested with runtime jtreg tests, and mach5 tier1-3 in progress. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222231.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8222231 Thanks, Coleen From david.holmes at oracle.com Wed Apr 10 02:04:46 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 Apr 2019 12:04:46 +1000 Subject: RFR (S) 8222231: Clean up interfaceSupport.inline.hpp duplicated code In-Reply-To: References: Message-ID: Hi Coleen, Thanks for doing this! It is all a lot simpler now. Reviewed with enthusiasm. :) David On 10/04/2019 9:58 am, coleen.phillimore at oracle.com wrote: > Some code was left from removing code UseMembar and further improvements > encouraged by dholmes.? I also removed a couple redundant and unneeded > clear_unhandled_oops calls.? The one in ThreadBlockInVMWithDeadlockCheck > is in the calling code Monitor::lock, and the thread in vm from native > makes no sense. > > Tested with runtime jtreg tests, and mach5 tier1-3 in progress. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222231.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8222231 > > Thanks, > Coleen From varming at gmail.com Wed Apr 10 02:25:03 2019 From: varming at gmail.com (Carsten Varming) Date: Tue, 9 Apr 2019 22:25:03 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: Hi Dan, On Mon, Apr 8, 2019 at 9:04 PM Daniel D. Daugherty < daniel.daugherty at oracle.com> wrote: > On 4/5/19 4:59 PM, Karen Kinnear wrote: > > 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 < > _count > with comments that caller ensured _count <= 0 > In ReenterI: guarantee 0 <= _count, with comment not _count < 0 > ? Am I missing something subtle here or should they be the same guarantees? > > > Here's the code in question: > > src/hotspot/share/runtime/objectMonitor.cpp: > > void ObjectMonitor::EnterI(TRAPS) { > > if (_owner == DEFLATER_MARKER) { > guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 should > have been handled by the caller"); > // Deflater thread tried to lock this monitor, but it failed to make > _count negative and gave up. > > void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) { > > if (_owner == DEFLATER_MARKER) { > guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER && > _count < 0, monitor must not be owned by deflater thread here"); > > > Reading these two guarantee() calls always throws me off stride > because I would have written them like this: > > guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 should > have been handled by the caller"); > > and > > guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER && > _count < 0, monitor must not be owned by deflater thread here"); > > When rewritten like the above, you have: > > "_count > 0" ... _count <= 0 > > and: > > "_count >= 0" ... "_count < 0" > > which is easier for my brain to read... okay... enough sidebar... > He he. I have pretty much eliminated > and >= from my written vocabulary. It makes life simpler. Trust me. :) > Short answer: No the guarantees should not be the same. > > Longer answer: EnterI() is called by enter() after enter() has > incremented the _count field to indicate the contended state of > things. So in EnterI(), "_count > 0" is the right check. > ReenterI() is called after wait() has returned (notified or > timedout), and the _count field is not used on reentry ops so > "_count >= 0" is the right check. > > I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, > there are two places in EnterI() that do this): > > L501: if (_owner == DEFLATER_MARKER) { > // The deflation protocol finished the first part (setting > _owner), > // but it failed the second part (making _count negative) > and bailed. > // Because we're called from enter() we have at least one > contention. > guarantee(count > 0, "_owner == DEFLATER_MARKER && _count <= > 0 should have been handled by the caller"); > L504: // Try to acquire monitor. > L505: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == > DEFLATER_MARKER) { > > L629: if (_owner == DEFLATER_MARKER) { > // The deflation protocol finished the first part (setting > _owner), > // but it failed the second part (making _count negative) > and bailed. > // Because we're called from enter() we have at least one > contention. > guarantee(count> 0 , "_owner == DEFLATER_MARKER && _count > <= 0 should have been handled by the caller"); > L632: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == > DEFLATER_MARKER) { > > And I'm going to tweak the ReenterI() code like this: > > L759: if (_owner == DEFLATER_MARKER) { > // The deflation protocol finished the first part (setting > _owner), > // but it will observe _waiters != 0 and will bail out. > Because we're > // called from wait() we may or may not have any > contentions. > guarantee(count >= 0, "Impossible: _owner == > DEFLATER_MARKER && _count < 0 should have been handled by the caller"); > L761: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == > DEFLATER_MARKER) { > > > You didn't ask this, but it is okay that _count is only used to track > contentions in enter()/EnterI() and is not used to track contentions > in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is > used by is_busy() to observe the busy state for an ObjectMonitor that > is being wait()'ed for. The _waiters field is decremented after a > waiter has returned from ReenterI() so the _owner field takes over > answering the is_busy() question... > > > 5. I could use a little help with allocation state transitions, > e.g. in deflate_monitor_list_using_JT > you see is_new with object set so you mark it as old so next deflation > will check it > > > Here's the code in question: > > src/hotspot/share/runtime/synchronizer.cpp: > > int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** > listHeadp, > ObjectMonitor** > freeHeadp, > ObjectMonitor** > freeTailp, > ObjectMonitor** > savedMidInUsep) { > > // Only try to deflate if there is an associated Java object and if > // mid is old (is not newly allocated and is not newly freed). > if (mid->object() != NULL && mid->is_old() && > deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { > // Deflation succeeded so update the in-use list. > > } else { > // mid is considered in-use if it does not have an associated > // Java object or mid is not old or deflation did not succeed. > // A mid->is_new() node can be seen here when it is freshly returned > // by omAlloc() (and skips the deflation code path). > // A mid->is_old() node can be seen here when deflation failed. > // A mid->is_free() node can be seen here when a fresh node from > // omAlloc() is released by omRelease() due to losing the race > // in inflate(). > > if (mid->object() != NULL && mid->is_new()) { > // mid has an associated Java object and has now been seen > // as newly allocated so mark it as "old". > mid->set_allocation_state(ObjectMonitor::Old); > } > > - why do you set it to old here rather than in inflate once we set > values? > > > Inflation is used in quite a few places. If we marked the > ObjectMonitor as "Old" in inflate(), then that would make the > ObjectMonitor available for deflation by deflate_monitor_using_JT() > earlier: > > src/hotspot/share/runtime/synchronizer.cpp: > > bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, > ObjectMonitor** > freeHeadp, > ObjectMonitor** > freeTailp) { > assert(AsyncDeflateIdleMonitors, "sanity check"); > assert(Thread::current()->is_Java_thread(), "precondition"); > // A newly allocated ObjectMonitor should not be seen here so we > // avoid an endless inflate/deflate cycle. > assert(mid->is_old(), "precondition"); > > > So the idea behind only deflating ObjectMonitors that have reached > allocation state "Old" is to prevent "an endless inflate/deflate cycle". > Here's the relevant section from Carsten's JEP: > > To avoid endless inflation / deflation cycles in the prototype, monitor > > deflation is only attempted the second time a monitor is seen by the > > thread marking monitors as deflatable: If the thread (the only thread > > marking monitors as deflatable; might be service thread or some GC > > related thread or even a dedicated thread) sees a monitor in state New, > > then the thread marks the monitor as Old and moves on. So there is > > little interaction between a thread inflating a lock to a monitor and > > the deflating thread, the inflating thread just has to make sure the > > monitor is marked New and this marker is published using appropriate > > barriers. > > > There isn't an explicit example in the JEP of what Carsten was thinking > of with "an endless inflate/deflate cycle". I didn't try to think of > such an example for the OpenJDK wiki either. I simple wrote: > I think I was thinking about a cycle where a Java object exhibits a monitor inflation, then deflation, then inflation, then deflation. Each inflation will be with a new monitor. This behavior could increase the number of monitors allocated, especially with my original patch as I recycled monitors only after a safepoint. Now that I think about it again, such a cycle is incredible unlikely as it would require repeated contention on the java object, yet the monitors must not be busy when the deflator thread comes by. And this scenario has to repeat itself. This all seems pretty unlikely. ObjectMonitor has a new allocation_state field that supports three > > states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied > > to ObjectMonitors that have reached the 'Old' state. When the Async > > Monitor Deflation code sees an ObjectMonitor in the 'New' state, it > > is changed to the 'Old' state, but is not deflated. This prevents a > > newly allocated ObjectMonitor from being immediately deflated which > > could cause an inflation<->deflation oscillation. > > > So let's think about what might happen if an ObjectMonitor is marked > as "Old" in inflate(). Here's an example use of inflate() in the > "slow enter" code path: > > src/hotspot/share/runtime/synchronizer.cpp: > > void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) { > > base< inflate(THREAD, obj(), > inflate_cause_monitor_enter)->enter(THREAD); > > new> ObjectMonitorHandle omh; > new> inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); > new> do_loop = !omh.om_ptr()->enter(THREAD); > > In the "base" code, we took the return from inflate() and used it to call > ObjectMonitor::enter(). If we never changed that bit of code and inflate() > marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() could > async deflate the ObjectMonitor while we were trying to call enter() on > it... Boom! So we might think that holding off marking an ObjectMonitor > as "Old" can save us... and it can, but not in all cases... :-( > > It is entirely possible that our call to slow_enter() is made on an > ObjectMonitor that's already marked "Old". In that case, our thread > (T-enter) calls inflate() which returns the existing ObjectMonitor* > and we use it to call enter(). If the thread (T-deflate) calling > deflate_monitor_using_JT() does its magic before T-enter sets the > owner field or the count field... Boom! > > The previous paragraph is exactly what motivated the _ref_count field, > the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* > parameter to inflate(). inflate() calls ObjectMonitorHandle::save_om_ptr() > which increments the ObjectMonitor's ref_count and then checks for async > deflation protocol collisions. If there's a collision, then save_om_ptr() > returns false and the caller (inflate() in this case) has to retry. When > inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot > be deflated and is safe until the ObjectMonitorHandle is destroyed. > > So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot > deflate the ObjectMonitor in the window after inflate() returns and > before T-enter sets the owner field or increments the count field. But > you know all that already! > > So let's bring this back to having inflate() mark the ObjectMonitor as > "Old"... Since inflate() returns an ObjectMonitor with the ref_count > 0, > it doesn't matter if the ObjectMonitor is marked as "Old" in inflate(). > T-deflate cannot deflate it due to ref_count > 0. > > Here's another crazy thought... inflate() is the only function that > calls omAlloc(), and omAlloc() is the only function that sets "New". > If we move the setting of "Old" from deflate_monitor_list_using_JT() > to inflate(), then the change from "New" -> "Old" never happens > outside of the inflate() call so why do we need the allocation state? > > Small dose of reality: I've found having the allocation state to be > very helpful when debugging race related crashes. We could make the > allocation state be DEBUG_ONLY, but then what about race debugging of > product bits... sigh... > > > 6. Could you get rid of the new goto?s? > > > I believe there is only one left from Carsten's prototype: > You make it sound like I was throwing gotos around left and right. :) If you count continue and break statements, then you might have been right. I'll break my response here, so we can return to regular structured programming, ;-) Carsten From david.holmes at oracle.com Wed Apr 10 05:15:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 Apr 2019 15:15:48 +1000 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: <620e300d-a941-5e55-1814-f37c4b559655@oracle.com> Hi Carsten, Dan, I'd like to pick up on one topic - a higher-level discussion about the timing of the ObjectMonitor lifecycle as they currently are and with these changes: Carsten wrote: > I think I was thinking about a cycle where a Java object exhibits > a monitor inflation, then deflation, then inflation, then deflation. > Each inflation will be with a new monitor. This behavior could > increase the number of monitors allocated, especially with my > original patch as I recycled monitors only after a safepoint. Now > that I think about it again, such a cycle is incredible unlikely as > it would require repeated contention on the> java object, yet the > monitors must not be busy when the deflator thread comes by. And this > scenario has to repeat itself. This all seems pretty unlikely. So logically every Object has associated with it an ObjectMonitor but if we created the ObjectMonitor at the same time as the Object and kept it alive while the Object was alive then we would double our memory use (if not worse). So we lazily create ObjectMonitors only when we need them: contention, Object.wait() use, hashcode use. We could then leave the ObjectMonitors around as long as the Objects are alive, but again this has implications for memory use. So we deflate idle ObjectMonitors to reclaim memory (though in practice it is more complex and we maintain pools of them to speed up allocation). If we aggressively deflate as soon as an ObjectMonitor is idle then we risk getting into inflate->deflate->inflate cycles. The likelihood may be low but if you hit this pathology in your code you will probably be unhappy about the effects on performance. So instead, IIUC, we use some measure of "memory pressure" and only try to deflate under certain conditions. But I'm unclear exactly what those conditions are today, and whether they change with async monitor deflation. Can you enlighten me please? Thanks, David On 10/04/2019 12:25 pm, Carsten Varming wrote: > Hi Dan, > > On Mon, Apr 8, 2019 at 9:04 PM Daniel D. Daugherty < > daniel.daugherty at oracle.com> wrote: > >> On 4/5/19 4:59 PM, Karen Kinnear wrote: >> >> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 < >> _count >> with comments that caller ensured _count <= 0 >> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >> ? Am I missing something subtle here or should they be the same guarantees? >> >> >> Here's the code in question: >> >> src/hotspot/share/runtime/objectMonitor.cpp: >> >> void ObjectMonitor::EnterI(TRAPS) { >> >> if (_owner == DEFLATER_MARKER) { >> guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 should >> have been handled by the caller"); >> // Deflater thread tried to lock this monitor, but it failed to make >> _count negative and gave up. >> >> void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) { >> >> if (_owner == DEFLATER_MARKER) { >> guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER && >> _count < 0, monitor must not be owned by deflater thread here"); >> >> >> Reading these two guarantee() calls always throws me off stride >> because I would have written them like this: >> >> guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 should >> have been handled by the caller"); >> >> and >> >> guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER && >> _count < 0, monitor must not be owned by deflater thread here"); >> >> When rewritten like the above, you have: >> >> "_count > 0" ... _count <= 0 >> >> and: >> >> "_count >= 0" ... "_count < 0" >> >> which is easier for my brain to read... okay... enough sidebar... >> > > He he. I have pretty much eliminated > and >= from my written vocabulary. > It makes life simpler. Trust me. :) > > >> Short answer: No the guarantees should not be the same. >> >> Longer answer: EnterI() is called by enter() after enter() has >> incremented the _count field to indicate the contended state of >> things. So in EnterI(), "_count > 0" is the right check. >> ReenterI() is called after wait() has returned (notified or >> timedout), and the _count field is not used on reentry ops so >> "_count >= 0" is the right check. >> >> I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, >> there are two places in EnterI() that do this): >> >> L501: if (_owner == DEFLATER_MARKER) { >> // The deflation protocol finished the first part (setting >> _owner), >> // but it failed the second part (making _count negative) >> and bailed. >> // Because we're called from enter() we have at least one >> contention. >> guarantee(count > 0, "_owner == DEFLATER_MARKER && _count <= >> 0 should have been handled by the caller"); >> L504: // Try to acquire monitor. >> L505: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >> DEFLATER_MARKER) { >> >> L629: if (_owner == DEFLATER_MARKER) { >> // The deflation protocol finished the first part (setting >> _owner), >> // but it failed the second part (making _count negative) >> and bailed. >> // Because we're called from enter() we have at least one >> contention. >> guarantee(count> 0 , "_owner == DEFLATER_MARKER && _count >> <= 0 should have been handled by the caller"); >> L632: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >> DEFLATER_MARKER) { >> >> And I'm going to tweak the ReenterI() code like this: >> >> L759: if (_owner == DEFLATER_MARKER) { >> // The deflation protocol finished the first part (setting >> _owner), >> // but it will observe _waiters != 0 and will bail out. >> Because we're >> // called from wait() we may or may not have any >> contentions. >> guarantee(count >= 0, "Impossible: _owner == >> DEFLATER_MARKER && _count < 0 should have been handled by the caller"); >> L761: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >> DEFLATER_MARKER) { >> >> >> You didn't ask this, but it is okay that _count is only used to track >> contentions in enter()/EnterI() and is not used to track contentions >> in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is >> used by is_busy() to observe the busy state for an ObjectMonitor that >> is being wait()'ed for. The _waiters field is decremented after a >> waiter has returned from ReenterI() so the _owner field takes over >> answering the is_busy() question... >> >> >> 5. I could use a little help with allocation state transitions, >> e.g. in deflate_monitor_list_using_JT >> you see is_new with object set so you mark it as old so next deflation >> will check it >> >> >> Here's the code in question: >> >> src/hotspot/share/runtime/synchronizer.cpp: >> >> int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** >> listHeadp, >> ObjectMonitor** >> freeHeadp, >> ObjectMonitor** >> freeTailp, >> ObjectMonitor** >> savedMidInUsep) { >> >> // Only try to deflate if there is an associated Java object and if >> // mid is old (is not newly allocated and is not newly freed). >> if (mid->object() != NULL && mid->is_old() && >> deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { >> // Deflation succeeded so update the in-use list. >> >> } else { >> // mid is considered in-use if it does not have an associated >> // Java object or mid is not old or deflation did not succeed. >> // A mid->is_new() node can be seen here when it is freshly returned >> // by omAlloc() (and skips the deflation code path). >> // A mid->is_old() node can be seen here when deflation failed. >> // A mid->is_free() node can be seen here when a fresh node from >> // omAlloc() is released by omRelease() due to losing the race >> // in inflate(). >> >> if (mid->object() != NULL && mid->is_new()) { >> // mid has an associated Java object and has now been seen >> // as newly allocated so mark it as "old". >> mid->set_allocation_state(ObjectMonitor::Old); >> } >> >> - why do you set it to old here rather than in inflate once we set >> values? >> >> >> Inflation is used in quite a few places. If we marked the >> ObjectMonitor as "Old" in inflate(), then that would make the >> ObjectMonitor available for deflation by deflate_monitor_using_JT() >> earlier: >> >> src/hotspot/share/runtime/synchronizer.cpp: >> >> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, >> ObjectMonitor** >> freeHeadp, >> ObjectMonitor** >> freeTailp) { >> assert(AsyncDeflateIdleMonitors, "sanity check"); >> assert(Thread::current()->is_Java_thread(), "precondition"); >> // A newly allocated ObjectMonitor should not be seen here so we >> // avoid an endless inflate/deflate cycle. >> assert(mid->is_old(), "precondition"); >> >> >> So the idea behind only deflating ObjectMonitors that have reached >> allocation state "Old" is to prevent "an endless inflate/deflate cycle". >> Here's the relevant section from Carsten's JEP: >> >> To avoid endless inflation / deflation cycles in the prototype, monitor >> >> deflation is only attempted the second time a monitor is seen by the >> >> thread marking monitors as deflatable: If the thread (the only thread >> >> marking monitors as deflatable; might be service thread or some GC >> >> related thread or even a dedicated thread) sees a monitor in state New, >> >> then the thread marks the monitor as Old and moves on. So there is >> >> little interaction between a thread inflating a lock to a monitor and >> >> the deflating thread, the inflating thread just has to make sure the >> >> monitor is marked New and this marker is published using appropriate >> >> barriers. >> >> >> There isn't an explicit example in the JEP of what Carsten was thinking >> of with "an endless inflate/deflate cycle". I didn't try to think of >> such an example for the OpenJDK wiki either. I simple wrote: >> > > I think I was thinking about a cycle where a Java object exhibits a monitor > inflation, then deflation, then inflation, then deflation. Each inflation > will be with a new monitor. This behavior could increase the number of > monitors allocated, especially with my original patch as I recycled > monitors only after a safepoint. Now that I think about it again, such a > cycle is incredible unlikely as it would require repeated contention on the > java object, yet the monitors must not be busy when the deflator thread > comes by. And this scenario has to repeat itself. This all seems pretty > unlikely. > > ObjectMonitor has a new allocation_state field that supports three >> >> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied >> >> to ObjectMonitors that have reached the 'Old' state. When the Async >> >> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it >> >> is changed to the 'Old' state, but is not deflated. This prevents a >> >> newly allocated ObjectMonitor from being immediately deflated which >> >> could cause an inflation<->deflation oscillation. >> >> >> So let's think about what might happen if an ObjectMonitor is marked >> as "Old" in inflate(). Here's an example use of inflate() in the >> "slow enter" code path: >> >> src/hotspot/share/runtime/synchronizer.cpp: >>> void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) { >> >> base< inflate(THREAD, obj(), >> inflate_cause_monitor_enter)->enter(THREAD); >> >> new> ObjectMonitorHandle omh; >> new> inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); >> new> do_loop = !omh.om_ptr()->enter(THREAD); >> >> In the "base" code, we took the return from inflate() and used it to call >> ObjectMonitor::enter(). If we never changed that bit of code and inflate() >> marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() could >> async deflate the ObjectMonitor while we were trying to call enter() on >> it... Boom! So we might think that holding off marking an ObjectMonitor >> as "Old" can save us... and it can, but not in all cases... :-( >> >> It is entirely possible that our call to slow_enter() is made on an >> ObjectMonitor that's already marked "Old". In that case, our thread >> (T-enter) calls inflate() which returns the existing ObjectMonitor* >> and we use it to call enter(). If the thread (T-deflate) calling >> deflate_monitor_using_JT() does its magic before T-enter sets the >> owner field or the count field... Boom! >> >> The previous paragraph is exactly what motivated the _ref_count field, >> the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* >> parameter to inflate(). inflate() calls ObjectMonitorHandle::save_om_ptr() >> which increments the ObjectMonitor's ref_count and then checks for async >> deflation protocol collisions. If there's a collision, then save_om_ptr() >> returns false and the caller (inflate() in this case) has to retry. When >> inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot >> be deflated and is safe until the ObjectMonitorHandle is destroyed. >> >> So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot >> deflate the ObjectMonitor in the window after inflate() returns and >> before T-enter sets the owner field or increments the count field. But >> you know all that already! >> >> So let's bring this back to having inflate() mark the ObjectMonitor as >> "Old"... Since inflate() returns an ObjectMonitor with the ref_count > 0, >> it doesn't matter if the ObjectMonitor is marked as "Old" in inflate(). >> T-deflate cannot deflate it due to ref_count > 0. >> >> Here's another crazy thought... inflate() is the only function that >> calls omAlloc(), and omAlloc() is the only function that sets "New". >> If we move the setting of "Old" from deflate_monitor_list_using_JT() >> to inflate(), then the change from "New" -> "Old" never happens >> outside of the inflate() call so why do we need the allocation state? >> >> Small dose of reality: I've found having the allocation state to be >> very helpful when debugging race related crashes. We could make the >> allocation state be DEBUG_ONLY, but then what about race debugging of >> product bits... sigh... >> >> >> 6. Could you get rid of the new goto?s? >> >> >> I believe there is only one left from Carsten's prototype: >> > > You make it sound like I was throwing gotos around left and right. :) If > you count continue and break statements, then you might have been right. > > I'll break my response here, so we can return to regular structured > programming, ;-) > Carsten > From claes.redestad at oracle.com Wed Apr 10 09:55:53 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 10 Apr 2019 11:55:53 +0200 Subject: RFR: 8221836: Avoid recalculating String.hash when zero In-Reply-To: References: <3f953f60-a3c4-0881-9aa4-1b1b31568a87@redhat.com> <924616cb-8c2f-3f9f-aae2-09b1d72927d6@oracle.com> <88548c52-3630-9c8b-c222-872ea5d1b58b@redhat.com> <7ee48121-d4a5-08d7-0f4a-ddf116764e0d@redhat.com> <098306c5-2c39-7e72-a2e5-1d7ad9b75f2b@gmail.com> <9d272e1d-7c49-017c-7313-57636b103003@redhat.com> <0d75c78f-c255-147f-ed80-af940ce7b509@redhat.com> <1bfa62d2-fa96-da46-63c0-7e4cf4cf40fd@redhat.com> <183620f8-831f-8754-d3db-b7c3c1328b21@redhat.com> <9b67c12d-8109-e3f7-1d82-b6aecd2ffb39@oracle.com> Message-ID: <249b958e-8fc2-1535-479c-180a44b27eb8@oracle.com> On 2019-04-09 15:03, Andrew Dinn wrote: >> How about this: >> http://cr.openjdk.java.net/~redestad/8221836/open.03/ > Yes, that looks fine. Thanks, Andrew. I'll push this shortly. /Claes From robin.westberg at oracle.com Wed Apr 10 14:02:51 2019 From: robin.westberg at oracle.com (Robin Westberg) Date: Wed, 10 Apr 2019 16:02:51 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> Message-ID: Hi David, Thanks for the detailed review! > On 9 Apr 2019, at 04:45, David Holmes wrote: > > Hi Robin, > > On 8/04/2019 10:47 pm, Robin Westberg wrote: >> Hi again, >> Here?s an updated version where I?ve moved the naked_short_nanosleep function into the Posix class, to avoid future cross-platform use. (It?s still used in the SpinYield and TimedYield implementations though). > > But you also changed the existing Windows os::naked_short_sleep to use the WaitableTimer which is a significant change to make. Is this just because it will likely have better resolution than the native Sleep function? > > Code using os::naked_short_sleep(1) might be unexpectedly impacted by this if what was a 10ms (or worse) sleep becomes closer to 1ms. PerfDataManager::destroy in particular could be impacted as its racy to begin with. That's not to say this isn't a good thing to fix, just be aware it may have unexpected consequences. You are right, my thinking was that it would be nice to retain the more exact version found in the nanosleep implementation that I was removing. But I?d be fine with doing that as a separate change and run additional testing on it. I can file a separate RFE to keep track of it. >> Full webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.01/ >> Incremental: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00-01/ > > src/hotspot/share/utilities/spinYield.cpp > > I'm somewhat dubious about getting rid of non-Posix naked_short_nanosleep and instead adding a win32 ifdef to this code. It kind of defeats the purpose of the os abstraction layer - albeit that Windows can't really do a nanosleep. > > Why did you get rid of the sleep_ns parameter and hardwire it to 1000? A configurable sleep time would be a feature of this utility. The original implementation of SpinYield had the sleep hardwired to 1 ms os::naked_short_sleep - when os::naked_short_nanosleep was introduced this parameter was added as well, with a default value of 1000. But there?s no code that actually sets the parameter, and it?s a bit misleading that the parameter accepts nanoseconds when that cannot be acted upon on Windows. So I figured that it would be better to remove the parameter but retain the existing behavior. But this should probably be revisited when the fate of TimedYield has been decided.. > --- > > src/hotspot/share/utilities/timedYield.hpp > > 36 // scheduled on the same cpu as the waiter, we will first try local cpu > 37 // yielding until we reach OS sleep primitive granularity in the waiting > > Whether or not the yielding is "cpu local" will depend on the OS and the scheduler in use. I would just refer to OS native yield. The intention is to coerce the scheduler to perform a cpu-local yield if possible - more on that later. > 40 jlong _yield_time_ns; > 41 jlong _sleep_time_ns; > 42 jlong _max_yield_time_ns; > 43 jlong _yield_granularity_ns; > > We are avoiding using j-types in the VM unless they actually pertain to Java variables. These should be int64_t (to be compatible with javaTimeNanos call). Thanks, will change. > 52 // Perform next round of delay. > 53 void wait(); > > I think delay() would be a better name than wait(). This was inspired by the SpinYield utility, but I certainly wouldn?t mind renaming it. Perhaps SpinYield::wait should be renamed as well to keep the symmetry? > --- > > src/hotspot/share/utilities/timedYield.cpp > > 42 // Phase 1 - local cpu yielding > 43 if (_yield_time_ns < _max_yield_time_ns) { > 44 #ifdef WIN32 > 45 if (SwitchToThread() == 0) { > 46 // Nothing else is ready to run on this cpu, spin a little > 47 while (os::javaTimeNanos() - start < _yield_granularity_ns) { > 48 SpinPause(); > 49 } > 50 } > 51 #else > 52 os::Posix::naked_short_nanosleep(_yield_granularity_ns); > 53 #endif > 54 _yield_time_ns += os::javaTimeNanos() - start; > 55 return; > 56 } > > I have a few issues with this code. It's breaking the os abstraction layer again by using an OS ifdef and not using os APIs that exist for this very purpose - i.e os::naked_yield(). And for non-Windows it seems quite bizarre the "yielding" part of TimedYield is actually implemented with a sleep and not os::naked_yield! The ?root? problem here is that the existing os primitives unfortunately do not quite map to the goal of this utility. Let me try to break it down a bit and see if it makes sense: The purpose of TimedYield is to wait for a thread rendezvous, for a short time as possible. (In this case, waiting for threads to notice that the safepoint poll has been armed). Ideally, we are aiming for sub-millisecond waiting times here. If these threads are scheduled on other cpu?s this is pretty simple - we could spin or nanosleep and they would make progress. However - if one or more of these threads are scheduled to run on the current cpu things become interesting. Waiting for the OS scheduler to move threads to different cpu?s can take milliseconds - much slower than what is possible to achieve. So, we want to try performing a cpu-local yield at first. On Windows, this maps reasonably well to os::naked_yield: void os::naked_yield() { // Consider passing back the return value from SwitchToThread(). SwitchToThread(); } But for example on Linux, there is instead this: // Linux CFS scheduler (since 2.6.23) does not guarantee sched_yield(2) will // actually give up the CPU. Since skip buddy (v2.6.28): // // * Sets the yielding task as skip buddy for current CPU's run queue. // * Picks next from run queue, if empty, picks a skip buddy (can be the yielding task). // * Clears skip buddies for this run queue (yielding task no longer a skip buddy). // // An alternative is calling os::naked_short_nanosleep with a small number to avoid // getting re-scheduled immediately. // void os::naked_yield() { sched_yield(); } In both cases, we may get rescheduled immediately - on Windows this is indicated in the return value from SwitchToThread, but on Linux we don?t know. On Windows, it is then fine to spin a little while as there is nothing else ready to run. But on Linux, the CFS scheduler penalizes spinning as the runtime counter is increased, which will hurt the waiter when the time comes to perform actual work. So we don?t want to spin on a no-op sched_yield, we have to use nanosleep instead. But then we are back to the original problem - the current nanosleep is not what we want to do on Windows in this situation So why not change nanosleep on Windows: > If the existing os api's need adjustment or expansion to provide the functionality desired by this code then I would much prefer to see the os API's updated to address that. > > That said, given the original problem is that os::naked_short_nanosleep on Windows is too coarse with the use of WaitableTimer why not just replace that with a simple loop of the form: > > while (elapsed < sleep_ns) { > if (SwitchToThread() == 0) { > SpinPause(); > elapsed = ? > } So this would actually work fine in this case - but it's probably not what you would expect from a sleep function in the general case. On Linux, you would get control back after the provided nanosecond period even if another thread executed in the meantime. But on Windows, you are potentially giving up your entire timeslice if another thread is ready to run - this would be much worse than plain naked_short_sleep as you may not get control back for another 15 ms or so. That all being said, switching the Windows naked_short_nanosleep to the above implementation would be just fine - but I really think it should be renamed in that case. Perhaps something like os::timed_yield(jlong ns) would make sense? The additional backoff mechanism in ThreadYield can be reverted back to being handled by the safepointing code. The reason I made TimedYield into a separate utility was that it may be useful in other places as well, but such future use can of course be handled separately if the need actually arises. Best regards, Robin > > ? > > Thanks, > David > ----- > > >> Best regards, >> Robin >>> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >>> >>> Hi David, >>> >>>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>>> >>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>> Hi David, >>>>> Thanks for taking a look! >>>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>>> >>>>>> Hi Robin, >>>>>> >>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>> Hi all, >>>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>>> >>>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>>> >>>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >>> >>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >>> >>> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >>> >>>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >>> >>> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>> >>> Best regards, >>> Robin >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> Best regards, >>>>> Robin >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>> Testing: tier1 >>>>>>> Best regards, >>>>>>> Robin From coleen.phillimore at oracle.com Wed Apr 10 14:36:58 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 10 Apr 2019 10:36:58 -0400 Subject: RFR (S) 8222231: Clean up interfaceSupport.inline.hpp duplicated code In-Reply-To: References: Message-ID: Thank you David! Coleen On 4/9/19 10:04 PM, David Holmes wrote: > Hi Coleen, > > Thanks for doing this! It is all a lot simpler now. Reviewed with > enthusiasm. :) > > David > > On 10/04/2019 9:58 am, coleen.phillimore at oracle.com wrote: >> Some code was left from removing code UseMembar and further >> improvements encouraged by dholmes.? I also removed a couple >> redundant and unneeded clear_unhandled_oops calls.? The one in >> ThreadBlockInVMWithDeadlockCheck is in the calling code >> Monitor::lock, and the thread in vm from native makes no sense. >> >> Tested with runtime jtreg tests, and mach5 tier1-3 in progress. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8222231.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8222231 >> >> Thanks, >> Coleen From patricio.chilano.mateo at oracle.com Wed Apr 10 16:01:33 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 10 Apr 2019 12:01:33 -0400 Subject: RFR (S) 8222231: Clean up interfaceSupport.inline.hpp duplicated code In-Reply-To: References: Message-ID: Hi Coleen! Change looks good to me! Just a small comment. Since we now will not clear unhandled oops in the TBIVMWDC jacket anymore, I think we should add that check in Monitor::wait() like we do in Monitor::lock(). Thanks! Patricio On 4/10/19 10:36 AM, coleen.phillimore at oracle.com wrote: > > Thank you David! > Coleen > > On 4/9/19 10:04 PM, David Holmes wrote: >> Hi Coleen, >> >> Thanks for doing this! It is all a lot simpler now. Reviewed with >> enthusiasm. :) >> >> David >> >> On 10/04/2019 9:58 am, coleen.phillimore at oracle.com wrote: >>> Some code was left from removing code UseMembar and further >>> improvements encouraged by dholmes.? I also removed a couple >>> redundant and unneeded clear_unhandled_oops calls.? The one in >>> ThreadBlockInVMWithDeadlockCheck is in the calling code >>> Monitor::lock, and the thread in vm from native makes no sense. >>> >>> Tested with runtime jtreg tests, and mach5 tier1-3 in progress. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8222231.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8222231 >>> >>> Thanks, >>> Coleen > From daniel.daugherty at oracle.com Wed Apr 10 16:24:49 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 10 Apr 2019 12:24:49 -0400 Subject: RFR(XXS): 8222034: Thread-SMR functions should be updated to remove work around Message-ID: <92633a98-5d05-3dd5-8fd9-4a63b8b25071@oracle.com> Greetings, I have a very small fix for the following bug: ??? JDK-8222034 Thread-SMR functions should be updated to remove work around ??? https://bugs.openjdk.java.net/browse/JDK-8222034 Webrev URL: http://cr.openjdk.java.net/~dcubed/8222034-webrev/0_for_jdk13/ I would like to hear from Erik Osterlund and Martin Doerr on this review. Of course, anyone else is also welcome to chime in. This fix has been tested with a Mach5 tier[1-3] run. Thanks, in advance, for any questions, comments or suggestions. Dan From daniel.daugherty at oracle.com Wed Apr 10 16:38:59 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 10 Apr 2019 12:38:59 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: Hi Carsten, Thanks for chiming back in on this thread... More below... On 4/9/19 10:25 PM, Carsten Varming wrote: > Hi Dan, > > On Mon, Apr 8, 2019 at 9:04 PM Daniel D. Daugherty > > wrote: > > On 4/5/19 4:59 PM, Karen Kinnear wrote: >> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees >> that 0 < _count >> with comments that caller ensured _count <= 0 >> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >> ? Am I missing something subtle here or should they be the same >> guarantees? > > Here's the code in question: > > src/hotspot/share/runtime/objectMonitor.cpp: > > void ObjectMonitor::EnterI(TRAPS) { > > ? if (_owner == DEFLATER_MARKER) { > ??? guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= > 0 should have been handled by the caller"); > ??? // Deflater thread tried to lock this monitor, but it failed > to make _count negative and gave up. > > void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) { > > ??? if (_owner == DEFLATER_MARKER) { > ????? guarantee(0 <= _count, "Impossible: _owner == > DEFLATER_MARKER && _count < 0, monitor must not be owned by > deflater thread here"); > > > Reading these two guarantee() calls always throws me off stride > because I would have written them like this: > > ??? guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= > 0 should have been handled by the caller"); > > and > > ????? guarantee(_count >= 0, "Impossible: _owner == > DEFLATER_MARKER && _count < 0, monitor must not be owned by > deflater thread here"); > > When rewritten like the above, you have: > > ??? "_count > 0" ... _count <= 0 > > and: > > ??? "_count >= 0" ... "_count < 0" > > which is easier for my brain to read... okay... enough sidebar... > > > He he. I have pretty much eliminated > and >= from my written > vocabulary. It makes life simpler. Trust me. :) Interesting... we'll have to discuss that (off thread)... > Short answer: No the guarantees should not be the same. > > Longer answer: EnterI() is called by enter() after enter() has > incremented the _count field to indicate the contended state of > things. So in EnterI(), "_count > 0" is the right check. > ReenterI() is called after wait() has returned (notified or > timedout), and the _count field is not used on reentry ops so > "_count >= 0" is the right check. > > I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, > there are two places in EnterI() that do this): > > ??? L501: ? if (_owner == DEFLATER_MARKER) { > ??? ? ?? ?? ? // The deflation protocol finished the first part > (setting _owner), > ??? ? ?? ? ?? // but it failed the second part (making _count > negative) and bailed. > ??? ? ? ? ? ? // Because we're called from enter() we have at > least one contention. > ??? ? ? ? ??? guarantee(count > 0, "_owner == DEFLATER_MARKER && > _count <= 0 should have been handled by the caller"); > ??? L504: ??? // Try to acquire monitor. > ??? L505: ??? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) > == DEFLATER_MARKER) { > > ??? L629: ??? if (_owner == DEFLATER_MARKER) { > ????? ?? ?????? // The deflation protocol finished the first part > (setting _owner), > ????? ?? ?????? // but it failed the second part (making _count > negative) and bailed. > ????? ?? ?????? // Because we're called from enter() we have at > least one contention. > ?? ???????????? guarantee(count> 0 , "_owner == DEFLATER_MARKER && > _count <= 0 should have been handled by the caller"); > ??? L632: ????? if (Atomic::cmpxchg(Self, &_owner, > DEFLATER_MARKER) == DEFLATER_MARKER) { > > And I'm going to tweak the ReenterI() code like this: > > ??? L759: ??? if (_owner == DEFLATER_MARKER) { > ??????????????? // The deflation protocol finished the first part > (setting _owner), > ??????????????? // but it will observe _waiters != 0 and will bail > out. Because we're > ??????????????? // called from wait() we may or may not have any > contentions. > ? ? ? ? ? ????? guarantee(count >= 0, "Impossible: _owner == > DEFLATER_MARKER && _count < 0 should have been handled by the > caller"); > ??? L761: ????? if (Atomic::cmpxchg(Self, &_owner, > DEFLATER_MARKER) == DEFLATER_MARKER) { > > > You didn't ask this, but it is okay that _count is only used to track > contentions in enter()/EnterI() and is not used to track contentions > in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is > used by is_busy() to observe the busy state for an ObjectMonitor that > is being wait()'ed for. The _waiters field is decremented after a > waiter has returned from ReenterI() so the _owner field takes over > answering the is_busy() question... > > >> 5. I could use a little help with allocation state transitions, >> e.g. in deflate_monitor_list_using_JT >> ? you see is_new with object set so you mark it as old so next >> deflation will check it > > Here's the code in question: > > src/hotspot/share/runtime/synchronizer.cpp: > > int > ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** > listHeadp, > ObjectMonitor** freeHeadp, > ObjectMonitor** freeTailp, > ObjectMonitor** savedMidInUsep) { > > ??? // Only try to deflate if there is an associated Java object > and if > ??? // mid is old (is not newly allocated and is not newly freed). > ??? if (mid->object() != NULL && mid->is_old() && > ??????? deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { > ????? // Deflation succeeded so update the in-use list. > > ??? } else { > ????? // mid is considered in-use if it does not have an associated > ????? // Java object or mid is not old or deflation did not succeed. > ????? // A mid->is_new() node can be seen here when it is freshly > returned > ????? // by omAlloc() (and skips the deflation code path). > ????? // A mid->is_old() node can be seen here when deflation failed. > ????? // A mid->is_free() node can be seen here when a fresh node from > ????? // omAlloc() is released by omRelease() due to losing the race > ????? // in inflate(). > > ????? if (mid->object() != NULL && mid->is_new()) { > ??????? // mid has an associated Java object and has now been seen > ??????? // as newly allocated so mark it as "old". > mid->set_allocation_state(ObjectMonitor::Old); > ????? } > >> ? - why do you set it to old here rather than in inflate once we >> set values? > > Inflation is used in quite a few places. If we marked the > ObjectMonitor as "Old" in inflate(), then that would make the > ObjectMonitor available for deflation by deflate_monitor_using_JT() > earlier: > > src/hotspot/share/runtime/synchronizer.cpp: >> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, >> ObjectMonitor** freeHeadp, >> ObjectMonitor** freeTailp) { >> ? assert(AsyncDeflateIdleMonitors, "sanity check"); >> ? assert(Thread::current()->is_Java_thread(), "precondition"); >> ? // A newly allocated ObjectMonitor should not be seen here so we >> ? // avoid an endless inflate/deflate cycle. >> ? assert(mid->is_old(), "precondition"); > > So the idea behind only deflating ObjectMonitors that have reached > allocation state "Old" is to prevent "an endless inflate/deflate > cycle". > Here's the relevant section from Carsten's JEP: > >> To avoid endless inflation / deflation cycles in the prototype, >> monitor >> deflation is only attempted the second time a monitor is seen by the >> thread marking monitors as deflatable: If the thread (the only thread >> marking monitors as deflatable; might be service thread or some GC >> related thread or even a dedicated thread) sees a monitor in >> state New, >> then the thread marks the monitor as Old and moves on. So there is >> little interaction between a thread inflating a lock to a monitor and >> the deflating thread, the inflating thread just has to make sure the >> monitor is marked New and this marker is published using appropriate >> barriers. > > There isn't an explicit example in the JEP of what Carsten was > thinking > of with "an endless inflate/deflate cycle". I didn't try to think of > such an example for the OpenJDK wiki either. I simple wrote: > > > I think I was thinking about a cycle where a Java object exhibits a > monitor inflation, then deflation, then inflation, then deflation. > Each inflation will be with a new monitor. This behavior could > increase the number of monitors allocated, especially with my original > patch as I recycled monitors only after a safepoint. Now that I think > about it again, such a cycle is incredible unlikely as it would > require repeated contention on the java object, yet the monitors must > not be busy when the deflator thread comes by. And this scenario has > to repeat itself. This all seems pretty unlikely. So it sounds like you're okay with moving the setting of "Old" into inflate(). As always, the proof will be in the stress testing... :-) > >> ObjectMonitor has a new allocation_state field that supports three >> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied >> to ObjectMonitors that have reached the 'Old' state. When the Async >> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it >> is changed to the 'Old' state, but is not deflated. This prevents a >> newly allocated ObjectMonitor from being immediately deflated which >> could cause an inflation<->deflation oscillation. > > So let's think about what might happen if an ObjectMonitor is marked > as "Old" in inflate(). Here's an example use of inflate() in the > "slow enter" code path: > > src/hotspot/share/runtime/synchronizer.cpp: > > void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, > TRAPS) { > > base< ?? inflate(THREAD, obj(), > inflate_cause_monitor_enter)->enter(THREAD); > > new>? ?? ObjectMonitorHandle omh; > new>? ?? inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); > new>? ?? do_loop = !omh.om_ptr()->enter(THREAD); > > In the "base" code, we took the return from inflate() and used it > to call > ObjectMonitor::enter(). If we never changed that bit of code and > inflate() > marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() > could > async deflate the ObjectMonitor while we were trying to call > enter() on > it... Boom! So we might think that holding off marking an > ObjectMonitor > as "Old" can save us... and it can, but not in all cases... :-( > > It is entirely possible that our call to slow_enter() is made on an > ObjectMonitor that's already marked "Old". In that case, our thread > (T-enter) calls inflate() which returns the existing ObjectMonitor* > and we use it to call enter(). If the thread (T-deflate) calling > deflate_monitor_using_JT() does its magic before T-enter sets the > owner field or the count field... Boom! > > The previous paragraph is exactly what motivated the _ref_count field, > the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* > parameter to inflate(). inflate() calls > ObjectMonitorHandle::save_om_ptr() > which increments the ObjectMonitor's ref_count and then checks for > async > deflation protocol collisions. If there's a collision, then > save_om_ptr() > returns false and the caller (inflate() in this case) has to > retry. When > inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot > be deflated and is safe until the ObjectMonitorHandle is destroyed. > > So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot > deflate the ObjectMonitor in the window after inflate() returns and > before T-enter sets the owner field or increments the count field. But > you know all that already! > > So let's bring this back to having inflate() mark the ObjectMonitor as > "Old"... Since inflate() returns an ObjectMonitor with the > ref_count > 0, > it doesn't matter if the ObjectMonitor is marked as "Old" in > inflate(). > T-deflate cannot deflate it due to ref_count > 0. > > Here's another crazy thought... inflate() is the only function that > calls omAlloc(), and omAlloc() is the only function that sets "New". > If we move the setting of "Old" from deflate_monitor_list_using_JT() > to inflate(), then the change from "New" -> "Old" never happens > outside of the inflate() call so why do we need the allocation state? > > Small dose of reality: I've found having the allocation state to be > very helpful when debugging race related crashes. We could make the > allocation state be DEBUG_ONLY, but then what about race debugging of > product bits... sigh... > > >> 6. Could you get rid of the new goto?s? > > I believe there is only one left from Carsten's prototype: > > > You make it sound like I was throwing gotos around left and right. :) Sorry, that wasn't my intent! :-) There were just three. I got rid of two in my port, but left that last one because I really didn't want to reformat the FastHashCode() function... I figured I would have to if someone called me on it... and Karen did... > If you count continue and break statements, then you might have been > right. No shortage of those in the monitor subsystem... :-) > I'll break my response here, so we can return to regular structured > programming, ;-) > Carsten Thanks again for chiming in! Dan From gerard.ziemski at oracle.com Wed Apr 10 17:06:17 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Wed, 10 Apr 2019 12:06:17 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <5CAD05AF.1090700@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CAD05AF.1090700@oracle.com> Message-ID: <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> Thank you Erik for more feedback. New webrev:? http://cr.openjdk.java.net/~gziemski/8185525_rev5 On 4/9/19 3:50 PM, Erik Gahlin wrote: > Thanks Gerard, > > In metadata.xml (and possible elsewhere) can you change the fields > > "varianceOfBucketCount" to "bucketCountVariance" > "stdDevOfBucketCount" to "bucketCountStandardDeviation" I changed those, but I also changed: "maximumBucketCount" to "bucketCountMaximum" "averageBucketCount" to "bucketCountAverage" to be fully consistent. > > I noticed that events are only emitted if we are able to take the > resize lock. Can this be fixed? What prevents us from always getting > the data? That's how other periodic events work and losing data > sometimes may lead to subtle bugs that hard to understand and > replicate in systems that rely on the information. Could we retry on a > failure? Good observation. If the resize lock is taken, then it's not likely that whoever owns it will be done soon, so retrying is most likely not going to succeed right away. Is it OK to tie up JFR periodic thread for some time? If so, how long? If the lock is taken, then it means that someone is scanning through the entire table, or the table is being resized. Either way, we're not loosing data, but are just temporarily blind - I don't see a problem here for a long running apps, they will start receiving events eventually (which happen every 10 sec by default) > > If it is very problematic to fix, it may be OK to skip the events, but > then tests would need to be updated to take that into account > (retrying). Otherwise we may get intermittent failures. At the startup of our jtreg JFR test, no one, besides us, should take the lock, so if we don't get the event, because someone else is holding it (too small system hash table that gets resized up immediately after VM starts up), we probably would want to know about it, so a failure here might be in fact welcome. cheers > > Thanks > Erik > >> hi Erik, >> >> >> On 4/3/19 12:44 PM, Erik Gahlin wrote: >>> Hi Gerard, >>> >>> Here are some comments about the metadata (to make it consistent >>> with other events). >>> >>> The events should not be in the "Java Application" category since >>> they are JVM events. You could perhaps put them in "Java Virtual >>> Machine, Runtime, Tables". Some comments about the names and labels >>> of fields. >>> >>> - Label: Number of buckets => Bucket Count >>> - Label: Number of entries => Entry Count >>> - Label: Total footprint => Total Footprint >>> >>> Could you remove descriptions that are exactly the same as the label. >>> >>> - Label: Maximum bucket size => Maximum Bucket Size >>> - Label: Average bucket size => Average Bucket Size >>> - Label: Variance of bucket? size => Bucket Size Variance >>> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >>> - Label: Standard deviation of bucket size => Bucket Size Standard >>> Deviation" >>> >>> Instead of using the word "size", it may make more sense to use the >>> word "count" here as well, i.e "Average Bucket Count", or maybe I'm >>> missing something? Is there a difference? >>> >>> I wonder how useful standard deviation and variance is? If support >>> engineers are looking at a recording, or JMC adds a rule for the >>> events, what would a good or bad value be? Is it possible to use the >>> information for troubleshooting? >> >> While I'm working on all the above changes you suggested, we can >> discuss the standard devation and variance. >> >> I added them because they are part of the jcmd "VM.symboltable >> -verbose" command, so we are consistent. >> >> Now, regarding how useful they are, I always understood them as a >> sign of imbalanced table distribution, and without a proper >> histogram, this is the best description of the histogram shape. In >> reality, however, I think that if they identify an issue, then we >> might have a very curious distribution (some sort of hash table >> attack), or we have an issue with our hash function for the >> particular usage case. >> >> Still, I'd personally elect to keep them. >> >> Let me ask you a different question though, Is it expensive to have 2 >> doubles as part of an event (5 events per second)? And if so, is >> there currently (or planned) granularity for controlling not just >> which events to record, but also which attributes? >> >>> >>> - Name: addRate => insertionRate >>> - Label: Rate of addition =>? Insertation Rate >>> - Name: removeRate => removalRate >>> - Label: Rate of removal => Removal Rate >> >> Will do. >> >>> >>> I'm missing unit tests for the events. Could you please add in >>> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >>> average not exceeding max, no negative values etc. >> >> Working on it, do we need separate test per each event (table), or >> just one table will suffice (ex. StringTable)? >> >> Thank you for the feedback! >> >> >> cheers >>> >>> Thanks! >>> Erik >>> >>>> Hi all, >>>> >>>> Please review this feature, which adds tracing events for the >>>> internal hash tables. >>>> >>>> The following attributes are implemented: >>>> >>>> >>>> >>>> >>> label="Total footprint" description="Total memory footprint (the >>>> table itself plus all of the entries)" /> >>>> >>>> >>> /> >>>> >>>> >>> description="How many items were added since last event (per >>>> second)" /> >>>> >>> description="How many items were removed since last event (per >>>> second)" /> >>>> >>>> This event was implemented for the following system tables: >>>> >>>> SymbolTable >>>> StringTable >>>> Placeholder Table >>>> LoaderConstraints Table >>>> ProtectionDomainCache Table >>>> >>>> Webrev:? http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>>> Bug:???? https://bugs.openjdk.java.net/browse/JDK-8185525 >>>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in >>>> progress?) >>>> >>>> >>>> Cheers >>>> >>> >>> >> > > From gerard.ziemski at oracle.com Wed Apr 10 17:10:00 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Wed, 10 Apr 2019 12:10:00 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <72cbb5f1-c843-2995-12da-31eb22734c30@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> <4862ac3f-f165-7d04-6248-514e588e70a2@oracle.com> <6b9dcad0-5563-0e6b-1c37-c8a4c40d407a@oracle.com> <72cbb5f1-c843-2995-12da-31eb22734c30@oracle.com> Message-ID: <6ba4c5dc-1346-d377-808d-05c16c769e3b@oracle.com> Thank you Coleen! On 4/9/19 10:57 AM, coleen.phillimore at oracle.com wrote: >>> >>> I didn't think this should be moved from jfrPeriodic.cpp.? I thought >>> it could be something like an X macro. >>> >>> Or just make this bit a function that they all call with event as >>> parameter. >>> >>> + event.set_numberOfBuckets(statistics._number_of_buckets); >>> + event.set_numberOfEntries(statistics._number_of_entries); >>> + event.set_totalFootprint(statistics._total_footprint); >>> + event.set_maximumBucketCount(statistics._maximum_bucket_size); >>> + event.set_averageBucketCount(statistics._average_bucket_size); >>> + event.set_varianceOfBucketCount(statistics._variance_of_bucket_size); >>> + event.set_stdDevOfBucketCount(statistics._stddev_of_bucket_size); >>> + event.set_insertionRate(statistics._add_rate); >>> + event.set_removalRate(statistics._remove_rate); >>> + event.commit(); >> >> Each of those JFR events are an instance of a different class, so the >> best I can do is a macro here (otherwise I'd have to create a base >> class for the TableStatistics events from which to extend our 6 table >> events, but I'm not sure JFR architecture supports that - it >> generates class automatically from the event's meta description) >> >> Updated webrev http://cr.openjdk.java.net/~gziemski/8185525_rev4/ > > Yes, that looks better to me. > > + //statistics.print(tty, "SymbolTable"); > > You should remove commented out code.?? You can always add it back > locally if you want to debug it again.? (I don't need to see this change). It was pointed out to me, that we can use templates instead of macro here, so I tried it and I like it: http://cr.openjdk.java.net/~gziemski/8185525_rev5 cheers From daniel.daugherty at oracle.com Wed Apr 10 17:09:52 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 10 Apr 2019 13:09:52 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <620e300d-a941-5e55-1814-f37c4b559655@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> <620e300d-a941-5e55-1814-f37c4b559655@oracle.com> Message-ID: On 4/10/19 1:15 AM, David Holmes wrote: > Hi Carsten, Dan, > > I'd like to pick up on one topic - a higher-level discussion about the > timing of the ObjectMonitor lifecycle as they currently are and with > these changes: > > Carsten wrote: >> I think I was thinking about a cycle where a Java object exhibits >> a monitor inflation, then deflation, then inflation, then deflation. >> Each inflation will be with a new monitor. This behavior could >> increase the number of monitors allocated, especially with my >> original patch as I recycled monitors only after a safepoint. Now >> that I think about it again, such a cycle is incredible unlikely as >> it would require repeated contention on the> java object, yet the >> monitors must not be busy when the deflator thread comes by. And this >> scenario has to repeat itself. This all seems pretty unlikely. > > So logically every Object has associated with it an ObjectMonitor but > if we created the ObjectMonitor at the same time as the Object and > kept it alive while the Object was alive then we would double our > memory use (if not worse). Generally worse. In one of my recent debug sessions on MacOSX with product bits, I had to figure out sizeof(ObjectMonitor) for memory dumping purposes and it was 224 bytes. > So we lazily create ObjectMonitors only when we need them: contention, > Object.wait() use, hashcode use. Clarification: hashcode with contention. hashcode by itself does not require inflation. > We could then leave the ObjectMonitors around as long as the Objects > are alive, but again this has implications for memory use. > > So we deflate idle ObjectMonitors to reclaim memory (though in > practice it is more complex and we maintain pools of them to speed up > allocation). > > If we aggressively deflate as soon as an ObjectMonitor is idle then we > risk getting into inflate->deflate->inflate cycles. The likelihood may > be low but if you hit this pathology in your code you will probably be > unhappy about the effects on performance. > > So instead, IIUC, we use some measure of "memory pressure" and only > try to deflate under certain conditions. But I'm unclear exactly what > those conditions are today, and whether they change with async monitor > deflation. Can you enlighten me please? Without trying to describe the existing trigger mechanisms for monitor deflation (lots of details), Async Monitor Deflation uses the same safepoint cleanup trigger points for _initiating_ monitor deflation. However, unlike safepoint cleanup work which will finish the job during the current safepoint, Async Monitor Deflation will start after the safepoint that initiated the monitor deflation, but there is no guarantee when the ServiceThread or the JavaThreads will finish their deflation work (maybe not before the next safepoint). The v2.00/3-for-jdk13 webrev has the "must be seen twice to deflate" algorithm in place. So for any given ObjectMonitor, the first time it is seen by ObjectSynchronizer::deflate_monitor_list_using_JT(), it will not be deflated even if eligible (allocation state == "New"). The second time that it is seen by deflate_monitor_list_using_JT() (allocation state == "Old"), it is eligible for deflation. In this round of code review, we are talking about setting the allocation state to "Old" in inflate() which would make an ObjectMonitor eligible for deflation in the next round of deflation. One of my tasks is to update the existing comments about ObjectMonitor life cycle (I'll use another base monitor subsystem subtask); I don't think they were updated with the monitor list changes so there's a bit of catch up work to do there. I also need to update them for Async Monitor Deflation life cycle changes and that will be part of this project. Dan > > Thanks, > David > > On 10/04/2019 12:25 pm, Carsten Varming wrote: >> Hi Dan, >> >> On Mon, Apr 8, 2019 at 9:04 PM Daniel D. Daugherty < >> daniel.daugherty at oracle.com> wrote: >> >>> On 4/5/19 4:59 PM, Karen Kinnear wrote: >>> >>> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that >>> 0 < >>> _count >>> with comments that caller ensured _count <= 0 >>> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >>> ? Am I missing something subtle here or should they be the same >>> guarantees? >>> >>> >>> Here's the code in question: >>> >>> src/hotspot/share/runtime/objectMonitor.cpp: >>> >>> void ObjectMonitor::EnterI(TRAPS) { >>> >>> ?? if (_owner == DEFLATER_MARKER) { >>> ???? guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 >>> should >>> have been handled by the caller"); >>> ???? // Deflater thread tried to lock this monitor, but it failed to >>> make >>> _count negative and gave up. >>> >>> void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) { >>> >>> ???? if (_owner == DEFLATER_MARKER) { >>> ?????? guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER && >>> _count < 0, monitor must not be owned by deflater thread here"); >>> >>> >>> Reading these two guarantee() calls always throws me off stride >>> because I would have written them like this: >>> >>> ???? guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 >>> should >>> have been handled by the caller"); >>> >>> and >>> >>> ?????? guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER && >>> _count < 0, monitor must not be owned by deflater thread here"); >>> >>> When rewritten like the above, you have: >>> >>> ???? "_count > 0" ... _count <= 0 >>> >>> and: >>> >>> ???? "_count >= 0" ... "_count < 0" >>> >>> which is easier for my brain to read... okay... enough sidebar... >>> >> >> He he. I have pretty much eliminated > and >= from my written >> vocabulary. >> It makes life simpler. Trust me. :) >> >> >>> Short answer: No the guarantees should not be the same. >>> >>> Longer answer: EnterI() is called by enter() after enter() has >>> incremented the _count field to indicate the contended state of >>> things. So in EnterI(), "_count > 0" is the right check. >>> ReenterI() is called after wait() has returned (notified or >>> timedout), and the _count field is not used on reentry ops so >>> "_count >= 0" is the right check. >>> >>> I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, >>> there are two places in EnterI() that do this): >>> >>> ???? L501:?? if (_owner == DEFLATER_MARKER) { >>> ?????????????? // The deflation protocol finished the first part >>> (setting >>> _owner), >>> ?????????????? // but it failed the second part (making _count >>> negative) >>> and bailed. >>> ?????????????? // Because we're called from enter() we have at least >>> one >>> contention. >>> ?????????????? guarantee(count > 0, "_owner == DEFLATER_MARKER && >>> _count <= >>> 0 should have been handled by the caller"); >>> ???? L504:???? // Try to acquire monitor. >>> ???? L505:???? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>> DEFLATER_MARKER) { >>> >>> ???? L629:???? if (_owner == DEFLATER_MARKER) { >>> ???????????????? // The deflation protocol finished the first part >>> (setting >>> _owner), >>> ???????????????? // but it failed the second part (making _count >>> negative) >>> and bailed. >>> ???????????????? // Because we're called from enter() we have at >>> least one >>> contention. >>> ???????????????? guarantee(count> 0 , "_owner == DEFLATER_MARKER && >>> _count >>> <= 0 should have been handled by the caller"); >>> ???? L632:?????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>> DEFLATER_MARKER) { >>> >>> And I'm going to tweak the ReenterI() code like this: >>> >>> ???? L759:???? if (_owner == DEFLATER_MARKER) { >>> ???????????????? // The deflation protocol finished the first part >>> (setting >>> _owner), >>> ???????????????? // but it will observe _waiters != 0 and will bail >>> out. >>> Because we're >>> ???????????????? // called from wait() we may or may not have any >>> contentions. >>> ???????????????? guarantee(count >= 0, "Impossible: _owner == >>> DEFLATER_MARKER && _count < 0 should have been handled by the caller"); >>> ???? L761:?????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>> DEFLATER_MARKER) { >>> >>> >>> You didn't ask this, but it is okay that _count is only used to track >>> contentions in enter()/EnterI() and is not used to track contentions >>> in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is >>> used by is_busy() to observe the busy state for an ObjectMonitor that >>> is being wait()'ed for. The _waiters field is decremented after a >>> waiter has returned from ReenterI() so the _owner field takes over >>> answering the is_busy() question... >>> >>> >>> 5. I could use a little help with allocation state transitions, >>> e.g. in deflate_monitor_list_using_JT >>> ?? you see is_new with object set so you mark it as old so next >>> deflation >>> will check it >>> >>> >>> Here's the code in question: >>> >>> src/hotspot/share/runtime/synchronizer.cpp: >>> >>> int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** >>> listHeadp, >>> ObjectMonitor** >>> freeHeadp, >>> ObjectMonitor** >>> freeTailp, >>> ObjectMonitor** >>> savedMidInUsep) { >>> >>> ???? // Only try to deflate if there is an associated Java object >>> and if >>> ???? // mid is old (is not newly allocated and is not newly freed). >>> ???? if (mid->object() != NULL && mid->is_old() && >>> ???????? deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { >>> ?????? // Deflation succeeded so update the in-use list. >>> >>> ???? } else { >>> ?????? // mid is considered in-use if it does not have an associated >>> ?????? // Java object or mid is not old or deflation did not succeed. >>> ?????? // A mid->is_new() node can be seen here when it is freshly >>> returned >>> ?????? // by omAlloc() (and skips the deflation code path). >>> ?????? // A mid->is_old() node can be seen here when deflation failed. >>> ?????? // A mid->is_free() node can be seen here when a fresh node from >>> ?????? // omAlloc() is released by omRelease() due to losing the race >>> ?????? // in inflate(). >>> >>> ?????? if (mid->object() != NULL && mid->is_new()) { >>> ???????? // mid has an associated Java object and has now been seen >>> ???????? // as newly allocated so mark it as "old". >>> ???????? mid->set_allocation_state(ObjectMonitor::Old); >>> ?????? } >>> >>> ?? - why do you set it to old here rather than in inflate once we set >>> values? >>> >>> >>> Inflation is used in quite a few places. If we marked the >>> ObjectMonitor as "Old" in inflate(), then that would make the >>> ObjectMonitor available for deflation by deflate_monitor_using_JT() >>> earlier: >>> >>> src/hotspot/share/runtime/synchronizer.cpp: >>> >>> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, >>> ObjectMonitor** >>> freeHeadp, >>> ObjectMonitor** >>> freeTailp) { >>> ?? assert(AsyncDeflateIdleMonitors, "sanity check"); >>> ?? assert(Thread::current()->is_Java_thread(), "precondition"); >>> ?? // A newly allocated ObjectMonitor should not be seen here so we >>> ?? // avoid an endless inflate/deflate cycle. >>> ?? assert(mid->is_old(), "precondition"); >>> >>> >>> So the idea behind only deflating ObjectMonitors that have reached >>> allocation state "Old" is to prevent "an endless inflate/deflate >>> cycle". >>> Here's the relevant section from Carsten's JEP: >>> >>> To avoid endless inflation / deflation cycles in the prototype, monitor >>> >>> deflation is only attempted the second time a monitor is seen by the >>> >>> thread marking monitors as deflatable: If the thread (the only thread >>> >>> marking monitors as deflatable; might be service thread or some GC >>> >>> related thread or even a dedicated thread) sees a monitor in state New, >>> >>> then the thread marks the monitor as Old and moves on. So there is >>> >>> little interaction between a thread inflating a lock to a monitor and >>> >>> the deflating thread, the inflating thread just has to make sure the >>> >>> monitor is marked New and this marker is published using appropriate >>> >>> barriers. >>> >>> >>> There isn't an explicit example in the JEP of what Carsten was thinking >>> of with "an endless inflate/deflate cycle". I didn't try to think of >>> such an example for the OpenJDK wiki either. I simple wrote: >>> >> >> I think I was thinking about a cycle where a Java object exhibits a >> monitor >> inflation, then deflation, then inflation, then deflation. Each >> inflation >> will be with a new monitor. This behavior could increase the number of >> monitors allocated, especially with my original patch as I recycled >> monitors only after a safepoint. Now that I think about it again, such a >> cycle is incredible unlikely as it would require repeated contention >> on the >> java object, yet the monitors must not be busy when the deflator thread >> comes by. And this scenario has to repeat itself. This all seems pretty >> unlikely. >> >> ObjectMonitor has a new allocation_state field that supports three >>> >>> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied >>> >>> to ObjectMonitors that have reached the 'Old' state. When the Async >>> >>> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it >>> >>> is changed to the 'Old' state, but is not deflated. This prevents a >>> >>> newly allocated ObjectMonitor from being immediately deflated which >>> >>> could cause an inflation<->deflation oscillation. >>> >>> >>> So let's think about what might happen if an ObjectMonitor is marked >>> as "Old" in inflate(). Here's an example use of inflate() in the >>> "slow enter" code path: >>> >>> src/hotspot/share/runtime/synchronizer.cpp: >>>> void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, >>>> TRAPS) { >>> >>> base>> inflate_cause_monitor_enter)->enter(THREAD); >>> >>> new>???? ObjectMonitorHandle omh; >>> new>???? inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); >>> new>???? do_loop = !omh.om_ptr()->enter(THREAD); >>> >>> In the "base" code, we took the return from inflate() and used it to >>> call >>> ObjectMonitor::enter(). If we never changed that bit of code and >>> inflate() >>> marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() >>> could >>> async deflate the ObjectMonitor while we were trying to call enter() on >>> it... Boom! So we might think that holding off marking an ObjectMonitor >>> as "Old" can save us... and it can, but not in all cases... :-( >>> >>> It is entirely possible that our call to slow_enter() is made on an >>> ObjectMonitor that's already marked "Old". In that case, our thread >>> (T-enter) calls inflate() which returns the existing ObjectMonitor* >>> and we use it to call enter(). If the thread (T-deflate) calling >>> deflate_monitor_using_JT() does its magic before T-enter sets the >>> owner field or the count field... Boom! >>> >>> The previous paragraph is exactly what motivated the _ref_count field, >>> the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* >>> parameter to inflate(). inflate() calls >>> ObjectMonitorHandle::save_om_ptr() >>> which increments the ObjectMonitor's ref_count and then checks for >>> async >>> deflation protocol collisions. If there's a collision, then >>> save_om_ptr() >>> returns false and the caller (inflate() in this case) has to retry. >>> When >>> inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot >>> be deflated and is safe until the ObjectMonitorHandle is destroyed. >>> >>> So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot >>> deflate the ObjectMonitor in the window after inflate() returns and >>> before T-enter sets the owner field or increments the count field. But >>> you know all that already! >>> >>> So let's bring this back to having inflate() mark the ObjectMonitor as >>> "Old"... Since inflate() returns an ObjectMonitor with the ref_count >>> > 0, >>> it doesn't matter if the ObjectMonitor is marked as "Old" in inflate(). >>> T-deflate cannot deflate it due to ref_count > 0. >>> >>> Here's another crazy thought... inflate() is the only function that >>> calls omAlloc(), and omAlloc() is the only function that sets "New". >>> If we move the setting of "Old" from deflate_monitor_list_using_JT() >>> to inflate(), then the change from "New" -> "Old" never happens >>> outside of the inflate() call so why do we need the allocation state? >>> >>> Small dose of reality: I've found having the allocation state to be >>> very helpful when debugging race related crashes. We could make the >>> allocation state be DEBUG_ONLY, but then what about race debugging of >>> product bits... sigh... >>> >>> >>> 6. Could you get rid of the new goto?s? >>> >>> >>> I believe there is only one left from Carsten's prototype: >>> >> >> You make it sound like I was throwing gotos around left and right. :) If >> you count continue and break statements, then you might have been right. >> >> I'll break my response here, so we can return to regular structured >> programming, ;-) >> Carsten >> From martin.doerr at sap.com Wed Apr 10 17:41:26 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 10 Apr 2019 17:41:26 +0000 Subject: RFR(XXS): 8222034: Thread-SMR functions should be updated to remove work around In-Reply-To: <92633a98-5d05-3dd5-8fd9-4a63b8b25071@oracle.com> References: <92633a98-5d05-3dd5-8fd9-4a63b8b25071@oracle.com> Message-ID: Hi Dan, thank you for cleaning this up. New implementation looks good. Also thanks for improving comments. Maybe Erik can check the comments, too. Best regards, Martin -----Original Message----- From: Daniel D. Daugherty Sent: Mittwoch, 10. April 2019 18:25 To: hotspot-runtime-dev at openjdk.java.net; Erik ?sterlund ; Doerr, Martin Subject: RFR(XXS): 8222034: Thread-SMR functions should be updated to remove work around Greetings, I have a very small fix for the following bug: ??? JDK-8222034 Thread-SMR functions should be updated to remove work around ??? https://bugs.openjdk.java.net/browse/JDK-8222034 Webrev URL: http://cr.openjdk.java.net/~dcubed/8222034-webrev/0_for_jdk13/ I would like to hear from Erik Osterlund and Martin Doerr on this review. Of course, anyone else is also welcome to chime in. This fix has been tested with a Mach5 tier[1-3] run. Thanks, in advance, for any questions, comments or suggestions. Dan From thomas.stuefe at gmail.com Wed Apr 10 17:42:20 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 10 Apr 2019 19:42:20 +0200 Subject: RFR(s): 8222015: Small VM.metaspace improvements In-Reply-To: References: Message-ID: Thank you Jiangli! Your recommendations make sense, I'll work them into the next webrev. Best Regards, Thomas On Mon, Apr 8, 2019 at 5:30 AM Jiangli Zhou wrote: > Hi Thomas, > > This seems good to me. I have a few minor suggestions below, but please > feel free to keep your existing code without changing. > > - For consistency with the existing code and VM.metaspace output, it might > be worth renaming _num_classes_cds to _num_classes_shared, and > _num_classes_cds_by_spacetype to _num_classes_shared_by_spacetype. > > - src/hotspot/share/memory/metaspace/printCLDMetaspaceInfoClosure.cpp > You could replace the following MetaspaceShared::is_in_shared_metaspace(k) > call with k->is_shared() if 'k' is guaranteed to be a valid Klass. > > 58 void do_klass(Klass* k) { 59 _num_classes ++; 60 if (MetaspaceShared::is_in_shared_metaspace(k)) { 61 _num_classes_cds ++; 62 } 63 } > > - src/hotspot/share/memory/metaspace/printMetaspaceInfoKlassClosure.cpp > > 46 // Print a 's' for shared classes 47 _out->put(MetaspaceShared::is_in_shared_metaspace(k) ? 's': ' '); 48 > > Same suggestion as the above. > > Thanks and regards, > Jiangli > > > On Fri, Apr 5, 2019 at 3:07 AM Thomas St?fe > wrote: > >> Hi all, >> >> may I have please a review for this collection of small improvements to >> the >> VM.metaspace diagnostic command? >> >> - it clearly marks now classes whose metadata reside in cds >> - it shows the number of classes loaded, incl. those from cds, in the >> overviews too. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8222015 >> cr: >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/webrev.00/webrev/ >> >> Example output: >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-by-spacetype.txt >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders.txt >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders-showclasses.txt >> (scroll >> down -> cds classes in are now marked with 's') >> >> Thank you, >> >> Thomas >> > From daniel.daugherty at oracle.com Wed Apr 10 17:42:53 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 10 Apr 2019 13:42:53 -0400 Subject: RFR(XXS): 8222034: Thread-SMR functions should be updated to remove work around In-Reply-To: References: <92633a98-5d05-3dd5-8fd9-4a63b8b25071@oracle.com> Message-ID: Martin, thanks for the quick review! On 4/10/19 1:41 PM, Doerr, Martin wrote: > Hi Dan, > > thank you for cleaning this up. New implementation looks good. Thanks! > Also thanks for improving comments. Maybe Erik can check the comments, too. Yup! Since I'm whacking his code and comments... :-) Dan > > Best regards, > Martin > > > -----Original Message----- > From: Daniel D. Daugherty > Sent: Mittwoch, 10. April 2019 18:25 > To: hotspot-runtime-dev at openjdk.java.net; Erik ?sterlund ; Doerr, Martin > Subject: RFR(XXS): 8222034: Thread-SMR functions should be updated to remove work around > > Greetings, > > I have a very small fix for the following bug: > > ??? JDK-8222034 Thread-SMR functions should be updated to remove work > around > ??? https://bugs.openjdk.java.net/browse/JDK-8222034 > > Webrev URL: http://cr.openjdk.java.net/~dcubed/8222034-webrev/0_for_jdk13/ > > I would like to hear from Erik Osterlund and Martin Doerr on this review. > Of course, anyone else is also welcome to chime in. > > This fix has been tested with a Mach5 tier[1-3] run. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan From coleen.phillimore at oracle.com Wed Apr 10 18:12:46 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 10 Apr 2019 14:12:46 -0400 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CAD05AF.1090700@oracle.com> <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> Message-ID: <7002f66b-49af-b38d-8d97-8642e684e998@oracle.com> On 4/10/19 1:06 PM, gerard ziemski wrote: > Thank you Erik for more feedback. > > New webrev:? http://cr.openjdk.java.net/~gziemski/8185525_rev5 > > > On 4/9/19 3:50 PM, Erik Gahlin wrote: >> Thanks Gerard, >> >> In metadata.xml (and possible elsewhere) can you change the fields >> >> "varianceOfBucketCount" to "bucketCountVariance" >> "stdDevOfBucketCount" to "bucketCountStandardDeviation" > > I changed those, but I also changed: > > "maximumBucketCount" to "bucketCountMaximum" > "averageBucketCount" to "bucketCountAverage" > > to be fully consistent. > > >> >> I noticed that events are only emitted if we are able to take the >> resize lock. Can this be fixed? What prevents us from always getting >> the data? That's how other periodic events work and losing data >> sometimes may lead to subtle bugs that hard to understand and >> replicate in systems that rely on the information. Could we retry on >> a failure? > Good observation. If the resize lock is taken, then it's not likely > that whoever owns it will be done soon, so retrying is most likely not > going to succeed right away. Is it OK to tie up JFR periodic thread > for some time? If so, how long? > > If the lock is taken, then it means that someone is scanning through > the entire table, or the table is being resized. Either way, we're not > loosing data, but are just temporarily blind - I don't see a problem > here for a long running apps, they will start receiving events > eventually (which happen every 10 sec by default) Robbin was talking about allowing scanning the table while resizing, ie. not having the resize_lock, if we can accept that there might be some entries double counted. Coleen > >> >> If it is very problematic to fix, it may be OK to skip the events, >> but then tests would need to be updated to take that into account >> (retrying). Otherwise we may get intermittent failures. > At the startup of our jtreg JFR test, no one, besides us, should take > the lock, so if we don't get the event, because someone else is > holding it (too small system hash table that gets resized up > immediately after VM starts up), we probably would want to know about > it, so a failure here might be in fact welcome. > > > cheers > > > >> >> Thanks >> Erik >> >>> hi Erik, >>> >>> >>> On 4/3/19 12:44 PM, Erik Gahlin wrote: >>>> Hi Gerard, >>>> >>>> Here are some comments about the metadata (to make it consistent >>>> with other events). >>>> >>>> The events should not be in the "Java Application" category since >>>> they are JVM events. You could perhaps put them in "Java Virtual >>>> Machine, Runtime, Tables". Some comments about the names and labels >>>> of fields. >>>> >>>> - Label: Number of buckets => Bucket Count >>>> - Label: Number of entries => Entry Count >>>> - Label: Total footprint => Total Footprint >>>> >>>> Could you remove descriptions that are exactly the same as the label. >>>> >>>> - Label: Maximum bucket size => Maximum Bucket Size >>>> - Label: Average bucket size => Average Bucket Size >>>> - Label: Variance of bucket? size => Bucket Size Variance >>>> - Name: stdDevOfBucketSize => bucketSizeStandardDeviation >>>> - Label: Standard deviation of bucket size => Bucket Size Standard >>>> Deviation" >>>> >>>> Instead of using the word "size", it may make more sense to use the >>>> word "count" here as well, i.e "Average Bucket Count", or maybe I'm >>>> missing something? Is there a difference? >>>> >>>> I wonder how useful standard deviation and variance is? If support >>>> engineers are looking at a recording, or JMC adds a rule for the >>>> events, what would a good or bad value be? Is it possible to use >>>> the information for troubleshooting? >>> >>> While I'm working on all the above changes you suggested, we can >>> discuss the standard devation and variance. >>> >>> I added them because they are part of the jcmd "VM.symboltable >>> -verbose" command, so we are consistent. >>> >>> Now, regarding how useful they are, I always understood them as a >>> sign of imbalanced table distribution, and without a proper >>> histogram, this is the best description of the histogram shape. In >>> reality, however, I think that if they identify an issue, then we >>> might have a very curious distribution (some sort of hash table >>> attack), or we have an issue with our hash function for the >>> particular usage case. >>> >>> Still, I'd personally elect to keep them. >>> >>> Let me ask you a different question though, Is it expensive to have >>> 2 doubles as part of an event (5 events per second)? And if so, is >>> there currently (or planned) granularity for controlling not just >>> which events to record, but also which attributes? >>> >>>> >>>> - Name: addRate => insertionRate >>>> - Label: Rate of addition =>? Insertation Rate >>>> - Name: removeRate => removalRate >>>> - Label: Rate of removal => Removal Rate >>> >>> Will do. >>> >>>> >>>> I'm missing unit tests for the events. Could you please add in >>>> /test/jdk/jdk/jfr/event/runtime. They can be sanity tests. i.e the >>>> average not exceeding max, no negative values etc. >>> >>> Working on it, do we need separate test per each event (table), or >>> just one table will suffice (ex. StringTable)? >>> >>> Thank you for the feedback! >>> >>> >>> cheers >>>> >>>> Thanks! >>>> Erik >>>> >>>>> Hi all, >>>>> >>>>> Please review this feature, which adds tracing events for the >>>>> internal hash tables. >>>>> >>>>> The following attributes are implemented: >>>>> >>>>> >>>>> >>>>> >>>> label="Total footprint" description="Total memory footprint (the >>>>> table itself plus all of the entries)" /> >>>>> >>>>> >>>> label="Variance of bucket sizes" description="How far bucket >>>>> lengths are spread out from their average value" /> >>>>> >>>>> >>>> description="How many items were added since last event (per >>>>> second)" /> >>>>> >>>> description="How many items were removed since last event (per >>>>> second)" /> >>>>> >>>>> This event was implemented for the following system tables: >>>>> >>>>> SymbolTable >>>>> StringTable >>>>> Placeholder Table >>>>> LoaderConstraints Table >>>>> ProtectionDomainCache Table >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev1/ >>>>> Bug:???? https://bugs.openjdk.java.net/browse/JDK-8185525 >>>>> Testing: Mach5 tier1,2,3 (another Mach5 tier1,2,3,4,5,6,7 in >>>>> progress?) >>>>> >>>>> >>>>> Cheers >>>>> >>>> >>>> >>> >> >> > From daniel.daugherty at oracle.com Wed Apr 10 19:24:40 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 10 Apr 2019 15:24:40 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <5b0d2152-e336-675b-5c89-45636596a279@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <5b0d2152-e336-675b-5c89-45636596a279@oracle.com> Message-ID: So I?ve been analyzing monitorinflation logs from SPECjbb2015 runs. It takes about 45 minutes for a SPECjbb15 run to finish on my Linux box. In the baseline bits: ? Total deflating time: 0.9314706 secs. ? Total deflating count: 2582566 In the v2.00 bits: ? Total deflating time: 1.5767698 secs. ? Total deflating count: 2505602 Yes, that is 1 second in 45 minutes for the baseline and 1.6 seconds in 45 minutes for the v2.00 bits. That strongly indicates that the mechanics of async monitor deflation is not the cause of the 4.5% slowdown in SPECjbb2015. It must be something else... I'm looking at safepoint stats next... Dan On 4/8/19 12:55 PM, Daniel D. Daugherty wrote: > Greetings, > > I took the last repo that I ran through Mach5 tier[1-8] testing and did > 10 SPECjbb2015 runs on the 'release' version of those bits. I also did > 10 SPECjbb2015 runs on the 'release' version of the baseline bits. > > Baseline: jdk-13+13 > Exp:????? v2.00 (8153224-webrev/3-for-jdk13) plus > ????????? special-cleanup-for-global-in-use-list > > Linux-X64 Machine: > ? - Ubuntu 16.04, Dell T7600, 64GB RAM > ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 > threads > > MacOSX Machine: > ? - MacOS 10.13.6, Mac Mini, mid 2011, 16GB RAM > ? - 2 GHz Intel Core i7 (I7-2635QM), 1 CPU x 4 cores x 2 threads > > Solaris-X64 Machine: > ? - Solaris 11.2 SRU5.5, Dell T7600, 64GB RAM > ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 > threads > > Average Results for Each OS > > ???? hbIR?????????? hbIR > (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name > ---------------? ---------? --------? -------------? -------- > > ?????? 23838.00?? 22446.90? 20738.80??????? 6166.70? Linux-X64 base > ?????? 23838.00?? 22279.40? 20262.00??????? 5891.50? Linux-X64 exp > > ??????? 5841.80??? 4885.00?? 4764.00??????? 1495.10? MacOSX base > ??????? 5621.00??? 4701.00?? 4778.00??????? 1492.10? MacOSX exp > > ?????? 16125.20?? 13852.30? 12780.50??????? 2791.90? Solaris-X64 base > ?????? 15788.70?? 13861.40? 12551.40??????? 2665.90? Solaris-X64 exp > > I'm new to SPECjbb2015 so I don't what "hbIR" and "jOPS" are yet. > Based a bit of googling so far, it appears that for critical-jOPS, > higher is better: > > - Linux-X64 exp critical-jOPS is ~4.5% lower than Linux-X64 base > - MacOSX base and MacOSX exp critical-jOPS are almost identical > - Solaris-X64 exp critical-jOPS is ~4.5% lower than Linux-X64 base > > I have not tried to research or analyze the other columns yet. > > The results for each of the 10 runs are shown below. > > Dan > > > > Linux-X64 Runs > > ???? hbIR?????????? hbIR > (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name > ---------------? ---------? --------? -------------? -------- > ????????? 23838????? 22719???? 19070?????????? 6515 > SPECjbb2015.Lin-X64.base.01 > ????????? 23838????? 21642???? 20262?????????? 5591 > SPECjbb2015.Lin-X64.base.02 > ????????? 23838????? 23108???? 20262?????????? 6508 > SPECjbb2015.Lin-X64.base.03 > ????????? 23838????? 21730???? 21454?????????? 6235 > SPECjbb2015.Lin-X64.base.04 > ????????? 23838????? 22220???? 21454?????????? 6028 > SPECjbb2015.Lin-X64.base.05 > ????????? 23838????? 22543???? 20262?????????? 5996 > SPECjbb2015.Lin-X64.base.06 > ????????? 23838????? 23014???? 21454?????????? 6192 > SPECjbb2015.Lin-X64.base.07 > ????????? 23838????? 22543???? 21454?????????? 5889 > SPECjbb2015.Lin-X64.base.08 > ????????? 23838????? 22750???? 20262?????????? 6038 > SPECjbb2015.Lin-X64.base.09 > ????????? 23838????? 22200???? 21454?????????? 6675 > SPECjbb2015.Lin-X64.base.10 > ---------------? ---------? --------? -------------? -------- > ?????? 23838.00?? 22446.90? 20738.80??????? 6166.70? average of values > > ???? hbIR?????????? hbIR > (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name > ---------------? ---------? --------? -------------? -------- > ????????? 23838????? 21422???? 20262?????????? 6329 > SPECjbb2015.Lin-X64.exp.01 > ????????? 23838????? 22543???? 19070?????????? 6351 > SPECjbb2015.Lin-X64.exp.02 > ????????? 23838????? 22100???? 20262?????????? 5005 > SPECjbb2015.Lin-X64.exp.03 > ????????? 23838????? 22543???? 20262?????????? 5881 > SPECjbb2015.Lin-X64.exp.04 > ????????? 23838????? 23170???? 20262?????????? 5938 > SPECjbb2015.Lin-X64.exp.05 > ????????? 23838????? 22543???? 20262?????????? 5744 > SPECjbb2015.Lin-X64.exp.06 > ????????? 23838????? 22100???? 20262?????????? 5482 > SPECjbb2015.Lin-X64.exp.07 > ????????? 23838????? 22543???? 20262?????????? 6213 > SPECjbb2015.Lin-X64.exp.08 > ????????? 23838????? 22100???? 21454?????????? 5637 > SPECjbb2015.Lin-X64.exp.09 > ????????? 23838????? 21730???? 20262?????????? 6335 > SPECjbb2015.Lin-X64.exp.10 > ---------------? ---------? --------? -------------? -------- > ?????? 23838.00?? 22279.40? 20262.00??????? 5891.50? average of values > > > MacOSX Runs > > ???? hbIR?????????? hbIR > (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name > ---------------? ---------? --------? -------------? -------- > ?????????? 6725?????? 5621????? 4708?????????? 1543 > SPECjbb2015.MacOSX.base.01 > ?????????? 5621?????? 4701????? 4778?????????? 1326 > SPECjbb2015.MacOSX.base.02 > ?????????? 6725?????? 5621????? 4708?????????? 1475 > SPECjbb2015.MacOSX.base.03 > ?????????? 5621?????? 4701????? 4778?????????? 1372 > SPECjbb2015.MacOSX.base.04 > ?????????? 5621?????? 4701????? 4778?????????? 1560 > SPECjbb2015.MacOSX.base.05 > ?????????? 5621?????? 4701????? 4778?????????? 1471 > SPECjbb2015.MacOSX.base.06 > ?????????? 5621?????? 4701????? 4778?????????? 1430 > SPECjbb2015.MacOSX.base.07 > ?????????? 5621?????? 4701????? 4778?????????? 1560 > SPECjbb2015.MacOSX.base.08 > ?????????? 5621?????? 4701????? 4778?????????? 1581 > SPECjbb2015.MacOSX.base.09 > ?????????? 5621?????? 4701????? 4778?????????? 1633 > SPECjbb2015.MacOSX.base.10 > ---------------? ---------? --------? -------------? -------- > ??????? 5841.80??? 4885.00?? 4764.00??????? 1495.10? average of values > > ???? hbIR?????????? hbIR > (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name > ---------------? ---------? --------? -------------? -------- > ?????????? 5621?????? 4701????? 4778?????????? 1566 > SPECjbb2015.MacOSX.exp.01 > ?????????? 5621?????? 4701????? 4778?????????? 1430 > SPECjbb2015.MacOSX.exp.02 > ?????????? 5621?????? 4701????? 4778?????????? 1530 > SPECjbb2015.MacOSX.exp.03 > ?????????? 5621?????? 4701????? 4778?????????? 1304 > SPECjbb2015.MacOSX.exp.04 > ?????????? 5621?????? 4701????? 4778?????????? 1560 > SPECjbb2015.MacOSX.exp.05 > ?????????? 5621?????? 4701????? 4778?????????? 1460 > SPECjbb2015.MacOSX.exp.06 > ?????????? 5621?????? 4701????? 4778?????????? 1638 > SPECjbb2015.MacOSX.exp.07 > ?????????? 5621?????? 4701????? 4778?????????? 1471 > SPECjbb2015.MacOSX.exp.08 > ?????????? 5621?????? 4701????? 4778?????????? 1402 > SPECjbb2015.MacOSX.exp.09 > ?????????? 5621?????? 4701????? 4778?????????? 1560 > SPECjbb2015.MacOSX.exp.10 > ---------------? ---------? --------? -------------? -------- > ??????? 5621.00??? 4701.00?? 4778.00??????? 1492.10? average of values > > > Solaris-X64 Runs > > ???? hbIR?????????? hbIR > (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name > ---------------? ---------? --------? -------------? -------- > ????????? 16584????? 13957???? 13267?????????? 2332 > SPECjbb2015.Sol-X64.base.01 > ????????? 16584????? 13837???? 13267?????????? 3123 > SPECjbb2015.Sol-X64.base.02 > ????????? 16584????? 13837???? 13267?????????? 2853 > SPECjbb2015.Sol-X64.base.03 > ????????? 16584????? 13837???? 12438?????????? 2667 > SPECjbb2015.Sol-X64.base.04 > ????????? 14743????? 14210???? 12532?????????? 2920 > SPECjbb2015.Sol-X64.base.05 > ????????? 16584????? 13837???? 12438?????????? 3534 > SPECjbb2015.Sol-X64.base.06 > ????????? 13837????? 13497???? 12453?????????? 2226 > SPECjbb2015.Sol-X64.base.07 > ????????? 16584????? 13837???? 12438?????????? 2265 > SPECjbb2015.Sol-X64.base.08 > ????????? 16584????? 13837???? 13267?????????? 2853 > SPECjbb2015.Sol-X64.base.09 > ????????? 16584????? 13837???? 12438?????????? 3146 > SPECjbb2015.Sol-X64.base.10 > ---------------? ---------? --------? -------------? -------- > ?????? 16125.20?? 13852.30? 12780.50??????? 2791.90? average of values > > ???? hbIR?????????? hbIR > (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name > ---------------? ---------? --------? -------------? -------- > ????????? 16584????? 13837???? 12438?????????? 2073 > SPECjbb2015.Sol-X64.exp.01 > ????????? 16584????? 14353???? 13267?????????? 2667 > SPECjbb2015.Sol-X64.exp.02 > ????????? 16584????? 13837???? 12438?????????? 2349 > SPECjbb2015.Sol-X64.exp.03 > ????????? 16584????? 13837???? 12438?????????? 2494 > SPECjbb2015.Sol-X64.exp.04 > ????????? 13981????? 13832???? 12583?????????? 3241 > SPECjbb2015.Sol-X64.exp.05 > ????????? 13837????? 13575???? 12453?????????? 2621 > SPECjbb2015.Sol-X64.exp.06 > ????????? 13981????? 13832???? 12583?????????? 2768 > SPECjbb2015.Sol-X64.exp.07 > ????????? 16584????? 13837???? 12438?????????? 3000 > SPECjbb2015.Sol-X64.exp.08 > ????????? 16584????? 13837???? 12438?????????? 2952 > SPECjbb2015.Sol-X64.exp.09 > ????????? 16584????? 13837???? 12438?????????? 2494 > SPECjbb2015.Sol-X64.exp.10 > ---------------? ---------? --------? -------------? -------- > ?????? 15788.70?? 13861.40? 12551.40??????? 2665.90? average of values > > > On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> Welcome to the OpenJDK review thread for my port of Carsten's work on: >> >> ??? JDK-8153224 Monitor deflation prolong safepoints >> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> Here's a link to the OpenJDK wiki that describes my port: >> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >> >> Here's the webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >> >> Here's a link to Carsten's original webrev: >> >> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >> >> Earlier versions of this patch have been through several rounds of >> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >> Roman for their preliminary code review comments. A very special >> thanks to Robbin and Roman for building and testing the patch in >> their own environments (including specJBB2015). >> >> This version of the patch has been thru Mach5 tier[1-8] testing on >> Oracle's usual set of platforms. Earlier versions have been run >> through my stress kit on my Linux-X64 and Solaris-X64 servers >> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >> and slowdebug). Earlier versions have run my monitor inflation stress >> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >> fastdebug and slowdebug). >> >> All of the testing done on earlier versions will be redone on the >> latest version of the patch. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> >> P.S. >> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >> is currently failing in -Xcomp mode on Win* only. I've been trying >> to characterize/analyze this failure for more than a week now. At >> this point I'm convinced that Async Monitor Deflation is aggravating >> an existing bug. However, I plan to have a better handle on that >> failure before these bits are pushed to the jdk/jdk repo. >> > > From coleen.phillimore at oracle.com Wed Apr 10 19:27:40 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 10 Apr 2019 15:27:40 -0400 Subject: RFR (S) 8222231: Clean up interfaceSupport.inline.hpp duplicated code In-Reply-To: References: Message-ID: On 4/10/19 12:01 PM, Patricio Chilano wrote: > Hi Coleen! > > Change looks good to me! Just a small comment. Since we now will not > clear unhandled oops in the TBIVMWDC jacket anymore, I think we should > add that check in Monitor::wait() like we do in Monitor::lock(). Patricio,? Thank you for reviewing this.? Good find! http://cr.openjdk.java.net/~coleenp/2019/8222231.02/webrev/ I added it and retested runThese jck tests with -XX:+CheckUnhandledOops. Thanks, Coleen > > Thanks! > Patricio > > On 4/10/19 10:36 AM, coleen.phillimore at oracle.com wrote: >> >> Thank you David! >> Coleen >> >> On 4/9/19 10:04 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> Thanks for doing this! It is all a lot simpler now. Reviewed with >>> enthusiasm. :) >>> >>> David >>> >>> On 10/04/2019 9:58 am, coleen.phillimore at oracle.com wrote: >>>> Some code was left from removing code UseMembar and further >>>> improvements encouraged by dholmes.? I also removed a couple >>>> redundant and unneeded clear_unhandled_oops calls.? The one in >>>> ThreadBlockInVMWithDeadlockCheck is in the calling code >>>> Monitor::lock, and the thread in vm from native makes no sense. >>>> >>>> Tested with runtime jtreg tests, and mach5 tier1-3 in progress. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8222231.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222231 >>>> >>>> Thanks, >>>> Coleen >> > From gerard.ziemski at oracle.com Wed Apr 10 20:03:57 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Wed, 10 Apr 2019 15:03:57 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <7002f66b-49af-b38d-8d97-8642e684e998@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CAD05AF.1090700@oracle.com> <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> <7002f66b-49af-b38d-8d97-8642e684e998@oracle.com> Message-ID: On 4/10/19 1:12 PM, coleen.phillimore at oracle.com wrote: >>> >>> I noticed that events are only emitted if we are able to take the >>> resize lock. Can this be fixed? What prevents us from always getting >>> the data? That's how other periodic events work and losing data >>> sometimes may lead to subtle bugs that hard to understand and >>> replicate in systems that rely on the information. Could we retry on >>> a failure? >> Good observation. If the resize lock is taken, then it's not likely >> that whoever owns it will be done soon, so retrying is most likely >> not going to succeed right away. Is it OK to tie up JFR periodic >> thread for some time? If so, how long? >> >> If the lock is taken, then it means that someone is scanning through >> the entire table, or the table is being resized. Either way, we're >> not loosing data, but are just temporarily blind - I don't see a >> problem here for a long running apps, they will start receiving >> events eventually (which happen every 10 sec by default) > > Robbin was talking about allowing scanning the table while resizing, > ie. not having the resize_lock, if we can accept that there might be > some entries double counted. Yes, we could do that - are you suggesting that this is what we should do? Personally, I think I'd prefer not to emit the event at all, rather than emit one that might be wrong (that's exactly what we do currently for jcmd print statistics). Erik, Robbin, do you have a preference here? cheers From coleen.phillimore at oracle.com Wed Apr 10 21:56:09 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 10 Apr 2019 17:56:09 -0400 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CAD05AF.1090700@oracle.com> <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> <7002f66b-49af-b38d-8d97-8642e684e998@oracle.com> Message-ID: <9b492479-5f03-47a9-bb6b-014cee395545@oracle.com> On 4/10/19 4:03 PM, gerard ziemski wrote: > > > On 4/10/19 1:12 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> I noticed that events are only emitted if we are able to take the >>>> resize lock. Can this be fixed? What prevents us from always >>>> getting the data? That's how other periodic events work and losing >>>> data sometimes may lead to subtle bugs that hard to understand and >>>> replicate in systems that rely on the information. Could we retry >>>> on a failure? >>> Good observation. If the resize lock is taken, then it's not likely >>> that whoever owns it will be done soon, so retrying is most likely >>> not going to succeed right away. Is it OK to tie up JFR periodic >>> thread for some time? If so, how long? >>> >>> If the lock is taken, then it means that someone is scanning through >>> the entire table, or the table is being resized. Either way, we're >>> not loosing data, but are just temporarily blind - I don't see a >>> problem here for a long running apps, they will start receiving >>> events eventually (which happen every 10 sec by default) >> >> Robbin was talking about allowing scanning the table while resizing, >> ie. not having the resize_lock, if we can accept that there might be >> some entries double counted. > > Yes, we could do that - are you suggesting that this is what we should > do? Not for this change. Coleen > Personally, I think I'd prefer not to emit the event at all, rather > than emit one that might be wrong (that's exactly what we do currently > for jcmd print statistics). > > Erik, Robbin, do you have a preference here? > > > cheers > From coleen.phillimore at oracle.com Wed Apr 10 22:01:45 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 10 Apr 2019 18:01:45 -0400 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <6ba4c5dc-1346-d377-808d-05c16c769e3b@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> <4862ac3f-f165-7d04-6248-514e588e70a2@oracle.com> <6b9dcad0-5563-0e6b-1c37-c8a4c40d407a@oracle.com> <72cbb5f1-c843-2995-12da-31eb22734c30@oracle.com> <6ba4c5dc-1346-d377-808d-05c16c769e3b@oracle.com> Message-ID: http://cr.openjdk.java.net/~gziemski/8185525_rev5/src/hotspot/share/utilities/tableStatistics.cpp.html Sorry I didn't notice this before but these constructors should have initializers like: 31 TableRateStatistics::TableRateStatistics() { 32 _added_items = 0; 33 _removed_items = 0; 34 35 _time_stamp = 0; 36 _seconds_stamp = 0.0; 37 _added_items_stamp = 0; 38 _added_items_stamp_prev = 0; 39 _removed_items_stamp = 0; 40 _removed_items_stamp_prev = 0; 41 } Should be: 31 TableRateStatistics::TableRateStatistics() : 32 _added_items(0), _removed_items(0), _time_stamp(0), etc. {} Kim could tell you why this is better but he's on vacation. http://cr.openjdk.java.net/~gziemski/8185525_rev5/src/hotspot/share/jfr/periodic/jfrPeriodic.cpp.udiff.html The template looks great! Thanks, Coleen On 4/10/19 1:10 PM, gerard ziemski wrote: > Thank you Coleen! > > > On 4/9/19 10:57 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> I didn't think this should be moved from jfrPeriodic.cpp.? I >>>> thought it could be something like an X macro. >>>> >>>> Or just make this bit a function that they all call with event as >>>> parameter. >>>> >>>> + event.set_numberOfBuckets(statistics._number_of_buckets); >>>> + event.set_numberOfEntries(statistics._number_of_entries); >>>> + event.set_totalFootprint(statistics._total_footprint); >>>> + event.set_maximumBucketCount(statistics._maximum_bucket_size); >>>> + event.set_averageBucketCount(statistics._average_bucket_size); >>>> + event.set_varianceOfBucketCount(statistics._variance_of_bucket_size); >>>> + event.set_stdDevOfBucketCount(statistics._stddev_of_bucket_size); >>>> + event.set_insertionRate(statistics._add_rate); >>>> + event.set_removalRate(statistics._remove_rate); >>>> + event.commit(); >>> >>> Each of those JFR events are an instance of a different class, so >>> the best I can do is a macro here (otherwise I'd have to create a >>> base class for the TableStatistics events from which to extend our 6 >>> table events, but I'm not sure JFR architecture supports that - it >>> generates class automatically from the event's meta description) >>> >>> Updated webrev http://cr.openjdk.java.net/~gziemski/8185525_rev4/ >> >> Yes, that looks better to me. >> >> + //statistics.print(tty, "SymbolTable"); >> >> You should remove commented out code.?? You can always add it back >> locally if you want to debug it again.? (I don't need to see this >> change). > > It was pointed out to me, that we can use templates instead of macro > here, so I tried it and I like it: > > http://cr.openjdk.java.net/~gziemski/8185525_rev5 > > > cheers > From patricio.chilano.mateo at oracle.com Wed Apr 10 22:07:25 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 10 Apr 2019 18:07:25 -0400 Subject: RFR (S) 8222231: Clean up interfaceSupport.inline.hpp duplicated code In-Reply-To: References: Message-ID: <1b49d0e9-4eea-fe3a-7064-f77fa1cb4ab4@oracle.com> Hi Coleen, On 4/10/19 3:27 PM, coleen.phillimore at oracle.com wrote: > > On 4/10/19 12:01 PM, Patricio Chilano wrote: >> Hi Coleen! >> >> Change looks good to me! Just a small comment. Since we now will not >> clear unhandled oops in the TBIVMWDC jacket anymore, I think we >> should add that check in Monitor::wait() like we do in Monitor::lock(). > > Patricio,? Thank you for reviewing this.? Good find! > > http://cr.openjdk.java.net/~coleenp/2019/8222231.02/webrev/ > > I added it and retested runThese jck tests with -XX:+CheckUnhandledOops. Looks good to me! Thanks, Patricio > Thanks, > Coleen > >> >> Thanks! >> Patricio >> >> On 4/10/19 10:36 AM, coleen.phillimore at oracle.com wrote: >>> >>> Thank you David! >>> Coleen >>> >>> On 4/9/19 10:04 PM, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> Thanks for doing this! It is all a lot simpler now. Reviewed with >>>> enthusiasm. :) >>>> >>>> David >>>> >>>> On 10/04/2019 9:58 am, coleen.phillimore at oracle.com wrote: >>>>> Some code was left from removing code UseMembar and further >>>>> improvements encouraged by dholmes.? I also removed a couple >>>>> redundant and unneeded clear_unhandled_oops calls.? The one in >>>>> ThreadBlockInVMWithDeadlockCheck is in the calling code >>>>> Monitor::lock, and the thread in vm from native makes no sense. >>>>> >>>>> Tested with runtime jtreg tests, and mach5 tier1-3 in progress. >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8222231.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222231 >>>>> >>>>> Thanks, >>>>> Coleen >>> >> > From coleen.phillimore at oracle.com Wed Apr 10 22:08:12 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 10 Apr 2019 18:08:12 -0400 Subject: RFR (S) 8222231: Clean up interfaceSupport.inline.hpp duplicated code In-Reply-To: <1b49d0e9-4eea-fe3a-7064-f77fa1cb4ab4@oracle.com> References: <1b49d0e9-4eea-fe3a-7064-f77fa1cb4ab4@oracle.com> Message-ID: <42296e33-116a-22b6-79b0-5f8aeef31001@oracle.com> Thanks Patricio! Coleen On 4/10/19 6:07 PM, Patricio Chilano wrote: > Hi Coleen, > > On 4/10/19 3:27 PM, coleen.phillimore at oracle.com wrote: >> >> On 4/10/19 12:01 PM, Patricio Chilano wrote: >>> Hi Coleen! >>> >>> Change looks good to me! Just a small comment. Since we now will not >>> clear unhandled oops in the TBIVMWDC jacket anymore, I think we >>> should add that check in Monitor::wait() like we do in Monitor::lock(). >> >> Patricio,? Thank you for reviewing this.? Good find! >> >> http://cr.openjdk.java.net/~coleenp/2019/8222231.02/webrev/ >> >> I added it and retested runThese jck tests with -XX:+CheckUnhandledOops. > Looks good to me! > > > Thanks, > Patricio >> Thanks, >> Coleen >> >>> >>> Thanks! >>> Patricio >>> >>> On 4/10/19 10:36 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> Thank you David! >>>> Coleen >>>> >>>> On 4/9/19 10:04 PM, David Holmes wrote: >>>>> Hi Coleen, >>>>> >>>>> Thanks for doing this! It is all a lot simpler now. Reviewed with >>>>> enthusiasm. :) >>>>> >>>>> David >>>>> >>>>> On 10/04/2019 9:58 am, coleen.phillimore at oracle.com wrote: >>>>>> Some code was left from removing code UseMembar and further >>>>>> improvements encouraged by dholmes.? I also removed a couple >>>>>> redundant and unneeded clear_unhandled_oops calls.? The one in >>>>>> ThreadBlockInVMWithDeadlockCheck is in the calling code >>>>>> Monitor::lock, and the thread in vm from native makes no sense. >>>>>> >>>>>> Tested with runtime jtreg tests, and mach5 tier1-3 in progress. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8222231.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222231 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>> >>> >> > From mikhailo.seledtsov at oracle.com Thu Apr 11 01:06:36 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 10 Apr 2019 18:06:36 -0700 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group Message-ID: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> Please review this small (trivial) change that excludes docker tests from hotspot_runtime. The rational for this change is that docker tests require specially configured environment (docker engine installed, test user being member of docker group, docker proxy or docker mirror repo configured). This may lead to unexpected errors when docker tests are ran as part of hotspot_runtime group in general environment(s). JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 Change: diff --git a/test/hotspot/jtreg/TEST.groups b/test/hotspot/jtreg/TEST.groups --- a/test/hotspot/jtreg/TEST.groups +++ b/test/hotspot/jtreg/TEST.groups @@ -44,7 +44,8 @@ ?? -gc/nvdimm ?hotspot_runtime = \ -? runtime +? runtime \ +? -runtime/containers/docker ?hotspot_handshake = \ ?? runtime/handshake Testing: ? jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime ? jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | grep docker Thank you, Misha From igor.ignatyev at oracle.com Thu Apr 11 01:13:19 2019 From: igor.ignatyev at oracle.com (Igor Ignatev) Date: Wed, 10 Apr 2019 18:13:19 -0700 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group In-Reply-To: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> References: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> Message-ID: <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> Should these test be filtered out using @requires? ? Igor > On Apr 10, 2019, at 6:06 PM, mikhailo.seledtsov at oracle.com wrote: > > Please review this small (trivial) change that excludes docker tests from hotspot_runtime. The rational for this change is that docker tests require specially configured environment (docker engine installed, test user being member of docker group, docker proxy or docker mirror repo configured). This may lead to unexpected errors when docker tests are ran as part of hotspot_runtime group in general environment(s). > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 > > Change: > > diff --git a/test/hotspot/jtreg/TEST.groups b/test/hotspot/jtreg/TEST.groups > --- a/test/hotspot/jtreg/TEST.groups > +++ b/test/hotspot/jtreg/TEST.groups > @@ -44,7 +44,8 @@ > -gc/nvdimm > > hotspot_runtime = \ > - runtime > + runtime \ > + -runtime/containers/docker > > hotspot_handshake = \ > runtime/handshake > > Testing: > > jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime > > jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | grep docker > > > Thank you, > > Misha > From mikhailo.seledtsov at oracle.com Thu Apr 11 01:39:24 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 10 Apr 2019 18:39:24 -0700 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group In-Reply-To: <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> References: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> Message-ID: <96a548bf-d86d-00aa-698f-d3f956a6c473@oracle.com> Hi Igor, Thank you for taking a look. On 4/10/19 6:13 PM, Igor Ignatev wrote: > Should these test be filtered out using @requires? Checking all the conditions for this via @requires will require building a test docker image (or at least downloading the base/FROM) image in evaluation of at-requires, which will be unacceptably long, especially given that @requires is evaluated each time one runs jtreg command for any test in hotspot. Alternatively, if the current approach is undesirable, we can throw jtreg.Skipped exception if docker base image fails to download. Let me know if this is your preference. Misha > > ? Igor > >> On Apr 10, 2019, at 6:06 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Please review this small (trivial) change that excludes docker tests from hotspot_runtime. The rational for this change is that docker tests require specially configured environment (docker engine installed, test user being member of docker group, docker proxy or docker mirror repo configured). This may lead to unexpected errors when docker tests are ran as part of hotspot_runtime group in general environment(s). >> >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >> >> Change: >> >> diff --git a/test/hotspot/jtreg/TEST.groups b/test/hotspot/jtreg/TEST.groups >> --- a/test/hotspot/jtreg/TEST.groups >> +++ b/test/hotspot/jtreg/TEST.groups >> @@ -44,7 +44,8 @@ >> -gc/nvdimm >> >> hotspot_runtime = \ >> - runtime >> + runtime \ >> + -runtime/containers/docker >> >> hotspot_handshake = \ >> runtime/handshake >> >> Testing: >> >> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime >> >> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | grep docker >> >> >> Thank you, >> >> Misha >> From igor.ignatyev at oracle.com Thu Apr 11 01:44:28 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 10 Apr 2019 18:44:28 -0700 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group In-Reply-To: <96a548bf-d86d-00aa-698f-d3f956a6c473@oracle.com> References: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> <96a548bf-d86d-00aa-698f-d3f956a6c473@oracle.com> Message-ID: <73DE96B1-C9B0-4F1A-BFD6-20B803957C65@oracle.com> Hi Misha, although it is not formally stated anywhere, :hotspot_runtime is expected to include _all_ jtreg-jtreg runtime related tests, that's to say I'd expect to have docker tests included into this group, regardless of a host's ability/inability to run them. hence I'd prefer us to use Skipped exception in these tests. Thanks, -- Igor > On Apr 10, 2019, at 6:39 PM, mikhailo.seledtsov at oracle.com wrote: > > Hi Igor, > > Thank you for taking a look. > > > On 4/10/19 6:13 PM, Igor Ignatev wrote: >> Should these test be filtered out using @requires? > Checking all the conditions for this via @requires will require building a test docker image (or at least downloading the base/FROM) image in evaluation of at-requires, which will be unacceptably long, especially given that @requires is evaluated each time one runs jtreg command for any test in hotspot. > > Alternatively, if the current approach is undesirable, we can throw jtreg.Skipped exception if docker base image fails to download. Let me know if this is your preference. > > Misha >> >> ? Igor >> >>> On Apr 10, 2019, at 6:06 PM, mikhailo.seledtsov at oracle.com wrote: >>> >>> Please review this small (trivial) change that excludes docker tests from hotspot_runtime. The rational for this change is that docker tests require specially configured environment (docker engine installed, test user being member of docker group, docker proxy or docker mirror repo configured). This may lead to unexpected errors when docker tests are ran as part of hotspot_runtime group in general environment(s). >>> >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>> >>> Change: >>> >>> diff --git a/test/hotspot/jtreg/TEST.groups b/test/hotspot/jtreg/TEST.groups >>> --- a/test/hotspot/jtreg/TEST.groups >>> +++ b/test/hotspot/jtreg/TEST.groups >>> @@ -44,7 +44,8 @@ >>> -gc/nvdimm >>> >>> hotspot_runtime = \ >>> - runtime >>> + runtime \ >>> + -runtime/containers/docker >>> >>> hotspot_handshake = \ >>> runtime/handshake >>> >>> Testing: >>> >>> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime >>> >>> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | grep docker >>> >>> >>> Thank you, >>> >>> Misha >>> > From david.holmes at oracle.com Thu Apr 11 01:58:40 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 11 Apr 2019 11:58:40 +1000 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group In-Reply-To: <73DE96B1-C9B0-4F1A-BFD6-20B803957C65@oracle.com> References: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> <96a548bf-d86d-00aa-698f-d3f956a6c473@oracle.com> <73DE96B1-C9B0-4F1A-BFD6-20B803957C65@oracle.com> Message-ID: Hi Misha, On 11/04/2019 11:44 am, Igor Ignatyev wrote: > Hi Misha, > > although it is not formally stated anywhere, :hotspot_runtime is expected to include _all_ jtreg-jtreg runtime related tests, that's to say I'd expect to have docker tests included into this group, regardless of a host's ability/inability to run them. hence I'd prefer us to use Skipped exception in these tests. I agree with Igor - :hotspot_runtime is supposed to include everything. We exclude specific tests when we run more specific test groups like tier1_runtime. (Personally I no longer use groups for local testing but just directories.) If @requires is infeasible (and I can easily see it is) then using "skipped" exception is one approach - however I'm also concerned about wasting time failing to run these tests. Another option perhaps is to move "containers" to be a top-level test group alongside runtime, rather than within it? Thanks, David > Thanks, > -- Igor > >> On Apr 10, 2019, at 6:39 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Hi Igor, >> >> Thank you for taking a look. >> >> >> On 4/10/19 6:13 PM, Igor Ignatev wrote: >>> Should these test be filtered out using @requires? >> Checking all the conditions for this via @requires will require building a test docker image (or at least downloading the base/FROM) image in evaluation of at-requires, which will be unacceptably long, especially given that @requires is evaluated each time one runs jtreg command for any test in hotspot. >> >> Alternatively, if the current approach is undesirable, we can throw jtreg.Skipped exception if docker base image fails to download. Let me know if this is your preference. >> >> Misha >>> >>> ? Igor >>> >>>> On Apr 10, 2019, at 6:06 PM, mikhailo.seledtsov at oracle.com wrote: >>>> >>>> Please review this small (trivial) change that excludes docker tests from hotspot_runtime. The rational for this change is that docker tests require specially configured environment (docker engine installed, test user being member of docker group, docker proxy or docker mirror repo configured). This may lead to unexpected errors when docker tests are ran as part of hotspot_runtime group in general environment(s). >>>> >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>>> >>>> Change: >>>> >>>> diff --git a/test/hotspot/jtreg/TEST.groups b/test/hotspot/jtreg/TEST.groups >>>> --- a/test/hotspot/jtreg/TEST.groups >>>> +++ b/test/hotspot/jtreg/TEST.groups >>>> @@ -44,7 +44,8 @@ >>>> -gc/nvdimm >>>> >>>> hotspot_runtime = \ >>>> - runtime >>>> + runtime \ >>>> + -runtime/containers/docker >>>> >>>> hotspot_handshake = \ >>>> runtime/handshake >>>> >>>> Testing: >>>> >>>> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime >>>> >>>> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | grep docker >>>> >>>> >>>> Thank you, >>>> >>>> Misha >>>> >> > From aoqi at loongson.cn Thu Apr 11 02:34:41 2019 From: aoqi at loongson.cn (Ao Qi) Date: Thu, 11 Apr 2019 10:34:41 +0800 Subject: RFR(trivial): JDK-8222300: Zero build broken Message-ID: Hi, Zero build is broken after JDK-8222231. Could you please review this fix? Bug: https://bugs.openjdk.java.net/browse/JDK-8222300 Webrev: http://cr.openjdk.java.net/~aoqi/8222300/webrev.00/ Tested: linux-x86_64-{server, minimal, zero}-{fastdebug, release} build, linux-x86_64-server-release hotspot:tier1 Cheers, Ao Qi From mikhailo.seledtsov at oracle.com Thu Apr 11 02:34:36 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 10 Apr 2019 19:34:36 -0700 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group In-Reply-To: <73DE96B1-C9B0-4F1A-BFD6-20B803957C65@oracle.com> References: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> <96a548bf-d86d-00aa-698f-d3f956a6c473@oracle.com> <73DE96B1-C9B0-4F1A-BFD6-20B803957C65@oracle.com> Message-ID: <48842c0c-0cfb-7b9c-bcf1-6d9ad2bd5368@oracle.com> On 4/10/19 6:44 PM, Igor Ignatyev wrote: > Hi Misha, > > although it is not formally stated anywhere, :hotspot_runtime is expected to include _all_ jtreg-jtreg runtime related tests, that's to say I'd expect to have docker tests included into this group, regardless of a host's ability/inability to run them. hence I'd prefer us to use Skipped exception in these tests. Thank you Igor. I understand this implicit requirement, and agree with you. Misha > > Thanks, > -- Igor > >> On Apr 10, 2019, at 6:39 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Hi Igor, >> >> Thank you for taking a look. >> >> >> On 4/10/19 6:13 PM, Igor Ignatev wrote: >>> Should these test be filtered out using @requires? >> Checking all the conditions for this via @requires will require building a test docker image (or at least downloading the base/FROM) image in evaluation of at-requires, which will be unacceptably long, especially given that @requires is evaluated each time one runs jtreg command for any test in hotspot. >> >> Alternatively, if the current approach is undesirable, we can throw jtreg.Skipped exception if docker base image fails to download. Let me know if this is your preference. >> >> Misha >>> ? Igor >>> >>>> On Apr 10, 2019, at 6:06 PM, mikhailo.seledtsov at oracle.com wrote: >>>> >>>> Please review this small (trivial) change that excludes docker tests from hotspot_runtime. The rational for this change is that docker tests require specially configured environment (docker engine installed, test user being member of docker group, docker proxy or docker mirror repo configured). This may lead to unexpected errors when docker tests are ran as part of hotspot_runtime group in general environment(s). >>>> >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>>> >>>> Change: >>>> >>>> diff --git a/test/hotspot/jtreg/TEST.groups b/test/hotspot/jtreg/TEST.groups >>>> --- a/test/hotspot/jtreg/TEST.groups >>>> +++ b/test/hotspot/jtreg/TEST.groups >>>> @@ -44,7 +44,8 @@ >>>> -gc/nvdimm >>>> >>>> hotspot_runtime = \ >>>> - runtime >>>> + runtime \ >>>> + -runtime/containers/docker >>>> >>>> hotspot_handshake = \ >>>> runtime/handshake >>>> >>>> Testing: >>>> >>>> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime >>>> >>>> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | grep docker >>>> >>>> >>>> Thank you, >>>> >>>> Misha >>>> From david.holmes at oracle.com Thu Apr 11 02:38:52 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 11 Apr 2019 12:38:52 +1000 Subject: RFR(trivial): JDK-8222300: Zero build broken In-Reply-To: References: Message-ID: Hi, On 11/04/2019 12:34 pm, Ao Qi wrote: > Hi, > > Zero build is broken after JDK-8222231. Could you please review this fix? Sorry about that. Fix looks good and trivial. I will sponsor this for you. Thanks, David > Bug: > https://bugs.openjdk.java.net/browse/JDK-8222300 > > Webrev: > http://cr.openjdk.java.net/~aoqi/8222300/webrev.00/ > > Tested: > linux-x86_64-{server, minimal, zero}-{fastdebug, release} build, > linux-x86_64-server-release hotspot:tier1 > > Cheers, > Ao Qi > From mikhailo.seledtsov at oracle.com Thu Apr 11 02:38:04 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 10 Apr 2019 19:38:04 -0700 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group In-Reply-To: References: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> <96a548bf-d86d-00aa-698f-d3f956a6c473@oracle.com> <73DE96B1-C9B0-4F1A-BFD6-20B803957C65@oracle.com> Message-ID: Hi David, ? Thank you for your input. On 4/10/19 6:58 PM, David Holmes wrote: > Hi Misha, > > On 11/04/2019 11:44 am, Igor Ignatyev wrote: >> Hi Misha, >> >> although it is not formally stated anywhere, :hotspot_runtime is >> expected to include _all_ jtreg-jtreg runtime related tests, that's >> to say I'd expect to have docker tests included into this group, >> regardless of a host's ability/inability to run them. hence I'd >> prefer us to use Skipped exception in these tests. > > I agree with Igor - :hotspot_runtime is supposed to include > everything. We exclude specific tests when we run more specific test > groups like tier1_runtime. (Personally I no longer use groups for > local testing but just directories.) > > If @requires is infeasible (and I can easily see it is) then using > "skipped" exception is one approach - however I'm also concerned about > wasting time failing to run these tests. > > Another option perhaps is to move "containers" to be a top-level test > group alongside runtime, rather than within it? I agree. This is another good alternative. Would you recommend moving these tests to test/hotspot/jtreg/containers/..., or should they move under another "top-level" directory, such as test/hotspot/jtreg/*misc*/containers? or similar ? Igor,? do you have any input or opinion on this alternative? Thank you, Misha > > Thanks, > David > >> Thanks, >> -- Igor >> >>> On Apr 10, 2019, at 6:39 PM, mikhailo.seledtsov at oracle.com wrote: >>> >>> Hi Igor, >>> >>> Thank you for taking a look. >>> >>> >>> On 4/10/19 6:13 PM, Igor Ignatev wrote: >>>> Should these test be filtered out using @requires? >>> Checking all the conditions for this via @requires will require >>> building a test docker image (or at least downloading the base/FROM) >>> image in evaluation of at-requires, which will be unacceptably long, >>> especially given that @requires is evaluated each time one runs >>> jtreg command for any test in hotspot. >>> >>> Alternatively, if the current approach is undesirable, we can throw >>> jtreg.Skipped exception if docker base image fails to download. Let >>> me know if this is your preference. >>> >>> Misha >>>> >>>> ? Igor >>>> >>>>> On Apr 10, 2019, at 6:06 PM, mikhailo.seledtsov at oracle.com wrote: >>>>> >>>>> Please review this small (trivial) change that excludes docker >>>>> tests from hotspot_runtime. The rational for this change is that >>>>> docker tests require specially configured environment (docker >>>>> engine installed, test user being member of docker group, docker >>>>> proxy or docker mirror repo configured). This may lead to >>>>> unexpected errors when docker tests are ran as part of >>>>> hotspot_runtime group in general environment(s). >>>>> >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>>>> >>>>> Change: >>>>> >>>>> diff --git a/test/hotspot/jtreg/TEST.groups >>>>> b/test/hotspot/jtreg/TEST.groups >>>>> --- a/test/hotspot/jtreg/TEST.groups >>>>> +++ b/test/hotspot/jtreg/TEST.groups >>>>> @@ -44,7 +44,8 @@ >>>>> ??? -gc/nvdimm >>>>> >>>>> ? hotspot_runtime = \ >>>>> -? runtime >>>>> +? runtime \ >>>>> +? -runtime/containers/docker >>>>> >>>>> ? hotspot_handshake = \ >>>>> ??? runtime/handshake >>>>> >>>>> Testing: >>>>> >>>>> ?? jtreg -l >>>>> /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime >>>>> >>>>> ?? jtreg -l >>>>> /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | >>>>> grep docker >>>>> >>>>> >>>>> Thank you, >>>>> >>>>> Misha >>>>> >>> >> From aoqi at loongson.cn Thu Apr 11 03:21:11 2019 From: aoqi at loongson.cn (Ao Qi) Date: Thu, 11 Apr 2019 11:21:11 +0800 Subject: RFR(trivial): JDK-8222300: Zero build broken In-Reply-To: References: Message-ID: On Thu, Apr 11, 2019 at 10:38 AM David Holmes wrote: > > Hi, > > On 11/04/2019 12:34 pm, Ao Qi wrote: > > Hi, > > > > Zero build is broken after JDK-8222231. Could you please review this fix? > > Sorry about that. Fix looks good and trivial. I will sponsor this for you. Thank you, David. > > Thanks, > David > > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8222300 > > > > Webrev: > > http://cr.openjdk.java.net/~aoqi/8222300/webrev.00/ > > > > Tested: > > linux-x86_64-{server, minimal, zero}-{fastdebug, release} build, > > linux-x86_64-server-release hotspot:tier1 > > > > Cheers, > > Ao Qi > > From david.holmes at oracle.com Thu Apr 11 03:17:42 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 11 Apr 2019 13:17:42 +1000 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group In-Reply-To: References: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> <96a548bf-d86d-00aa-698f-d3f956a6c473@oracle.com> <73DE96B1-C9B0-4F1A-BFD6-20B803957C65@oracle.com> Message-ID: On 11/04/2019 12:38 pm, mikhailo.seledtsov at oracle.com wrote: > Hi David, > > ? Thank you for your input. > > > On 4/10/19 6:58 PM, David Holmes wrote: >> Hi Misha, >> >> On 11/04/2019 11:44 am, Igor Ignatyev wrote: >>> Hi Misha, >>> >>> although it is not formally stated anywhere, :hotspot_runtime is >>> expected to include _all_ jtreg-jtreg runtime related tests, that's >>> to say I'd expect to have docker tests included into this group, >>> regardless of a host's ability/inability to run them. hence I'd >>> prefer us to use Skipped exception in these tests. >> >> I agree with Igor - :hotspot_runtime is supposed to include >> everything. We exclude specific tests when we run more specific test >> groups like tier1_runtime. (Personally I no longer use groups for >> local testing but just directories.) >> >> If @requires is infeasible (and I can easily see it is) then using >> "skipped" exception is one approach - however I'm also concerned about >> wasting time failing to run these tests. >> >> Another option perhaps is to move "containers" to be a top-level test >> group alongside runtime, rather than within it? > I agree. This is another good alternative. Would you recommend moving > these tests to test/hotspot/jtreg/containers/..., Yes that is what I was thinking. Thanks, David > or should they move > under another "top-level" directory, such as > test/hotspot/jtreg/*misc*/containers? or similar ? > > Igor,? do you have any input or opinion on this alternative? > > > Thank you, > Misha >> >> Thanks, >> David >> >>> Thanks, >>> -- Igor >>> >>>> On Apr 10, 2019, at 6:39 PM, mikhailo.seledtsov at oracle.com wrote: >>>> >>>> Hi Igor, >>>> >>>> Thank you for taking a look. >>>> >>>> >>>> On 4/10/19 6:13 PM, Igor Ignatev wrote: >>>>> Should these test be filtered out using @requires? >>>> Checking all the conditions for this via @requires will require >>>> building a test docker image (or at least downloading the base/FROM) >>>> image in evaluation of at-requires, which will be unacceptably long, >>>> especially given that @requires is evaluated each time one runs >>>> jtreg command for any test in hotspot. >>>> >>>> Alternatively, if the current approach is undesirable, we can throw >>>> jtreg.Skipped exception if docker base image fails to download. Let >>>> me know if this is your preference. >>>> >>>> Misha >>>>> >>>>> ? Igor >>>>> >>>>>> On Apr 10, 2019, at 6:06 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>> >>>>>> Please review this small (trivial) change that excludes docker >>>>>> tests from hotspot_runtime. The rational for this change is that >>>>>> docker tests require specially configured environment (docker >>>>>> engine installed, test user being member of docker group, docker >>>>>> proxy or docker mirror repo configured). This may lead to >>>>>> unexpected errors when docker tests are ran as part of >>>>>> hotspot_runtime group in general environment(s). >>>>>> >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>>>>> >>>>>> Change: >>>>>> >>>>>> diff --git a/test/hotspot/jtreg/TEST.groups >>>>>> b/test/hotspot/jtreg/TEST.groups >>>>>> --- a/test/hotspot/jtreg/TEST.groups >>>>>> +++ b/test/hotspot/jtreg/TEST.groups >>>>>> @@ -44,7 +44,8 @@ >>>>>> ??? -gc/nvdimm >>>>>> >>>>>> ? hotspot_runtime = \ >>>>>> -? runtime >>>>>> +? runtime \ >>>>>> +? -runtime/containers/docker >>>>>> >>>>>> ? hotspot_handshake = \ >>>>>> ??? runtime/handshake >>>>>> >>>>>> Testing: >>>>>> >>>>>> ?? jtreg -l >>>>>> /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime >>>>>> >>>>>> ?? jtreg -l >>>>>> /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | >>>>>> grep docker >>>>>> >>>>>> >>>>>> Thank you, >>>>>> >>>>>> Misha >>>>>> >>>> >>> > From igor.ignatyev at oracle.com Thu Apr 11 06:13:39 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 10 Apr 2019 23:13:39 -0700 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group In-Reply-To: References: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> <96a548bf-d86d-00aa-698f-d3f956a6c473@oracle.com> <73DE96B1-C9B0-4F1A-BFD6-20B803957C65@oracle.com> Message-ID: <29ADAB28-7883-408F-A0C0-1F9FC4ADD62D@oracle.com> > Igor, do you have any input or opinion on this alternative? I'm fine w/ 'test/hotspot/jtreg/(misc)/containers/' given we create :hotspot_containers/:hotspot_misc to include these tests and your changes don't change our current tier definitions. -- Igor > On Apr 10, 2019, at 7:38 PM, mikhailo.seledtsov at oracle.com wrote: > > Hi David, > > Thank you for your input. > > > On 4/10/19 6:58 PM, David Holmes wrote: >> Hi Misha, >> >> On 11/04/2019 11:44 am, Igor Ignatyev wrote: >>> Hi Misha, >>> >>> although it is not formally stated anywhere, :hotspot_runtime is expected to include _all_ jtreg-jtreg runtime related tests, that's to say I'd expect to have docker tests included into this group, regardless of a host's ability/inability to run them. hence I'd prefer us to use Skipped exception in these tests. >> >> I agree with Igor - :hotspot_runtime is supposed to include everything. We exclude specific tests when we run more specific test groups like tier1_runtime. (Personally I no longer use groups for local testing but just directories.) >> >> If @requires is infeasible (and I can easily see it is) then using "skipped" exception is one approach - however I'm also concerned about wasting time failing to run these tests. >> >> Another option perhaps is to move "containers" to be a top-level test group alongside runtime, rather than within it? > I agree. This is another good alternative. Would you recommend moving these tests to test/hotspot/jtreg/containers/..., or should they move under another "top-level" directory, such as test/hotspot/jtreg/misc/containers or similar ? > > Igor, do you have any input or opinion on this alternative? > > > Thank you, > Misha >> >> Thanks, >> David >> >>> Thanks, >>> -- Igor >>> >>>> On Apr 10, 2019, at 6:39 PM, mikhailo.seledtsov at oracle.com wrote: >>>> >>>> Hi Igor, >>>> >>>> Thank you for taking a look. >>>> >>>> >>>> On 4/10/19 6:13 PM, Igor Ignatev wrote: >>>>> Should these test be filtered out using @requires? >>>> Checking all the conditions for this via @requires will require building a test docker image (or at least downloading the base/FROM) image in evaluation of at-requires, which will be unacceptably long, especially given that @requires is evaluated each time one runs jtreg command for any test in hotspot. >>>> >>>> Alternatively, if the current approach is undesirable, we can throw jtreg.Skipped exception if docker base image fails to download. Let me know if this is your preference. >>>> >>>> Misha >>>>> >>>>> ? Igor >>>>> >>>>>> On Apr 10, 2019, at 6:06 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>> >>>>>> Please review this small (trivial) change that excludes docker tests from hotspot_runtime. The rational for this change is that docker tests require specially configured environment (docker engine installed, test user being member of docker group, docker proxy or docker mirror repo configured). This may lead to unexpected errors when docker tests are ran as part of hotspot_runtime group in general environment(s). >>>>>> >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>>>>> >>>>>> Change: >>>>>> >>>>>> diff --git a/test/hotspot/jtreg/TEST.groups b/test/hotspot/jtreg/TEST.groups >>>>>> --- a/test/hotspot/jtreg/TEST.groups >>>>>> +++ b/test/hotspot/jtreg/TEST.groups >>>>>> @@ -44,7 +44,8 @@ >>>>>> -gc/nvdimm >>>>>> >>>>>> hotspot_runtime = \ >>>>>> - runtime >>>>>> + runtime \ >>>>>> + -runtime/containers/docker >>>>>> >>>>>> hotspot_handshake = \ >>>>>> runtime/handshake >>>>>> >>>>>> Testing: >>>>>> >>>>>> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime >>>>>> >>>>>> jtreg -l /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | grep docker >>>>>> >>>>>> >>>>>> Thank you, >>>>>> >>>>>> Misha >>>>>> >>>> >>> > From robbin.ehn at oracle.com Thu Apr 11 06:44:14 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 11 Apr 2019 08:44:14 +0200 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <9b492479-5f03-47a9-bb6b-014cee395545@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CAD05AF.1090700@oracle.com> <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> <7002f66b-49af-b38d-8d97-8642e684e998@oracle.com> <9b492479-5f03-47a9-bb6b-014cee395545@oracle.com> Message-ID: <25d8b067-0137-27f9-0ead-9eb21e3afe9b@oracle.com> >> Personally, I think I'd prefer not to emit the event at all, rather than emit >> one that might be wrong (that's exactly what we do currently for jcmd print >> statistics). Skipping the event sounds fine? /Robbin >> >> Erik, Robbin, do you have a preference here? >> >> >> cheers >> > From thomas.stuefe at gmail.com Thu Apr 11 07:33:12 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 11 Apr 2019 09:33:12 +0200 Subject: RFR (S): 8218458: [TESTBUG] runtime/NMT/CheckForProperDetailStackTrace.java fails with Expected stack trace missing from output In-Reply-To: <495c2471-b034-77ed-0150-10f525a5384a@oracle.com> References: <71514a13-8373-c401-2f34-4347bb951f13@oracle.com> <93910021-c901-1cf5-9b71-558088804ee2@oracle.com> <495c2471-b034-77ed-0150-10f525a5384a@oracle.com> Message-ID: Hi David, its okay. If the intent of this test is to check that C-Heap allocating callstacks are printed in a clear manner with enough significant frames to identify the caller vs just to test that NMT detail printing in general works, the test is needed in its complexity. I still think these are two separate issues but its fine to test them in one go. Cheers, Thomas On Mon, Apr 8, 2019 at 1:40 AM David Holmes wrote: > Hi Thomas, > > My apologies, I did not mean to ignore your input here. Thanks for > taking a look and pointing out the scanning error in my original proposal. > > Hopefully you are okay with the simpler approach that Chris has advocated. > > Thanks, > David > > On 4/04/2019 5:12 pm, Thomas St?fe wrote: > > Hi David, Chris, > > > > I think this is an improvement and goes in the right direction. Those > > hard-wired inline guesses always made me twitch a bit. > > > > The patch looks fine to me in its current form, since it is already an > > improvement. So the following remarks are "optional": > > > > - Since all we want to do is to test that NMT detail printing works, we > > do not have to use one of the malloc paths; I have the feeling the mmap > > paths are more "inline stable" since they usually end up in one of the > > ReservedSpace child class constructors which do not get inlined. > > > > Like this: > > > > 74 [0x0000000706400000 - 0x0000000800000000] reserved 4091904KB for > > Java Heap from > > 75 [0x00007f9b514cff07] > > ReservedHeapSpace::try_reserve_range(char*, char*, unsigned long, char*, > > char*, unsigned long, unsigned long, bool)+0xb7 > > 76 [0x00007f9b514d08d8] > > ReservedHeapSpace::initialize_compressed_heap(unsigned long, unsigned > > long, bool)+0x5f8 > > 77 [0x00007f9b514d0f3a] > > ReservedHeapSpace::ReservedHeapSpace(unsigned long, unsigned long, bool, > > char const*) [clone .part.29]+0x9a > > 78 [0x00007f9b51450331] Universe::reserve_heap(unsigned long, > > unsigned long)+0xe1 > > > > Or this: > > > > 256 [0x00007f9b308c5000 - 0x00007f9b3f8c5000] reserved 245760KB for > > Code from > > 257 [0x00007f9b514cad02] > > ReservedCodeSpace::ReservedCodeSpace(unsigned long, unsigned long, > > bool)+0xa2 > > 258 [0x00007f9b505bcfb7] CodeCache::reserve_heap_memory(unsigned > > long)+0xe7 > > 259 [0x00007f9b505bd75b] CodeCache::initialize_heaps()+0x2db > > 260 [0x00007f9b505bde45] CodeCache::initialize()+0x1b5 > > > > will be stacks you always will see. > > > > - I do not like scanning the whole output for each single stack frame. > > The test may give false positives. I would like it more if we were to > > read the file line by line, and when the first pattern line matches, > > check that subsequent lines match too. This is how we do call stack > > matching at SAP for similar tests. > > This is also more efficient since you do not re-scan the whole output > > each time. > > > > In general: > > > > NMT is really very useful. We could think about > > increasing NMT_TrackingStackDepth, since 4 is obviously not a lot. 6 or > > 8 would be better. I do not believe the memory footprint increase would > > be significant, but of course we would have to measure. > > > > Thanks! Thomas > > > > On Thu, Apr 4, 2019 at 8:36 AM Chris Plummer > > wrote: > > > > On 4/3/19 11:23 PM, David Holmes wrote: > > > Hi Chris, > > > > > > On 4/04/2019 4:12 pm, Chris Plummer wrote: > > >> Hi David, > > >> > > >> I have concerns that this will hide some of the other bugs I've > > >> mentioned: JDK-8133749, JDK-8133747, and JDK-8133740. These bugs > > >> result in 1 or two frames appearing in the stacktrace that > > should be > > >> skipped. Notably NativeCallStack::NativeCallStack() and > > >> os::get_native_stack(). > > > > > > The test still checks those are not present first: > > > > > > 73 // We should never see either of these frames because > > they > > > are supposed to be skipped. */ > > > 74 output.shouldNotContain("NativeCallStack::NativeCallStack"); > > > 75 output.shouldNotContain("os::get_native_stack"); > > Ah yes. I skimmed over the test looking for it but missed it. > > > > > >> Also, AllocateHeap() should normally not be in the stack trace, > but > > >> the test has specifically allowed for it for windows and solaris > > >> slowdebug builds. Although these builds should have honored the > > >> ALWAYSINLINE directive, it was deemed acceptable that it was not > in > > >> slowdebug builds. However, I would not want to allow > AllocateHeap() > > >> to appear in a product build, and best not to see it in fastdebug > > >> either. > > > > > > This is a test of NMT detail not a test of whether a given > compiler > > > chooses to inline something like AllocateHeap. I don't think it > > is the > > > job of this test to be checking for something specific to the > native > > > compiler. The previous handling of AllocateHeap seemed to be there > > > simply because it was the only way to deal with an optional frame > - > > > but now that's handled generically. > > It's appearance means you effectively only have 3 frames to identity > > callsites instead of 4. If it does appear in a product build, a > > solution > > should be looked into to get rid of it. If the port owner decides it > > can't get rid of it (or is unwilling to), then an exception should be > > added to the test like was done for solaris and windows slowdebug > > builds. > > > > thanks, > > > > Chris > > > > > > Thanks, > > > David > > > > > >> Given the changes you made to allow more flexibly in which frames > > >> appear, I think you need to now also make sure the above 3 > > mentioned > > >> frames are not present, except for allowing AllocateHeap() in > > >> slowdebug builds. > > >> > > >> thanks, > > >> > > >> Chris > > >> > > >> On 4/3/19 10:53 PM, David Holmes wrote: > > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218458 > > >>> Webrev: http://cr.openjdk.java.net/~dholmes/8218458/webrev/ > > >>> > > >>> The actual stack trace reported by NMT detail is affected by the > > >>> inlining decisions of the native compiler, and on the type of > > build. > > >>> So we define an "ideal" stacktrace and then allow for some > > frames to > > >>> be missing based on empirical observations. So to date we have > > seen > > >>> two frames that may or may not be inlined and so we allow for 2 > > >>> non-matching entries. > > >>> > > >>> The special-casing of AllocateHeap is removed as now it is just > an > > >>> optional frame. > > >>> > > >>> Chris: does this maintain the "spirit" of the test as you > intended? > > >>> > > >>> Zhengyu: can you test this on your system(s) please. > > >>> > > >>> Thanks, > > >>> David > > >> > > >> > > > > > From claes.redestad at oracle.com Thu Apr 11 09:43:52 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Thu, 11 Apr 2019 11:43:52 +0200 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <5b0d2152-e336-675b-5c89-45636596a279@oracle.com> Message-ID: Hi Dan, critical-jOPS in SPECjbb2015 is designed to be sensitive to regressions in latency of the benchmark operations, sometimes to a fault. So keep in mind that what you're seeing could very well be attributed to noise. Such as an accidental result of pauses happening - by chance or design - when the benchmark is critically assessing latency SLAs. Correlating safepoint pauses across the benchmark run with benchmark logs might inform if there's an increase in spikes/latencies during sensitive phases of the benchmark that could contribute to a sustained critJOPS regression. Sample questions to answer: - is the time spent deflating more spread out before/after? - is there indication of back-to-back safepoints happening? Thanks! /Claes On 2019-04-10 21:24, Daniel D. Daugherty wrote: > So I?ve been analyzing monitorinflation logs from SPECjbb2015 runs. It > takes about 45 minutes for a SPECjbb15 run to finish on my Linux box. > > In the baseline bits: > ? Total deflating time: 0.9314706 secs. > ? Total deflating count: 2582566 > > In the v2.00 bits: > ? Total deflating time: 1.5767698 secs. > ? Total deflating count: 2505602 > > Yes, that is 1 second in 45 minutes for the baseline and 1.6 seconds > in 45 minutes for the v2.00 bits. That strongly indicates that the > mechanics of async monitor deflation is not the cause of the 4.5% > slowdown in SPECjbb2015. It must be something else... > > I'm looking at safepoint stats next... > > Dan > > > On 4/8/19 12:55 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I took the last repo that I ran through Mach5 tier[1-8] testing and did >> 10 SPECjbb2015 runs on the 'release' version of those bits. I also did >> 10 SPECjbb2015 runs on the 'release' version of the baseline bits. >> >> Baseline: jdk-13+13 >> Exp:????? v2.00 (8153224-webrev/3-for-jdk13) plus >> ????????? special-cleanup-for-global-in-use-list >> >> Linux-X64 Machine: >> ? - Ubuntu 16.04, Dell T7600, 64GB RAM >> ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 >> threads >> >> MacOSX Machine: >> ? - MacOS 10.13.6, Mac Mini, mid 2011, 16GB RAM >> ? - 2 GHz Intel Core i7 (I7-2635QM), 1 CPU x 4 cores x 2 threads >> >> Solaris-X64 Machine: >> ? - Solaris 11.2 SRU5.5, Dell T7600, 64GB RAM >> ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 >> threads >> >> Average Results for Each OS >> >> ???? hbIR?????????? hbIR >> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >> ---------------? ---------? --------? -------------? -------- >> >> ?????? 23838.00?? 22446.90? 20738.80??????? 6166.70? Linux-X64 base >> ?????? 23838.00?? 22279.40? 20262.00??????? 5891.50? Linux-X64 exp >> >> ??????? 5841.80??? 4885.00?? 4764.00??????? 1495.10? MacOSX base >> ??????? 5621.00??? 4701.00?? 4778.00??????? 1492.10? MacOSX exp >> >> ?????? 16125.20?? 13852.30? 12780.50??????? 2791.90? Solaris-X64 base >> ?????? 15788.70?? 13861.40? 12551.40??????? 2665.90? Solaris-X64 exp >> >> I'm new to SPECjbb2015 so I don't what "hbIR" and "jOPS" are yet. >> Based a bit of googling so far, it appears that for critical-jOPS, >> higher is better: >> >> - Linux-X64 exp critical-jOPS is ~4.5% lower than Linux-X64 base >> - MacOSX base and MacOSX exp critical-jOPS are almost identical >> - Solaris-X64 exp critical-jOPS is ~4.5% lower than Linux-X64 base >> >> I have not tried to research or analyze the other columns yet. >> >> The results for each of the 10 runs are shown below. >> >> Dan >> >> >> >> Linux-X64 Runs >> >> ???? hbIR?????????? hbIR >> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >> ---------------? ---------? --------? -------------? -------- >> ????????? 23838????? 22719???? 19070?????????? 6515 >> SPECjbb2015.Lin-X64.base.01 >> ????????? 23838????? 21642???? 20262?????????? 5591 >> SPECjbb2015.Lin-X64.base.02 >> ????????? 23838????? 23108???? 20262?????????? 6508 >> SPECjbb2015.Lin-X64.base.03 >> ????????? 23838????? 21730???? 21454?????????? 6235 >> SPECjbb2015.Lin-X64.base.04 >> ????????? 23838????? 22220???? 21454?????????? 6028 >> SPECjbb2015.Lin-X64.base.05 >> ????????? 23838????? 22543???? 20262?????????? 5996 >> SPECjbb2015.Lin-X64.base.06 >> ????????? 23838????? 23014???? 21454?????????? 6192 >> SPECjbb2015.Lin-X64.base.07 >> ????????? 23838????? 22543???? 21454?????????? 5889 >> SPECjbb2015.Lin-X64.base.08 >> ????????? 23838????? 22750???? 20262?????????? 6038 >> SPECjbb2015.Lin-X64.base.09 >> ????????? 23838????? 22200???? 21454?????????? 6675 >> SPECjbb2015.Lin-X64.base.10 >> ---------------? ---------? --------? -------------? -------- >> ?????? 23838.00?? 22446.90? 20738.80??????? 6166.70? average of values >> >> ???? hbIR?????????? hbIR >> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >> ---------------? ---------? --------? -------------? -------- >> ????????? 23838????? 21422???? 20262?????????? 6329 >> SPECjbb2015.Lin-X64.exp.01 >> ????????? 23838????? 22543???? 19070?????????? 6351 >> SPECjbb2015.Lin-X64.exp.02 >> ????????? 23838????? 22100???? 20262?????????? 5005 >> SPECjbb2015.Lin-X64.exp.03 >> ????????? 23838????? 22543???? 20262?????????? 5881 >> SPECjbb2015.Lin-X64.exp.04 >> ????????? 23838????? 23170???? 20262?????????? 5938 >> SPECjbb2015.Lin-X64.exp.05 >> ????????? 23838????? 22543???? 20262?????????? 5744 >> SPECjbb2015.Lin-X64.exp.06 >> ????????? 23838????? 22100???? 20262?????????? 5482 >> SPECjbb2015.Lin-X64.exp.07 >> ????????? 23838????? 22543???? 20262?????????? 6213 >> SPECjbb2015.Lin-X64.exp.08 >> ????????? 23838????? 22100???? 21454?????????? 5637 >> SPECjbb2015.Lin-X64.exp.09 >> ????????? 23838????? 21730???? 20262?????????? 6335 >> SPECjbb2015.Lin-X64.exp.10 >> ---------------? ---------? --------? -------------? -------- >> ?????? 23838.00?? 22279.40? 20262.00??????? 5891.50? average of values >> >> >> MacOSX Runs >> >> ???? hbIR?????????? hbIR >> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >> ---------------? ---------? --------? -------------? -------- >> ?????????? 6725?????? 5621????? 4708?????????? 1543 >> SPECjbb2015.MacOSX.base.01 >> ?????????? 5621?????? 4701????? 4778?????????? 1326 >> SPECjbb2015.MacOSX.base.02 >> ?????????? 6725?????? 5621????? 4708?????????? 1475 >> SPECjbb2015.MacOSX.base.03 >> ?????????? 5621?????? 4701????? 4778?????????? 1372 >> SPECjbb2015.MacOSX.base.04 >> ?????????? 5621?????? 4701????? 4778?????????? 1560 >> SPECjbb2015.MacOSX.base.05 >> ?????????? 5621?????? 4701????? 4778?????????? 1471 >> SPECjbb2015.MacOSX.base.06 >> ?????????? 5621?????? 4701????? 4778?????????? 1430 >> SPECjbb2015.MacOSX.base.07 >> ?????????? 5621?????? 4701????? 4778?????????? 1560 >> SPECjbb2015.MacOSX.base.08 >> ?????????? 5621?????? 4701????? 4778?????????? 1581 >> SPECjbb2015.MacOSX.base.09 >> ?????????? 5621?????? 4701????? 4778?????????? 1633 >> SPECjbb2015.MacOSX.base.10 >> ---------------? ---------? --------? -------------? -------- >> ??????? 5841.80??? 4885.00?? 4764.00??????? 1495.10? average of values >> >> ???? hbIR?????????? hbIR >> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >> ---------------? ---------? --------? -------------? -------- >> ?????????? 5621?????? 4701????? 4778?????????? 1566 >> SPECjbb2015.MacOSX.exp.01 >> ?????????? 5621?????? 4701????? 4778?????????? 1430 >> SPECjbb2015.MacOSX.exp.02 >> ?????????? 5621?????? 4701????? 4778?????????? 1530 >> SPECjbb2015.MacOSX.exp.03 >> ?????????? 5621?????? 4701????? 4778?????????? 1304 >> SPECjbb2015.MacOSX.exp.04 >> ?????????? 5621?????? 4701????? 4778?????????? 1560 >> SPECjbb2015.MacOSX.exp.05 >> ?????????? 5621?????? 4701????? 4778?????????? 1460 >> SPECjbb2015.MacOSX.exp.06 >> ?????????? 5621?????? 4701????? 4778?????????? 1638 >> SPECjbb2015.MacOSX.exp.07 >> ?????????? 5621?????? 4701????? 4778?????????? 1471 >> SPECjbb2015.MacOSX.exp.08 >> ?????????? 5621?????? 4701????? 4778?????????? 1402 >> SPECjbb2015.MacOSX.exp.09 >> ?????????? 5621?????? 4701????? 4778?????????? 1560 >> SPECjbb2015.MacOSX.exp.10 >> ---------------? ---------? --------? -------------? -------- >> ??????? 5621.00??? 4701.00?? 4778.00??????? 1492.10? average of values >> >> >> Solaris-X64 Runs >> >> ???? hbIR?????????? hbIR >> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >> ---------------? ---------? --------? -------------? -------- >> ????????? 16584????? 13957???? 13267?????????? 2332 >> SPECjbb2015.Sol-X64.base.01 >> ????????? 16584????? 13837???? 13267?????????? 3123 >> SPECjbb2015.Sol-X64.base.02 >> ????????? 16584????? 13837???? 13267?????????? 2853 >> SPECjbb2015.Sol-X64.base.03 >> ????????? 16584????? 13837???? 12438?????????? 2667 >> SPECjbb2015.Sol-X64.base.04 >> ????????? 14743????? 14210???? 12532?????????? 2920 >> SPECjbb2015.Sol-X64.base.05 >> ????????? 16584????? 13837???? 12438?????????? 3534 >> SPECjbb2015.Sol-X64.base.06 >> ????????? 13837????? 13497???? 12453?????????? 2226 >> SPECjbb2015.Sol-X64.base.07 >> ????????? 16584????? 13837???? 12438?????????? 2265 >> SPECjbb2015.Sol-X64.base.08 >> ????????? 16584????? 13837???? 13267?????????? 2853 >> SPECjbb2015.Sol-X64.base.09 >> ????????? 16584????? 13837???? 12438?????????? 3146 >> SPECjbb2015.Sol-X64.base.10 >> ---------------? ---------? --------? -------------? -------- >> ?????? 16125.20?? 13852.30? 12780.50??????? 2791.90? average of values >> >> ???? hbIR?????????? hbIR >> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >> ---------------? ---------? --------? -------------? -------- >> ????????? 16584????? 13837???? 12438?????????? 2073 >> SPECjbb2015.Sol-X64.exp.01 >> ????????? 16584????? 14353???? 13267?????????? 2667 >> SPECjbb2015.Sol-X64.exp.02 >> ????????? 16584????? 13837???? 12438?????????? 2349 >> SPECjbb2015.Sol-X64.exp.03 >> ????????? 16584????? 13837???? 12438?????????? 2494 >> SPECjbb2015.Sol-X64.exp.04 >> ????????? 13981????? 13832???? 12583?????????? 3241 >> SPECjbb2015.Sol-X64.exp.05 >> ????????? 13837????? 13575???? 12453?????????? 2621 >> SPECjbb2015.Sol-X64.exp.06 >> ????????? 13981????? 13832???? 12583?????????? 2768 >> SPECjbb2015.Sol-X64.exp.07 >> ????????? 16584????? 13837???? 12438?????????? 3000 >> SPECjbb2015.Sol-X64.exp.08 >> ????????? 16584????? 13837???? 12438?????????? 2952 >> SPECjbb2015.Sol-X64.exp.09 >> ????????? 16584????? 13837???? 12438?????????? 2494 >> SPECjbb2015.Sol-X64.exp.10 >> ---------------? ---------? --------? -------------? -------- >> ?????? 15788.70?? 13861.40? 12551.40??????? 2665.90? average of values >> >> >> On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>> >>> ??? JDK-8153224 Monitor deflation prolong safepoints >>> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >>> >>> Here's a link to the OpenJDK wiki that describes my port: >>> >>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>> >>> Here's the webrev URL: >>> >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>> >>> Here's a link to Carsten's original webrev: >>> >>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>> >>> Earlier versions of this patch have been through several rounds of >>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>> Roman for their preliminary code review comments. A very special >>> thanks to Robbin and Roman for building and testing the patch in >>> their own environments (including specJBB2015). >>> >>> This version of the patch has been thru Mach5 tier[1-8] testing on >>> Oracle's usual set of platforms. Earlier versions have been run >>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>> and slowdebug). Earlier versions have run my monitor inflation stress >>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>> fastdebug and slowdebug). >>> >>> All of the testing done on earlier versions will be redone on the >>> latest version of the patch. >>> >>> Thanks, in advance, for any questions, comments or suggestions. >>> >>> Dan >>> >>> P.S. >>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>> is currently failing in -Xcomp mode on Win* only. I've been trying >>> to characterize/analyze this failure for more than a week now. At >>> this point I'm convinced that Async Monitor Deflation is aggravating >>> an existing bug. However, I plan to have a better handle on that >>> failure before these bits are pushed to the jdk/jdk repo. >>> >> >> > From coleen.phillimore at oracle.com Thu Apr 11 12:45:05 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 11 Apr 2019 08:45:05 -0400 Subject: RFR(trivial): JDK-8222300: Zero build broken In-Reply-To: References: Message-ID: <7335eda0-5f90-8490-6eb6-9b9c57bc3dda@oracle.com> Thank you for finding and fixing this so quickly. Coleen On 4/10/19 11:21 PM, Ao Qi wrote: > On Thu, Apr 11, 2019 at 10:38 AM David Holmes wrote: >> Hi, >> >> On 11/04/2019 12:34 pm, Ao Qi wrote: >>> Hi, >>> >>> Zero build is broken after JDK-8222231. Could you please review this fix? >> Sorry about that. Fix looks good and trivial. I will sponsor this for you. > Thank you, David. > >> Thanks, >> David >> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8222300 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~aoqi/8222300/webrev.00/ >>> >>> Tested: >>> linux-x86_64-{server, minimal, zero}-{fastdebug, release} build, >>> linux-x86_64-server-release hotspot:tier1 >>> >>> Cheers, >>> Ao Qi >>> From mikhailo.seledtsov at oracle.com Thu Apr 11 15:26:41 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 11 Apr 2019 08:26:41 -0700 Subject: RFR(T): 8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group In-Reply-To: <29ADAB28-7883-408F-A0C0-1F9FC4ADD62D@oracle.com> References: <2a68544f-655d-140f-4a0b-bff6ec09df57@oracle.com> <0F704EEE-CBF2-4C75-9357-049B0DD9717B@oracle.com> <96a548bf-d86d-00aa-698f-d3f956a6c473@oracle.com> <73DE96B1-C9B0-4F1A-BFD6-20B803957C65@oracle.com> <29ADAB28-7883-408F-A0C0-1F9FC4ADD62D@oracle.com> Message-ID: <9f1880e2-c3fb-21a1-2d80-96682381b801@oracle.com> David, Igor, ? Thank you for looking at my change, and providing your advice. I will then proceed with the following: ? - hg move the container tests from test/hotspot/jtreg/runtime/containers to test/hotspot/jtreg/containers ? - will create :hotspot_containers group ? - update test definitions accordingly (no changes except to use hotspot_containers group instead of the directory path) I will change the JBS issue description, but keep using the same issue. I will post the updated webrev shortly, with the updated subject. I am closing this thread. Thank you, Misha On 4/10/19 11:13 PM, Igor Ignatyev wrote: >> Igor,? do you have any input or opinion on this alternative? > > I'm fine w/ 'test/hotspot/jtreg/(misc)/containers/' given we create > :hotspot_containers/:hotspot_misc to include these tests and your > changes don't change our current tier definitions. > > -- Igor > >> On Apr 10, 2019, at 7:38 PM, mikhailo.seledtsov at oracle.com >> wrote: >> >> Hi David, >> >> ? Thank you for your input. >> >> >> On 4/10/19 6:58 PM, David Holmes wrote: >>> Hi Misha, >>> >>> On 11/04/2019 11:44 am, Igor Ignatyev wrote: >>>> Hi Misha, >>>> >>>> although it is not formally stated anywhere, :hotspot_runtime is >>>> expected to include _all_ jtreg-jtreg runtime related tests, that's >>>> to say I'd expect to have docker tests included into this group, >>>> regardless of a host's ability/inability to run them. hence I'd >>>> prefer us to use Skipped exception in these tests. >>> >>> I agree with Igor - :hotspot_runtime is supposed to include >>> everything. We exclude specific tests when we run more specific test >>> groups like tier1_runtime. (Personally I no longer use groups for >>> local testing but just directories.) >>> >>> If @requires is infeasible (and I can easily see it is) then using >>> "skipped" exception is one approach - however I'm also concerned >>> about wasting time failing to run these tests. >>> >>> Another option perhaps is to move "containers" to be a top-level >>> test group alongside runtime, rather than within it? >> I agree. This is another good alternative. Would you recommend moving >> these tests to test/hotspot/jtreg/containers/..., or should they move >> under another "top-level" directory, such as >> test/hotspot/jtreg/*misc*/containers? or similar ? >> >> Igor,? do you have any input or opinion on this alternative? >> >> >> Thank you, >> Misha >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> -- Igor >>>> >>>>> On Apr 10, 2019, at 6:39 PM, mikhailo.seledtsov at oracle.com wrote: >>>>> >>>>> Hi Igor, >>>>> >>>>> Thank you for taking a look. >>>>> >>>>> >>>>> On 4/10/19 6:13 PM, Igor Ignatev wrote: >>>>>> Should these test be filtered out using @requires? >>>>> Checking all the conditions for this via @requires will require >>>>> building a test docker image (or at least downloading the >>>>> base/FROM) image in evaluation of at-requires, which will be >>>>> unacceptably long, especially given that @requires is evaluated >>>>> each time one runs jtreg command for any test in hotspot. >>>>> >>>>> Alternatively, if the current approach is undesirable, we can >>>>> throw jtreg.Skipped exception if docker base image fails to >>>>> download. Let me know if this is your preference. >>>>> >>>>> Misha >>>>>> >>>>>> ? Igor >>>>>> >>>>>>> On Apr 10, 2019, at 6:06 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>> >>>>>>> Please review this small (trivial) change that excludes docker >>>>>>> tests from hotspot_runtime. The rational for this change is that >>>>>>> docker tests require specially configured environment (docker >>>>>>> engine installed, test user being member of docker group, docker >>>>>>> proxy or docker mirror repo configured). This may lead to >>>>>>> unexpected errors when docker tests are ran as part of >>>>>>> hotspot_runtime group in general environment(s). >>>>>>> >>>>>>> >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>>>>>> >>>>>>> Change: >>>>>>> >>>>>>> diff --git a/test/hotspot/jtreg/TEST.groups >>>>>>> b/test/hotspot/jtreg/TEST.groups >>>>>>> --- a/test/hotspot/jtreg/TEST.groups >>>>>>> +++ b/test/hotspot/jtreg/TEST.groups >>>>>>> @@ -44,7 +44,8 @@ >>>>>>> ??? -gc/nvdimm >>>>>>> >>>>>>> ? hotspot_runtime = \ >>>>>>> -? runtime >>>>>>> +? runtime \ >>>>>>> +? -runtime/containers/docker >>>>>>> >>>>>>> ? hotspot_handshake = \ >>>>>>> ??? runtime/handshake >>>>>>> >>>>>>> Testing: >>>>>>> >>>>>>> ?? jtreg -l >>>>>>> /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime >>>>>>> >>>>>>> ?? jtreg -l >>>>>>> /ws/hg/jdk/jdk/work01/open/test/hotspot/jtreg/:hotspot_runtime | >>>>>>> grep docker >>>>>>> >>>>>>> >>>>>>> Thank you, >>>>>>> >>>>>>> Misha >>>>>>> >>>>> >>>> >> > From daniel.daugherty at oracle.com Thu Apr 11 15:34:38 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 11 Apr 2019 11:34:38 -0400 Subject: RFR(XXS): 8222034: Thread-SMR functions should be updated to remove work around In-Reply-To: <92633a98-5d05-3dd5-8fd9-4a63b8b25071@oracle.com> References: <92633a98-5d05-3dd5-8fd9-4a63b8b25071@oracle.com> Message-ID: <7b1e76e8-5996-1ba4-931e-5a2ce463c655@oracle.com> I could use a second reviewer for this one. I've heard from Martin, but not from Erik O. so I wonder if he is on vacation... Dan On 4/10/19 12:24 PM, Daniel D. Daugherty wrote: > Greetings, > > I have a very small fix for the following bug: > > ??? JDK-8222034 Thread-SMR functions should be updated to remove work > around > ??? https://bugs.openjdk.java.net/browse/JDK-8222034 > > Webrev URL: > http://cr.openjdk.java.net/~dcubed/8222034-webrev/0_for_jdk13/ > > I would like to hear from Erik Osterlund and Martin Doerr on this review. > Of course, anyone else is also welcome to chime in. > > This fix has been tested with a Mach5 tier[1-3] run. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > From coleen.phillimore at oracle.com Thu Apr 11 15:39:52 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 11 Apr 2019 11:39:52 -0400 Subject: RFR (XS) 8222297: IRT_ENTRY/IRT_LEAF etc are the same as JRT Message-ID: Summary: Replace IRT entry points with JRT. Tested with hs tier1-3 and built zero.? And grepped from the right level directory this time. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222297.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8222297 Thanks, Coleen From gerard.ziemski at oracle.com Thu Apr 11 17:08:25 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Thu, 11 Apr 2019 12:08:25 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> <4862ac3f-f165-7d04-6248-514e588e70a2@oracle.com> <6b9dcad0-5563-0e6b-1c37-c8a4c40d407a@oracle.com> <72cbb5f1-c843-2995-12da-31eb22734c30@oracle.com> <6ba4c5dc-1346-d377-808d-05c16c769e3b@oracle.com> Message-ID: <83ea1c8d-09ab-2aa3-e3c9-8bd221727ebe@oracle.com> On 4/10/19 5:01 PM, coleen.phillimore at oracle.com wrote: > > http://cr.openjdk.java.net/~gziemski/8185525_rev5/src/hotspot/share/utilities/tableStatistics.cpp.html > > Sorry I didn't notice this before but these constructors should have > initializers like: > > 31 TableRateStatistics::TableRateStatistics() { > 32 _added_items = 0; > 33 _removed_items = 0; > 34 > 35 _time_stamp = 0; > 36 _seconds_stamp = 0.0; > 37 _added_items_stamp = 0; > 38 _added_items_stamp_prev = 0; > 39 _removed_items_stamp = 0; > 40 _removed_items_stamp_prev = 0; > 41 } > Should be: > 31 TableRateStatistics::TableRateStatistics() : > 32 _added_items(0), _removed_items(0), _time_stamp(0), etc. {} > Kim could tell you why this is better but he's on vacation. Done. I also went back to src/hotspot/share/jfr/periodic/jfrPeriodic.cpp and changed TableEventFiller::fill() to be static, since we don't actually need an instance of TableEventFiller to do its job. webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev6 cheers From lois.foltan at oracle.com Thu Apr 11 17:26:43 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 11 Apr 2019 13:26:43 -0400 Subject: RFR (XS) 8222297: IRT_ENTRY/IRT_LEAF etc are the same as JRT In-Reply-To: References: Message-ID: <479ab9f9-4cb6-a1ef-d09c-55d4d7614efa@oracle.com> Looks good. Lois On 4/11/2019 11:39 AM, coleen.phillimore at oracle.com wrote: > Summary: Replace IRT entry points with JRT. > > Tested with hs tier1-3 and built zero.? And grepped from the right > level directory this time. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222297.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8222297 > > Thanks, > Coleen From erik.osterlund at oracle.com Thu Apr 11 17:36:12 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Thu, 11 Apr 2019 19:36:12 +0200 Subject: RFR(XXS): 8222034: Thread-SMR functions should be updated to remove work around In-Reply-To: <7b1e76e8-5996-1ba4-931e-5a2ce463c655@oracle.com> References: <92633a98-5d05-3dd5-8fd9-4a63b8b25071@oracle.com> <7b1e76e8-5996-1ba4-931e-5a2ce463c655@oracle.com> Message-ID: <99F4156A-4F4A-436D-A853-7CA16F7EE261@oracle.com> Hi Dan, Comments sound about right. Looks good. Thanks for cleaning this up. /Erik > On 11 Apr 2019, at 17:34, Daniel D. Daugherty wrote: > > I could use a second reviewer for this one. I've heard from Martin, > but not from Erik O. so I wonder if he is on vacation... > > Dan > > >> On 4/10/19 12:24 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a very small fix for the following bug: >> >> JDK-8222034 Thread-SMR functions should be updated to remove work around >> https://bugs.openjdk.java.net/browse/JDK-8222034 >> >> Webrev URL: http://cr.openjdk.java.net/~dcubed/8222034-webrev/0_for_jdk13/ >> >> I would like to hear from Erik Osterlund and Martin Doerr on this review. >> Of course, anyone else is also welcome to chime in. >> >> This fix has been tested with a Mach5 tier[1-3] run. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> > From daniel.daugherty at oracle.com Thu Apr 11 17:56:52 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 11 Apr 2019 13:56:52 -0400 Subject: RFR(XXS): 8222034: Thread-SMR functions should be updated to remove work around In-Reply-To: <99F4156A-4F4A-436D-A853-7CA16F7EE261@oracle.com> References: <92633a98-5d05-3dd5-8fd9-4a63b8b25071@oracle.com> <7b1e76e8-5996-1ba4-931e-5a2ce463c655@oracle.com> <99F4156A-4F4A-436D-A853-7CA16F7EE261@oracle.com> Message-ID: <98131eb8-b582-17c7-9a22-30561418043a@oracle.com> Erik, Thanks for the review! Dan On 4/11/19 1:36 PM, Erik Osterlund wrote: > Hi Dan, > > Comments sound about right. Looks good. Thanks for cleaning this up. > > /Erik > >> On 11 Apr 2019, at 17:34, Daniel D. Daugherty wrote: >> >> I could use a second reviewer for this one. I've heard from Martin, >> but not from Erik O. so I wonder if he is on vacation... >> >> Dan >> >> >>> On 4/10/19 12:24 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> I have a very small fix for the following bug: >>> >>> JDK-8222034 Thread-SMR functions should be updated to remove work around >>> https://bugs.openjdk.java.net/browse/JDK-8222034 >>> >>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8222034-webrev/0_for_jdk13/ >>> >>> I would like to hear from Erik Osterlund and Martin Doerr on this review. >>> Of course, anyone else is also welcome to chime in. >>> >>> This fix has been tested with a Mach5 tier[1-3] run. >>> >>> Thanks, in advance, for any questions, comments or suggestions. >>> >>> Dan >>> From coleen.phillimore at oracle.com Thu Apr 11 17:56:10 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 11 Apr 2019 13:56:10 -0400 Subject: RFR (XS) 8222297: IRT_ENTRY/IRT_LEAF etc are the same as JRT In-Reply-To: <479ab9f9-4cb6-a1ef-d09c-55d4d7614efa@oracle.com> References: <479ab9f9-4cb6-a1ef-d09c-55d4d7614efa@oracle.com> Message-ID: <5fff2316-8136-bc73-0076-2ee78790e11c@oracle.com> Thanks Lois! Coleen On 4/11/19 1:26 PM, Lois Foltan wrote: > Looks good. > Lois > > On 4/11/2019 11:39 AM, coleen.phillimore at oracle.com wrote: >> Summary: Replace IRT entry points with JRT. >> >> Tested with hs tier1-3 and built zero.? And grepped from the right >> level directory this time. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8222297.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8222297 >> >> Thanks, >> Coleen > From coleen.phillimore at oracle.com Thu Apr 11 17:58:19 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 11 Apr 2019 13:58:19 -0400 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <83ea1c8d-09ab-2aa3-e3c9-8bd221727ebe@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CA649FE.9070904@oracle.com> <4862ac3f-f165-7d04-6248-514e588e70a2@oracle.com> <6b9dcad0-5563-0e6b-1c37-c8a4c40d407a@oracle.com> <72cbb5f1-c843-2995-12da-31eb22734c30@oracle.com> <6ba4c5dc-1346-d377-808d-05c16c769e3b@oracle.com> <83ea1c8d-09ab-2aa3-e3c9-8bd221727ebe@oracle.com> Message-ID: <262b4365-fef8-3caa-7e32-f4bf5097dbc4@oracle.com> On 4/11/19 1:08 PM, gerard ziemski wrote: > > > On 4/10/19 5:01 PM, coleen.phillimore at oracle.com wrote: >> >> http://cr.openjdk.java.net/~gziemski/8185525_rev5/src/hotspot/share/utilities/tableStatistics.cpp.html >> >> Sorry I didn't notice this before but these constructors should have >> initializers like: >> >> 31 TableRateStatistics::TableRateStatistics() { >> 32 _added_items = 0; >> 33 _removed_items = 0; >> 34 >> 35 _time_stamp = 0; >> 36 _seconds_stamp = 0.0; >> 37 _added_items_stamp = 0; >> 38 _added_items_stamp_prev = 0; >> 39 _removed_items_stamp = 0; >> 40 _removed_items_stamp_prev = 0; >> 41 } >> Should be: >> 31 TableRateStatistics::TableRateStatistics() : >> 32 _added_items(0), _removed_items(0), _time_stamp(0), etc. {} >> Kim could tell you why this is better but he's on vacation. > > Done. > > I also went back to src/hotspot/share/jfr/periodic/jfrPeriodic.cpp and > changed TableEventFiller::fill() to be static, since we don't actually > need an instance of TableEventFiller to do its job. > > webrev: http://cr.openjdk.java.net/~gziemski/8185525_rev6 > > Yes, this looks really good! Thanks, Coleen > cheers From daniel.daugherty at oracle.com Thu Apr 11 19:02:07 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 11 Apr 2019 15:02:07 -0400 Subject: RFR (XS) 8222297: IRT_ENTRY/IRT_LEAF etc are the same as JRT In-Reply-To: References: Message-ID: <584edb17-c310-5440-087e-c507e4dd4875@oracle.com> On 4/11/19 11:39 AM, coleen.phillimore at oracle.com wrote: > Summary: Replace IRT entry points with JRT. > > Tested with hs tier1-3 and built zero.? And grepped from the right > level directory this time. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222297.01/webrev src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp ??? No comments. src/hotspot/cpu/arm/interpreterRT_arm.cpp ??? No comments. src/hotspot/cpu/ppc/interpreterRT_ppc.cpp ??? No comments. src/hotspot/cpu/s390/interpreterRT_s390.cpp ??? No comments. src/hotspot/cpu/sparc/interpreterRT_sparc.cpp ??? No comments. src/hotspot/cpu/x86/interpreterRT_x86_32.cpp ??? No comments. src/hotspot/cpu/x86/interpreterRT_x86_64.cpp ??? No comments. src/hotspot/cpu/zero/cppInterpreter_zero.cpp ??? No comments. src/hotspot/cpu/zero/interpreterRT_zero.cpp ??? No comments. src/hotspot/share/interpreter/interpreterRuntime.cpp src/hotspot/share/runtime/interfaceSupport.inline.hpp ??? old L435: #define IRT_LEAF(result_type, header) ??? old L438: ??? debug_only(NoSafepointVerifier __nspv(true);) ??? new L432: #define JRT_LEAF(result_type, header) ??? new L435: ? debug_only(JRTLeafVerifier __jlv;) ??????? src/hotspot/share/runtime/interfaceSupport.cpp: ?? ?? ??? JRTLeafVerifier::JRTLeafVerifier() ? ? ? ? ? ? : NoSafepointVerifier(true, JRTLeafVerifier::should_verify_GC()) ? ? ????? { ? ? ????? } ??????? src/hotspot/share/runtime/safepointVerifiers.hpp: ????????? NoSafepointVerifier(bool activated = true, bool verifygc = true ) : ? ? ? ? ? ? NoGCVerifier(verifygc), ? ? ? ? ? ? _activated(activated) { ??????? IRT_LEAF creates a NoSafepointVerifier with first ctr param == true ??????? and the second ctr param == default true. ??????? JRT_LEAF creates a JRTLeafVerifier subclassed on NoSafepointVerifier ??????? with first ctr param == true and second ctr param based on ??????? JRTLeafVerifier::should_verify_GC() which can return either ??????? true or false depending on the calling thread's state. If the ??????? thread's state == _thread_in_Java, then the return == true. ??????? If the thread's state == _thread_in_native, then the return == false. ??????? As long as all the IRT_LEAF uses are thread state == _thread_in_Java ??????? then this is an equivalent change. ??????? I found these uses of IRT_LEAF: ??? ?? ?? SharedRuntime::fixup_callers_callsite() ?? ?? ??? InterpreterRuntime::bcp_to_di() ?? ?? ??? InterpreterRuntime::verify_mdp() ?? ?? ??? InterpreterRuntime::interpreter_contains() ?? ?? ??? InterpreterRuntime::popframe_move_outgoing_args() ?? ?? ??? InterpreterRuntime::trace_bytecode() ??????? I have not checked to see if these IRT_LEAF functions are ??????? ever called from thread state == _thread_in_native locations, ??????? but if they are, then we will no longer 'verifygc' with the ??????? JRT_LEAF switch. src/hotspot/share/runtime/sharedRuntime.cpp ??? No comments. Your call on what to do about the difference that I found between IRT_LEAF and JRT_LEAF. We could be losing a 'verifygc' check here, but... Dan > bug link https://bugs.openjdk.java.net/browse/JDK-8222297 > > Thanks, > Coleen From rkennke at redhat.com Thu Apr 11 20:58:54 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 11 Apr 2019 22:58:54 +0200 Subject: RFR: JDK-8222281: GC interface for load-klass: runtime part Message-ID: An upcoming feature in Shenandoah requires that GC can intercept loading the Klass* of an object. I'd like to introduce a GC interface for that. This change covers the runtime part. Bug: https://bugs.openjdk.java.net/browse/JDK-8222281 Webrev: http://cr.openjdk.java.net/~rkennke/JDK-8222281/webrev.00/ The change takes the 3 variants of oopDesc::klass() and funnels them through the Access API to be intercepted by the GC in any way it wants. Behaviour- and performance-wise it should be identical (assuming the compiler can make sense of the Access API). I see that there might be an opportunity here to make the if (UseCompressedClassPointers) check pre-resolved, but I don't know how to do that. I also don't really see how that is supposed to work for loads and stores wrt UseCompressedOops either: in order to select the proper functions on first call, and subsequently go through that selected function, it would have to be done in the *_init() functions. But I don't see any selection code there. Instead, it seems to be in PreRuntimeDispatch? I must be missing something. I left the selection in the raw implementation. Maybe we want to sort this out? I must say that I fought with myself whether or not I should add to the madness that is the Access API. What should have been a 1-line-addition to an API (e.g. BarrierSet) turned out to become: 7 files changed, 86 insertions(+), 20 deletions(-) and two days of work. And god forbid we ever have to change or even fix anything there. I was about to just not do it in Access API at all, but somewhere else instead, but then I did not want to introduce a schism there, so I bit the bullet. Maybe we should consider to turn this into a proper C++ interface instead? This is just unmaintainable madness. Testing: tier1 fine. Will submit into jdk/submit shortly, works with the Shenandoah prototype that I have here (iow, the API is good) Can I please get a review? Thanks, Roman From coleen.phillimore at oracle.com Thu Apr 11 21:06:35 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 11 Apr 2019 17:06:35 -0400 Subject: RFR (XS) 8222297: IRT_ENTRY/IRT_LEAF etc are the same as JRT In-Reply-To: <584edb17-c310-5440-087e-c507e4dd4875@oracle.com> References: <584edb17-c310-5440-087e-c507e4dd4875@oracle.com> Message-ID: Dan, Thank you for reviewing. On 4/11/19 3:02 PM, Daniel D. Daugherty wrote: > On 4/11/19 11:39 AM, coleen.phillimore at oracle.com wrote: >> Summary: Replace IRT entry points with JRT. >> >> Tested with hs tier1-3 and built zero.? And grepped from the right >> level directory this time. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8222297.01/webrev > > src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp > ??? No comments. > > src/hotspot/cpu/arm/interpreterRT_arm.cpp > ??? No comments. > > src/hotspot/cpu/ppc/interpreterRT_ppc.cpp > ??? No comments. > > src/hotspot/cpu/s390/interpreterRT_s390.cpp > ??? No comments. > > src/hotspot/cpu/sparc/interpreterRT_sparc.cpp > ??? No comments. > > src/hotspot/cpu/x86/interpreterRT_x86_32.cpp > ??? No comments. > > src/hotspot/cpu/x86/interpreterRT_x86_64.cpp > ??? No comments. > > src/hotspot/cpu/zero/cppInterpreter_zero.cpp > ??? No comments. > > src/hotspot/cpu/zero/interpreterRT_zero.cpp > ??? No comments. > > src/hotspot/share/interpreter/interpreterRuntime.cpp > src/hotspot/share/runtime/interfaceSupport.inline.hpp > ??? old L435: #define IRT_LEAF(result_type, header) > ??? old L438: ??? debug_only(NoSafepointVerifier __nspv(true);) > ??? new L432: #define JRT_LEAF(result_type, header) > ??? new L435: ? debug_only(JRTLeafVerifier __jlv;) > ??????? src/hotspot/share/runtime/interfaceSupport.cpp: > > ?? ?? ??? JRTLeafVerifier::JRTLeafVerifier() > ? ? ? ? ? ? : NoSafepointVerifier(true, > JRTLeafVerifier::should_verify_GC()) > ? ? ????? { > ? ? ????? } > > ??????? src/hotspot/share/runtime/safepointVerifiers.hpp: > > ????????? NoSafepointVerifier(bool activated = true, bool verifygc = > true ) : > ? ? ? ? ? ? NoGCVerifier(verifygc), > ? ? ? ? ? ? _activated(activated) { > > ??????? IRT_LEAF creates a NoSafepointVerifier with first ctr param == > true > ??????? and the second ctr param == default true. > > ??????? JRT_LEAF creates a JRTLeafVerifier subclassed on > NoSafepointVerifier > ??????? with first ctr param == true and second ctr param based on > ??????? JRTLeafVerifier::should_verify_GC() which can return either > ??????? true or false depending on the calling thread's state. If the > ??????? thread's state == _thread_in_Java, then the return == true. > ??????? If the thread's state == _thread_in_native, then the return == > false. > > ??????? As long as all the IRT_LEAF uses are thread state == > _thread_in_Java > ??????? then this is an equivalent change. > > ??????? I found these uses of IRT_LEAF: > > ??? ?? ?? SharedRuntime::fixup_callers_callsite() This is called from the c2i adapter so would be thread_in_java. > ??? InterpreterRuntime::bcp_to_di() > ?? ?? ??? InterpreterRuntime::verify_mdp() > ?? ?? ??? InterpreterRuntime::interpreter_contains() > ?? ?? ??? InterpreterRuntime::popframe_move_outgoing_args() > ?? ?? ??? InterpreterRuntime::trace_bytecode() > I assume these are thread_in_java too, since they have access to metadata (except interpreter_contains, whose callers have access to metadata).? This would be dangerous for in_native threads to touch metadata directly. > I have not checked to see if these IRT_LEAF functions are > ??????? ever called from thread state == _thread_in_native locations, > ??????? but if they are, then we will no longer 'verifygc' with the > ??????? JRT_LEAF switch. > It seems like if one of these IRT calls was from native the NoSafepointVerifier( true /* no safepoints */, false /* don't verify GC */) might be a fix to these methods.? Because of the comment: ? case _thread_in_native: ??? // A native thread is not subject to safepoints. ??? // Even while it is in a leaf routine, GC is ok ??? return false; But I think it's the case that the behavior is the same, or more consistent if one of these is a native LEAF call that had IRT_LEAF. Trying to resolve these subtle differences is the trouble with mostly duplicated code :( Thanks! Coleen > src/hotspot/share/runtime/sharedRuntime.cpp > ??? No comments. > > > Your call on what to do about the difference that I found between > IRT_LEAF and JRT_LEAF. We could be losing a 'verifygc' check here, > but... > > Dan > > >> bug link https://bugs.openjdk.java.net/browse/JDK-8222297 >> >> Thanks, >> Coleen > From calvin.cheung at oracle.com Thu Apr 11 21:18:25 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 11 Apr 2019 14:18:25 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive Message-ID: <5CAFAF21.3030007@oracle.com> This is a follow-up on the preliminary code review sent by Jiangli in January[1]. Highlights of changes since then: 1. New vm option for dumping a dynamic archive (-XX:ArchiveClassesAtExit=) and enhancement to the existing -XX:SharedArchiveFile option. Please refer to the corresponding CSR[2] for details. 2. New way to run existing AppCDS tests in dynamic CDS archive mode. At the jtreg command line, the user can run many existing AppCDS tests in dynamic CDS archive mode by specifying the following: -vmoptions:-Dtest.dynamic.cds.archive=true /open/test/hotspot/jtreg:hotspot_appcds_dynamic We will have a follow-up RFE to determine in which tier the above tests should be run. 3. Added more tests. 4. Various bug fixes to improve stability. RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 webrev: http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ (The webrev is based on top of the following rev: http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) Testing: - mach5 tiers 1- 3 (including the new tests) - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few tests require more investigation) thanks, Calvin [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html [2] https://bugs.openjdk.java.net/browse/JDK-8221706 From mikhailo.seledtsov at oracle.com Thu Apr 11 22:46:09 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 11 Apr 2019 15:46:09 -0700 Subject: RFR(S): JDK-8222299: [TESTBUG] move hotspot container tests to hotspot/containers Message-ID: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> Please review this change that moves HotSpot container tests into their own directory under test/hotspot/jtreg and creates hotspot_containers test group. Since container tests require specially setup/configured environment, it is best to group them into their own group, so they could be executed in properly configured environment. The details of this change, such as new location were recommended by David and Igor during earlier public review for this issue (8222299: [TESTBUG] Docker tests should be excluded from hotspot_runtime group). Also, as part of this review, Igor recommended to use jtreg.skipped exception if docker image build fails, which I agreed to and implemented. ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222299.00/ ??? Testing: ?????? Ran the affected tests on a machine configured for Docker testing. ?????? Ran them two ways, via specifying the directory as well as by using a newly created group - PASS Thank you, Misha From david.holmes at oracle.com Thu Apr 11 23:24:38 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Apr 2019 09:24:38 +1000 Subject: RFR (XS) 8222297: IRT_ENTRY/IRT_LEAF etc are the same as JRT In-Reply-To: References: <584edb17-c310-5440-087e-c507e4dd4875@oracle.com> Message-ID: <0af7792f-a691-4737-d60d-99152860f05a@oracle.com> Hi Coleen, I was busy doing some archaeology on this code so didn't notice the RFR. Glad Dan picked up on the only difference with the "verifiers" in the LEAF variants. FTR the differences here are historical. JRT was added first and shortly needed to manifest the current thread directly. IRT were added later and the "thread" was an implicit argument. But by July 1998 the two ENTRY macros were the same. The only difference was the verifier in the LEAF, and some custom variants of each macro that no longer exist. Conceptually I've always thought there was a difference in how the interpreter needed to "enter the runtime" versus the compilers. So the different macros made that clear. But if the requirements are essentially identical, and always have been, then the distinction just confuses things. I'm not clear what your resolution is here? Just accept that maybe we lost a verify-gc call as Dan noted? Thanks, David On 12/04/2019 7:06 am, coleen.phillimore at oracle.com wrote: > > Dan, Thank you for reviewing. > > On 4/11/19 3:02 PM, Daniel D. Daugherty wrote: >> On 4/11/19 11:39 AM, coleen.phillimore at oracle.com wrote: >>> Summary: Replace IRT entry points with JRT. >>> >>> Tested with hs tier1-3 and built zero.? And grepped from the right >>> level directory this time. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8222297.01/webrev >> >> src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp >> ??? No comments. >> >> src/hotspot/cpu/arm/interpreterRT_arm.cpp >> ??? No comments. >> >> src/hotspot/cpu/ppc/interpreterRT_ppc.cpp >> ??? No comments. >> >> src/hotspot/cpu/s390/interpreterRT_s390.cpp >> ??? No comments. >> >> src/hotspot/cpu/sparc/interpreterRT_sparc.cpp >> ??? No comments. >> >> src/hotspot/cpu/x86/interpreterRT_x86_32.cpp >> ??? No comments. >> >> src/hotspot/cpu/x86/interpreterRT_x86_64.cpp >> ??? No comments. >> >> src/hotspot/cpu/zero/cppInterpreter_zero.cpp >> ??? No comments. >> >> src/hotspot/cpu/zero/interpreterRT_zero.cpp >> ??? No comments. >> >> src/hotspot/share/interpreter/interpreterRuntime.cpp >> src/hotspot/share/runtime/interfaceSupport.inline.hpp >> ??? old L435: #define IRT_LEAF(result_type, header) >> ??? old L438: ??? debug_only(NoSafepointVerifier __nspv(true);) >> ??? new L432: #define JRT_LEAF(result_type, header) >> ??? new L435: ? debug_only(JRTLeafVerifier __jlv;) >> ??????? src/hotspot/share/runtime/interfaceSupport.cpp: >> >> ?? ?? ??? JRTLeafVerifier::JRTLeafVerifier() >> ? ? ? ? ? ? : NoSafepointVerifier(true, >> JRTLeafVerifier::should_verify_GC()) >> ? ? ????? { >> ? ? ????? } >> >> ??????? src/hotspot/share/runtime/safepointVerifiers.hpp: >> >> ????????? NoSafepointVerifier(bool activated = true, bool verifygc = >> true ) : >> ? ? ? ? ? ? NoGCVerifier(verifygc), >> ? ? ? ? ? ? _activated(activated) { >> >> ??????? IRT_LEAF creates a NoSafepointVerifier with first ctr param == >> true >> ??????? and the second ctr param == default true. >> >> ??????? JRT_LEAF creates a JRTLeafVerifier subclassed on >> NoSafepointVerifier >> ??????? with first ctr param == true and second ctr param based on >> ??????? JRTLeafVerifier::should_verify_GC() which can return either >> ??????? true or false depending on the calling thread's state. If the >> ??????? thread's state == _thread_in_Java, then the return == true. >> ??????? If the thread's state == _thread_in_native, then the return == >> false. >> >> ??????? As long as all the IRT_LEAF uses are thread state == >> _thread_in_Java >> ??????? then this is an equivalent change. >> >> ??????? I found these uses of IRT_LEAF: >> >> ??? ?? ?? SharedRuntime::fixup_callers_callsite() > > This is called from the c2i adapter so would be thread_in_java. > >> ??? InterpreterRuntime::bcp_to_di() >> ?? ?? ??? InterpreterRuntime::verify_mdp() >> ?? ?? ??? InterpreterRuntime::interpreter_contains() >> ?? ?? ??? InterpreterRuntime::popframe_move_outgoing_args() >> ?? ?? ??? InterpreterRuntime::trace_bytecode() >> > > I assume these are thread_in_java too, since they have access to > metadata (except interpreter_contains, whose callers have access to > metadata).? This would be dangerous for in_native threads to touch > metadata directly. > >> I have not checked to see if these IRT_LEAF functions are >> ??????? ever called from thread state == _thread_in_native locations, >> ??????? but if they are, then we will no longer 'verifygc' with the >> ??????? JRT_LEAF switch. >> > It seems like if one of these IRT calls was from native the > > NoSafepointVerifier( true /* no safepoints */, false /* don't verify GC > */) might be a fix to these methods.? Because of the comment: > > ? case _thread_in_native: > ??? // A native thread is not subject to safepoints. > ??? // Even while it is in a leaf routine, GC is ok > ??? return false; > > But I think it's the case that the behavior is the same, or more > consistent if one of these is a native LEAF call that had IRT_LEAF. > Trying to resolve these subtle differences is the trouble with mostly > duplicated code :( > > Thanks! > Coleen > >> src/hotspot/share/runtime/sharedRuntime.cpp >> ??? No comments. >> >> >> Your call on what to do about the difference that I found between >> IRT_LEAF and JRT_LEAF. We could be losing a 'verifygc' check here, >> but... >> >> Dan >> >> >>> bug link https://bugs.openjdk.java.net/browse/JDK-8222297 >>> >>> Thanks, >>> Coleen >> > From david.holmes at oracle.com Fri Apr 12 00:40:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Apr 2019 10:40:00 +1000 Subject: RFR(S): JDK-8222299: [TESTBUG] move hotspot container tests to hotspot/containers In-Reply-To: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> References: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> Message-ID: <4db0f322-da30-976e-86eb-ba05894d2f51@oracle.com> Hi Misha, This seems fine to me. Thanks, David On 12/04/2019 8:46 am, mikhailo.seledtsov at oracle.com wrote: > Please review this change that moves HotSpot container tests into their > own directory under test/hotspot/jtreg and > creates hotspot_containers test group. Since container tests require > specially setup/configured environment, it is best to group them into > their own group, > so they could be executed in properly configured environment. The > details of this change, such as new location were > recommended by David and Igor during earlier public review for this > issue (8222299: [TESTBUG] Docker tests should be excluded from > hotspot_runtime group). > > Also, as part of this review, Igor recommended to use jtreg.skipped > exception if docker image build fails, which I agreed to > and implemented. > > ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 > ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222299.00/ > ??? Testing: > ?????? Ran the affected tests on a machine configured for Docker testing. > ?????? Ran them two ways, via specifying the directory as well as by > using a newly created group - PASS > > > Thank you, > Misha > From fujie at loongson.cn Fri Apr 12 00:57:43 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 12 Apr 2019 08:57:43 +0800 Subject: RFR(S): JDK-8222299: [TESTBUG] move hotspot container tests to hotspot/containers In-Reply-To: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> References: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> Message-ID: Hi Misha, It might be better to update the test doc[1][2] together. Thanks. Best regards, Jie [1] http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l384 [2] http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l389 On 2019/4/12 ??6:46, mikhailo.seledtsov at oracle.com wrote: > Please review this change that moves HotSpot container tests into > their own directory under test/hotspot/jtreg and > creates hotspot_containers test group. Since container tests require > specially setup/configured environment, it is best to group them into > their own group, > so they could be executed in properly configured environment. The > details of this change, such as new location were > recommended by David and Igor during earlier public review for this > issue (8222299: [TESTBUG] Docker tests should be excluded from > hotspot_runtime group). > > Also, as part of this review, Igor recommended to use jtreg.skipped > exception if docker image build fails, which I agreed to > and implemented. > > ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 > ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222299.00/ > ??? Testing: > ?????? Ran the affected tests on a machine configured for Docker testing. > ?????? Ran them two ways, via specifying the directory as well as by > using a newly created group - PASS > > > Thank you, > Misha > From david.holmes at oracle.com Fri Apr 12 01:10:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Apr 2019 11:10:55 +1000 Subject: RFR(S): JDK-8222299: [TESTBUG] move hotspot container tests to hotspot/containers In-Reply-To: References: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> Message-ID: <90c1de8c-328c-b547-f4eb-49b3011168ad@oracle.com> On 12/04/2019 10:57 am, Jie Fu wrote: > Hi Misha, > > It might be better to update the test doc[1][2] together. > Thanks. Yes good point Jie! David > Best regards, > Jie > > [1] > http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l384 > [2] > http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l389 > > > On 2019/4/12 ??6:46, mikhailo.seledtsov at oracle.com wrote: >> Please review this change that moves HotSpot container tests into >> their own directory under test/hotspot/jtreg and >> creates hotspot_containers test group. Since container tests require >> specially setup/configured environment, it is best to group them into >> their own group, >> so they could be executed in properly configured environment. The >> details of this change, such as new location were >> recommended by David and Igor during earlier public review for this >> issue (8222299: [TESTBUG] Docker tests should be excluded from >> hotspot_runtime group). >> >> Also, as part of this review, Igor recommended to use jtreg.skipped >> exception if docker image build fails, which I agreed to >> and implemented. >> >> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >> ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222299.00/ >> ??? Testing: >> ?????? Ran the affected tests on a machine configured for Docker testing. >> ?????? Ran them two ways, via specifying the directory as well as by >> using a newly created group - PASS >> >> >> Thank you, >> Misha >> > From mikhailo.seledtsov at oracle.com Fri Apr 12 01:47:23 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 11 Apr 2019 18:47:23 -0700 Subject: RFR(S): JDK-8222299: [TESTBUG] move hotspot container tests to hotspot/containers In-Reply-To: References: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> Message-ID: <13c0cc36-0bd0-f8cd-45ba-f5e3893cc5a9@oracle.com> Thank you Jie, Good catch. I will update the docs. Misha On 4/11/19 5:57 PM, Jie Fu wrote: > Hi Misha, > > It might be better to update the test doc[1][2] together. > Thanks. > > Best regards, > Jie > > [1] > http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l384 > [2] > http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l389 > > > On 2019/4/12 ??6:46, mikhailo.seledtsov at oracle.com wrote: >> Please review this change that moves HotSpot container tests into >> their own directory under test/hotspot/jtreg and >> creates hotspot_containers test group. Since container tests require >> specially setup/configured environment, it is best to group them into >> their own group, >> so they could be executed in properly configured environment. The >> details of this change, such as new location were >> recommended by David and Igor during earlier public review for this >> issue (8222299: [TESTBUG] Docker tests should be excluded from >> hotspot_runtime group). >> >> Also, as part of this review, Igor recommended to use jtreg.skipped >> exception if docker image build fails, which I agreed to >> and implemented. >> >> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >> ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222299.00/ >> ??? Testing: >> ?????? Ran the affected tests on a machine configured for Docker >> testing. >> ?????? Ran them two ways, via specifying the directory as well as by >> using a newly created group - PASS >> >> >> Thank you, >> Misha >> > From mikhailo.seledtsov at oracle.com Fri Apr 12 01:59:55 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 11 Apr 2019 18:59:55 -0700 Subject: RFR(S): JDK-8222299: [TESTBUG] move hotspot container tests to hotspot/containers In-Reply-To: <13c0cc36-0bd0-f8cd-45ba-f5e3893cc5a9@oracle.com> References: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> <13c0cc36-0bd0-f8cd-45ba-f5e3893cc5a9@oracle.com> Message-ID: <8702daac-092c-4236-f9c1-0f3ba1b6267e@oracle.com> David, Jie, thank you for reviews. Here is the webrev with update to the docs (html and .md): http://cr.openjdk.java.net/~mseledtsov/8222299.01/ Thank you, Misha On 4/11/19 6:47 PM, mikhailo.seledtsov at oracle.com wrote: > Thank you Jie, > > Good catch. I will update the docs. > > > Misha > > > On 4/11/19 5:57 PM, Jie Fu wrote: >> Hi Misha, >> >> It might be better to update the test doc[1][2] together. >> Thanks. >> >> Best regards, >> Jie >> >> [1] >> http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l384 >> [2] >> http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l389 >> >> >> On 2019/4/12 ??6:46, mikhailo.seledtsov at oracle.com wrote: >>> Please review this change that moves HotSpot container tests into >>> their own directory under test/hotspot/jtreg and >>> creates hotspot_containers test group. Since container tests require >>> specially setup/configured environment, it is best to group them >>> into their own group, >>> so they could be executed in properly configured environment. The >>> details of this change, such as new location were >>> recommended by David and Igor during earlier public review for this >>> issue (8222299: [TESTBUG] Docker tests should be excluded from >>> hotspot_runtime group). >>> >>> Also, as part of this review, Igor recommended to use jtreg.skipped >>> exception if docker image build fails, which I agreed to >>> and implemented. >>> >>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>> ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222299.00/ >>> ??? Testing: >>> ?????? Ran the affected tests on a machine configured for Docker >>> testing. >>> ?????? Ran them two ways, via specifying the directory as well as by >>> using a newly created group - PASS >>> >>> >>> Thank you, >>> Misha >>> >> > From david.holmes at oracle.com Fri Apr 12 02:11:05 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Apr 2019 12:11:05 +1000 Subject: RFR(S): JDK-8222299: [TESTBUG] move hotspot container tests to hotspot/containers In-Reply-To: <8702daac-092c-4236-f9c1-0f3ba1b6267e@oracle.com> References: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> <13c0cc36-0bd0-f8cd-45ba-f5e3893cc5a9@oracle.com> <8702daac-092c-4236-f9c1-0f3ba1b6267e@oracle.com> Message-ID: Looks good! Thanks, David On 12/04/2019 11:59 am, mikhailo.seledtsov at oracle.com wrote: > David, Jie, thank you for reviews. > > Here is the webrev with update to the docs (html and .md): > > http://cr.openjdk.java.net/~mseledtsov/8222299.01/ > > > Thank you, > > Misha > > > On 4/11/19 6:47 PM, mikhailo.seledtsov at oracle.com wrote: >> Thank you Jie, >> >> Good catch. I will update the docs. >> >> >> Misha >> >> >> On 4/11/19 5:57 PM, Jie Fu wrote: >>> Hi Misha, >>> >>> It might be better to update the test doc[1][2] together. >>> Thanks. >>> >>> Best regards, >>> Jie >>> >>> [1] >>> http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l384 >>> [2] >>> http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l389 >>> >>> >>> On 2019/4/12 ??6:46, mikhailo.seledtsov at oracle.com wrote: >>>> Please review this change that moves HotSpot container tests into >>>> their own directory under test/hotspot/jtreg and >>>> creates hotspot_containers test group. Since container tests require >>>> specially setup/configured environment, it is best to group them >>>> into their own group, >>>> so they could be executed in properly configured environment. The >>>> details of this change, such as new location were >>>> recommended by David and Igor during earlier public review for this >>>> issue (8222299: [TESTBUG] Docker tests should be excluded from >>>> hotspot_runtime group). >>>> >>>> Also, as part of this review, Igor recommended to use jtreg.skipped >>>> exception if docker image build fails, which I agreed to >>>> and implemented. >>>> >>>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>>> ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222299.00/ >>>> ??? Testing: >>>> ?????? Ran the affected tests on a machine configured for Docker >>>> testing. >>>> ?????? Ran them two ways, via specifying the directory as well as by >>>> using a newly created group - PASS >>>> >>>> >>>> Thank you, >>>> Misha >>>> >>> >> > From fujie at loongson.cn Fri Apr 12 02:12:48 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 12 Apr 2019 10:12:48 +0800 Subject: RFR(S): JDK-8222299: [TESTBUG] move hotspot container tests to hotspot/containers In-Reply-To: <8702daac-092c-4236-f9c1-0f3ba1b6267e@oracle.com> References: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> <13c0cc36-0bd0-f8cd-45ba-f5e3893cc5a9@oracle.com> <8702daac-092c-4236-f9c1-0f3ba1b6267e@oracle.com> Message-ID: <133ba807-fbc1-340d-a65a-0bea2cfc086a@loongson.cn> Thank you Misha for updating the doc. On 2019/4/12 ??9:59, mikhailo.seledtsov at oracle.com wrote: > David, Jie, thank you for reviews. > > Here is the webrev with update to the docs (html and .md): > > http://cr.openjdk.java.net/~mseledtsov/8222299.01/ > > > Thank you, > > Misha > > > On 4/11/19 6:47 PM, mikhailo.seledtsov at oracle.com wrote: >> Thank you Jie, >> >> Good catch. I will update the docs. >> >> >> Misha >> >> >> On 4/11/19 5:57 PM, Jie Fu wrote: >>> Hi Misha, >>> >>> It might be better to update the test doc[1][2] together. >>> Thanks. >>> >>> Best regards, >>> Jie >>> >>> [1] >>> http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l384 >>> >>> [2] >>> http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l389 >>> >>> >>> >>> On 2019/4/12 ??6:46, mikhailo.seledtsov at oracle.com wrote: >>>> Please review this change that moves HotSpot container tests into >>>> their own directory under test/hotspot/jtreg and >>>> creates hotspot_containers test group. Since container tests >>>> require specially setup/configured environment, it is best to group >>>> them into their own group, >>>> so they could be executed in properly configured environment. The >>>> details of this change, such as new location were >>>> recommended by David and Igor during earlier public review for this >>>> issue (8222299: [TESTBUG] Docker tests should be excluded from >>>> hotspot_runtime group). >>>> >>>> Also, as part of this review, Igor recommended to use jtreg.skipped >>>> exception if docker image build fails, which I agreed to >>>> and implemented. >>>> >>>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>>> ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222299.00/ >>>> ??? Testing: >>>> ?????? Ran the affected tests on a machine configured for Docker >>>> testing. >>>> ?????? Ran them two ways, via specifying the directory as well as >>>> by using a newly created group - PASS >>>> >>>> >>>> Thank you, >>>> Misha >>>> >>> >> > From per.liden at oracle.com Fri Apr 12 05:18:41 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 12 Apr 2019 07:18:41 +0200 Subject: RFR: JDK-8222281: GC interface for load-klass: runtime part In-Reply-To: References: Message-ID: Hi Roman, On 04/11/2019 10:58 PM, Roman Kennke wrote: > An upcoming feature in Shenandoah requires that GC can intercept loading > the Klass* of an object. I'd like to introduce a GC interface for that. Could you please give some more insight into what this feature is and why it needs to intercept these loads? thanks, Per > This change covers the runtime part. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8222281 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8222281/webrev.00/ > > The change takes the 3 variants of oopDesc::klass() and funnels them > through the Access API to be intercepted by the GC in any way it wants. > Behaviour- and performance-wise it should be identical (assuming the > compiler can make sense of the Access API). > > I see that there might be an opportunity here to make the if > (UseCompressedClassPointers) check pre-resolved, but I don't know how to > do that. I also don't really see how that is supposed to work for loads > and stores wrt UseCompressedOops either: in order to select the proper > functions on first call, and subsequently go through that selected > function, it would have to be done in the *_init() functions. But I > don't see any selection code there. Instead, it seems to be in > PreRuntimeDispatch? I must be missing something. I left the selection in > the raw implementation. Maybe we want to sort this out? > > > I must say that I fought with myself whether or not I should add to the > madness that is the Access API. What should have been a 1-line-addition > to an API (e.g. BarrierSet) turned out to become: > > 7 files changed, 86 insertions(+), 20 deletions(-) > > and two days of work. And god forbid we ever have to change or even fix > anything there. > > I was about to just not do it in Access API at all, but somewhere else > instead, but then I did not want to introduce a schism there, so I bit > the bullet. Maybe we should consider to turn this into a proper C++ > interface instead? This is just unmaintainable madness. > > > Testing: tier1 fine. Will submit into jdk/submit shortly, works with the > Shenandoah prototype that I have here (iow, the API is good) > > Can I please get a review? > > Thanks, > Roman > From rkennke at redhat.com Fri Apr 12 06:23:04 2019 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 12 Apr 2019 08:23:04 +0200 Subject: RFR: JDK-8222281: GC interface for load-klass: runtime part In-Reply-To: References: Message-ID: <789EF88B-3E6E-4455-8335-96185BAECA07@redhat.com> Am 12. April 2019 07:18:41 MESZ schrieb Per Liden : >Hi Roman, > >On 04/11/2019 10:58 PM, Roman Kennke wrote: >> An upcoming feature in Shenandoah requires that GC can intercept >loading >> the Klass* of an object. I'd like to introduce a GC interface for >that. > >Could you please give some more insight into what this feature is and >why it needs to intercept these loads? Oh yeah, apparently it was late last night. ;-) We want to eliminate Shenandoah's extra word for the forwarding pointer and stick it in the Klass* word for forwarded objects, with a little bit of encoding to distinguish a forward pointer from a Klass*. Therefore we need to see Klass* loads to check that encoding and possibly load the Klass* from the forwardee instead. Roman >thanks, >Per > >> This change covers the runtime part. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8222281 >> Webrev: >> http://cr.openjdk.java.net/~rkennke/JDK-8222281/webrev.00/ >> >> The change takes the 3 variants of oopDesc::klass() and funnels them >> through the Access API to be intercepted by the GC in any way it >wants. >> Behaviour- and performance-wise it should be identical (assuming the >> compiler can make sense of the Access API). >> >> I see that there might be an opportunity here to make the if >> (UseCompressedClassPointers) check pre-resolved, but I don't know how >to >> do that. I also don't really see how that is supposed to work for >loads >> and stores wrt UseCompressedOops either: in order to select the >proper >> functions on first call, and subsequently go through that selected >> function, it would have to be done in the *_init() functions. But I >> don't see any selection code there. Instead, it seems to be in >> PreRuntimeDispatch? I must be missing something. I left the selection >in >> the raw implementation. Maybe we want to sort this out? >> >> >> I must say that I fought with myself whether or not I should add to >the >> madness that is the Access API. What should have been a >1-line-addition >> to an API (e.g. BarrierSet) turned out to become: >> >> 7 files changed, 86 insertions(+), 20 deletions(-) >> >> and two days of work. And god forbid we ever have to change or even >fix >> anything there. >> >> I was about to just not do it in Access API at all, but somewhere >else >> instead, but then I did not want to introduce a schism there, so I >bit >> the bullet. Maybe we should consider to turn this into a proper C++ >> interface instead? This is just unmaintainable madness. >> >> >> Testing: tier1 fine. Will submit into jdk/submit shortly, works with >the >> Shenandoah prototype that I have here (iow, the API is good) >> >> Can I please get a review? >> >> Thanks, >> Roman >> -- Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From per.liden at oracle.com Fri Apr 12 07:06:40 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 12 Apr 2019 09:06:40 +0200 Subject: RFR: JDK-8222281: GC interface for load-klass: runtime part In-Reply-To: <789EF88B-3E6E-4455-8335-96185BAECA07@redhat.com> References: <789EF88B-3E6E-4455-8335-96185BAECA07@redhat.com> Message-ID: <67a0953d-7d74-b315-f571-a8516a9a8776@oracle.com> On 4/12/19 8:23 AM, Roman Kennke wrote: > > > Am 12. April 2019 07:18:41 MESZ schrieb Per Liden : >> Hi Roman, >> >> On 04/11/2019 10:58 PM, Roman Kennke wrote: >>> An upcoming feature in Shenandoah requires that GC can intercept >> loading >>> the Klass* of an object. I'd like to introduce a GC interface for >> that. >> >> Could you please give some more insight into what this feature is and >> why it needs to intercept these loads? > > Oh yeah, apparently it was late last night. ;-) We want to eliminate Shenandoah's extra word for the forwarding pointer and stick it in the Klass* word for forwarded objects, with a little bit of encoding to distinguish a forward pointer from a Klass*. Therefore we need to see Klass* loads to check that encoding and possibly load the Klass* from the forwardee instead. Maybe I'm missing something, but didn't you just switch to having a to-space invariant? What runtime code can get hold of a an oop pointing into from-space and then try load its klass? cheers, Per > > Roman > >> thanks, >> Per >> >>> This change covers the runtime part. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8222281 >>> Webrev: >>> http://cr.openjdk.java.net/~rkennke/JDK-8222281/webrev.00/ >>> >>> The change takes the 3 variants of oopDesc::klass() and funnels them >>> through the Access API to be intercepted by the GC in any way it >> wants. >>> Behaviour- and performance-wise it should be identical (assuming the >>> compiler can make sense of the Access API). >>> >>> I see that there might be an opportunity here to make the if >>> (UseCompressedClassPointers) check pre-resolved, but I don't know how >> to >>> do that. I also don't really see how that is supposed to work for >> loads >>> and stores wrt UseCompressedOops either: in order to select the >> proper >>> functions on first call, and subsequently go through that selected >>> function, it would have to be done in the *_init() functions. But I >>> don't see any selection code there. Instead, it seems to be in >>> PreRuntimeDispatch? I must be missing something. I left the selection >> in >>> the raw implementation. Maybe we want to sort this out? >>> >>> >>> I must say that I fought with myself whether or not I should add to >> the >>> madness that is the Access API. What should have been a >> 1-line-addition >>> to an API (e.g. BarrierSet) turned out to become: >>> >>> 7 files changed, 86 insertions(+), 20 deletions(-) >>> >>> and two days of work. And god forbid we ever have to change or even >> fix >>> anything there. >>> >>> I was about to just not do it in Access API at all, but somewhere >> else >>> instead, but then I did not want to introduce a schism there, so I >> bit >>> the bullet. Maybe we should consider to turn this into a proper C++ >>> interface instead? This is just unmaintainable madness. >>> >>> >>> Testing: tier1 fine. Will submit into jdk/submit shortly, works with >> the >>> Shenandoah prototype that I have here (iow, the API is good) >>> >>> Can I please get a review? >>> >>> Thanks, >>> Roman >>> > From rkennke at redhat.com Fri Apr 12 07:30:20 2019 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 12 Apr 2019 09:30:20 +0200 Subject: RFR: JDK-8222281: GC interface for load-klass: runtime part In-Reply-To: <67a0953d-7d74-b315-f571-a8516a9a8776@oracle.com> References: <789EF88B-3E6E-4455-8335-96185BAECA07@redhat.com> <67a0953d-7d74-b315-f571-a8516a9a8776@oracle.com> Message-ID: <3c30019cf42d6c689e6ff3ff580e3cdb97b14065.camel@redhat.com> > > per.liden at oracle.com>: > > > Hi Roman, > > > > > > On 04/11/2019 10:58 PM, Roman Kennke wrote: > > > > An upcoming feature in Shenandoah requires that GC can > > > > intercept > > > loading > > > > the Klass* of an object. I'd like to introduce a GC interface > > > > for > > > that. > > > > > > Could you please give some more insight into what this feature is > > > and > > > why it needs to intercept these loads? > > > > Oh yeah, apparently it was late last night. ;-) We want to > > eliminate Shenandoah's extra word for the forwarding pointer and > > stick it in the Klass* word for forwarded objects, with a little > > bit of encoding to distinguish a forward pointer from a Klass*. > > Therefore we need to see Klass* loads to check that encoding and > > possibly load the Klass* from the forwardee instead. > > Maybe I'm missing something, but didn't you just switch to having a > to-space invariant? What runtime code can get hold of a an oop > pointing > into from-space and then try load its klass? Ha! Right.. The prototype actually predates the to-space-invariant change, and then I only 'adapted' it to the new situation without giving it much thought, but you're right, we might not actually need any GC interface changes. That should simplify everything, and perhaps speed it up too. I withdraw the RFR for now. Thanks, Roman > cheers, > Per > > > Roman > > > > > thanks, > > > Per > > > > > > > This change covers the runtime part. > > > > > > > > Bug: > > > > https://bugs.openjdk.java.net/browse/JDK-8222281 > > > > Webrev: > > > > http://cr.openjdk.java.net/~rkennke/JDK-8222281/webrev.00/ > > > > > > > > The change takes the 3 variants of oopDesc::klass() and funnels > > > > them > > > > through the Access API to be intercepted by the GC in any way > > > > it > > > wants. > > > > Behaviour- and performance-wise it should be identical > > > > (assuming the > > > > compiler can make sense of the Access API). > > > > > > > > I see that there might be an opportunity here to make the if > > > > (UseCompressedClassPointers) check pre-resolved, but I don't > > > > know how > > > to > > > > do that. I also don't really see how that is supposed to work > > > > for > > > loads > > > > and stores wrt UseCompressedOops either: in order to select the > > > proper > > > > functions on first call, and subsequently go through that > > > > selected > > > > function, it would have to be done in the *_init() functions. > > > > But I > > > > don't see any selection code there. Instead, it seems to be in > > > > PreRuntimeDispatch? I must be missing something. I left the > > > > selection > > > in > > > > the raw implementation. Maybe we want to sort this out? > > > > > > > > > > > > I must say that I fought with myself whether or not I should > > > > add to > > > the > > > > madness that is the Access API. What should have been a > > > 1-line-addition > > > > to an API (e.g. BarrierSet) turned out to become: > > > > > > > > 7 files changed, 86 insertions(+), 20 deletions(-) > > > > > > > > and two days of work. And god forbid we ever have to change or > > > > even > > > fix > > > > anything there. > > > > > > > > I was about to just not do it in Access API at all, but > > > > somewhere > > > else > > > > instead, but then I did not want to introduce a schism there, > > > > so I > > > bit > > > > the bullet. Maybe we should consider to turn this into a proper > > > > C++ > > > > interface instead? This is just unmaintainable madness. > > > > > > > > > > > > Testing: tier1 fine. Will submit into jdk/submit shortly, works > > > > with > > > the > > > > Shenandoah prototype that I have here (iow, the API is good) > > > > > > > > Can I please get a review? > > > > > > > > Thanks, > > > > Roman > > > > From david.holmes at oracle.com Fri Apr 12 09:15:08 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Apr 2019 19:15:08 +1000 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> Message-ID: <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> Hi Robin, Sorry for the delay I've been mulling over this one ... :) On 11/04/2019 12:02 am, Robin Westberg wrote: > Hi David, > > Thanks for the detailed review! > >> On 9 Apr 2019, at 04:45, David Holmes wrote: >> >> Hi Robin, >> >> On 8/04/2019 10:47 pm, Robin Westberg wrote: >>> Hi again, >>> Here?s an updated version where I?ve moved the naked_short_nanosleep function into the Posix class, to avoid future cross-platform use. (It?s still used in the SpinYield and TimedYield implementations though). >> >> But you also changed the existing Windows os::naked_short_sleep to use the WaitableTimer which is a significant change to make. Is this just because it will likely have better resolution than the native Sleep function? >> >> Code using os::naked_short_sleep(1) might be unexpectedly impacted by this if what was a 10ms (or worse) sleep becomes closer to 1ms. PerfDataManager::destroy in particular could be impacted as its racy to begin with. That's not to say this isn't a good thing to fix, just be aware it may have unexpected consequences. > > You are right, my thinking was that it would be nice to retain the more exact version found in the nanosleep implementation that I was removing. But I?d be fine with doing that as a separate change and run additional testing on it. I can file a separate RFE to keep track of it. > >>> Full webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.01/ >>> Incremental: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00-01/ >> >> src/hotspot/share/utilities/spinYield.cpp >> >> I'm somewhat dubious about getting rid of non-Posix naked_short_nanosleep and instead adding a win32 ifdef to this code. It kind of defeats the purpose of the os abstraction layer - albeit that Windows can't really do a nanosleep. >> >> Why did you get rid of the sleep_ns parameter and hardwire it to 1000? A configurable sleep time would be a feature of this utility. > > The original implementation of SpinYield had the sleep hardwired to 1 ms os::naked_short_sleep - when os::naked_short_nanosleep was introduced this parameter was added as well, with a default value of 1000. But there?s no code that actually sets the parameter, and it?s a bit misleading that the parameter accepts nanoseconds when that cannot be acted upon on Windows. So I figured that it would be better to remove the parameter but retain the existing behavior. But this should probably be revisited when the fate of TimedYield has been decided.. I evaluate the abstraction and API that is being provided and even if the initial user doesn't want anything but the default value, the ability to set the value makes perfect sense for that particular abstraction. The fact it can't be acted upon on windows is unfortunate, but these things should always been specified as "best effort" when we can't have guarantees. >> --- >> >> src/hotspot/share/utilities/timedYield.hpp >> >> 36 // scheduled on the same cpu as the waiter, we will first try local cpu >> 37 // yielding until we reach OS sleep primitive granularity in the waiting >> >> Whether or not the yielding is "cpu local" will depend on the OS and the scheduler in use. I would just refer to OS native yield. > > The intention is to coerce the scheduler to perform a cpu-local yield if possible - more on that later. > >> 40 jlong _yield_time_ns; >> 41 jlong _sleep_time_ns; >> 42 jlong _max_yield_time_ns; >> 43 jlong _yield_granularity_ns; >> >> We are avoiding using j-types in the VM unless they actually pertain to Java variables. These should be int64_t (to be compatible with javaTimeNanos call). > > Thanks, will change. > >> 52 // Perform next round of delay. >> 53 void wait(); >> >> I think delay() would be a better name than wait(). > > This was inspired by the SpinYield utility, but I certainly wouldn?t mind renaming it. Perhaps SpinYield::wait should be renamed as well to keep the symmetry? If its already in use then wait() is okay. >> --- >> >> src/hotspot/share/utilities/timedYield.cpp >> >> 42 // Phase 1 - local cpu yielding >> 43 if (_yield_time_ns < _max_yield_time_ns) { >> 44 #ifdef WIN32 >> 45 if (SwitchToThread() == 0) { >> 46 // Nothing else is ready to run on this cpu, spin a little >> 47 while (os::javaTimeNanos() - start < _yield_granularity_ns) { >> 48 SpinPause(); >> 49 } >> 50 } >> 51 #else >> 52 os::Posix::naked_short_nanosleep(_yield_granularity_ns); >> 53 #endif >> 54 _yield_time_ns += os::javaTimeNanos() - start; >> 55 return; >> 56 } >> >> I have a few issues with this code. It's breaking the os abstraction layer again by using an OS ifdef and not using os APIs that exist for this very purpose - i.e os::naked_yield(). And for non-Windows it seems quite bizarre the "yielding" part of TimedYield is actually implemented with a sleep and not os::naked_yield! > > The ?root? problem here is that the existing os primitives unfortunately do not quite map to the goal of this utility. Let me try to break it down a bit and see if it makes sense: The purpose of TimedYield is to wait for a thread rendezvous, for a short time as possible. (In this case, waiting for threads to notice that the safepoint poll has been armed). Ideally, we are aiming for sub-millisecond waiting times here. > > If these threads are scheduled on other cpu?s this is pretty simple - we could spin or nanosleep and they would make progress. However - if one or more of these threads are scheduled to run on the current cpu things become interesting. Waiting for the OS scheduler to move threads to different cpu?s can take milliseconds - much slower than what is possible to achieve. So, we want to try performing a cpu-local yield at first. This is heading into territory that we only go into if absolutely necessary. We don't want to be coding to our assumed knowledge of what a particular scheduler will do - especially when we don't even know we will be executing on that scheduler. Unless there is very good reason we should stick with the functionality and semantics provided by the OS. > On Windows, this maps reasonably well to os::naked_yield: > > void os::naked_yield() { > // Consider passing back the return value from SwitchToThread(). > SwitchToThread(); > } > > But for example on Linux, there is instead this: > > // Linux CFS scheduler (since 2.6.23) does not guarantee sched_yield(2) will > // actually give up the CPU. Since skip buddy (v2.6.28): > // > // * Sets the yielding task as skip buddy for current CPU's run queue. > // * Picks next from run queue, if empty, picks a skip buddy (can be the yielding task). > // * Clears skip buddies for this run queue (yielding task no longer a skip buddy). > // > // An alternative is calling os::naked_short_nanosleep with a small number to avoid > // getting re-scheduled immediately. > // > void os::naked_yield() { > sched_yield(); > } > > In both cases, we may get rescheduled immediately - on Windows this is indicated in the return value from SwitchToThread, but on Linux we don?t know. On Windows, it is then fine to spin a little while as there is nothing else ready to run. But on Linux, the CFS scheduler penalizes spinning as the runtime counter is increased, which will hurt the waiter when the time comes to perform actual work. So we don?t want to spin on a no-op sched_yield, we have to use nanosleep instead. But then we are back to the original problem - the current nanosleep is not what we want to do on Windows in this situation So why not change nanosleep on Windows: Yielding should always be a hint, not a requirement. Trying to second guess who may be running on which core and what the load may be is not a game we want to play lightly. There are just too factors out of our control that can change on a different piece of hardware or a different OS release etc. >> If the existing os api's need adjustment or expansion to provide the functionality desired by this code then I would much prefer to see the os API's updated to address that. >> >> That said, given the original problem is that os::naked_short_nanosleep on Windows is too coarse with the use of WaitableTimer why not just replace that with a simple loop of the form: >> >> while (elapsed < sleep_ns) { >> if (SwitchToThread() == 0) { >> SpinPause(); >> elapsed = ? >> } > > So this would actually work fine in this case - but it's probably not what you would expect from a sleep function in the general case. On Linux, you would get control back after the provided nanosecond period even if another thread executed in the meantime. But on Windows, you are potentially giving up your entire timeslice if another thread is ready to run - this would be much worse than plain naked_short_sleep as you may not get control back for another 15 ms or so. Perhaps you have different expectations on what a sleep may do, but I don't expect absolute precision here. I expect a sleep function to take me off CPU for (ideally**) at least a given amount of time, but have no expectations about the maximum - that depends on a lot of factors about os timers and scheduling that I don't want to have to think about or know about. I don't even assume I go off cpu for all that time as I know there are limits around timer resolution etc and so the OS may in fact do some spinning instead. That's all fine by me. If the sleeping thread doesn't get back on CPU for 15-20ms then so be it, some other thread is getting to run and do what it needs to do. ** Windows returns early from timed-waits in many cases. > That all being said, switching the Windows naked_short_nanosleep to the above implementation would be just fine - but I really think it should be renamed in that case. Perhaps something like os::timed_yield(jlong ns) would make sense? The additional backoff mechanism in ThreadYield can be reverted back to being handled by the safepointing code. The reason I made TimedYield into a separate utility was that it may be useful in other places as well, but such future use can of course be handled separately if the need actually arises. I'd prefer to fix a windows problem, just on windows. I'm not hung up on having sleep in the name, but if you prefer timed_yield to naked_short_nanosleep then that's fine (and avoids people wondering what the "naked" part means). If we need the TimedYield capability in the future then lets revisit that then. Thanks, David ----- > Best regards, > Robin > >> >> ? >> >> Thanks, >> David >> ----- >> >> >>> Best regards, >>> Robin >>>> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >>>> >>>> Hi David, >>>> >>>>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>>>> >>>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>>> Hi David, >>>>>> Thanks for taking a look! >>>>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>>>> >>>>>>> Hi Robin, >>>>>>> >>>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>>> Hi all, >>>>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>>>> >>>>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>>>> >>>>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >>>> >>>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >>>> >>>> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >>>> >>>>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >>>> >>>> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>>> >>>> Best regards, >>>> Robin >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Best regards, >>>>>> Robin >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>>> Testing: tier1 >>>>>>>> Best regards, >>>>>>>> Robin > From david.holmes at oracle.com Fri Apr 12 09:23:39 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Apr 2019 19:23:39 +1000 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> <620e300d-a941-5e55-1814-f37c4b559655@oracle.com> Message-ID: Hi Dan, Thanks for the response. I think I need some gory details here, so I'll take it up with you separately. David On 11/04/2019 3:09 am, Daniel D. Daugherty wrote: > On 4/10/19 1:15 AM, David Holmes wrote: >> Hi Carsten, Dan, >> >> I'd like to pick up on one topic - a higher-level discussion about the >> timing of the ObjectMonitor lifecycle as they currently are and with >> these changes: >> >> Carsten wrote: >>> I think I was thinking about a cycle where a Java object exhibits >>> a monitor inflation, then deflation, then inflation, then deflation. >>> Each inflation will be with a new monitor. This behavior could >>> increase the number of monitors allocated, especially with my >>> original patch as I recycled monitors only after a safepoint. Now >>> that I think about it again, such a cycle is incredible unlikely as >>> it would require repeated contention on the> java object, yet the >>> monitors must not be busy when the deflator thread comes by. And this >>> scenario has to repeat itself. This all seems pretty unlikely. >> >> So logically every Object has associated with it an ObjectMonitor but >> if we created the ObjectMonitor at the same time as the Object and >> kept it alive while the Object was alive then we would double our >> memory use (if not worse). > > Generally worse. In one of my recent debug sessions on MacOSX with product > bits, I had to figure out sizeof(ObjectMonitor) for memory dumping purposes > and it was 224 bytes. > > >> So we lazily create ObjectMonitors only when we need them: contention, >> Object.wait() use, hashcode use. > > Clarification: hashcode with contention. hashcode by itself does not > require inflation. > > >> We could then leave the ObjectMonitors around as long as the Objects >> are alive, but again this has implications for memory use. >> >> So we deflate idle ObjectMonitors to reclaim memory (though in >> practice it is more complex and we maintain pools of them to speed up >> allocation). >> >> If we aggressively deflate as soon as an ObjectMonitor is idle then we >> risk getting into inflate->deflate->inflate cycles. The likelihood may >> be low but if you hit this pathology in your code you will probably be >> unhappy about the effects on performance. >> >> So instead, IIUC, we use some measure of "memory pressure" and only >> try to deflate under certain conditions. But I'm unclear exactly what >> those conditions are today, and whether they change with async monitor >> deflation. Can you enlighten me please? > > Without trying to describe the existing trigger mechanisms for monitor > deflation (lots of details), Async Monitor Deflation uses the same > safepoint cleanup trigger points for _initiating_ monitor deflation. > However, unlike safepoint cleanup work which will finish the job during > the current safepoint, Async Monitor Deflation will start after the > safepoint that initiated the monitor deflation, but there is no guarantee > when the ServiceThread or the JavaThreads will finish their deflation work > (maybe not before the next safepoint). > > The v2.00/3-for-jdk13 webrev has the "must be seen twice to deflate" > algorithm in place. So for any given ObjectMonitor, the first time it > is seen by ObjectSynchronizer::deflate_monitor_list_using_JT(), it will > not be deflated even if eligible (allocation state == "New"). The second > time that it is seen by deflate_monitor_list_using_JT() (allocation state > == "Old"), it is eligible for deflation. > > In this round of code review, we are talking about setting the allocation > state to "Old" in inflate() which would make an ObjectMonitor eligible > for deflation in the next round of deflation. > > > One of my tasks is to update the existing comments about ObjectMonitor > life cycle (I'll use another base monitor subsystem subtask); I don't > think they were updated with the monitor list changes so there's a bit > of catch up work to do there. I also need to update them for Async > Monitor Deflation life cycle changes and that will be part of this > project. > > Dan > >> >> Thanks, >> David >> >> On 10/04/2019 12:25 pm, Carsten Varming wrote: >>> Hi Dan, >>> >>> On Mon, Apr 8, 2019 at 9:04 PM Daniel D. Daugherty < >>> daniel.daugherty at oracle.com> wrote: >>> >>>> On 4/5/19 4:59 PM, Karen Kinnear wrote: >>>> >>>> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that >>>> 0 < >>>> _count >>>> with comments that caller ensured _count <= 0 >>>> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >>>> ? Am I missing something subtle here or should they be the same >>>> guarantees? >>>> >>>> >>>> Here's the code in question: >>>> >>>> src/hotspot/share/runtime/objectMonitor.cpp: >>>> >>>> void ObjectMonitor::EnterI(TRAPS) { >>>> >>>> ?? if (_owner == DEFLATER_MARKER) { >>>> ???? guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 >>>> should >>>> have been handled by the caller"); >>>> ???? // Deflater thread tried to lock this monitor, but it failed to >>>> make >>>> _count negative and gave up. >>>> >>>> void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) { >>>> >>>> ???? if (_owner == DEFLATER_MARKER) { >>>> ?????? guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER && >>>> _count < 0, monitor must not be owned by deflater thread here"); >>>> >>>> >>>> Reading these two guarantee() calls always throws me off stride >>>> because I would have written them like this: >>>> >>>> ???? guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 >>>> should >>>> have been handled by the caller"); >>>> >>>> and >>>> >>>> ?????? guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER && >>>> _count < 0, monitor must not be owned by deflater thread here"); >>>> >>>> When rewritten like the above, you have: >>>> >>>> ???? "_count > 0" ... _count <= 0 >>>> >>>> and: >>>> >>>> ???? "_count >= 0" ... "_count < 0" >>>> >>>> which is easier for my brain to read... okay... enough sidebar... >>>> >>> >>> He he. I have pretty much eliminated > and >= from my written >>> vocabulary. >>> It makes life simpler. Trust me. :) >>> >>> >>>> Short answer: No the guarantees should not be the same. >>>> >>>> Longer answer: EnterI() is called by enter() after enter() has >>>> incremented the _count field to indicate the contended state of >>>> things. So in EnterI(), "_count > 0" is the right check. >>>> ReenterI() is called after wait() has returned (notified or >>>> timedout), and the _count field is not used on reentry ops so >>>> "_count >= 0" is the right check. >>>> >>>> I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, >>>> there are two places in EnterI() that do this): >>>> >>>> ???? L501:?? if (_owner == DEFLATER_MARKER) { >>>> ?????????????? // The deflation protocol finished the first part >>>> (setting >>>> _owner), >>>> ?????????????? // but it failed the second part (making _count >>>> negative) >>>> and bailed. >>>> ?????????????? // Because we're called from enter() we have at least >>>> one >>>> contention. >>>> ?????????????? guarantee(count > 0, "_owner == DEFLATER_MARKER && >>>> _count <= >>>> 0 should have been handled by the caller"); >>>> ???? L504:???? // Try to acquire monitor. >>>> ???? L505:???? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>>> DEFLATER_MARKER) { >>>> >>>> ???? L629:???? if (_owner == DEFLATER_MARKER) { >>>> ???????????????? // The deflation protocol finished the first part >>>> (setting >>>> _owner), >>>> ???????????????? // but it failed the second part (making _count >>>> negative) >>>> and bailed. >>>> ???????????????? // Because we're called from enter() we have at >>>> least one >>>> contention. >>>> ???????????????? guarantee(count> 0 , "_owner == DEFLATER_MARKER && >>>> _count >>>> <= 0 should have been handled by the caller"); >>>> ???? L632:?????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>>> DEFLATER_MARKER) { >>>> >>>> And I'm going to tweak the ReenterI() code like this: >>>> >>>> ???? L759:???? if (_owner == DEFLATER_MARKER) { >>>> ???????????????? // The deflation protocol finished the first part >>>> (setting >>>> _owner), >>>> ???????????????? // but it will observe _waiters != 0 and will bail >>>> out. >>>> Because we're >>>> ???????????????? // called from wait() we may or may not have any >>>> contentions. >>>> ???????????????? guarantee(count >= 0, "Impossible: _owner == >>>> DEFLATER_MARKER && _count < 0 should have been handled by the caller"); >>>> ???? L761:?????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>>> DEFLATER_MARKER) { >>>> >>>> >>>> You didn't ask this, but it is okay that _count is only used to track >>>> contentions in enter()/EnterI() and is not used to track contentions >>>> in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is >>>> used by is_busy() to observe the busy state for an ObjectMonitor that >>>> is being wait()'ed for. The _waiters field is decremented after a >>>> waiter has returned from ReenterI() so the _owner field takes over >>>> answering the is_busy() question... >>>> >>>> >>>> 5. I could use a little help with allocation state transitions, >>>> e.g. in deflate_monitor_list_using_JT >>>> ?? you see is_new with object set so you mark it as old so next >>>> deflation >>>> will check it >>>> >>>> >>>> Here's the code in question: >>>> >>>> src/hotspot/share/runtime/synchronizer.cpp: >>>> >>>> int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** >>>> listHeadp, >>>> ObjectMonitor** >>>> freeHeadp, >>>> ObjectMonitor** >>>> freeTailp, >>>> ObjectMonitor** >>>> savedMidInUsep) { >>>> >>>> ???? // Only try to deflate if there is an associated Java object >>>> and if >>>> ???? // mid is old (is not newly allocated and is not newly freed). >>>> ???? if (mid->object() != NULL && mid->is_old() && >>>> ???????? deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { >>>> ?????? // Deflation succeeded so update the in-use list. >>>> >>>> ???? } else { >>>> ?????? // mid is considered in-use if it does not have an associated >>>> ?????? // Java object or mid is not old or deflation did not succeed. >>>> ?????? // A mid->is_new() node can be seen here when it is freshly >>>> returned >>>> ?????? // by omAlloc() (and skips the deflation code path). >>>> ?????? // A mid->is_old() node can be seen here when deflation failed. >>>> ?????? // A mid->is_free() node can be seen here when a fresh node from >>>> ?????? // omAlloc() is released by omRelease() due to losing the race >>>> ?????? // in inflate(). >>>> >>>> ?????? if (mid->object() != NULL && mid->is_new()) { >>>> ???????? // mid has an associated Java object and has now been seen >>>> ???????? // as newly allocated so mark it as "old". >>>> ???????? mid->set_allocation_state(ObjectMonitor::Old); >>>> ?????? } >>>> >>>> ?? - why do you set it to old here rather than in inflate once we set >>>> values? >>>> >>>> >>>> Inflation is used in quite a few places. If we marked the >>>> ObjectMonitor as "Old" in inflate(), then that would make the >>>> ObjectMonitor available for deflation by deflate_monitor_using_JT() >>>> earlier: >>>> >>>> src/hotspot/share/runtime/synchronizer.cpp: >>>> >>>> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, >>>> ObjectMonitor** >>>> freeHeadp, >>>> ObjectMonitor** >>>> freeTailp) { >>>> ?? assert(AsyncDeflateIdleMonitors, "sanity check"); >>>> ?? assert(Thread::current()->is_Java_thread(), "precondition"); >>>> ?? // A newly allocated ObjectMonitor should not be seen here so we >>>> ?? // avoid an endless inflate/deflate cycle. >>>> ?? assert(mid->is_old(), "precondition"); >>>> >>>> >>>> So the idea behind only deflating ObjectMonitors that have reached >>>> allocation state "Old" is to prevent "an endless inflate/deflate >>>> cycle". >>>> Here's the relevant section from Carsten's JEP: >>>> >>>> To avoid endless inflation / deflation cycles in the prototype, monitor >>>> >>>> deflation is only attempted the second time a monitor is seen by the >>>> >>>> thread marking monitors as deflatable: If the thread (the only thread >>>> >>>> marking monitors as deflatable; might be service thread or some GC >>>> >>>> related thread or even a dedicated thread) sees a monitor in state New, >>>> >>>> then the thread marks the monitor as Old and moves on. So there is >>>> >>>> little interaction between a thread inflating a lock to a monitor and >>>> >>>> the deflating thread, the inflating thread just has to make sure the >>>> >>>> monitor is marked New and this marker is published using appropriate >>>> >>>> barriers. >>>> >>>> >>>> There isn't an explicit example in the JEP of what Carsten was thinking >>>> of with "an endless inflate/deflate cycle". I didn't try to think of >>>> such an example for the OpenJDK wiki either. I simple wrote: >>>> >>> >>> I think I was thinking about a cycle where a Java object exhibits a >>> monitor >>> inflation, then deflation, then inflation, then deflation. Each >>> inflation >>> will be with a new monitor. This behavior could increase the number of >>> monitors allocated, especially with my original patch as I recycled >>> monitors only after a safepoint. Now that I think about it again, such a >>> cycle is incredible unlikely as it would require repeated contention >>> on the >>> java object, yet the monitors must not be busy when the deflator thread >>> comes by. And this scenario has to repeat itself. This all seems pretty >>> unlikely. >>> >>> ObjectMonitor has a new allocation_state field that supports three >>>> >>>> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied >>>> >>>> to ObjectMonitors that have reached the 'Old' state. When the Async >>>> >>>> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it >>>> >>>> is changed to the 'Old' state, but is not deflated. This prevents a >>>> >>>> newly allocated ObjectMonitor from being immediately deflated which >>>> >>>> could cause an inflation<->deflation oscillation. >>>> >>>> >>>> So let's think about what might happen if an ObjectMonitor is marked >>>> as "Old" in inflate(). Here's an example use of inflate() in the >>>> "slow enter" code path: >>>> >>>> src/hotspot/share/runtime/synchronizer.cpp: >>>>> void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, >>>>> TRAPS) { >>>> >>>> base>>> inflate_cause_monitor_enter)->enter(THREAD); >>>> >>>> new>???? ObjectMonitorHandle omh; >>>> new>???? inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); >>>> new>???? do_loop = !omh.om_ptr()->enter(THREAD); >>>> >>>> In the "base" code, we took the return from inflate() and used it to >>>> call >>>> ObjectMonitor::enter(). If we never changed that bit of code and >>>> inflate() >>>> marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() >>>> could >>>> async deflate the ObjectMonitor while we were trying to call enter() on >>>> it... Boom! So we might think that holding off marking an ObjectMonitor >>>> as "Old" can save us... and it can, but not in all cases... :-( >>>> >>>> It is entirely possible that our call to slow_enter() is made on an >>>> ObjectMonitor that's already marked "Old". In that case, our thread >>>> (T-enter) calls inflate() which returns the existing ObjectMonitor* >>>> and we use it to call enter(). If the thread (T-deflate) calling >>>> deflate_monitor_using_JT() does its magic before T-enter sets the >>>> owner field or the count field... Boom! >>>> >>>> The previous paragraph is exactly what motivated the _ref_count field, >>>> the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* >>>> parameter to inflate(). inflate() calls >>>> ObjectMonitorHandle::save_om_ptr() >>>> which increments the ObjectMonitor's ref_count and then checks for >>>> async >>>> deflation protocol collisions. If there's a collision, then >>>> save_om_ptr() >>>> returns false and the caller (inflate() in this case) has to retry. >>>> When >>>> inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot >>>> be deflated and is safe until the ObjectMonitorHandle is destroyed. >>>> >>>> So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot >>>> deflate the ObjectMonitor in the window after inflate() returns and >>>> before T-enter sets the owner field or increments the count field. But >>>> you know all that already! >>>> >>>> So let's bring this back to having inflate() mark the ObjectMonitor as >>>> "Old"... Since inflate() returns an ObjectMonitor with the ref_count >>>> > 0, >>>> it doesn't matter if the ObjectMonitor is marked as "Old" in inflate(). >>>> T-deflate cannot deflate it due to ref_count > 0. >>>> >>>> Here's another crazy thought... inflate() is the only function that >>>> calls omAlloc(), and omAlloc() is the only function that sets "New". >>>> If we move the setting of "Old" from deflate_monitor_list_using_JT() >>>> to inflate(), then the change from "New" -> "Old" never happens >>>> outside of the inflate() call so why do we need the allocation state? >>>> >>>> Small dose of reality: I've found having the allocation state to be >>>> very helpful when debugging race related crashes. We could make the >>>> allocation state be DEBUG_ONLY, but then what about race debugging of >>>> product bits... sigh... >>>> >>>> >>>> 6. Could you get rid of the new goto?s? >>>> >>>> >>>> I believe there is only one left from Carsten's prototype: >>>> >>> >>> You make it sound like I was throwing gotos around left and right. :) If >>> you count continue and break statements, then you might have been right. >>> >>> I'll break my response here, so we can return to regular structured >>> programming, ;-) >>> Carsten >>> > From goetz.lindenmaier at sap.com Fri Apr 12 10:33:05 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 12 Apr 2019 10:33:05 +0000 Subject: RFR(L): 8218628: Add detailed message to NullPointerException describing what is null. In-Reply-To: References: <7c4b0bc27961471e91195bef9e767226@sap.com> <5c445ea9-24fb-0007-78df-41b94aadde2a@oracle.com> <8d1cc0b0-4a01-4564-73a9-4c635bfbfbaf@oracle.com> <3245ec3cefe2471e8382048164c0ba6b@sap.com> Message-ID: Hi, while waiting for progress on corresponding the JEP, I improved the implementation of generating the NPE message. It now uses a single outputStream. This removes several allocations of temporary data. I also removed TrackingStackSource. The analysis code originally addressed several use cases, for NullPointerExceptions this is not needed. I cleaned up bytecodeUtils from some code not (really) needed. I split get_null_pointer_slot() into two methods: get_NPE_null_slot() and print_NPE_failed_action(). This simplifies the implementation, and streamlines it more with the text in the JEP. I print methods using the code added in "8221470: Print methods in exception messages in java-like Syntax.", so it now prints 'void m(int)' instead of 'm(I)V'. I implemented a row of new test cases, and rearranged the test to test the message part of print_NPE_failed_action() and print_NPE_cause() separated. I made sure all bytecodes handled in these methods are covered. Further I arranged the tests in methods according to the functional properties as discussed in the JEP. http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/07 Best regards, Goetz. > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Donnerstag, 14. M?rz 2019 21:56 > To: 'Mandy Chung' ; 'Roger Riggs' > > Cc: 'Java Core Libs' ; 'hotspot-runtime- > dev at openjdk.java.net' > Subject: RE: RFR(L): 8218628: Add detailed message to NullPointerException > describing what is null. > > Hi, > > I had promised to work on a better wording of the messages. > > This I deliver with this webrev: > http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/05- > otherMessages/ > > The test in the webrev is modified to just print messages along with the > code that raised the messages. > > Please have a look at these files with test output contained in the webrev: > Messages with debug information: > http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/05- > otherMessages/output_with_debug_info.txt > Messages without debug information: > http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/05- > otherMessages/output_no_debug_info.txt > > Especially look at the first few messages, they point out the usefulness > of this change. They precisely say what was null in a chain of dereferences. > > Best regards, > Goetz. > > > > -----Original Message----- > > From: Lindenmaier, Goetz > > Sent: Wednesday, February 13, 2019 10:09 AM > > To: 'Mandy Chung' ; Roger Riggs > > > > Cc: Java Core Libs ; hotspot-runtime- > > dev at openjdk.java.net > > Subject: RE: RFR(L): 8218628: Add detailed message to NullPointerException > > describing what is null. > > > > Hi Mandy, > > > > Thanks for supporting my intend of adding the message as such! > > I'll start implementing this in Java and come back with a webrev > > in a while. > > > > In parallel, I would like to continue discussing the other > > topics, e.g., the wording of the message. I will probably come up > > with a separate webrev for that. > > > > Best regards, > > Goetz. > > > > > > > > > -----Original Message----- > > > From: core-libs-dev On Behalf > > > Of Mandy Chung > > > Sent: Tuesday, February 12, 2019 7:32 PM > > > To: Roger Riggs > > > Cc: Java Core Libs ; hotspot-runtime- > > > dev at openjdk.java.net > > > Subject: Re: RFR(L): 8218628: Add detailed message to > > NullPointerException > > > describing what is null. > > > > > > On 2/8/19 11:46 AM, Roger Riggs wrote: > > > > Hi, > > > > > > > > A few higher level issues should be considered, though the details > > > > of the webrev captured my immediate attention. > > > > > > > > Is this the right feature and is this the right level of implementation > > > > (C++/native)? > > > > : > > > > How much of this can be done in Java code with StackWalker and other > > > > java APIs? > > > > It would be a shame to add this much native code if there was a more > > > robust > > > > way to implement it using APIs with more leverage. > > > > > > Improving the NPE message for better diagnosability is helpful while > > > I share the same concern Roger raised. > > > > > > Implementing this feature in Java and the library would be a better > > > choice as this isn't absolutely required to be done in VM in native. > > > > > > NPE keeps a backtrace capturing the method id and bci of each stack > > > frame. One option to explore is to have StackWalker to accept a > > > Throwable object that returns a stream of StackFrame which allows > > > you to get the method and BCI and also code source (I started a > > > prototype for JDK-8189752 some time ago). It can use the bytecode > > > library e.g. ASM to read the bytecode. For NPE message, you can > > > implement a specialized StackFrameTraverser just for building > > > an exception message purpose. > > > > > > Mandy From goetz.lindenmaier at sap.com Fri Apr 12 10:44:51 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 12 Apr 2019 10:44:51 +0000 Subject: RFR: 8220715 JEP Draft: Add detailed message to NullPointerException describing what is null Message-ID: Hi, This JEP proposes to enhance the messages of NullPointerExceptions that are thrown if execution of a bytecode fails due to a null reference. https://bugs.openjdk.java.net/browse/JDK-8220715 A corresponding prototype exists and is submitted to the sandbox repo and is available as change 8218628. I already got valuable input on this topic by Mandy: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-March/033115.html Is anyone else interested in commenting on this? Would someone sponsor this JEP? Best regards, Goetz. > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Freitag, 15. M?rz 2019 11:55 > To: 'mark.reinhold at oracle.com' ; > maurizio.cimadamore at oracle.com > Cc: mandy.chung at oracle.com; roger.riggs at oracle.com; core-libs- > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: RE: RFR(L): 8218628: Add detailed message to NullPointerException > describing what is null. > > Hi everybody, Mark, > > I followed your advice and created a JEP: > https://bugs.openjdk.java.net/browse/JDK-8220715 > > Please point me to things I need to improve formally, this is my first > JEP. Also feel free to fix the administrative information in the > Jira issue if it is wrong. > > And, naturally, you're welcome to discuss the topic! > > Best regards, > Goetz. > > > -----Original Message----- > > From: mark.reinhold at oracle.com > > Sent: Donnerstag, 14. M?rz 2019 22:38 > > To: maurizio.cimadamore at oracle.com; Lindenmaier, Goetz > > > > Cc: mandy.chung at oracle.com; roger.riggs at oracle.com; core-libs- > > dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > > Subject: Re: RFR(L): 8218628: Add detailed message to NullPointerException > > describing what is null. > > > > 2019/3/14 8:00:20 -0700, maurizio.cimadamore at oracle.com: > > > I second what Mandy says. > > > > > > First let me start by saying that this enhancement will be a great > > > addition to our platform; back in the days when I was teaching some Java > > > classes at the university, I was very aware of how hard it is to > > > diagnose a NPE for someone novel to Java programming. > > > > Agreed! > > > > > ... > > > > > > I also think that the design space for such an enhancement is non > > > trivial, and would best be explored (and captured!) in a medium that is > > > something other than a patch. ... > > > > Agreed, also. > > > > Goetz -- if, per Mandy?s suggestion, you?re going to write something > > up using the JEP template, might I suggest that you then submit it as > > an actual JEP? Giving visibility to, and recording, such design-space > > explorations is one of the primary goals of the JEP process. > > > > - Mark From coleen.phillimore at oracle.com Fri Apr 12 11:39:41 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 12 Apr 2019 07:39:41 -0400 Subject: RFR (XS) 8222297: IRT_ENTRY/IRT_LEAF etc are the same as JRT In-Reply-To: <0af7792f-a691-4737-d60d-99152860f05a@oracle.com> References: <584edb17-c310-5440-087e-c507e4dd4875@oracle.com> <0af7792f-a691-4737-d60d-99152860f05a@oracle.com> Message-ID: <66832269-a5da-b117-15c2-32ddde02964f@oracle.com> On 4/11/19 7:24 PM, David Holmes wrote: > Hi Coleen, > > I was busy doing some archaeology on this code so didn't notice the > RFR. Glad Dan picked up on the only difference with the "verifiers" in > the LEAF variants. > > FTR the differences here are historical. JRT was added first and > shortly needed to manifest the current thread directly. IRT were added > later and the "thread" was an implicit argument. But by July 1998 the > two ENTRY macros were the same. The only difference was the verifier > in the LEAF, and some custom variants of each macro that no longer exist. > > Conceptually I've always thought there was a difference in how the > interpreter needed to "enter the runtime" versus the compilers. So the > different macros made that clear. But if the requirements are > essentially identical, and always have been, then the distinction just > confuses things. Right.? And as I put it in the bug, there have been some extensions to the JRT entries that every now and then, I think I need for the interpreter, so these distinctions are just confusing. > > I'm not clear what your resolution is here? Just accept that maybe we > lost a verify-gc call as Dan noted? I think there is no actual lost verify call, in that there is no IRT entry coming from native (that I can spot visually without running verification), and I can't figure why out it's makes sense to verify that there's no GC, when we've already verified that there is no safepoint. NoGCVerifier::NoGCVerifier(bool verifygc) { ? _verifygc = verifygc; ? if (_verifygc) { ??? CollectedHeap* h = Universe::heap(); ??? assert(!h->is_gc_active(), "GC active during NoGCVerifier"); ??? _old_invocations = h->total_collections(); ? } When is_gc_active is set here: void GenCollectedHeap::do_collection(bool?????????? full, ... ? assert(SafepointSynchronize::is_at_safepoint(), "should be at safepoint"); ... ? FlagSetting fl(_is_gc_active, true); Except for ZGC, which I can't tell, increment_total_collections also is called at a safepoint. It might be a useful assert if we want to prevent checking for the start of a concurrent collection.? If the thread is in native, it doesn't actually make sense because in native should not have any direct access to oops or metadata.? So if the verification is actually lost, which I doubt, that's a good thing. Coleen > > Thanks, > David > > On 12/04/2019 7:06 am, coleen.phillimore at oracle.com wrote: >> >> Dan, Thank you for reviewing. >> >> On 4/11/19 3:02 PM, Daniel D. Daugherty wrote: >>> On 4/11/19 11:39 AM, coleen.phillimore at oracle.com wrote: >>>> Summary: Replace IRT entry points with JRT. >>>> >>>> Tested with hs tier1-3 and built zero.? And grepped from the right >>>> level directory this time. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8222297.01/webrev >>> >>> src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp >>> ??? No comments. >>> >>> src/hotspot/cpu/arm/interpreterRT_arm.cpp >>> ??? No comments. >>> >>> src/hotspot/cpu/ppc/interpreterRT_ppc.cpp >>> ??? No comments. >>> >>> src/hotspot/cpu/s390/interpreterRT_s390.cpp >>> ??? No comments. >>> >>> src/hotspot/cpu/sparc/interpreterRT_sparc.cpp >>> ??? No comments. >>> >>> src/hotspot/cpu/x86/interpreterRT_x86_32.cpp >>> ??? No comments. >>> >>> src/hotspot/cpu/x86/interpreterRT_x86_64.cpp >>> ??? No comments. >>> >>> src/hotspot/cpu/zero/cppInterpreter_zero.cpp >>> ??? No comments. >>> >>> src/hotspot/cpu/zero/interpreterRT_zero.cpp >>> ??? No comments. >>> >>> src/hotspot/share/interpreter/interpreterRuntime.cpp >>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>> ??? old L435: #define IRT_LEAF(result_type, header) >>> ??? old L438: ??? debug_only(NoSafepointVerifier __nspv(true);) >>> ??? new L432: #define JRT_LEAF(result_type, header) >>> ??? new L435: ? debug_only(JRTLeafVerifier __jlv;) >>> ??????? src/hotspot/share/runtime/interfaceSupport.cpp: >>> >>> ?? ?? ??? JRTLeafVerifier::JRTLeafVerifier() >>> ? ? ? ? ? ? : NoSafepointVerifier(true, >>> JRTLeafVerifier::should_verify_GC()) >>> ? ? ????? { >>> ? ? ????? } >>> >>> ??????? src/hotspot/share/runtime/safepointVerifiers.hpp: >>> >>> ????????? NoSafepointVerifier(bool activated = true, bool verifygc = >>> true ) : >>> ? ? ? ? ? ? NoGCVerifier(verifygc), >>> ? ? ? ? ? ? _activated(activated) { >>> >>> ??????? IRT_LEAF creates a NoSafepointVerifier with first ctr param >>> == true >>> ??????? and the second ctr param == default true. >>> >>> ??????? JRT_LEAF creates a JRTLeafVerifier subclassed on >>> NoSafepointVerifier >>> ??????? with first ctr param == true and second ctr param based on >>> ??????? JRTLeafVerifier::should_verify_GC() which can return either >>> ??????? true or false depending on the calling thread's state. If the >>> ??????? thread's state == _thread_in_Java, then the return == true. >>> ??????? If the thread's state == _thread_in_native, then the return >>> == false. >>> >>> ??????? As long as all the IRT_LEAF uses are thread state == >>> _thread_in_Java >>> ??????? then this is an equivalent change. >>> >>> ??????? I found these uses of IRT_LEAF: >>> >>> ??? ?? ?? SharedRuntime::fixup_callers_callsite() >> >> This is called from the c2i adapter so would be thread_in_java. >> >>> ??? InterpreterRuntime::bcp_to_di() >>> ?? ?? ??? InterpreterRuntime::verify_mdp() >>> ?? ?? ??? InterpreterRuntime::interpreter_contains() >>> ?? ?? ??? InterpreterRuntime::popframe_move_outgoing_args() >>> ?? ?? ??? InterpreterRuntime::trace_bytecode() >>> >> >> I assume these are thread_in_java too, since they have access to >> metadata (except interpreter_contains, whose callers have access to >> metadata).? This would be dangerous for in_native threads to touch >> metadata directly. >> >>> I have not checked to see if these IRT_LEAF functions are >>> ??????? ever called from thread state == _thread_in_native locations, >>> ??????? but if they are, then we will no longer 'verifygc' with the >>> ??????? JRT_LEAF switch. >>> >> It seems like if one of these IRT calls was from native the >> >> NoSafepointVerifier( true /* no safepoints */, false /* don't verify >> GC */) might be a fix to these methods.? Because of the comment: >> >> ?? case _thread_in_native: >> ???? // A native thread is not subject to safepoints. >> ???? // Even while it is in a leaf routine, GC is ok >> ???? return false; >> >> But I think it's the case that the behavior is the same, or more >> consistent if one of these is a native LEAF call that had IRT_LEAF. >> Trying to resolve these subtle differences is the trouble with mostly >> duplicated code :( >> >> Thanks! >> Coleen >> >>> src/hotspot/share/runtime/sharedRuntime.cpp >>> ??? No comments. >>> >>> >>> Your call on what to do about the difference that I found between >>> IRT_LEAF and JRT_LEAF. We could be losing a 'verifygc' check here, >>> but... >>> >>> Dan >>> >>> >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222297 >>>> >>>> Thanks, >>>> Coleen >>> >> From daniel.daugherty at oracle.com Fri Apr 12 13:31:00 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 12 Apr 2019 09:31:00 -0400 Subject: RFR (XS) 8222297: IRT_ENTRY/IRT_LEAF etc are the same as JRT In-Reply-To: <66832269-a5da-b117-15c2-32ddde02964f@oracle.com> References: <584edb17-c310-5440-087e-c507e4dd4875@oracle.com> <0af7792f-a691-4737-d60d-99152860f05a@oracle.com> <66832269-a5da-b117-15c2-32ddde02964f@oracle.com> Message-ID: <398a9a38-d56f-0983-2493-cf6d2b738407@oracle.com> > > I'm not clear what your resolution is here? Just accept that maybe > we lost a verify-gc call as Dan noted? > > I think there is no actual lost verify call, in that there is no IRT > entry coming from native (that I can spot visually without running > verification), I could not spot any IRT_LEAF coming from native either. I'm good with pushing this change and keeping an eye open for any anomalies... Thumbs up! Dan On 4/12/19 7:39 AM, coleen.phillimore at oracle.com wrote: > > > On 4/11/19 7:24 PM, David Holmes wrote: >> Hi Coleen, >> >> I was busy doing some archaeology on this code so didn't notice the >> RFR. Glad Dan picked up on the only difference with the "verifiers" >> in the LEAF variants. >> >> FTR the differences here are historical. JRT was added first and >> shortly needed to manifest the current thread directly. IRT were >> added later and the "thread" was an implicit argument. But by July >> 1998 the two ENTRY macros were the same. The only difference was the >> verifier in the LEAF, and some custom variants of each macro that no >> longer exist. >> >> Conceptually I've always thought there was a difference in how the >> interpreter needed to "enter the runtime" versus the compilers. So >> the different macros made that clear. But if the requirements are >> essentially identical, and always have been, then the distinction >> just confuses things. > > Right.? And as I put it in the bug, there have been some extensions to > the JRT entries that every now and then, I think I need for the > interpreter, so these distinctions are just confusing. >> >> I'm not clear what your resolution is here? Just accept that maybe we >> lost a verify-gc call as Dan noted? > > I think there is no actual lost verify call, in that there is no IRT > entry coming from native (that I can spot visually without running > verification), and I can't figure why out it's makes sense to verify > that there's no GC, when we've already verified that there is no > safepoint. > > NoGCVerifier::NoGCVerifier(bool verifygc) { > ? _verifygc = verifygc; > ? if (_verifygc) { > ??? CollectedHeap* h = Universe::heap(); > ??? assert(!h->is_gc_active(), "GC active during NoGCVerifier"); > ??? _old_invocations = h->total_collections(); > ? } > > When is_gc_active is set here: > > void GenCollectedHeap::do_collection(bool?????????? full, > ... > ? assert(SafepointSynchronize::is_at_safepoint(), "should be at > safepoint"); > ... > ? FlagSetting fl(_is_gc_active, true); > > Except for ZGC, which I can't tell, increment_total_collections also > is called at a safepoint. > > It might be a useful assert if we want to prevent checking for the > start of a concurrent collection.? If the thread is in native, it > doesn't actually make sense because in native should not have any > direct access to oops or metadata.? So if the verification is actually > lost, which I doubt, that's a good thing. > > Coleen > >> >> Thanks, >> David >> >> On 12/04/2019 7:06 am, coleen.phillimore at oracle.com wrote: >>> >>> Dan, Thank you for reviewing. >>> >>> On 4/11/19 3:02 PM, Daniel D. Daugherty wrote: >>>> On 4/11/19 11:39 AM, coleen.phillimore at oracle.com wrote: >>>>> Summary: Replace IRT entry points with JRT. >>>>> >>>>> Tested with hs tier1-3 and built zero.? And grepped from the right >>>>> level directory this time. >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8222297.01/webrev >>>> >>>> src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/cpu/arm/interpreterRT_arm.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/cpu/ppc/interpreterRT_ppc.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/cpu/s390/interpreterRT_s390.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/cpu/sparc/interpreterRT_sparc.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/cpu/x86/interpreterRT_x86_32.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/cpu/x86/interpreterRT_x86_64.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/cpu/zero/cppInterpreter_zero.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/cpu/zero/interpreterRT_zero.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/share/interpreter/interpreterRuntime.cpp >>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>> ??? old L435: #define IRT_LEAF(result_type, header) >>>> ??? old L438: ??? debug_only(NoSafepointVerifier __nspv(true);) >>>> ??? new L432: #define JRT_LEAF(result_type, header) >>>> ??? new L435: ? debug_only(JRTLeafVerifier __jlv;) >>>> ??????? src/hotspot/share/runtime/interfaceSupport.cpp: >>>> >>>> ?? ?? ??? JRTLeafVerifier::JRTLeafVerifier() >>>> ? ? ? ? ? ? : NoSafepointVerifier(true, >>>> JRTLeafVerifier::should_verify_GC()) >>>> ? ? ????? { >>>> ? ? ????? } >>>> >>>> ??????? src/hotspot/share/runtime/safepointVerifiers.hpp: >>>> >>>> ????????? NoSafepointVerifier(bool activated = true, bool verifygc >>>> = true ) : >>>> ? ? ? ? ? ? NoGCVerifier(verifygc), >>>> ? ? ? ? ? ? _activated(activated) { >>>> >>>> ??????? IRT_LEAF creates a NoSafepointVerifier with first ctr param >>>> == true >>>> ??????? and the second ctr param == default true. >>>> >>>> ??????? JRT_LEAF creates a JRTLeafVerifier subclassed on >>>> NoSafepointVerifier >>>> ??????? with first ctr param == true and second ctr param based on >>>> ??????? JRTLeafVerifier::should_verify_GC() which can return either >>>> ??????? true or false depending on the calling thread's state. If the >>>> ??????? thread's state == _thread_in_Java, then the return == true. >>>> ??????? If the thread's state == _thread_in_native, then the return >>>> == false. >>>> >>>> ??????? As long as all the IRT_LEAF uses are thread state == >>>> _thread_in_Java >>>> ??????? then this is an equivalent change. >>>> >>>> ??????? I found these uses of IRT_LEAF: >>>> >>>> ??? ?? ?? SharedRuntime::fixup_callers_callsite() >>> >>> This is called from the c2i adapter so would be thread_in_java. >>> >>>> ??? InterpreterRuntime::bcp_to_di() >>>> ?? ?? ??? InterpreterRuntime::verify_mdp() >>>> ?? ?? ??? InterpreterRuntime::interpreter_contains() >>>> ?? ?? ??? InterpreterRuntime::popframe_move_outgoing_args() >>>> ?? ?? ??? InterpreterRuntime::trace_bytecode() >>>> >>> >>> I assume these are thread_in_java too, since they have access to >>> metadata (except interpreter_contains, whose callers have access to >>> metadata).? This would be dangerous for in_native threads to touch >>> metadata directly. >>> >>>> I have not checked to see if these IRT_LEAF functions are >>>> ??????? ever called from thread state == _thread_in_native locations, >>>> ??????? but if they are, then we will no longer 'verifygc' with the >>>> ??????? JRT_LEAF switch. >>>> >>> It seems like if one of these IRT calls was from native the >>> >>> NoSafepointVerifier( true /* no safepoints */, false /* don't verify >>> GC */) might be a fix to these methods.? Because of the comment: >>> >>> ?? case _thread_in_native: >>> ???? // A native thread is not subject to safepoints. >>> ???? // Even while it is in a leaf routine, GC is ok >>> ???? return false; >>> >>> But I think it's the case that the behavior is the same, or more >>> consistent if one of these is a native LEAF call that had IRT_LEAF. >>> Trying to resolve these subtle differences is the trouble with >>> mostly duplicated code :( >>> >>> Thanks! >>> Coleen >>> >>>> src/hotspot/share/runtime/sharedRuntime.cpp >>>> ??? No comments. >>>> >>>> >>>> Your call on what to do about the difference that I found between >>>> IRT_LEAF and JRT_LEAF. We could be losing a 'verifygc' check here, >>>> but... >>>> >>>> Dan >>>> >>>> >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222297 >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>> > From daniel.daugherty at oracle.com Fri Apr 12 13:35:09 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 12 Apr 2019 09:35:09 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> <620e300d-a941-5e55-1814-f37c4b559655@oracle.com> Message-ID: <37f22bd1-a95f-a8f0-d51f-765c07ab5bd4@oracle.com> Sounds like a plan... Dan On 4/12/19 5:23 AM, David Holmes wrote: > Hi Dan, > > Thanks for the response. I think I need some gory details here, so > I'll take it up with you separately. > > David > > On 11/04/2019 3:09 am, Daniel D. Daugherty wrote: >> On 4/10/19 1:15 AM, David Holmes wrote: >>> Hi Carsten, Dan, >>> >>> I'd like to pick up on one topic - a higher-level discussion about >>> the timing of the ObjectMonitor lifecycle as they currently are and >>> with these changes: >>> >>> Carsten wrote: >>>> I think I was thinking about a cycle where a Java object exhibits >>>> a monitor inflation, then deflation, then inflation, then deflation. >>>> Each inflation will be with a new monitor. This behavior could >>>> increase the number of monitors allocated, especially with my >>>> original patch as I recycled monitors only after a safepoint. Now >>>> that I think about it again, such a cycle is incredible unlikely as >>>> it would require repeated contention on the> java object, yet the >>>> monitors must not be busy when the deflator thread comes by. And this >>>> scenario has to repeat itself. This all seems pretty unlikely. >>> >>> So logically every Object has associated with it an ObjectMonitor >>> but if we created the ObjectMonitor at the same time as the Object >>> and kept it alive while the Object was alive then we would double >>> our memory use (if not worse). >> >> Generally worse. In one of my recent debug sessions on MacOSX with >> product >> bits, I had to figure out sizeof(ObjectMonitor) for memory dumping >> purposes >> and it was 224 bytes. >> >> >>> So we lazily create ObjectMonitors only when we need them: >>> contention, Object.wait() use, hashcode use. >> >> Clarification: hashcode with contention. hashcode by itself does not >> require inflation. >> >> >>> We could then leave the ObjectMonitors around as long as the Objects >>> are alive, but again this has implications for memory use. >>> >>> So we deflate idle ObjectMonitors to reclaim memory (though in >>> practice it is more complex and we maintain pools of them to speed >>> up allocation). >>> >>> If we aggressively deflate as soon as an ObjectMonitor is idle then >>> we risk getting into inflate->deflate->inflate cycles. The >>> likelihood may be low but if you hit this pathology in your code you >>> will probably be unhappy about the effects on performance. >>> >>> So instead, IIUC, we use some measure of "memory pressure" and only >>> try to deflate under certain conditions. But I'm unclear exactly >>> what those conditions are today, and whether they change with async >>> monitor deflation. Can you enlighten me please? >> >> Without trying to describe the existing trigger mechanisms for monitor >> deflation (lots of details), Async Monitor Deflation uses the same >> safepoint cleanup trigger points for _initiating_ monitor deflation. >> However, unlike safepoint cleanup work which will finish the job during >> the current safepoint, Async Monitor Deflation will start after the >> safepoint that initiated the monitor deflation, but there is no >> guarantee >> when the ServiceThread or the JavaThreads will finish their deflation >> work >> (maybe not before the next safepoint). >> >> The v2.00/3-for-jdk13 webrev has the "must be seen twice to deflate" >> algorithm in place. So for any given ObjectMonitor, the first time it >> is seen by ObjectSynchronizer::deflate_monitor_list_using_JT(), it will >> not be deflated even if eligible (allocation state == "New"). The second >> time that it is seen by deflate_monitor_list_using_JT() (allocation >> state >> == "Old"), it is eligible for deflation. >> >> In this round of code review, we are talking about setting the >> allocation >> state to "Old" in inflate() which would make an ObjectMonitor eligible >> for deflation in the next round of deflation. >> >> >> One of my tasks is to update the existing comments about ObjectMonitor >> life cycle (I'll use another base monitor subsystem subtask); I don't >> think they were updated with the monitor list changes so there's a bit >> of catch up work to do there. I also need to update them for Async >> Monitor Deflation life cycle changes and that will be part of this >> project. >> >> Dan >> >>> >>> Thanks, >>> David >>> >>> On 10/04/2019 12:25 pm, Carsten Varming wrote: >>>> Hi Dan, >>>> >>>> On Mon, Apr 8, 2019 at 9:04 PM Daniel D. Daugherty < >>>> daniel.daugherty at oracle.com> wrote: >>>> >>>>> On 4/5/19 4:59 PM, Karen Kinnear wrote: >>>>> >>>>> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees >>>>> that 0 < >>>>> _count >>>>> with comments that caller ensured _count <= 0 >>>>> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >>>>> ? Am I missing something subtle here or should they be the same >>>>> guarantees? >>>>> >>>>> >>>>> Here's the code in question: >>>>> >>>>> src/hotspot/share/runtime/objectMonitor.cpp: >>>>> >>>>> void ObjectMonitor::EnterI(TRAPS) { >>>>> >>>>> ?? if (_owner == DEFLATER_MARKER) { >>>>> ???? guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= >>>>> 0 should >>>>> have been handled by the caller"); >>>>> ???? // Deflater thread tried to lock this monitor, but it failed >>>>> to make >>>>> _count negative and gave up. >>>>> >>>>> void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * >>>>> SelfNode) { >>>>> >>>>> ???? if (_owner == DEFLATER_MARKER) { >>>>> ?????? guarantee(0 <= _count, "Impossible: _owner == >>>>> DEFLATER_MARKER && >>>>> _count < 0, monitor must not be owned by deflater thread here"); >>>>> >>>>> >>>>> Reading these two guarantee() calls always throws me off stride >>>>> because I would have written them like this: >>>>> >>>>> ???? guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= >>>>> 0 should >>>>> have been handled by the caller"); >>>>> >>>>> and >>>>> >>>>> ?????? guarantee(_count >= 0, "Impossible: _owner == >>>>> DEFLATER_MARKER && >>>>> _count < 0, monitor must not be owned by deflater thread here"); >>>>> >>>>> When rewritten like the above, you have: >>>>> >>>>> ???? "_count > 0" ... _count <= 0 >>>>> >>>>> and: >>>>> >>>>> ???? "_count >= 0" ... "_count < 0" >>>>> >>>>> which is easier for my brain to read... okay... enough sidebar... >>>>> >>>> >>>> He he. I have pretty much eliminated > and >= from my written >>>> vocabulary. >>>> It makes life simpler. Trust me. :) >>>> >>>> >>>>> Short answer: No the guarantees should not be the same. >>>>> >>>>> Longer answer: EnterI() is called by enter() after enter() has >>>>> incremented the _count field to indicate the contended state of >>>>> things. So in EnterI(), "_count > 0" is the right check. >>>>> ReenterI() is called after wait() has returned (notified or >>>>> timedout), and the _count field is not used on reentry ops so >>>>> "_count >= 0" is the right check. >>>>> >>>>> I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, >>>>> there are two places in EnterI() that do this): >>>>> >>>>> ???? L501:?? if (_owner == DEFLATER_MARKER) { >>>>> ?????????????? // The deflation protocol finished the first part >>>>> (setting >>>>> _owner), >>>>> ?????????????? // but it failed the second part (making _count >>>>> negative) >>>>> and bailed. >>>>> ?????????????? // Because we're called from enter() we have at >>>>> least one >>>>> contention. >>>>> ?????????????? guarantee(count > 0, "_owner == DEFLATER_MARKER && >>>>> _count <= >>>>> 0 should have been handled by the caller"); >>>>> ???? L504:???? // Try to acquire monitor. >>>>> ???? L505:???? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>>>> DEFLATER_MARKER) { >>>>> >>>>> ???? L629:???? if (_owner == DEFLATER_MARKER) { >>>>> ???????????????? // The deflation protocol finished the first part >>>>> (setting >>>>> _owner), >>>>> ???????????????? // but it failed the second part (making _count >>>>> negative) >>>>> and bailed. >>>>> ???????????????? // Because we're called from enter() we have at >>>>> least one >>>>> contention. >>>>> ???????????????? guarantee(count> 0 , "_owner == DEFLATER_MARKER >>>>> && _count >>>>> <= 0 should have been handled by the caller"); >>>>> ???? L632:?????? if (Atomic::cmpxchg(Self, &_owner, >>>>> DEFLATER_MARKER) == >>>>> DEFLATER_MARKER) { >>>>> >>>>> And I'm going to tweak the ReenterI() code like this: >>>>> >>>>> ???? L759:???? if (_owner == DEFLATER_MARKER) { >>>>> ???????????????? // The deflation protocol finished the first part >>>>> (setting >>>>> _owner), >>>>> ???????????????? // but it will observe _waiters != 0 and will >>>>> bail out. >>>>> Because we're >>>>> ???????????????? // called from wait() we may or may not have any >>>>> contentions. >>>>> ???????????????? guarantee(count >= 0, "Impossible: _owner == >>>>> DEFLATER_MARKER && _count < 0 should have been handled by the >>>>> caller"); >>>>> ???? L761:?????? if (Atomic::cmpxchg(Self, &_owner, >>>>> DEFLATER_MARKER) == >>>>> DEFLATER_MARKER) { >>>>> >>>>> >>>>> You didn't ask this, but it is okay that _count is only used to track >>>>> contentions in enter()/EnterI() and is not used to track contentions >>>>> in wait()/ReenterI(). For the wait()/ReenterI() code path, >>>>> _waiters is >>>>> used by is_busy() to observe the busy state for an ObjectMonitor that >>>>> is being wait()'ed for. The _waiters field is decremented after a >>>>> waiter has returned from ReenterI() so the _owner field takes over >>>>> answering the is_busy() question... >>>>> >>>>> >>>>> 5. I could use a little help with allocation state transitions, >>>>> e.g. in deflate_monitor_list_using_JT >>>>> ?? you see is_new with object set so you mark it as old so next >>>>> deflation >>>>> will check it >>>>> >>>>> >>>>> Here's the code in question: >>>>> >>>>> src/hotspot/share/runtime/synchronizer.cpp: >>>>> >>>>> int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** >>>>> listHeadp, >>>>> ObjectMonitor** >>>>> freeHeadp, >>>>> ObjectMonitor** >>>>> freeTailp, >>>>> ObjectMonitor** >>>>> savedMidInUsep) { >>>>> >>>>> ???? // Only try to deflate if there is an associated Java object >>>>> and if >>>>> ???? // mid is old (is not newly allocated and is not newly freed). >>>>> ???? if (mid->object() != NULL && mid->is_old() && >>>>> ???????? deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { >>>>> ?????? // Deflation succeeded so update the in-use list. >>>>> >>>>> ???? } else { >>>>> ?????? // mid is considered in-use if it does not have an associated >>>>> ?????? // Java object or mid is not old or deflation did not succeed. >>>>> ?????? // A mid->is_new() node can be seen here when it is freshly >>>>> returned >>>>> ?????? // by omAlloc() (and skips the deflation code path). >>>>> ?????? // A mid->is_old() node can be seen here when deflation >>>>> failed. >>>>> ?????? // A mid->is_free() node can be seen here when a fresh node >>>>> from >>>>> ?????? // omAlloc() is released by omRelease() due to losing the race >>>>> ?????? // in inflate(). >>>>> >>>>> ?????? if (mid->object() != NULL && mid->is_new()) { >>>>> ???????? // mid has an associated Java object and has now been seen >>>>> ???????? // as newly allocated so mark it as "old". >>>>> ???????? mid->set_allocation_state(ObjectMonitor::Old); >>>>> ?????? } >>>>> >>>>> ?? - why do you set it to old here rather than in inflate once we set >>>>> values? >>>>> >>>>> >>>>> Inflation is used in quite a few places. If we marked the >>>>> ObjectMonitor as "Old" in inflate(), then that would make the >>>>> ObjectMonitor available for deflation by deflate_monitor_using_JT() >>>>> earlier: >>>>> >>>>> src/hotspot/share/runtime/synchronizer.cpp: >>>>> >>>>> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, >>>>> ObjectMonitor** >>>>> freeHeadp, >>>>> ObjectMonitor** >>>>> freeTailp) { >>>>> ?? assert(AsyncDeflateIdleMonitors, "sanity check"); >>>>> ?? assert(Thread::current()->is_Java_thread(), "precondition"); >>>>> ?? // A newly allocated ObjectMonitor should not be seen here so we >>>>> ?? // avoid an endless inflate/deflate cycle. >>>>> ?? assert(mid->is_old(), "precondition"); >>>>> >>>>> >>>>> So the idea behind only deflating ObjectMonitors that have reached >>>>> allocation state "Old" is to prevent "an endless inflate/deflate >>>>> cycle". >>>>> Here's the relevant section from Carsten's JEP: >>>>> >>>>> To avoid endless inflation / deflation cycles in the prototype, >>>>> monitor >>>>> >>>>> deflation is only attempted the second time a monitor is seen by the >>>>> >>>>> thread marking monitors as deflatable: If the thread (the only thread >>>>> >>>>> marking monitors as deflatable; might be service thread or some GC >>>>> >>>>> related thread or even a dedicated thread) sees a monitor in state >>>>> New, >>>>> >>>>> then the thread marks the monitor as Old and moves on. So there is >>>>> >>>>> little interaction between a thread inflating a lock to a monitor and >>>>> >>>>> the deflating thread, the inflating thread just has to make sure the >>>>> >>>>> monitor is marked New and this marker is published using appropriate >>>>> >>>>> barriers. >>>>> >>>>> >>>>> There isn't an explicit example in the JEP of what Carsten was >>>>> thinking >>>>> of with "an endless inflate/deflate cycle". I didn't try to think of >>>>> such an example for the OpenJDK wiki either. I simple wrote: >>>>> >>>> >>>> I think I was thinking about a cycle where a Java object exhibits a >>>> monitor >>>> inflation, then deflation, then inflation, then deflation. Each >>>> inflation >>>> will be with a new monitor. This behavior could increase the number of >>>> monitors allocated, especially with my original patch as I recycled >>>> monitors only after a safepoint. Now that I think about it again, >>>> such a >>>> cycle is incredible unlikely as it would require repeated >>>> contention on the >>>> java object, yet the monitors must not be busy when the deflator >>>> thread >>>> comes by. And this scenario has to repeat itself. This all seems >>>> pretty >>>> unlikely. >>>> >>>> ObjectMonitor has a new allocation_state field that supports three >>>>> >>>>> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied >>>>> >>>>> to ObjectMonitors that have reached the 'Old' state. When the Async >>>>> >>>>> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it >>>>> >>>>> is changed to the 'Old' state, but is not deflated. This prevents a >>>>> >>>>> newly allocated ObjectMonitor from being immediately deflated which >>>>> >>>>> could cause an inflation<->deflation oscillation. >>>>> >>>>> >>>>> So let's think about what might happen if an ObjectMonitor is marked >>>>> as "Old" in inflate(). Here's an example use of inflate() in the >>>>> "slow enter" code path: >>>>> >>>>> src/hotspot/share/runtime/synchronizer.cpp: >>>>>> void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, >>>>>> TRAPS) { >>>>> >>>>> base>>>> inflate_cause_monitor_enter)->enter(THREAD); >>>>> >>>>> new>???? ObjectMonitorHandle omh; >>>>> new>???? inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); >>>>> new>???? do_loop = !omh.om_ptr()->enter(THREAD); >>>>> >>>>> In the "base" code, we took the return from inflate() and used it >>>>> to call >>>>> ObjectMonitor::enter(). If we never changed that bit of code and >>>>> inflate() >>>>> marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() >>>>> could >>>>> async deflate the ObjectMonitor while we were trying to call >>>>> enter() on >>>>> it... Boom! So we might think that holding off marking an >>>>> ObjectMonitor >>>>> as "Old" can save us... and it can, but not in all cases... :-( >>>>> >>>>> It is entirely possible that our call to slow_enter() is made on an >>>>> ObjectMonitor that's already marked "Old". In that case, our thread >>>>> (T-enter) calls inflate() which returns the existing ObjectMonitor* >>>>> and we use it to call enter(). If the thread (T-deflate) calling >>>>> deflate_monitor_using_JT() does its magic before T-enter sets the >>>>> owner field or the count field... Boom! >>>>> >>>>> The previous paragraph is exactly what motivated the _ref_count >>>>> field, >>>>> the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* >>>>> parameter to inflate(). inflate() calls >>>>> ObjectMonitorHandle::save_om_ptr() >>>>> which increments the ObjectMonitor's ref_count and then checks for >>>>> async >>>>> deflation protocol collisions. If there's a collision, then >>>>> save_om_ptr() >>>>> returns false and the caller (inflate() in this case) has to >>>>> retry. When >>>>> inflate() returns, the ObjectMonitor in the ObjectMonitorHandle >>>>> cannot >>>>> be deflated and is safe until the ObjectMonitorHandle is destroyed. >>>>> >>>>> So by changing T-enter to use an ObjectMonitorHandle, T-deflate >>>>> cannot >>>>> deflate the ObjectMonitor in the window after inflate() returns and >>>>> before T-enter sets the owner field or increments the count field. >>>>> But >>>>> you know all that already! >>>>> >>>>> So let's bring this back to having inflate() mark the >>>>> ObjectMonitor as >>>>> "Old"... Since inflate() returns an ObjectMonitor with the >>>>> ref_count > 0, >>>>> it doesn't matter if the ObjectMonitor is marked as "Old" in >>>>> inflate(). >>>>> T-deflate cannot deflate it due to ref_count > 0. >>>>> >>>>> Here's another crazy thought... inflate() is the only function that >>>>> calls omAlloc(), and omAlloc() is the only function that sets "New". >>>>> If we move the setting of "Old" from deflate_monitor_list_using_JT() >>>>> to inflate(), then the change from "New" -> "Old" never happens >>>>> outside of the inflate() call so why do we need the allocation state? >>>>> >>>>> Small dose of reality: I've found having the allocation state to be >>>>> very helpful when debugging race related crashes. We could make the >>>>> allocation state be DEBUG_ONLY, but then what about race debugging of >>>>> product bits... sigh... >>>>> >>>>> >>>>> 6. Could you get rid of the new goto?s? >>>>> >>>>> >>>>> I believe there is only one left from Carsten's prototype: >>>>> >>>> >>>> You make it sound like I was throwing gotos around left and right. >>>> :) If >>>> you count continue and break statements, then you might have been >>>> right. >>>> >>>> I'll break my response here, so we can return to regular structured >>>> programming, ;-) >>>> Carsten >>>> >> From coleen.phillimore at oracle.com Fri Apr 12 13:59:17 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 12 Apr 2019 09:59:17 -0400 Subject: RFR (XS) 8222297: IRT_ENTRY/IRT_LEAF etc are the same as JRT In-Reply-To: <398a9a38-d56f-0983-2493-cf6d2b738407@oracle.com> References: <584edb17-c310-5440-087e-c507e4dd4875@oracle.com> <0af7792f-a691-4737-d60d-99152860f05a@oracle.com> <66832269-a5da-b117-15c2-32ddde02964f@oracle.com> <398a9a38-d56f-0983-2493-cf6d2b738407@oracle.com> Message-ID: Thanks Dan! Coleen On 4/12/19 9:31 AM, Daniel D. Daugherty wrote: >> > I'm not clear what your resolution is here? Just accept that maybe >> we lost a verify-gc call as Dan noted? >> >> I think there is no actual lost verify call, in that there is no IRT >> entry coming from native (that I can spot visually without running >> verification), > > I could not spot any IRT_LEAF coming from native either. > > I'm good with pushing this change and keeping an eye open for any > anomalies... > > Thumbs up! > > Dan > > > On 4/12/19 7:39 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 4/11/19 7:24 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> I was busy doing some archaeology on this code so didn't notice the >>> RFR. Glad Dan picked up on the only difference with the "verifiers" >>> in the LEAF variants. >>> >>> FTR the differences here are historical. JRT was added first and >>> shortly needed to manifest the current thread directly. IRT were >>> added later and the "thread" was an implicit argument. But by July >>> 1998 the two ENTRY macros were the same. The only difference was the >>> verifier in the LEAF, and some custom variants of each macro that no >>> longer exist. >>> >>> Conceptually I've always thought there was a difference in how the >>> interpreter needed to "enter the runtime" versus the compilers. So >>> the different macros made that clear. But if the requirements are >>> essentially identical, and always have been, then the distinction >>> just confuses things. >> >> Right.? And as I put it in the bug, there have been some extensions >> to the JRT entries that every now and then, I think I need for the >> interpreter, so these distinctions are just confusing. >>> >>> I'm not clear what your resolution is here? Just accept that maybe >>> we lost a verify-gc call as Dan noted? >> >> I think there is no actual lost verify call, in that there is no IRT >> entry coming from native (that I can spot visually without running >> verification), and I can't figure why out it's makes sense to verify >> that there's no GC, when we've already verified that there is no >> safepoint. >> >> NoGCVerifier::NoGCVerifier(bool verifygc) { >> ? _verifygc = verifygc; >> ? if (_verifygc) { >> ??? CollectedHeap* h = Universe::heap(); >> ??? assert(!h->is_gc_active(), "GC active during NoGCVerifier"); >> ??? _old_invocations = h->total_collections(); >> ? } >> >> When is_gc_active is set here: >> >> void GenCollectedHeap::do_collection(bool?????????? full, >> ... >> ? assert(SafepointSynchronize::is_at_safepoint(), "should be at >> safepoint"); >> ... >> ? FlagSetting fl(_is_gc_active, true); >> >> Except for ZGC, which I can't tell, increment_total_collections also >> is called at a safepoint. >> >> It might be a useful assert if we want to prevent checking for the >> start of a concurrent collection.? If the thread is in native, it >> doesn't actually make sense because in native should not have any >> direct access to oops or metadata.? So if the verification is >> actually lost, which I doubt, that's a good thing. >> >> Coleen >> >>> >>> Thanks, >>> David >>> >>> On 12/04/2019 7:06 am, coleen.phillimore at oracle.com wrote: >>>> >>>> Dan, Thank you for reviewing. >>>> >>>> On 4/11/19 3:02 PM, Daniel D. Daugherty wrote: >>>>> On 4/11/19 11:39 AM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Replace IRT entry points with JRT. >>>>>> >>>>>> Tested with hs tier1-3 and built zero.? And grepped from the >>>>>> right level directory this time. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8222297.01/webrev >>>>> >>>>> src/hotspot/cpu/aarch64/interpreterRT_aarch64.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/cpu/arm/interpreterRT_arm.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/cpu/ppc/interpreterRT_ppc.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/cpu/s390/interpreterRT_s390.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/cpu/sparc/interpreterRT_sparc.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/cpu/x86/interpreterRT_x86_32.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/cpu/x86/interpreterRT_x86_64.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/cpu/zero/cppInterpreter_zero.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/cpu/zero/interpreterRT_zero.cpp >>>>> ??? No comments. >>>>> >>>>> src/hotspot/share/interpreter/interpreterRuntime.cpp >>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>>> ??? old L435: #define IRT_LEAF(result_type, header) >>>>> ??? old L438: ??? debug_only(NoSafepointVerifier __nspv(true);) >>>>> ??? new L432: #define JRT_LEAF(result_type, header) >>>>> ??? new L435: ? debug_only(JRTLeafVerifier __jlv;) >>>>> ??????? src/hotspot/share/runtime/interfaceSupport.cpp: >>>>> >>>>> ?? ?? ??? JRTLeafVerifier::JRTLeafVerifier() >>>>> ? ? ? ? ? ? : NoSafepointVerifier(true, >>>>> JRTLeafVerifier::should_verify_GC()) >>>>> ? ? ????? { >>>>> ? ? ????? } >>>>> >>>>> ??????? src/hotspot/share/runtime/safepointVerifiers.hpp: >>>>> >>>>> ????????? NoSafepointVerifier(bool activated = true, bool verifygc >>>>> = true ) : >>>>> ? ? ? ? ? ? NoGCVerifier(verifygc), >>>>> ? ? ? ? ? ? _activated(activated) { >>>>> >>>>> ??????? IRT_LEAF creates a NoSafepointVerifier with first ctr >>>>> param == true >>>>> ??????? and the second ctr param == default true. >>>>> >>>>> ??????? JRT_LEAF creates a JRTLeafVerifier subclassed on >>>>> NoSafepointVerifier >>>>> ??????? with first ctr param == true and second ctr param based on >>>>> ??????? JRTLeafVerifier::should_verify_GC() which can return either >>>>> ??????? true or false depending on the calling thread's state. If the >>>>> ??????? thread's state == _thread_in_Java, then the return == true. >>>>> ??????? If the thread's state == _thread_in_native, then the >>>>> return == false. >>>>> >>>>> ??????? As long as all the IRT_LEAF uses are thread state == >>>>> _thread_in_Java >>>>> ??????? then this is an equivalent change. >>>>> >>>>> ??????? I found these uses of IRT_LEAF: >>>>> >>>>> ??? ?? ?? SharedRuntime::fixup_callers_callsite() >>>> >>>> This is called from the c2i adapter so would be thread_in_java. >>>> >>>>> ??? InterpreterRuntime::bcp_to_di() >>>>> ?? ?? ??? InterpreterRuntime::verify_mdp() >>>>> ?? ?? ??? InterpreterRuntime::interpreter_contains() >>>>> InterpreterRuntime::popframe_move_outgoing_args() >>>>> ?? ?? ??? InterpreterRuntime::trace_bytecode() >>>>> >>>> >>>> I assume these are thread_in_java too, since they have access to >>>> metadata (except interpreter_contains, whose callers have access to >>>> metadata).? This would be dangerous for in_native threads to touch >>>> metadata directly. >>>> >>>>> I have not checked to see if these IRT_LEAF functions are >>>>> ??????? ever called from thread state == _thread_in_native locations, >>>>> ??????? but if they are, then we will no longer 'verifygc' with the >>>>> ??????? JRT_LEAF switch. >>>>> >>>> It seems like if one of these IRT calls was from native the >>>> >>>> NoSafepointVerifier( true /* no safepoints */, false /* don't >>>> verify GC */) might be a fix to these methods. Because of the comment: >>>> >>>> ?? case _thread_in_native: >>>> ???? // A native thread is not subject to safepoints. >>>> ???? // Even while it is in a leaf routine, GC is ok >>>> ???? return false; >>>> >>>> But I think it's the case that the behavior is the same, or more >>>> consistent if one of these is a native LEAF call that had IRT_LEAF. >>>> Trying to resolve these subtle differences is the trouble with >>>> mostly duplicated code :( >>>> >>>> Thanks! >>>> Coleen >>>> >>>>> src/hotspot/share/runtime/sharedRuntime.cpp >>>>> ??? No comments. >>>>> >>>>> >>>>> Your call on what to do about the difference that I found between >>>>> IRT_LEAF and JRT_LEAF. We could be losing a 'verifygc' check here, >>>>> but... >>>>> >>>>> Dan >>>>> >>>>> >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222297 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>> >> > From rkennke at redhat.com Fri Apr 12 15:13:51 2019 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 12 Apr 2019 17:13:51 +0200 Subject: RFR: JDK-8222281: GC interface for load-klass: runtime part In-Reply-To: <67a0953d-7d74-b315-f571-a8516a9a8776@oracle.com> References: <789EF88B-3E6E-4455-8335-96185BAECA07@redhat.com> <67a0953d-7d74-b315-f571-a8516a9a8776@oracle.com> Message-ID: <9b8a0644ae253a4225f879077ef5f7b791c26746.camel@redhat.com> Alright, I am experimenting with this. It turns out that there are a *lot* of paths that call into oopDesc::klass() which need extra handling, all of which come out of the GC. Example is oopDesc::size() and oopDesc::oop_iterate() and various others. I am wondering if a simple BarrierSet::obj_klass() might be useful/feasible for such cases? I can work around some of those occurances by calling variants that take a Klass* parameter instead, and resolve that internally in the GC. But not always. For example, I can call oopDesc::size_given_klass() instead. However, that blows up later because objArrayKlass::obj_size() is asserting klass()- >is_objArrayKlass() or something like that later... Ideas? Roman > On 4/12/19 8:23 AM, Roman Kennke wrote: > > > > Am 12. April 2019 07:18:41 MESZ schrieb Per Liden < > > per.liden at oracle.com>: > > > Hi Roman, > > > > > > On 04/11/2019 10:58 PM, Roman Kennke wrote: > > > > An upcoming feature in Shenandoah requires that GC can > > > > intercept > > > loading > > > > the Klass* of an object. I'd like to introduce a GC interface > > > > for > > > that. > > > > > > Could you please give some more insight into what this feature is > > > and > > > why it needs to intercept these loads? > > > > Oh yeah, apparently it was late last night. ;-) We want to > > eliminate Shenandoah's extra word for the forwarding pointer and > > stick it in the Klass* word for forwarded objects, with a little > > bit of encoding to distinguish a forward pointer from a Klass*. > > Therefore we need to see Klass* loads to check that encoding and > > possibly load the Klass* from the forwardee instead. > > Maybe I'm missing something, but didn't you just switch to having a > to-space invariant? What runtime code can get hold of a an oop > pointing > into from-space and then try load its klass? > > cheers, > Per > > > Roman > > > > > thanks, > > > Per > > > > > > > This change covers the runtime part. > > > > > > > > Bug: > > > > https://bugs.openjdk.java.net/browse/JDK-8222281 > > > > Webrev: > > > > http://cr.openjdk.java.net/~rkennke/JDK-8222281/webrev.00/ > > > > > > > > The change takes the 3 variants of oopDesc::klass() and funnels > > > > them > > > > through the Access API to be intercepted by the GC in any way > > > > it > > > wants. > > > > Behaviour- and performance-wise it should be identical > > > > (assuming the > > > > compiler can make sense of the Access API). > > > > > > > > I see that there might be an opportunity here to make the if > > > > (UseCompressedClassPointers) check pre-resolved, but I don't > > > > know how > > > to > > > > do that. I also don't really see how that is supposed to work > > > > for > > > loads > > > > and stores wrt UseCompressedOops either: in order to select the > > > proper > > > > functions on first call, and subsequently go through that > > > > selected > > > > function, it would have to be done in the *_init() functions. > > > > But I > > > > don't see any selection code there. Instead, it seems to be in > > > > PreRuntimeDispatch? I must be missing something. I left the > > > > selection > > > in > > > > the raw implementation. Maybe we want to sort this out? > > > > > > > > > > > > I must say that I fought with myself whether or not I should > > > > add to > > > the > > > > madness that is the Access API. What should have been a > > > 1-line-addition > > > > to an API (e.g. BarrierSet) turned out to become: > > > > > > > > 7 files changed, 86 insertions(+), 20 deletions(-) > > > > > > > > and two days of work. And god forbid we ever have to change or > > > > even > > > fix > > > > anything there. > > > > > > > > I was about to just not do it in Access API at all, but > > > > somewhere > > > else > > > > instead, but then I did not want to introduce a schism there, > > > > so I > > > bit > > > > the bullet. Maybe we should consider to turn this into a proper > > > > C++ > > > > interface instead? This is just unmaintainable madness. > > > > > > > > > > > > Testing: tier1 fine. Will submit into jdk/submit shortly, works > > > > with > > > the > > > > Shenandoah prototype that I have here (iow, the API is good) > > > > > > > > Can I please get a review? > > > > > > > > Thanks, > > > > Roman > > > > From mikhailo.seledtsov at oracle.com Fri Apr 12 17:25:37 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Fri, 12 Apr 2019 10:25:37 -0700 Subject: RFR(S): JDK-8222299: [TESTBUG] move hotspot container tests to hotspot/containers In-Reply-To: <8702daac-092c-4236-f9c1-0f3ba1b6267e@oracle.com> References: <9548b55e-4745-98f4-d5bd-1bbee0b1d8c8@oracle.com> <13c0cc36-0bd0-f8cd-45ba-f5e3893cc5a9@oracle.com> <8702daac-092c-4236-f9c1-0f3ba1b6267e@oracle.com> Message-ID: David, Igor, Jie, ? Thank you for review. I did not hear any additional feedback therefore I will go ahead and integrate v01: http://cr.openjdk.java.net/~mseledtsov/8222299.01/ Thank you, Misha On 4/11/19 6:59 PM, mikhailo.seledtsov at oracle.com wrote: > David, Jie, thank you for reviews. > > Here is the webrev with update to the docs (html and .md): > > http://cr.openjdk.java.net/~mseledtsov/8222299.01/ > > > Thank you, > > Misha > > > On 4/11/19 6:47 PM, mikhailo.seledtsov at oracle.com wrote: >> Thank you Jie, >> >> Good catch. I will update the docs. >> >> >> Misha >> >> >> On 4/11/19 5:57 PM, Jie Fu wrote: >>> Hi Misha, >>> >>> It might be better to update the test doc[1][2] together. >>> Thanks. >>> >>> Best regards, >>> Jie >>> >>> [1] >>> http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l384 >>> >>> [2] >>> http://hg.openjdk.java.net/jdk/jdk/file/a2795025f417/doc/testing.md#l389 >>> >>> >>> >>> On 2019/4/12 ??6:46, mikhailo.seledtsov at oracle.com wrote: >>>> Please review this change that moves HotSpot container tests into >>>> their own directory under test/hotspot/jtreg and >>>> creates hotspot_containers test group. Since container tests >>>> require specially setup/configured environment, it is best to group >>>> them into their own group, >>>> so they could be executed in properly configured environment. The >>>> details of this change, such as new location were >>>> recommended by David and Igor during earlier public review for this >>>> issue (8222299: [TESTBUG] Docker tests should be excluded from >>>> hotspot_runtime group). >>>> >>>> Also, as part of this review, Igor recommended to use jtreg.skipped >>>> exception if docker image build fails, which I agreed to >>>> and implemented. >>>> >>>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222299 >>>> ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222299.00/ >>>> ??? Testing: >>>> ?????? Ran the affected tests on a machine configured for Docker >>>> testing. >>>> ?????? Ran them two ways, via specifying the directory as well as >>>> by using a newly created group - PASS >>>> >>>> >>>> Thank you, >>>> Misha >>>> >>> >> > From erik.gahlin at oracle.com Fri Apr 12 17:45:34 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Fri, 12 Apr 2019 19:45:34 +0200 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CAD05AF.1090700@oracle.com> <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> <7002f66b-49af-b38d-8d97-8642e684e998@oracle.com> Message-ID: <5CB0CEBE.5000400@oracle.com> On 2019-04-10 22:03, gerard ziemski wrote: > > > On 4/10/19 1:12 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> I noticed that events are only emitted if we are able to take the >>>> resize lock. Can this be fixed? What prevents us from always >>>> getting the data? That's how other periodic events work and losing >>>> data sometimes may lead to subtle bugs that hard to understand and >>>> replicate in systems that rely on the information. Could we retry >>>> on a failure? >>> Good observation. If the resize lock is taken, then it's not likely >>> that whoever owns it will be done soon, so retrying is most likely >>> not going to succeed right away. Is it OK to tie up JFR periodic >>> thread for some time? If so, how long? There is no general upper limit for periodic events. If we need to wait for a safepoint, we need to do it. That said, events that can induce significant latencies or CPU overhead (even in pathological cases) are off in default.jfc and only enabled in profile.jfr, or not at all. As I understand it, the events themselves don't cause latencies and the tables are not expanded that often, so I think it would be okay to emit them. If you think otherwise, I would try to scan concurrently, even if it means we are slightly off. >>> >>> >>> If the lock is taken, then it means that someone is scanning through >>> the entire table, or the table is being resized. Either way, we're >>> not loosing data, but are just temporarily blind - I don't see a >>> problem here for a long running apps, they will start receiving >>> events eventually (which happen every 10 sec by default) A user can set period "everyChunk" which means events are guaranteed to be in the recording. I think we should try to avoid breaking that contract. When event streaming is in place, we can implement requestable events where a user can demand an event programmatically from Java. If they sometimes don't get an event, it will break their code in a subtle way. Thanks Erik >> >> Robbin was talking about allowing scanning the table while resizing, >> ie. not having the resize_lock, if we can accept that there might be >> some entries double counted. > > Yes, we could do that - are you suggesting that this is what we should > do? Personally, I think I'd prefer not to emit the event at all, > rather than emit one that might be wrong (that's exactly what we do > currently for jcmd print statistics). > > Erik, Robbin, do you have a preference here? > > > cheers > From gerard.ziemski at oracle.com Fri Apr 12 20:13:22 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Fri, 12 Apr 2019 15:13:22 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <5CB0CEBE.5000400@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CAD05AF.1090700@oracle.com> <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> <7002f66b-49af-b38d-8d97-8642e684e998@oracle.com> <5CB0CEBE.5000400@oracle.com> Message-ID: <72c5aeb5-d65c-33d0-fc95-5e469316478f@oracle.com> On 4/12/19 12:45 PM, Erik Gahlin wrote: > On 2019-04-10 22:03, gerard ziemski wrote: >> >> >> On 4/10/19 1:12 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> I noticed that events are only emitted if we are able to take the >>>>> resize lock. Can this be fixed? What prevents us from always >>>>> getting the data? That's how other periodic events work and losing >>>>> data sometimes may lead to subtle bugs that hard to understand and >>>>> replicate in systems that rely on the information. Could we retry >>>>> on a failure? >>>> Good observation. If the resize lock is taken, then it's not likely >>>> that whoever owns it will be done soon, so retrying is most likely >>>> not going to succeed right away. Is it OK to tie up JFR periodic >>>> thread for some time? If so, how long? > There is no general upper limit for periodic events. > > If we need to wait for a safepoint, we need to do it. That said, > events that can induce significant latencies or CPU overhead (even in > pathological cases) are off in default.jfc and only enabled in > profile.jfr, or not at all. > > As I understand it, the events themselves don't cause latencies and > the tables are not expanded that often, so I think it would be okay to > emit them.? If you think otherwise, I would try to scan concurrently, > even if it means we are slightly off. > >>>> >>>> >>>> If the lock is taken, then it means that someone is scanning >>>> through the entire table, or the table is being resized. Either >>>> way, we're not loosing data, but are just temporarily blind - I >>>> don't see a problem here for a long running apps, they will start >>>> receiving events eventually (which happen every 10 sec by default) > A user can set period "everyChunk" which means events are guaranteed > to be in the recording. > > I think we should try to avoid breaking that contract. When event > streaming is in place, we can implement requestable events where a > user can demand an event programmatically from Java. If they sometimes > don't get an event, it will break their code in a subtle way. No problem, I removed the resize_lock around the JFR table statistics, so we might get a slightly incorrect stats every now and then, but we will be emitting the events on schedule: http://cr.openjdk.java.net/~gziemski/8185525_rev7 Last question: what is the recommended way to programatically tell if JFR is ON? I'm wondering whether I should collect the add/remove rates for the tables only if JRF is ON. As it is right now, we collect them always. It's just an atomic increment, but still, it's work only JFR events need. cheers From per.liden at oracle.com Mon Apr 15 05:58:51 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 15 Apr 2019 07:58:51 +0200 Subject: RFR: 8222460: -XX:+PrintFlagsRanges prints incorrect range Message-ID: <9781e40b-f22c-1b8f-198f-525ffbda9a47@oracle.com> When testing JDK-8222145 (Add -XX:SoftMaxHeapSize flag), I ran into this issue. -XX:+PrintFlagsRanges prints incorrect value range for flags that are associated with a constraint function. Instead of printing an empty range "[ ... ]" is prints the default value range for the type. It should do the opposite. The fix in jvmFlagRangeList.cpp is trivial. However, it has the side effect that some options that were previously not tested are now tested, and vice versa. As a result I had to exclude testing the max range of two problematic options (ActiveProcessorCount and G1PeriodicGCInterval). It might also be the cases that some of the previously excluded options no longer need to be excluded, but I didn't want to fiddle with that in this patch. If this is the case, I suggest that is fixed in a follow up RFE. Testing: tier1-3 in all Oracle platforms, tier1-7 on Linux/x86_64 Bug: https://bugs.openjdk.java.net/browse/JDK-8222460 Webrev: http://cr.openjdk.java.net/~pliden/8222460/webrev.0 /Per From per.liden at oracle.com Mon Apr 15 06:17:32 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 15 Apr 2019 08:17:32 +0200 Subject: RFR: JDK-8222281: GC interface for load-klass: runtime part In-Reply-To: <9b8a0644ae253a4225f879077ef5f7b791c26746.camel@redhat.com> References: <789EF88B-3E6E-4455-8335-96185BAECA07@redhat.com> <67a0953d-7d74-b315-f571-a8516a9a8776@oracle.com> <9b8a0644ae253a4225f879077ef5f7b791c26746.camel@redhat.com> Message-ID: <0a9d1386-5594-dfe6-f842-dc0001ac2f72@oracle.com> Hi Roman, On 04/12/2019 05:13 PM, Roman Kennke wrote: > Alright, I am experimenting with this. It turns out that there are a > *lot* of paths that call into oopDesc::klass() which need extra > handling, all of which come out of the GC. Example is oopDesc::size() > and oopDesc::oop_iterate() and various others. It's till not quite clear to me why you need this. Put another way, I think you should really try to design this such that you don't need this. Doing things like size() or oop_iterate() on from-space objects sounds like there's an abstraction/load-barrier missing somewhere. I can understand if the GC sometimes needs to fiddle with from-space objects, but that should be a Shenandoah internal thing and shouldn't have to leak out into BarrierSet, etc. Btw, once your conversion to to-space invariant is complete (is it?), I'm hoping we can start to cleaning out all the stuff we no longer need from BarrierSet and the Access API. cheers, Per > > I am wondering if a simple BarrierSet::obj_klass() might be > useful/feasible for such cases? I can work around some of those > occurances by calling variants that take a Klass* parameter instead, > and resolve that internally in the GC. But not always. For example, I > can call oopDesc::size_given_klass() instead. However, that blows up > later because objArrayKlass::obj_size() is asserting klass()- >> is_objArrayKlass() or something like that later... > > Ideas? > > Roman > >> On 4/12/19 8:23 AM, Roman Kennke wrote: >>> >>> Am 12. April 2019 07:18:41 MESZ schrieb Per Liden < >>> per.liden at oracle.com>: >>>> Hi Roman, >>>> >>>> On 04/11/2019 10:58 PM, Roman Kennke wrote: >>>>> An upcoming feature in Shenandoah requires that GC can >>>>> intercept >>>> loading >>>>> the Klass* of an object. I'd like to introduce a GC interface >>>>> for >>>> that. >>>> >>>> Could you please give some more insight into what this feature is >>>> and >>>> why it needs to intercept these loads? >>> >>> Oh yeah, apparently it was late last night. ;-) We want to >>> eliminate Shenandoah's extra word for the forwarding pointer and >>> stick it in the Klass* word for forwarded objects, with a little >>> bit of encoding to distinguish a forward pointer from a Klass*. >>> Therefore we need to see Klass* loads to check that encoding and >>> possibly load the Klass* from the forwardee instead. >> >> Maybe I'm missing something, but didn't you just switch to having a >> to-space invariant? What runtime code can get hold of a an oop >> pointing >> into from-space and then try load its klass? >> >> cheers, >> Per >> >>> Roman >>> >>>> thanks, >>>> Per >>>> >>>>> This change covers the runtime part. >>>>> >>>>> Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8222281 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8222281/webrev.00/ >>>>> >>>>> The change takes the 3 variants of oopDesc::klass() and funnels >>>>> them >>>>> through the Access API to be intercepted by the GC in any way >>>>> it >>>> wants. >>>>> Behaviour- and performance-wise it should be identical >>>>> (assuming the >>>>> compiler can make sense of the Access API). >>>>> >>>>> I see that there might be an opportunity here to make the if >>>>> (UseCompressedClassPointers) check pre-resolved, but I don't >>>>> know how >>>> to >>>>> do that. I also don't really see how that is supposed to work >>>>> for >>>> loads >>>>> and stores wrt UseCompressedOops either: in order to select the >>>> proper >>>>> functions on first call, and subsequently go through that >>>>> selected >>>>> function, it would have to be done in the *_init() functions. >>>>> But I >>>>> don't see any selection code there. Instead, it seems to be in >>>>> PreRuntimeDispatch? I must be missing something. I left the >>>>> selection >>>> in >>>>> the raw implementation. Maybe we want to sort this out? >>>>> >>>>> >>>>> I must say that I fought with myself whether or not I should >>>>> add to >>>> the >>>>> madness that is the Access API. What should have been a >>>> 1-line-addition >>>>> to an API (e.g. BarrierSet) turned out to become: >>>>> >>>>> 7 files changed, 86 insertions(+), 20 deletions(-) >>>>> >>>>> and two days of work. And god forbid we ever have to change or >>>>> even >>>> fix >>>>> anything there. >>>>> >>>>> I was about to just not do it in Access API at all, but >>>>> somewhere >>>> else >>>>> instead, but then I did not want to introduce a schism there, >>>>> so I >>>> bit >>>>> the bullet. Maybe we should consider to turn this into a proper >>>>> C++ >>>>> interface instead? This is just unmaintainable madness. >>>>> >>>>> >>>>> Testing: tier1 fine. Will submit into jdk/submit shortly, works >>>>> with >>>> the >>>>> Shenandoah prototype that I have here (iow, the API is good) >>>>> >>>>> Can I please get a review? >>>>> >>>>> Thanks, >>>>> Roman >>>>> From rkennke at redhat.com Mon Apr 15 07:44:11 2019 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 15 Apr 2019 09:44:11 +0200 Subject: RFR: JDK-8222281: GC interface for load-klass: runtime part In-Reply-To: <0a9d1386-5594-dfe6-f842-dc0001ac2f72@oracle.com> References: <789EF88B-3E6E-4455-8335-96185BAECA07@redhat.com> <67a0953d-7d74-b315-f571-a8516a9a8776@oracle.com> <9b8a0644ae253a4225f879077ef5f7b791c26746.camel@redhat.com> <0a9d1386-5594-dfe6-f842-dc0001ac2f72@oracle.com> Message-ID: Am 15.04.19 um 08:17 schrieb Per Liden: > Hi Roman, > > On 04/12/2019 05:13 PM, Roman Kennke wrote: >> Alright, I am experimenting with this. It turns out that there are a >> *lot* of paths that call into oopDesc::klass() which need extra >> handling, all of which come out of the GC. Example is oopDesc::size() >> and oopDesc::oop_iterate() and various others. > > It's till not quite clear to me why you need this. Put another way, I > think you should really try to design this such that you don't need > this. Doing things like size() or oop_iterate() on from-space objects > sounds like there's an abstraction/load-barrier missing somewhere. It's GC code: e.g. when I want to evacuate an object, I need to know its size, and we're racing with other threads to do the same, possibly. Said that, I've come up with a way to handle it nicely, and actually improves some code paths. Stay tuned! Roman > I can understand if the GC sometimes needs to fiddle with from-space > objects, but that should be a Shenandoah internal thing and shouldn't > have to leak out into BarrierSet, etc. > > Btw, once your conversion to to-space invariant is complete (is it?), > I'm hoping we can start to cleaning out all the stuff we no longer > need from BarrierSet and the Access API. > > cheers, > Per > >> >> I am wondering if a simple BarrierSet::obj_klass() might be >> useful/feasible for such cases? I can work around some of those >> occurances by calling variants that take a Klass* parameter instead, >> and resolve that internally in the GC. But not always. For example, I >> can call oopDesc::size_given_klass() instead. However, that blows up >> later because objArrayKlass::obj_size() is asserting klass()- >>> is_objArrayKlass() or something like that later... >> >> Ideas? >> >> Roman >> >>> On 4/12/19 8:23 AM, Roman Kennke wrote: >>>> >>>> Am 12. April 2019 07:18:41 MESZ schrieb Per Liden < >>>> per.liden at oracle.com>: >>>>> Hi Roman, >>>>> >>>>> On 04/11/2019 10:58 PM, Roman Kennke wrote: >>>>>> An upcoming feature in Shenandoah requires that GC can >>>>>> intercept >>>>> loading >>>>>> the Klass* of an object. I'd like to introduce a GC interface >>>>>> for >>>>> that. >>>>> >>>>> Could you please give some more insight into what this feature is >>>>> and >>>>> why it needs to intercept these loads? >>>> >>>> Oh yeah, apparently it was late last night. ;-) We want to >>>> eliminate Shenandoah's extra word for the forwarding pointer and >>>> stick it in the Klass* word for forwarded objects, with a little >>>> bit of encoding to distinguish a forward pointer from a Klass*. >>>> Therefore we need to see Klass* loads to check that encoding and >>>> possibly load the Klass* from the forwardee instead. >>> >>> Maybe I'm missing something, but didn't you just switch to having a >>> to-space invariant? What runtime code can get hold of a an oop >>> pointing >>> into from-space and then try load its klass? >>> >>> cheers, >>> Per >>> >>>> Roman >>>> >>>>> thanks, >>>>> Per >>>>> >>>>>> This change covers the runtime part. >>>>>> >>>>>> Bug: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8222281 >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8222281/webrev.00/ >>>>>> >>>>>> The change takes the 3 variants of oopDesc::klass() and funnels >>>>>> them >>>>>> through the Access API to be intercepted by the GC in any way >>>>>> it >>>>> wants. >>>>>> Behaviour- and performance-wise it should be identical >>>>>> (assuming the >>>>>> compiler can make sense of the Access API). >>>>>> >>>>>> I see that there might be an opportunity here to make the if >>>>>> (UseCompressedClassPointers) check pre-resolved, but I don't >>>>>> know how >>>>> to >>>>>> do that. I also don't really see how that is supposed to work >>>>>> for >>>>> loads >>>>>> and stores wrt UseCompressedOops either: in order to select the >>>>> proper >>>>>> functions on first call, and subsequently go through that >>>>>> selected >>>>>> function, it would have to be done in the *_init() functions. >>>>>> But I >>>>>> don't see any selection code there. Instead, it seems to be in >>>>>> PreRuntimeDispatch? I must be missing something. I left the >>>>>> selection >>>>> in >>>>>> the raw implementation. Maybe we want to sort this out? >>>>>> >>>>>> >>>>>> I must say that I fought with myself whether or not I should >>>>>> add to >>>>> the >>>>>> madness that is the Access API. What should have been a >>>>> 1-line-addition >>>>>> to an API (e.g. BarrierSet) turned out to become: >>>>>> >>>>>> ??? 7 files changed, 86 insertions(+), 20 deletions(-) >>>>>> >>>>>> and two days of work. And god forbid we ever have to change or >>>>>> even >>>>> fix >>>>>> anything there. >>>>>> >>>>>> I was about to just not do it in Access API at all, but >>>>>> somewhere >>>>> else >>>>>> instead, but then I did not want to introduce a schism there, >>>>>> so I >>>>> bit >>>>>> the bullet. Maybe we should consider to turn this into a proper >>>>>> C++ >>>>>> interface instead? This is just unmaintainable madness. >>>>>> >>>>>> >>>>>> Testing: tier1 fine. Will submit into jdk/submit shortly, works >>>>>> with >>>>> the >>>>>> Shenandoah prototype that I have here (iow, the API is good) >>>>>> >>>>>> Can I please get a review? >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> From robbin.ehn at oracle.com Mon Apr 15 08:58:34 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 15 Apr 2019 10:58:34 +0200 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> Message-ID: Hi, please review. After reexamine this issue: Threads in native must always have their stack walkable. JFR sampler should never need to make a stack walkable (for native sample). I manage to locally reproduce reliable with changes to JFR sampler and having hundreds of threads running similar code as the in the bug. (Looping creating an array with negative size.) I found a place where we don't proper look at the suspend flags. The java thread can thus escape native and make it's stack unwalkable and later it tries to make it walkable at the same time as the JFR sampler. By removing some kind of fast check and instead always call the check_safepoint_and_suspend_for_native_trans I can no longer reproduce. (which have the JFR native trans suspend check) And it passes t1-5. Code: http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ Issue: https://bugs.openjdk.java.net/browse/JDK-8218147 Thanks, Robbin On 4/5/19 5:43 PM, Robbin Ehn wrote: > Hi Dean, > > Sorry, I missed this mail. > Yes we can do that. > Ignore my other mail, I'll update. > > Thanks, Robbin > > > dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>> >>>>> >>>>> If it's already set, should we check that _last_Java_pc matches the >> >>>>> new value? >>>> >>>> We manually set the pc in several places, so if it's set, it's not >>>> certain that >>>> it should be the same as in last sp. >>>> I can't distinguish between the cases. >>>> >>> >>> If we get pc from sp[-1] then it should match, but you're right, we >>> sometimes get pc from somewhere else. >> >> How about if we combine the !walkable check and the >> capture_last_Java_pc() logic into a single method? >> Then we can do something like: >> >> ??? if (!walkable()) { >> ??????? address pc = (address)_last_Java_sp[-1]; >> ??????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >> ??????? assert(a == NULL || a == pc, "unexpected PC %p", a); >> ??? } >> >> dl From robbin.ehn at oracle.com Mon Apr 15 09:04:37 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 15 Apr 2019 11:04:37 +0200 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths Message-ID: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> Hi all, please review. Removing some dead code. Code: http://cr.openjdk.java.net/~rehn/8222327/webrev/ Issue: https://bugs.openjdk.java.net/browse/JDK-8222327 Passes t1-5. Thanks, Robbin From claes.redestad at oracle.com Mon Apr 15 09:40:23 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 15 Apr 2019 11:40:23 +0200 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> Message-ID: <39592868-25d6-5882-c89c-65af3559e005@oracle.com> Nice cleanup! Seems you could trivially do the same for _stackSize_offset Thanks! /Claes On 2019-04-15 11:04, Robbin Ehn wrote: > Hi all, please review. > > Removing some dead code. > > Code: > http://cr.openjdk.java.net/~rehn/8222327/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8222327 > > Passes t1-5. > > Thanks, Robbin From david.holmes at oracle.com Mon Apr 15 11:28:34 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 15 Apr 2019 21:28:34 +1000 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> Message-ID: <58a83122-9881-b7b4-d6f0-51b5f92e919c@oracle.com> Hi Robbin, On 15/04/2019 7:04 pm, Robbin Ehn wrote: > Hi all, please review. > > Removing some dead code. You haven't deleted the fields in javaClasses.hpp. Otherwise looks fine. Thanks, David > Code: > http://cr.openjdk.java.net/~rehn/8222327/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8222327 > > Passes t1-5. > > Thanks, Robbin From david.holmes at oracle.com Mon Apr 15 11:29:56 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 15 Apr 2019 21:29:56 +1000 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <58a83122-9881-b7b4-d6f0-51b5f92e919c@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <58a83122-9881-b7b4-d6f0-51b5f92e919c@oracle.com> Message-ID: <50cf3e85-5fe9-afd0-3f18-c1effa2d1f91@oracle.com> On 15/04/2019 9:28 pm, David Holmes wrote: > Hi Robbin, > > On 15/04/2019 7:04 pm, Robbin Ehn wrote: >> Hi all, please review. >> >> Removing some dead code. > > You haven't deleted the fields in javaClasses.hpp. Doh! Please ignore that :) David > Otherwise looks fine. > > Thanks, > David > >> Code: >> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8222327 >> >> Passes t1-5. >> >> Thanks, Robbin From david.holmes at oracle.com Mon Apr 15 11:34:10 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 15 Apr 2019 21:34:10 +1000 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <39592868-25d6-5882-c89c-65af3559e005@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <39592868-25d6-5882-c89c-65af3559e005@oracle.com> Message-ID: <5a823b4c-eeab-36e3-99e9-f8ffe23ec492@oracle.com> On 15/04/2019 7:40 pm, Claes Redestad wrote: > Nice cleanup! > > Seems you could trivially do the same for _stackSize_offset Seems any of the checks for _foo_offset > 0 can be removed as all fields must always be present. David > Thanks! > > /Claes > > On 2019-04-15 11:04, Robbin Ehn wrote: >> Hi all, please review. >> >> Removing some dead code. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8222327 >> >> Passes t1-5. >> >> Thanks, Robbin From gerard.ziemski at oracle.com Mon Apr 15 15:29:29 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Mon, 15 Apr 2019 10:29:29 -0500 Subject: RFR: 8222460: -XX:+PrintFlagsRanges prints incorrect range In-Reply-To: <9781e40b-f22c-1b8f-198f-525ffbda9a47@oracle.com> References: <9781e40b-f22c-1b8f-198f-525ffbda9a47@oracle.com> Message-ID: <7830b677-141f-7d1f-8b61-dd1094c072e7@oracle.com> On 4/15/19 12:58 AM, Per Liden wrote: > When testing JDK-8222145 (Add -XX:SoftMaxHeapSize flag), I ran into > this issue. > > -XX:+PrintFlagsRanges prints incorrect value range for flags that are > associated with a constraint function. Instead of printing an empty > range "[ ...?????????????????????????? ]" is prints the default value > range for the type. It should do the opposite. > > The fix in jvmFlagRangeList.cpp is trivial. However, it has the side > effect that some options that were previously not tested are now > tested, and vice versa. As a result I had to exclude testing the max > range of two problematic options (ActiveProcessorCount and > G1PeriodicGCInterval). It might also be the cases that some of the > previously excluded options no longer need to be excluded, but I > didn't want to fiddle with that in this patch. If this is the case, I > suggest that is fixed in a follow up RFE. > > Testing: tier1-3 in all Oracle platforms, tier1-7 on Linux/x86_64 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8222460 > Webrev: http://cr.openjdk.java.net/~pliden/8222460/webrev.0 > > /Per > hi Per, I'm sorry that this caused an issue for you. I think the original idea here was that for those flags with a constraint, we wanted to show the possible range, from which the constraint will further restrict the final value. However, that is tricky to test without exposing the constraint function, as evidenced by the exclusion list in the test. For those flags without range and constraint, the implicit range is the max range of the flag's type, so the idea here was that such flag was "untestable" for practical purposes, so we print an empty range. I believe that a better fix here might be to print an empty range in both cases. The cleanup of the exclusion flags in the test (and removal of default print range functions, if not needed anymore) can be handled by a followup issue. cheers From per.liden at oracle.com Mon Apr 15 15:58:41 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 15 Apr 2019 17:58:41 +0200 Subject: RFR: 8222460: -XX:+PrintFlagsRanges prints incorrect range In-Reply-To: <7830b677-141f-7d1f-8b61-dd1094c072e7@oracle.com> References: <9781e40b-f22c-1b8f-198f-525ffbda9a47@oracle.com> <7830b677-141f-7d1f-8b61-dd1094c072e7@oracle.com> Message-ID: Hi Gerard, On 04/15/2019 05:29 PM, gerard ziemski wrote: > > > On 4/15/19 12:58 AM, Per Liden wrote: >> When testing JDK-8222145 (Add -XX:SoftMaxHeapSize flag), I ran into >> this issue. >> >> -XX:+PrintFlagsRanges prints incorrect value range for flags that are >> associated with a constraint function. Instead of printing an empty >> range "[ ... ]" is prints the default value >> range for the type. It should do the opposite. >> >> The fix in jvmFlagRangeList.cpp is trivial. However, it has the side >> effect that some options that were previously not tested are now >> tested, and vice versa. As a result I had to exclude testing the max >> range of two problematic options (ActiveProcessorCount and >> G1PeriodicGCInterval). It might also be the cases that some of the >> previously excluded options no longer need to be excluded, but I >> didn't want to fiddle with that in this patch. If this is the case, I >> suggest that is fixed in a follow up RFE. >> >> Testing: tier1-3 in all Oracle platforms, tier1-7 on Linux/x86_64 >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8222460 >> Webrev: http://cr.openjdk.java.net/~pliden/8222460/webrev.0 >> >> /Per >> > > hi Per, > > I'm sorry that this caused an issue for you. No problem. > > I think the original idea here was that for those flags with a > constraint, we wanted to show the possible range, from which the > constraint will further restrict the final value. However, that is > tricky to test without exposing the constraint function, as evidenced by > the exclusion list in the test. > > For those flags without range and constraint, the implicit range is the > max range of the flag's type, so the idea here was that such flag was > "untestable" for practical purposes, so we print an empty range. But that's not very useful to an actual user, who want's to know what the range is (even if every value allowed by the type is valid). We can't expect a user to know what the range of a specific type is on every platform. > > I believe that a better fix here might be to print an empty range in > both cases. But why print an empty range when the range is well known? We don't have to make -XX:+PrintFlagsRanges dumber than it needs to be. The only time we don't know the range is what there's a constraint function associated with the flag. Frankly, to me this looks like the original intent of this code, but a simple mistake snuck in which inverted the if-else statement. cheers, Per > > The cleanup of the exclusion flags in the test (and removal of default > print range functions, if not needed anymore) can be handled by a > followup issue. > > > cheers > From karen.kinnear at oracle.com Mon Apr 15 16:17:08 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Mon, 15 Apr 2019 12:17:08 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: <4B563AE0-2C52-4349-B59E-A52103A8B570@oracle.com> Dan, Sorry to be so slow to get back to you, > On Apr 8, 2019, at 9:04 PM, Daniel D. Daugherty wrote: > > On 4/5/19 4:59 PM, Karen Kinnear wrote: >> Dan, >> >> Some more minor comments from reading the code: > > Thanks for the additional comments. I'm gathering changes for the next > round of code review (CR1) so these will be resolved in that round... > > More below... > > >> 1. Could you add comments to markOop.hpp about >> the use in the displaced_mark_word of is_marked to prevent any users of is_marked >> here from needing to have that information saved/restored? > > I _think_ I know what you're looking for here... Perhaps this: > > src/hotspot/share/oops/markOop.hpp: > // ObjectMonitor::install_displaced_markword_in_object() uses > // is_marked() on ObjectMonitor::_header as part of the restoration > // protocol for an object's header. In this usage, the mark bit is > // only ever set (and cleared) on the ObjectMonitor::_header field. > bool is_marked() const { > return (mask_bits(value(), lock_mask_in_place) == marked_value); > } Yes - thank you. Potentially prevents future overlaps. > > >> 2. In objectMonitor.hpp >> in is_busy you clarify the difference in use between _count (which I think you may be changing >> to _contended) and _ref_count. Could you possibly also comment where you declare them? > > I'll do the rename of _count -> _contentions in a subtask of 8153224 > like the other cleanups of the monitor subsystem. > > Here's the comment in question: > > src/hotspot/share/runtime/objectMonitor.hpp: > intptr_t is_busy() const { > // TODO-FIXME: merge _count and _waiters. > // TODO-FIXME: assert _owner == null implies _recursions = 0 > // TODO-FIXME: assert _WaitSet != null implies _count > 0 > // We do not include _ref_count in the is_busy() check because > // _ref_count is for indicating that the ObjectMonitor* is in > // use which is orthogonal to whether the ObjectMonitor itself > // is in use for a locking operation. > return _count|_waiters|intptr_t(_owner)|intptr_t(_cxq)|intptr_t(_EntryList); > } > > I don't think this comment clarifies _count vs. _ref_count. > I added the last four lines of the comment and their purpose is to > describe why _ref_count isn't used by is_busy(). The TODO-FIXME lines > need to be revisited since (at least) the third one is wrong. > > Here's the existing comment for _ref_count: > > volatile jint _ref_count; // ref count for ObjectMonitor* > > Here's the existing comment for _count: > > volatile jint _count; // reference count to prevent reclamation/deflation > // at stop-the-world time. See ObjectSynchronizer::deflate_monitor(). > // _count is approximately |_WaitSet| + |_EntryList| > > And here's what I proposed to change it to in my reply to your design > review notes: > > volatile jint _contentions; // Number of active contentions in enter(). It is used by is_busy() > // along with other fields to determine if an ObjectMonitor can be > // deflated. See ObjectSynchronizer::deflate_monitor(). > > I think we're good here with the proposed change of comment (and the > rename) for the _contentions field along with existing comment for > _ref_count and the existing comment for is_busy(). I may delete the > third TODO-FIXME line as part of the next cleanup. Thank you for the rename and comment > > >> 3. clear_using_JT: would it make sense to have an assertion that _owner is either null or DEFLATER_MARKER? > > We could add something like: > > assert(_owner == NULL || > (AsyncDeflateIdleMonitors && _owner == DEFLATER_MARKER), > "Fatal logic error in ObjectMonitor owner!"); > > and that will catch any races in async monitor deflation where the > _owner field is set to a monitor owner value (stack addr or thread*). > For monitor deflation at a safepoint, the non-NULL _owner field is > caught in clear() (which calls clear_using_JT()). That covers my concern. > > >> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 < _count >> with comments that caller ensured _count <= 0 >> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >> ? Am I missing something subtle here or should they be the same guarantees? > > Here's the code in question: > > src/hotspot/share/runtime/objectMonitor.cpp: > > void ObjectMonitor::EnterI(TRAPS) { > > if (_owner == DEFLATER_MARKER) { > guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller"); > // Deflater thread tried to lock this monitor, but it failed to make _count negative and gave up. > > void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) { > > if (_owner == DEFLATER_MARKER) { > guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER && _count < 0, monitor must not be owned by deflater thread here"); > > > Reading these two guarantee() calls always throws me off stride > because I would have written them like this: > > guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller"); > > and > > guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER && _count < 0, monitor must not be owned by deflater thread here"); > > When rewritten like the above, you have: > > "_count > 0" ... _count <= 0 > > and: > > "_count >= 0" ... "_count < 0" > > which is easier for my brain to read... okay... enough sidebar... > > Short answer: No the guarantees should not be the same. > > Longer answer: EnterI() is called by enter() after enter() has > incremented the _count field to indicate the contended state of > things. So in EnterI(), "_count > 0" is the right check. > ReenterI() is called after wait() has returned (notified or > timedout), and the _count field is not used on reentry ops so > "_count >= 0" is the right check. Thank you for walking this through. Wait also can cause a call to enter() -> EnterI(), but in that case _waiters was set while the monitor was still owned, so we should never get to the logic where EnterI sees DEFLATER_MARKER. > > I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, > there are two places in EnterI() that do this): > > L501: if (_owner == DEFLATER_MARKER) { > // The deflation protocol finished the first part (setting _owner), > // but it failed the second part (making _count negative) and bailed. > // Because we're called from enter() we have at least one contention. > guarantee(count > 0, "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller"); > L504: // Try to acquire monitor. > L505: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { > > L629: if (_owner == DEFLATER_MARKER) { > // The deflation protocol finished the first part (setting _owner), > // but it failed the second part (making _count negative) and bailed. > // Because we're called from enter() we have at least one contention. > guarantee(count> 0 , "_owner == DEFLATER_MARKER && _count <= 0 should have been handled by the caller"); > L632: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { > > And I'm going to tweak the ReenterI() code like this: > > L759: if (_owner == DEFLATER_MARKER) { > // The deflation protocol finished the first part (setting _owner), > // but it will observe _waiters != 0 and will bail out. Because we're > // called from wait() we may or may not have any contentions. > guarantee(count >= 0, "Impossible: _owner == DEFLATER_MARKER && _count < 0 should have been handled by the caller"); > L761: if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { > > > You didn't ask this, but it is okay that _count is only used to track > contentions in enter()/EnterI() and is not used to track contentions > in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is > used by is_busy() to observe the busy state for an ObjectMonitor that > is being wait()'ed for. The _waiters field is decremented after a > waiter has returned from ReenterI() so the _owner field takes over > answering the is_busy() question? Yes - that was my confusion and I had not walked it through carefully enough. And thank you - the guarantees are easier to read this way. > > >> 5. I could use a little help with allocation state transitions, >> e.g. in deflate_monitor_list_using_JT >> you see is_new with object set so you mark it as old so next deflation will check it > > Here's the code in question: > > src/hotspot/share/runtime/synchronizer.cpp: > > int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** listHeadp, > ObjectMonitor** freeHeadp, > ObjectMonitor** freeTailp, > ObjectMonitor** savedMidInUsep) { > > // Only try to deflate if there is an associated Java object and if > // mid is old (is not newly allocated and is not newly freed). > if (mid->object() != NULL && mid->is_old() && > deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { > // Deflation succeeded so update the in-use list. > > } else { > // mid is considered in-use if it does not have an associated > // Java object or mid is not old or deflation did not succeed. > // A mid->is_new() node can be seen here when it is freshly returned > // by omAlloc() (and skips the deflation code path). > // A mid->is_old() node can be seen here when deflation failed. > // A mid->is_free() node can be seen here when a fresh node from > // omAlloc() is released by omRelease() due to losing the race > // in inflate(). > > if (mid->object() != NULL && mid->is_new()) { > // mid has an associated Java object and has now been seen > // as newly allocated so mark it as "old". > mid->set_allocation_state(ObjectMonitor::Old); > } > >> - why do you set it to old here rather than in inflate once we set values? > > Inflation is used in quite a few places. If we marked the > ObjectMonitor as "Old" in inflate(), then that would make the > ObjectMonitor available for deflation by deflate_monitor_using_JT() > earlier: > > src/hotspot/share/runtime/synchronizer.cpp: >> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, >> ObjectMonitor** freeHeadp, >> ObjectMonitor** freeTailp) { >> assert(AsyncDeflateIdleMonitors, "sanity check"); >> assert(Thread::current()->is_Java_thread(), "precondition"); >> // A newly allocated ObjectMonitor should not be seen here so we >> // avoid an endless inflate/deflate cycle. >> assert(mid->is_old(), "precondition"); > > So the idea behind only deflating ObjectMonitors that have reached > allocation state "Old" is to prevent "an endless inflate/deflate cycle". > Here's the relevant section from Carsten's JEP: > >> To avoid endless inflation / deflation cycles in the prototype, monitor >> deflation is only attempted the second time a monitor is seen by the >> thread marking monitors as deflatable: If the thread (the only thread >> marking monitors as deflatable; might be service thread or some GC >> related thread or even a dedicated thread) sees a monitor in state New, >> then the thread marks the monitor as Old and moves on. So there is >> little interaction between a thread inflating a lock to a monitor and >> the deflating thread, the inflating thread just has to make sure the >> monitor is marked New and this marker is published using appropriate >> barriers. > > There isn't an explicit example in the JEP of what Carsten was thinking > of with "an endless inflate/deflate cycle". I didn't try to think of > such an example for the OpenJDK wiki either. I simple wrote: > >> ObjectMonitor has a new allocation_state field that supports three >> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied >> to ObjectMonitors that have reached the 'Old' state. When the Async >> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it >> is changed to the 'Old' state, but is not deflated. This prevents a >> newly allocated ObjectMonitor from being immediately deflated which >> could cause an inflation<->deflation oscillation. > > So let's think about what might happen if an ObjectMonitor is marked > as "Old" in inflate(). Here's an example use of inflate() in the > "slow enter" code path: > > src/hotspot/share/runtime/synchronizer.cpp: > > void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) { > > base< inflate(THREAD, obj(), inflate_cause_monitor_enter)->enter(THREAD); > > new> ObjectMonitorHandle omh; > new> inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); > new> do_loop = !omh.om_ptr()->enter(THREAD); > > In the "base" code, we took the return from inflate() and used it to call > ObjectMonitor::enter(). If we never changed that bit of code and inflate() > marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() could > async deflate the ObjectMonitor while we were trying to call enter() on > it... Boom! So we might think that holding off marking an ObjectMonitor > as "Old" can save us... and it can, but not in all cases... :-( > > It is entirely possible that our call to slow_enter() is made on an > ObjectMonitor that's already marked "Old". In that case, our thread > (T-enter) calls inflate() which returns the existing ObjectMonitor* > and we use it to call enter(). If the thread (T-deflate) calling > deflate_monitor_using_JT() does its magic before T-enter sets the > owner field or the count field... Boom! > > The previous paragraph is exactly what motivated the _ref_count field, > the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* > parameter to inflate(). inflate() calls ObjectMonitorHandle::save_om_ptr() > which increments the ObjectMonitor's ref_count and then checks for async > deflation protocol collisions. If there's a collision, then save_om_ptr() > returns false and the caller (inflate() in this case) has to retry. When > inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot > be deflated and is safe until the ObjectMonitorHandle is destroyed. > > So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot > deflate the ObjectMonitor in the window after inflate() returns and > before T-enter sets the owner field or increments the count field. But > you know all that already! > So let's bring this back to having inflate() mark the ObjectMonitor as > "Old"... Since inflate() returns an ObjectMonitor with the ref_count > 0, > it doesn't matter if the ObjectMonitor is marked as "Old" in inflate(). > T-deflate cannot deflate it due to ref_count > 0. > > Here's another crazy thought... inflate() is the only function that > calls omAlloc(), and omAlloc() is the only function that sets "New". > If we move the setting of "Old" from deflate_monitor_list_using_JT() > to inflate(), then the change from "New" -> "Old" never happens > outside of the inflate() call so why do we need the allocation state? That was my next question. > > Small dose of reality: I've found having the allocation state to be > very helpful when debugging race related crashes. We could make the > allocation state be DEBUG_ONLY, but then what about race debugging of > product bits... sigh... > > >> 6. Could you get rid of the new goto?s? > > I believe there is only one left from Carsten's prototype: > > src/hotspot/share/runtime/synchronizer.cpp: > >> intptr_t ObjectSynchronizer::FastHashCode(Thread * Self, oop obj) { > >> } else if (mark->has_monitor()) { >> ObjectMonitorHandle omh; >> if (!omh.save_om_ptr(obj, mark)) { >> // Lost a race with async deflation so try again. >> assert(AsyncDeflateIdleMonitors, "sanity check"); >> goto Retry; >> } > > I can change FastHashCode() to use the same "while (do_loop)" as the > other code that needs to do retries... > Thank you. > >> 7. On the updated wiki for the hash race example: >> Racing Threads: ?T-hash is about to inc the ref_count field? >> actually - T-hash just did - ref_count == 1 - so maybe change middle values > > Actually, we're talking about the set up for the race and the > diagram shows "ref_count == 1" and should show "ref_count == 0". > So I have fixed that on the "Racing Threads" diagram. > > In the following "T-deflate Wins" and "T-hash Wins" diagrams, > "ref_count == 1" is shown in both initial race results ObjectMonitor > box. In "T-deflate Wins", it shows ref_count being restored to > 0 in the second ObjectMonitor box. > > Thanks for catching this error. I've fixed it on the wiki. Thanks. > > >> >> 8. There is an old comment in FastHashCode >> that >> // WARNING: >> // The displaced header is strictly immutable. >> // It can NOT be changed in ANY cases. >> >> I presume that only applies to the displaced header for a stack lock - could you >> possibly update that while you are in the code? > > Here's the whole comment: > >> // WARNING: >> // The displaced header is strictly immutable. >> // It can NOT be changed in ANY cases. So we have >> // to inflate the header into heavyweight monitor >> // even the current thread owns the lock. The reason >> // is the BasicLock (stack slot) will be asynchronously >> // read by other threads during the inflate() function. >> // Any change to stack may not propagate to other threads >> // correctly. > > That comment applies the displaced header that's in the BasicLock > on the thread's stack and it definitely needs some cleaning up > independent of the Async Monitor Deflation project. Thank you. > > >> Also in FastHashCode >> // The only update to the header in the monitor (outside GC) >> 823 // is install the hash code. If someone add new usage of >> 824 // displaced header, please update this code >> Can you update that comment as well? I know you?ve already updated the code logic. > > I'll revisit that comment as well. I believe Carsten updated it in > his prototype, but when I backed out that change when I simplified > the hashcode stuff due to ObjectMonitorHandles/ref_count. Thank you. And if you are revisiting this to potentially call install_displaced_markword then this would change yet again. > > >> So I walked the logic for the hashcode interactions - I didn?t find any holes. Thank you for walking most of it in email/wiki. >> In particular, inflate does the save_om_ptr dance to inc_ref_count, so this code above will >> be called while preventing async deflation. > > Right. > > >> 9. install_displaced_markword_in_object >> What happens if the cas_set_mark fails? > > Here's the code in question: > > src/hotspot/share/runtime/objectMonitor.cpp: > >> void ObjectMonitor::install_displaced_markword_in_object() { > >> if (dmw->is_marked()) { >> // The dmw copy is marked which means a hash was not set by a racing >> // thread. Clear the mark from the copy in preparation for possible >> // restoration from this thread. >> assert(dmw->hash() == 0, "must be 0: hash=" INTPTR_FORMAT, dmw->hash()); >> dmw = dmw->set_unmarked(); >> } >> assert(dmw->is_neutral(), "must be a neutral markword"); >> >> oop const obj = (oop) object(); >> // Install displaced markword if object markword still points to this >> // monitor. Both the mutator trying to enter() and the thread deflating >> // the monitor will reach this point, but only one can win. >> // Note: If a mutator won the cmpxchg() race above and installed a hash >> // in _header, then the updated dmw contains that hash and we'll install >> // it in the object's markword here. >> obj->cas_set_mark(dmw, markOopDesc::encode(this)); > > We don't check the return from cas_set_mark() here intentionally. > If we have just T-enter and T-deflate racing through this code, > then after the "if (dmw->is_marked()) {" block, both threads > will have the same 'dmw' value. One thread will set it and the > other thread will fail to set it, but we don't care because both > threads wanted to set the same value... As a result of the > cas_set_mark() call in both threads, both threads will see the > same value in the object's header (if they happen to look). > > I talk about this in the "Either Wins the Second Race" sub-section > on the wiki. Yes for the current model of only these two callers, neither modifying the dmw other than set/clear is_marked) bit. If we extend to FastHashCode - this gets trickier. > > >> I get that today this handles the race with enter and deflate_monitor_using_JT. If we remove >> the call from enter, is the expectation that we?ve blocked all others who did not set is_marked themselves? >> If we remove the call from enter would it make sense to ensure that the cas_set_mark succeeds here? > > If we remove the install_displaced_markword_in_object() call from enter(), > then I don't think we need install_displaced_markword_in_object() at > all and can restore the object's header with: > > // Restore the header back to obj > obj->release_set_mark(mid->header()); > > just like ObjectSynchronizer::deflate_monitor(). The question is > whether we think install_displaced_markword_in_object() buys us > something other than "help" in restoring the object's header. Yes - that was the question - it adds complexity. Does it help threads make progress. Thinking about this more and reading Carsten?s reply - I think it does - especially since seeing DEFLATION_MARKER can make the checker loop, but in the meantime someone else can acquire the monitor - so that might be a non-trivial spin time. > > >> 10. Is there any benefit in a bit of stress testing with something like a temporary flag that deflates in >> mAlloc each time it is called? > > Maybe? :-) Something like DeflateAsyncMonitorsALot? Can you eloborate > on your thinking a bit? Just if you thought it might help shake multi-thread timing testing. If you think it is all shaken out already - no need. > > >> Looking forward to the performance runs as well as the latency numbers. > > I posted the SPECjbb2015 numbers from this past weekend earlier today. > Rather disappointing on my T7600's... Neutral on my MacMini? Thank you. And Claus? response was to be sure they are meaningful before deep-diving - so reproduce multiple times kind of thing. > > When you say "latency numbers", what do you mean? Do you mean how long > ObjectMonitors that could be deflated are kept inflated? Or do you mean > something else? latency I was thinking about was the reduced time during safepoint cleanup - which is one of the goals of this exercise. > > I think I've responded to everything. Please let me know if I missed > something? You got it all. many thanks, Karen > > Dan > > > >> >> thanks, >> Karen >> >>> On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty > wrote: >>> >>> Filed: >>> >>> JDK-8222034 Thread-SMR functions should be updated to remove work around >>> https://bugs.openjdk.java.net/browse/JDK-8222034 >>> >>> Martin and Robbin, please check it out and make sure that I captured >>> things correctly... >>> >>> Dan >>> >>> >>> >>> On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: >>>> On 4/5/19 8:37 AM, Doerr, Martin wrote: >>>>> Hi everybody, >>>>> >>>>>> I think was fixed with: >>>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics >>>>>> You should get a leading sync and trailing one with the default conservative >>>>>> model and thus get proper memory ordering. >>>>>> Martin, I'm I correct? >>>>> Exactly. Thanks for pointing this out. PPC uses the strongest possible ordering semantics with memory_order_conservative (default parameter). >>>>> I've seen that comment about PPC in "void ThreadsList::inc_nested_handle_cnt()". This function could get replaced. >>>> >>>> Okay so we need a new bug to update these two Thread-SMR functions: >>>> >>>> src/hotspot/share/runtime/threadSMR.cpp: >>>> >>>> void ThreadsList::dec_nested_handle_cnt() { >>>> // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec >>>> // backend on PPC does not yet conform to these requirements. Therefore >>>> // the decrement is simulated with an Atomic::sub(1, &addr). >>>> // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR mechanism >>>> // is not generally safe to use. >>>> Atomic::sub(1, &_nested_handle_cnt); >>>> } >>>> >>>> void ThreadsList::inc_nested_handle_cnt() { >>>> // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc >>>> // backend on PPC does not yet conform to these requirements. Therefore >>>> // the increment is simulated with a load phi; cas phi + 1; loop. >>>> // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR mechanism >>>> // is not generally safe to use. >>>> intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>>> for (;;) { >>>> if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == sample) { >>>> return; >>>> } else { >>>> sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>>> } >>>> } >>>> } >>>> >>>> I'll file a new bug, loop in Robbin, Erik O and Martin, and make >>>> sure we're all in agreement. Once we decide that Thread-SMR's >>>> functions look like, I'll adapt my Async Monitor Deflation >>>> functions... >>>> >>>> Dan >>>> >>>> >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Robbin Ehn > >>>>> Sent: Freitag, 5. April 2019 14:07 >>>>> To: daniel.daugherty at oracle.com ; hotspot-runtime-dev at openjdk.java.net ; Carsten Varming >; Roman Kennke >; Doerr, Martin > >>>>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>>>> >>>>> Hi Dan, >>>>> >>>>> (Martin there is question for you last in this email) >>>>> >>>>> After first pass I did not find any real issues. >>>>> Considering what you had to work with, it looks good! >>>>> >>>>> #1 >>>>> There are some assert which are redundant (to me at least) like: >>>>> src/hotspot/share/runtime/objectMonitor.cpp >>>>> L445 >>>>> if (!dmw->is_marked() && dmw->hash() == 0) { >>>>> // This dmw is neutral and has not yet started the restoration >>>>> // protocol so we mark a copy of the dmw to begin the protocol. >>>>> markOop marked_dmw = dmw->set_marked(); >>>>> assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >>>>> "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >>>>> marked_dmw->is_marked(), marked_dmw->hash()); >>>>> >>>>> That assert is basically a test that set_marked worked? >>>>> >>>>> L505 >>>>> if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { >>>>> assert(_succ != Self, "invariant"); >>>>> assert(_owner == Self, "invariant"); >>>>> >>>>> Assert on _owner checks that our cmpxchg is not broken? >>>>> >>>>> I think it's easier to read the code if some on the most obvious asserts are >>>>> removed. Maybe comments instead. >>>>> >>>>> #2 >>>>> Not your doing but I think we should remove TRAPS/Thread * Self and use >>>>> JavaThread* instead. >>>>> E.g. so we can change: >>>>> void ObjectMonitor::EnterI(TRAPS) { >>>>> Thread * const Self = THREAD; >>>>> assert(Self->is_Java_thread(), "invariant"); >>>>> assert(((JavaThread *) Self)->thread_state() == _thread_blocked, "invariant"); >>>>> >>>>> to: >>>>> >>>>> void ObjectMonitor::EnterI(JavaThread* Self) { >>>>> assert(Self->thread_state() == _thread_blocked, "invariant"); >>>>> >>>>> #3 >>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp >>>>> 164 inline void ObjectMonitor::inc_ref_count() { >>>>> 165 // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc >>>>> 166 // backend on PPC does not yet conform to these requirements. Therefore >>>>> 167 // the increment is simulated with a load phi; cas phi + 1; loop. >>>>> 168 // Without this MO_SEQ_CST Atomic::inc simulation, AsyncDeflateIdleMonitors >>>>> 169 // is not safe. >>>>> >>>>> I think was fixed with: >>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics >>>>> You should get a leading sync and trailing one with the default conservative >>>>> model and thus get proper memory ordering. >>>>> Martin, I'm I correct? >>>>> >>>>> Thanks, Robbin >>>>> >>>>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>>>>> Greetings, >>>>>> >>>>>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>>>>> >>>>>> JDK-8153224 Monitor deflation prolong safepoints >>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>> >>>>>> Here's a link to the OpenJDK wiki that describes my port: >>>>>> >>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>> >>>>>> Here's the webrev URL: >>>>>> >>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>>>> >>>>>> Here's a link to Carsten's original webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>>>> >>>>>> Earlier versions of this patch have been through several rounds of >>>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>>>> Roman for their preliminary code review comments. A very special >>>>>> thanks to Robbin and Roman for building and testing the patch in >>>>>> their own environments (including specJBB2015). >>>>>> >>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on >>>>>> Oracle's usual set of platforms. Earlier versions have been run >>>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>>>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>>>>> and slowdebug). Earlier versions have run my monitor inflation stress >>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>>>> fastdebug and slowdebug). >>>>>> >>>>>> All of the testing done on earlier versions will be redone on the >>>>>> latest version of the patch. >>>>>> >>>>>> Thanks, in advance, for any questions, comments or suggestions. >>>>>> >>>>>> Dan >>>>>> >>>>>> P.S. >>>>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>>> is currently failing in -Xcomp mode on Win* only. I've been trying >>>>>> to characterize/analyze this failure for more than a week now. At >>>>>> this point I'm convinced that Async Monitor Deflation is aggravating >>>>>> an existing bug. However, I plan to have a better handle on that >>>>>> failure before these bits are pushed to the jdk/jdk repo. >>>> >>>> >>> >> > From karen.kinnear at oracle.com Mon Apr 15 16:20:28 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Mon, 15 Apr 2019 12:20:28 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: <27DD6D28-0C81-4E7D-BA65-C21798F1786F@oracle.com> Carsten, Thank you for the quick responses. > On Apr 5, 2019, at 11:01 PM, Carsten Varming wrote: > > Dear Karen, > > Please see inline answers. > > On Fri, Apr 5, 2019 at 4:59 PM Karen Kinnear > wrote: > > 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 < _count > with comments that caller ensured _count <= 0 > In ReenterI: guarantee 0 <= _count, with comment not _count < 0 > ? Am I missing something subtle here or should they be the same guarantees? > > In ::enter _count is incremented when the thread is trying to acquire the monitor and decremented after the monitor has been acquired. The 0 < _count assertion is between those two point in the code. A thread acquiring a monitor and then calling wait will increment _count and then decrement _count as part of acquiring the monitor, thus _count can be 0 by the time the thread calls wait and when ReenterI is called. Yes, thank you. Wait/ReenterI uses the waiters count to prevent deflation. > > 9. install_displaced_markword_in_object > What happens if the cas_set_mark fails? > I get that today this handles the race with enter and deflate_monitor_using_JT. If we remove > the call from enter, is the expectation that we?ve blocked all others who did not set is_marked themselves? > If we remove the call from enter would it make sense to ensure that the cas_set_mark succeeds here? > > I designed my original patch such that no thread would ever wait for the the deflating thread to finish deflating a monitor. If you remove install_displaced_markword_in_object from enter, then the entering thread can end up busy waiting by continuously reading the monitor pointer from the object mark word and then realizing that the monitor is being deflated and it should retry by going back to reading the object mark word. This bad behavior is completely avoided by calling install_displaced_markword_in_object. > > In my original patch no thread would ever wait for a deflating thread to finish. This property got lost in FastHashCode as that function evolved since I wrote my patch, but I think this property is worth preserving where possible. It might even be worth looking at FastHashCode to see if we can re-establish this property. Got it. So the model was that all threads that detected the deflating thread starting to operate on this monitor could make progress rather than retry loops. And because enter can claim a monitor that has started deflating but not yet fully claimed it - this could be a non-trivial wait. Makes sense - thank you for the explanation. thanks, Karen > > I hope this helps. > > Best, > Carsten > >> On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty > wrote: >> >> Filed: >> >> JDK-8222034 Thread-SMR functions should be updated to remove work around >> https://bugs.openjdk.java.net/browse/JDK-8222034 >> >> Martin and Robbin, please check it out and make sure that I captured >> things correctly... >> >> Dan >> >> >> >> On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: >>> On 4/5/19 8:37 AM, Doerr, Martin wrote: >>>> Hi everybody, >>>> >>>>> I think was fixed with: >>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics >>>>> You should get a leading sync and trailing one with the default conservative >>>>> model and thus get proper memory ordering. >>>>> Martin, I'm I correct? >>>> Exactly. Thanks for pointing this out. PPC uses the strongest possible ordering semantics with memory_order_conservative (default parameter). >>>> I've seen that comment about PPC in "void ThreadsList::inc_nested_handle_cnt()". This function could get replaced. >>> >>> Okay so we need a new bug to update these two Thread-SMR functions: >>> >>> src/hotspot/share/runtime/threadSMR.cpp: >>> >>> void ThreadsList::dec_nested_handle_cnt() { >>> // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec >>> // backend on PPC does not yet conform to these requirements. Therefore >>> // the decrement is simulated with an Atomic::sub(1, &addr). >>> // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR mechanism >>> // is not generally safe to use. >>> Atomic::sub(1, &_nested_handle_cnt); >>> } >>> >>> void ThreadsList::inc_nested_handle_cnt() { >>> // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc >>> // backend on PPC does not yet conform to these requirements. Therefore >>> // the increment is simulated with a load phi; cas phi + 1; loop. >>> // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR mechanism >>> // is not generally safe to use. >>> intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>> for (;;) { >>> if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == sample) { >>> return; >>> } else { >>> sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>> } >>> } >>> } >>> >>> I'll file a new bug, loop in Robbin, Erik O and Martin, and make >>> sure we're all in agreement. Once we decide that Thread-SMR's >>> functions look like, I'll adapt my Async Monitor Deflation >>> functions... >>> >>> Dan >>> >>> >>>> >>>> Best regards, >>>> Martin >>>> >>>> >>>> -----Original Message----- >>>> From: Robbin Ehn > >>>> Sent: Freitag, 5. April 2019 14:07 >>>> To: daniel.daugherty at oracle.com ; hotspot-runtime-dev at openjdk.java.net ; Carsten Varming >; Roman Kennke >; Doerr, Martin > >>>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>>> >>>> Hi Dan, >>>> >>>> (Martin there is question for you last in this email) >>>> >>>> After first pass I did not find any real issues. >>>> Considering what you had to work with, it looks good! >>>> >>>> #1 >>>> There are some assert which are redundant (to me at least) like: >>>> src/hotspot/share/runtime/objectMonitor.cpp >>>> L445 >>>> if (!dmw->is_marked() && dmw->hash() == 0) { >>>> // This dmw is neutral and has not yet started the restoration >>>> // protocol so we mark a copy of the dmw to begin the protocol. >>>> markOop marked_dmw = dmw->set_marked(); >>>> assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >>>> "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >>>> marked_dmw->is_marked(), marked_dmw->hash()); >>>> >>>> That assert is basically a test that set_marked worked? >>>> >>>> L505 >>>> if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { >>>> assert(_succ != Self, "invariant"); >>>> assert(_owner == Self, "invariant"); >>>> >>>> Assert on _owner checks that our cmpxchg is not broken? >>>> >>>> I think it's easier to read the code if some on the most obvious asserts are >>>> removed. Maybe comments instead. >>>> >>>> #2 >>>> Not your doing but I think we should remove TRAPS/Thread * Self and use >>>> JavaThread* instead. >>>> E.g. so we can change: >>>> void ObjectMonitor::EnterI(TRAPS) { >>>> Thread * const Self = THREAD; >>>> assert(Self->is_Java_thread(), "invariant"); >>>> assert(((JavaThread *) Self)->thread_state() == _thread_blocked, "invariant"); >>>> >>>> to: >>>> >>>> void ObjectMonitor::EnterI(JavaThread* Self) { >>>> assert(Self->thread_state() == _thread_blocked, "invariant"); >>>> >>>> #3 >>>> src/hotspot/share/runtime/objectMonitor.inline.hpp >>>> 164 inline void ObjectMonitor::inc_ref_count() { >>>> 165 // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc >>>> 166 // backend on PPC does not yet conform to these requirements. Therefore >>>> 167 // the increment is simulated with a load phi; cas phi + 1; loop. >>>> 168 // Without this MO_SEQ_CST Atomic::inc simulation, AsyncDeflateIdleMonitors >>>> 169 // is not safe. >>>> >>>> I think was fixed with: >>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics >>>> You should get a leading sync and trailing one with the default conservative >>>> model and thus get proper memory ordering. >>>> Martin, I'm I correct? >>>> >>>> Thanks, Robbin >>>> >>>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>>>> Greetings, >>>>> >>>>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>>>> >>>>> JDK-8153224 Monitor deflation prolong safepoints >>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>> >>>>> Here's a link to the OpenJDK wiki that describes my port: >>>>> >>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>> >>>>> Here's the webrev URL: >>>>> >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>>> >>>>> Here's a link to Carsten's original webrev: >>>>> >>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>>> >>>>> Earlier versions of this patch have been through several rounds of >>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>>> Roman for their preliminary code review comments. A very special >>>>> thanks to Robbin and Roman for building and testing the patch in >>>>> their own environments (including specJBB2015). >>>>> >>>>> This version of the patch has been thru Mach5 tier[1-8] testing on >>>>> Oracle's usual set of platforms. Earlier versions have been run >>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>>>> and slowdebug). Earlier versions have run my monitor inflation stress >>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>>> fastdebug and slowdebug). >>>>> >>>>> All of the testing done on earlier versions will be redone on the >>>>> latest version of the patch. >>>>> >>>>> Thanks, in advance, for any questions, comments or suggestions. >>>>> >>>>> Dan >>>>> >>>>> P.S. >>>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>> is currently failing in -Xcomp mode on Win* only. I've been trying >>>>> to characterize/analyze this failure for more than a week now. At >>>>> this point I'm convinced that Async Monitor Deflation is aggravating >>>>> an existing bug. However, I plan to have a better handle on that >>>>> failure before these bits are pushed to the jdk/jdk repo. >>> >>> >> > From daniel.daugherty at oracle.com Mon Apr 15 17:12:29 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 15 Apr 2019 13:12:29 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <0f4ab494-57d1-d202-5ffa-f2416031c5ff@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <1015f61b-71a3-3c81-09a2-159d16c0b24b@oracle.com> <4e262c9a-6a21-b213-27f9-d8e59c27ba84@oracle.com> <0f4ab494-57d1-d202-5ffa-f2416031c5ff@oracle.com> Message-ID: <97361f14-8d04-22f6-abe4-7e2b667424b3@oracle.com> Just tracking to make sure I made all the changes... On 3/29/19 3:18 PM, Daniel D. Daugherty wrote: > On 3/29/19 2:48 PM, Carsten Varming wrote: >> >>>> I am trying to figure out when the global list of monitors, >>>> i.e., the monitors from dead threads, are deflated after a >>>> System.gc request. It looks like >>>> ObjectSynchronizer::do_safepoint_work should be responsible >>>> for this, but it calls deflate_idle_monitors if >>>> !AsyncDeflateIdleMonitors only. Perhaps I am missing something. >>> >>> So System.gc() results in call to JVM_GC() here: >>> >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/src/hotspot/share/prims/jvm.cpp.udiff.html >>> >>> ??? The addition here is: >>> >>> ??? > ObjectSynchronizer::set_is_cleanup_requested(true); >>> >>> which sets a flag. For per-thread lists: >>> >>> ?void >>> ObjectSynchronizer::deflate_thread_local_monitors(Thread* >>> thread, DeflateMonitorCounters* counters) { >>> assert(SafepointSynchronize::is_at_safepoint(), "must be at >>> safepoint"); >>> >>> +? if (AsyncDeflateIdleMonitors) { >>> +??? // Nothing to do when idle ObjectMonitors are deflated >>> using a >>> +??? // JavaThread unless a special cleanup has been requested. >>> +??? if (!is_cleanup_requested()) { >>> +????? return; >>> +??? } >>> +? } >>> >>> we normally bail if AsyncDeflateIdleMonitors, but we do not >>> if is_cleanup_requested() is true so that takes care of >>> per-thread lists for a System.gc() call. >>> >>> So here's deflate_idle_monitors() which handles the global >>> lists: >>> >>> 2007 void >>> ObjectSynchronizer::deflate_idle_monitors(DeflateMonitorCounters* >>> counters) { >>> 2008 assert(!AsyncDeflateIdleMonitors, "sanity check"); >>> >>> so clearly that function can't be called (currently). >>> >>> Stepping back for a second, a System.gc() will result in >>> a safepoint which will result in this call: >>> >>> ???? if >>> (_subtasks.try_claim_task(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) >>> { >>> ?????? const char* name = "deflating global idle monitors"; >>> ?????? EventSafepointCleanupTask event; >>> ?????? TraceTime timer(name, TRACETIME_LOG(Info, safepoint, >>> cleanup)); >>> - ObjectSynchronizer::deflate_idle_monitors(_counters); >>> +????? // AsyncDeflateIdleMonitors only uses >>> DeflateMonitorCounters >>> +????? // when a special cleanup has been requested. >>> +????? // Note: This logging output will include global idle >>> monitor >>> +????? // elapsed times, but not global idle monitor >>> deflation count. >>> + >>> ObjectSynchronizer::do_safepoint_work(!AsyncDeflateIdleMonitors >>> ? _counters : NULL); >>> >>> do_safepoint_work() will do the following: >>> >>> 1691?? _gOmShouldDeflateIdleMonitors = true; >>> 1692?? MutexLockerEx ml(Service_lock, >>> Mutex::_no_safepoint_check_flag); >>> 1693 Service_lock->notify_all(); >>> >>> which will cause the ServiceThread to deflate global idle >>> monitors >>> _sometime_ after the safepoint is complete. >>> >>> So a System.gc() only causes deflate_thread_local_monitors() to >>> deflate per-thread idle monitors at the safepoint. The global >>> idle monitors are only handled by the ServiceThread. >>> >>> Looking back through my notes when I added is_cleanup_requested >>> support, I only had a test case that had a monitor on a >>> per-thread >>> list that needed to be deflated when System.gc() is called. >>> >>> I could change things so that deflate_idle_monitors() is also >>> called when is_cleanup_requested is true. That way a System.gc() >>> results in all monitor lists being cleaned at a safepoint which >>> is more consistent. >>> >>> Carsten, would you like me to make this change? >>> >>> >>> I think it would be best to call deflate_idle_monitors() after a >>> System.gc() call before the actual garbage collection. If I >>> remember correctly, the monitors are GC roots, so not deflating >>> idle monitors would keep garbage alive contrary to the intended >>> meaning of System.gc(). >> >> I made the change to call deflate_idle_monitors() when >> is_cleanup_requested >> is true couple of days ago and I've been running it through some >> testing >> (so far with no new issues). >> >> >> Nice. > > This will be in the next round of code review. Done. > >>>> Speaking of optimizations, it sure would be nice if little >>>> changes to java threads could be combined and performed on >>>> the way out of the safepoint in one go instead of having >>>> lots of iterations of the thread list in various places. >>>> Some people have thousands of threads and each traversal of >>>> the thread list hurts. >>> >>> Do you have a specific example in mind? >>> >>> >>> No concrete example for a public mailing list. :(. But do notice >>> that independent tasks that require traversals of the thread >>> list are already fused in ParallelSPCleanupThreadClosure >>> . >>> If you made deflate_thread_local_monitors >>> set?jt->omShouldDeflateIdleMonitors to true, then you wouldn't >>> need to iterator over all java threads in do_safepoint_work. >> >> I think I see what you mean... so when >> ParallelSPCleanupThreadClosure:: >> do_thread() calls deflate_thread_local_monitors(): >> >> 2250?? if (AsyncDeflateIdleMonitors) { >> 2251???? // Nothing to do when idle ObjectMonitors are deflated >> using a >> 2252???? // JavaThread unless a special cleanup has been requested. >> >> Replace L2251-2 with: >> ???????? // Mark the JavaThread for idle monitor cleanup unless a >> ???????? // special cleanup has been requested. >> 2253???? if (!is_cleanup_requested()) { >> >> Add these three lines: >> ?????????? if (thread->omInUseCount > 0) { >> ???????????? // This JavaThread is using monitors so mark it. >> ???????????? thread->omShouldDeflateIdleMonitors = true; >> ?????????? } >> 2254?????? return; >> 2255???? } >> >> That will allow this block to go away: >> >> 1695?? // Request deflation of per-thread idle monitors by each >> JavaThread: >> 1696?? for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = >> jtiwh.next(); ) { >> 1697???? if (jt->omInUseCount > 0) { >> 1698?????? // This JavaThread is using monitors so check it. >> 1699?????? jt->omShouldDeflateIdleMonitors = true; >> 1700???? } >> 1701?? } >> >> Please let me know if I understand what you meant... >> >> >> This is exactly what I meant. > > Good. This will be in the next round of code review Done. Okay, I believe I've made all the changes for this set of comments and replies... Dan From daniel.daugherty at oracle.com Mon Apr 15 17:13:37 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 15 Apr 2019 13:13:37 -0400 Subject: Async Monitor Deflation design review (2019.04.02) In-Reply-To: References: <416D33B4-3457-46DF-8E92-003782BD5A5F@oracle.com> Message-ID: <13ccb1f4-220c-31d3-ad33-2ee2d7af390d@oracle.com> Just tracking to make sure I made all the changes... I filed: ??? JDK-8222295 more baseline cleanups from Async Monitor Deflation project https://bugs.openjdk.java.net/browse/JDK-8222295 to track this round of baseline cleanups. On 4/5/19 9:48 AM, Daniel D. Daugherty wrote > On 4/3/19 5:12 PM, Karen Kinnear wrote: > > > src/hotspot/share/runtime/objectMonitor.hpp: > >> ? volatile jint? _count;??????????? // reference count to prevent >> reclamation/deflation >> ??????????????????????????????????? // at stop-the-world time. See >> ObjectSynchronizer::deflate_monitor(). >> ??????????????????????????????????? // _count is approximately >> |_WaitSet| + |_EntryList| > > The above comment in objectMonitor.hpp is from the base system and it > is partially misleading and partially wrong. This comment should be > updated in the base system to something like this: > > // Number of active contentions in enter(). It is used by is_busy() > // along with other fields to determine if an ObjectMonitor can be > // deflated. See ObjectSynchronizer::deflate_monitor(). Done via 8222295. > This part: "_count is approximately |_WaitSet| + |_EntryList|" is wrong > since _WaitSet is associated with Wait and Notify/NotifyAll operations > and not Enter operations. > > I'm also thinking of renaming '_count' -> '_contentions'. See below. > > >> ? static int count_offset_in_bytes()?????? { return >> offset_of(ObjectMonitor, _count); } > > This function appears to be unused and should be removed. Done via 8222295. > src/hotspot/share/runtime/objectMonitor.inline.hpp > >> inline jint ObjectMonitor::count() const { >> ? return _count; >> } > > The count() getter appears to be unused and should be deleted. Done via 8222295. There was also a 'set_count()' declared in objectMonitor.hpp, but never defined anywhere so I also removed it. > >> ? if (Atomic::cmpxchg(DEFLATER_MARKER, &mid->_owner, (void*)NULL) == >> NULL) { > > Hmmm... that code pre-dates Atomic::replace_if_null()... should > switch to that... Done. > >> Good to add comments/assertions/guarantees >> to future proof. > > I agree. I'll look at adding comments and guarantee() calls to cover > our assumptions when the object's header/dmw is stored into the > ObjectMonitor's _header field. Here's the existing code that saves the 'dmw' into ObjectMonitor::_header: > ????? markOop dmw = mark->displaced_mark_helper(); > ????? assert(dmw->is_neutral(), "invariant"); > > ????? // Setup monitor fields to proper values -- prepare the monitor > ????? m->set_header(dmw); is_neutral() is defined as: >? bool is_neutral()? const { return (mask_bits(value(), biased_lock_mask_in_place) == unlocked_value); } is_marked() is defined as: ? bool is_marked()?? const { ??? return (mask_bits(value(), lock_mask_in_place) == marked_value); ? } and: >???????? lock_bits??????????????? = 2, >? enum { lock_mask??????????????? = right_n_bits(lock_bits), >???????? lock_mask_in_place?????? = lock_mask << lock_shift, >???????? biased_lock_bits???????? = 1, >???????? biased_lock_mask???????? = right_n_bits(lock_bits + biased_lock_bits), >???????? biased_lock_mask_in_place= biased_lock_mask << lock_shift, >???????? unlocked_value?????????? = 1, >???????? marked_value???????????? = 3, So the location of the mark bits (2 bits) overlaps with the location of biased_lock_mask (3 bits) which means that the is_neutral() check will catch if the mark bits are set: ? marked_value != unlocked_value So here's the change for the code path that converts the stack lock into an inflated ObjectMonitor:: -????? assert(dmw->is_neutral(), "invariant"); +????? // Catch if the object's header is not neutral (not locked and +????? // not marked is what we care about here). +????? assert(dmw->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)dmw)); Same assert, but better diagnostics. So here's the change for the code path that converts a neutral header into an inflated ObjectMonitor: -??? assert(mark->is_neutral(), "invariant"); +??? // Catch if the object's header is not neutral (not locked and +??? // not marked is what we care about here). +??? assert(mark->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)mark)); Same assert, but better diagnostics. Done via 8222295. For the Async Monitor Deflation project, we'll change those from assert() to guarantee() for now. > 453 // There can be three different racers trying to update the _header > ?454???? // field and the return dmw value will tell us what cleanup needs > ?455???? // to be done (if any) after the race winner: > ?456???? //?? 1)? A mutator trying to install a hash in the object. > ?457???? //?????? Note: That mutator is not executing this code, but it is > ?458???? //?????? trying to update the _header field. > ?459???? //?????? If winner: dmw will contain the hash and be unmarked > > I no longer think my port has race #1 due to the ref_count changes. > I have to go back and verify the location of that race in Carsten's > port and then verify my port again. Here's Carsten's code that handled the race between T-deflate and T-hash (L813-822 and L830-4): > 803 // Inflate the monitor to set hash code > 804 monitor = ObjectSynchronizer::inflate(Self, obj, inflate_cause_hash_code); > 805 // Load displaced header and check it has hash code > 806 mark = monitor->header(); > 807 assert(mark->is_neutral() || mark->hash() == 0 && > mark->is_marked(), "invariant"); > 808 hash = mark->hash(); > 809 if (hash == 0) { > 810 hash = get_next_hash(Self, obj); > 811 temp = mark->set_unmarked()->copy_set_hash(hash); // merge hash > code into header > 812 assert(temp->is_neutral(), "invariant"); > 813 if (mark->is_marked()) { > 814 // Monitor is being deflated. Try installing mark word with hash > code into obj. > 815 markOop monitor_mark = markOopDesc::encode(monitor); > 816 if (obj->cas_set_mark(temp, monitor_mark) == monitor_mark) { > 817 return hash; > 818 } else { > 819 // Somebody else installed a new mark word in obj. Start over. We > are making progress, > 820 // as the new mark word is not a pointer to monitor. > 821 goto Retry; > 822 } > 823 } > 824 test = (markOop) Atomic::cmpxchg_ptr(temp, monitor, mark); > 825 if (test != mark) { > 826 // The only update to the header in the monitor (outside GC) is > install > 827 // the hash code or mark the header to signal that the monitor is > being > 828 // deflated. If someone add new usage of displaced header, please > update > 829 // this code. > 830 if (test->is_marked()) { > 831 // Monitor is being deflated. Make progress by starting over. > 832 assert(test->hash() == 0, "invariant"); > 833 goto Retry; > 834 } > 835 hash = test->hash(); > 836 assert(test->is_neutral(), "invariant"); > 837 assert(hash != 0, "Trivial unexpected object/monitor header usage."); > 838 } > 839 } > 840 // We finally get the hash > 841 return hash; > 842 } And in my port, the same race is here (L810): > 808 // Inflate the monitor to set hash code > 809 ObjectMonitorHandle omh; > 810 inflate(&omh, Self, obj, inflate_cause_hash_code); > 811 monitor = omh.om_ptr(); > 812 // Load displaced header and check it has hash code > 813 mark = monitor->header(); > 814 assert(mark->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)mark)); > 815 hash = mark->hash(); > 816 if (hash == 0) { > 817 hash = get_next_hash(Self, obj); > 818 temp = mark->copy_set_hash(hash); // merge hash code into header > 819 assert(temp->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)temp)); > 820 test = Atomic::cmpxchg(temp, monitor->header_addr(), mark); > 821 if (test != mark) { > 822 // The only update to the header in the monitor (outside GC) > 823 // is install the hash code. If someone add new usage of > 824 // displaced header, please update this code > 825 hash = test->hash(); > 826 assert(test->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)test)); > 827 assert(hash != 0, "Trivial unexpected object/monitor header usage."); > 828 } > 829 } > 830 // We finally get the hash > 831 return hash; > 832 } All of the racing in my port is done inside inflate() with the call to save_om_ptr() so the remainder of the hashcode update is back to the baseline code. > ?480 // Note: If a mutator won the cmpxchg() race above and installed > a hash > ?481?? // in _header, then the updated dmw contains that hash and > we'll install > ?482?? // it in the object's markword here. > > The comment on L480-481 does not quite make sense. A mutator trying > to install a hashcode does not call this function and if we already > have a hashcode in the initial 'dmw', then we won't call cmpxchg() > above. > > However, it is possible for our two callers to > install_displaced_markword_in_object() to have two different > initial 'dmw' values for the same object. For example: > > - T-deflate can have a 'dmw' without a hashcode. > - T-enter can have a 'dmw' with a hashcode if a hashcode was saved in > ? the ObjectMonitor's header/dmw after T-deflate grabbed its initial > ? 'dmw' value and before T-enter grabbed its initial 'dmw' value. > > Because T-deflate's initial 'dmw' does not have a hashcode, it will > go thru the restoration protocol but this: > > > 464???? dmw = (markOop) Atomic::cmpxchg(marked_dmw, &_header, dmw); > > will not update the ObjectMonitor's header/dmw field because T-deflate's > initial 'dmw' value no longer matches the ObjectMonitor's current > header/dmw field which now has a hashcode. The return value from > cmpxchg() will be the ObjectMonitor's current header/dmw value including > the hashcode so both T-deflate and T-enter will be racing to set the > object's header to the 'dmw' with the hashcode. > > > ?483?? obj->cas_set_mark(dmw, markOopDesc::encode(this)); > ?484 } > > > So to make this incredibly long hashcode story short: > > ? If a hashcode was set in the ObjectMonitor's header/dmw field, then > ? it will be restored to the object's header by either T-enter's or > ? T-deflate's install_displaced_markword_in_object() > > > Side note: I could do a set of diagrams for two different calls to > install_displaced_markword_in_object() where T-enter has the newly set > hashcode and T-deflate does not have the hashcode to show that it will > resolve right. However, if we drop install_displaced_markword_in_object() > from T-enter as mentioned below, then we don't have to do that because > we no longer have that race. The comment on L480-2 needs tweaking, but I don't think the scenario that I posited above can actually happen. So I made a complete pass through install_displaced_markword_in_object() and updated comments and assertions. Pretty much a complete rewrite of comments and assertions so it will be need to re-reviewed carefully. Okay, I believe I've made all the changes for this set of comments and replies... Dan From daniel.daugherty at oracle.com Mon Apr 15 17:13:57 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 15 Apr 2019 13:13:57 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: Just tracking to make sure I made all the changes... On 4/8/19 9:04 PM, Daniel D. Daugherty wrote: > On 4/5/19 4:59 PM, Karen Kinnear wrote: >> 1. Could you add comments to markOop.hpp about >> the use in the displaced_mark_word of is_marked to prevent any users >> of is_marked >> here from needing to have that information saved/restored? > > I _think_ I know what you're looking for here... Perhaps this: > > src/hotspot/share/oops/markOop.hpp: > ? // ObjectMonitor::install_displaced_markword_in_object() uses > ? // is_marked() on ObjectMonitor::_header as part of the restoration > ? // protocol for an object's header. In this usage, the mark bit is > ? // only ever set (and cleared) on the ObjectMonitor::_header field. > ? bool is_marked()?? const { > ??? return (mask_bits(value(), lock_mask_in_place) == marked_value); > ? } Done. > >> 2. In objectMonitor.hpp >> ? in is_busy you clarify the difference in use between _count (which >> I think you may be changing >> to _contended) and _ref_count. Could you possibly also comment where >> you declare them? > > I'll do the rename of _count -> _contentions in a subtask of 8153224 > like the other cleanups of the monitor subsystem. Done via 8222295. > Here's the comment in question: > > src/hotspot/share/runtime/objectMonitor.hpp: > ? intptr_t is_busy() const { > ??? // TODO-FIXME: merge _count and _waiters. > ??? // TODO-FIXME: assert _owner == null implies _recursions = 0 > ??? // TODO-FIXME: assert _WaitSet != null implies _count > 0 > ??? // We do not include _ref_count in the is_busy() check because > ??? // _ref_count is for indicating that the ObjectMonitor* is in > ??? // use which is orthogonal to whether the ObjectMonitor itself > ??? // is in use for a locking operation. > ??? return > _count|_waiters|intptr_t(_owner)|intptr_t(_cxq)|intptr_t(_EntryList); > ? } > > I don't think this comment clarifies _count vs. _ref_count. > I added the last four lines of the comment and their purpose is to > describe why _ref_count isn't used by is_busy(). The TODO-FIXME lines > need to be revisited since (at least) the third one is wrong. > > Here's the existing comment for _ref_count: > > ? volatile jint _ref_count;???????? // ref count for ObjectMonitor* > > Here's the existing comment for _count: > > ? volatile jint? _count;??????????? // reference count to prevent > reclamation/deflation > ??????????????????????????????????? // at stop-the-world time. See > ObjectSynchronizer::deflate_monitor(). > ??????????????????????????????????? // _count is approximately > |_WaitSet| + |_EntryList| > > And here's what I proposed to change it to in my reply to your design > review notes: > > ? volatile jint? _contentions;????? // Number of active contentions in > enter(). It is used by is_busy() > ??????????????????????????????????? // along with other fields to > determine if an ObjectMonitor can be > ??????????????????????????????????? // deflated. See > ObjectSynchronizer::deflate_monitor(). Done via 8222295. > I think we're good here with the proposed change of comment (and the > rename) for the _contentions field along with existing comment for > _ref_count and the existing comment for is_busy(). I may delete the > third TODO-FIXME line as part of the next cleanup. I deleted the first and the third TODO-FIXME lines. We're never going to merge _count and _waiters; we have APIs that need those returned independently of each other. Done via 8222295. > > >> 3. clear_using_JT: would it make sense to have an assertion that >> ?_owner is either null or DEFLATER_MARKER? > > We could add something like: > > ? assert(_owner == NULL || > ???????? (AsyncDeflateIdleMonitors && _owner == DEFLATER_MARKER), > ???????? "Fatal logic error in ObjectMonitor owner!"); > > and that will catch any races in async monitor deflation where the > _owner field is set to a monitor owner value (stack addr or thread*). > For monitor deflation at a safepoint, the non-NULL _owner field is > caught in clear() (which calls clear_using_JT()). Done. For now I've added: ? if (AsyncDeflateIdleMonitors) { ??? guarantee(_owner == NULL || _owner == DEFLATER_MARKER, ????????????? "Fatal logic error in ObjectMonitor owner!: owner=" ????????????? INTPTR_FORMAT, p2i(_owner)); ??? guarantee(_contentions <= 0, ????????????? "Fatal logic error in ObjectMonitor contentions!: contentions=%d", ????????????? _contentions); ? } > >> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that >> 0 < _count >> with comments that caller ensured _count <= 0 >> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >> ? Am I missing something subtle here or should they be the same >> guarantees? > > Here's the code in question: > > src/hotspot/share/runtime/objectMonitor.cpp: > > void ObjectMonitor::EnterI(TRAPS) { > > ? if (_owner == DEFLATER_MARKER) { > ??? guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 > should have been handled by the caller"); > ??? // Deflater thread tried to lock this monitor, but it failed to > make _count negative and gave up. > > void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) { > > ??? if (_owner == DEFLATER_MARKER) { > ????? guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER && > _count < 0, monitor must not be owned by deflater thread here"); > > > Reading these two guarantee() calls always throws me off stride > because I would have written them like this: > > ??? guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 > should have been handled by the caller"); > > and > > ????? guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER && > _count < 0, monitor must not be owned by deflater thread here"); > > When rewritten like the above, you have: > > ??? "_count > 0" ... _count <= 0 > > and: > > ??? "_count >= 0" ... "_count < 0" > > which is easier for my brain to read... okay... enough sidebar... > > Short answer: No the guarantees should not be the same. > > Longer answer: EnterI() is called by enter() after enter() has > incremented the _count field to indicate the contended state of > things. So in EnterI(), "_count > 0" is the right check. > ReenterI() is called after wait() has returned (notified or > timedout), and the _count field is not used on reentry ops so > "_count >= 0" is the right check. > > I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, > there are two places in EnterI() that do this): > > ??? L501: ? if (_owner == DEFLATER_MARKER) { > ??? ? ?? ?? ? // The deflation protocol finished the first part > (setting _owner), > ??? ? ?? ? ?? // but it failed the second part (making _count > negative) and bailed. > ??? ? ? ? ? ? // Because we're called from enter() we have at least > one contention. > ??? ? ? ? ??? guarantee(count > 0, "_owner == DEFLATER_MARKER && > _count <= 0 should have been handled by the caller"); > ??? L504: ??? // Try to acquire monitor. > ??? L505: ??? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == > DEFLATER_MARKER) { Done. > ??? L629: ??? if (_owner == DEFLATER_MARKER) { > ????? ?? ?????? // The deflation protocol finished the first part > (setting _owner), > ????? ?? ?????? // but it failed the second part (making _count > negative) and bailed. > ????? ?? ?????? // Because we're called from enter() we have at least > one contention. > ?? ???????????? guarantee(count> 0 , "_owner == DEFLATER_MARKER && > _count <= 0 should have been handled by the caller"); > ??? L632: ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == > DEFLATER_MARKER) { Done. > And I'm going to tweak the ReenterI() code like this: > > ??? L759: ??? if (_owner == DEFLATER_MARKER) { > ??????????????? // The deflation protocol finished the first part > (setting _owner), > ??????????????? // but it will observe _waiters != 0 and will bail > out. Because we're > ??????????????? // called from wait() we may or may not have any > contentions. > ? ? ? ? ? ????? guarantee(count >= 0, "Impossible: _owner == > DEFLATER_MARKER && _count < 0 should have been handled by the caller"); > ??? L761: ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == > DEFLATER_MARKER) { Done. > >> 5. I could use a little help with allocation state transitions, >> e.g. in deflate_monitor_list_using_JT >> ? you see is_new with object set so you mark it as old so next >> deflation will check it > > Here's the code in question: > > src/hotspot/share/runtime/synchronizer.cpp: > > int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** > listHeadp, > ObjectMonitor** freeHeadp, > ObjectMonitor** freeTailp, > ObjectMonitor** savedMidInUsep) { > > ??? // Only try to deflate if there is an associated Java object and if > ??? // mid is old (is not newly allocated and is not newly freed). > ??? if (mid->object() != NULL && mid->is_old() && > ??????? deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { > ????? // Deflation succeeded so update the in-use list. > > ??? } else { > ????? // mid is considered in-use if it does not have an associated > ????? // Java object or mid is not old or deflation did not succeed. > ????? // A mid->is_new() node can be seen here when it is freshly returned > ????? // by omAlloc() (and skips the deflation code path). > ????? // A mid->is_old() node can be seen here when deflation failed. > ????? // A mid->is_free() node can be seen here when a fresh node from > ????? // omAlloc() is released by omRelease() due to losing the race > ????? // in inflate(). > > ????? if (mid->object() != NULL && mid->is_new()) { > ??????? // mid has an associated Java object and has now been seen > ??????? // as newly allocated so mark it as "old". > ??????? mid->set_allocation_state(ObjectMonitor::Old); > ????? } > >> ? - why do you set it to old here rather than in inflate once we set >> values? > > Inflation is used in quite a few places. If we marked the > ObjectMonitor as "Old" in inflate(), then that would make the > ObjectMonitor available for deflation by deflate_monitor_using_JT() > earlier: > > src/hotspot/share/runtime/synchronizer.cpp: >> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, >> ObjectMonitor** freeHeadp, >> ObjectMonitor** freeTailp) { >> ? assert(AsyncDeflateIdleMonitors, "sanity check"); >> ? assert(Thread::current()->is_Java_thread(), "precondition"); >> ? // A newly allocated ObjectMonitor should not be seen here so we >> ? // avoid an endless inflate/deflate cycle. >> ? assert(mid->is_old(), "precondition"); > > So the idea behind only deflating ObjectMonitors that have reached > allocation state "Old" is to prevent "an endless inflate/deflate cycle". > Here's the relevant section from Carsten's JEP: > >> To avoid endless inflation / deflation cycles in the prototype, monitor >> deflation is only attempted the second time a monitor is seen by the >> thread marking monitors as deflatable: If the thread (the only thread >> marking monitors as deflatable; might be service thread or some GC >> related thread or even a dedicated thread) sees a monitor in state New, >> then the thread marks the monitor as Old and moves on. So there is >> little interaction between a thread inflating a lock to a monitor and >> the deflating thread, the inflating thread just has to make sure the >> monitor is marked New and this marker is published using appropriate >> barriers. > > There isn't an explicit example in the JEP of what Carsten was thinking > of with "an endless inflate/deflate cycle". I didn't try to think of > such an example for the OpenJDK wiki either. I simple wrote: > >> ObjectMonitor has a new allocation_state field that supports three >> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied >> to ObjectMonitors that have reached the 'Old' state. When the Async >> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it >> is changed to the 'Old' state, but is not deflated. This prevents a >> newly allocated ObjectMonitor from being immediately deflated which >> could cause an inflation<->deflation oscillation. > > So let's think about what might happen if an ObjectMonitor is marked > as "Old" in inflate(). Here's an example use of inflate() in the > "slow enter" code path: > > src/hotspot/share/runtime/synchronizer.cpp: > > void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, > TRAPS) { > > base< ?? inflate(THREAD, obj(), > inflate_cause_monitor_enter)->enter(THREAD); > > new>? ?? ObjectMonitorHandle omh; > new>? ?? inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); > new>? ?? do_loop = !omh.om_ptr()->enter(THREAD); > > In the "base" code, we took the return from inflate() and used it to call > ObjectMonitor::enter(). If we never changed that bit of code and inflate() > marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() could > async deflate the ObjectMonitor while we were trying to call enter() on > it... Boom! So we might think that holding off marking an ObjectMonitor > as "Old" can save us... and it can, but not in all cases... :-( > > It is entirely possible that our call to slow_enter() is made on an > ObjectMonitor that's already marked "Old". In that case, our thread > (T-enter) calls inflate() which returns the existing ObjectMonitor* > and we use it to call enter(). If the thread (T-deflate) calling > deflate_monitor_using_JT() does its magic before T-enter sets the > owner field or the count field... Boom! > > The previous paragraph is exactly what motivated the _ref_count field, > the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* > parameter to inflate(). inflate() calls ObjectMonitorHandle::save_om_ptr() > which increments the ObjectMonitor's ref_count and then checks for async > deflation protocol collisions. If there's a collision, then save_om_ptr() > returns false and the caller (inflate() in this case) has to retry. When > inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot > be deflated and is safe until the ObjectMonitorHandle is destroyed. > > So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot > deflate the ObjectMonitor in the window after inflate() returns and > before T-enter sets the owner field or increments the count field. But > you know all that already! > > So let's bring this back to having inflate() mark the ObjectMonitor as > "Old"... Since inflate() returns an ObjectMonitor with the ref_count > 0, > it doesn't matter if the ObjectMonitor is marked as "Old" in inflate(). > T-deflate cannot deflate it due to ref_count > 0. I moved the marking as "Old" into inflate() and deleted the block in deflate_monitor_list_using_JT() that converted "New" into "Old" (when there was an object ref). This means that inflated ObjectMonitors will be eligible for deflation as soon as the ref_count for the ObjectMonitor* does to zero in the inflate() caller (when ObjectMonitorHandle is destroyed). > >> 6. Could you get rid of the new goto?s? > > I believe there is only one left from Carsten's prototype: > > src/hotspot/share/runtime/synchronizer.cpp: > >> intptr_t ObjectSynchronizer::FastHashCode(Thread * Self, oop obj) { > >> ? } else if (mark->has_monitor()) { >> ??? ObjectMonitorHandle omh; >> ??? if (!omh.save_om_ptr(obj, mark)) { >> ????? // Lost a race with async deflation so try again. >> ????? assert(AsyncDeflateIdleMonitors, "sanity check"); >> ????? goto Retry; >> ??? } > > I can change FastHashCode() to use the same "while (do_loop)" as the > other code that needs to do retries... Done. Changed it to a "while (true)" loop. > >> >> 8. There is an old comment in FastHashCode >> that >> ?// WARNING: >> ? ? // ? The displaced header is strictly immutable. >> ? ? // It can NOT be changed in ANY cases. >> >> I presume that only applies to the displaced header for a stack lock >> - could you >> possibly update that while you are in the code? > > Here's the whole comment: > >> ??? // WARNING: >> ??? //?? The displaced header is strictly immutable. >> ??? // It can NOT be changed in ANY cases. So we have >> ??? // to inflate the header into heavyweight monitor >> ??? // even the current thread owns the lock. The reason >> ??? // is the BasicLock (stack slot) will be asynchronously >> ??? // read by other threads during the inflate() function. >> ??? // Any change to stack may not propagate to other threads >> ??? // correctly. > > That comment applies the displaced header that's in the BasicLock > on the thread's stack and it definitely needs some cleaning up > independent of the Async Monitor Deflation project. Done via 8222295. > >> Also in FastHashCode >> // The only update to the header in the monitor (outside GC) >> 823 // is install the hash code. If someone add new usage of >> 824 // displaced header, please update this code >> Can you update that comment as well? I know you?ve already updated >> the code logic. > > I'll revisit that comment as well. I believe Carsten updated it in > his prototype, but when I backed out that change when I simplified > the hashcode stuff due to ObjectMonitorHandles/ref_count. Updated the comment in the baseline. I've also updated "hash code" to "hashcode" and "hash_code" to "hashcode" so we have just one form. Done via 8222295: ??????? // The only update to the ObjectMonitor's header/dmw field ??????? // is to merge in the hash code. If someone adds a new usage ??????? // of the header/dmw field, please update this code. Added the following paragraph for Async Monitor Deflation: ??????? // ObjectMonitor::install_displaced_markword_in_object() ??????? // does mark the header/dmw field as part of async monitor ??????? // deflation, but that protocol cannot happen now due to ??????? // the ObjectMonitorHandle above. The existing assert()'s will catch any errant marks: ??????? assert(test->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)test)); Okay, I believe I've made all the changes for this set of comments and replies... Dan From daniel.daugherty at oracle.com Mon Apr 15 17:15:00 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 15 Apr 2019 13:15:00 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <6b4195d3-ad6c-c598-3e9f-d35621f68bb5@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <6b4195d3-ad6c-c598-3e9f-d35621f68bb5@oracle.com> Message-ID: Just tracking to make sure I made all the changes... On 4/5/19 11:54 AM, Daniel D. Daugherty wrote: > On 4/5/19 8:07 AM, Robbin Ehn wrote: > >> #1 >> There are some assert which are redundant (to me at least) like: >> src/hotspot/share/runtime/objectMonitor.cpp >> L445 >> ? if (!dmw->is_marked() && dmw->hash() == 0) { >> ??? // This dmw is neutral and has not yet started the restoration >> ??? // protocol so we mark a copy of the dmw to begin the protocol. >> ??? markOop marked_dmw = dmw->set_marked(); >> ??? assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >> ?????????? "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >> ?????????? marked_dmw->is_marked(), marked_dmw->hash()); >> >> That assert is basically a test that set_marked worked? > > Yeah... that's a little paranoid... will take care of that. Done. install_displaced_markword_in_object() comments and asserts have been reworked quite a bit. > >> L505 >> ??? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >> DEFLATER_MARKER) { >> ????? assert(_succ != Self, "invariant"); >> ????? assert(_owner == Self, "invariant"); >> >> Assert on _owner checks that our cmpxchg is not broken? > > Also a little paranoid... will take care of that... Deleted "assert(_owner == Self, "invariant")". > >> I think it's easier to read the code if some on the most obvious >> asserts are removed. Maybe comments instead. > > I'll take a pass at the end of the CR1 round and look at each of the new > asserts. If them seem to paranoid, I'll drop them Will make this pass before I send out the next webrev... > >> #2 >> Not your doing but I think we should remove TRAPS/Thread * Self and >> use JavaThread* instead. >> E.g. so we can change: >> void ObjectMonitor::EnterI(TRAPS) { >> ? Thread * const Self = THREAD; >> ? assert(Self->is_Java_thread(), "invariant"); >> ? assert(((JavaThread *) Self)->thread_state() == _thread_blocked, >> "invariant"); >> >> to: >> >> void ObjectMonitor::EnterI(JavaThread* Self) { >> ? assert(Self->thread_state() == _thread_blocked, "invariant"); > > I'd rather not make that change as part of this project. I'm likely > to do another cleanup subtask related to the _count field discussion > from the design review. I could see looking at TRAPS then... > > >> #3 >> src/hotspot/share/runtime/objectMonitor.inline.hpp >> ?164 inline void ObjectMonitor::inc_ref_count() { >> ?165?? // The increment needs to be MO_SEQ_CST. At the moment, the >> Atomic::inc >> ?166?? // backend on PPC does not yet conform to these requirements. >> Therefore >> ?167?? // the increment is simulated with a load phi; cas phi + 1; loop. >> ?168?? // Without this MO_SEQ_CST Atomic::inc simulation, >> AsyncDeflateIdleMonitors >> ?169?? // is not safe. >> >> I think was fixed with: >> 8202080: Introduce ordering semantics for Atomic::add/inc and other >> RMW atomics >> You should get a leading sync and trailing one with the default >> conservative model and thus get proper memory ordering. >> Martin, I'm I correct? > > So what code are you saying we can switch to for this project? The code that I copied from Thread-SMR has been fixed via: ??? JDK-8222034 Thread-SMR functions should be updated to remove work around https://bugs.openjdk.java.net/browse/JDK-8222034 I've updated ObjectMonitor::dec_ref_count() and inc_ref_count() to use the simpler Atomic::dec() and Atomic::inc(). Okay, I believe I've made all the changes for this set of comments and replies... Dan From daniel.daugherty at oracle.com Mon Apr 15 17:15:11 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 15 Apr 2019 13:15:11 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: Just tracking to make sure I made all the changes... On 4/9/19 11:53 AM, Daniel D. Daugherty wrote: > Hi Carsten, > > Thanks for responding to Karen's code review comments. > > Karen, > > I have a query for you down at the end of my reply... > > More below... > > On 4/5/19 11:01 PM, Carsten Varming wrote: >> Dear Karen, >> >> Please see inline answers. >> >> On Fri, Apr 5, 2019 at 4:59 PM Karen Kinnear >> > wrote: >> >> >> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees >> that 0 < _count >> with comments that caller ensured _count <= 0 >> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >> ? Am I missing something subtle here or should they be the same >> guarantees? >> >> >> In ::enter _count is incremented when the thread is trying to acquire >> the monitor and decremented after the monitor has been acquired. The >> 0 < _count assertion is between those two point in the code. A thread >> acquiring a monitor and then calling wait will increment _count and >> then decrement _count as part of acquiring the monitor, thus _count >> can be 0 by the time the thread calls wait and when ReenterI is called. > > I had a similar answer and I'm planning to tweak the comments > and the guarantees a bit in the next round of code review (CR1); > please see my reply to Karen's CR for the proposed changes. Done. > >> >> 9. install_displaced_markword_in_object >> What happens if the cas_set_mark fails? >> I get that today this handles the race with enter and >> deflate_monitor_using_JT. If we remove >> the call from enter, is the expectation that we?ve blocked all >> others who did not set is_marked themselves? >> If we remove the call from enter would it make sense to ensure >> that the cas_set_mark succeeds here? >> >> >> I designed my original patch such that no thread would ever wait for >> the the deflating thread to finish deflating a monitor. If you remove >> install_displaced_markword_in_object from enter, then the entering >> thread can end up busy waiting by continuously reading the monitor >> pointer from the object mark word and then realizing that the monitor >> is being deflated and it should retry by going back to reading the >> object mark word. This bad behavior is completely avoided by calling >> install_displaced_markword_in_object. > > Here's the code in question: > > src/hotspot/share/runtime/objectMonitor.cpp: > >> bool ObjectMonitor::enter(TRAPS) { > >> ? // Prevent deflation. See ObjectSynchronizer::deflate_monitor() and >> is_busy(). >> ? // Ensure the object-monitor relationship remains stable while >> there's contention. >> ? const jint count = Atomic::add(1, &_count); >> ? if (count <= 0 && _owner == DEFLATER_MARKER) { >> ??? // Async deflation in progress. Help deflater thread install >> ??? // the mark word (in case deflater thread is slow). >> ??? install_displaced_markword_in_object(); >> ??? Self->_Stalled = 0; >> ??? return false;? // Caller should retry. Never mind about _count as >> this monitor has been deflated. >> ? } > > Our thread (T-enter) observes that the ObjectMonitor is being deflated > by T-deflate, calls install_displaced_markword_in_object() and returns > false to the caller which causes a retry. > > Restoring the header/dmw from the ObjectMonitor to the object's header > here isn't needed for correctness so it could be dropped (and would > simplify the code). Your counterpoint is if we drop the call, then > T-enter could do retry after retry if T-deflate is slow to get to its > install_displaced_markword_in_object() call. > > If T-enter calls install_displaced_markword_in_object(), then T-enter > will do a single retry because the object T-enter is trying to lock > will no longer have an ObjectMonitor. Okay I finally grok it... > > I think we need to clarify the comment a bit: > > > ? if (count <= 0 && _owner == DEFLATER_MARKER) { > >? ?? // Async deflation is in progress. Attempt to restore the > >???? // header/dmw to the object's header so that we only retry once > >???? // if the deflater thread happens to be slow. > >? ?? install_displaced_markword_in_object(); Done. > >> In my original patch no thread would ever wait for a deflating thread >> to finish. This property got lost in FastHashCode as that function >> evolved since I wrote my patch, but I think this property is worth >> preserving where possible. It might even be worth looking at >> FastHashCode to see if we can re-establish this property. > > Async Monitor Deflation causes races with FastHashCode() when the target > object has an existing ObjectMonitor. Here's the base code: > >> 768 } else if (mark->has_monitor()) { >> 769 monitor = mark->monitor(); >> 770 temp = monitor->header(); >> 771 assert(temp->is_neutral(), "invariant"); >> 772 hash = temp->hash(); >> 773 if (hash) { >> 774 return hash; >> 775 } >> 776 // Skip to the following code to reduce code size > > The 'monitor' fetched on L769 is unstable due to Async Monitor > Deflation and can cause an incorrect hash value to be returned. > The solution is to protect the ObjectMonitor*: > >> 775 } else if (mark->has_monitor()) { >> 776 ObjectMonitorHandle omh; >> 777 if (!omh.save_om_ptr(obj, mark)) { >> 778 // Lost a race with async deflation so try again. >> 779 assert(AsyncDeflateIdleMonitors, "sanity check"); >> 780 goto Retry; >> 781 } >> 782 monitor = omh.om_ptr(); >> 783 temp = monitor->header(); >> 784 assert(temp->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)temp)); >> 785 hash = temp->hash(); >> 786 if (hash != 0) { >> 787 return hash; >> 788 } >> 789 // Skip to the following code to reduce code size > > where L776-L782 handle the protection duty and possible retry. So we > have to protect the ObjectMonitor*, but, like enter(), we could call > install_displaced_markword_in_object() when we retry which would limit > T-hash to a single retry. Done in save_om_ptr(). > ObjectSynchronizer::inflate() has a similar collision and retry issue: > >> 1456 // CASE: inflated >> 1457 if (mark->has_monitor()) { >> 1458 if (!omh_p->save_om_ptr(object, mark)) { >> 1459 // Lost a race with async deflation so try again. >> 1460 assert(AsyncDeflateIdleMonitors, "sanity check"); >> 1461 continue; >> 1462 } > > In this situation, inflate() discovers that the object already has an > ObjectMonitor; the object may not have had one when inflate() was > called, but it has one now. That particular race predates this project. > > In any case, inflate() wants to return a stable ObjectMonitor* in the > ObjectMonitorHandle, but if save_om_ptr() returns false, then inflate() > has to retry. The only reason for save_om_ptr() to return false is due > to a collision with Async Monitor Deflation. Like enter, we could call > install_displaced_markword_in_object() when we retry which would limit > inflate() to a single retry. Done in save_om_ptr(). > Okay, I've evolved from thinking we could simplify the code by dropping > install_displaced_markword_in_object() to thinking that I understand > what install_displaced_markword_in_object() brings to the party. And now > I'm proposing that we add 2 more install_displaced_markword_in_object() > calls to limit retries on two more code paths. So install_displaced_markword_in_object() is now called from: ? ObjectMonitor::enter()???????????????????????? - no change ? ObjectMonitorHandle::save_om_ptr()???????????? - on _owner == DEFLATER_MARKER detection ? ObjectSynchronizer::deflate_monitor_using_JT() - no change Because of the new call in save_om_ptr(), calls to inflate() also pick up the benefit of at most one retry. I've double checked the new loops and the existing loops and I think all are covered. Okay, I believe I've made all the changes for this set of comments and replies... Dan From gerard.ziemski at oracle.com Mon Apr 15 17:16:03 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Mon, 15 Apr 2019 12:16:03 -0500 Subject: RFR: 8222460: -XX:+PrintFlagsRanges prints incorrect range In-Reply-To: References: <9781e40b-f22c-1b8f-198f-525ffbda9a47@oracle.com> <7830b677-141f-7d1f-8b61-dd1094c072e7@oracle.com> Message-ID: <79c45c20-e113-04c8-1567-2d0b589a7844@oracle.com> On 4/15/19 10:58 AM, Per Liden wrote: >> >> I think the original idea here was that for those flags with a >> constraint, we wanted to show the possible range, from which the >> constraint will further restrict the final value. However, that is >> tricky to test without exposing the constraint function, as evidenced >> by the exclusion list in the test. >> >> For those flags without range and constraint, the implicit range is >> the max range of the flag's type, so the idea here was that such flag >> was "untestable" for practical purposes, so we print an empty range. > > But that's not very useful to an actual user, who want's to know what > the range is (even if every value allowed by the type is valid). We > can't expect a user to know what the range of a specific type is on > every platform. > >> >> I believe that a better fix here might be to print an empty range in >> both cases. > > But why print an empty range when the range is well known? We don't > have to make -XX:+PrintFlagsRanges dumber than it needs to be. The > only time we don't know the range is what there's a constraint > function associated with the flag. > > Frankly, to me this looks like the original intent of this code, but a > simple mistake snuck in which inverted the if-else statement. I was wrong in my initial reply. There are flags with both range and constraint - the entire point of printing the range to the user is to help define valid values, and as you say "We can't expect a user to know what the range of a specific type is on every platform", which applies to both cases. If the test has an issue with the flag, then the test itself needs to be fixed, by excluding the troublesome flag. I believe that we should print out the default range in both cases (done in a followup) and we should modify the test to exclude hard to test (troublesome) flags as needed. Can't we just exclude "SoftMaxHeapSize" from the test here? cheers From daniel.daugherty at oracle.com Mon Apr 15 17:40:31 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 15 Apr 2019 13:40:31 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <4B563AE0-2C52-4349-B59E-A52103A8B570@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> <4B563AE0-2C52-4349-B59E-A52103A8B570@oracle.com> Message-ID: <03e895a5-cd5c-7ce9-dfc6-48a198d079aa@oracle.com> On 4/15/19 12:17 PM, Karen Kinnear wrote: > Dan, > Sorry to be so slow to get back to you, Just when I think I've caught up on all the email threads... :-) No worries... I'm still testing my prelim CR1 changes... > >> On Apr 8, 2019, at 9:04 PM, Daniel D. Daugherty >> > wrote: >> >> On 4/5/19 4:59 PM, Karen Kinnear wrote: >>> Dan, >>> >>> Some more minor comments from reading the code: >> >> Thanks for the additional comments. I'm gathering changes for the next >> round of code review (CR1) so these will be resolved in that round... >> >> More below... >> >> >>> 1. Could you add comments to markOop.hpp about >>> the use in the displaced_mark_word of is_marked to prevent any users >>> of is_marked >>> here from needing to have that information saved/restored? >> >> I _think_ I know what you're looking for here... Perhaps this: >> >> src/hotspot/share/oops/markOop.hpp: >> ? // ObjectMonitor::install_displaced_markword_in_object() uses >> ? // is_marked() on ObjectMonitor::_header as part of the restoration >> ? // protocol for an object's header. In this usage, the mark bit is >> ? // only ever set (and cleared) on the ObjectMonitor::_header field. >> ? bool is_marked()?? const { >> ??? return (mask_bits(value(), lock_mask_in_place) == marked_value); >> ? } > Yes - thank you. Potentially prevents future overlaps. >> >> >>> 2. In objectMonitor.hpp >>> ? in is_busy you clarify the difference in use between _count (which >>> I think you may be changing >>> to _contended) and _ref_count. Could you possibly also comment where >>> you declare them? >> >> I'll do the rename of _count -> _contentions in a subtask of 8153224 >> like the other cleanups of the monitor subsystem. >> >> Here's the comment in question: >> >> src/hotspot/share/runtime/objectMonitor.hpp: >> ? intptr_t is_busy() const { >> ??? // TODO-FIXME: merge _count and _waiters. >> ??? // TODO-FIXME: assert _owner == null implies _recursions = 0 >> ??? // TODO-FIXME: assert _WaitSet != null implies _count > 0 >> ??? // We do not include _ref_count in the is_busy() check because >> ??? // _ref_count is for indicating that the ObjectMonitor* is in >> ??? // use which is orthogonal to whether the ObjectMonitor itself >> ??? // is in use for a locking operation. >> ??? return >> _count|_waiters|intptr_t(_owner)|intptr_t(_cxq)|intptr_t(_EntryList); >> ? } >> >> I don't think this comment clarifies _count vs. _ref_count. >> I added the last four lines of the comment and their purpose is to >> describe why _ref_count isn't used by is_busy(). The TODO-FIXME lines >> need to be revisited since (at least) the third one is wrong. >> >> Here's the existing comment for _ref_count: >> >> ? volatile jint _ref_count;???????? // ref count for ObjectMonitor* >> >> Here's the existing comment for _count: >> >> ? volatile jint? _count;??????????? // reference count to prevent >> reclamation/deflation >> ??????????????????????????????????? // at stop-the-world time. See >> ObjectSynchronizer::deflate_monitor(). >> ??????????????????????????????????? // _count is approximately >> |_WaitSet| + |_EntryList| >> >> And here's what I proposed to change it to in my reply to your design >> review notes: >> >> ? volatile jint? _contentions;????? // Number of active contentions >> in enter(). It is used by is_busy() >> ??????????????????????????????????? // along with other fields to >> determine if an ObjectMonitor can be >> ??????????????????????????????????? // deflated. See >> ObjectSynchronizer::deflate_monitor(). >> >> I think we're good here with the proposed change of comment (and the >> rename) for the _contentions field along with existing comment for >> _ref_count and the existing comment for is_busy(). I may delete the >> third TODO-FIXME line as part of the next cleanup. > Thank you for the rename and comment >> >> >>> 3. clear_using_JT: would it make sense to have an assertion that >>> ?_owner is either null or DEFLATER_MARKER? >> >> We could add something like: >> >> ? assert(_owner == NULL || >> ???????? (AsyncDeflateIdleMonitors && _owner == DEFLATER_MARKER), >> ???????? "Fatal logic error in ObjectMonitor owner!"); >> >> and that will catch any races in async monitor deflation where the >> _owner field is set to a monitor owner value (stack addr or thread*). >> For monitor deflation at a safepoint, the non-NULL _owner field is >> caught in clear() (which calls clear_using_JT()). > That covers my concern. >> >> >>> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that >>> 0 < _count >>> with comments that caller ensured _count <= 0 >>> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >>> ? Am I missing something subtle here or should they be the same >>> guarantees? >> >> Here's the code in question: >> >> src/hotspot/share/runtime/objectMonitor.cpp: >> >> void ObjectMonitor::EnterI(TRAPS) { >> >> ? if (_owner == DEFLATER_MARKER) { >> ??? guarantee(0 < _count, "_owner == DEFLATER_MARKER && _count <= 0 >> should have been handled by the caller"); >> ??? // Deflater thread tried to lock this monitor, but it failed to >> make _count negative and gave up. >> >> void ObjectMonitor::ReenterI(Thread * Self, ObjectWaiter * SelfNode) { >> >> ??? if (_owner == DEFLATER_MARKER) { >> ????? guarantee(0 <= _count, "Impossible: _owner == DEFLATER_MARKER >> && _count < 0, monitor must not be owned by deflater thread here"); >> >> >> Reading these two guarantee() calls always throws me off stride >> because I would have written them like this: >> >> ??? guarantee(_count > 0, "_owner == DEFLATER_MARKER && _count <= 0 >> should have been handled by the caller"); >> >> and >> >> ????? guarantee(_count >= 0, "Impossible: _owner == DEFLATER_MARKER >> && _count < 0, monitor must not be owned by deflater thread here"); >> >> When rewritten like the above, you have: >> >> ??? "_count > 0" ... _count <= 0 >> >> and: >> >> ??? "_count >= 0" ... "_count < 0" >> >> which is easier for my brain to read... okay... enough sidebar... >> >> Short answer: No the guarantees should not be the same. >> >> Longer answer: EnterI() is called by enter() after enter() has >> incremented the _count field to indicate the contended state of >> things. So in EnterI(), "_count > 0" is the right check. >> ReenterI() is called after wait() has returned (notified or >> timedout), and the _count field is not used on reentry ops so >> "_count >= 0" is the right check. > Thank you for walking this through. > Wait also can cause a call to enter() -> EnterI(), but in that case > _waiters was set while > the monitor was still owned, so we should never get to the logic where > EnterI sees > DEFLATER_MARKER. Correct. So in the "wait() -> enter() -> EnterI()" chain we won't get to that new guarantee() because of the _waiters value. I think we're in agreement that the EnterI() and ReenterI() guarantees are different and that's okay... :-) >> >> I'm going to tweak the ObjectMonitor::EnterI() code like this (yes, >> there are two places in EnterI() that do this): >> >> ??? L501: ? if (_owner == DEFLATER_MARKER) { >> ??? ? ?? ?? ? // The deflation protocol finished the first part >> (setting _owner), >> ??? ? ?? ? ?? // but it failed the second part (making _count >> negative) and bailed. >> ??? ? ? ? ? ? // Because we're called from enter() we have at least >> one contention. >> ??? ? ? ? ??? guarantee(count > 0, "_owner == DEFLATER_MARKER && >> _count <= 0 should have been handled by the caller"); >> ??? L504: ??? // Try to acquire monitor. >> ??? L505: ??? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >> DEFLATER_MARKER) { >> >> ??? L629: ??? if (_owner == DEFLATER_MARKER) { >> ????? ?? ?????? // The deflation protocol finished the first part >> (setting _owner), >> ????? ?? ?????? // but it failed the second part (making _count >> negative) and bailed. >> ????? ?? ?????? // Because we're called from enter() we have at least >> one contention. >> ?? ???????????? guarantee(count> 0 , "_owner == DEFLATER_MARKER && >> _count <= 0 should have been handled by the caller"); >> ??? L632: ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) >> == DEFLATER_MARKER) { >> >> And I'm going to tweak the ReenterI() code like this: >> >> ??? L759: ??? if (_owner == DEFLATER_MARKER) { >> ??????????????? // The deflation protocol finished the first part >> (setting _owner), >> ??????????????? // but it will observe _waiters != 0 and will bail >> out. Because we're >> ??????????????? // called from wait() we may or may not have any >> contentions. >> ? ? ? ? ? ????? guarantee(count >= 0, "Impossible: _owner == >> DEFLATER_MARKER && _count < 0 should have been handled by the caller"); >> ??? L761: ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) >> == DEFLATER_MARKER) { >> >> >> You didn't ask this, but it is okay that _count is only used to track >> contentions in enter()/EnterI() and is not used to track contentions >> in wait()/ReenterI(). For the wait()/ReenterI() code path, _waiters is >> used by is_busy() to observe the busy state for an ObjectMonitor that >> is being wait()'ed for. The _waiters field is decremented after a >> waiter has returned from ReenterI() so the _owner field takes over >> answering the is_busy() question? > Yes - that was my confusion and I had not walked it through carefully > enough. > And thank you - the guarantees are easier to read this way. Thanks. >> >> >>> 5. I could use a little help with allocation state transitions, >>> e.g. in deflate_monitor_list_using_JT >>> ? you see is_new with object set so you mark it as old so next >>> deflation will check it >> >> Here's the code in question: >> >> src/hotspot/share/runtime/synchronizer.cpp: >> >> int ObjectSynchronizer::deflate_monitor_list_using_JT(ObjectMonitor** >> listHeadp, >> ObjectMonitor** freeHeadp, >> ObjectMonitor** freeTailp, >> ObjectMonitor** savedMidInUsep) { >> >> ??? // Only try to deflate if there is an associated Java object and if >> ??? // mid is old (is not newly allocated and is not newly freed). >> ??? if (mid->object() != NULL && mid->is_old() && >> ??????? deflate_monitor_using_JT(mid, freeHeadp, freeTailp)) { >> ????? // Deflation succeeded so update the in-use list. >> >> ??? } else { >> ????? // mid is considered in-use if it does not have an associated >> ????? // Java object or mid is not old or deflation did not succeed. >> ????? // A mid->is_new() node can be seen here when it is freshly >> returned >> ????? // by omAlloc() (and skips the deflation code path). >> ????? // A mid->is_old() node can be seen here when deflation failed. >> ????? // A mid->is_free() node can be seen here when a fresh node from >> ????? // omAlloc() is released by omRelease() due to losing the race >> ????? // in inflate(). >> >> ????? if (mid->object() != NULL && mid->is_new()) { >> ??????? // mid has an associated Java object and has now been seen >> ??????? // as newly allocated so mark it as "old". >> mid->set_allocation_state(ObjectMonitor::Old); >> ????? } >> >>> ? - why do you set it to old here rather than in inflate once we set >>> values? >> >> Inflation is used in quite a few places. If we marked the >> ObjectMonitor as "Old" in inflate(), then that would make the >> ObjectMonitor available for deflation by deflate_monitor_using_JT() >> earlier: >> >> src/hotspot/share/runtime/synchronizer.cpp: >>> bool ObjectSynchronizer::deflate_monitor_using_JT(ObjectMonitor* mid, >>> ObjectMonitor** freeHeadp, >>> ObjectMonitor** freeTailp) { >>> ? assert(AsyncDeflateIdleMonitors, "sanity check"); >>> ? assert(Thread::current()->is_Java_thread(), "precondition"); >>> ? // A newly allocated ObjectMonitor should not be seen here so we >>> ? // avoid an endless inflate/deflate cycle. >>> ? assert(mid->is_old(), "precondition"); >> >> So the idea behind only deflating ObjectMonitors that have reached >> allocation state "Old" is to prevent "an endless inflate/deflate cycle". >> Here's the relevant section from Carsten's JEP: >> >>> To avoid endless inflation / deflation cycles in the prototype, monitor >>> deflation is only attempted the second time a monitor is seen by the >>> thread marking monitors as deflatable: If the thread (the only thread >>> marking monitors as deflatable; might be service thread or some GC >>> related thread or even a dedicated thread) sees a monitor in state New, >>> then the thread marks the monitor as Old and moves on. So there is >>> little interaction between a thread inflating a lock to a monitor and >>> the deflating thread, the inflating thread just has to make sure the >>> monitor is marked New and this marker is published using appropriate >>> barriers. >> >> There isn't an explicit example in the JEP of what Carsten was thinking >> of with "an endless inflate/deflate cycle". I didn't try to think of >> such an example for the OpenJDK wiki either. I simple wrote: >> >>> ObjectMonitor has a new allocation_state field that supports three >>> states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied >>> to ObjectMonitors that have reached the 'Old' state. When the Async >>> Monitor Deflation code sees an ObjectMonitor in the 'New' state, it >>> is changed to the 'Old' state, but is not deflated. This prevents a >>> newly allocated ObjectMonitor from being immediately deflated which >>> could cause an inflation<->deflation oscillation. >> >> So let's think about what might happen if an ObjectMonitor is marked >> as "Old" in inflate(). Here's an example use of inflate() in the >> "slow enter" code path: >> >> src/hotspot/share/runtime/synchronizer.cpp: >> > void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, >> TRAPS) { >> >> base< ?? inflate(THREAD, obj(), >> inflate_cause_monitor_enter)->enter(THREAD); >> >> new>? ?? ObjectMonitorHandle omh; >> new>? ?? inflate(&omh, THREAD, obj(), inflate_cause_monitor_enter); >> new>? ?? do_loop = !omh.om_ptr()->enter(THREAD); >> >> In the "base" code, we took the return from inflate() and used it to call >> ObjectMonitor::enter(). If we never changed that bit of code and >> inflate() >> marked the ObjectMonitor as "Old", then deflate_monitor_using_JT() could >> async deflate the ObjectMonitor while we were trying to call enter() on >> it... Boom! So we might think that holding off marking an ObjectMonitor >> as "Old" can save us... and it can, but not in all cases... :-( >> >> It is entirely possible that our call to slow_enter() is made on an >> ObjectMonitor that's already marked "Old". In that case, our thread >> (T-enter) calls inflate() which returns the existing ObjectMonitor* >> and we use it to call enter(). If the thread (T-deflate) calling >> deflate_monitor_using_JT() does its magic before T-enter sets the >> owner field or the count field... Boom! >> >> The previous paragraph is exactly what motivated the _ref_count field, >> the ObjectMonitorHandle helper, and adding an ObjectMonitorHandle* >> parameter to inflate(). inflate() calls >> ObjectMonitorHandle::save_om_ptr() >> which increments the ObjectMonitor's ref_count and then checks for async >> deflation protocol collisions. If there's a collision, then save_om_ptr() >> returns false and the caller (inflate() in this case) has to retry. When >> inflate() returns, the ObjectMonitor in the ObjectMonitorHandle cannot >> be deflated and is safe until the ObjectMonitorHandle is destroyed. >> >> So by changing T-enter to use an ObjectMonitorHandle, T-deflate cannot >> deflate the ObjectMonitor in the window after inflate() returns and >> before T-enter sets the owner field or increments the count field. But >> you know all that already! >> So let's bring this back to having inflate() mark the ObjectMonitor as >> "Old"... Since inflate() returns an ObjectMonitor with the ref_count > 0, >> it doesn't matter if the ObjectMonitor is marked as "Old" in inflate(). >> T-deflate cannot deflate it due to ref_count > 0. >> >> Here's another crazy thought... inflate() is the only function that >> calls omAlloc(), and omAlloc() is the only function that sets "New". >> If we move the setting of "Old" from deflate_monitor_list_using_JT() >> to inflate(), then the change from "New" -> "Old" never happens >> outside of the inflate() call so why do we need the allocation state? > That was my next question. I'm leaving in allocation state for now, but I moved when it's set to "Old". We'll see what the testing shakes out... I have some failures right now, but I don't think they are related... but I'm not done hunting yet... >> >> Small dose of reality: I've found having the allocation state to be >> very helpful when debugging race related crashes. We could make the >> allocation state be DEBUG_ONLY, but then what about race debugging of >> product bits... sigh... >> >> >>> 6. Could you get rid of the new goto?s? >> >> I believe there is only one left from Carsten's prototype: >> >> src/hotspot/share/runtime/synchronizer.cpp: >> >>> intptr_t ObjectSynchronizer::FastHashCode(Thread * Self, oop obj) { >> >>> ? } else if (mark->has_monitor()) { >>> ??? ObjectMonitorHandle omh; >>> ??? if (!omh.save_om_ptr(obj, mark)) { >>> ????? // Lost a race with async deflation so try again. >>> ????? assert(AsyncDeflateIdleMonitors, "sanity check"); >>> ????? goto Retry; >>> ??? } >> >> I can change FastHashCode() to use the same "while (do_loop)" as the >> other code that needs to do retries... >> > Thank you. >> >>> 7. On the updated wiki for the hash race example: >>> Racing Threads: ?T-hash is about to inc the ref_count field? >>> actually - T-hash just did - ref_count == 1 - so maybe change middle >>> values >> >> Actually, we're talking about the set up for the race and the >> diagram shows "ref_count == 1" and should show "ref_count == 0". >> So I have fixed that on the "Racing Threads" diagram. >> >> In the following "T-deflate Wins" and "T-hash Wins" diagrams, >> "ref_count == 1" is shown in both initial race results ObjectMonitor >> box. In "T-deflate Wins", it shows ref_count being restored to >> 0 in the second ObjectMonitor box. >> >> Thanks for catching this error. I've fixed it on the wiki. > Thanks. >> >> >>> >>> 8. There is an old comment in FastHashCode >>> that >>> ?// WARNING: >>> ? ? // ? The displaced header is strictly immutable. >>> ? ? // It can NOT be changed in ANY cases. >>> >>> I presume that only applies to the displaced header for a stack lock >>> - could you >>> possibly update that while you are in the code? >> >> Here's the whole comment: >> >>> ??? // WARNING: >>> ??? //?? The displaced header is strictly immutable. >>> ??? // It can NOT be changed in ANY cases. So we have >>> ??? // to inflate the header into heavyweight monitor >>> ??? // even the current thread owns the lock. The reason >>> ??? // is the BasicLock (stack slot) will be asynchronously >>> ??? // read by other threads during the inflate() function. >>> ??? // Any change to stack may not propagate to other threads >>> ??? // correctly. >> >> That comment applies the displaced header that's in the BasicLock >> on the thread's stack and it definitely needs some cleaning up >> independent of the Async Monitor Deflation project. > Thank you. >> >> >>> Also in FastHashCode >>> // The only update to the header in the monitor (outside GC) >>> 823 // is install the hash code. If someone add new usage of >>> 824 // displaced header, please update this code >>> Can you update that comment as well? I know you?ve already updated >>> the code logic. >> >> I'll revisit that comment as well. I believe Carsten updated it in >> his prototype, but when I backed out that change when I simplified >> the hashcode stuff due to ObjectMonitorHandles/ref_count. > Thank you. And if you are revisiting this to potentially call > install_displaced_markword then this > would change yet again. >> >> >>> So I walked the logic for the hashcode interactions - I didn?t find >>> any holes. Thank you for walking most of it in email/wiki. >>> In particular, inflate does the save_om_ptr dance to inc_ref_count, >>> so this code above will >>> be called while preventing async deflation. >> >> Right. >> >> >>> 9. install_displaced_markword_in_object >>> What happens if the cas_set_mark fails? >> >> Here's the code in question: >> >> src/hotspot/share/runtime/objectMonitor.cpp: >> >>> void ObjectMonitor::install_displaced_markword_in_object() { >> >>> ? if (dmw->is_marked()) { >>> ??? // The dmw copy is marked which means a hash was not set by a racing >>> ??? // thread. Clear the mark from the copy in preparation for possible >>> ??? // restoration from this thread. >>> ??? assert(dmw->hash() == 0, "must be 0: hash=" INTPTR_FORMAT, >>> dmw->hash()); >>> ??? dmw = dmw->set_unmarked(); >>> ? } >>> ? assert(dmw->is_neutral(), "must be a neutral markword"); >>> >>> ? oop const obj = (oop) object(); >>> ? // Install displaced markword if object markword still points to this >>> ? // monitor. Both the mutator trying to enter() and the thread >>> deflating >>> ? // the monitor will reach this point, but only one can win. >>> ? // Note: If a mutator won the cmpxchg() race above and installed a >>> hash >>> ? // in _header, then the updated dmw contains that hash and we'll >>> install >>> ? // it in the object's markword here. >>> ? obj->cas_set_mark(dmw, markOopDesc::encode(this)); >> >> We don't check the return from cas_set_mark() here intentionally. >> If we have just T-enter and T-deflate racing through this code, >> then after the "if (dmw->is_marked()) {" block, both threads >> will have the same 'dmw' value. One thread will set it and the >> other thread will fail to set it, but we don't care because both >> threads wanted to set the same value... As a result of the >> cas_set_mark() call in both threads, both threads will see the >> same value in the object's header (if they happen to look). >> >> I talk about this in the "Either Wins the Second Race" sub-section >> on the wiki. > Yes for the current model of only these two callers, neither modifying > the dmw other than > set/clear is_marked) bit. If we extend to FastHashCode - this gets > trickier. We'll have to see if the CR1 version of the comments and code makes more sense... :-) >> >> >>> I get that today this handles the race with enter and >>> deflate_monitor_using_JT. If we remove >>> the call from enter, is the expectation that we?ve blocked all >>> others who did not set is_marked themselves? >>> If we remove the call from enter would it make sense to ensure that >>> the cas_set_mark succeeds here? >> >> If we remove the install_displaced_markword_in_object() call from >> enter(), >> then I don't think we need install_displaced_markword_in_object() at >> all and can restore the object's header with: >> >> ??? // Restore the header back to obj >> ??? obj->release_set_mark(mid->header()); >> >> just like ObjectSynchronizer::deflate_monitor(). The question is >> whether we think install_displaced_markword_in_object() buys us >> something other than "help" in restoring the object's header. > Yes - that was the question - it adds complexity. Does it help threads > make progress. > Thinking about this more and reading Carsten?s reply - I think it does > - especially > since seeing DEFLATION_MARKER can make the checker loop, but in the > meantime someone > else can acquire the monitor - so that might be a non-trivial spin time. Yup. I'm convinced that keeping install_displaced_markword_in_object() is a good thing now; I just have to shake out the test results... >> >> >>> 10. Is there any benefit in a bit of stress testing with something >>> like a temporary flag that deflates in >>> mAlloc each time it is called? >> >> Maybe? :-) Something like DeflateAsyncMonitorsALot? Can you eloborate >> on your thinking a bit? > Just if you thought it might help shake multi-thread timing testing. > If you think it is all shaken out > already - no need. I'll check it out... :-) >> >> >>> Looking forward to the performance runs as well as the latency numbers. >> >> I posted the SPECjbb2015 numbers from this past weekend earlier today. >> Rather disappointing on my T7600's... Neutral on my MacMini? > Thank you. And Claus? response was to be sure they are meaningful > before deep-diving - so > reproduce multiple times kind of thing. Yup. And the changes I'm making in CR1 will change the deflation rate (for the better) so what I've measured will no longer be valid... >> >> When you say "latency numbers", what do you mean? Do you mean how long >> ObjectMonitors that could be deflated are kept inflated? Or do you mean >> something else? > latency I was thinking about was the reduced time during safepoint > cleanup - which is one of the goals of this > exercise. Ahhh... that part is easy. It's the increase in number of safepoints that I need to mull on, but that'll change with the earlier "Old" setting so I have to remeasure anyway... >> >> I think I've responded to everything. Please let me know if I missed >> something? > You got it all. Good. Thanks for the sanity check. Dan > > many thanks, > Karen >> >> Dan >> >> >> >>> >>> thanks, >>> Karen >>> >>>> On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty >>>> > >>>> wrote: >>>> >>>> Filed: >>>> >>>> ??? JDK-8222034 Thread-SMR functions should be updated to remove >>>> work around >>>> https://bugs.openjdk.java.net/browse/JDK-8222034 >>>> >>>> Martin and Robbin, please check it out and make sure that I captured >>>> things correctly... >>>> >>>> Dan >>>> >>>> >>>> >>>> On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: >>>>> On 4/5/19 8:37 AM, Doerr, Martin wrote: >>>>>> Hi everybody, >>>>>> >>>>>>> I think was fixed with: >>>>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and >>>>>>> other RMW atomics >>>>>>> You should get a leading sync and trailing one with the default >>>>>>> conservative >>>>>>> model and thus get proper memory ordering. >>>>>>> Martin, I'm I correct? >>>>>> Exactly. Thanks for pointing this out. PPC uses the strongest >>>>>> possible ordering semantics with memory_order_conservative >>>>>> (default parameter). >>>>>> I've seen that comment about PPC in "void >>>>>> ThreadsList::inc_nested_handle_cnt()". This function could get >>>>>> replaced. >>>>> >>>>> Okay so we need a new bug to update these two Thread-SMR functions: >>>>> >>>>> src/hotspot/share/runtime/threadSMR.cpp: >>>>> >>>>> void ThreadsList::dec_nested_handle_cnt() { >>>>> ? // The decrement needs to be MO_ACQ_REL. At the moment, the >>>>> Atomic::dec >>>>> ? // backend on PPC does not yet conform to these requirements. >>>>> Therefore >>>>> ? // the decrement is simulated with an Atomic::sub(1, &addr). >>>>> ? // Without this MO_ACQ_REL Atomic::dec simulation, the nested >>>>> SMR mechanism >>>>> ? // is not generally safe to use. >>>>> ? Atomic::sub(1, &_nested_handle_cnt); >>>>> } >>>>> >>>>> void ThreadsList::inc_nested_handle_cnt() { >>>>> ? // The increment needs to be MO_SEQ_CST. At the moment, the >>>>> Atomic::inc >>>>> ? // backend on PPC does not yet conform to these requirements. >>>>> Therefore >>>>> ? // the increment is simulated with a load phi; cas phi + 1; loop. >>>>> ? // Without this MO_SEQ_CST Atomic::inc simulation, the nested >>>>> SMR mechanism >>>>> ? // is not generally safe to use. >>>>> ? intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>>>> ? for (;;) { >>>>> ??? if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) >>>>> == sample) { >>>>> ????? return; >>>>> ??? } else { >>>>> ????? sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>>>> ??? } >>>>> ? } >>>>> } >>>>> >>>>> I'll file a new bug, loop in Robbin, Erik O and Martin, and make >>>>> sure we're all in agreement. Once we decide that Thread-SMR's >>>>> functions look like, I'll adapt my Async Monitor Deflation >>>>> functions... >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> Best regards, >>>>>> Martin >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Robbin Ehn >>>>> > >>>>>> Sent: Freitag, 5. April 2019 14:07 >>>>>> To: daniel.daugherty at oracle.com >>>>>> ; >>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>> ; Carsten Varming >>>>>> >; Roman Kennke >>>>>> >; Doerr, Martin >>>>>> > >>>>>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>>>>> >>>>>> Hi Dan, >>>>>> >>>>>> (Martin there is question for you last in this email) >>>>>> >>>>>> After first pass I did not find any real issues. >>>>>> Considering what you had to work with, it looks good! >>>>>> >>>>>> #1 >>>>>> There are some assert which are redundant (to me at least) like: >>>>>> src/hotspot/share/runtime/objectMonitor.cpp >>>>>> L445 >>>>>> ??? if (!dmw->is_marked() && dmw->hash() == 0) { >>>>>> ????? // This dmw is neutral and has not yet started the restoration >>>>>> ????? // protocol so we mark a copy of the dmw to begin the protocol. >>>>>> ????? markOop marked_dmw = dmw->set_marked(); >>>>>> assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >>>>>> ???????????? "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >>>>>> marked_dmw->is_marked(), marked_dmw->hash()); >>>>>> >>>>>> That assert is basically a test that set_marked worked? >>>>>> >>>>>> L505 >>>>>> ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>>>>> DEFLATER_MARKER) { >>>>>> ??????? assert(_succ != Self, "invariant"); >>>>>> ??????? assert(_owner == Self, "invariant"); >>>>>> >>>>>> Assert on _owner checks that our cmpxchg is not broken? >>>>>> >>>>>> I think it's easier to read the code if some on the most obvious >>>>>> asserts are >>>>>> removed. Maybe comments instead. >>>>>> >>>>>> #2 >>>>>> Not your doing but I think we should remove TRAPS/Thread * Self >>>>>> and use >>>>>> JavaThread* instead. >>>>>> E.g. so we can change: >>>>>> void ObjectMonitor::EnterI(TRAPS) { >>>>>> ??? Thread * const Self = THREAD; >>>>>> assert(Self->is_Java_thread(), "invariant"); >>>>>> ??? assert(((JavaThread *) Self)->thread_state() == >>>>>> _thread_blocked, "invariant"); >>>>>> >>>>>> to: >>>>>> >>>>>> void ObjectMonitor::EnterI(JavaThread* Self) { >>>>>> ??? assert(Self->thread_state() == _thread_blocked, "invariant"); >>>>>> >>>>>> #3 >>>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp >>>>>> ?? 164 inline void ObjectMonitor::inc_ref_count() { >>>>>> ?? 165?? // The increment needs to be MO_SEQ_CST. At the moment, >>>>>> the Atomic::inc >>>>>> ?? 166?? // backend on PPC does not yet conform to these >>>>>> requirements. Therefore >>>>>> ?? 167?? // the increment is simulated with a load phi; cas phi + >>>>>> 1; loop. >>>>>> ?? 168?? // Without this MO_SEQ_CST Atomic::inc simulation, >>>>>> AsyncDeflateIdleMonitors >>>>>> ?? 169?? // is not safe. >>>>>> >>>>>> I think was fixed with: >>>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and >>>>>> other RMW atomics >>>>>> You should get a leading sync and trailing one with the default >>>>>> conservative >>>>>> model and thus get proper memory ordering. >>>>>> Martin, I'm I correct? >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>>>>>> Greetings, >>>>>>> >>>>>>> Welcome to the OpenJDK review thread for my port of Carsten's >>>>>>> work on: >>>>>>> >>>>>>> ? ??? JDK-8153224 Monitor deflation prolong safepoints >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>> Here's a link to the OpenJDK wiki that describes my port: >>>>>>> >>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>> Here's the webrev URL: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>>>>> >>>>>>> Here's a link to Carsten's original webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>>>>> >>>>>>> Earlier versions of this patch have been through several rounds of >>>>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>>>>> Roman for their preliminary code review comments. A very special >>>>>>> thanks to Robbin and Roman for building and testing the patch in >>>>>>> their own environments (including specJBB2015). >>>>>>> >>>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on >>>>>>> Oracle's usual set of platforms. Earlier versions have been run >>>>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>>>>> (product, fastdebug, slowdebug).Earlier versions have run >>>>>>> Kitchensink >>>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>>>>> fastdebug >>>>>>> and slowdebug). Earlier versions have run my monitor inflation >>>>>>> stress >>>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>>>>> fastdebug and slowdebug). >>>>>>> >>>>>>> All of the testing done on earlier versions will be redone on the >>>>>>> latest version of the patch. >>>>>>> >>>>>>> Thanks, in advance, for any questions, comments or suggestions. >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> P.S. >>>>>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>>>> is currently failing in -Xcomp mode on Win* only. I've been trying >>>>>>> to characterize/analyze this failure for more than a week now. At >>>>>>> this point I'm convinced that Async Monitor Deflation is aggravating >>>>>>> an existing bug. However, I plan to have a better handle on that >>>>>>> failure before these bits are pushed to the jdk/jdk repo. >>>>> >>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Mon Apr 15 17:42:43 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 15 Apr 2019 13:42:43 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <5b0d2152-e336-675b-5c89-45636596a279@oracle.com> Message-ID: <12c116b3-1ba5-43d8-f77f-6f03f0e3f34c@oracle.com> Claes, Thanks for the pointers on SPECjbb2015... On 4/11/19 5:43 AM, Claes Redestad wrote: > Hi Dan, > > critical-jOPS in SPECjbb2015 is designed to be sensitive to regressions > in latency of the benchmark operations, sometimes to a fault. So keep in > mind that what you're seeing could very well be attributed to noise. > Such as an accidental result of pauses happening - by chance or design - > when the benchmark is critically assessing latency SLAs. > > Correlating safepoint pauses across the benchmark run with benchmark > logs might inform if there's an increase in spikes/latencies during > sensitive phases of the benchmark that could contribute to a sustained > critJOPS regression. > > Sample questions to answer: > - is the time spent deflating more spread out before/after? > - is there indication of back-to-back safepoints happening? We're making non-trivial changes in the CR1 round so I'll have to redo all the testing (stress and SPECjbb2015). Thanks for giving me things to think about... I'll likely have more questions down the road. Dan > > Thanks! > > /Claes > > On 2019-04-10 21:24, Daniel D. Daugherty wrote: >> So I?ve been analyzing monitorinflation logs from SPECjbb2015 runs. It >> takes about 45 minutes for a SPECjbb15 run to finish on my Linux box. >> >> In the baseline bits: >> ?? Total deflating time: 0.9314706 secs. >> ?? Total deflating count: 2582566 >> >> In the v2.00 bits: >> ?? Total deflating time: 1.5767698 secs. >> ?? Total deflating count: 2505602 >> >> Yes, that is 1 second in 45 minutes for the baseline and 1.6 seconds >> in 45 minutes for the v2.00 bits. That strongly indicates that the >> mechanics of async monitor deflation is not the cause of the 4.5% >> slowdown in SPECjbb2015. It must be something else... >> >> I'm looking at safepoint stats next... >> >> Dan >> >> >> On 4/8/19 12:55 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> I took the last repo that I ran through Mach5 tier[1-8] testing and did >>> 10 SPECjbb2015 runs on the 'release' version of those bits. I also did >>> 10 SPECjbb2015 runs on the 'release' version of the baseline bits. >>> >>> Baseline: jdk-13+13 >>> Exp:????? v2.00 (8153224-webrev/3-for-jdk13) plus >>> ????????? special-cleanup-for-global-in-use-list >>> >>> Linux-X64 Machine: >>> ? - Ubuntu 16.04, Dell T7600, 64GB RAM >>> ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 >>> threads >>> >>> MacOSX Machine: >>> ? - MacOS 10.13.6, Mac Mini, mid 2011, 16GB RAM >>> ? - 2 GHz Intel Core i7 (I7-2635QM), 1 CPU x 4 cores x 2 threads >>> >>> Solaris-X64 Machine: >>> ? - Solaris 11.2 SRU5.5, Dell T7600, 64GB RAM >>> ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 >>> threads >>> >>> Average Results for Each OS >>> >>> ???? hbIR?????????? hbIR >>> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >>> ---------------? ---------? --------? -------------? -------- >>> >>> ?????? 23838.00?? 22446.90? 20738.80??????? 6166.70? Linux-X64 base >>> ?????? 23838.00?? 22279.40? 20262.00??????? 5891.50? Linux-X64 exp >>> >>> ??????? 5841.80??? 4885.00?? 4764.00??????? 1495.10? MacOSX base >>> ??????? 5621.00??? 4701.00?? 4778.00??????? 1492.10? MacOSX exp >>> >>> ?????? 16125.20?? 13852.30? 12780.50??????? 2791.90 Solaris-X64 base >>> ?????? 15788.70?? 13861.40? 12551.40??????? 2665.90 Solaris-X64 exp >>> >>> I'm new to SPECjbb2015 so I don't what "hbIR" and "jOPS" are yet. >>> Based a bit of googling so far, it appears that for critical-jOPS, >>> higher is better: >>> >>> - Linux-X64 exp critical-jOPS is ~4.5% lower than Linux-X64 base >>> - MacOSX base and MacOSX exp critical-jOPS are almost identical >>> - Solaris-X64 exp critical-jOPS is ~4.5% lower than Linux-X64 base >>> >>> I have not tried to research or analyze the other columns yet. >>> >>> The results for each of the 10 runs are shown below. >>> >>> Dan >>> >>> >>> >>> Linux-X64 Runs >>> >>> ???? hbIR?????????? hbIR >>> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >>> ---------------? ---------? --------? -------------? -------- >>> ????????? 23838????? 22719???? 19070?????????? 6515 >>> SPECjbb2015.Lin-X64.base.01 >>> ????????? 23838????? 21642???? 20262?????????? 5591 >>> SPECjbb2015.Lin-X64.base.02 >>> ????????? 23838????? 23108???? 20262?????????? 6508 >>> SPECjbb2015.Lin-X64.base.03 >>> ????????? 23838????? 21730???? 21454?????????? 6235 >>> SPECjbb2015.Lin-X64.base.04 >>> ????????? 23838????? 22220???? 21454?????????? 6028 >>> SPECjbb2015.Lin-X64.base.05 >>> ????????? 23838????? 22543???? 20262?????????? 5996 >>> SPECjbb2015.Lin-X64.base.06 >>> ????????? 23838????? 23014???? 21454?????????? 6192 >>> SPECjbb2015.Lin-X64.base.07 >>> ????????? 23838????? 22543???? 21454?????????? 5889 >>> SPECjbb2015.Lin-X64.base.08 >>> ????????? 23838????? 22750???? 20262?????????? 6038 >>> SPECjbb2015.Lin-X64.base.09 >>> ????????? 23838????? 22200???? 21454?????????? 6675 >>> SPECjbb2015.Lin-X64.base.10 >>> ---------------? ---------? --------? -------------? -------- >>> ?????? 23838.00?? 22446.90? 20738.80??????? 6166.70? average of values >>> >>> ???? hbIR?????????? hbIR >>> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >>> ---------------? ---------? --------? -------------? -------- >>> ????????? 23838????? 21422???? 20262?????????? 6329 >>> SPECjbb2015.Lin-X64.exp.01 >>> ????????? 23838????? 22543???? 19070?????????? 6351 >>> SPECjbb2015.Lin-X64.exp.02 >>> ????????? 23838????? 22100???? 20262?????????? 5005 >>> SPECjbb2015.Lin-X64.exp.03 >>> ????????? 23838????? 22543???? 20262?????????? 5881 >>> SPECjbb2015.Lin-X64.exp.04 >>> ????????? 23838????? 23170???? 20262?????????? 5938 >>> SPECjbb2015.Lin-X64.exp.05 >>> ????????? 23838????? 22543???? 20262?????????? 5744 >>> SPECjbb2015.Lin-X64.exp.06 >>> ????????? 23838????? 22100???? 20262?????????? 5482 >>> SPECjbb2015.Lin-X64.exp.07 >>> ????????? 23838????? 22543???? 20262?????????? 6213 >>> SPECjbb2015.Lin-X64.exp.08 >>> ????????? 23838????? 22100???? 21454?????????? 5637 >>> SPECjbb2015.Lin-X64.exp.09 >>> ????????? 23838????? 21730???? 20262?????????? 6335 >>> SPECjbb2015.Lin-X64.exp.10 >>> ---------------? ---------? --------? -------------? -------- >>> ?????? 23838.00?? 22279.40? 20262.00??????? 5891.50? average of values >>> >>> >>> MacOSX Runs >>> >>> ???? hbIR?????????? hbIR >>> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >>> ---------------? ---------? --------? -------------? -------- >>> ?????????? 6725?????? 5621????? 4708?????????? 1543 >>> SPECjbb2015.MacOSX.base.01 >>> ?????????? 5621?????? 4701????? 4778?????????? 1326 >>> SPECjbb2015.MacOSX.base.02 >>> ?????????? 6725?????? 5621????? 4708?????????? 1475 >>> SPECjbb2015.MacOSX.base.03 >>> ?????????? 5621?????? 4701????? 4778?????????? 1372 >>> SPECjbb2015.MacOSX.base.04 >>> ?????????? 5621?????? 4701????? 4778?????????? 1560 >>> SPECjbb2015.MacOSX.base.05 >>> ?????????? 5621?????? 4701????? 4778?????????? 1471 >>> SPECjbb2015.MacOSX.base.06 >>> ?????????? 5621?????? 4701????? 4778?????????? 1430 >>> SPECjbb2015.MacOSX.base.07 >>> ?????????? 5621?????? 4701????? 4778?????????? 1560 >>> SPECjbb2015.MacOSX.base.08 >>> ?????????? 5621?????? 4701????? 4778?????????? 1581 >>> SPECjbb2015.MacOSX.base.09 >>> ?????????? 5621?????? 4701????? 4778?????????? 1633 >>> SPECjbb2015.MacOSX.base.10 >>> ---------------? ---------? --------? -------------? -------- >>> ??????? 5841.80??? 4885.00?? 4764.00??????? 1495.10? average of values >>> >>> ???? hbIR?????????? hbIR >>> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >>> ---------------? ---------? --------? -------------? -------- >>> ?????????? 5621?????? 4701????? 4778?????????? 1566 >>> SPECjbb2015.MacOSX.exp.01 >>> ?????????? 5621?????? 4701????? 4778?????????? 1430 >>> SPECjbb2015.MacOSX.exp.02 >>> ?????????? 5621?????? 4701????? 4778?????????? 1530 >>> SPECjbb2015.MacOSX.exp.03 >>> ?????????? 5621?????? 4701????? 4778?????????? 1304 >>> SPECjbb2015.MacOSX.exp.04 >>> ?????????? 5621?????? 4701????? 4778?????????? 1560 >>> SPECjbb2015.MacOSX.exp.05 >>> ?????????? 5621?????? 4701????? 4778?????????? 1460 >>> SPECjbb2015.MacOSX.exp.06 >>> ?????????? 5621?????? 4701????? 4778?????????? 1638 >>> SPECjbb2015.MacOSX.exp.07 >>> ?????????? 5621?????? 4701????? 4778?????????? 1471 >>> SPECjbb2015.MacOSX.exp.08 >>> ?????????? 5621?????? 4701????? 4778?????????? 1402 >>> SPECjbb2015.MacOSX.exp.09 >>> ?????????? 5621?????? 4701????? 4778?????????? 1560 >>> SPECjbb2015.MacOSX.exp.10 >>> ---------------? ---------? --------? -------------? -------- >>> ??????? 5621.00??? 4701.00?? 4778.00??????? 1492.10? average of values >>> >>> >>> Solaris-X64 Runs >>> >>> ???? hbIR?????????? hbIR >>> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >>> ---------------? ---------? --------? -------------? -------- >>> ????????? 16584????? 13957???? 13267?????????? 2332 >>> SPECjbb2015.Sol-X64.base.01 >>> ????????? 16584????? 13837???? 13267?????????? 3123 >>> SPECjbb2015.Sol-X64.base.02 >>> ????????? 16584????? 13837???? 13267?????????? 2853 >>> SPECjbb2015.Sol-X64.base.03 >>> ????????? 16584????? 13837???? 12438?????????? 2667 >>> SPECjbb2015.Sol-X64.base.04 >>> ????????? 14743????? 14210???? 12532?????????? 2920 >>> SPECjbb2015.Sol-X64.base.05 >>> ????????? 16584????? 13837???? 12438?????????? 3534 >>> SPECjbb2015.Sol-X64.base.06 >>> ????????? 13837????? 13497???? 12453?????????? 2226 >>> SPECjbb2015.Sol-X64.base.07 >>> ????????? 16584????? 13837???? 12438?????????? 2265 >>> SPECjbb2015.Sol-X64.base.08 >>> ????????? 16584????? 13837???? 13267?????????? 2853 >>> SPECjbb2015.Sol-X64.base.09 >>> ????????? 16584????? 13837???? 12438?????????? 3146 >>> SPECjbb2015.Sol-X64.base.10 >>> ---------------? ---------? --------? -------------? -------- >>> ?????? 16125.20?? 13852.30? 12780.50??????? 2791.90? average of values >>> >>> ???? hbIR?????????? hbIR >>> (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name >>> ---------------? ---------? --------? -------------? -------- >>> ????????? 16584????? 13837???? 12438?????????? 2073 >>> SPECjbb2015.Sol-X64.exp.01 >>> ????????? 16584????? 14353???? 13267?????????? 2667 >>> SPECjbb2015.Sol-X64.exp.02 >>> ????????? 16584????? 13837???? 12438?????????? 2349 >>> SPECjbb2015.Sol-X64.exp.03 >>> ????????? 16584????? 13837???? 12438?????????? 2494 >>> SPECjbb2015.Sol-X64.exp.04 >>> ????????? 13981????? 13832???? 12583?????????? 3241 >>> SPECjbb2015.Sol-X64.exp.05 >>> ????????? 13837????? 13575???? 12453?????????? 2621 >>> SPECjbb2015.Sol-X64.exp.06 >>> ????????? 13981????? 13832???? 12583?????????? 2768 >>> SPECjbb2015.Sol-X64.exp.07 >>> ????????? 16584????? 13837???? 12438?????????? 3000 >>> SPECjbb2015.Sol-X64.exp.08 >>> ????????? 16584????? 13837???? 12438?????????? 2952 >>> SPECjbb2015.Sol-X64.exp.09 >>> ????????? 16584????? 13837???? 12438?????????? 2494 >>> SPECjbb2015.Sol-X64.exp.10 >>> ---------------? ---------? --------? -------------? -------- >>> ?????? 15788.70?? 13861.40? 12551.40??????? 2665.90? average of values >>> >>> >>> On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: >>>> Greetings, >>>> >>>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>>> >>>> ??? JDK-8153224 Monitor deflation prolong safepoints >>>> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >>>> >>>> Here's a link to the OpenJDK wiki that describes my port: >>>> >>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>> >>>> Here's the webrev URL: >>>> >>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>> >>>> Here's a link to Carsten's original webrev: >>>> >>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>> >>>> Earlier versions of this patch have been through several rounds of >>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>> Roman for their preliminary code review comments. A very special >>>> thanks to Robbin and Roman for building and testing the patch in >>>> their own environments (including specJBB2015). >>>> >>>> This version of the patch has been thru Mach5 tier[1-8] testing on >>>> Oracle's usual set of platforms. Earlier versions have been run >>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>>> and slowdebug). Earlier versions have run my monitor inflation stress >>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>> fastdebug and slowdebug). >>>> >>>> All of the testing done on earlier versions will be redone on the >>>> latest version of the patch. >>>> >>>> Thanks, in advance, for any questions, comments or suggestions. >>>> >>>> Dan >>>> >>>> P.S. >>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>>> is currently failing in -Xcomp mode on Win* only. I've been trying >>>> to characterize/analyze this failure for more than a week now. At >>>> this point I'm convinced that Async Monitor Deflation is aggravating >>>> an existing bug. However, I plan to have a better handle on that >>>> failure before these bits are pushed to the jdk/jdk repo. >>>> >>> >>> >> From mikhailo.seledtsov at oracle.com Mon Apr 15 18:58:30 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Mon, 15 Apr 2019 11:58:30 -0700 Subject: RFR(T): 8222501: [TESTBUG] Docker support is always set to true in jtreg-ext/requires/VMProps.java Message-ID: <7cf6ae6d-e87c-ce8e-7642-e1207675ebe2@oracle.com> Changes to this file were integrated by accident when adding new tests for JFR+Containers. Could you, please, review this anti-delta for this file? --- a/test/jtreg-ext/requires/VMProps.java +++ b/test/jtreg-ext/requires/VMProps.java @@ -425,7 +425,7 @@ ????? * @return true if docker is supported in a given environment ????? */ ???? protected String dockerSupport() { -??????? boolean isSupported = true; +??????? boolean isSupported = false; ???????? if (Platform.isLinux()) { ??????????? // currently docker testing is only supported for Linux, ??????????? // on certain platforms Thank you, Misha From karen.kinnear at oracle.com Mon Apr 15 19:19:21 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Mon, 15 Apr 2019 15:19:21 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> Message-ID: <07183297-EA2A-4C3D-BADA-AFDF43FA8565@oracle.com> Dan, Thank you for delving into this > On Apr 9, 2019, at 11:53 AM, Daniel D. Daugherty wrote: > > Hi Carsten, > > Thanks for responding to Karen's code review comments. > > Karen, > > I have a query for you down at the end of my reply... > > More below... > > On 4/5/19 11:01 PM, Carsten Varming wrote: >> Dear Karen, >> >> Please see inline answers. >> >> On Fri, Apr 5, 2019 at 4:59 PM Karen Kinnear > wrote: >> >> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees that 0 < _count >> with comments that caller ensured _count <= 0 >> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >> ? Am I missing something subtle here or should they be the same guarantees? >> >> In ::enter _count is incremented when the thread is trying to acquire the monitor and decremented after the monitor has been acquired. The 0 < _count assertion is between those two point in the code. A thread acquiring a monitor and then calling wait will increment _count and then decrement _count as part of acquiring the monitor, thus _count can be 0 by the time the thread calls wait and when ReenterI is called. > > I had a similar answer and I'm planning to tweak the comments > and the guarantees a bit in the next round of code review (CR1); > please see my reply to Karen's CR for the proposed changes. Thanks again. > > >> >> 9. install_displaced_markword_in_object >> What happens if the cas_set_mark fails? >> I get that today this handles the race with enter and deflate_monitor_using_JT. If we remove >> the call from enter, is the expectation that we?ve blocked all others who did not set is_marked themselves? >> If we remove the call from enter would it make sense to ensure that the cas_set_mark succeeds here? >> >> I designed my original patch such that no thread would ever wait for the the deflating thread to finish deflating a monitor. If you remove install_displaced_markword_in_object from enter, then the entering thread can end up busy waiting by continuously reading the monitor pointer from the object mark word and then realizing that the monitor is being deflated and it should retry by going back to reading the object mark word. This bad behavior is completely avoided by calling install_displaced_markword_in_object. > > Here's the code in question: > > src/hotspot/share/runtime/objectMonitor.cpp: > >> bool ObjectMonitor::enter(TRAPS) { > >> // Prevent deflation. See ObjectSynchronizer::deflate_monitor() and is_busy(). >> // Ensure the object-monitor relationship remains stable while there's contention. >> const jint count = Atomic::add(1, &_count); >> if (count <= 0 && _owner == DEFLATER_MARKER) { >> // Async deflation in progress. Help deflater thread install >> // the mark word (in case deflater thread is slow). >> install_displaced_markword_in_object(); >> Self->_Stalled = 0; >> return false; // Caller should retry. Never mind about _count as this monitor has been deflated. >> } > > Our thread (T-enter) observes that the ObjectMonitor is being deflated > by T-deflate, calls install_displaced_markword_in_object() and returns > false to the caller which causes a retry. > > Restoring the header/dmw from the ObjectMonitor to the object's header > here isn't needed for correctness so it could be dropped (and would > simplify the code). Your counterpoint is if we drop the call, then > T-enter could do retry after retry if T-deflate is slow to get to its > install_displaced_markword_in_object() call. > > If T-enter calls install_displaced_markword_in_object(), then T-enter > will do a single retry because the object T-enter is trying to lock > will no longer have an ObjectMonitor. Okay I finally grok it... > > I think we need to clarify the comment a bit: > > > if (count <= 0 && _owner == DEFLATER_MARKER) { > > // Async deflation is in progress. Attempt to restore the > > // header/dmw to the object's header so that we only retry once > > // if the deflater thread happens to be slow. > > install_displaced_markword_in_object(); > > >> In my original patch no thread would ever wait for a deflating thread to finish. This property got lost in FastHashCode as that function evolved since I wrote my patch, but I think this property is worth preserving where possible. It might even be worth looking at FastHashCode to see if we can re-establish this property. > > Async Monitor Deflation causes races with FastHashCode() when the target > object has an existing ObjectMonitor. Here's the base code: > >> 768 } else if (mark->has_monitor()) { >> 769 monitor = mark->monitor(); >> 770 temp = monitor->header(); >> 771 assert(temp->is_neutral(), "invariant"); >> 772 hash = temp->hash(); >> 773 if (hash) { >> 774 return hash; >> 775 } >> 776 // Skip to the following code to reduce code size > > The 'monitor' fetched on L769 is unstable due to Async Monitor > Deflation and can cause an incorrect hash value to be returned. > The solution is to protect the ObjectMonitor*: > >> 775 } else if (mark->has_monitor()) { >> 776 ObjectMonitorHandle omh; >> 777 if (!omh.save_om_ptr(obj, mark)) { >> 778 // Lost a race with async deflation so try again. >> 779 assert(AsyncDeflateIdleMonitors, "sanity check"); >> 780 goto Retry; >> 781 } >> 782 monitor = omh.om_ptr(); >> 783 temp = monitor->header(); >> 784 assert(temp->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)temp)); >> 785 hash = temp->hash(); >> 786 if (hash != 0) { >> 787 return hash; >> 788 } >> 789 // Skip to the following code to reduce code size > > where L776-L782 handle the protection duty and possible retry. So we > have to protect the ObjectMonitor*, but, like enter(), we could call > install_displaced_markword_in_object() when we retry which would limit > T-hash to a single retry. > > ObjectSynchronizer::inflate() has a similar collision and retry issue: > >> 1456 // CASE: inflated >> 1457 if (mark->has_monitor()) { >> 1458 if (!omh_p->save_om_ptr(object, mark)) { >> 1459 // Lost a race with async deflation so try again. >> 1460 assert(AsyncDeflateIdleMonitors, "sanity check"); >> 1461 continue; >> 1462 } > > In this situation, inflate() discovers that the object already has an > ObjectMonitor; the object may not have had one when inflate() was > called, but it has one now. That particular race predates this project. > > In any case, inflate() wants to return a stable ObjectMonitor* in the > ObjectMonitorHandle, but if save_om_ptr() returns false, then inflate() > has to retry. The only reason for save_om_ptr() to return false is due > to a collision with Async Monitor Deflation. Like enter, we could call > install_displaced_markword_in_object() when we retry which would limit > inflate() to a single retry. > > > Okay, I've evolved from thinking we could simplify the code by dropping > install_displaced_markword_in_object() to thinking that I understand > what install_displaced_markword_in_object() brings to the party. And now > I'm proposing that we add 2 more install_displaced_markword_in_object() > calls to limit retries on two more code paths. > > Karen, are you convinced that install_displaced_markword_in_object() is > useful? Dan - between Carsten?s and your explanation - I get that 1) there is value in being able to make forward progress on the object itself sooner, once we know that DEFLATE_MARKER is trying to deflate it 2) since an enter() can CAS an _owner after DEFLATER_MARKER set but before _count -MAX_JINT, then anyone else waiting for deflation to finish could wait a while until they actually get a turn 3) If the other callers were to use install_dmw? then they could have shorter retry cycles (although with the enter, and now potentially other equivalent callers - more recontenders). So I see the benefit in others trying this approach as well - to free the object from the not-currently-used inflated monitor and retry. You may have already sent out a new webrev with that approach - I have not worked through in my head how that changes the details, since there are now more players, with different gates. thanks for walking us through this, Karen > Dan > > > >> >> I hope this helps. >> >> Best, >> Carsten >> >>> On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty > wrote: >>> >>> Filed: >>> >>> JDK-8222034 Thread-SMR functions should be updated to remove work around >>> https://bugs.openjdk.java.net/browse/JDK-8222034 >>> >>> Martin and Robbin, please check it out and make sure that I captured >>> things correctly... >>> >>> Dan >>> >>> >>> >>> On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: >>>> On 4/5/19 8:37 AM, Doerr, Martin wrote: >>>>> Hi everybody, >>>>> >>>>>> I think was fixed with: >>>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics >>>>>> You should get a leading sync and trailing one with the default conservative >>>>>> model and thus get proper memory ordering. >>>>>> Martin, I'm I correct? >>>>> Exactly. Thanks for pointing this out. PPC uses the strongest possible ordering semantics with memory_order_conservative (default parameter). >>>>> I've seen that comment about PPC in "void ThreadsList::inc_nested_handle_cnt()". This function could get replaced. >>>> >>>> Okay so we need a new bug to update these two Thread-SMR functions: >>>> >>>> src/hotspot/share/runtime/threadSMR.cpp: >>>> >>>> void ThreadsList::dec_nested_handle_cnt() { >>>> // The decrement needs to be MO_ACQ_REL. At the moment, the Atomic::dec >>>> // backend on PPC does not yet conform to these requirements. Therefore >>>> // the decrement is simulated with an Atomic::sub(1, &addr). >>>> // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR mechanism >>>> // is not generally safe to use. >>>> Atomic::sub(1, &_nested_handle_cnt); >>>> } >>>> >>>> void ThreadsList::inc_nested_handle_cnt() { >>>> // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc >>>> // backend on PPC does not yet conform to these requirements. Therefore >>>> // the increment is simulated with a load phi; cas phi + 1; loop. >>>> // Without this MO_SEQ_CST Atomic::inc simulation, the nested SMR mechanism >>>> // is not generally safe to use. >>>> intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>>> for (;;) { >>>> if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, sample) == sample) { >>>> return; >>>> } else { >>>> sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>>> } >>>> } >>>> } >>>> >>>> I'll file a new bug, loop in Robbin, Erik O and Martin, and make >>>> sure we're all in agreement. Once we decide that Thread-SMR's >>>> functions look like, I'll adapt my Async Monitor Deflation >>>> functions... >>>> >>>> Dan >>>> >>>> >>>>> >>>>> Best regards, >>>>> Martin >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Robbin Ehn > >>>>> Sent: Freitag, 5. April 2019 14:07 >>>>> To: daniel.daugherty at oracle.com ; hotspot-runtime-dev at openjdk.java.net ; Carsten Varming >; Roman Kennke >; Doerr, Martin > >>>>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>>>> >>>>> Hi Dan, >>>>> >>>>> (Martin there is question for you last in this email) >>>>> >>>>> After first pass I did not find any real issues. >>>>> Considering what you had to work with, it looks good! >>>>> >>>>> #1 >>>>> There are some assert which are redundant (to me at least) like: >>>>> src/hotspot/share/runtime/objectMonitor.cpp >>>>> L445 >>>>> if (!dmw->is_marked() && dmw->hash() == 0) { >>>>> // This dmw is neutral and has not yet started the restoration >>>>> // protocol so we mark a copy of the dmw to begin the protocol. >>>>> markOop marked_dmw = dmw->set_marked(); >>>>> assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >>>>> "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >>>>> marked_dmw->is_marked(), marked_dmw->hash()); >>>>> >>>>> That assert is basically a test that set_marked worked? >>>>> >>>>> L505 >>>>> if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == DEFLATER_MARKER) { >>>>> assert(_succ != Self, "invariant"); >>>>> assert(_owner == Self, "invariant"); >>>>> >>>>> Assert on _owner checks that our cmpxchg is not broken? >>>>> >>>>> I think it's easier to read the code if some on the most obvious asserts are >>>>> removed. Maybe comments instead. >>>>> >>>>> #2 >>>>> Not your doing but I think we should remove TRAPS/Thread * Self and use >>>>> JavaThread* instead. >>>>> E.g. so we can change: >>>>> void ObjectMonitor::EnterI(TRAPS) { >>>>> Thread * const Self = THREAD; >>>>> assert(Self->is_Java_thread(), "invariant"); >>>>> assert(((JavaThread *) Self)->thread_state() == _thread_blocked, "invariant"); >>>>> >>>>> to: >>>>> >>>>> void ObjectMonitor::EnterI(JavaThread* Self) { >>>>> assert(Self->thread_state() == _thread_blocked, "invariant"); >>>>> >>>>> #3 >>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp >>>>> 164 inline void ObjectMonitor::inc_ref_count() { >>>>> 165 // The increment needs to be MO_SEQ_CST. At the moment, the Atomic::inc >>>>> 166 // backend on PPC does not yet conform to these requirements. Therefore >>>>> 167 // the increment is simulated with a load phi; cas phi + 1; loop. >>>>> 168 // Without this MO_SEQ_CST Atomic::inc simulation, AsyncDeflateIdleMonitors >>>>> 169 // is not safe. >>>>> >>>>> I think was fixed with: >>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and other RMW atomics >>>>> You should get a leading sync and trailing one with the default conservative >>>>> model and thus get proper memory ordering. >>>>> Martin, I'm I correct? >>>>> >>>>> Thanks, Robbin >>>>> >>>>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>>>>> Greetings, >>>>>> >>>>>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>>>>> >>>>>> JDK-8153224 Monitor deflation prolong safepoints >>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>> >>>>>> Here's a link to the OpenJDK wiki that describes my port: >>>>>> >>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>> >>>>>> Here's the webrev URL: >>>>>> >>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>>>> >>>>>> Here's a link to Carsten's original webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>>>> >>>>>> Earlier versions of this patch have been through several rounds of >>>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>>>> Roman for their preliminary code review comments. A very special >>>>>> thanks to Robbin and Roman for building and testing the patch in >>>>>> their own environments (including specJBB2015). >>>>>> >>>>>> This version of the patch has been thru Mach5 tier[1-8] testing on >>>>>> Oracle's usual set of platforms. Earlier versions have been run >>>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>>>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>>>>> and slowdebug). Earlier versions have run my monitor inflation stress >>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>>>> fastdebug and slowdebug). >>>>>> >>>>>> All of the testing done on earlier versions will be redone on the >>>>>> latest version of the patch. >>>>>> >>>>>> Thanks, in advance, for any questions, comments or suggestions. >>>>>> >>>>>> Dan >>>>>> >>>>>> P.S. >>>>>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>>> is currently failing in -Xcomp mode on Win* only. I've been trying >>>>>> to characterize/analyze this failure for more than a week now. At >>>>>> this point I'm convinced that Async Monitor Deflation is aggravating >>>>>> an existing bug. However, I plan to have a better handle on that >>>>>> failure before these bits are pushed to the jdk/jdk repo. >>>> >>>> >>> >> > From daniel.daugherty at oracle.com Mon Apr 15 19:25:01 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 15 Apr 2019 15:25:01 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints In-Reply-To: <07183297-EA2A-4C3D-BADA-AFDF43FA8565@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <83c71dae-613e-151d-bdef-1dbfcfe64ab6@oracle.com> <155ba478-b02d-2ff2-505f-1ca25edcbea4@oracle.com> <07183297-EA2A-4C3D-BADA-AFDF43FA8565@oracle.com> Message-ID: <5fa2d954-0b7a-1fc4-db21-665a29bcd720@oracle.com> On 4/15/19 3:19 PM, Karen Kinnear wrote: > Dan, > > Thank you for delving into this > >> On Apr 9, 2019, at 11:53 AM, Daniel D. Daugherty >> > wrote: >> >> Hi Carsten, >> >> Thanks for responding to Karen's code review comments. >> >> Karen, >> >> I have a query for you down at the end of my reply... >> >> More below... >> >> On 4/5/19 11:01 PM, Carsten Varming wrote: >>> Dear Karen, >>> >>> Please see inline answers. >>> >>> On Fri, Apr 5, 2019 at 4:59 PM Karen Kinnear >>> > wrote: >>> >>> >>> 4. In EnterI: if _owner == DEFLATER_MARKER there are guarantees >>> that 0 < _count >>> with comments that caller ensured _count <= 0 >>> In ReenterI: guarantee 0 <= _count, with comment not _count < 0 >>> ? Am I missing something subtle here or should they be the same >>> guarantees? >>> >>> >>> In ::enter _count is incremented when the thread is trying to >>> acquire the monitor and decremented after the monitor has been >>> acquired. The 0 < _count assertion is between those two point in the >>> code. A thread acquiring a monitor and then calling wait will >>> increment _count and then decrement _count as part of acquiring the >>> monitor, thus _count can be 0 by the time the thread calls wait and >>> when ReenterI is called. >> >> I had a similar answer and I'm planning to tweak the comments >> and the guarantees a bit in the next round of code review (CR1); >> please see my reply to Karen's CR for the proposed changes. > Thanks again. >> >> >>> >>> 9. install_displaced_markword_in_object >>> What happens if the cas_set_mark fails? >>> I get that today this handles the race with enter and >>> deflate_monitor_using_JT. If we remove >>> the call from enter, is the expectation that we?ve blocked all >>> others who did not set is_marked themselves? >>> If we remove the call from enter would it make sense to ensure >>> that the cas_set_mark succeeds here? >>> >>> >>> I designed my original patch such that no thread would ever wait for >>> the the deflating thread to finish deflating a monitor. If you >>> remove install_displaced_markword_in_object from enter, then the >>> entering thread can end up busy waiting by continuously reading the >>> monitor pointer from the object mark word and then realizing that >>> the monitor is being deflated and it should retry by going back to >>> reading the object mark word. This bad behavior is completely >>> avoided by calling install_displaced_markword_in_object. >> >> Here's the code in question: >> >> src/hotspot/share/runtime/objectMonitor.cpp: >> >>> bool ObjectMonitor::enter(TRAPS) { >> >>> ? // Prevent deflation. See ObjectSynchronizer::deflate_monitor() >>> and is_busy(). >>> ? // Ensure the object-monitor relationship remains stable while >>> there's contention. >>> ? const jint count = Atomic::add(1, &_count); >>> ? if (count <= 0 && _owner == DEFLATER_MARKER) { >>> ??? // Async deflation in progress. Help deflater thread install >>> ??? // the mark word (in case deflater thread is slow). >>> ??? install_displaced_markword_in_object(); >>> ??? Self->_Stalled = 0; >>> ??? return false;? // Caller should retry. Never mind about _count >>> as this monitor has been deflated. >>> ? } >> >> Our thread (T-enter) observes that the ObjectMonitor is being deflated >> by T-deflate, calls install_displaced_markword_in_object() and returns >> false to the caller which causes a retry. >> >> Restoring the header/dmw from the ObjectMonitor to the object's header >> here isn't needed for correctness so it could be dropped (and would >> simplify the code). Your counterpoint is if we drop the call, then >> T-enter could do retry after retry if T-deflate is slow to get to its >> install_displaced_markword_in_object() call. >> >> If T-enter calls install_displaced_markword_in_object(), then T-enter >> will do a single retry because the object T-enter is trying to lock >> will no longer have an ObjectMonitor. Okay I finally grok it... >> >> I think we need to clarify the comment a bit: >> >> > ? if (count <= 0 && _owner == DEFLATER_MARKER) { >> >? ?? // Async deflation is in progress. Attempt to restore the >> >???? // header/dmw to the object's header so that we only retry once >> >???? // if the deflater thread happens to be slow. >> >? ?? install_displaced_markword_in_object(); >> >> >>> In my original patch no thread would ever wait for a deflating >>> thread to finish. This property got lost in FastHashCode as that >>> function evolved since I wrote my patch, but I think this property >>> is worth preserving where possible. It might even be worth looking >>> at FastHashCode to see if we can re-establish this property. >> >> Async Monitor Deflation causes races with FastHashCode() when the target >> object has an existing ObjectMonitor. Here's the base code: >> >>> 768 } else if (mark->has_monitor()) { >>> 769 monitor = mark->monitor(); >>> 770 temp = monitor->header(); >>> 771 assert(temp->is_neutral(), "invariant"); >>> 772 hash = temp->hash(); >>> 773 if (hash) { >>> 774 return hash; >>> 775 } >>> 776 // Skip to the following code to reduce code size >> >> The 'monitor' fetched on L769 is unstable due to Async Monitor >> Deflation and can cause an incorrect hash value to be returned. >> The solution is to protect the ObjectMonitor*: >> >>> 775 } else if (mark->has_monitor()) { >>> 776 ObjectMonitorHandle omh; >>> 777 if (!omh.save_om_ptr(obj, mark)) { >>> 778 // Lost a race with async deflation so try again. >>> 779 assert(AsyncDeflateIdleMonitors, "sanity check"); >>> 780 goto Retry; >>> 781 } >>> 782 monitor = omh.om_ptr(); >>> 783 temp = monitor->header(); >>> 784 assert(temp->is_neutral(), "invariant: header=" INTPTR_FORMAT, p2i((address)temp)); >>> 785 hash = temp->hash(); >>> 786 if (hash != 0) { >>> 787 return hash; >>> 788 } >>> 789 // Skip to the following code to reduce code size >> >> where L776-L782 handle the protection duty and possible retry. So we >> have to protect the ObjectMonitor*, but, like enter(), we could call >> install_displaced_markword_in_object() when we retry which would limit >> T-hash to a single retry. > >> >> ObjectSynchronizer::inflate() has a similar collision and retry issue: >> >>> 1456 // CASE: inflated >>> 1457 if (mark->has_monitor()) { >>> 1458 if (!omh_p->save_om_ptr(object, mark)) { >>> 1459 // Lost a race with async deflation so try again. >>> 1460 assert(AsyncDeflateIdleMonitors, "sanity check"); >>> 1461 continue; >>> 1462 } >> >> In this situation, inflate() discovers that the object already has an >> ObjectMonitor; the object may not have had one when inflate() was >> called, but it has one now. That particular race predates this project. >> >> In any case, inflate() wants to return a stable ObjectMonitor* in the >> ObjectMonitorHandle, but if save_om_ptr() returns false, then inflate() >> has to retry. The only reason for save_om_ptr() to return false is due >> to a collision with Async Monitor Deflation. Like enter, we could call >> install_displaced_markword_in_object() when we retry which would limit >> inflate() to a single retry. >> >> >> Okay, I've evolved from thinking we could simplify the code by dropping >> install_displaced_markword_in_object() to thinking that I understand >> what install_displaced_markword_in_object() brings to the party. And now >> I'm proposing that we add 2 more install_displaced_markword_in_object() >> calls to limit retries on two more code paths. >> >> Karen, are you convinced that install_displaced_markword_in_object() is >> useful? > > Dan - between Carsten?s and your explanation - I get that > 1) there is value in being able to make forward progress on the object > itself sooner, once we know > that DEFLATE_MARKER is trying to deflate it > 2) since an enter() can CAS an _owner after DEFLATER_MARKER set but > before _count -MAX_JINT, > then anyone else waiting for deflation to finish could wait a while > until they actually get a turn > 3) If the other callers were to use install_dmw? then they could have > shorter retry cycles (although > with the enter, and now potentially other equivalent callers - more > recontenders). > > So I see the benefit in others trying this approach as well - to free > the object from the not-currently-used > inflated monitor and retry. Glad we're all on the same page. > You may have already sent out a new webrev with that approach - I have > not worked through in my > head how that changes the details, since there are now more players, > with different gates. I have not sent out CR1 yet. I'm still testing. I had some new test failures on Friday. It looks like adding more install_displaced_markword_in_object() calls has revealed yet another race. I'm testing a fix now. Once I have stable bits again, I'll start the CR1 cycle... Dan > > thanks for walking us through this, > Karen >> Dan >> >> >> >>> >>> I hope this helps. >>> >>> Best, >>> Carsten >>> >>>> On Apr 5, 2019, at 12:10 PM, Daniel D. Daugherty >>>> >>> > wrote: >>>> >>>> Filed: >>>> >>>> ??? JDK-8222034 Thread-SMR functions should be updated to >>>> remove work around >>>> https://bugs.openjdk.java.net/browse/JDK-8222034 >>>> >>>> Martin and Robbin, please check it out and make sure that I >>>> captured >>>> things correctly... >>>> >>>> Dan >>>> >>>> >>>> >>>> On 4/5/19 12:01 PM, Daniel D. Daugherty wrote: >>>>> On 4/5/19 8:37 AM, Doerr, Martin wrote: >>>>>> Hi everybody, >>>>>> >>>>>>> I think was fixed with: >>>>>>> 8202080: Introduce ordering semantics for Atomic::add/inc >>>>>>> and other RMW atomics >>>>>>> You should get a leading sync and trailing one with the >>>>>>> default conservative >>>>>>> model and thus get proper memory ordering. >>>>>>> Martin, I'm I correct? >>>>>> Exactly. Thanks for pointing this out. PPC uses the strongest >>>>>> possible ordering semantics with memory_order_conservative >>>>>> (default parameter). >>>>>> I've seen that comment about PPC in "void >>>>>> ThreadsList::inc_nested_handle_cnt()". This function could >>>>>> get replaced. >>>>> >>>>> Okay so we need a new bug to update these two Thread-SMR >>>>> functions: >>>>> >>>>> src/hotspot/share/runtime/threadSMR.cpp: >>>>> >>>>> void ThreadsList::dec_nested_handle_cnt() { >>>>> ? // The decrement needs to be MO_ACQ_REL. At the moment, the >>>>> Atomic::dec >>>>> ? // backend on PPC does not yet conform to these >>>>> requirements. Therefore >>>>> ? // the decrement is simulated with an Atomic::sub(1, &addr). >>>>> ? // Without this MO_ACQ_REL Atomic::dec simulation, the >>>>> nested SMR mechanism >>>>> ? // is not generally safe to use. >>>>> ? Atomic::sub(1, &_nested_handle_cnt); >>>>> } >>>>> >>>>> void ThreadsList::inc_nested_handle_cnt() { >>>>> ? // The increment needs to be MO_SEQ_CST. At the moment, the >>>>> Atomic::inc >>>>> ? // backend on PPC does not yet conform to these >>>>> requirements. Therefore >>>>> ? // the increment is simulated with a load phi; cas phi + 1; >>>>> loop. >>>>> ? // Without this MO_SEQ_CST Atomic::inc simulation, the >>>>> nested SMR mechanism >>>>> ? // is not generally safe to use. >>>>> ? intx sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>>>> ? for (;;) { >>>>> ??? if (Atomic::cmpxchg(sample + 1, &_nested_handle_cnt, >>>>> sample) == sample) { >>>>> ????? return; >>>>> ??? } else { >>>>> ????? sample = OrderAccess::load_acquire(&_nested_handle_cnt); >>>>> ??? } >>>>> ? } >>>>> } >>>>> >>>>> I'll file a new bug, loop in Robbin, Erik O and Martin, and make >>>>> sure we're all in agreement. Once we decide that Thread-SMR's >>>>> functions look like, I'll adapt my Async Monitor Deflation >>>>> functions... >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> Best regards, >>>>>> Martin >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Robbin Ehn >>>>> > >>>>>> Sent: Freitag, 5. April 2019 14:07 >>>>>> To: daniel.daugherty at oracle.com >>>>>> ; >>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>> ; Carsten >>>>>> Varming >; Roman >>>>>> Kennke >; >>>>>> Doerr, Martin >>>>> > >>>>>> Subject: Re: RFR(L) 8153224 Monitor deflation prolong safepoints >>>>>> >>>>>> Hi Dan, >>>>>> >>>>>> (Martin there is question for you last in this email) >>>>>> >>>>>> After first pass I did not find any real issues. >>>>>> Considering what you had to work with, it looks good! >>>>>> >>>>>> #1 >>>>>> There are some assert which are redundant (to me at least) like: >>>>>> src/hotspot/share/runtime/objectMonitor.cpp >>>>>> L445 >>>>>> ??? if (!dmw->is_marked() && dmw->hash() == 0) { >>>>>> ????? // This dmw is neutral and has not yet started the >>>>>> restoration >>>>>> ????? // protocol so we mark a copy of the dmw to begin the >>>>>> protocol. >>>>>> ????? markOop marked_dmw = dmw->set_marked(); >>>>>> assert(marked_dmw->is_marked() && marked_dmw->hash() == 0, >>>>>> ???????????? "sanity_check: is_marked=%d, hash=" INTPTR_FORMAT, >>>>>> marked_dmw->is_marked(), marked_dmw->hash()); >>>>>> >>>>>> That assert is basically a test that set_marked worked? >>>>>> >>>>>> L505 >>>>>> ????? if (Atomic::cmpxchg(Self, &_owner, DEFLATER_MARKER) == >>>>>> DEFLATER_MARKER) { >>>>>> ??????? assert(_succ != Self, "invariant"); >>>>>> ??????? assert(_owner == Self, "invariant"); >>>>>> >>>>>> Assert on _owner checks that our cmpxchg is not broken? >>>>>> >>>>>> I think it's easier to read the code if some on the most >>>>>> obvious asserts are >>>>>> removed. Maybe comments instead. >>>>>> >>>>>> #2 >>>>>> Not your doing but I think we should remove TRAPS/Thread * >>>>>> Self and use >>>>>> JavaThread* instead. >>>>>> E.g. so we can change: >>>>>> void ObjectMonitor::EnterI(TRAPS) { >>>>>> ??? Thread * const Self = THREAD; >>>>>> assert(Self->is_Java_thread(), "invariant"); >>>>>> ??? assert(((JavaThread *) Self)->thread_state() == >>>>>> _thread_blocked, "invariant"); >>>>>> >>>>>> to: >>>>>> >>>>>> void ObjectMonitor::EnterI(JavaThread* Self) { >>>>>> assert(Self->thread_state() == _thread_blocked, "invariant"); >>>>>> >>>>>> #3 >>>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp >>>>>> ?? 164 inline void ObjectMonitor::inc_ref_count() { >>>>>> ?? 165?? // The increment needs to be MO_SEQ_CST. At the >>>>>> moment, the Atomic::inc >>>>>> ?? 166?? // backend on PPC does not yet conform to these >>>>>> requirements. Therefore >>>>>> ?? 167?? // the increment is simulated with a load phi; cas >>>>>> phi + 1; loop. >>>>>> ?? 168?? // Without this MO_SEQ_CST Atomic::inc simulation, >>>>>> AsyncDeflateIdleMonitors >>>>>> ?? 169?? // is not safe. >>>>>> >>>>>> I think was fixed with: >>>>>> 8202080: Introduce ordering semantics for Atomic::add/inc and >>>>>> other RMW atomics >>>>>> You should get a leading sync and trailing one with the >>>>>> default conservative >>>>>> model and thus get proper memory ordering. >>>>>> Martin, I'm I correct? >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>> On 3/24/19 2:57 PM, Daniel D. Daugherty wrote: >>>>>>> Greetings, >>>>>>> >>>>>>> Welcome to the OpenJDK review thread for my port of >>>>>>> Carsten's work on: >>>>>>> >>>>>>> ? ??? JDK-8153224 Monitor deflation prolong safepoints >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153224 >>>>>>> >>>>>>> Here's a link to the OpenJDK wiki that describes my port: >>>>>>> >>>>>>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>>>>>> >>>>>>> Here's the webrev URL: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>>>>>> >>>>>>> Here's a link to Carsten's original webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>>>>>> >>>>>>> Earlier versions of this patch have been through several >>>>>>> rounds of >>>>>>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>>>>>> Roman for their preliminary code review comments. A very special >>>>>>> thanks to Robbin and Roman for building and testing the patch in >>>>>>> their own environments (including specJBB2015). >>>>>>> >>>>>>> This version of the patch has been thru Mach5 tier[1-8] >>>>>>> testing on >>>>>>> Oracle's usual set of platforms. Earlier versions have been run >>>>>>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>>>>>> (product, fastdebug, slowdebug).Earlier versions have run >>>>>>> Kitchensink >>>>>>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>>>>>> fastdebug >>>>>>> and slowdebug). Earlier versions have run my monitor >>>>>>> inflation stress >>>>>>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 >>>>>>> (product, >>>>>>> fastdebug and slowdebug). >>>>>>> >>>>>>> All of the testing done on earlier versions will be redone >>>>>>> on the >>>>>>> latest version of the patch. >>>>>>> >>>>>>> Thanks, in advance, for any questions, comments or suggestions. >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> P.S. >>>>>>> One subtest in >>>>>>> gc/g1/humongousObjects/TestHumongousClassLoader.java >>>>>>> is currently failing in -Xcomp mode on Win* only. I've been >>>>>>> trying >>>>>>> to characterize/analyze this failure for more than a week >>>>>>> now. At >>>>>>> this point I'm convinced that Async Monitor Deflation is >>>>>>> aggravating >>>>>>> an existing bug. However, I plan to have a better handle on that >>>>>>> failure before these bits are pushed to the jdk/jdk repo. >>>>> >>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Mon Apr 15 19:28:41 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 15 Apr 2019 15:28:41 -0400 Subject: RFR(T): 8222501: [TESTBUG] Docker support is always set to true in jtreg-ext/requires/VMProps.java In-Reply-To: <7cf6ae6d-e87c-ce8e-7642-e1207675ebe2@oracle.com> References: <7cf6ae6d-e87c-ce8e-7642-e1207675ebe2@oracle.com> Message-ID: <5525ea2d-8462-fc9a-91cf-71f330a2b6ed@oracle.com> Thumbs up. I agree that this change is trivial. Dan On 4/15/19 2:58 PM, mikhailo.seledtsov at oracle.com wrote: > Changes to this file were integrated by accident when adding new tests > for JFR+Containers. > > Could you, please, review this anti-delta for this file? > > --- a/test/jtreg-ext/requires/VMProps.java > +++ b/test/jtreg-ext/requires/VMProps.java > @@ -425,7 +425,7 @@ > ????? * @return true if docker is supported in a given environment > ????? */ > ???? protected String dockerSupport() { > -??????? boolean isSupported = true; > +??????? boolean isSupported = false; > ???????? if (Platform.isLinux()) { > ??????????? // currently docker testing is only supported for Linux, > ??????????? // on certain platforms > > > Thank you, > > Misha > From mikhailo.seledtsov at oracle.com Mon Apr 15 19:32:36 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Mon, 15 Apr 2019 12:32:36 -0700 Subject: RFR(T): 8222501: [TESTBUG] Docker support is always set to true in jtreg-ext/requires/VMProps.java In-Reply-To: <5525ea2d-8462-fc9a-91cf-71f330a2b6ed@oracle.com> References: <7cf6ae6d-e87c-ce8e-7642-e1207675ebe2@oracle.com> <5525ea2d-8462-fc9a-91cf-71f330a2b6ed@oracle.com> Message-ID: Thank you, Misha On 4/15/19 12:28 PM, Daniel D. Daugherty wrote: > Thumbs up. I agree that this change is trivial. > > Dan > > > On 4/15/19 2:58 PM, mikhailo.seledtsov at oracle.com wrote: >> Changes to this file were integrated by accident when adding new >> tests for JFR+Containers. >> >> Could you, please, review this anti-delta for this file? >> >> --- a/test/jtreg-ext/requires/VMProps.java >> +++ b/test/jtreg-ext/requires/VMProps.java >> @@ -425,7 +425,7 @@ >> ????? * @return true if docker is supported in a given environment >> ????? */ >> ???? protected String dockerSupport() { >> -??????? boolean isSupported = true; >> +??????? boolean isSupported = false; >> ???????? if (Platform.isLinux()) { >> ??????????? // currently docker testing is only supported for Linux, >> ??????????? // on certain platforms >> >> >> Thank you, >> >> Misha >> > From per.liden at oracle.com Mon Apr 15 19:40:43 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 15 Apr 2019 21:40:43 +0200 Subject: RFR: 8222460: -XX:+PrintFlagsRanges prints incorrect range In-Reply-To: <79c45c20-e113-04c8-1567-2d0b589a7844@oracle.com> References: <9781e40b-f22c-1b8f-198f-525ffbda9a47@oracle.com> <7830b677-141f-7d1f-8b61-dd1094c072e7@oracle.com> <79c45c20-e113-04c8-1567-2d0b589a7844@oracle.com> Message-ID: <0ce0349a-df09-441c-6172-1caa2828df2f@oracle.com> On 04/15/2019 07:16 PM, gerard ziemski wrote: > On 4/15/19 10:58 AM, Per Liden wrote: >>> >>> I think the original idea here was that for those flags with a >>> constraint, we wanted to show the possible range, from which the >>> constraint will further restrict the final value. However, that is >>> tricky to test without exposing the constraint function, as evidenced >>> by the exclusion list in the test. >>> >>> For those flags without range and constraint, the implicit range is >>> the max range of the flag's type, so the idea here was that such flag >>> was "untestable" for practical purposes, so we print an empty range. >> >> But that's not very useful to an actual user, who want's to know what >> the range is (even if every value allowed by the type is valid). We >> can't expect a user to know what the range of a specific type is on >> every platform. >> >>> >>> I believe that a better fix here might be to print an empty range in >>> both cases. >> >> But why print an empty range when the range is well known? We don't >> have to make -XX:+PrintFlagsRanges dumber than it needs to be. The >> only time we don't know the range is what there's a constraint >> function associated with the flag. >> >> Frankly, to me this looks like the original intent of this code, but a >> simple mistake snuck in which inverted the if-else statement. > > I was wrong in my initial reply. > > There are flags with both range and constraint - the entire point of > printing the range to the user is to help define valid values, and as > you say "We can't expect a user to know what the range of a specific > type is on every platform", which applies to both cases. > > If the test has an issue with the flag, then the test itself needs to be > fixed, by excluding the troublesome flag. > > I believe that we should print out the default range in both cases (done > in a followup) and we should modify the test to exclude hard to test > (troublesome) flags as needed. But for flags with constrains functions we have no way of knowing the valid range, right? So, the question becomes, is it better for a user to get no information (we an empty range) instead of the wrong information (we print the default range)? I tend to lean toward printing no information. A third alternative would be printing the default range for the type, but also add something in the printout to indicate that there might be other constraints too. > > Can't we just exclude "SoftMaxHeapSize" from the test here? Sure, I've updated that patch for JDK-8222145 "Add -XX:SoftMaxHeapSize flag" to exclude it from testing. Webrev here: http://cr.openjdk.java.net/~pliden/8222145/webrev.2 Looks ok? I'll withdraw the review for JDK-8222460, and re-assign the bug to you, and you can update it as you see fit, ok? cheers, Per From david.holmes at oracle.com Mon Apr 15 20:23:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2019 06:23:36 +1000 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <5a823b4c-eeab-36e3-99e9-f8ffe23ec492@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <39592868-25d6-5882-c89c-65af3559e005@oracle.com> <5a823b4c-eeab-36e3-99e9-f8ffe23ec492@oracle.com> Message-ID: <6098cff6-ce54-6301-edf8-2812477c14cb@oracle.com> On 15/04/2019 9:34 pm, David Holmes wrote: > On 15/04/2019 7:40 pm, Claes Redestad wrote: >> Nice cleanup! >> >> Seems you could trivially do the same for _stackSize_offset > > Seems any of the checks for _foo_offset > 0 can be removed as all fields > must always be present. Except you'd have to be careful about access prior to initialization - for all of these "offset" fields. David > David > >> Thanks! >> >> /Claes >> >> On 2019-04-15 11:04, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Removing some dead code. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8222327 >>> >>> Passes t1-5. >>> >>> Thanks, Robbin From david.holmes at oracle.com Tue Apr 16 02:00:58 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2019 12:00:58 +1000 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> Message-ID: Hi Robbin, On 15/04/2019 6:58 pm, Robbin Ehn wrote: > Hi, please review. > > After reexamine this issue: > Threads in native must always have their stack walkable. > JFR sampler should never need to make a stack walkable (for native sample). > > I manage to locally reproduce reliable with changes to JFR sampler and > having > hundreds of threads running similar code as the in the bug. > (Looping creating an array with negative size.) > > I found a place where we don't proper look at the suspend flags. > The java thread can thus escape native and make it's stack unwalkable > and later > it tries to make it walkable at the same time as the JFR sampler. > > By removing some kind of fast check and instead always call the > check_safepoint_and_suspend_for_native_trans I can no longer reproduce. Sorry but I can't see how this can fix anything: - if (SafepointMechanism::should_block(thread) || thread->is_suspend_after_native()) { JavaThread::check_safepoint_and_suspend_for_native_trans(thread); - } All you are doing is changing the timing of the race between the thread re-entering the VM/Java and the request for a suspend or safepoint. If there is a race between the sampler logic acting on the thread, and the thread acting on itself then that race has to be precluded somehow. Thanks, David ----- > (which have the JFR native trans suspend check) > And it passes t1-5. > > Code: > http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8218147 > > Thanks, Robbin > > On 4/5/19 5:43 PM, Robbin Ehn wrote: >> Hi Dean, >> >> Sorry, I missed this mail. >> Yes we can do that. >> Ignore my other mail, I'll update. >> >> Thanks, Robbin >> >> >> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>> >>>>>> >>>>>> If it's already set, should we check that _last_Java_pc matches the >>> >>>>>> new value? >>>>> >>>>> We manually set the pc in several places, so if it's set, it's not >>>>> certain that >>>>> it should be the same as in last sp. >>>>> I can't distinguish between the cases. >>>>> >>>> >>>> If we get pc from sp[-1] then it should match, but you're right, we >>>> sometimes get pc from somewhere else. >>> >>> How about if we combine the !walkable check and the >>> capture_last_Java_pc() logic into a single method? >>> Then we can do something like: >>> >>> ???? if (!walkable()) { >>> ???????? address pc = (address)_last_Java_sp[-1]; >>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>> ???? } >>> >>> dl From robbin.ehn at oracle.com Tue Apr 16 06:51:17 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 08:51:17 +0200 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> Message-ID: Hi David, On 4/16/19 4:00 AM, David Holmes wrote: > Hi Robbin, > > On 15/04/2019 6:58 pm, Robbin Ehn wrote: >> Hi, please review. >> >> After reexamine this issue: >> Threads in native must always have their stack walkable. >> JFR sampler should never need to make a stack walkable (for native sample). >> >> I manage to locally reproduce reliable with changes to JFR sampler and having >> hundreds of threads running similar code as the in the bug. >> (Looping creating an array with negative size.) >> >> I found a place where we don't proper look at the suspend flags. >> The java thread can thus escape native and make it's stack unwalkable and later >> it tries to make it walkable at the same time as the JFR sampler. >> >> By removing some kind of fast check and instead always call the >> check_safepoint_and_suspend_for_native_trans I can no longer reproduce. > > Sorry but I can't see how this can fix anything: > > -???? if (SafepointMechanism::should_block(thread) || > thread->is_suspend_after_native()) { > ??????? JavaThread::check_safepoint_and_suspend_for_native_trans(thread); > -???? } > In check_safepoint_and_suspend_for_native_trans we check _trace_flag and stop with macro: JFR_ONLY(SUSPEND_THREAD_CONDITIONAL(thread);) This method does not do the right thing: bool is_suspend_after_native() const { return (_suspend_flags & (_external_suspend | _deopt_suspend)) != 0; } If we want to keep the out-of-line double checking, this is an alternative: diff -r 51211b2d6514 src/hotspot/share/runtime/thread.hpp --- a/src/hotspot/share/runtime/thread.hpp Tue Apr 16 08:38:32 2019 +0200 +++ b/src/hotspot/share/runtime/thread.hpp Tue Apr 16 08:44:21 2019 +0200 @@ -1417,3 +1417,3 @@ bool is_suspend_after_native() const { - return (_suspend_flags & (_external_suspend | _deopt_suspend)) != 0; + return (_suspend_flags & (_external_suspend | _deopt_suspend | _trace_flag)) != 0; } So we are missing checking that bit completely in this transition code. Thanks, Robbin > All you are doing is changing the timing of the race between the thread > re-entering the VM/Java and the request for a suspend or safepoint. > > If there is a race between the sampler logic acting on the thread, and the > thread acting on itself then that race has to be precluded somehow. > > Thanks, > David > ----- > >> (which have the JFR native trans suspend check) >> And it passes t1-5. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8218147 >> >> Thanks, Robbin >> >> On 4/5/19 5:43 PM, Robbin Ehn wrote: >>> Hi Dean, >>> >>> Sorry, I missed this mail. >>> Yes we can do that. >>> Ignore my other mail, I'll update. >>> >>> Thanks, Robbin >>> >>> >>> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>>> >>>>>>> >>>>>>> If it's already set, should we check that _last_Java_pc matches the >>>> >>>>>>> new value? >>>>>> >>>>>> We manually set the pc in several places, so if it's set, it's not >>>>>> certain that >>>>>> it should be the same as in last sp. >>>>>> I can't distinguish between the cases. >>>>>> >>>>> >>>>> If we get pc from sp[-1] then it should match, but you're right, we >>>>> sometimes get pc from somewhere else. >>>> >>>> How about if we combine the !walkable check and the >>>> capture_last_Java_pc() logic into a single method? >>>> Then we can do something like: >>>> >>>> ???? if (!walkable()) { >>>> ???????? address pc = (address)_last_Java_sp[-1]; >>>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>>> ???? } >>>> >>>> dl From david.holmes at oracle.com Tue Apr 16 07:27:19 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2019 17:27:19 +1000 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> Message-ID: <3d0d841d-3bcf-ce3c-1d44-096c9aff2e72@oracle.com> On 16/04/2019 4:51 pm, Robbin Ehn wrote: > Hi David, > > On 4/16/19 4:00 AM, David Holmes wrote: >> Hi Robbin, >> >> On 15/04/2019 6:58 pm, Robbin Ehn wrote: >>> Hi, please review. >>> >>> After reexamine this issue: >>> Threads in native must always have their stack walkable. >>> JFR sampler should never need to make a stack walkable (for native >>> sample). >>> >>> I manage to locally reproduce reliable with changes to JFR sampler >>> and having >>> hundreds of threads running similar code as the in the bug. >>> (Looping creating an array with negative size.) >>> >>> I found a place where we don't proper look at the suspend flags. >>> The java thread can thus escape native and make it's stack unwalkable >>> and later >>> it tries to make it walkable at the same time as the JFR sampler. >>> >>> By removing some kind of fast check and instead always call the >>> check_safepoint_and_suspend_for_native_trans I can no longer reproduce. >> >> Sorry but I can't see how this can fix anything: >> >> -???? if (SafepointMechanism::should_block(thread) || >> thread->is_suspend_after_native()) { >> >> JavaThread::check_safepoint_and_suspend_for_native_trans(thread); >> -???? } >> > > In check_safepoint_and_suspend_for_native_trans we check _trace_flag and > stop with macro: > JFR_ONLY(SUSPEND_THREAD_CONDITIONAL(thread);) Ah I see - needed to expand that macro. > This method does not do the right thing: > bool is_suspend_after_native() const { > ? return (_suspend_flags & (_external_suspend | _deopt_suspend)) != 0; > } > > If we want to keep the out-of-line double checking, this is an alternative: > diff -r 51211b2d6514 src/hotspot/share/runtime/thread.hpp > --- a/src/hotspot/share/runtime/thread.hpp??? Tue Apr 16 08:38:32 2019 > +0200 > +++ b/src/hotspot/share/runtime/thread.hpp??? Tue Apr 16 08:44:21 2019 > +0200 > @@ -1417,3 +1417,3 @@ > ?? bool is_suspend_after_native() const { > -??? return (_suspend_flags & (_external_suspend | _deopt_suspend)) != 0; > +??? return (_suspend_flags & (_external_suspend | _deopt_suspend | > _trace_flag)) != 0; > ?? } Right. Should that use JFR_ONLY? I was a little concerned that thread->set_trace_flag() may not ensure visibility of the flag update, but then realized it should be covered by the fence: static inline void transition_from_native(JavaThread *thread, JavaThreadState to) { // Change to transition state and ensure it is seen by the VM thread. thread->set_thread_state_fence(_thread_in_native_trans); and the comment should say: // Change to transition state and ensure it is seen by other thread, // and we will see any _suspend_flag changes below. However it also seems to me that in JfrThreadSampleClosure::do_sample_thread we need a storeload() barrier after: thread->set_trace_flag(); to ensure its not reordered with the reads of thread->thread_state() ? Thanks, David ----- > So we are missing checking that bit completely in this transition code. > > Thanks, Robbin > > >> All you are doing is changing the timing of the race between the >> thread re-entering the VM/Java and the request for a suspend or >> safepoint. >> >> If there is a race between the sampler logic acting on the thread, and >> the thread acting on itself then that race has to be precluded somehow. >> >> Thanks, >> David >> ----- >> >>> (which have the JFR native trans suspend check) >>> And it passes t1-5. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8218147 >>> >>> Thanks, Robbin >>> >>> On 4/5/19 5:43 PM, Robbin Ehn wrote: >>>> Hi Dean, >>>> >>>> Sorry, I missed this mail. >>>> Yes we can do that. >>>> Ignore my other mail, I'll update. >>>> >>>> Thanks, Robbin >>>> >>>> >>>> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>>>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>>>> >>>>>>>> >>>>>>>> If it's already set, should we check that _last_Java_pc matches the >>>>> >>>>>>>> new value? >>>>>>> >>>>>>> We manually set the pc in several places, so if it's set, it's not >>>>>>> certain that >>>>>>> it should be the same as in last sp. >>>>>>> I can't distinguish between the cases. >>>>>>> >>>>>> >>>>>> If we get pc from sp[-1] then it should match, but you're right, we >>>>>> sometimes get pc from somewhere else. >>>>> >>>>> How about if we combine the !walkable check and the >>>>> capture_last_Java_pc() logic into a single method? >>>>> Then we can do something like: >>>>> >>>>> ???? if (!walkable()) { >>>>> ???????? address pc = (address)_last_Java_sp[-1]; >>>>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>>>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>>>> ???? } >>>>> >>>>> dl From robbin.ehn at oracle.com Tue Apr 16 07:56:42 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 09:56:42 +0200 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: <3d0d841d-3bcf-ce3c-1d44-096c9aff2e72@oracle.com> References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> <3d0d841d-3bcf-ce3c-1d44-096c9aff2e72@oracle.com> Message-ID: Hi David, *truncated* On 4/16/19 9:27 AM, David Holmes wrote: >> If we want to keep the out-of-line double checking, this is an alternative: >> diff -r 51211b2d6514 src/hotspot/share/runtime/thread.hpp >> --- a/src/hotspot/share/runtime/thread.hpp??? Tue Apr 16 08:38:32 2019 +0200 >> +++ b/src/hotspot/share/runtime/thread.hpp??? Tue Apr 16 08:44:21 2019 +0200 >> @@ -1417,3 +1417,3 @@ >> ??? bool is_suspend_after_native() const { >> -??? return (_suspend_flags & (_external_suspend | _deopt_suspend)) != 0; >> +??? return (_suspend_flags & (_external_suspend | _deopt_suspend | >> _trace_flag)) != 0; >> ??? } > So you prefer this patch? > Right. Should that use JFR_ONLY? _trace_flag is always present, so we don't need it. And I'm not sure how to get that macro into that method in a nice way? > > I was a little concerned that thread->set_trace_flag() may not ensure visibility > of the flag update, but then realized it should be covered by the fence: > > ?static inline void transition_from_native(JavaThread *thread, JavaThreadState > to) { > ??? // Change to transition state and ensure it is seen by the VM thread. > ??? thread->set_thread_state_fence(_thread_in_native_trans); > > and the comment should say: > > // Change to transition state and ensure it is seen by other thread, > // and we will see any _suspend_flag changes below. > > However it also seems to me that in JfrThreadSampleClosure::do_sample_thread we > need a storeload() barrier after: > > ? thread->set_trace_flag(); > > to ensure its not reordered with the reads of thread->thread_state() ? Setting/clearing suspend flags is always done with Atomic::cmpxchg, since there can be multiple threads manipulating the bit pattern. I can add a comment about it. Thanks, Robbin > > Thanks, > David > ----- > > >> So we are missing checking that bit completely in this transition code. >> >> Thanks, Robbin >> >> >>> All you are doing is changing the timing of the race between the thread >>> re-entering the VM/Java and the request for a suspend or safepoint. >>> >>> If there is a race between the sampler logic acting on the thread, and the >>> thread acting on itself then that race has to be precluded somehow. >>> >>> Thanks, >>> David >>> ----- >>> >>>> (which have the JFR native trans suspend check) >>>> And it passes t1-5. >>>> >>>> Code: >>>> http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8218147 >>>> >>>> Thanks, Robbin >>>> >>>> On 4/5/19 5:43 PM, Robbin Ehn wrote: >>>>> Hi Dean, >>>>> >>>>> Sorry, I missed this mail. >>>>> Yes we can do that. >>>>> Ignore my other mail, I'll update. >>>>> >>>>> Thanks, Robbin >>>>> >>>>> >>>>> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>>>>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>>>>> >>>>>>>>> >>>>>>>>> If it's already set, should we check that _last_Java_pc matches the >>>>>> >>>>>>>>> new value? >>>>>>>> >>>>>>>> We manually set the pc in several places, so if it's set, it's not >>>>>>>> certain that >>>>>>>> it should be the same as in last sp. >>>>>>>> I can't distinguish between the cases. >>>>>>>> >>>>>>> >>>>>>> If we get pc from sp[-1] then it should match, but you're right, we >>>>>>> sometimes get pc from somewhere else. >>>>>> >>>>>> How about if we combine the !walkable check and the >>>>>> capture_last_Java_pc() logic into a single method? >>>>>> Then we can do something like: >>>>>> >>>>>> ???? if (!walkable()) { >>>>>> ???????? address pc = (address)_last_Java_sp[-1]; >>>>>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>>>>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>>>>> ???? } >>>>>> >>>>>> dl From robbin.ehn at oracle.com Tue Apr 16 08:17:12 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 10:17:12 +0200 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> Message-ID: Hi Claes and David, Here is v2: http://cr.openjdk.java.net/~rehn/8222327/v2/ Inc: http://cr.openjdk.java.net/~rehn/8222327/v2/inc/ Passed t1-2. Thanks, Robbin On 4/15/19 11:04 AM, Robbin Ehn wrote: > Hi all, please review. > > Removing some dead code. > > Code: > http://cr.openjdk.java.net/~rehn/8222327/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8222327 > > Passes t1-5. > > Thanks, Robbin From robbin.ehn at oracle.com Tue Apr 16 08:18:16 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 10:18:16 +0200 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <39592868-25d6-5882-c89c-65af3559e005@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <39592868-25d6-5882-c89c-65af3559e005@oracle.com> Message-ID: Hi Claes, On 4/15/19 11:40 AM, Claes Redestad wrote: > Nice cleanup! Thanks! > > Seems you could trivially do the same for _stackSize_offset Sent out a v2. Thanks, Robbin > > Thanks! > > /Claes > > On 2019-04-15 11:04, Robbin Ehn wrote: >> Hi all, please review. >> >> Removing some dead code. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8222327 >> >> Passes t1-5. >> >> Thanks, Robbin From david.holmes at oracle.com Tue Apr 16 08:21:44 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2019 18:21:44 +1000 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> <3d0d841d-3bcf-ce3c-1d44-096c9aff2e72@oracle.com> Message-ID: <40b529d0-aeca-24b8-636a-518b7445b593@oracle.com> On 16/04/2019 5:56 pm, Robbin Ehn wrote: > Hi David, > > *truncated* > > On 4/16/19 9:27 AM, David Holmes wrote: >>> If we want to keep the out-of-line double checking, this is an >>> alternative: >>> diff -r 51211b2d6514 src/hotspot/share/runtime/thread.hpp >>> --- a/src/hotspot/share/runtime/thread.hpp??? Tue Apr 16 08:38:32 >>> 2019 +0200 >>> +++ b/src/hotspot/share/runtime/thread.hpp??? Tue Apr 16 08:44:21 >>> 2019 +0200 >>> @@ -1417,3 +1417,3 @@ >>> ??? bool is_suspend_after_native() const { >>> -??? return (_suspend_flags & (_external_suspend | _deopt_suspend)) >>> != 0; >>> +??? return (_suspend_flags & (_external_suspend | _deopt_suspend | >>> _trace_flag)) != 0; >>> ??? } >> > > So you prefer this patch? Yes > >> Right. Should that use JFR_ONLY? > > _trace_flag is always present, so we don't need it. Okay ... not clear what will set it other than JFR ... > And I'm not sure how to get that macro into that method in a nice way? Define nice ;-) return (_suspend_flags & (_external_suspend | _deopt_suspend JFR_ONLY(| _trace_flag))) != 0; >> >> I was a little concerned that thread->set_trace_flag() may not ensure >> visibility of the flag update, but then realized it should be covered >> by the fence: >> >> ??static inline void transition_from_native(JavaThread *thread, >> JavaThreadState to) { >> ???? // Change to transition state and ensure it is seen by the VM >> thread. >> ???? thread->set_thread_state_fence(_thread_in_native_trans); >> >> and the comment should say: >> >> // Change to transition state and ensure it is seen by other thread, >> // and we will see any _suspend_flag changes below. >> >> However it also seems to me that in >> JfrThreadSampleClosure::do_sample_thread we need a storeload() barrier >> after: >> >> ?? thread->set_trace_flag(); >> >> to ensure its not reordered with the reads of thread->thread_state() ? > > Setting/clearing suspend flags is always done with Atomic::cmpxchg, > since there can be multiple threads manipulating the bit pattern. > I can add a comment about it. Missed that - thanks. David ----- > Thanks, Robbin > >> >> Thanks, >> David >> ----- >> >> >>> So we are missing checking that bit completely in this transition code. >>> >>> Thanks, Robbin >>> >>> >>>> All you are doing is changing the timing of the race between the >>>> thread re-entering the VM/Java and the request for a suspend or >>>> safepoint. >>>> >>>> If there is a race between the sampler logic acting on the thread, >>>> and the thread acting on itself then that race has to be precluded >>>> somehow. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> (which have the JFR native trans suspend check) >>>>> And it passes t1-5. >>>>> >>>>> Code: >>>>> http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ >>>>> Issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8218147 >>>>> >>>>> Thanks, Robbin >>>>> >>>>> On 4/5/19 5:43 PM, Robbin Ehn wrote: >>>>>> Hi Dean, >>>>>> >>>>>> Sorry, I missed this mail. >>>>>> Yes we can do that. >>>>>> Ignore my other mail, I'll update. >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>> >>>>>> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>>>>>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>>>>>> >>>>>>>>>> >>>>>>>>>> If it's already set, should we check that _last_Java_pc >>>>>>>>>> matches the >>>>>>> >>>>>>>>>> new value? >>>>>>>>> >>>>>>>>> We manually set the pc in several places, so if it's set, it's not >>>>>>>>> certain that >>>>>>>>> it should be the same as in last sp. >>>>>>>>> I can't distinguish between the cases. >>>>>>>>> >>>>>>>> >>>>>>>> If we get pc from sp[-1] then it should match, but you're right, we >>>>>>>> sometimes get pc from somewhere else. >>>>>>> >>>>>>> How about if we combine the !walkable check and the >>>>>>> capture_last_Java_pc() logic into a single method? >>>>>>> Then we can do something like: >>>>>>> >>>>>>> ???? if (!walkable()) { >>>>>>> ???????? address pc = (address)_last_Java_sp[-1]; >>>>>>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>>>>>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>>>>>> ???? } >>>>>>> >>>>>>> dl From robbin.ehn at oracle.com Tue Apr 16 08:23:46 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 10:23:46 +0200 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <6098cff6-ce54-6301-edf8-2812477c14cb@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <39592868-25d6-5882-c89c-65af3559e005@oracle.com> <5a823b4c-eeab-36e3-99e9-f8ffe23ec492@oracle.com> <6098cff6-ce54-6301-edf8-2812477c14cb@oracle.com> Message-ID: <066cc7b7-9795-53a8-6086-5bc834f63cf0@oracle.com> Hi David, On 4/15/19 10:23 PM, David Holmes wrote: > On 15/04/2019 9:34 pm, David Holmes wrote: >> On 15/04/2019 7:40 pm, Claes Redestad wrote: >>> Nice cleanup! >>> >>> Seems you could trivially do the same for _stackSize_offset >> >> Seems any of the checks for _foo_offset > 0 can be removed as all fields must >> always be present. > > Except you'd have to be careful about access prior to initialization - for all > of these "offset" fields. Today we would return e.g. NULL or 0 which presumably show up when asking for e.g. thread id. Most of them today do not have any checking and I saw no issue in testing. So I don't think that is problem, otherwise we should have problem today setting thread status could be ignored. I sent out a v2. Thanks, Robbin > > David > >> David >> >>> Thanks! >>> >>> /Claes >>> >>> On 2019-04-15 11:04, Robbin Ehn wrote: >>>> Hi all, please review. >>>> >>>> Removing some dead code. >>>> >>>> Code: >>>> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8222327 >>>> >>>> Passes t1-5. >>>> >>>> Thanks, Robbin From david.holmes at oracle.com Tue Apr 16 08:27:31 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2019 18:27:31 +1000 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> Message-ID: <5fbea75e-f398-9e53-2d62-056cd69b4b44@oracle.com> Hi Robbin, On 16/04/2019 6:17 pm, Robbin Ehn wrote: > Hi Claes and David, > > Here is v2: > http://cr.openjdk.java.net/~rehn/8222327/v2/ This is missing any check that the offsets are always initialized before use. Can you guarantee none of these will be used in such a case? If so asserts at least would be good. oop java_lang_Thread::park_blocker(oop java_thread) { ! assert(JDK_Version::current().supports_thread_park_blocker(), ! "Must support parkBlocker field"); This assert can be removed as can JDK_Version::supports_thread_park_blocker() and possibly other stuff, as they also represent pre-1.5 code paths. Thanks, David ----- > Inc: > http://cr.openjdk.java.net/~rehn/8222327/v2/inc/ > > Passed t1-2. > > Thanks, Robbin > > On 4/15/19 11:04 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> Removing some dead code. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8222327 >> >> Passes t1-5. >> >> Thanks, Robbin From david.holmes at oracle.com Tue Apr 16 08:31:43 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2019 18:31:43 +1000 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <066cc7b7-9795-53a8-6086-5bc834f63cf0@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <39592868-25d6-5882-c89c-65af3559e005@oracle.com> <5a823b4c-eeab-36e3-99e9-f8ffe23ec492@oracle.com> <6098cff6-ce54-6301-edf8-2812477c14cb@oracle.com> <066cc7b7-9795-53a8-6086-5bc834f63cf0@oracle.com> Message-ID: Sorry didn't see this before reviewing v2 ... On 16/04/2019 6:23 pm, Robbin Ehn wrote: > Hi David, > > On 4/15/19 10:23 PM, David Holmes wrote: >> On 15/04/2019 9:34 pm, David Holmes wrote: >>> On 15/04/2019 7:40 pm, Claes Redestad wrote: >>>> Nice cleanup! >>>> >>>> Seems you could trivially do the same for _stackSize_offset >>> >>> Seems any of the checks for _foo_offset > 0 can be removed as all >>> fields must always be present. >> >> Except you'd have to be careful about access prior to initialization - >> for all of these "offset" fields. > > Today we would return e.g. NULL or 0 which presumably show up when > asking for e.g. thread id. Most of them today do not have any checking > and I saw no issue in testing. So I don't think that is problem, > otherwise we should have problem today setting thread status could be > ignored. If the offset is not initialized you will now crash (most likely) instead of getting a zero etc. This would only be a potential issue during VM initialization, and possibly only on initialization failures which makes this very hard to test. Thanks, David > I sent out a v2. > > Thanks, Robbin > >> >> David >> >>> David >>> >>>> Thanks! >>>> >>>> /Claes >>>> >>>> On 2019-04-15 11:04, Robbin Ehn wrote: >>>>> Hi all, please review. >>>>> >>>>> Removing some dead code. >>>>> >>>>> Code: >>>>> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >>>>> Issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8222327 >>>>> >>>>> Passes t1-5. >>>>> >>>>> Thanks, Robbin From robbin.ehn at oracle.com Tue Apr 16 08:50:23 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 10:50:23 +0200 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <5fbea75e-f398-9e53-2d62-056cd69b4b44@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <5fbea75e-f398-9e53-2d62-056cd69b4b44@oracle.com> Message-ID: Hi David, On 4/16/19 10:27 AM, David Holmes wrote: > Hi Robbin, > > On 16/04/2019 6:17 pm, Robbin Ehn wrote: >> Hi Claes and David, >> >> Here is v2: >> http://cr.openjdk.java.net/~rehn/8222327/v2/ > > This is missing any check that the offsets are always initialized before use. > Can you guarantee none of these will be used in such a case? If so asserts at > least would be good. The other ones don't have assert. 1698 bool java_lang_Thread::is_daemon(oop java_thread) { 1699 return java_thread->bool_field(_daemon_offset) != 0; 1700 } 1701 1702 1703 void java_lang_Thread::set_daemon(oop java_thread) { 1704 java_thread->bool_field_put(_daemon_offset, true); 1705 } > > ? oop java_lang_Thread::park_blocker(oop java_thread) { > !?? assert(JDK_Version::current().supports_thread_park_blocker(), > !????????? "Must support parkBlocker field"); > > This assert can be removed as can JDK_Version::supports_thread_park_blocker() > and possibly other stuff, as they also represent pre-1.5 code paths. Yes, I know, but this is part of the JDK_Version API, which means I get a load of collateral changes. And I did not want to do cleanup JDK_Version in this patch. From other mail: >If the offset is not initialized you will now crash (most likely) instead of >getting a zero etc. This would only be a potential issue during VM >initialization, and possibly only on initialization failures which makes this >very hard to test. But the other ones don't have assert so that is the case today. Thanks, Robbin > > Thanks, > David > ----- > >> Inc: >> http://cr.openjdk.java.net/~rehn/8222327/v2/inc/ >> >> Passed t1-2. >> >> Thanks, Robbin >> >> On 4/15/19 11:04 AM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Removing some dead code. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8222327 >>> >>> Passes t1-5. >>> >>> Thanks, Robbin From robbin.ehn at oracle.com Tue Apr 16 08:59:51 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 10:59:51 +0200 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: <40b529d0-aeca-24b8-636a-518b7445b593@oracle.com> References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> <3d0d841d-3bcf-ce3c-1d44-096c9aff2e72@oracle.com> <40b529d0-aeca-24b8-636a-518b7445b593@oracle.com> Message-ID: <6f53161b-479b-6dff-5ead-18aaa4170bf3@oracle.com> Hi David, >> And I'm not sure how to get that macro into that method in a nice way? > > Define nice ;-) > > return (_suspend_flags & (_external_suspend | _deopt_suspend JFR_ONLY(| > _trace_flag))) != 0; Sure! I'll re-test and sent out a v4, thanks! /Robbin > >>> >>> I was a little concerned that thread->set_trace_flag() may not ensure >>> visibility of the flag update, but then realized it should be covered by the >>> fence: >>> >>> ??static inline void transition_from_native(JavaThread *thread, >>> JavaThreadState to) { >>> ???? // Change to transition state and ensure it is seen by the VM thread. >>> ???? thread->set_thread_state_fence(_thread_in_native_trans); >>> >>> and the comment should say: >>> >>> // Change to transition state and ensure it is seen by other thread, >>> // and we will see any _suspend_flag changes below. >>> >>> However it also seems to me that in JfrThreadSampleClosure::do_sample_thread >>> we need a storeload() barrier after: >>> >>> ?? thread->set_trace_flag(); >>> >>> to ensure its not reordered with the reads of thread->thread_state() ? >> >> Setting/clearing suspend flags is always done with Atomic::cmpxchg, since >> there can be multiple threads manipulating the bit pattern. >> I can add a comment about it. > > Missed that - thanks. > > David > ----- > >> Thanks, Robbin >> >>> >>> Thanks, >>> David >>> ----- >>> >>> >>>> So we are missing checking that bit completely in this transition code. >>>> >>>> Thanks, Robbin >>>> >>>> >>>>> All you are doing is changing the timing of the race between the thread >>>>> re-entering the VM/Java and the request for a suspend or safepoint. >>>>> >>>>> If there is a race between the sampler logic acting on the thread, and the >>>>> thread acting on itself then that race has to be precluded somehow. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> (which have the JFR native trans suspend check) >>>>>> And it passes t1-5. >>>>>> >>>>>> Code: >>>>>> http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ >>>>>> Issue: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8218147 >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>> On 4/5/19 5:43 PM, Robbin Ehn wrote: >>>>>>> Hi Dean, >>>>>>> >>>>>>> Sorry, I missed this mail. >>>>>>> Yes we can do that. >>>>>>> Ignore my other mail, I'll update. >>>>>>> >>>>>>> Thanks, Robbin >>>>>>> >>>>>>> >>>>>>> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>>>>>>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> If it's already set, should we check that _last_Java_pc matches the >>>>>>>> >>>>>>>>>>> new value? >>>>>>>>>> >>>>>>>>>> We manually set the pc in several places, so if it's set, it's not >>>>>>>>>> certain that >>>>>>>>>> it should be the same as in last sp. >>>>>>>>>> I can't distinguish between the cases. >>>>>>>>>> >>>>>>>>> >>>>>>>>> If we get pc from sp[-1] then it should match, but you're right, we >>>>>>>>> sometimes get pc from somewhere else. >>>>>>>> >>>>>>>> How about if we combine the !walkable check and the >>>>>>>> capture_last_Java_pc() logic into a single method? >>>>>>>> Then we can do something like: >>>>>>>> >>>>>>>> ???? if (!walkable()) { >>>>>>>> ???????? address pc = (address)_last_Java_sp[-1]; >>>>>>>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>>>>>>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>>>>>>> ???? } >>>>>>>> >>>>>>>> dl From david.holmes at oracle.com Tue Apr 16 10:42:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2019 20:42:36 +1000 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <5fbea75e-f398-9e53-2d62-056cd69b4b44@oracle.com> Message-ID: <7fcfb983-f5ad-ac61-bceb-f2e714d884b0@oracle.com> On 16/04/2019 6:50 pm, Robbin Ehn wrote: > Hi David, > > On 4/16/19 10:27 AM, David Holmes wrote: >> Hi Robbin, >> >> On 16/04/2019 6:17 pm, Robbin Ehn wrote: >>> Hi Claes and David, >>> >>> Here is v2: >>> http://cr.openjdk.java.net/~rehn/8222327/v2/ >> >> This is missing any check that the offsets are always initialized >> before use. Can you guarantee none of these will be used in such a >> case? If so asserts at least would be good. > > The other ones don't have assert. > > 1698 bool java_lang_Thread::is_daemon(oop java_thread) { > 1699?? return java_thread->bool_field(_daemon_offset) != 0; > 1700 } > 1701 > 1702 > 1703 void java_lang_Thread::set_daemon(oop java_thread) { > 1704?? java_thread->bool_field_put(_daemon_offset, true); > 1705 } > >> >> ?? oop java_lang_Thread::park_blocker(oop java_thread) { >> !?? assert(JDK_Version::current().supports_thread_park_blocker(), >> !????????? "Must support parkBlocker field"); >> >> This assert can be removed as can >> JDK_Version::supports_thread_park_blocker() and possibly other stuff, >> as they also represent pre-1.5 code paths. > > Yes, I know, but this is part of the JDK_Version API, which means I get > a load of collateral changes. And I did not want to do cleanup > JDK_Version in this patch. I thought you were on a crusade to get rid of pre JDK 1.5 code ;-) Okay we can leave this to another cleanup. > From other mail: > >If the offset is not initialized you will now crash (most likely) > instead of >getting a zero etc. This would only be a potential issue > during VM >initialization, and possibly only on initialization failures > which makes this >very hard to test. > > But the other ones don't have assert so that is the case today. So they could potentially both be wrong. As it is this code is initialized via init_globals() before any Java Thread object can be created, so there is no issue with access. Thanks, David > Thanks, Robbin > > >> >> Thanks, >> David >> ----- >> >>> Inc: >>> http://cr.openjdk.java.net/~rehn/8222327/v2/inc/ >>> >>> Passed t1-2. >>> >>> Thanks, Robbin >>> >>> On 4/15/19 11:04 AM, Robbin Ehn wrote: >>>> Hi all, please review. >>>> >>>> Removing some dead code. >>>> >>>> Code: >>>> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8222327 >>>> >>>> Passes t1-5. >>>> >>>> Thanks, Robbin From robbin.ehn at oracle.com Tue Apr 16 12:01:36 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 14:01:36 +0200 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <7fcfb983-f5ad-ac61-bceb-f2e714d884b0@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <5fbea75e-f398-9e53-2d62-056cd69b4b44@oracle.com> <7fcfb983-f5ad-ac61-bceb-f2e714d884b0@oracle.com> Message-ID: <4bf491f7-4ebf-c8c5-9490-7dfcf22a836a@oracle.com> Hi David, On 4/16/19 12:42 PM, David Holmes wrote: > > I thought you were on a crusade to get rid of pre JDK 1.5 code ;-) Okay we can > leave this to another cleanup. :) > >> ?From other mail: >> ?>If the offset is not initialized you will now crash (most likely) instead of >> >getting a zero etc. This would only be a potential issue during VM >> >initialization, and possibly only on initialization failures which makes this >> >very hard to test. >> >> But the other ones don't have assert so that is the case today. > > So they could potentially both be wrong. > > As it is this code is initialized via init_globals() before any Java Thread > object can be created, so there is no issue with access. Good, thanks. /Robbin > > Thanks, > David > >> Thanks, Robbin >> >> >>> >>> Thanks, >>> David >>> ----- >>> >>>> Inc: >>>> http://cr.openjdk.java.net/~rehn/8222327/v2/inc/ >>>> >>>> Passed t1-2. >>>> >>>> Thanks, Robbin >>>> >>>> On 4/15/19 11:04 AM, Robbin Ehn wrote: >>>>> Hi all, please review. >>>>> >>>>> Removing some dead code. >>>>> >>>>> Code: >>>>> http://cr.openjdk.java.net/~rehn/8222327/webrev/ >>>>> Issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8222327 >>>>> >>>>> Passes t1-5. >>>>> >>>>> Thanks, Robbin From robbin.ehn at oracle.com Tue Apr 16 12:06:41 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 14:06:41 +0200 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> Message-ID: <53a81521-1630-802f-ce9e-b597b62f6ac5@oracle.com> Hi, here is v4. http://cr.openjdk.java.net/~rehn/8218147/v4/webrev/index.html Re-prod test and t1-t2. Thanks, Robbin On 4/15/19 10:58 AM, Robbin Ehn wrote: > Hi, please review. > > After reexamine this issue: > Threads in native must always have their stack walkable. > JFR sampler should never need to make a stack walkable (for native sample). > > I manage to locally reproduce reliable with changes to JFR sampler and having > hundreds of threads running similar code as the in the bug. > (Looping creating an array with negative size.) > > I found a place where we don't proper look at the suspend flags. > The java thread can thus escape native and make it's stack unwalkable and later > it tries to make it walkable at the same time as the JFR sampler. > > By removing some kind of fast check and instead always call the > check_safepoint_and_suspend_for_native_trans I can no longer reproduce. > (which have the JFR native trans suspend check) > And it passes t1-5. > > Code: > http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8218147 > > Thanks, Robbin > > On 4/5/19 5:43 PM, Robbin Ehn wrote: >> Hi Dean, >> >> Sorry, I missed this mail. >> Yes we can do that. >> Ignore my other mail, I'll update. >> >> Thanks, Robbin >> >> >> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>> >>>>>> >>>>>> If it's already set, should we check that _last_Java_pc matches the >>> >>>>>> new value? >>>>> >>>>> We manually set the pc in several places, so if it's set, it's not >>>>> certain that >>>>> it should be the same as in last sp. >>>>> I can't distinguish between the cases. >>>>> >>>> >>>> If we get pc from sp[-1] then it should match, but you're right, we >>>> sometimes get pc from somewhere else. >>> >>> How about if we combine the !walkable check and the >>> capture_last_Java_pc() logic into a single method? >>> Then we can do something like: >>> >>> ???? if (!walkable()) { >>> ???????? address pc = (address)_last_Java_sp[-1]; >>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>> ???? } >>> >>> dl From david.holmes at oracle.com Tue Apr 16 13:00:04 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2019 23:00:04 +1000 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: <53a81521-1630-802f-ce9e-b597b62f6ac5@oracle.com> References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> <53a81521-1630-802f-ce9e-b597b62f6ac5@oracle.com> Message-ID: Looks good to me. Thanks, David On 16/04/2019 10:06 pm, Robbin Ehn wrote: > Hi, here is v4. > > http://cr.openjdk.java.net/~rehn/8218147/v4/webrev/index.html > > Re-prod test and t1-t2. > > Thanks, Robbin > > On 4/15/19 10:58 AM, Robbin Ehn wrote: >> Hi, please review. >> >> After reexamine this issue: >> Threads in native must always have their stack walkable. >> JFR sampler should never need to make a stack walkable (for native >> sample). >> >> I manage to locally reproduce reliable with changes to JFR sampler and >> having >> hundreds of threads running similar code as the in the bug. >> (Looping creating an array with negative size.) >> >> I found a place where we don't proper look at the suspend flags. >> The java thread can thus escape native and make it's stack unwalkable >> and later >> it tries to make it walkable at the same time as the JFR sampler. >> >> By removing some kind of fast check and instead always call the >> check_safepoint_and_suspend_for_native_trans I can no longer reproduce. >> (which have the JFR native trans suspend check) >> And it passes t1-5. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8218147 >> >> Thanks, Robbin >> >> On 4/5/19 5:43 PM, Robbin Ehn wrote: >>> Hi Dean, >>> >>> Sorry, I missed this mail. >>> Yes we can do that. >>> Ignore my other mail, I'll update. >>> >>> Thanks, Robbin >>> >>> >>> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>>> >>>>>>> >>>>>>> If it's already set, should we check that _last_Java_pc matches the >>>> >>>>>>> new value? >>>>>> >>>>>> We manually set the pc in several places, so if it's set, it's not >>>>>> certain that >>>>>> it should be the same as in last sp. >>>>>> I can't distinguish between the cases. >>>>>> >>>>> >>>>> If we get pc from sp[-1] then it should match, but you're right, we >>>>> sometimes get pc from somewhere else. >>>> >>>> How about if we combine the !walkable check and the >>>> capture_last_Java_pc() logic into a single method? >>>> Then we can do something like: >>>> >>>> ???? if (!walkable()) { >>>> ???????? address pc = (address)_last_Java_sp[-1]; >>>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>>> ???? } >>>> >>>> dl From daniel.daugherty at oracle.com Tue Apr 16 13:22:08 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 16 Apr 2019 09:22:08 -0400 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: <53a81521-1630-802f-ce9e-b597b62f6ac5@oracle.com> References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> <53a81521-1630-802f-ce9e-b597b62f6ac5@oracle.com> Message-ID: <2799bfbb-ae0d-12f9-35f7-a0392a6b941e@oracle.com> On 4/16/19 8:06 AM, Robbin Ehn wrote: > Hi, here is v4. > > http://cr.openjdk.java.net/~rehn/8218147/v4/webrev/index.html src/hotspot/share/jfr/periodic/sampling/jfrThreadSampler.cpp ??? L362: ? thread->set_trace_flag();? // Provides StoreLoad, needed to keep read of thread state not floating up. ??????? Typo: s/not floating/from floating/ src/hotspot/share/runtime/thread.hpp ??? No comments. Thumbs up! Dan > > Re-prod test and t1-t2. > > Thanks, Robbin > > On 4/15/19 10:58 AM, Robbin Ehn wrote: >> Hi, please review. >> >> After reexamine this issue: >> Threads in native must always have their stack walkable. >> JFR sampler should never need to make a stack walkable (for native >> sample). >> >> I manage to locally reproduce reliable with changes to JFR sampler >> and having >> hundreds of threads running similar code as the in the bug. >> (Looping creating an array with negative size.) >> >> I found a place where we don't proper look at the suspend flags. >> The java thread can thus escape native and make it's stack unwalkable >> and later >> it tries to make it walkable at the same time as the JFR sampler. >> >> By removing some kind of fast check and instead always call the >> check_safepoint_and_suspend_for_native_trans I can no longer reproduce. >> (which have the JFR native trans suspend check) >> And it passes t1-5. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8218147 >> >> Thanks, Robbin >> >> On 4/5/19 5:43 PM, Robbin Ehn wrote: >>> Hi Dean, >>> >>> Sorry, I missed this mail. >>> Yes we can do that. >>> Ignore my other mail, I'll update. >>> >>> Thanks, Robbin >>> >>> >>> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>>> >>>>>>> >>>>>>> If it's already set, should we check that _last_Java_pc matches the >>>> >>>>>>> new value? >>>>>> >>>>>> We manually set the pc in several places, so if it's set, it's not >>>>>> certain that >>>>>> it should be the same as in last sp. >>>>>> I can't distinguish between the cases. >>>>>> >>>>> >>>>> If we get pc from sp[-1] then it should match, but you're right, we >>>>> sometimes get pc from somewhere else. >>>> >>>> How about if we combine the !walkable check and the >>>> capture_last_Java_pc() logic into a single method? >>>> Then we can do something like: >>>> >>>> ???? if (!walkable()) { >>>> ???????? address pc = (address)_last_Java_sp[-1]; >>>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>>> ???? } >>>> >>>> dl From robbin.ehn at oracle.com Tue Apr 16 13:30:30 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 15:30:30 +0200 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> <53a81521-1630-802f-ce9e-b597b62f6ac5@oracle.com> Message-ID: <5a8c7b88-bda7-ad73-3c2a-ef390a814de4@oracle.com> Thanks David, Robbin On 4/16/19 3:00 PM, David Holmes wrote: > Looks good to me. > > Thanks, > David > > On 16/04/2019 10:06 pm, Robbin Ehn wrote: >> Hi, here is v4. >> >> http://cr.openjdk.java.net/~rehn/8218147/v4/webrev/index.html >> >> Re-prod test and t1-t2. >> >> Thanks, Robbin >> >> On 4/15/19 10:58 AM, Robbin Ehn wrote: >>> Hi, please review. >>> >>> After reexamine this issue: >>> Threads in native must always have their stack walkable. >>> JFR sampler should never need to make a stack walkable (for native sample). >>> >>> I manage to locally reproduce reliable with changes to JFR sampler and having >>> hundreds of threads running similar code as the in the bug. >>> (Looping creating an array with negative size.) >>> >>> I found a place where we don't proper look at the suspend flags. >>> The java thread can thus escape native and make it's stack unwalkable and later >>> it tries to make it walkable at the same time as the JFR sampler. >>> >>> By removing some kind of fast check and instead always call the >>> check_safepoint_and_suspend_for_native_trans I can no longer reproduce. >>> (which have the JFR native trans suspend check) >>> And it passes t1-5. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8218147 >>> >>> Thanks, Robbin >>> >>> On 4/5/19 5:43 PM, Robbin Ehn wrote: >>>> Hi Dean, >>>> >>>> Sorry, I missed this mail. >>>> Yes we can do that. >>>> Ignore my other mail, I'll update. >>>> >>>> Thanks, Robbin >>>> >>>> >>>> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>>>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>>>> >>>>>>>> >>>>>>>> If it's already set, should we check that _last_Java_pc matches the >>>>> >>>>>>>> new value? >>>>>>> >>>>>>> We manually set the pc in several places, so if it's set, it's not >>>>>>> certain that >>>>>>> it should be the same as in last sp. >>>>>>> I can't distinguish between the cases. >>>>>>> >>>>>> >>>>>> If we get pc from sp[-1] then it should match, but you're right, we >>>>>> sometimes get pc from somewhere else. >>>>> >>>>> How about if we combine the !walkable check and the >>>>> capture_last_Java_pc() logic into a single method? >>>>> Then we can do something like: >>>>> >>>>> ???? if (!walkable()) { >>>>> ???????? address pc = (address)_last_Java_sp[-1]; >>>>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>>>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>>>> ???? } >>>>> >>>>> dl From robbin.ehn at oracle.com Tue Apr 16 13:31:23 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 16 Apr 2019 15:31:23 +0200 Subject: RFR(s): 8218147: make_walkable asserts on multiple calls In-Reply-To: <2799bfbb-ae0d-12f9-35f7-a0392a6b941e@oracle.com> References: <48b0d1d0-3b12-e08d-f606-bf96a8445bf8@oracle.com> <587d1afe-9f86-f801-92de-31711b711034@oracle.com> <24A2B80F-D3D4-486A-9E29-8D48D49DF030@oracle.com> <53a81521-1630-802f-ce9e-b597b62f6ac5@oracle.com> <2799bfbb-ae0d-12f9-35f7-a0392a6b941e@oracle.com> Message-ID: Hi Dan, On 4/16/19 3:22 PM, Daniel D. Daugherty wrote: > On 4/16/19 8:06 AM, Robbin Ehn wrote: >> Hi, here is v4. >> >> http://cr.openjdk.java.net/~rehn/8218147/v4/webrev/index.html > > src/hotspot/share/jfr/periodic/sampling/jfrThreadSampler.cpp > ??? L362: ? thread->set_trace_flag();? // Provides StoreLoad, needed to keep > read of thread state not floating up. > ??????? Typo: s/not floating/from floating/ Fixed! > > src/hotspot/share/runtime/thread.hpp > ??? No comments. > > Thumbs up! > Thanks Dan! /Robbin > Dan > > >> >> Re-prod test and t1-t2. >> >> Thanks, Robbin >> >> On 4/15/19 10:58 AM, Robbin Ehn wrote: >>> Hi, please review. >>> >>> After reexamine this issue: >>> Threads in native must always have their stack walkable. >>> JFR sampler should never need to make a stack walkable (for native sample). >>> >>> I manage to locally reproduce reliable with changes to JFR sampler and having >>> hundreds of threads running similar code as the in the bug. >>> (Looping creating an array with negative size.) >>> >>> I found a place where we don't proper look at the suspend flags. >>> The java thread can thus escape native and make it's stack unwalkable and later >>> it tries to make it walkable at the same time as the JFR sampler. >>> >>> By removing some kind of fast check and instead always call the >>> check_safepoint_and_suspend_for_native_trans I can no longer reproduce. >>> (which have the JFR native trans suspend check) >>> And it passes t1-5. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8218147/v3/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8218147 >>> >>> Thanks, Robbin >>> >>> On 4/5/19 5:43 PM, Robbin Ehn wrote: >>>> Hi Dean, >>>> >>>> Sorry, I missed this mail. >>>> Yes we can do that. >>>> Ignore my other mail, I'll update. >>>> >>>> Thanks, Robbin >>>> >>>> >>>> dean.long at oracle.com skrev: (5 april 2019 09:22:24 CEST) >>>>> On 4/4/19 5:16 PM, dean.long at oracle.com wrote: >>>>>> >>>>>>>> >>>>>>>> If it's already set, should we check that _last_Java_pc matches the >>>>> >>>>>>>> new value? >>>>>>> >>>>>>> We manually set the pc in several places, so if it's set, it's not >>>>>>> certain that >>>>>>> it should be the same as in last sp. >>>>>>> I can't distinguish between the cases. >>>>>>> >>>>>> >>>>>> If we get pc from sp[-1] then it should match, but you're right, we >>>>>> sometimes get pc from somewhere else. >>>>> >>>>> How about if we combine the !walkable check and the >>>>> capture_last_Java_pc() logic into a single method? >>>>> Then we can do something like: >>>>> >>>>> ???? if (!walkable()) { >>>>> ???????? address pc = (address)_last_Java_sp[-1]; >>>>> ???????? address a = Atomic::cmpxchg(pc, &_last_Java_pc, NULL); >>>>> ???????? assert(a == NULL || a == pc, "unexpected PC %p", a); >>>>> ???? } >>>>> >>>>> dl > From claes.redestad at oracle.com Tue Apr 16 14:05:31 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 16 Apr 2019 16:05:31 +0200 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> Message-ID: <38785c09-604f-1d3f-48f6-db647730fad1@oracle.com> On 2019-04-16 10:17, Robbin Ehn wrote: > http://cr.openjdk.java.net/~rehn/8222327/v2/ Looks good to me! /Claes From coleen.phillimore at oracle.com Tue Apr 16 14:19:35 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 16 Apr 2019 10:19:35 -0400 Subject: RFR (T) 8220743: [TESTBUG] Review Runtime tests recently migrated from JDK subdirs Message-ID: <1906de9d-bf4b-e757-84d4-31521edd0b5f@oracle.com> Summary: removed tests that will not find bugs in current code base. See bug for more details.? Tested with mach5 hs tier1-3, to check for artifacts in test scripts. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8220743.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8220743 Thanks, Coleen From lois.foltan at oracle.com Tue Apr 16 14:21:48 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 16 Apr 2019 10:21:48 -0400 Subject: RFR (T) 8220743: [TESTBUG] Review Runtime tests recently migrated from JDK subdirs In-Reply-To: <1906de9d-bf4b-e757-84d4-31521edd0b5f@oracle.com> References: <1906de9d-bf4b-e757-84d4-31521edd0b5f@oracle.com> Message-ID: <316e161f-4546-de03-dd16-3c93a44db115@oracle.com> Looks good & trivial. Lois On 4/16/2019 10:19 AM, coleen.phillimore at oracle.com wrote: > Summary: removed tests that will not find bugs in current code base. > > See bug for more details.? Tested with mach5 hs tier1-3, to check for > artifacts in test scripts. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8220743.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8220743 > > Thanks, > Coleen From harold.seigel at oracle.com Tue Apr 16 14:24:32 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Tue, 16 Apr 2019 10:24:32 -0400 Subject: RFR (T) 8220743: [TESTBUG] Review Runtime tests recently migrated from JDK subdirs In-Reply-To: <1906de9d-bf4b-e757-84d4-31521edd0b5f@oracle.com> References: <1906de9d-bf4b-e757-84d4-31521edd0b5f@oracle.com> Message-ID: <3d01de13-d32d-1eb2-a4a6-53dfa828b904@oracle.com> Looks trivial and good. Thanks, Harold On 4/16/2019 10:19 AM, coleen.phillimore at oracle.com wrote: > Summary: removed tests that will not find bugs in current code base. > > See bug for more details.? Tested with mach5 hs tier1-3, to check for > artifacts in test scripts. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8220743.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8220743 > > Thanks, > Coleen From coleen.phillimore at oracle.com Tue Apr 16 14:24:51 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 16 Apr 2019 10:24:51 -0400 Subject: RFR (T) 8220743: [TESTBUG] Review Runtime tests recently migrated from JDK subdirs In-Reply-To: <316e161f-4546-de03-dd16-3c93a44db115@oracle.com> References: <1906de9d-bf4b-e757-84d4-31521edd0b5f@oracle.com> <316e161f-4546-de03-dd16-3c93a44db115@oracle.com> Message-ID: <70005c76-29ec-f65d-789c-ca22c38e0f2c@oracle.com> Thank you for the quick review, Lois! Coleen On 4/16/19 10:21 AM, Lois Foltan wrote: > Looks good & trivial. > Lois > > On 4/16/2019 10:19 AM, coleen.phillimore at oracle.com wrote: >> Summary: removed tests that will not find bugs in current code base. >> >> See bug for more details.? Tested with mach5 hs tier1-3, to check for >> artifacts in test scripts. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8220743.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8220743 >> >> Thanks, >> Coleen > From gerard.ziemski at oracle.com Tue Apr 16 14:26:33 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Tue, 16 Apr 2019 09:26:33 -0500 Subject: RFR: 8222460: -XX:+PrintFlagsRanges prints incorrect range In-Reply-To: <0ce0349a-df09-441c-6172-1caa2828df2f@oracle.com> References: <9781e40b-f22c-1b8f-198f-525ffbda9a47@oracle.com> <7830b677-141f-7d1f-8b61-dd1094c072e7@oracle.com> <79c45c20-e113-04c8-1567-2d0b589a7844@oracle.com> <0ce0349a-df09-441c-6172-1caa2828df2f@oracle.com> Message-ID: <4b3cdf62-25ac-8cbe-8a01-1c37064a7ffa@oracle.com> On 4/15/19 2:40 PM, Per Liden wrote: >> >> I was wrong in my initial reply. >> >> There are flags with both range and constraint - the entire point of >> printing the range to the user is to help define valid values, and as >> you say "We can't expect a user to know what the range of a specific >> type is on every platform", which applies to both cases. >> >> If the test has an issue with the flag, then the test itself needs to >> be fixed, by excluding the troublesome flag. >> >> I believe that we should print out the default range in both cases >> (done in a followup) and we should modify the test to exclude hard to >> test (troublesome) flags as needed. > > But for flags with constrains functions we have no way of knowing the > valid range, right? So, the question becomes, is it better for a user > to get no information (we an empty range) instead of the wrong > information (we print the default range)? I tend to lean toward > printing no information. A third alternative would be printing the > default range for the type, but also add something in the printout to > indicate that there might be other constraints too. I was thinking about doing something along this line. > >> >> Can't we just exclude "SoftMaxHeapSize" from the test here? > > Sure, I've updated that patch for JDK-8222145 "Add -XX:SoftMaxHeapSize > flag" to exclude it from testing. Webrev here: > > http://cr.openjdk.java.net/~pliden/8222145/webrev.2 > > Looks ok? Looks ok, but why make "allWriteableOptions" static field just to access it in "excludeTestRange()"? Couldn't we simply pass "allWriteableOptions" as an argument to excludeTestRange()"? ? 43??? private static void excludeTestRange(List allWriteableOptions, String optionName) { ? 44???????? for (JVMOption option: allWriteableOptions) { ? 45???????????? if (option.getName().equals(optionName)) { ? 46???????????????? option.excludeTestMinRange(); ? 47???????????????? option.excludeTestMaxRange(); ? 48???????????????? break; ? 49???????????? } ? 50???????? } ? 51???? } I don't need to see another webrev for that change. > I'll withdraw the review for JDK-8222460, and re-assign the bug to > you, and you can update it as you see fit, ok? I filed JDK-8222531 as a followup. cheers From coleen.phillimore at oracle.com Tue Apr 16 14:35:19 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 16 Apr 2019 10:35:19 -0400 Subject: RFR (T) 8220743: [TESTBUG] Review Runtime tests recently migrated from JDK subdirs In-Reply-To: <3d01de13-d32d-1eb2-a4a6-53dfa828b904@oracle.com> References: <1906de9d-bf4b-e757-84d4-31521edd0b5f@oracle.com> <3d01de13-d32d-1eb2-a4a6-53dfa828b904@oracle.com> Message-ID: <9cdb4ba7-8046-ede5-2035-f971783003a0@oracle.com> Thank you for the review, Harold! Coleen On 4/16/19 10:24 AM, Harold Seigel wrote: > Looks trivial and good. > > Thanks, Harold > > On 4/16/2019 10:19 AM, coleen.phillimore at oracle.com wrote: >> Summary: removed tests that will not find bugs in current code base. >> >> See bug for more details.? Tested with mach5 hs tier1-3, to check for >> artifacts in test scripts. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8220743.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8220743 >> >> Thanks, >> Coleen From per.liden at oracle.com Tue Apr 16 14:45:10 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 16 Apr 2019 16:45:10 +0200 Subject: RFR: 8222460: -XX:+PrintFlagsRanges prints incorrect range In-Reply-To: <4b3cdf62-25ac-8cbe-8a01-1c37064a7ffa@oracle.com> References: <9781e40b-f22c-1b8f-198f-525ffbda9a47@oracle.com> <7830b677-141f-7d1f-8b61-dd1094c072e7@oracle.com> <79c45c20-e113-04c8-1567-2d0b589a7844@oracle.com> <0ce0349a-df09-441c-6172-1caa2828df2f@oracle.com> <4b3cdf62-25ac-8cbe-8a01-1c37064a7ffa@oracle.com> Message-ID: On 04/16/2019 04:26 PM, gerard ziemski wrote: > > > On 4/15/19 2:40 PM, Per Liden wrote: >>> >>> I was wrong in my initial reply. >>> >>> There are flags with both range and constraint - the entire point of >>> printing the range to the user is to help define valid values, and as >>> you say "We can't expect a user to know what the range of a specific >>> type is on every platform", which applies to both cases. >>> >>> If the test has an issue with the flag, then the test itself needs to >>> be fixed, by excluding the troublesome flag. >>> >>> I believe that we should print out the default range in both cases >>> (done in a followup) and we should modify the test to exclude hard to >>> test (troublesome) flags as needed. >> >> But for flags with constrains functions we have no way of knowing the >> valid range, right? So, the question becomes, is it better for a user >> to get no information (we an empty range) instead of the wrong >> information (we print the default range)? I tend to lean toward >> printing no information. A third alternative would be printing the >> default range for the type, but also add something in the printout to >> indicate that there might be other constraints too. > > I was thinking about doing something along this line. > > >> >>> >>> Can't we just exclude "SoftMaxHeapSize" from the test here? >> >> Sure, I've updated that patch for JDK-8222145 "Add -XX:SoftMaxHeapSize >> flag" to exclude it from testing. Webrev here: >> >> http://cr.openjdk.java.net/~pliden/8222145/webrev.2 >> >> Looks ok? > > Looks ok, but why make "allWriteableOptions" static field just to access Ok, thanks. > it in "excludeTestRange()"? Couldn't we simply pass > "allWriteableOptions" as an argument to excludeTestRange()"? In principle I agree with you, but I'm just following the style in which these tests are written, see test/hotspot/jtreg/runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java. If we want to use a different style for these tests we should fix all of them, and I didn't want to do that in this patch. Perhaps something for your followup? cheers, Per > > 43 private static void excludeTestRange(List > allWriteableOptions, String optionName) { > 44 for (JVMOption option: allWriteableOptions) { > 45 if (option.getName().equals(optionName)) { > 46 option.excludeTestMinRange(); > 47 option.excludeTestMaxRange(); > 48 break; > 49 } > 50 } > 51 } > > I don't need to see another webrev for that change. > > >> I'll withdraw the review for JDK-8222460, and re-assign the bug to >> you, and you can update it as you see fit, ok? > > I filed JDK-8222531 as a followup. > > > cheers > > From gerard.ziemski at oracle.com Tue Apr 16 16:08:55 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Tue, 16 Apr 2019 11:08:55 -0500 Subject: RFR: 8222460: -XX:+PrintFlagsRanges prints incorrect range In-Reply-To: References: <9781e40b-f22c-1b8f-198f-525ffbda9a47@oracle.com> <7830b677-141f-7d1f-8b61-dd1094c072e7@oracle.com> <79c45c20-e113-04c8-1567-2d0b589a7844@oracle.com> <0ce0349a-df09-441c-6172-1caa2828df2f@oracle.com> <4b3cdf62-25ac-8cbe-8a01-1c37064a7ffa@oracle.com> Message-ID: <66a40d05-589d-58c5-8abd-54efadb075ea@oracle.com> On 4/16/19 9:45 AM, Per Liden wrote: > > On 04/16/2019 04:26 PM, gerard ziemski wrote: >> >> >> On 4/15/19 2:40 PM, Per Liden wrote: >>>> >>>> I was wrong in my initial reply. >>>> >>>> There are flags with both range and constraint - the entire point >>>> of printing the range to the user is to help define valid values, >>>> and as you say "We can't expect a user to know what the range of a >>>> specific type is on every platform", which applies to both cases. >>>> >>>> If the test has an issue with the flag, then the test itself needs >>>> to be fixed, by excluding the troublesome flag. >>>> >>>> I believe that we should print out the default range in both cases >>>> (done in a followup) and we should modify the test to exclude hard >>>> to test (troublesome) flags as needed. >>> >>> But for flags with constrains functions we have no way of knowing >>> the valid range, right? So, the question becomes, is it better for a >>> user to get no information (we an empty range) instead of the wrong >>> information (we print the default range)? I tend to lean toward >>> printing no information. A third alternative would be printing the >>> default range for the type, but also add something in the printout >>> to indicate that there might be other constraints too. >> >> I was thinking about doing something along this line. >> >> >>> >>>> >>>> Can't we just exclude "SoftMaxHeapSize" from the test here? >>> >>> Sure, I've updated that patch for JDK-8222145 "Add >>> -XX:SoftMaxHeapSize flag" to exclude it from testing. Webrev here: >>> >>> http://cr.openjdk.java.net/~pliden/8222145/webrev.2 >>> >>> Looks ok? >> >> Looks ok, but why make "allWriteableOptions" static field just to access > > Ok, thanks. > >> it in "excludeTestRange()"? Couldn't we simply pass >> "allWriteableOptions" as an argument to excludeTestRange()"? > > In principle I agree with you, but I'm just following the style in > which these tests are written, see > test/hotspot/jtreg/runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java. > If we want to use a different style for these tests we should fix all > of them, and I didn't want to do that in this patch. Perhaps something > for your followup? Sounds good. cheers From rkennke at redhat.com Tue Apr 16 19:58:28 2019 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 16 Apr 2019 21:58:28 +0200 Subject: RFR: JDK-8222545: Safe klass asserts Message-ID: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> Various code paths in oopDesc, Klass and their subclasses assert something that fetches the object's _klass field. With upcoming Shenandoah's changes this is not always safe and requires an additional indirection. The trouble here is that we can, for example, call Klass::oop_oop_iterate() with a pre-resolved Klass*, instead of oopDesc::oop_iterate() which would call oopDesc::klass() on its own, which would be racy on some GC internal call paths, but we can't (currently) control some calls to klass() further down the call stack (all in asserts). We'd also like a way to ensure that non-GC calls to klass() are sane. Bug: https://bugs.openjdk.java.net/browse/JDK-8222545 Webrev: http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ Testing: hotspot_gc_shenandoah with and without the prototype, hotspot/tier1 The change introduces only two ASSERT-level GC-interfaces, and afaict, this with JDK-8222537 will be all that we need for the upcoming elimination of forward pointers in Shenandoah. Notice that one assert in objArrayKlass is strengthened from is_array() to is_objArray(), but that seems only sane in that context. Can I please get reviews? Thanks, Roman From rkennke at redhat.com Tue Apr 16 20:45:42 2019 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 16 Apr 2019 22:45:42 +0200 Subject: RFR: JDK-8222537: Avoid fetching _klass twice in TypeArrayOop::size() Message-ID: <5bad6271-2fb3-875f-d0d9-71bac167f768@redhat.com> Currently, when calling TypeArrayOop::size(), we end up calling klass() twice: once before calling into size_given_klass() and then again before calling TypeArrayOop::object_size(). This is currently only a minor performance nuisance. With upcoming Shenandoah's elimination of forwarding pointer, loading klass like this is not safe anymore, and therefore we only call size_given_klass(), and must avoid calling naked klass() altogether. Bug: https://bugs.openjdk.java.net/browse/JDK-8222537 Webrev: http://cr.openjdk.java.net/~rkennke/JDK-8222537/webrev.00/ Testing: hotspot_gc_shenandoah with and without the prototype, hotspot/tier1 Can I please get reviews? Thanks, Roman From stefan.karlsson at oracle.com Tue Apr 16 21:00:02 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 16 Apr 2019 23:00:02 +0200 Subject: RFR: 8222558: Rework ResolvedMethodTable verification Message-ID: <5f81a0a7-757d-26e0-3fb1-18261d42c1a3@oracle.com> Hi all, Please review this patch to rework the ResolvedMethodTable verification. https://cr.openjdk.java.net/~stefank/8222558/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8222558 The patch removes the quadratic search for duplicate entries, and adds a linear scan that checks that the vmtargets are non-old methods. The verification is moved to be run when Universe::verify is called. Testing tier1-3. Thanks, StefanK From stefan.karlsson at oracle.com Tue Apr 16 21:24:59 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 16 Apr 2019 23:24:59 +0200 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out Message-ID: Hi all, Please review this patch to fix a timeout in the MemberNameLeak test. https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8222550 The test could fail if GCs happened during the setup phase when entries for all generated methods were created. When this happened the code to grow the table was triggered, which in turn cleaned out all so-far created entries.? This put the table in a condition where the grow / cleaning code didn't have to be triggered again. But the test still waited for it to happen. This patch adds all MethodHandles to an ArrayList, so that they are kept alive until it's time for them to be cleaned out. While debugging this timeout I added some extra logging. I've left it in the test in case we ever need to debug it again. Testing: tier1-3 and multiple tier1_runtime runs on osx where the timeouts reproduced. The patch is applied on top of the patch in: https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html Thanks, StefanK From david.holmes at oracle.com Wed Apr 17 00:00:11 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 17 Apr 2019 10:00:11 +1000 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> Message-ID: <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> Hi Roman, On 17/04/2019 5:58 am, Roman Kennke wrote: > Various code paths in oopDesc, Klass and their subclasses assert > something that fetches the object's _klass field. With upcoming > Shenandoah's changes this is not always safe and requires an additional > indirection. > > The trouble here is that we can, for example, call > Klass::oop_oop_iterate() with a pre-resolved Klass*, instead of > oopDesc::oop_iterate() which would call oopDesc::klass() on its own, > which would be racy on some GC internal call paths, but we can't > (currently) control some calls to klass() further down the call stack > (all in asserts). > > We'd also like a way to ensure that non-GC calls to klass() are sane. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8222545 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ > Testing: > hotspot_gc_shenandoah with and without the prototype, hotspot/tier1 > > The change introduces only two ASSERT-level GC-interfaces, and afaict, > this with JDK-8222537 will be all that we need for the upcoming > elimination of forward pointers in Shenandoah. Notice that one assert in > objArrayKlass is strengthened from is_array() to is_objArray(), but that > seems only sane in that context. > > Can I please get reviews? This looks very awkward to me. Using: Universe::heap()->safe_klass(obj)->is_objArray_klass() instead of the obvious: obj->is_objArray() is very unintuitive. Can this not be handled inside is_objArray (and is_typeArray) ? Thanks, David > Thanks, > Roman > From erik.gahlin at oracle.com Wed Apr 17 03:05:51 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Wed, 17 Apr 2019 05:05:51 +0200 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <72c5aeb5-d65c-33d0-fc95-5e469316478f@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CAD05AF.1090700@oracle.com> <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> <7002f66b-49af-b38d-8d97-8642e684e998@oracle.com> <5CB0CEBE.5000400@oracle.com> <72c5aeb5-d65c-33d0-fc95-5e469316478f@oracle.com> Message-ID: <5CB6980F.2050800@oracle.com> On 2019-04-12 22:13, gerard ziemski wrote: > > > On 4/12/19 12:45 PM, Erik Gahlin wrote: >> On 2019-04-10 22:03, gerard ziemski wrote: >>> >>> >>> On 4/10/19 1:12 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> I noticed that events are only emitted if we are able to take the >>>>>> resize lock. Can this be fixed? What prevents us from always >>>>>> getting the data? That's how other periodic events work and >>>>>> losing data sometimes may lead to subtle bugs that hard to >>>>>> understand and replicate in systems that rely on the information. >>>>>> Could we retry on a failure? >>>>> Good observation. If the resize lock is taken, then it's not >>>>> likely that whoever owns it will be done soon, so retrying is most >>>>> likely not going to succeed right away. Is it OK to tie up JFR >>>>> periodic thread for some time? If so, how long? >> There is no general upper limit for periodic events. >> >> If we need to wait for a safepoint, we need to do it. That said, >> events that can induce significant latencies or CPU overhead (even in >> pathological cases) are off in default.jfc and only enabled in >> profile.jfr, or not at all. >> >> As I understand it, the events themselves don't cause latencies and >> the tables are not expanded that often, so I think it would be okay >> to emit them. If you think otherwise, I would try to scan >> concurrently, even if it means we are slightly off. >> >>>>> >>>>> >>>>> If the lock is taken, then it means that someone is scanning >>>>> through the entire table, or the table is being resized. Either >>>>> way, we're not loosing data, but are just temporarily blind - I >>>>> don't see a problem here for a long running apps, they will start >>>>> receiving events eventually (which happen every 10 sec by default) >> A user can set period "everyChunk" which means events are guaranteed >> to be in the recording. >> >> I think we should try to avoid breaking that contract. When event >> streaming is in place, we can implement requestable events where a >> user can demand an event programmatically from Java. If they >> sometimes don't get an event, it will break their code in a subtle way. > > No problem, I removed the resize_lock around the JFR table statistics, > so we might get a slightly incorrect stats every now and then, but we > will be emitting the events on schedule: > http://cr.openjdk.java.net/~gziemski/8185525_rev7 Is it sufficient to just remove the lock to make it "work"? I think it could be OK to use stale data, or perhaps count a value twice, but are there other issues that needs to be fixed as well? Robbin may have more information on this. An alternative approach would be to use the last known data, if we are not able to take the lock. It would be old, but not out of whack. That said, it would be interesting to have some numbers on what the cost would be to wait for the lock. > > Last question: what is the recommended way to programatically tell if > JFR is ON? I'm wondering whether I should collect the add/remove rates > for the tables only if JRF is ON. As it is right now, we collect them > always. It's just an atomic increment, but still, it's work only JFR > events need. You can use the JFR_ONLY macro, if it's not built with JFR. If you want to check if a recording is running, you can use Jfr::is_recording(), but perhaps Jfr::is_enabled() is more accurate/correct if a recording is started/stopped repeatedly? I looked at jfrPeriodic.cpp, and it seems to me that things could be simplified, i.e. template static void emit_table_statistics(TableStatistics& statistics) { T event; event.set_bucketCount(statistics._number_of_buckets); ... event.commit(); } Thanks Erik From david.holmes at oracle.com Wed Apr 17 03:58:35 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 17 Apr 2019 13:58:35 +1000 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: References: Message-ID: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> Hi Stefan, On 17/04/2019 7:24 am, Stefan Karlsson wrote: > Hi all, > > Please review this patch to fix a timeout in the MemberNameLeak test. > > https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8222550 > > The test could fail if GCs happened during the setup phase when entries > for all generated methods were created. When this happened the code to > grow the table was triggered, which in turn cleaned out all so-far > created entries.? This put the table in a condition where the grow / > cleaning code didn't have to be triggered again. But the test still > waited for it to happen. This patch adds all MethodHandles to an > ArrayList, so that they are kept alive until it's time for them to be > cleaned out. While debugging this timeout I added some extra logging. > I've left it in the test in case we ever need to debug it again. Fix seems reasonable. A couple of comments: 119 "-Xlog:membername+table=trace,gc+verify=debug,gc", 120 "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", I'm assuming you only actually want line 120? Is the log file copied across with the test artifacts in mach5? I'm assuming you're using the file for gc logging so that the normal test .jtr file is not inundated with excessive logging data. Thanks, David ----- > Testing: tier1-3 and multiple tier1_runtime runs on osx where the > timeouts reproduced. > > The patch is applied on top of the patch in: > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html > > > Thanks, > StefanK From stefan.karlsson at oracle.com Wed Apr 17 06:41:53 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 17 Apr 2019 08:41:53 +0200 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> Message-ID: On 2019-04-17 05:58, David Holmes wrote: > Hi Stefan, > > On 17/04/2019 7:24 am, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to fix a timeout in the MemberNameLeak test. >> >> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8222550 >> >> The test could fail if GCs happened during the setup phase when >> entries for all generated methods were created. When this happened >> the code to grow the table was triggered, which in turn cleaned out >> all so-far created entries.? This put the table in a condition where >> the grow / cleaning code didn't have to be triggered again. But the >> test still waited for it to happen. This patch adds all MethodHandles >> to an ArrayList, so that they are kept alive until it's time for them >> to be cleaned out. While debugging this timeout I added some extra >> logging. I've left it in the test in case we ever need to debug it >> again. > > Fix seems reasonable. Thanks. > A couple of comments: > > ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", > ?120 > "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", > > I'm assuming you only actually want line 120? It was a quick and dirty way to get logging from 119 to the outputAnalyzer, and more comprehensive logging from line 120 saved to disk. > > Is the log file copied across with the test artifacts in mach5? Yes. > I'm assuming you're using the file for gc logging so that the normal > test .jtr file is not inundated with excessive logging data. Yes, and because jtreg cuts in the middle of the output of tests with excessive logging. I created a more elaborate version that only logs to files, and perform the verification on those files: ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 Thanks, StefanK > > Thanks, > David > ----- > >> Testing: tier1-3 and multiple tier1_runtime runs on osx where the >> timeouts reproduced. >> >> The patch is applied on top of the patch in: >> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >> >> >> Thanks, >> StefanK From robbin.ehn at oracle.com Wed Apr 17 07:03:01 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 17 Apr 2019 09:03:01 +0200 Subject: RFR(s): 8222327: java_lang_Thread _thread_status_offset, remove pre 1.5 code paths In-Reply-To: <38785c09-604f-1d3f-48f6-db647730fad1@oracle.com> References: <9f9ea907-7412-c2bd-40fe-3faa8af8e2c5@oracle.com> <38785c09-604f-1d3f-48f6-db647730fad1@oracle.com> Message-ID: <9905e625-e06b-99a3-c600-f138e00456a5@oracle.com> Thanks Claes! /Robbin On 4/16/19 4:05 PM, Claes Redestad wrote: > > > On 2019-04-16 10:17, Robbin Ehn wrote: >> http://cr.openjdk.java.net/~rehn/8222327/v2/ > > Looks good to me! > > /Claes From robbin.ehn at oracle.com Wed Apr 17 08:35:27 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 17 Apr 2019 10:35:27 +0200 Subject: RFR(s): 8222640: Remove deopt suspend Message-ID: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> Hi all, please consider this change. The code for deopt suspend is no longer needed since today the register window is always flushed when this code executes. Exactly when this code was needed is not clear, entered via duke changeset 1. I did not dig since we no longer have such use case. Webrev: http://cr.openjdk.java.net/~rehn/8222640/webrev/ Issue: https://bugs.openjdk.java.net/browse/JDK-8222640 Passes t1-5. Thanks, Robbin From robbin.ehn at oracle.com Wed Apr 17 10:09:19 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 17 Apr 2019 12:09:19 +0200 Subject: RFR(s): 8222640: Remove deopt suspend In-Reply-To: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> Message-ID: <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> Adding compiler. /Robbin On 4/17/19 10:35 AM, Robbin Ehn wrote: > Hi all, please consider this change. > > The code for deopt suspend is no longer needed since today the register window > is always flushed when this code executes. Exactly when this code was needed is > not clear, entered via duke changeset 1. I did not dig since we no longer have > such use case. > > Webrev: > http://cr.openjdk.java.net/~rehn/8222640/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8222640 > > Passes t1-5. > > Thanks, Robbin From coleen.phillimore at oracle.com Wed Apr 17 11:34:34 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 Apr 2019 07:34:34 -0400 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: References: Message-ID: This looks good to me. Coleen On 4/16/19 5:24 PM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to fix a timeout in the MemberNameLeak test. > > https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8222550 > > The test could fail if GCs happened during the setup phase when > entries for all generated methods were created. When this happened the > code to grow the table was triggered, which in turn cleaned out all > so-far created entries.? This put the table in a condition where the > grow / cleaning code didn't have to be triggered again. But the test > still waited for it to happen. This patch adds all MethodHandles to an > ArrayList, so that they are kept alive until it's time for them to be > cleaned out. While debugging this timeout I added some extra logging. > I've left it in the test in case we ever need to debug it again. > > Testing: tier1-3 and multiple tier1_runtime runs on osx where the > timeouts reproduced. > > The patch is applied on top of the patch in: > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html > > > Thanks, > StefanK From coleen.phillimore at oracle.com Wed Apr 17 11:37:45 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 Apr 2019 07:37:45 -0400 Subject: RFR: 8222558: Rework ResolvedMethodTable verification In-Reply-To: <5f81a0a7-757d-26e0-3fb1-18261d42c1a3@oracle.com> References: <5f81a0a7-757d-26e0-3fb1-18261d42c1a3@oracle.com> Message-ID: This looks good too. Coleen On 4/16/19 5:00 PM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to rework the ResolvedMethodTable verification. > > https://cr.openjdk.java.net/~stefank/8222558/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8222558 > > The patch removes the quadratic search for duplicate entries, and adds > a linear scan that checks that the vmtargets are non-old methods. The > verification is moved to be run when Universe::verify is called. > > Testing tier1-3. > > Thanks, > StefanK From stefan.karlsson at oracle.com Wed Apr 17 12:11:33 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 17 Apr 2019 14:11:33 +0200 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: References: Message-ID: <3e74814c-9e76-b03e-1f9e-42bf859c3821@oracle.com> Thanks Coleen. StefanK On 2019-04-17 13:34, coleen.phillimore at oracle.com wrote: > > This looks good to me. > Coleen > > On 4/16/19 5:24 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to fix a timeout in the MemberNameLeak test. >> >> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8222550 >> >> The test could fail if GCs happened during the setup phase when >> entries for all generated methods were created. When this happened >> the code to grow the table was triggered, which in turn cleaned out >> all so-far created entries.? This put the table in a condition where >> the grow / cleaning code didn't have to be triggered again. But the >> test still waited for it to happen. This patch adds all MethodHandles >> to an ArrayList, so that they are kept alive until it's time for them >> to be cleaned out. While debugging this timeout I added some extra >> logging. I've left it in the test in case we ever need to debug it >> again. >> >> Testing: tier1-3 and multiple tier1_runtime runs on osx where the >> timeouts reproduced. >> >> The patch is applied on top of the patch in: >> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >> >> >> Thanks, >> StefanK > From stefan.karlsson at oracle.com Wed Apr 17 12:11:47 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 17 Apr 2019 14:11:47 +0200 Subject: RFR: 8222558: Rework ResolvedMethodTable verification In-Reply-To: References: <5f81a0a7-757d-26e0-3fb1-18261d42c1a3@oracle.com> Message-ID: <9b616582-9188-961b-6f9a-7c169ad251ed@oracle.com> Thanks Coleen. StefanK On 2019-04-17 13:37, coleen.phillimore at oracle.com wrote: > This looks good too. > Coleen > > On 4/16/19 5:00 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to rework the ResolvedMethodTable verification. >> >> https://cr.openjdk.java.net/~stefank/8222558/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8222558 >> >> The patch removes the quadratic search for duplicate entries, and >> adds a linear scan that checks that the vmtargets are non-old >> methods. The verification is moved to be run when Universe::verify is >> called. >> >> Testing tier1-3. >> >> Thanks, >> StefanK > From coleen.phillimore at oracle.com Wed Apr 17 12:24:12 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 Apr 2019 08:24:12 -0400 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION Message-ID: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> Summary: Give fatal error if CDS loses archive mapping; but map Windows RW because remapping is dangerous. Ioi and I discussed this change and thought it is best.? Windows only maps the CDS archive around 50% time because of ASLR and this retains the startup performance improvements for CDS on windows. Tested with mach5 tier1-3. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8222379 Thanks, Coleen From daniel.daugherty at oracle.com Wed Apr 17 13:46:52 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 17 Apr 2019 09:46:52 -0400 Subject: RFR(s): 8222640: Remove deopt suspend In-Reply-To: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> Message-ID: <6b46fdbf-0dbb-ce9b-047a-fc6d502653e6@oracle.com> On 4/17/19 4:35 AM, Robbin Ehn wrote: > Hi all, please consider this change. > > The code for deopt suspend is no longer needed since today the > register window > is always flushed when this code executes. Exactly when this code was > needed is not clear, entered via duke changeset 1. I did not dig since > we no longer have such use case. > > Webrev: > http://cr.openjdk.java.net/~rehn/8222640/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8222640 > > Passes t1-5. > > Thanks, Robbin Since this code was added by the Compiler team, I think you're going to want at least one Compiler team member to chime in on this review... I was going to add a historical comment to your bug, but JBS appears to be down at the moment... This code was added by this delta: $ sp -r1.795.1.1 src/share/vm/runtime/thread.cpp src/share/vm/runtime/SCCS/s.thread.cpp: D 1.795.1.1 06/12/07 10:06:52 sgoldman 2086 2084 00031/00010/04023 MRs: COMMENTS: 6463133 - patchless deopt. Support specialized deopt suspend for register window based machines. Pass registerMap to revoke_bias to prevent redundant stack walks.? frames now cache the codeBlob. Looks like 6463133 was not a bug that I was tracking way back then so I don't have an email folder for it. I did find Steve Goldman's push message for it, but the fix for 6463133 is included with four other bug fixes: --------------------------------------------------------- Job ID:???????????????? 20061207101238.sgoldman.6463133_deopt-M Original workspace:???? gretch:/disk2/ws/6463133_deopt-M Submitter:????????????? sgoldman Archived data: /net/prt-data.east/archives/main/c2_baseline/2006/20061207101238.sgoldman.6463133_deopt-M/ Webrev: http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2006/20061207101238.sgoldman.6463133_deopt-M/workspace/webrevs/webrev-2006.12.07/index.html Fixed 6463133: Deoptimization should not use code patching Fixed 6490483: Java support for pstack broken Fixed 6490489: java pstack support for server on x86 corrupts stack frame info Fixed 6490492: java support for pstack rarely gets server compiled frames correct. Fixed 6500866: jvm crash running forte stress kit. This converts deoptimization to no longer do patching of code and now only patches return address. This made a rather large change to the frame object so that now a frame always carries along the codeBlob it refers to if it in fact does refer to a codeBlob. This removes lots of redundant CodeCache::find_blob calls. The testing of this fix which obviously changes the way frames look on the stack discovered that both SA and pstack support have been broken. pstack support has been been broken for years. As part of the SA changes I found that the fix for: 6252656: Putative invariant for TLABS _start+_size==_end+alignment_reserve() not being maintained didn't properly update SA and I've added that fix. There is a discussion of patchless deopt here http://j2se.sfbay/web/bin/view/HotspotCompilers/PatchlessDeopt which is currently (Dec. 7, 2006) out of date but which I will clean up. One other notable change with this putback is that there is no longer a separate exception handler for each codeblob. Now the exception handler (and the new deopt handler) are stored in the stub (read uncommon code) area. Reviewed by: Tom Fix verified (y/n): yes Verification testing: ??? PRT with various stress options. NSK and JDI tests with and without stress ??? options. Dan's forte stress tests on sparc/x86/amd64. Lots of various hand ??? testing of SA (in addition to sasanity) pstack, and dtrace. [end sgoldman Thu Dec? 7 13:56:55 2006 EDT] sgoldman Mon Dec 11 15:28:48 2006 PDT ------------------------------------- Since this is an integration push from c2_baseline -> main/baseline, I do not have the list of files modified. I would have to find the c2_baseline TeamWare workspace if it still exists. However, I'm not sure it would help much since 6463133 is combined with 4 other fixes... I did a search for all of the files that mention 6463133 in their SCCS history and that list is 83 files long. Ouch. I've attached that list to this email. Dan -------------- next part -------------- src/closed/cpu/ia64/vm/frame_ia64.hpp src/closed/cpu/ia64/vm/globals_ia64.hpp src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp src/cpu/sparc/vm/c1_LIRAssembler_sparc.hpp src/cpu/sparc/vm/frame_sparc.cpp src/cpu/sparc/vm/frame_sparc.hpp src/cpu/sparc/vm/frame_sparc.inline.hpp src/cpu/sparc/vm/globals_sparc.hpp src/cpu/sparc/vm/interpreter_sparc.cpp src/cpu/sparc/vm/nativeInst_sparc.cpp src/cpu/sparc/vm/sharedRuntime_sparc.cpp src/cpu/sparc/vm/sparc.ad src/cpu/x86/vm/c1_LIRAssembler_x86.cpp src/cpu/x86/vm/c1_LIRAssembler_x86.hpp src/cpu/x86/vm/frame_x86.cpp src/cpu/x86/vm/frame_x86.hpp src/cpu/x86/vm/frame_x86.inline.hpp src/cpu/x86/vm/globals_x86.hpp src/cpu/x86/vm/interpreter_x86_32.cpp src/cpu/x86/vm/interpreter_x86_64.cpp src/cpu/x86/vm/sharedRuntime_x86_32.cpp src/cpu/x86/vm/sharedRuntime_x86_64.cpp src/cpu/x86/vm/x86_32.ad src/cpu/x86/vm/x86_64.ad src/os/solaris/dtrace/generateJvmOffsets.cpp src/os/solaris/dtrace/libjvm_db.c src/share/vm/asm/assembler.cpp src/share/vm/asm/assembler.hpp src/share/vm/asm/codeBuffer.cpp src/share/vm/asm/codeBuffer.hpp src/share/vm/c1/c1_Compilation.cpp src/share/vm/c1/c1_Compilation.hpp src/share/vm/c1/c1_FrameMap.cpp src/share/vm/c1/c1_FrameMap.hpp src/share/vm/c1/c1_LIRAssembler.cpp src/share/vm/c1/c1_LIRAssembler.hpp src/share/vm/c1/c1_Runtime1.cpp src/share/vm/c1/c1_Runtime1.hpp src/share/vm/ci/ciEnv.cpp src/share/vm/ci/ciEnv.hpp src/share/vm/classfile/javaClasses.cpp src/share/vm/code/codeBlob.cpp src/share/vm/code/codeBlob.hpp src/share/vm/code/codeCache.cpp src/share/vm/code/compiledIC.cpp src/share/vm/code/icBuffer.cpp src/share/vm/code/nmethod.cpp src/share/vm/code/nmethod.hpp src/share/vm/code/relocInfo.cpp src/share/vm/code/relocInfo.hpp src/share/vm/compiler/oopMap.cpp src/share/vm/compiler/oopMap.hpp src/share/vm/includeDB_compiler1 src/share/vm/includeDB_compiler2 src/share/vm/includeDB_core src/share/vm/opto/chaitin.cpp src/share/vm/opto/compile.cpp src/share/vm/opto/compile.hpp src/share/vm/opto/graphKit.cpp src/share/vm/opto/locknode.cpp src/share/vm/opto/locknode.hpp src/share/vm/opto/matcher.cpp src/share/vm/opto/output.cpp src/share/vm/opto/runtime.cpp src/share/vm/prims/forte.cpp src/share/vm/runtime/deoptimization.cpp src/share/vm/runtime/deoptimization.hpp src/share/vm/runtime/fprofiler.cpp src/share/vm/runtime/frame.cpp src/share/vm/runtime/frame.hpp src/share/vm/runtime/globals.hpp src/share/vm/runtime/interfaceSupport.cpp src/share/vm/runtime/rframe.cpp src/share/vm/runtime/safepoint.cpp src/share/vm/runtime/sharedRuntime.cpp src/share/vm/runtime/thread.cpp src/share/vm/runtime/thread.hpp src/share/vm/runtime/vframe_hp.cpp src/share/vm/runtime/vframe_hp.hpp src/share/vm/runtime/vframe.cpp src/share/vm/runtime/vframe.hpp src/share/vm/runtime/vm_operations.cpp src/share/vm/runtime/vmStructs.cpp From robin.westberg at oracle.com Wed Apr 17 13:55:49 2019 From: robin.westberg at oracle.com (Robin Westberg) Date: Wed, 17 Apr 2019 15:55:49 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> Message-ID: Hi David, > On 12 Apr 2019, at 11:15, David Holmes wrote: > > Hi Robin, > > Sorry for the delay I've been mulling over this one ... :) No worries, I?ve been otherwise occupied anyway.. :) >> The original implementation of SpinYield had the sleep hardwired to 1 ms os::naked_short_sleep - when os::naked_short_nanosleep was introduced this parameter was added as well, with a default value of 1000. But there?s no code that actually sets the parameter, and it?s a bit misleading that the parameter accepts nanoseconds when that cannot be acted upon on Windows. So I figured that it would be better to remove the parameter but retain the existing behavior. But this should probably be revisited when the fate of TimedYield has been decided.. > > I evaluate the abstraction and API that is being provided and even if the initial user doesn't want anything but the default value, the ability to set the value makes perfect sense for that particular abstraction. The fact it can't be acted upon on windows is unfortunate, but these things should always been specified as "best effort" when we can't have guarantees. Fair enough, I?ll avoid touching that part. >>> I think delay() would be a better name than wait(). >> This was inspired by the SpinYield utility, but I certainly wouldn?t mind renaming it. Perhaps SpinYield::wait should be renamed as well to keep the symmetry? > > If its already in use then wait() is okay. Sine I dropped the new TimedYield class I?ll leave this one as well. >> The ?root? problem here is that the existing os primitives unfortunately do not quite map to the goal of this utility. Let me try to break it down a bit and see if it makes sense: The purpose of TimedYield is to wait for a thread rendezvous, for a short time as possible. (In this case, waiting for threads to notice that the safepoint poll has been armed). Ideally, we are aiming for sub-millisecond waiting times here. >> If these threads are scheduled on other cpu?s this is pretty simple - we could spin or nanosleep and they would make progress. However - if one or more of these threads are scheduled to run on the current cpu things become interesting. Waiting for the OS scheduler to move threads to different cpu?s can take milliseconds - much slower than what is possible to achieve. So, we want to try performing a cpu-local yield at first. > > This is heading into territory that we only go into if absolutely necessary. We don't want to be coding to our assumed knowledge of what a particular scheduler will do - especially when we don't even know we will be executing on that scheduler. Unless there is very good reason we should stick with the functionality and semantics provided by the OS. Right, that?s why I thought it could be useful to encapsulate the ?optimal? strategy for thread rendezvous waiting in something that could be implemented differently depending on the OS / scheduler, as it may not be obvious that nanosleep is the primitive of choice for that. Perhaps TimedYield was not the best name for that either though.. Just for the record, here are the differences I observed using different strategies (specjvm2008 time-to-safepoint in microseconds): Average Median 90th percentile Original: 693 677 1088 SpinYield: 438 222 1014 TimedYield: 226 180 241 >> On Windows, this maps reasonably well to os::naked_yield: >> void os::naked_yield() { >> // Consider passing back the return value from SwitchToThread(). >> SwitchToThread(); >> } >> But for example on Linux, there is instead this: >> // Linux CFS scheduler (since 2.6.23) does not guarantee sched_yield(2) will >> // actually give up the CPU. Since skip buddy (v2.6.28): >> // >> // * Sets the yielding task as skip buddy for current CPU's run queue. >> // * Picks next from run queue, if empty, picks a skip buddy (can be the yielding task). >> // * Clears skip buddies for this run queue (yielding task no longer a skip buddy). >> // >> // An alternative is calling os::naked_short_nanosleep with a small number to avoid >> // getting re-scheduled immediately. >> // >> void os::naked_yield() { >> sched_yield(); >> } >> In both cases, we may get rescheduled immediately - on Windows this is indicated in the return value from SwitchToThread, but on Linux we don?t know. On Windows, it is then fine to spin a little while as there is nothing else ready to run. But on Linux, the CFS scheduler penalizes spinning as the runtime counter is increased, which will hurt the waiter when the time comes to perform actual work. So we don?t want to spin on a no-op sched_yield, we have to use nanosleep instead. But then we are back to the original problem - the current nanosleep is not what we want to do on Windows in this situation So why not change nanosleep on Windows: > > Yielding should always be a hint, not a requirement. Trying to second guess who may be running on which core and what the load may be is not a game we want to play lightly. There are just too factors out of our control that can change on a different piece of hardware or a different OS release etc. > >>> If the existing os api's need adjustment or expansion to provide the functionality desired by this code then I would much prefer to see the os API's updated to address that. >>> >>> That said, given the original problem is that os::naked_short_nanosleep on Windows is too coarse with the use of WaitableTimer why not just replace that with a simple loop of the form: >>> >>> while (elapsed < sleep_ns) { >>> if (SwitchToThread() == 0) { >>> SpinPause(); >>> elapsed = ? >>> } >> So this would actually work fine in this case - but it's probably not what you would expect from a sleep function in the general case. On Linux, you would get control back after the provided nanosecond period even if another thread executed in the meantime. But on Windows, you are potentially giving up your entire timeslice if another thread is ready to run - this would be much worse than plain naked_short_sleep as you may not get control back for another 15 ms or so. > > Perhaps you have different expectations on what a sleep may do, but I don't expect absolute precision here. I expect a sleep function to take me off CPU for (ideally**) at least a given amount of time, but have no expectations about the maximum - that depends on a lot of factors about os timers and scheduling that I don't want to have to think about or know about. I don't even assume I go off cpu for all that time as I know there are limits around timer resolution etc and so the OS may in fact do some spinning instead. That's all fine by me. If the sleeping thread doesn't get back on CPU for 15-20ms then so be it, some other thread is getting to run and do what it needs to do. I guess my main concern is that I would probably expect better precision from the sleep method accepting nanoseconds instead of milliseconds, but I can certainly live with not changing that. > ** Windows returns early from timed-waits in many cases. > >> That all being said, switching the Windows naked_short_nanosleep to the above implementation would be just fine - but I really think it should be renamed in that case. Perhaps something like os::timed_yield(jlong ns) would make sense? The additional backoff mechanism in ThreadYield can be reverted back to being handled by the safepointing code. The reason I made TimedYield into a separate utility was that it may be useful in other places as well, but such future use can of course be handled separately if the need actually arises. > > I'd prefer to fix a windows problem, just on windows. I'm not hung up on having sleep in the name, but if you prefer timed_yield to naked_short_nanosleep then that's fine (and avoids people wondering what the "naked" part means). > > If we need the TimedYield capability in the future then lets revisit that then. Sure, here?s a lighter version of this change that changes the Windows implementation of naked_short_nanosleep, with a few adjustments to some assumptions in the waiting-for-safepoint backoff strategy. Still passes tier1, with the same performance improvements on Windows (and no obvious regressions on Linux). New webrev: https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02/ Best regards, Robin > > Thanks, > David > ----- > >> Best regards, >> Robin >>> >>> ? >>> >>> Thanks, >>> David >>> ----- >>> >>> >>>> Best regards, >>>> Robin >>>>> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >>>>> >>>>> Hi David, >>>>> >>>>>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>>>>> >>>>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>>>> Hi David, >>>>>>> Thanks for taking a look! >>>>>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>>>>> >>>>>>>> Hi Robin, >>>>>>>> >>>>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>>>> Hi all, >>>>>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>>>>> >>>>>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>>>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>>>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>>>>> >>>>>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >>>>> >>>>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >>>>> >>>>> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >>>>> >>>>>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >>>>> >>>>> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>>>> >>>>> Best regards, >>>>> Robin >>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Best regards, >>>>>>> Robin >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>>>> Testing: tier1 >>>>>>>>> Best regards, >>>>>>>>> Robin From ioi.lam at oracle.com Wed Apr 17 14:16:21 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 17 Apr 2019 07:16:21 -0700 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION In-Reply-To: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> References: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> Message-ID: Looks good. Thanks for fixing this! - Ioi On 4/17/19 5:24 AM, coleen.phillimore at oracle.com wrote: > Summary: Give fatal error if CDS loses archive mapping; but map > Windows RW because remapping is dangerous. > > Ioi and I discussed this change and thought it is best.? Windows only > maps the CDS archive around 50% time because of ASLR and this retains > the startup performance improvements for CDS on windows. > > Tested with mach5 tier1-3. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8222379 > > Thanks, > Coleen > From robbin.ehn at oracle.com Wed Apr 17 14:32:40 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 17 Apr 2019 16:32:40 +0200 Subject: RFR(s): 8222640: Remove deopt suspend In-Reply-To: <6b46fdbf-0dbb-ce9b-047a-fc6d502653e6@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <6b46fdbf-0dbb-ce9b-047a-fc6d502653e6@oracle.com> Message-ID: <8bdd971b-9e15-5f6f-b14f-e955c735c327@oracle.com> Hi Dan, thanks for digging! Yes, I have already forward the mail to compiler, added here also. Thanks, Robbin On 2019-04-17 15:46, Daniel D. Daugherty wrote: > On 4/17/19 4:35 AM, Robbin Ehn wrote: >> Hi all, please consider this change. >> >> The code for deopt suspend is no longer needed since today the register window >> is always flushed when this code executes. Exactly when this code was needed is not clear, entered via duke changeset >> 1. I did not dig since we no longer have such use case. >> >> Webrev: >> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8222640 >> >> Passes t1-5. >> >> Thanks, Robbin > > Since this code was added by the Compiler team, I think you're going > to want at least one Compiler team member to chime in on this review... > > > I was going to add a historical comment to your bug, but JBS appears to > be down at the moment... This code was added by this delta: > > $ sp -r1.795.1.1 src/share/vm/runtime/thread.cpp > src/share/vm/runtime/SCCS/s.thread.cpp: > > D 1.795.1.1 06/12/07 10:06:52 sgoldman 2086 2084 00031/00010/04023 > MRs: > COMMENTS: > 6463133 - patchless deopt. Support specialized deopt suspend for register window > based machines. Pass registerMap to revoke_bias to prevent redundant stack > walks.? frames now cache the codeBlob. > > > Looks like 6463133 was not a bug that I was tracking way back then > so I don't have an email folder for it. I did find Steve Goldman's > push message for it, but the fix for 6463133 is included with four > other bug fixes: > > --------------------------------------------------------- > > Job ID:???????????????? 20061207101238.sgoldman.6463133_deopt-M > Original workspace:???? gretch:/disk2/ws/6463133_deopt-M > Submitter:????????????? sgoldman > Archived data: /net/prt-data.east/archives/main/c2_baseline/2006/20061207101238.sgoldman.6463133_deopt-M/ > Webrev: > http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2006/20061207101238.sgoldman.6463133_deopt-M/workspace/webrevs/webrev-2006.12.07/index.html > > > Fixed 6463133: Deoptimization should not use code patching > Fixed 6490483: Java support for pstack broken > Fixed 6490489: java pstack support for server on x86 corrupts stack frame info > Fixed 6490492: java support for pstack rarely gets server compiled frames correct. > Fixed 6500866: jvm crash running forte stress kit. > > This converts deoptimization to no longer do patching of code > and now only patches return address. This made a rather large > change to the frame object so that now a frame always carries > along the codeBlob it refers to if it in fact does refer to > a codeBlob. This removes lots of redundant CodeCache::find_blob > calls. The testing of this fix which obviously changes the way > frames look on the stack discovered that both SA and pstack support > have been broken. pstack support has been been broken for years. > As part of the SA changes I found that the fix for: > > 6252656: Putative invariant for TLABS _start+_size==_end+alignment_reserve() not being maintained > > didn't properly update SA and I've added that fix. > > There is a discussion of patchless deopt here > > http://j2se.sfbay/web/bin/view/HotspotCompilers/PatchlessDeopt > > which is currently (Dec. 7, 2006) out of date but which I will clean up. > > One other notable change with this putback is that there is no longer a separate > exception handler for each codeblob. Now the exception handler (and the new > deopt handler) are stored in the stub (read uncommon code) area. > > > Reviewed by: Tom > > Fix verified (y/n): yes > > Verification testing: > > ??? PRT with various stress options. NSK and JDI tests with and without stress > ??? options. Dan's forte stress tests on sparc/x86/amd64. Lots of various hand > ??? testing of SA (in addition to sasanity) pstack, and dtrace. > [end sgoldman Thu Dec? 7 13:56:55 2006 EDT] > > sgoldman Mon Dec 11 15:28:48 2006 PDT > ------------------------------------- > > > Since this is an integration push from c2_baseline -> main/baseline, I do > not have the list of files modified. I would have to find the c2_baseline > TeamWare workspace if it still exists. However, I'm not sure it would help > much since 6463133 is combined with 4 other fixes... > > I did a search for all of the files that mention 6463133 in their SCCS > history and that list is 83 files long. Ouch. I've attached that list to > this email. > > Dan From coleen.phillimore at oracle.com Wed Apr 17 14:45:54 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 Apr 2019 10:45:54 -0400 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION In-Reply-To: References: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> Message-ID: Thank you Ioi! Coleen On 4/17/19 10:16 AM, Ioi Lam wrote: > Looks good. Thanks for fixing this! > > - Ioi > > On 4/17/19 5:24 AM, coleen.phillimore at oracle.com wrote: >> Summary: Give fatal error if CDS loses archive mapping; but map >> Windows RW because remapping is dangerous. >> >> Ioi and I discussed this change and thought it is best.? Windows only >> maps the CDS archive around 50% time because of ASLR and this retains >> the startup performance improvements for CDS on windows. >> >> Tested with mach5 tier1-3. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8222379 >> >> Thanks, >> Coleen >> > From calvin.cheung at oracle.com Wed Apr 17 16:05:59 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 17 Apr 2019 09:05:59 -0700 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION In-Reply-To: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> References: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> Message-ID: <5CB74EE7.8040803@oracle.com> Hi Coleen, Thanks for fixing it and the change looks good. I'm wondering instead of having has_jfr_option(), could the JfrRecorder::is_enabled() be used instead? The _enable field is set via the JfrRecorder::on_vm_init() and it is indirectly called during vm init in Threads::create_vm() via JFR_ONLY(Jfr::on_vm_init();). However, JfrRecorder::is_enabled() is currently used within only JFR code. thanks, Calvin On 4/17/19, 5:24 AM, coleen.phillimore at oracle.com wrote: > Summary: Give fatal error if CDS loses archive mapping; but map > Windows RW because remapping is dangerous. > > Ioi and I discussed this change and thought it is best. Windows only > maps the CDS archive around 50% time because of ASLR and this retains > the startup performance improvements for CDS on windows. > > Tested with mach5 tier1-3. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8222379 > > Thanks, > Coleen > From rkennke at redhat.com Wed Apr 17 16:59:09 2019 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 17 Apr 2019 18:59:09 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> Message-ID: <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> >> Various code paths in oopDesc, Klass and their subclasses assert >> something that fetches the object's _klass field. With upcoming >> Shenandoah's changes this is not always safe and requires an additional >> indirection. >> >> The trouble here is that we can, for example, call >> Klass::oop_oop_iterate() with a pre-resolved Klass*, instead of >> oopDesc::oop_iterate() which would call oopDesc::klass() on its own, >> which would be racy on some GC internal call paths, but we can't >> (currently) control some calls to klass() further down the call stack >> (all in asserts). >> >> We'd also like a way to ensure that non-GC calls to klass() are sane. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8222545 >> Webrev: >> http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ >> Testing: >> hotspot_gc_shenandoah with and without the prototype, hotspot/tier1 >> >> The change introduces only two ASSERT-level GC-interfaces, and afaict, >> this with JDK-8222537 will be all that we need for the upcoming >> elimination of forward pointers in Shenandoah. Notice that one assert in >> objArrayKlass is strengthened from is_array() to is_objArray(), but that >> seems only sane in that context. >> >> Can I please get reviews? > > This looks very awkward to me. Using: > > Universe::heap()->safe_klass(obj)->is_objArray_klass() > > instead of the obvious: > > obj->is_objArray() > > is very unintuitive. Can this not be handled inside is_objArray (and > is_typeArray) ? Not really. Then it would get exposed to many more code paths, most of which don't actually need it/don't want it, and many of which are outside of asserts, and rely on the usual klass() with the sanity assert there instead. I am open for suggestions, but it would have to be restricted to ASSERT code IMO, and ideally with as few as possible GC interface additions. Roman From jianglizhou at google.com Wed Apr 17 17:50:58 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Wed, 17 Apr 2019 10:50:58 -0700 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION In-Reply-To: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> References: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> Message-ID: Hi Coleen, Looks reasonable to me. - src/hotspot/os/windows/os_windows.cpp 5005 // There is a very small theoretical window between the unmap_memory() 5006 // call above and the map_memory() call below where a thread in native 5007 // code may be able to access an address that is no longer mapped. The comment refers to the 'unmap_memory() call above and the map_memory() call below'. Since the calls are removed, could you please update the comment as well. - src/hotspot/share/memory/filemap.cpp 853 // If a tool agent is in use (debugging enabled), or JFR, we must map the address space RW 854 if (JvmtiExport::can_modify_any_class() || JvmtiExport::can_walk_any_space() || 855 Arguments::has_jfr_option()) { 856 si->_read_only = false; 857 } 858 #ifdef _WINDOWS 859 // Windows cannot remap read-only shared memory to read-write when required for 860 // RedefineClasses, which is also used by JFR. Always map windows regions as RW. 861 si->_read_only = false; 862 #endif To avoid setting si->_read_only twice on Windows, maybe putting the code at line #853-#857 under #else after #ifdef _WINDOWS? Best regards, Jiangli On Wed, Apr 17, 2019 at 5:25 AM wrote: > Summary: Give fatal error if CDS loses archive mapping; but map Windows > RW because remapping is dangerous. > > Ioi and I discussed this change and thought it is best. Windows only > maps the CDS archive around 50% time because of ASLR and this retains > the startup performance improvements for CDS on windows. > > Tested with mach5 tier1-3. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8222379 > > Thanks, > Coleen > > From coleen.phillimore at oracle.com Wed Apr 17 18:41:49 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 Apr 2019 14:41:49 -0400 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION In-Reply-To: <5CB74EE7.8040803@oracle.com> References: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> <5CB74EE7.8040803@oracle.com> Message-ID: On 4/17/19 12:05 PM, Calvin Cheung wrote: > Hi Coleen, > > Thanks for fixing it and the change looks good. > > I'm wondering instead of having has_jfr_option(), could the > JfrRecorder::is_enabled() be used instead? Yes, this would be a lot better, but JfrRecorder::on_vm_init() is called after we map in the CDS regions (I just checked). Coleen > > The _enable field is set via the JfrRecorder::on_vm_init() and it is > indirectly called during vm init in Threads::create_vm() via > JFR_ONLY(Jfr::on_vm_init();). However, JfrRecorder::is_enabled() is > currently used within only JFR code. > > thanks, > Calvin > > On 4/17/19, 5:24 AM, coleen.phillimore at oracle.com wrote: >> Summary: Give fatal error if CDS loses archive mapping; but map >> Windows RW because remapping is dangerous. >> >> Ioi and I discussed this change and thought it is best.? Windows only >> maps the CDS archive around 50% time because of ASLR and this retains >> the startup performance improvements for CDS on windows. >> >> Tested with mach5 tier1-3. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8222379 >> >> Thanks, >> Coleen >> From coleen.phillimore at oracle.com Wed Apr 17 18:54:10 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 Apr 2019 14:54:10 -0400 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION In-Reply-To: References: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> Message-ID: <7c8e821b-5469-8e70-2256-d9608686e88a@oracle.com> On 4/17/19 1:50 PM, Jiangli Zhou wrote: > Hi Coleen, > > Looks reasonable to me. > > -?src/hotspot/os/windows/os_windows.cpp > 5005 // There is a very small theoretical window between the unmap_memory() > 5006 // call above and the map_memory() call below where a thread in native > 5007 // code may be able to access an address that is no longer mapped. > The comment refers to the 'unmap_memory() call above and the map_memory() call below'. Since the calls are removed, could you please update the comment as well. I can update the comments.? How about: // Remap a block of memory. char* os::pd_remap_memory(int fd, const char* file_name, size_t file_offset, ????????????????????????? char *addr, size_t bytes, bool read_only, ????????????????????????? bool allow_exec) { ? // This OS does not allow existing memory maps to be remapped so we ? // would have to unmap the memory before we remap it. ? // Because there is a small window between unmapping memory and mapping ? // it in again with different protections, CDS archives are mapped RW ? // on windows, so this function isn't called. ? ShouldNotReachHere(); ? return NULL; } > - src/hotspot/share/memory/filemap.cpp > 853 // If a tool agent is in use (debugging enabled), or JFR, we must > map the address space RW > 854 if (JvmtiExport::can_modify_any_class() || > JvmtiExport::can_walk_any_space() || > 855 Arguments::has_jfr_option()) { > 856 si->_read_only = false; > 857 } > 858 #ifdef _WINDOWS > 859 // Windows cannot remap read-only shared memory to read-write when > required for > 860 // RedefineClasses, which is also used by JFR. Always map windows > regions as RW. > 861 si->_read_only = false; > 862 #endif > To avoid setting si->_read_only twice on Windows, maybe putting the code at line #853-#857 under #else after #ifdef _WINDOWS? Okay, that's an improvement too. Thanks Jiangli! Coleen > Best regards, > Jiangli > > On Wed, Apr 17, 2019 at 5:25 AM > wrote: > > Summary: Give fatal error if CDS loses archive mapping; but map > Windows > RW because remapping is dangerous. > > Ioi and I discussed this change and thought it is best. Windows only > maps the CDS archive around 50% time because of ASLR and this retains > the startup performance improvements for CDS on windows. > > Tested with mach5 tier1-3. > > open webrev at > http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8222379 > > Thanks, > Coleen > From jianglizhou at google.com Wed Apr 17 18:55:31 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Wed, 17 Apr 2019 11:55:31 -0700 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION In-Reply-To: <7c8e821b-5469-8e70-2256-d9608686e88a@oracle.com> References: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> <7c8e821b-5469-8e70-2256-d9608686e88a@oracle.com> Message-ID: Looks good! Best, Jiangli On Wed, Apr 17, 2019 at 11:54 AM wrote: > > > On 4/17/19 1:50 PM, Jiangli Zhou wrote: > > Hi Coleen, > > Looks reasonable to me. > > - src/hotspot/os/windows/os_windows.cpp > > 5005 // There is a very small theoretical window between the unmap_memory() > 5006 // call above and the map_memory() call below where a thread in native > 5007 // code may be able to access an address that is no longer mapped. > > The comment refers to the 'unmap_memory() call above and the map_memory() call below'. Since the calls are removed, could you please update the comment as well. > > > I can update the comments. How about: > > // Remap a block of memory. > char* os::pd_remap_memory(int fd, const char* file_name, size_t > file_offset, > char *addr, size_t bytes, bool read_only, > bool allow_exec) { > // This OS does not allow existing memory maps to be remapped so we > // would have to unmap the memory before we remap it. > > // Because there is a small window between unmapping memory and mapping > // it in again with different protections, CDS archives are mapped RW > // on windows, so this function isn't called. > ShouldNotReachHere(); > return NULL; > } > > > - src/hotspot/share/memory/filemap.cpp > > 853 // If a tool agent is in use (debugging enabled), or JFR, we must map the address space RW 854 if (JvmtiExport::can_modify_any_class() || JvmtiExport::can_walk_any_space() || 855 Arguments::has_jfr_option()) { > 856 si->_read_only = false; > 857 } 858 #ifdef _WINDOWS 859 // Windows cannot remap read-only shared memory to read-write when required for 860 // RedefineClasses, which is also used by JFR. Always map windows regions as RW. 861 si->_read_only = false; 862 #endif > > To avoid setting si->_read_only twice on Windows, maybe putting the code at line #853-#857 under #else after #ifdef _WINDOWS? > > > Okay, that's an improvement too. > > Thanks Jiangli! > Coleen > > Best regards, > > Jiangli > > > On Wed, Apr 17, 2019 at 5:25 AM wrote: > >> Summary: Give fatal error if CDS loses archive mapping; but map Windows >> RW because remapping is dangerous. >> >> Ioi and I discussed this change and thought it is best. Windows only >> maps the CDS archive around 50% time because of ASLR and this retains >> the startup performance improvements for CDS on windows. >> >> Tested with mach5 tier1-3. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8222379 >> >> Thanks, >> Coleen >> >> > From coleen.phillimore at oracle.com Wed Apr 17 19:00:11 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 Apr 2019 15:00:11 -0400 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION In-Reply-To: References: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> <7c8e821b-5469-8e70-2256-d9608686e88a@oracle.com> Message-ID: Thanks Jiangli! Coleen On 4/17/19 2:55 PM, Jiangli Zhou wrote: > Looks good! > > Best, > Jiangli > > On Wed, Apr 17, 2019 at 11:54 AM > wrote: > > > > On 4/17/19 1:50 PM, Jiangli Zhou wrote: >> Hi Coleen, >> >> Looks reasonable to me. >> >> -?src/hotspot/os/windows/os_windows.cpp >> 5005 // There is a very small theoretical window between the unmap_memory() >> 5006 // call above and the map_memory() call below where a thread in native >> 5007 // code may be able to access an address that is no longer mapped. >> The comment refers to the 'unmap_memory() call above and the map_memory() call below'. Since the calls are removed, could you please update the comment as well. > > I can update the comments.? How about: > > // Remap a block of memory. > char* os::pd_remap_memory(int fd, const char* file_name, size_t > file_offset, > ????????????????????????? char *addr, size_t bytes, bool read_only, > ????????????????????????? bool allow_exec) { > ? // This OS does not allow existing memory maps to be remapped so we > ? // would have to unmap the memory before we remap it. > > ? // Because there is a small window between unmapping memory and > mapping > ? // it in again with different protections, CDS archives are > mapped RW > ? // on windows, so this function isn't called. > ? ShouldNotReachHere(); > ? return NULL; > } > > >> - src/hotspot/share/memory/filemap.cpp >> 853 // If a tool agent is in use (debugging enabled), or JFR, we >> must map the address space RW >> 854 if (JvmtiExport::can_modify_any_class() || >> JvmtiExport::can_walk_any_space() || >> 855 Arguments::has_jfr_option()) { >> 856 si->_read_only = false; >> 857 } >> 858 #ifdef _WINDOWS >> 859 // Windows cannot remap read-only shared memory to read-write >> when required for >> 860 // RedefineClasses, which is also used by JFR. Always map >> windows regions as RW. >> 861 si->_read_only = false; >> 862 #endif >> To avoid setting si->_read_only twice on Windows, maybe putting the code at line #853-#857 under #else after #ifdef _WINDOWS? > > Okay, that's an improvement too. > > Thanks Jiangli! > Coleen > >> Best regards, >> Jiangli >> >> On Wed, Apr 17, 2019 at 5:25 AM > > wrote: >> >> Summary: Give fatal error if CDS loses archive mapping; but >> map Windows >> RW because remapping is dangerous. >> >> Ioi and I discussed this change and thought it is best.? >> Windows only >> maps the CDS archive around 50% time because of ASLR and this >> retains >> the startup performance improvements for CDS on windows. >> >> Tested with mach5 tier1-3. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8222379 >> >> Thanks, >> Coleen >> > From calvin.cheung at oracle.com Wed Apr 17 21:34:21 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 17 Apr 2019 14:34:21 -0700 Subject: RFR 8222379: JFR TestClassLoadEvent.java failed due to EXCEPTION_ACCESS_VIOLATION In-Reply-To: References: <749c3c58-24e2-6feb-00c7-232d6cd3a18a@oracle.com> <5CB74EE7.8040803@oracle.com> Message-ID: <5CB79BDD.4040209@oracle.com> On 4/17/19, 11:41 AM, coleen.phillimore at oracle.com wrote: > > > On 4/17/19 12:05 PM, Calvin Cheung wrote: >> Hi Coleen, >> >> Thanks for fixing it and the change looks good. >> >> I'm wondering instead of having has_jfr_option(), could the >> JfrRecorder::is_enabled() be used instead? > > Yes, this would be a lot better, but JfrRecorder::on_vm_init() is > called after we map in the CDS regions (I just checked). Too bad we couldn't use that function. Your updated comment looks good. thanks, Calvin > > Coleen >> >> The _enable field is set via the JfrRecorder::on_vm_init() and it is >> indirectly called during vm init in Threads::create_vm() via >> JFR_ONLY(Jfr::on_vm_init();). However, JfrRecorder::is_enabled() is >> currently used within only JFR code. >> >> thanks, >> Calvin >> >> On 4/17/19, 5:24 AM, coleen.phillimore at oracle.com wrote: >>> Summary: Give fatal error if CDS loses archive mapping; but map >>> Windows RW because remapping is dangerous. >>> >>> Ioi and I discussed this change and thought it is best. Windows >>> only maps the CDS archive around 50% time because of ASLR and this >>> retains the startup performance improvements for CDS on windows. >>> >>> Tested with mach5 tier1-3. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8222379.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8222379 >>> >>> Thanks, >>> Coleen >>> > From coleen.phillimore at oracle.com Thu Apr 18 02:09:26 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 Apr 2019 22:09:26 -0400 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> Message-ID: I didn't see the 02 change below.?? I think the shouldContain function should be added to output analyzer.? And as a utility patch, it should be separate to help with backports that might need it.? So I chopped out Stefan's function: open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222713.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8222713 Leaving the MemberNameLeak.java change as: open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222550.01/webrev Tested with the new MemberNameLeak.java test.? hs tier1-3 testing in progress. All which look good to me.? Please review! Thanks, Coleen On 4/17/19 2:41 AM, Stefan Karlsson wrote: > On 2019-04-17 05:58, David Holmes wrote: >> Hi Stefan, >> >> On 17/04/2019 7:24 am, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please review this patch to fix a timeout in the MemberNameLeak test. >>> >>> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8222550 >>> >>> The test could fail if GCs happened during the setup phase when >>> entries for all generated methods were created. When this happened >>> the code to grow the table was triggered, which in turn cleaned out >>> all so-far created entries.? This put the table in a condition where >>> the grow / cleaning code didn't have to be triggered again. But the >>> test still waited for it to happen. This patch adds all >>> MethodHandles to an ArrayList, so that they are kept alive until >>> it's time for them to be cleaned out. While debugging this timeout I >>> added some extra logging. I've left it in the test in case we ever >>> need to debug it again. >> >> Fix seems reasonable. > > Thanks. > >> A couple of comments: >> >> ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", >> ?120 >> "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", >> >> I'm assuming you only actually want line 120? > > It was a quick and dirty way to get logging from 119 to the > outputAnalyzer, and more comprehensive logging from line 120 saved to > disk. > >> >> Is the log file copied across with the test artifacts in mach5? > > Yes. >> I'm assuming you're using the file for gc logging so that the normal >> test .jtr file is not inundated with excessive logging data. > > Yes, and because jtreg cuts in the middle of the output of tests with > excessive logging. > > I created a more elaborate version that only logs to files, and > perform the verification on those files: > ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta > ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 > > Thanks, > StefanK >> >> Thanks, >> David >> ----- >> >>> Testing: tier1-3 and multiple tier1_runtime runs on osx where the >>> timeouts reproduced. >>> >>> The patch is applied on top of the patch in: >>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >>> >>> >>> Thanks, >>> StefanK > From david.holmes at oracle.com Thu Apr 18 04:26:26 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Apr 2019 14:26:26 +1000 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> Message-ID: <48ffc398-3b99-3bee-3578-efb0316fdd75@oracle.com> Hi Coleen, On 18/04/2019 12:09 pm, coleen.phillimore at oracle.com wrote: > > I didn't see the 02 change below.?? I think the shouldContain function > should be added to output analyzer.? And as a utility patch, it should > be separate to help with backports that might need it.? So I chopped out > Stefan's function: > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222713.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8222713 Metacomment: not sure why OutputAnalyzer should have a utility that takes an arbitrary file and searches it for strings? That's not part of the output that OutputAnalyzer is designed to analyze. This may be a useful utility but belongs elsewhere IMHO. The fact this doesn't reference any internal state of the OutputAnalyzer also suggests it should be a static utility method, not an instance method. Or define a new OutputAnalyzer constructor that takes a File and operates in its contents - though in that case you may be able to just convert the file contents to a String and use the existing OutputAnalyzer constructor that takes a String to start with. Not clear what the expected semantics should be for searching for multiple lines. I would have expected to be searching for lines in the given order, but the code will match them in any order. That may be what you wanted, but it's not clear to me its what you'd always want. Needs to be documented either way. + * Verify that the contents of the file contains the given the set of strings s/the set/set/ + LinkedList expectedList = new LinkedList<>(); + for (String s : expectedStrings) { + expectedList.add(s); + } List expectedList = Arrays.asList(expectedStrings); + FileReader fr = new FileReader(file); + BufferedReader reader = new BufferedReader(fr); These should be part of a try-with-resources block. + for (String line; (line = reader.readLine()) != null;) { I'd find this cleaner as a while-loop: String line; while ((line = reader.readLine()) != NULL) { Though I think there are simpler ways to deal with files - see java.nio.file.Files utility class. > Leaving the MemberNameLeak.java change as: > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222550.01/webrev Seems okay. Thanks, David ----- > Tested with the new MemberNameLeak.java test.? hs tier1-3 testing in > progress. > > All which look good to me.? Please review! > > Thanks, > Coleen > > On 4/17/19 2:41 AM, Stefan Karlsson wrote: >> On 2019-04-17 05:58, David Holmes wrote: >>> Hi Stefan, >>> >>> On 17/04/2019 7:24 am, Stefan Karlsson wrote: >>>> Hi all, >>>> >>>> Please review this patch to fix a timeout in the MemberNameLeak test. >>>> >>>> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >>>> https://bugs.openjdk.java.net/browse/JDK-8222550 >>>> >>>> The test could fail if GCs happened during the setup phase when >>>> entries for all generated methods were created. When this happened >>>> the code to grow the table was triggered, which in turn cleaned out >>>> all so-far created entries.? This put the table in a condition where >>>> the grow / cleaning code didn't have to be triggered again. But the >>>> test still waited for it to happen. This patch adds all >>>> MethodHandles to an ArrayList, so that they are kept alive until >>>> it's time for them to be cleaned out. While debugging this timeout I >>>> added some extra logging. I've left it in the test in case we ever >>>> need to debug it again. >>> >>> Fix seems reasonable. >> >> Thanks. >> >>> A couple of comments: >>> >>> ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", >>> ?120 >>> "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", >>> >>> >>> I'm assuming you only actually want line 120? >> >> It was a quick and dirty way to get logging from 119 to the >> outputAnalyzer, and more comprehensive logging from line 120 saved to >> disk. >> >>> >>> Is the log file copied across with the test artifacts in mach5? >> >> Yes. >>> I'm assuming you're using the file for gc logging so that the normal >>> test .jtr file is not inundated with excessive logging data. >> >> Yes, and because jtreg cuts in the middle of the output of tests with >> excessive logging. >> >> I created a more elaborate version that only logs to files, and >> perform the verification on those files: >> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta >> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 >> >> Thanks, >> StefanK >>> >>> Thanks, >>> David >>> ----- >>> >>>> Testing: tier1-3 and multiple tier1_runtime runs on osx where the >>>> timeouts reproduced. >>>> >>>> The patch is applied on top of the patch in: >>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >>>> >>>> >>>> Thanks, >>>> StefanK >> > From stefan.karlsson at oracle.com Thu Apr 18 07:31:37 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 18 Apr 2019 09:31:37 +0200 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: <48ffc398-3b99-3bee-3578-efb0316fdd75@oracle.com> References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> <48ffc398-3b99-3bee-3578-efb0316fdd75@oracle.com> Message-ID: <7c05517e-d1d7-f628-dbcf-ec0b996ab611@oracle.com> On 2019-04-18 06:26, David Holmes wrote: > Hi Coleen, > > On 18/04/2019 12:09 pm, coleen.phillimore at oracle.com wrote: >> >> I didn't see the 02 change below.?? I think the shouldContain function >> should be added to output analyzer.? And as a utility patch, it should >> be separate to help with backports that might need it.? So I chopped >> out Stefan's function: >> >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222713.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8222713 > > Metacomment: not sure why OutputAnalyzer should have a utility that > takes an arbitrary file and searches it for strings? That's not part of > the output that OutputAnalyzer is designed to analyze. This may be a > useful utility but belongs elsewhere IMHO. The fact this doesn't > reference any internal state of the OutputAnalyzer also suggests it > should be a static utility method, not an instance method. Or define a > new OutputAnalyzer constructor that takes a File and operates in its > contents Yes, this is exactly what I would have expected. - though in that case you may be able to just convert the file > contents to a String and use the existing OutputAnalyzer constructor > that takes a String to start with. I would rather have the OutputAnalyzer constructor do it for me, instead of having to do this conversion every time I want to analyze a file. > > Not clear what the expected semantics should be for searching for > multiple lines. I would have expected to be searching for lines in the > given order, but the code will match them in any order. That may be what > you wanted, but it's not clear to me its what you'd always want. Needs > to be documented either way. If we create a OutputAnalyzer(File) constructor, we don't have to care about that. I've added that function: http://cr.openjdk.java.net/~stefank/8222713/webrev.01/ and updated the webrev for the original bug: https://cr.openjdk.java.net/~stefank/8222550/webrev.03.delta/ https://cr.openjdk.java.net/~stefank/8222550/webrev.03/ Thanks, StefanK > > +????? * Verify that the contents of the file contains the given the set > of strings > > s/the set/set/ > > +???????? LinkedList expectedList = new LinkedList<>(); > +???????? for (String s : expectedStrings) { > +?????????? expectedList.add(s); > +???????? } > > List expectedList = Arrays.asList(expectedStrings); > > +???????? FileReader fr = new FileReader(file); > +???????? BufferedReader reader = new BufferedReader(fr); > > These should be part of a try-with-resources block. > > +???????? for (String line; (line = reader.readLine()) != null;) { > > I'd find this cleaner as a while-loop: > > String line; > while ((line = reader.readLine()) != NULL) { > > Though I think there are simpler ways to deal with files - see > java.nio.file.Files utility class. > >> Leaving the MemberNameLeak.java change as: >> >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222550.01/webrev > > Seems okay. > > Thanks, > David > ----- > >> Tested with the new MemberNameLeak.java test.? hs tier1-3 testing in >> progress. >> >> All which look good to me.? Please review! >> >> Thanks, >> Coleen >> >> On 4/17/19 2:41 AM, Stefan Karlsson wrote: >>> On 2019-04-17 05:58, David Holmes wrote: >>>> Hi Stefan, >>>> >>>> On 17/04/2019 7:24 am, Stefan Karlsson wrote: >>>>> Hi all, >>>>> >>>>> Please review this patch to fix a timeout in the MemberNameLeak test. >>>>> >>>>> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >>>>> https://bugs.openjdk.java.net/browse/JDK-8222550 >>>>> >>>>> The test could fail if GCs happened during the setup phase when >>>>> entries for all generated methods were created. When this happened >>>>> the code to grow the table was triggered, which in turn cleaned out >>>>> all so-far created entries.? This put the table in a condition >>>>> where the grow / cleaning code didn't have to be triggered again. >>>>> But the test still waited for it to happen. This patch adds all >>>>> MethodHandles to an ArrayList, so that they are kept alive until >>>>> it's time for them to be cleaned out. While debugging this timeout >>>>> I added some extra logging. I've left it in the test in case we >>>>> ever need to debug it again. >>>> >>>> Fix seems reasonable. >>> >>> Thanks. >>> >>>> A couple of comments: >>>> >>>> ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", >>>> ?120 >>>> "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", >>>> >>>> >>>> I'm assuming you only actually want line 120? >>> >>> It was a quick and dirty way to get logging from 119 to the >>> outputAnalyzer, and more comprehensive logging from line 120 saved to >>> disk. >>> >>>> >>>> Is the log file copied across with the test artifacts in mach5? >>> >>> Yes. >>>> I'm assuming you're using the file for gc logging so that the normal >>>> test .jtr file is not inundated with excessive logging data. >>> >>> Yes, and because jtreg cuts in the middle of the output of tests with >>> excessive logging. >>> >>> I created a more elaborate version that only logs to files, and >>> perform the verification on those files: >>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta >>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 >>> >>> Thanks, >>> StefanK >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Testing: tier1-3 and multiple tier1_runtime runs on osx where the >>>>> timeouts reproduced. >>>>> >>>>> The patch is applied on top of the patch in: >>>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >>>>> >>>>> >>>>> Thanks, >>>>> StefanK >>> >> From david.holmes at oracle.com Thu Apr 18 07:55:08 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Apr 2019 17:55:08 +1000 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: <7c05517e-d1d7-f628-dbcf-ec0b996ab611@oracle.com> References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> <48ffc398-3b99-3bee-3578-efb0316fdd75@oracle.com> <7c05517e-d1d7-f628-dbcf-ec0b996ab611@oracle.com> Message-ID: <9ef5aca0-2f56-ec60-94ce-b75b04a2dc33@oracle.com> On 18/04/2019 5:31 pm, Stefan Karlsson wrote: > On 2019-04-18 06:26, David Holmes wrote: >> Hi Coleen, >> >> On 18/04/2019 12:09 pm, coleen.phillimore at oracle.com wrote: >>> >>> I didn't see the 02 change below.?? I think the shouldContain >>> function should be added to output analyzer.? And as a utility patch, >>> it should be separate to help with backports that might need it.? So >>> I chopped out Stefan's function: >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8222713.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8222713 >> >> Metacomment: not sure why OutputAnalyzer should have a utility that >> takes an arbitrary file and searches it for strings? That's not part >> of the output that OutputAnalyzer is designed to analyze. This may be >> a useful utility but belongs elsewhere IMHO. The fact this doesn't >> reference any internal state of the OutputAnalyzer also suggests it >> should be a static utility method, not an instance method. Or define a >> new OutputAnalyzer constructor that takes a File and operates in its >> contents > > Yes, this is exactly what I would have expected. > > ?- though in that case you may be able to just convert the file >> contents to a String and use the existing OutputAnalyzer constructor >> that takes a String to start with. > > > I would rather have the OutputAnalyzer constructor do it for me, instead > of having to do this conversion every time I want to analyze a file. > > >> >> Not clear what the expected semantics should be for searching for >> multiple lines. I would have expected to be searching for lines in the >> given order, but the code will match them in any order. That may be >> what you wanted, but it's not clear to me its what you'd always want. >> Needs to be documented either way. > > If we create a OutputAnalyzer(File) constructor, we don't have to care > about that. > > I've added that function: > ?http://cr.openjdk.java.net/~stefank/8222713/webrev.01/ We don't need the Utils function Utils.fileAsString(file.toString()) since JDK 11 we can just use: Files.readString(file) Cheers, David ----- > and updated the webrev for the original bug: > ?https://cr.openjdk.java.net/~stefank/8222550/webrev.03.delta/ > ?https://cr.openjdk.java.net/~stefank/8222550/webrev.03/ > > Thanks, > StefanK > >> >> +????? * Verify that the contents of the file contains the given the >> set of strings >> >> s/the set/set/ >> >> +???????? LinkedList expectedList = new LinkedList<>(); >> +???????? for (String s : expectedStrings) { >> +?????????? expectedList.add(s); >> +???????? } >> >> List expectedList = Arrays.asList(expectedStrings); >> >> +???????? FileReader fr = new FileReader(file); >> +???????? BufferedReader reader = new BufferedReader(fr); >> >> These should be part of a try-with-resources block. >> >> +???????? for (String line; (line = reader.readLine()) != null;) { >> >> I'd find this cleaner as a while-loop: >> >> String line; >> while ((line = reader.readLine()) != NULL) { >> >> Though I think there are simpler ways to deal with files - see >> java.nio.file.Files utility class. >> >>> Leaving the MemberNameLeak.java change as: >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8222550.01/webrev >> >> Seems okay. >> >> Thanks, >> David >> ----- >> >>> Tested with the new MemberNameLeak.java test.? hs tier1-3 testing in >>> progress. >>> >>> All which look good to me.? Please review! >>> >>> Thanks, >>> Coleen >>> >>> On 4/17/19 2:41 AM, Stefan Karlsson wrote: >>>> On 2019-04-17 05:58, David Holmes wrote: >>>>> Hi Stefan, >>>>> >>>>> On 17/04/2019 7:24 am, Stefan Karlsson wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review this patch to fix a timeout in the MemberNameLeak test. >>>>>> >>>>>> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >>>>>> https://bugs.openjdk.java.net/browse/JDK-8222550 >>>>>> >>>>>> The test could fail if GCs happened during the setup phase when >>>>>> entries for all generated methods were created. When this happened >>>>>> the code to grow the table was triggered, which in turn cleaned >>>>>> out all so-far created entries.? This put the table in a condition >>>>>> where the grow / cleaning code didn't have to be triggered again. >>>>>> But the test still waited for it to happen. This patch adds all >>>>>> MethodHandles to an ArrayList, so that they are kept alive until >>>>>> it's time for them to be cleaned out. While debugging this timeout >>>>>> I added some extra logging. I've left it in the test in case we >>>>>> ever need to debug it again. >>>>> >>>>> Fix seems reasonable. >>>> >>>> Thanks. >>>> >>>>> A couple of comments: >>>>> >>>>> ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", >>>>> ?120 >>>>> "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", >>>>> >>>>> >>>>> I'm assuming you only actually want line 120? >>>> >>>> It was a quick and dirty way to get logging from 119 to the >>>> outputAnalyzer, and more comprehensive logging from line 120 saved >>>> to disk. >>>> >>>>> >>>>> Is the log file copied across with the test artifacts in mach5? >>>> >>>> Yes. >>>>> I'm assuming you're using the file for gc logging so that the >>>>> normal test .jtr file is not inundated with excessive logging data. >>>> >>>> Yes, and because jtreg cuts in the middle of the output of tests >>>> with excessive logging. >>>> >>>> I created a more elaborate version that only logs to files, and >>>> perform the verification on those files: >>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta >>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 >>>> >>>> Thanks, >>>> StefanK >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> Testing: tier1-3 and multiple tier1_runtime runs on osx where the >>>>>> timeouts reproduced. >>>>>> >>>>>> The patch is applied on top of the patch in: >>>>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >>>>>> >>>>>> >>>>>> Thanks, >>>>>> StefanK >>>> >>> From stefan.karlsson at oracle.com Thu Apr 18 08:16:09 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 18 Apr 2019 10:16:09 +0200 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: <9ef5aca0-2f56-ec60-94ce-b75b04a2dc33@oracle.com> References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> <48ffc398-3b99-3bee-3578-efb0316fdd75@oracle.com> <7c05517e-d1d7-f628-dbcf-ec0b996ab611@oracle.com> <9ef5aca0-2f56-ec60-94ce-b75b04a2dc33@oracle.com> Message-ID: <0cc32647-2951-b21d-08be-40c9db1203dd@oracle.com> On 2019-04-18 09:55, David Holmes wrote: > On 18/04/2019 5:31 pm, Stefan Karlsson wrote: >> On 2019-04-18 06:26, David Holmes wrote: >>> Hi Coleen, >>> >>> On 18/04/2019 12:09 pm, coleen.phillimore at oracle.com wrote: >>>> >>>> I didn't see the 02 change below.?? I think the shouldContain >>>> function should be added to output analyzer.? And as a utility >>>> patch, it should be separate to help with backports that might need >>>> it.? So I chopped out Stefan's function: >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8222713.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222713 >>> >>> Metacomment: not sure why OutputAnalyzer should have a utility that >>> takes an arbitrary file and searches it for strings? That's not part >>> of the output that OutputAnalyzer is designed to analyze. This may be >>> a useful utility but belongs elsewhere IMHO. The fact this doesn't >>> reference any internal state of the OutputAnalyzer also suggests it >>> should be a static utility method, not an instance method. Or define >>> a new OutputAnalyzer constructor that takes a File and operates in >>> its contents >> >> Yes, this is exactly what I would have expected. >> >> ??- though in that case you may be able to just convert the file >>> contents to a String and use the existing OutputAnalyzer constructor >>> that takes a String to start with. >> >> >> I would rather have the OutputAnalyzer constructor do it for me, >> instead of having to do this conversion every time I want to analyze a >> file. >> >> >>> >>> Not clear what the expected semantics should be for searching for >>> multiple lines. I would have expected to be searching for lines in >>> the given order, but the code will match them in any order. That may >>> be what you wanted, but it's not clear to me its what you'd always >>> want. Needs to be documented either way. >> >> If we create a OutputAnalyzer(File) constructor, we don't have to care >> about that. >> >> I've added that function: >> ??http://cr.openjdk.java.net/~stefank/8222713/webrev.01/ > > We don't need the Utils function > > Utils.fileAsString(file.toString()) > > ?since JDK 11 we can just use: > > Files.readString(file) Updated: http://cr.openjdk.java.net/~stefank/8222713/webrev.02.delta/ http://cr.openjdk.java.net/~stefank/8222713/webrev.02/ Thanks, StefanK > > Cheers, > David > ----- > > > >> and updated the webrev for the original bug: >> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03.delta/ >> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03/ >> >> Thanks, >> StefanK >> >>> >>> +????? * Verify that the contents of the file contains the given the >>> set of strings >>> >>> s/the set/set/ >>> >>> +???????? LinkedList expectedList = new LinkedList<>(); >>> +???????? for (String s : expectedStrings) { >>> +?????????? expectedList.add(s); >>> +???????? } >>> >>> List expectedList = Arrays.asList(expectedStrings); >>> >>> +???????? FileReader fr = new FileReader(file); >>> +???????? BufferedReader reader = new BufferedReader(fr); >>> >>> These should be part of a try-with-resources block. >>> >>> +???????? for (String line; (line = reader.readLine()) != null;) { >>> >>> I'd find this cleaner as a while-loop: >>> >>> String line; >>> while ((line = reader.readLine()) != NULL) { >>> >>> Though I think there are simpler ways to deal with files - see >>> java.nio.file.Files utility class. >>> >>>> Leaving the MemberNameLeak.java change as: >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8222550.01/webrev >>> >>> Seems okay. >>> >>> Thanks, >>> David >>> ----- >>> >>>> Tested with the new MemberNameLeak.java test.? hs tier1-3 testing in >>>> progress. >>>> >>>> All which look good to me.? Please review! >>>> >>>> Thanks, >>>> Coleen >>>> >>>> On 4/17/19 2:41 AM, Stefan Karlsson wrote: >>>>> On 2019-04-17 05:58, David Holmes wrote: >>>>>> Hi Stefan, >>>>>> >>>>>> On 17/04/2019 7:24 am, Stefan Karlsson wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this patch to fix a timeout in the MemberNameLeak >>>>>>> test. >>>>>>> >>>>>>> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222550 >>>>>>> >>>>>>> The test could fail if GCs happened during the setup phase when >>>>>>> entries for all generated methods were created. When this >>>>>>> happened the code to grow the table was triggered, which in turn >>>>>>> cleaned out all so-far created entries.? This put the table in a >>>>>>> condition where the grow / cleaning code didn't have to be >>>>>>> triggered again. But the test still waited for it to happen. This >>>>>>> patch adds all MethodHandles to an ArrayList, so that they are >>>>>>> kept alive until it's time for them to be cleaned out. While >>>>>>> debugging this timeout I added some extra logging. I've left it >>>>>>> in the test in case we ever need to debug it again. >>>>>> >>>>>> Fix seems reasonable. >>>>> >>>>> Thanks. >>>>> >>>>>> A couple of comments: >>>>>> >>>>>> ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", >>>>>> ?120 >>>>>> "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", >>>>>> >>>>>> >>>>>> I'm assuming you only actually want line 120? >>>>> >>>>> It was a quick and dirty way to get logging from 119 to the >>>>> outputAnalyzer, and more comprehensive logging from line 120 saved >>>>> to disk. >>>>> >>>>>> >>>>>> Is the log file copied across with the test artifacts in mach5? >>>>> >>>>> Yes. >>>>>> I'm assuming you're using the file for gc logging so that the >>>>>> normal test .jtr file is not inundated with excessive logging data. >>>>> >>>>> Yes, and because jtreg cuts in the middle of the output of tests >>>>> with excessive logging. >>>>> >>>>> I created a more elaborate version that only logs to files, and >>>>> perform the verification on those files: >>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta >>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 >>>>> >>>>> Thanks, >>>>> StefanK >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Testing: tier1-3 and multiple tier1_runtime runs on osx where the >>>>>>> timeouts reproduced. >>>>>>> >>>>>>> The patch is applied on top of the patch in: >>>>>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> StefanK >>>>> >>>> From david.holmes at oracle.com Thu Apr 18 09:13:42 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Apr 2019 19:13:42 +1000 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: <0cc32647-2951-b21d-08be-40c9db1203dd@oracle.com> References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> <48ffc398-3b99-3bee-3578-efb0316fdd75@oracle.com> <7c05517e-d1d7-f628-dbcf-ec0b996ab611@oracle.com> <9ef5aca0-2f56-ec60-94ce-b75b04a2dc33@oracle.com> <0cc32647-2951-b21d-08be-40c9db1203dd@oracle.com> Message-ID: <535878a1-3f62-2c96-7029-0fb502849c6e@oracle.com> Works for me :) Thanks, David On 18/04/2019 6:16 pm, Stefan Karlsson wrote: > > > On 2019-04-18 09:55, David Holmes wrote: >> On 18/04/2019 5:31 pm, Stefan Karlsson wrote: >>> On 2019-04-18 06:26, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> On 18/04/2019 12:09 pm, coleen.phillimore at oracle.com wrote: >>>>> >>>>> I didn't see the 02 change below.?? I think the shouldContain >>>>> function should be added to output analyzer.? And as a utility >>>>> patch, it should be separate to help with backports that might need >>>>> it.? So I chopped out Stefan's function: >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8222713.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222713 >>>> >>>> Metacomment: not sure why OutputAnalyzer should have a utility that >>>> takes an arbitrary file and searches it for strings? That's not part >>>> of the output that OutputAnalyzer is designed to analyze. This may >>>> be a useful utility but belongs elsewhere IMHO. The fact this >>>> doesn't reference any internal state of the OutputAnalyzer also >>>> suggests it should be a static utility method, not an instance >>>> method. Or define a new OutputAnalyzer constructor that takes a File >>>> and operates in its contents >>> >>> Yes, this is exactly what I would have expected. >>> >>> ??- though in that case you may be able to just convert the file >>>> contents to a String and use the existing OutputAnalyzer constructor >>>> that takes a String to start with. >>> >>> >>> I would rather have the OutputAnalyzer constructor do it for me, >>> instead of having to do this conversion every time I want to analyze >>> a file. >>> >>> >>>> >>>> Not clear what the expected semantics should be for searching for >>>> multiple lines. I would have expected to be searching for lines in >>>> the given order, but the code will match them in any order. That may >>>> be what you wanted, but it's not clear to me its what you'd always >>>> want. Needs to be documented either way. >>> >>> If we create a OutputAnalyzer(File) constructor, we don't have to >>> care about that. >>> >>> I've added that function: >>> ??http://cr.openjdk.java.net/~stefank/8222713/webrev.01/ >> >> We don't need the Utils function >> >> Utils.fileAsString(file.toString()) >> >> ??since JDK 11 we can just use: >> >> Files.readString(file) > > Updated: > ?http://cr.openjdk.java.net/~stefank/8222713/webrev.02.delta/ > ?http://cr.openjdk.java.net/~stefank/8222713/webrev.02/ > > Thanks, > StefanK > >> >> Cheers, >> David >> ----- >> >> >> >>> and updated the webrev for the original bug: >>> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03.delta/ >>> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03/ >>> >>> Thanks, >>> StefanK >>> >>>> >>>> +????? * Verify that the contents of the file contains the given the >>>> set of strings >>>> >>>> s/the set/set/ >>>> >>>> +???????? LinkedList expectedList = new LinkedList<>(); >>>> +???????? for (String s : expectedStrings) { >>>> +?????????? expectedList.add(s); >>>> +???????? } >>>> >>>> List expectedList = Arrays.asList(expectedStrings); >>>> >>>> +???????? FileReader fr = new FileReader(file); >>>> +???????? BufferedReader reader = new BufferedReader(fr); >>>> >>>> These should be part of a try-with-resources block. >>>> >>>> +???????? for (String line; (line = reader.readLine()) != null;) { >>>> >>>> I'd find this cleaner as a while-loop: >>>> >>>> String line; >>>> while ((line = reader.readLine()) != NULL) { >>>> >>>> Though I think there are simpler ways to deal with files - see >>>> java.nio.file.Files utility class. >>>> >>>>> Leaving the MemberNameLeak.java change as: >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8222550.01/webrev >>>> >>>> Seems okay. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Tested with the new MemberNameLeak.java test.? hs tier1-3 testing >>>>> in progress. >>>>> >>>>> All which look good to me.? Please review! >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>> On 4/17/19 2:41 AM, Stefan Karlsson wrote: >>>>>> On 2019-04-17 05:58, David Holmes wrote: >>>>>>> Hi Stefan, >>>>>>> >>>>>>> On 17/04/2019 7:24 am, Stefan Karlsson wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review this patch to fix a timeout in the MemberNameLeak >>>>>>>> test. >>>>>>>> >>>>>>>> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222550 >>>>>>>> >>>>>>>> The test could fail if GCs happened during the setup phase when >>>>>>>> entries for all generated methods were created. When this >>>>>>>> happened the code to grow the table was triggered, which in turn >>>>>>>> cleaned out all so-far created entries.? This put the table in a >>>>>>>> condition where the grow / cleaning code didn't have to be >>>>>>>> triggered again. But the test still waited for it to happen. >>>>>>>> This patch adds all MethodHandles to an ArrayList, so that they >>>>>>>> are kept alive until it's time for them to be cleaned out. While >>>>>>>> debugging this timeout I added some extra logging. I've left it >>>>>>>> in the test in case we ever need to debug it again. >>>>>>> >>>>>>> Fix seems reasonable. >>>>>> >>>>>> Thanks. >>>>>> >>>>>>> A couple of comments: >>>>>>> >>>>>>> ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", >>>>>>> ?120 >>>>>>> "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", >>>>>>> >>>>>>> >>>>>>> I'm assuming you only actually want line 120? >>>>>> >>>>>> It was a quick and dirty way to get logging from 119 to the >>>>>> outputAnalyzer, and more comprehensive logging from line 120 saved >>>>>> to disk. >>>>>> >>>>>>> >>>>>>> Is the log file copied across with the test artifacts in mach5? >>>>>> >>>>>> Yes. >>>>>>> I'm assuming you're using the file for gc logging so that the >>>>>>> normal test .jtr file is not inundated with excessive logging data. >>>>>> >>>>>> Yes, and because jtreg cuts in the middle of the output of tests >>>>>> with excessive logging. >>>>>> >>>>>> I created a more elaborate version that only logs to files, and >>>>>> perform the verification on those files: >>>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta >>>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 >>>>>> >>>>>> Thanks, >>>>>> StefanK >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> Testing: tier1-3 and multiple tier1_runtime runs on osx where >>>>>>>> the timeouts reproduced. >>>>>>>> >>>>>>>> The patch is applied on top of the patch in: >>>>>>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> StefanK >>>>>> >>>>> From stefan.karlsson at oracle.com Thu Apr 18 10:10:07 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 18 Apr 2019 12:10:07 +0200 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: <535878a1-3f62-2c96-7029-0fb502849c6e@oracle.com> References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> <48ffc398-3b99-3bee-3578-efb0316fdd75@oracle.com> <7c05517e-d1d7-f628-dbcf-ec0b996ab611@oracle.com> <9ef5aca0-2f56-ec60-94ce-b75b04a2dc33@oracle.com> <0cc32647-2951-b21d-08be-40c9db1203dd@oracle.com> <535878a1-3f62-2c96-7029-0fb502849c6e@oracle.com> Message-ID: <21f2be58-252d-1d60-4d5f-75e58d546fa6@oracle.com> Thanks! In the interest of getting this fixed before the holidays, I intend to push this as soon as the last round of testing passes. StefanK On 2019-04-18 11:13, David Holmes wrote: > Works for me :) > > Thanks, > David > > On 18/04/2019 6:16 pm, Stefan Karlsson wrote: >> >> >> On 2019-04-18 09:55, David Holmes wrote: >>> On 18/04/2019 5:31 pm, Stefan Karlsson wrote: >>>> On 2019-04-18 06:26, David Holmes wrote: >>>>> Hi Coleen, >>>>> >>>>> On 18/04/2019 12:09 pm, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> I didn't see the 02 change below.?? I think the shouldContain >>>>>> function should be added to output analyzer.? And as a utility >>>>>> patch, it should be separate to help with backports that might >>>>>> need it.? So I chopped out Stefan's function: >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8222713.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222713 >>>>> >>>>> Metacomment: not sure why OutputAnalyzer should have a utility that >>>>> takes an arbitrary file and searches it for strings? That's not >>>>> part of the output that OutputAnalyzer is designed to analyze. This >>>>> may be a useful utility but belongs elsewhere IMHO. The fact this >>>>> doesn't reference any internal state of the OutputAnalyzer also >>>>> suggests it should be a static utility method, not an instance >>>>> method. Or define a new OutputAnalyzer constructor that takes a >>>>> File and operates in its contents >>>> >>>> Yes, this is exactly what I would have expected. >>>> >>>> ??- though in that case you may be able to just convert the file >>>>> contents to a String and use the existing OutputAnalyzer >>>>> constructor that takes a String to start with. >>>> >>>> >>>> I would rather have the OutputAnalyzer constructor do it for me, >>>> instead of having to do this conversion every time I want to analyze >>>> a file. >>>> >>>> >>>>> >>>>> Not clear what the expected semantics should be for searching for >>>>> multiple lines. I would have expected to be searching for lines in >>>>> the given order, but the code will match them in any order. That >>>>> may be what you wanted, but it's not clear to me its what you'd >>>>> always want. Needs to be documented either way. >>>> >>>> If we create a OutputAnalyzer(File) constructor, we don't have to >>>> care about that. >>>> >>>> I've added that function: >>>> ??http://cr.openjdk.java.net/~stefank/8222713/webrev.01/ >>> >>> We don't need the Utils function >>> >>> Utils.fileAsString(file.toString()) >>> >>> ??since JDK 11 we can just use: >>> >>> Files.readString(file) >> >> Updated: >> ??http://cr.openjdk.java.net/~stefank/8222713/webrev.02.delta/ >> ??http://cr.openjdk.java.net/~stefank/8222713/webrev.02/ >> >> Thanks, >> StefanK >> >>> >>> Cheers, >>> David >>> ----- >>> >>> >>> >>>> and updated the webrev for the original bug: >>>> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03.delta/ >>>> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03/ >>>> >>>> Thanks, >>>> StefanK >>>> >>>>> >>>>> +????? * Verify that the contents of the file contains the given >>>>> the set of strings >>>>> >>>>> s/the set/set/ >>>>> >>>>> +???????? LinkedList expectedList = new LinkedList<>(); >>>>> +???????? for (String s : expectedStrings) { >>>>> +?????????? expectedList.add(s); >>>>> +???????? } >>>>> >>>>> List expectedList = Arrays.asList(expectedStrings); >>>>> >>>>> +???????? FileReader fr = new FileReader(file); >>>>> +???????? BufferedReader reader = new BufferedReader(fr); >>>>> >>>>> These should be part of a try-with-resources block. >>>>> >>>>> +???????? for (String line; (line = reader.readLine()) != null;) { >>>>> >>>>> I'd find this cleaner as a while-loop: >>>>> >>>>> String line; >>>>> while ((line = reader.readLine()) != NULL) { >>>>> >>>>> Though I think there are simpler ways to deal with files - see >>>>> java.nio.file.Files utility class. >>>>> >>>>>> Leaving the MemberNameLeak.java change as: >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8222550.01/webrev >>>>> >>>>> Seems okay. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> Tested with the new MemberNameLeak.java test.? hs tier1-3 testing >>>>>> in progress. >>>>>> >>>>>> All which look good to me.? Please review! >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >>>>>> On 4/17/19 2:41 AM, Stefan Karlsson wrote: >>>>>>> On 2019-04-17 05:58, David Holmes wrote: >>>>>>>> Hi Stefan, >>>>>>>> >>>>>>>> On 17/04/2019 7:24 am, Stefan Karlsson wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review this patch to fix a timeout in the MemberNameLeak >>>>>>>>> test. >>>>>>>>> >>>>>>>>> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222550 >>>>>>>>> >>>>>>>>> The test could fail if GCs happened during the setup phase when >>>>>>>>> entries for all generated methods were created. When this >>>>>>>>> happened the code to grow the table was triggered, which in >>>>>>>>> turn cleaned out all so-far created entries.? This put the >>>>>>>>> table in a condition where the grow / cleaning code didn't have >>>>>>>>> to be triggered again. But the test still waited for it to >>>>>>>>> happen. This patch adds all MethodHandles to an ArrayList, so >>>>>>>>> that they are kept alive until it's time for them to be cleaned >>>>>>>>> out. While debugging this timeout I added some extra logging. >>>>>>>>> I've left it in the test in case we ever need to debug it again. >>>>>>>> >>>>>>>> Fix seems reasonable. >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>>> A couple of comments: >>>>>>>> >>>>>>>> ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", >>>>>>>> ?120 >>>>>>>> "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", >>>>>>>> >>>>>>>> >>>>>>>> I'm assuming you only actually want line 120? >>>>>>> >>>>>>> It was a quick and dirty way to get logging from 119 to the >>>>>>> outputAnalyzer, and more comprehensive logging from line 120 >>>>>>> saved to disk. >>>>>>> >>>>>>>> >>>>>>>> Is the log file copied across with the test artifacts in mach5? >>>>>>> >>>>>>> Yes. >>>>>>>> I'm assuming you're using the file for gc logging so that the >>>>>>>> normal test .jtr file is not inundated with excessive logging data. >>>>>>> >>>>>>> Yes, and because jtreg cuts in the middle of the output of tests >>>>>>> with excessive logging. >>>>>>> >>>>>>> I created a more elaborate version that only logs to files, and >>>>>>> perform the verification on those files: >>>>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta >>>>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 >>>>>>> >>>>>>> Thanks, >>>>>>> StefanK >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> Testing: tier1-3 and multiple tier1_runtime runs on osx where >>>>>>>>> the timeouts reproduced. >>>>>>>>> >>>>>>>>> The patch is applied on top of the patch in: >>>>>>>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> StefanK >>>>>>> >>>>>> From rkennke at redhat.com Thu Apr 18 09:54:05 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 18 Apr 2019 11:54:05 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> Message-ID: To add a little more detail, I could move the change up into is_objArray(), but I don't want to expose it to any non-assert paths. Therefore I could do 2 different impls there, guarded by #ifdef ASSERT but I don't think it's a good idea to behave differently under ASSERT, that kindof defeats the point of assert, right? What do you think ? Roman Am 17. April 2019 18:59:09 MESZ schrieb Roman Kennke : >>> Various code paths in oopDesc, Klass and their subclasses assert >>> something that fetches the object's _klass field. With upcoming >>> Shenandoah's changes this is not always safe and requires an >additional >>> indirection. >>> >>> The trouble here is that we can, for example, call >>> Klass::oop_oop_iterate() with a pre-resolved Klass*, instead of >>> oopDesc::oop_iterate() which would call oopDesc::klass() on its own, >>> which would be racy on some GC internal call paths, but we can't >>> (currently) control some calls to klass() further down the call >stack >>> (all in asserts). >>> >>> We'd also like a way to ensure that non-GC calls to klass() are >sane. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8222545 >>> Webrev: >>> http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ >>> Testing: >>> hotspot_gc_shenandoah with and without the prototype, hotspot/tier1 >>> >>> The change introduces only two ASSERT-level GC-interfaces, and >afaict, >>> this with JDK-8222537 will be all that we need for the upcoming >>> elimination of forward pointers in Shenandoah. Notice that one >assert in >>> objArrayKlass is strengthened from is_array() to is_objArray(), but >that >>> seems only sane in that context. >>> >>> Can I please get reviews? >> >> This looks very awkward to me. Using: >> >> Universe::heap()->safe_klass(obj)->is_objArray_klass() >> >> instead of the obvious: >> >> obj->is_objArray() >> >> is very unintuitive. Can this not be handled inside is_objArray (and >> is_typeArray) ? > >Not really. Then it would get exposed to many more code paths, most of >which don't actually need it/don't want it, and many of which are >outside of asserts, and rely on the usual klass() with the sanity >assert >there instead. I am open for suggestions, but it would have to be >restricted to ASSERT code IMO, and ideally with as few as possible GC >interface additions. > >Roman -- Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From david.holmes at oracle.com Thu Apr 18 11:10:54 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Apr 2019 21:10:54 +1000 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> Message-ID: <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> On 18/04/2019 7:54 pm, Roman Kennke wrote: > To add a little more detail, I could move the change up into > is_objArray(), but I don't want to expose it to any non-assert paths. > Therefore I could do 2 different impls there, guarded by #ifdef ASSERT > but I don't think it's a good idea to behave differently under ASSERT, > that kindof defeats the point of assert, right? > > What do you think ? I don't follow your argument. Under asserts you need to access the klass pointer "safely" but otherwise you do not. So there are two behaviours related to accessing the klass pointer anyway. I'd rather see that encapsulated in the accessor. I assume it's not just asserts but any debug only code that wants to access the klass pointer. Thanks, David ----- > Roman > > > Am 17. April 2019 18:59:09 MESZ schrieb Roman Kennke : > > Various code paths in oopDesc, Klass and their subclasses assert > something that fetches the object's _klass field. With upcoming > Shenandoah's changes this is not always safe and requires an > additional > indirection. > > The trouble here is that we can, for example, call > Klass::oop_oop_iterate() with a pre-resolved Klass*, instead of > oopDesc::oop_iterate() which would call oopDesc::klass() on > its own, > which would be racy on some GC internal call paths, but we can't > (currently) control some calls to klass() further down the > call stack > (all in asserts). > > We'd also like a way to ensure that non-GC calls to klass() > are sane. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8222545 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ > Testing: > hotspot_gc_shenandoah with and without the prototype, > hotspot/tier1 > > The change introduces only two ASSERT-level GC-interfaces, > and afaict, > this with JDK-8222537 will be all that we need for the upcoming > elimination of forward pointers in Shenandoah. Notice that > one assert in > objArrayKlass is strengthened from is_array() to > is_objArray(), but that > seems only sane in that context. > > Can I please get reviews? > > > This looks very awkward to me. Using: > > Universe::heap()->safe_klass(obj)->is_objArray_klass() > > instead of the obvious: > > obj->is_objArray() > > is very unintuitive. Can this not be handled inside is_objArray > (and > is_typeArray) ? > > > Not really. Then it would get exposed to many more code paths, most of > which don't actually need it/don't want it, and many of which are > outside of asserts, and rely on the usual klass() with the sanity assert > there instead. I am open for suggestions, but it would have to be > restricted to ASSERT code IMO, and ideally with as few as possible GC > interface additions. > > Roman > > > -- > Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From coleen.phillimore at oracle.com Thu Apr 18 11:26:04 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 18 Apr 2019 07:26:04 -0400 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: <0cc32647-2951-b21d-08be-40c9db1203dd@oracle.com> References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> <48ffc398-3b99-3bee-3578-efb0316fdd75@oracle.com> <7c05517e-d1d7-f628-dbcf-ec0b996ab611@oracle.com> <9ef5aca0-2f56-ec60-94ce-b75b04a2dc33@oracle.com> <0cc32647-2951-b21d-08be-40c9db1203dd@oracle.com> Message-ID: <5983ad03-c7a9-e5ec-21fc-5c8f8b6851ff@oracle.com> This looks great! thanks, Coleen On 4/18/19 4:16 AM, Stefan Karlsson wrote: > > > On 2019-04-18 09:55, David Holmes wrote: >> On 18/04/2019 5:31 pm, Stefan Karlsson wrote: >>> On 2019-04-18 06:26, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> On 18/04/2019 12:09 pm, coleen.phillimore at oracle.com wrote: >>>>> >>>>> I didn't see the 02 change below.?? I think the shouldContain >>>>> function should be added to output analyzer.? And as a utility >>>>> patch, it should be separate to help with backports that might >>>>> need it.? So I chopped out Stefan's function: >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8222713.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222713 >>>> >>>> Metacomment: not sure why OutputAnalyzer should have a utility that >>>> takes an arbitrary file and searches it for strings? That's not >>>> part of the output that OutputAnalyzer is designed to analyze. This >>>> may be a useful utility but belongs elsewhere IMHO. The fact this >>>> doesn't reference any internal state of the OutputAnalyzer also >>>> suggests it should be a static utility method, not an instance >>>> method. Or define a new OutputAnalyzer constructor that takes a >>>> File and operates in its contents >>> >>> Yes, this is exactly what I would have expected. >>> >>> ??- though in that case you may be able to just convert the file >>>> contents to a String and use the existing OutputAnalyzer >>>> constructor that takes a String to start with. >>> >>> >>> I would rather have the OutputAnalyzer constructor do it for me, >>> instead of having to do this conversion every time I want to analyze >>> a file. >>> >>> >>>> >>>> Not clear what the expected semantics should be for searching for >>>> multiple lines. I would have expected to be searching for lines in >>>> the given order, but the code will match them in any order. That >>>> may be what you wanted, but it's not clear to me its what you'd >>>> always want. Needs to be documented either way. >>> >>> If we create a OutputAnalyzer(File) constructor, we don't have to >>> care about that. >>> >>> I've added that function: >>> ??http://cr.openjdk.java.net/~stefank/8222713/webrev.01/ >> >> We don't need the Utils function >> >> Utils.fileAsString(file.toString()) >> >> ??since JDK 11 we can just use: >> >> Files.readString(file) > > Updated: > ?http://cr.openjdk.java.net/~stefank/8222713/webrev.02.delta/ > ?http://cr.openjdk.java.net/~stefank/8222713/webrev.02/ > > Thanks, > StefanK > >> >> Cheers, >> David >> ----- >> >> >> >>> and updated the webrev for the original bug: >>> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03.delta/ >>> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03/ >>> >>> Thanks, >>> StefanK >>> >>>> >>>> +????? * Verify that the contents of the file contains the given >>>> the set of strings >>>> >>>> s/the set/set/ >>>> >>>> +???????? LinkedList expectedList = new LinkedList<>(); >>>> +???????? for (String s : expectedStrings) { >>>> +?????????? expectedList.add(s); >>>> +???????? } >>>> >>>> List expectedList = Arrays.asList(expectedStrings); >>>> >>>> +???????? FileReader fr = new FileReader(file); >>>> +???????? BufferedReader reader = new BufferedReader(fr); >>>> >>>> These should be part of a try-with-resources block. >>>> >>>> +???????? for (String line; (line = reader.readLine()) != null;) { >>>> >>>> I'd find this cleaner as a while-loop: >>>> >>>> String line; >>>> while ((line = reader.readLine()) != NULL) { >>>> >>>> Though I think there are simpler ways to deal with files - see >>>> java.nio.file.Files utility class. >>>> >>>>> Leaving the MemberNameLeak.java change as: >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8222550.01/webrev >>>> >>>> Seems okay. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Tested with the new MemberNameLeak.java test.? hs tier1-3 testing >>>>> in progress. >>>>> >>>>> All which look good to me.? Please review! >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>> On 4/17/19 2:41 AM, Stefan Karlsson wrote: >>>>>> On 2019-04-17 05:58, David Holmes wrote: >>>>>>> Hi Stefan, >>>>>>> >>>>>>> On 17/04/2019 7:24 am, Stefan Karlsson wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review this patch to fix a timeout in the MemberNameLeak >>>>>>>> test. >>>>>>>> >>>>>>>> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222550 >>>>>>>> >>>>>>>> The test could fail if GCs happened during the setup phase when >>>>>>>> entries for all generated methods were created. When this >>>>>>>> happened the code to grow the table was triggered, which in >>>>>>>> turn cleaned out all so-far created entries.? This put the >>>>>>>> table in a condition where the grow / cleaning code didn't have >>>>>>>> to be triggered again. But the test still waited for it to >>>>>>>> happen. This patch adds all MethodHandles to an ArrayList, so >>>>>>>> that they are kept alive until it's time for them to be cleaned >>>>>>>> out. While debugging this timeout I added some extra logging. >>>>>>>> I've left it in the test in case we ever need to debug it again. >>>>>>> >>>>>>> Fix seems reasonable. >>>>>> >>>>>> Thanks. >>>>>> >>>>>>> A couple of comments: >>>>>>> >>>>>>> ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", >>>>>>> ?120 >>>>>>> "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", >>>>>>> >>>>>>> >>>>>>> I'm assuming you only actually want line 120? >>>>>> >>>>>> It was a quick and dirty way to get logging from 119 to the >>>>>> outputAnalyzer, and more comprehensive logging from line 120 >>>>>> saved to disk. >>>>>> >>>>>>> >>>>>>> Is the log file copied across with the test artifacts in mach5? >>>>>> >>>>>> Yes. >>>>>>> I'm assuming you're using the file for gc logging so that the >>>>>>> normal test .jtr file is not inundated with excessive logging data. >>>>>> >>>>>> Yes, and because jtreg cuts in the middle of the output of tests >>>>>> with excessive logging. >>>>>> >>>>>> I created a more elaborate version that only logs to files, and >>>>>> perform the verification on those files: >>>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta >>>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 >>>>>> >>>>>> Thanks, >>>>>> StefanK >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> Testing: tier1-3 and multiple tier1_runtime runs on osx where >>>>>>>> the timeouts reproduced. >>>>>>>> >>>>>>>> The patch is applied on top of the patch in: >>>>>>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> StefanK >>>>>> >>>>> From rkennke at redhat.com Thu Apr 18 11:45:12 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 18 Apr 2019 13:45:12 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> Message-ID: <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> >> To add a little more detail, I could move the change up into >> is_objArray(), but I don't want to expose it to any non-assert paths. >> Therefore I could do 2 different impls there, guarded by #ifdef ASSERT >> but I don't think it's a good idea to behave differently under ASSERT, >> that kindof defeats the point of assert, right? >> >> What do you think ? > > I don't follow your argument. Under asserts you need to access the klass > pointer "safely" but otherwise you do not. So there are two behaviours > related to accessing the klass pointer anyway. I'd rather see that > encapsulated in the accessor. > > I assume it's not just asserts but any debug only code that wants to > access the klass pointer. In general, for any runtime calls into oopDesc::klass() the access should be safe. The acrobatics is only necessary for *GC-internal* calls, which can happen in 'unsafe' situations, where decoding the Klass* would be necessary. The way I do it is to call into acccessors like size_given_klass() or the oop_oop_iterate() methods, that take a pre-resolved Klass* as argument. But that only works if those code-paths don't call into klass() themselves. This is what this patch addresses. I guess we could put the call via safe_klass() into the accessors, but it would widen its exposure unnecessarily. If we don't want special paths for ASSERT via non-ASSERT there, it could actually be done in the GC backend I suppose, but the way I proposed it seems minimal exposure of GC fluff. Alternatively, it could be argued that we're in the Klass* instance already anyway, and what is the point of asserting it again, but I also see that we might want to ensure that we're not calling anything typeArray-ish on an objArray by accident. Roman > Thanks, > David > ----- > >> Roman >> >> >> Am 17. April 2019 18:59:09 MESZ schrieb Roman Kennke >> : >> >> ??????????? Various code paths in oopDesc, Klass and their subclasses >> assert >> ??????????? something that fetches the object's _klass field. With >> upcoming >> ??????????? Shenandoah's changes this is not always safe and requires an >> ??????????? additional >> ??????????? indirection. >> >> ??????????? The trouble here is that we can, for example, call >> ??????????? Klass::oop_oop_iterate() with a pre-resolved Klass*, >> instead of >> ??????????? oopDesc::oop_iterate() which would call oopDesc::klass() on >> ??????????? its own, >> ??????????? which would be racy on some GC internal call paths, but we >> can't >> ??????????? (currently) control some calls to klass() further down the >> ??????????? call stack >> ??????????? (all in asserts). >> >> ??????????? We'd also like a way to ensure that non-GC calls to klass() >> ??????????? are sane. >> >> ??????????? Bug: >> ??????????? https://bugs.openjdk.java.net/browse/JDK-8222545 >> ??????????? Webrev: >> ??????????? http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ >> ??????????? Testing: >> ??????????? hotspot_gc_shenandoah with and without the prototype, >> ??????????? hotspot/tier1 >> >> ??????????? The change introduces only two ASSERT-level GC-interfaces, >> ??????????? and afaict, >> ??????????? this with JDK-8222537 will be all that we need for the >> upcoming >> ??????????? elimination of forward pointers in Shenandoah. Notice that >> ??????????? one assert in >> ??????????? objArrayKlass is strengthened from is_array() to >> ??????????? is_objArray(), but that >> ??????????? seems only sane in that context. >> >> ??????????? Can I please get reviews? >> >> >> ??????? This looks very awkward to me. Using: >> >> ??????? Universe::heap()->safe_klass(obj)->is_objArray_klass() >> >> ??????? instead of the obvious: >> >> ??????? obj->is_objArray() >> >> ??????? is very unintuitive. Can this not be handled inside is_objArray >> ??????? (and >> ??????? is_typeArray) ? >> >> >> ??? Not really. Then it would get exposed to many more code paths, >> most of >> ??? which don't actually need it/don't want it, and many of which are >> ??? outside of asserts, and rely on the usual klass() with the sanity >> assert >> ??? there instead. I am open for suggestions, but it would have to be >> ??? restricted to ASSERT code IMO, and ideally with as few as possible GC >> ??? interface additions. >> >> ??? Roman >> >> >> -- >> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From stefan.karlsson at oracle.com Thu Apr 18 11:51:47 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 18 Apr 2019 13:51:47 +0200 Subject: RFR: 8222550: runtime/MemberName/MemberNameLeak.java times out In-Reply-To: <5983ad03-c7a9-e5ec-21fc-5c8f8b6851ff@oracle.com> References: <2da15c02-5684-0018-2e79-9054fe5b140b@oracle.com> <48ffc398-3b99-3bee-3578-efb0316fdd75@oracle.com> <7c05517e-d1d7-f628-dbcf-ec0b996ab611@oracle.com> <9ef5aca0-2f56-ec60-94ce-b75b04a2dc33@oracle.com> <0cc32647-2951-b21d-08be-40c9db1203dd@oracle.com> <5983ad03-c7a9-e5ec-21fc-5c8f8b6851ff@oracle.com> Message-ID: Thanks, Coleen! StefanK On 2019-04-18 13:26, coleen.phillimore at oracle.com wrote: > > This looks great! > thanks, > Coleen > > > On 4/18/19 4:16 AM, Stefan Karlsson wrote: >> >> >> On 2019-04-18 09:55, David Holmes wrote: >>> On 18/04/2019 5:31 pm, Stefan Karlsson wrote: >>>> On 2019-04-18 06:26, David Holmes wrote: >>>>> Hi Coleen, >>>>> >>>>> On 18/04/2019 12:09 pm, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> I didn't see the 02 change below.?? I think the shouldContain >>>>>> function should be added to output analyzer.? And as a utility >>>>>> patch, it should be separate to help with backports that might >>>>>> need it.? So I chopped out Stefan's function: >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8222713.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222713 >>>>> >>>>> Metacomment: not sure why OutputAnalyzer should have a utility that >>>>> takes an arbitrary file and searches it for strings? That's not >>>>> part of the output that OutputAnalyzer is designed to analyze. This >>>>> may be a useful utility but belongs elsewhere IMHO. The fact this >>>>> doesn't reference any internal state of the OutputAnalyzer also >>>>> suggests it should be a static utility method, not an instance >>>>> method. Or define a new OutputAnalyzer constructor that takes a >>>>> File and operates in its contents >>>> >>>> Yes, this is exactly what I would have expected. >>>> >>>> ??- though in that case you may be able to just convert the file >>>>> contents to a String and use the existing OutputAnalyzer >>>>> constructor that takes a String to start with. >>>> >>>> >>>> I would rather have the OutputAnalyzer constructor do it for me, >>>> instead of having to do this conversion every time I want to analyze >>>> a file. >>>> >>>> >>>>> >>>>> Not clear what the expected semantics should be for searching for >>>>> multiple lines. I would have expected to be searching for lines in >>>>> the given order, but the code will match them in any order. That >>>>> may be what you wanted, but it's not clear to me its what you'd >>>>> always want. Needs to be documented either way. >>>> >>>> If we create a OutputAnalyzer(File) constructor, we don't have to >>>> care about that. >>>> >>>> I've added that function: >>>> ??http://cr.openjdk.java.net/~stefank/8222713/webrev.01/ >>> >>> We don't need the Utils function >>> >>> Utils.fileAsString(file.toString()) >>> >>> ??since JDK 11 we can just use: >>> >>> Files.readString(file) >> >> Updated: >> ?http://cr.openjdk.java.net/~stefank/8222713/webrev.02.delta/ >> ?http://cr.openjdk.java.net/~stefank/8222713/webrev.02/ >> >> Thanks, >> StefanK >> >>> >>> Cheers, >>> David >>> ----- >>> >>> >>> >>>> and updated the webrev for the original bug: >>>> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03.delta/ >>>> ??https://cr.openjdk.java.net/~stefank/8222550/webrev.03/ >>>> >>>> Thanks, >>>> StefanK >>>> >>>>> >>>>> +????? * Verify that the contents of the file contains the given >>>>> the set of strings >>>>> >>>>> s/the set/set/ >>>>> >>>>> +???????? LinkedList expectedList = new LinkedList<>(); >>>>> +???????? for (String s : expectedStrings) { >>>>> +?????????? expectedList.add(s); >>>>> +???????? } >>>>> >>>>> List expectedList = Arrays.asList(expectedStrings); >>>>> >>>>> +???????? FileReader fr = new FileReader(file); >>>>> +???????? BufferedReader reader = new BufferedReader(fr); >>>>> >>>>> These should be part of a try-with-resources block. >>>>> >>>>> +???????? for (String line; (line = reader.readLine()) != null;) { >>>>> >>>>> I'd find this cleaner as a while-loop: >>>>> >>>>> String line; >>>>> while ((line = reader.readLine()) != NULL) { >>>>> >>>>> Though I think there are simpler ways to deal with files - see >>>>> java.nio.file.Files utility class. >>>>> >>>>>> Leaving the MemberNameLeak.java change as: >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8222550.01/webrev >>>>> >>>>> Seems okay. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> Tested with the new MemberNameLeak.java test.? hs tier1-3 testing >>>>>> in progress. >>>>>> >>>>>> All which look good to me.? Please review! >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >>>>>> On 4/17/19 2:41 AM, Stefan Karlsson wrote: >>>>>>> On 2019-04-17 05:58, David Holmes wrote: >>>>>>>> Hi Stefan, >>>>>>>> >>>>>>>> On 17/04/2019 7:24 am, Stefan Karlsson wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review this patch to fix a timeout in the MemberNameLeak >>>>>>>>> test. >>>>>>>>> >>>>>>>>> https://cr.openjdk.java.net/~stefank/8222550/webrev.01/ >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222550 >>>>>>>>> >>>>>>>>> The test could fail if GCs happened during the setup phase when >>>>>>>>> entries for all generated methods were created. When this >>>>>>>>> happened the code to grow the table was triggered, which in >>>>>>>>> turn cleaned out all so-far created entries.? This put the >>>>>>>>> table in a condition where the grow / cleaning code didn't have >>>>>>>>> to be triggered again. But the test still waited for it to >>>>>>>>> happen. This patch adds all MethodHandles to an ArrayList, so >>>>>>>>> that they are kept alive until it's time for them to be cleaned >>>>>>>>> out. While debugging this timeout I added some extra logging. >>>>>>>>> I've left it in the test in case we ever need to debug it again. >>>>>>>> >>>>>>>> Fix seems reasonable. >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>>> A couple of comments: >>>>>>>> >>>>>>>> ?119 "-Xlog:membername+table=trace,gc+verify=debug,gc", >>>>>>>> ?120 >>>>>>>> "-Xlog:membername+table=trace,gc+verify=debug,gc:gc.%p.log:time,utctime,uptime,pid,level,tags", >>>>>>>> >>>>>>>> >>>>>>>> I'm assuming you only actually want line 120? >>>>>>> >>>>>>> It was a quick and dirty way to get logging from 119 to the >>>>>>> outputAnalyzer, and more comprehensive logging from line 120 >>>>>>> saved to disk. >>>>>>> >>>>>>>> >>>>>>>> Is the log file copied across with the test artifacts in mach5? >>>>>>> >>>>>>> Yes. >>>>>>>> I'm assuming you're using the file for gc logging so that the >>>>>>>> normal test .jtr file is not inundated with excessive logging data. >>>>>>> >>>>>>> Yes, and because jtreg cuts in the middle of the output of tests >>>>>>> with excessive logging. >>>>>>> >>>>>>> I created a more elaborate version that only logs to files, and >>>>>>> perform the verification on those files: >>>>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02.delta >>>>>>> ?https://cr.openjdk.java.net/~stefank/8222550/webrev.02 >>>>>>> >>>>>>> Thanks, >>>>>>> StefanK >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> Testing: tier1-3 and multiple tier1_runtime runs on osx where >>>>>>>>> the timeouts reproduced. >>>>>>>>> >>>>>>>>> The patch is applied on top of the patch in: >>>>>>>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-April/033820.html >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> StefanK >>>>>>> >>>>>> > From per.liden at oracle.com Thu Apr 18 12:45:33 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 18 Apr 2019 14:45:33 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> Message-ID: Hi Roman, On 4/18/19 1:45 PM, Roman Kennke wrote: >>> To add a little more detail, I could move the change up into >>> is_objArray(), but I don't want to expose it to any non-assert paths. >>> Therefore I could do 2 different impls there, guarded by #ifdef >>> ASSERT but I don't think it's a good idea to behave differently under >>> ASSERT, that kindof defeats the point of assert, right? >>> >>> What do you think ? >> >> I don't follow your argument. Under asserts you need to access the >> klass pointer "safely" but otherwise you do not. So there are two >> behaviours related to accessing the klass pointer anyway. I'd rather >> see that encapsulated in the accessor. >> >> I assume it's not just asserts but any debug only code that wants to >> access the klass pointer. > > In general, for any runtime calls into oopDesc::klass() the access > should be safe. The acrobatics is only necessary for *GC-internal* This is the part I don't quite understand, and goes back to my initial question. Why are you doing these operations on from-space objects? I'm thinking you should be in a position in the GC to make sure this can never happen. If you need to do that in the GC (which is fine), then the GC could apply a "resolve" function to get the to-space object, and call size() (or whatever) on that object. This shouldn't have to leak out of the GC, right? cheers, Per > calls, which can happen in 'unsafe' situations, where decoding the > Klass* would be necessary. The way I do it is to call into acccessors > like size_given_klass() or the oop_oop_iterate() methods, that take a > pre-resolved Klass* as argument. But that only works if those code-paths > don't call into klass() themselves. This is what this patch addresses. > > I guess we could put the call via safe_klass() into the accessors, but > it would widen its exposure unnecessarily. If we don't want special > paths for ASSERT via non-ASSERT there, it could actually be done in the > GC backend I suppose, but the way I proposed it seems minimal exposure > of GC fluff. > > Alternatively, it could be argued that we're in the Klass* instance > already anyway, and what is the point of asserting it again, but I also > see that we might want to ensure that we're not calling anything > typeArray-ish on an objArray by accident. > > Roman > >> Thanks, >> David >> ----- >> >>> Roman >>> >>> >>> Am 17. April 2019 18:59:09 MESZ schrieb Roman Kennke >>> : >>> >>> ??????????? Various code paths in oopDesc, Klass and their subclasses >>> assert >>> ??????????? something that fetches the object's _klass field. With >>> upcoming >>> ??????????? Shenandoah's changes this is not always safe and requires an >>> ??????????? additional >>> ??????????? indirection. >>> >>> ??????????? The trouble here is that we can, for example, call >>> ??????????? Klass::oop_oop_iterate() with a pre-resolved Klass*, >>> instead of >>> ??????????? oopDesc::oop_iterate() which would call oopDesc::klass() on >>> ??????????? its own, >>> ??????????? which would be racy on some GC internal call paths, but >>> we can't >>> ??????????? (currently) control some calls to klass() further down the >>> ??????????? call stack >>> ??????????? (all in asserts). >>> >>> ??????????? We'd also like a way to ensure that non-GC calls to klass() >>> ??????????? are sane. >>> >>> ??????????? Bug: >>> ??????????? https://bugs.openjdk.java.net/browse/JDK-8222545 >>> ??????????? Webrev: >>> ??????????? http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ >>> ??????????? Testing: >>> ??????????? hotspot_gc_shenandoah with and without the prototype, >>> ??????????? hotspot/tier1 >>> >>> ??????????? The change introduces only two ASSERT-level GC-interfaces, >>> ??????????? and afaict, >>> ??????????? this with JDK-8222537 will be all that we need for the >>> upcoming >>> ??????????? elimination of forward pointers in Shenandoah. Notice that >>> ??????????? one assert in >>> ??????????? objArrayKlass is strengthened from is_array() to >>> ??????????? is_objArray(), but that >>> ??????????? seems only sane in that context. >>> >>> ??????????? Can I please get reviews? >>> >>> >>> ??????? This looks very awkward to me. Using: >>> >>> ??????? Universe::heap()->safe_klass(obj)->is_objArray_klass() >>> >>> ??????? instead of the obvious: >>> >>> ??????? obj->is_objArray() >>> >>> ??????? is very unintuitive. Can this not be handled inside is_objArray >>> ??????? (and >>> ??????? is_typeArray) ? >>> >>> >>> ??? Not really. Then it would get exposed to many more code paths, >>> most of >>> ??? which don't actually need it/don't want it, and many of which are >>> ??? outside of asserts, and rely on the usual klass() with the sanity >>> assert >>> ??? there instead. I am open for suggestions, but it would have to be >>> ??? restricted to ASSERT code IMO, and ideally with as few as >>> possible GC >>> ??? interface additions. >>> >>> ??? Roman >>> >>> >>> -- >>> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From per.liden at oracle.com Thu Apr 18 13:02:24 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 18 Apr 2019 15:02:24 +0200 Subject: RFR: JDK-8222537: Avoid fetching _klass twice in TypeArrayOop::size() In-Reply-To: <5bad6271-2fb3-875f-d0d9-71bac167f768@redhat.com> References: <5bad6271-2fb3-875f-d0d9-71bac167f768@redhat.com> Message-ID: As far as I understand, this would also not be needed if you just never did this on from-space objects, is that correct? I really don't think you want to go down that path, cause you're opening yourself up for future problems and bugs. I would suggest you have a much more strict to-space invariant to make sure, by design, that this can never happen. cheers, Per On 4/16/19 10:45 PM, Roman Kennke wrote: > Currently, when calling TypeArrayOop::size(), we end up calling klass() > twice: once before calling into size_given_klass() and then again before > calling TypeArrayOop::object_size(). > > This is currently only a minor performance nuisance. > > With upcoming Shenandoah's elimination of forwarding pointer, loading > klass like this is not safe anymore, and therefore we only call > size_given_klass(), and must avoid calling naked klass() altogether. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8222537 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8222537/webrev.00/ > Testing: > hotspot_gc_shenandoah with and without the prototype, hotspot/tier1 > > Can I please get reviews? > > Thanks, > Roman > From rkennke at redhat.com Thu Apr 18 13:09:11 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 18 Apr 2019 15:09:11 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> Message-ID: Let me revisit the changes. I probably can avoid the changes in oop_oop_iterate() and I will think hard about the size() code-path too. Roman > On 18/04/2019 7:54 pm, Roman Kennke wrote: >> To add a little more detail, I could move the change up into >> is_objArray(), but I don't want to expose it to any non-assert paths. >> Therefore I could do 2 different impls there, guarded by #ifdef ASSERT >> but I don't think it's a good idea to behave differently under ASSERT, >> that kindof defeats the point of assert, right? >> >> What do you think ? > > I don't follow your argument. Under asserts you need to access the klass > pointer "safely" but otherwise you do not. So there are two behaviours > related to accessing the klass pointer anyway. I'd rather see that > encapsulated in the accessor. > > I assume it's not just asserts but any debug only code that wants to > access the klass pointer. > > > Thanks, > David > ----- > >> Roman >> >> >> Am 17. April 2019 18:59:09 MESZ schrieb Roman Kennke >> : >> >> ??????????? Various code paths in oopDesc, Klass and their subclasses >> assert >> ??????????? something that fetches the object's _klass field. With >> upcoming >> ??????????? Shenandoah's changes this is not always safe and requires an >> ??????????? additional >> ??????????? indirection. >> >> ??????????? The trouble here is that we can, for example, call >> ??????????? Klass::oop_oop_iterate() with a pre-resolved Klass*, >> instead of >> ??????????? oopDesc::oop_iterate() which would call oopDesc::klass() on >> ??????????? its own, >> ??????????? which would be racy on some GC internal call paths, but we >> can't >> ??????????? (currently) control some calls to klass() further down the >> ??????????? call stack >> ??????????? (all in asserts). >> >> ??????????? We'd also like a way to ensure that non-GC calls to klass() >> ??????????? are sane. >> >> ??????????? Bug: >> ??????????? https://bugs.openjdk.java.net/browse/JDK-8222545 >> ??????????? Webrev: >> ??????????? http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ >> ??????????? Testing: >> ??????????? hotspot_gc_shenandoah with and without the prototype, >> ??????????? hotspot/tier1 >> >> ??????????? The change introduces only two ASSERT-level GC-interfaces, >> ??????????? and afaict, >> ??????????? this with JDK-8222537 will be all that we need for the >> upcoming >> ??????????? elimination of forward pointers in Shenandoah. Notice that >> ??????????? one assert in >> ??????????? objArrayKlass is strengthened from is_array() to >> ??????????? is_objArray(), but that >> ??????????? seems only sane in that context. >> >> ??????????? Can I please get reviews? >> >> >> ??????? This looks very awkward to me. Using: >> >> ??????? Universe::heap()->safe_klass(obj)->is_objArray_klass() >> >> ??????? instead of the obvious: >> >> ??????? obj->is_objArray() >> >> ??????? is very unintuitive. Can this not be handled inside is_objArray >> ??????? (and >> ??????? is_typeArray) ? >> >> >> ??? Not really. Then it would get exposed to many more code paths, >> most of >> ??? which don't actually need it/don't want it, and many of which are >> ??? outside of asserts, and rely on the usual klass() with the sanity >> assert >> ??? there instead. I am open for suggestions, but it would have to be >> ??? restricted to ASSERT code IMO, and ideally with as few as possible GC >> ??? interface additions. >> >> ??? Roman >> >> >> -- >> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From rkennke at redhat.com Thu Apr 18 13:13:38 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 18 Apr 2019 15:13:38 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> Message-ID: <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> >>>> To add a little more detail, I could move the change up into >>>> is_objArray(), but I don't want to expose it to any non-assert >>>> paths. Therefore I could do 2 different impls there, guarded by >>>> #ifdef ASSERT but I don't think it's a good idea to behave >>>> differently under ASSERT, that kindof defeats the point of assert, >>>> right? >>>> >>>> What do you think ? >>> >>> I don't follow your argument. Under asserts you need to access the >>> klass pointer "safely" but otherwise you do not. So there are two >>> behaviours related to accessing the klass pointer anyway. I'd rather >>> see that encapsulated in the accessor. >>> >>> I assume it's not just asserts but any debug only code that wants to >>> access the klass pointer. >> >> In general, for any runtime calls into oopDesc::klass() the access >> should be safe. The acrobatics is only necessary for *GC-internal* > > This is the part I don't quite understand, and goes back to my initial > question. Why are you doing these operations on from-space objects? I'm > thinking you should be in a position in the GC to make sure this can > never happen. If you need to do that in the GC (which is fine), then the > GC could apply a "resolve" function to get the to-space object, and call > size() (or whatever) on that object. This shouldn't have to leak out of > the GC, right? It is a problem when we are about to evacuate an object. Then we need to know its size in order to allocate and copy an appropriate chunk. The problem is that this part is racy: two threads (e.g. two Java threads via barrier, or one Java thread vs one GC thread) might compete over this: both would create a copy of the object, but ultimately only one would succeed (by CASing the fwd pointer). Therefore, getting hold of the object size is racy, by design, and this requires to resolve the _klass. Now, we can do that ahead of time, and call oopDesc::size_given_klass() and all would be good, except that size_given_klass() asserts that the object is indeed of the given klass, and hence fetches _klass again, which, at this point, is racy. Solving this inside the GC would require to basically copy all the machinery to get hold of object size into the GC. Are you asking me to do that? About the oop_oop_iterate() paths, I will revisit them. We should never actually need to iterate over a from-space object. Roman > cheers, > Per > >> calls, which can happen in 'unsafe' situations, where decoding the >> Klass* would be necessary. The way I do it is to call into acccessors >> like size_given_klass() or the oop_oop_iterate() methods, that take a >> pre-resolved Klass* as argument. But that only works if those >> code-paths don't call into klass() themselves. This is what this patch >> addresses. >> >> I guess we could put the call via safe_klass() into the accessors, but >> it would widen its exposure unnecessarily. If we don't want special >> paths for ASSERT via non-ASSERT there, it could actually be done in >> the GC backend I suppose, but the way I proposed it seems minimal >> exposure of GC fluff. >> >> Alternatively, it could be argued that we're in the Klass* instance >> already anyway, and what is the point of asserting it again, but I >> also see that we might want to ensure that we're not calling anything >> typeArray-ish on an objArray by accident. >> >> Roman >> >>> Thanks, >>> David >>> ----- >>> >>>> Roman >>>> >>>> >>>> Am 17. April 2019 18:59:09 MESZ schrieb Roman Kennke >>>> : >>>> >>>> ??????????? Various code paths in oopDesc, Klass and their >>>> subclasses assert >>>> ??????????? something that fetches the object's _klass field. With >>>> upcoming >>>> ??????????? Shenandoah's changes this is not always safe and >>>> requires an >>>> ??????????? additional >>>> ??????????? indirection. >>>> >>>> ??????????? The trouble here is that we can, for example, call >>>> ??????????? Klass::oop_oop_iterate() with a pre-resolved Klass*, >>>> instead of >>>> ??????????? oopDesc::oop_iterate() which would call oopDesc::klass() on >>>> ??????????? its own, >>>> ??????????? which would be racy on some GC internal call paths, but >>>> we can't >>>> ??????????? (currently) control some calls to klass() further down the >>>> ??????????? call stack >>>> ??????????? (all in asserts). >>>> >>>> ??????????? We'd also like a way to ensure that non-GC calls to klass() >>>> ??????????? are sane. >>>> >>>> ??????????? Bug: >>>> ??????????? https://bugs.openjdk.java.net/browse/JDK-8222545 >>>> ??????????? Webrev: >>>> ??????????? http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ >>>> ??????????? Testing: >>>> ??????????? hotspot_gc_shenandoah with and without the prototype, >>>> ??????????? hotspot/tier1 >>>> >>>> ??????????? The change introduces only two ASSERT-level GC-interfaces, >>>> ??????????? and afaict, >>>> ??????????? this with JDK-8222537 will be all that we need for the >>>> upcoming >>>> ??????????? elimination of forward pointers in Shenandoah. Notice that >>>> ??????????? one assert in >>>> ??????????? objArrayKlass is strengthened from is_array() to >>>> ??????????? is_objArray(), but that >>>> ??????????? seems only sane in that context. >>>> >>>> ??????????? Can I please get reviews? >>>> >>>> >>>> ??????? This looks very awkward to me. Using: >>>> >>>> ??????? Universe::heap()->safe_klass(obj)->is_objArray_klass() >>>> >>>> ??????? instead of the obvious: >>>> >>>> ??????? obj->is_objArray() >>>> >>>> ??????? is very unintuitive. Can this not be handled inside is_objArray >>>> ??????? (and >>>> ??????? is_typeArray) ? >>>> >>>> >>>> ??? Not really. Then it would get exposed to many more code paths, >>>> most of >>>> ??? which don't actually need it/don't want it, and many of which are >>>> ??? outside of asserts, and rely on the usual klass() with the >>>> sanity assert >>>> ??? there instead. I am open for suggestions, but it would have to be >>>> ??? restricted to ASSERT code IMO, and ideally with as few as >>>> possible GC >>>> ??? interface additions. >>>> >>>> ??? Roman >>>> >>>> >>>> -- >>>> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From per.liden at oracle.com Thu Apr 18 13:59:10 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 18 Apr 2019 15:59:10 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> Message-ID: <3a9e6896-8c85-de57-3a52-4652b2fe1b54@oracle.com> Hi Roman, On 4/18/19 3:13 PM, Roman Kennke wrote: >>>>> To add a little more detail, I could move the change up into >>>>> is_objArray(), but I don't want to expose it to any non-assert >>>>> paths. Therefore I could do 2 different impls there, guarded by >>>>> #ifdef ASSERT but I don't think it's a good idea to behave >>>>> differently under ASSERT, that kindof defeats the point of assert, >>>>> right? >>>>> >>>>> What do you think ? >>>> >>>> I don't follow your argument. Under asserts you need to access the >>>> klass pointer "safely" but otherwise you do not. So there are two >>>> behaviours related to accessing the klass pointer anyway. I'd rather >>>> see that encapsulated in the accessor. >>>> >>>> I assume it's not just asserts but any debug only code that wants to >>>> access the klass pointer. >>> >>> In general, for any runtime calls into oopDesc::klass() the access >>> should be safe. The acrobatics is only necessary for *GC-internal* >> >> This is the part I don't quite understand, and goes back to my initial >> question. Why are you doing these operations on from-space objects? >> I'm thinking you should be in a position in the GC to make sure this >> can never happen. If you need to do that in the GC (which is fine), >> then the GC could apply a "resolve" function to get the to-space >> object, and call size() (or whatever) on that object. This shouldn't >> have to leak out of the GC, right? > > It is a problem when we are about to evacuate an object. Then we need to > know its size in order to allocate and copy an appropriate chunk. The > problem is that this part is racy: two threads (e.g. two Java threads > via barrier, or one Java thread vs one GC thread) might compete over > this: both would create a copy of the object, but ultimately only one > would succeed (by CASing the fwd pointer). Therefore, getting hold of > the object size is racy, by design, and this requires to resolve the > _klass. Now, we can do that ahead of time, and call > oopDesc::size_given_klass() and all would be good, except that > size_given_klass() asserts that the object is indeed of the given klass, > and hence fetches _klass again, which, at this point, is racy. Solving > this inside the GC would require to basically copy all the machinery to > get hold of object size into the GC. Are you asking me to do that? > > About the oop_oop_iterate() paths, I will revisit them. We should never > actually need to iterate over a from-space object. Ok, so the only thing you really want to do it not assert in ObjArrayKlass::oop_size() and TypeArrayKlass::oop_size(), correct? Given that oopDesc::size_given_klass() exists to solve these kind of concurrency issues, I would have understood if your patch just removed the offending asserts, with the argument that you can't make these kind of asserts/assumptions there. In my view, that would be much easier to swallow, compared to adding clue to CollectedHeap to try to workaround the asserts. cheers, Per > > Roman > > >> cheers, >> Per >> >>> calls, which can happen in 'unsafe' situations, where decoding the >>> Klass* would be necessary. The way I do it is to call into acccessors >>> like size_given_klass() or the oop_oop_iterate() methods, that take a >>> pre-resolved Klass* as argument. But that only works if those >>> code-paths don't call into klass() themselves. This is what this >>> patch addresses. >>> >>> I guess we could put the call via safe_klass() into the accessors, >>> but it would widen its exposure unnecessarily. If we don't want >>> special paths for ASSERT via non-ASSERT there, it could actually be >>> done in the GC backend I suppose, but the way I proposed it seems >>> minimal exposure of GC fluff. >>> >>> Alternatively, it could be argued that we're in the Klass* instance >>> already anyway, and what is the point of asserting it again, but I >>> also see that we might want to ensure that we're not calling anything >>> typeArray-ish on an objArray by accident. >>> >>> Roman >>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Roman >>>>> >>>>> >>>>> Am 17. April 2019 18:59:09 MESZ schrieb Roman Kennke >>>>> : >>>>> >>>>> ??????????? Various code paths in oopDesc, Klass and their >>>>> subclasses assert >>>>> ??????????? something that fetches the object's _klass field. With >>>>> upcoming >>>>> ??????????? Shenandoah's changes this is not always safe and >>>>> requires an >>>>> ??????????? additional >>>>> ??????????? indirection. >>>>> >>>>> ??????????? The trouble here is that we can, for example, call >>>>> ??????????? Klass::oop_oop_iterate() with a pre-resolved Klass*, >>>>> instead of >>>>> ??????????? oopDesc::oop_iterate() which would call >>>>> oopDesc::klass() on >>>>> ??????????? its own, >>>>> ??????????? which would be racy on some GC internal call paths, but >>>>> we can't >>>>> ??????????? (currently) control some calls to klass() further down the >>>>> ??????????? call stack >>>>> ??????????? (all in asserts). >>>>> >>>>> ??????????? We'd also like a way to ensure that non-GC calls to >>>>> klass() >>>>> ??????????? are sane. >>>>> >>>>> ??????????? Bug: >>>>> ??????????? https://bugs.openjdk.java.net/browse/JDK-8222545 >>>>> ??????????? Webrev: >>>>> ??????????? http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ >>>>> ??????????? Testing: >>>>> ??????????? hotspot_gc_shenandoah with and without the prototype, >>>>> ??????????? hotspot/tier1 >>>>> >>>>> ??????????? The change introduces only two ASSERT-level GC-interfaces, >>>>> ??????????? and afaict, >>>>> ??????????? this with JDK-8222537 will be all that we need for the >>>>> upcoming >>>>> ??????????? elimination of forward pointers in Shenandoah. Notice that >>>>> ??????????? one assert in >>>>> ??????????? objArrayKlass is strengthened from is_array() to >>>>> ??????????? is_objArray(), but that >>>>> ??????????? seems only sane in that context. >>>>> >>>>> ??????????? Can I please get reviews? >>>>> >>>>> >>>>> ??????? This looks very awkward to me. Using: >>>>> >>>>> ??????? Universe::heap()->safe_klass(obj)->is_objArray_klass() >>>>> >>>>> ??????? instead of the obvious: >>>>> >>>>> ??????? obj->is_objArray() >>>>> >>>>> ??????? is very unintuitive. Can this not be handled inside >>>>> is_objArray >>>>> ??????? (and >>>>> ??????? is_typeArray) ? >>>>> >>>>> >>>>> ??? Not really. Then it would get exposed to many more code paths, >>>>> most of >>>>> ??? which don't actually need it/don't want it, and many of which are >>>>> ??? outside of asserts, and rely on the usual klass() with the >>>>> sanity assert >>>>> ??? there instead. I am open for suggestions, but it would have to be >>>>> ??? restricted to ASSERT code IMO, and ideally with as few as >>>>> possible GC >>>>> ??? interface additions. >>>>> >>>>> ??? Roman >>>>> >>>>> >>>>> -- >>>>> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From rkennke at redhat.com Thu Apr 18 14:09:19 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 18 Apr 2019 16:09:19 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <3a9e6896-8c85-de57-3a52-4652b2fe1b54@oracle.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> <3a9e6896-8c85-de57-3a52-4652b2fe1b54@oracle.com> Message-ID: <00a8db91-0bd8-6598-7e7a-911e697cf69e@redhat.com> >>>>>> To add a little more detail, I could move the change up into >>>>>> is_objArray(), but I don't want to expose it to any non-assert >>>>>> paths. Therefore I could do 2 different impls there, guarded by >>>>>> #ifdef ASSERT but I don't think it's a good idea to behave >>>>>> differently under ASSERT, that kindof defeats the point of assert, >>>>>> right? >>>>>> >>>>>> What do you think ? >>>>> >>>>> I don't follow your argument. Under asserts you need to access the >>>>> klass pointer "safely" but otherwise you do not. So there are two >>>>> behaviours related to accessing the klass pointer anyway. I'd >>>>> rather see that encapsulated in the accessor. >>>>> >>>>> I assume it's not just asserts but any debug only code that wants >>>>> to access the klass pointer. >>>> >>>> In general, for any runtime calls into oopDesc::klass() the access >>>> should be safe. The acrobatics is only necessary for *GC-internal* >>> >>> This is the part I don't quite understand, and goes back to my >>> initial question. Why are you doing these operations on from-space >>> objects? I'm thinking you should be in a position in the GC to make >>> sure this can never happen. If you need to do that in the GC (which >>> is fine), then the GC could apply a "resolve" function to get the >>> to-space object, and call size() (or whatever) on that object. This >>> shouldn't have to leak out of the GC, right? >> >> It is a problem when we are about to evacuate an object. Then we need >> to know its size in order to allocate and copy an appropriate chunk. >> The problem is that this part is racy: two threads (e.g. two Java >> threads via barrier, or one Java thread vs one GC thread) might >> compete over this: both would create a copy of the object, but >> ultimately only one would succeed (by CASing the fwd pointer). >> Therefore, getting hold of the object size is racy, by design, and >> this requires to resolve the _klass. Now, we can do that ahead of >> time, and call oopDesc::size_given_klass() and all would be good, >> except that size_given_klass() asserts that the object is indeed of >> the given klass, and hence fetches _klass again, which, at this point, >> is racy. Solving this inside the GC would require to basically copy >> all the machinery to get hold of object size into the GC. Are you >> asking me to do that? >> >> About the oop_oop_iterate() paths, I will revisit them. We should >> never actually need to iterate over a from-space object. > > Ok, so the only thing you really want to do it not assert in > ObjArrayKlass::oop_size() and TypeArrayKlass::oop_size(), correct? Given > that oopDesc::size_given_klass() exists to solve these kind of > concurrency issues, I would have understood if your patch just removed > the offending asserts, with the argument that you can't make these kind > of asserts/assumptions there. > > In my view, that would be much easier to swallow, compared to adding > clue to CollectedHeap to try to workaround the asserts. Ha! That was my first reaction when I've hit the asserts: why the heck should I assert something about the Klass* when I am in that Klass? But I can see that it serves a purpose: it ensures that size_given_klass() is not accidentally called on an object with a different Klass* and I wanted to preserve that guarantee. If you're fine with dropping this instead, then I'll do that. What do you think about the assert in oopDesc::klass() ? It exists to ensure we're not accidentally doing an unsafe access on klass(). If called the wrong way without the assert, it would sure enough blow up, but might be harder to track down. Can we leave that assert plus corresponding interface in CollectedHeap there? Roman From stefan.karlsson at oracle.com Thu Apr 18 14:34:01 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 18 Apr 2019 16:34:01 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> Message-ID: On 2019-04-18 15:13, Roman Kennke wrote: >>>>> To add a little more detail, I could move the change up into >>>>> is_objArray(), but I don't want to expose it to any non-assert >>>>> paths. Therefore I could do 2 different impls there, guarded by >>>>> #ifdef ASSERT but I don't think it's a good idea to behave >>>>> differently under ASSERT, that kindof defeats the point of assert, >>>>> right? >>>>> >>>>> What do you think ? >>>> >>>> I don't follow your argument. Under asserts you need to access the >>>> klass pointer "safely" but otherwise you do not. So there are two >>>> behaviours related to accessing the klass pointer anyway. I'd >>>> rather see that encapsulated in the accessor. >>>> >>>> I assume it's not just asserts but any debug only code that wants >>>> to access the klass pointer. >>> >>> In general, for any runtime calls into oopDesc::klass() the access >>> should be safe. The acrobatics is only necessary for *GC-internal* >> >> This is the part I don't quite understand, and goes back to my >> initial question. Why are you doing these operations on from-space >> objects? I'm thinking you should be in a position in the GC to make >> sure this can never happen. If you need to do that in the GC (which >> is fine), then the GC could apply a "resolve" function to get the >> to-space object, and call size() (or whatever) on that object. This >> shouldn't have to leak out of the GC, right? > > It is a problem when we are about to evacuate an object. Then we need > to know its size in order to allocate and copy an appropriate chunk. > The problem is that this part is racy: two threads (e.g. two Java > threads via barrier, or one Java thread vs one GC thread) might > compete over this: both would create a copy of the object, but > ultimately only one would succeed (by CASing the fwd pointer). > Therefore, getting hold of the object size is racy, by design, and > this requires to resolve the _klass. Now, we can do that ahead of > time, and call oopDesc::size_given_klass() and all would be good, > except that size_given_klass() asserts that the object is indeed of > the given klass, and hence fetches _klass again, which, at this point, > is racy. Solving this inside the GC would require to basically copy > all the machinery to get hold of object size into the GC. Are you > asking me to do that? Other GCs store forwarding pointers in the mark word. See oopDesc::forward_to and friends. Could you do the same and get rid of this problem? StefanK > > > About the oop_oop_iterate() paths, I will revisit them. We should > never actually need to iterate over a from-space object. > > Roman > > >> cheers, >> Per >> >>> calls, which can happen in 'unsafe' situations, where decoding the >>> Klass* would be necessary. The way I do it is to call into >>> acccessors like size_given_klass() or the oop_oop_iterate() methods, >>> that take a pre-resolved Klass* as argument. But that only works if >>> those code-paths don't call into klass() themselves. This is what >>> this patch addresses. >>> >>> I guess we could put the call via safe_klass() into the accessors, >>> but it would widen its exposure unnecessarily. If we don't want >>> special paths for ASSERT via non-ASSERT there, it could actually be >>> done in the GC backend I suppose, but the way I proposed it seems >>> minimal exposure of GC fluff. >>> >>> Alternatively, it could be argued that we're in the Klass* instance >>> already anyway, and what is the point of asserting it again, but I >>> also see that we might want to ensure that we're not calling >>> anything typeArray-ish on an objArray by accident. >>> >>> Roman >>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Roman >>>>> >>>>> >>>>> Am 17. April 2019 18:59:09 MESZ schrieb Roman Kennke >>>>> : >>>>> >>>>> ??????????? Various code paths in oopDesc, Klass and their >>>>> subclasses assert >>>>> ??????????? something that fetches the object's _klass field. With >>>>> upcoming >>>>> ??????????? Shenandoah's changes this is not always safe and >>>>> requires an >>>>> ??????????? additional >>>>> ??????????? indirection. >>>>> >>>>> ??????????? The trouble here is that we can, for example, call >>>>> ??????????? Klass::oop_oop_iterate() with a pre-resolved Klass*, >>>>> instead of >>>>> ??????????? oopDesc::oop_iterate() which would call >>>>> oopDesc::klass() on >>>>> ??????????? its own, >>>>> ??????????? which would be racy on some GC internal call paths, >>>>> but we can't >>>>> ??????????? (currently) control some calls to klass() further down >>>>> the >>>>> ??????????? call stack >>>>> ??????????? (all in asserts). >>>>> >>>>> ??????????? We'd also like a way to ensure that non-GC calls to >>>>> klass() >>>>> ??????????? are sane. >>>>> >>>>> ??????????? Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8222545 >>>>> ??????????? Webrev: >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8222545/webrev.00/ >>>>> ??????????? Testing: >>>>> ??????????? hotspot_gc_shenandoah with and without the prototype, >>>>> ??????????? hotspot/tier1 >>>>> >>>>> ??????????? The change introduces only two ASSERT-level >>>>> GC-interfaces, >>>>> ??????????? and afaict, >>>>> ??????????? this with JDK-8222537 will be all that we need for the >>>>> upcoming >>>>> ??????????? elimination of forward pointers in Shenandoah. Notice >>>>> that >>>>> ??????????? one assert in >>>>> ??????????? objArrayKlass is strengthened from is_array() to >>>>> ??????????? is_objArray(), but that >>>>> ??????????? seems only sane in that context. >>>>> >>>>> ??????????? Can I please get reviews? >>>>> >>>>> >>>>> ??????? This looks very awkward to me. Using: >>>>> >>>>> Universe::heap()->safe_klass(obj)->is_objArray_klass() >>>>> >>>>> ??????? instead of the obvious: >>>>> >>>>> ??????? obj->is_objArray() >>>>> >>>>> ??????? is very unintuitive. Can this not be handled inside >>>>> is_objArray >>>>> ??????? (and >>>>> ??????? is_typeArray) ? >>>>> >>>>> >>>>> ??? Not really. Then it would get exposed to many more code paths, >>>>> most of >>>>> ??? which don't actually need it/don't want it, and many of which are >>>>> ??? outside of asserts, and rely on the usual klass() with the >>>>> sanity assert >>>>> ??? there instead. I am open for suggestions, but it would have to be >>>>> ??? restricted to ASSERT code IMO, and ideally with as few as >>>>> possible GC >>>>> ??? interface additions. >>>>> >>>>> ??? Roman >>>>> >>>>> >>>>> -- >>>>> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From rkennke at redhat.com Thu Apr 18 15:29:37 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 18 Apr 2019 17:29:37 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> Message-ID: <23c7a1ac-d99d-cd28-b091-432c5b5a5860@redhat.com> Am 18.04.19 um 16:34 schrieb Stefan Karlsson: > On 2019-04-18 15:13, Roman Kennke wrote: >>>>>> To add a little more detail, I could move the change up into >>>>>> is_objArray(), but I don't want to expose it to any non-assert >>>>>> paths. Therefore I could do 2 different impls there, guarded by >>>>>> #ifdef ASSERT but I don't think it's a good idea to behave >>>>>> differently under ASSERT, that kindof defeats the point of assert, >>>>>> right? >>>>>> >>>>>> What do you think ? >>>>> >>>>> I don't follow your argument. Under asserts you need to access the >>>>> klass pointer "safely" but otherwise you do not. So there are two >>>>> behaviours related to accessing the klass pointer anyway. I'd >>>>> rather see that encapsulated in the accessor. >>>>> >>>>> I assume it's not just asserts but any debug only code that wants >>>>> to access the klass pointer. >>>> >>>> In general, for any runtime calls into oopDesc::klass() the access >>>> should be safe. The acrobatics is only necessary for *GC-internal* >>> >>> This is the part I don't quite understand, and goes back to my >>> initial question. Why are you doing these operations on from-space >>> objects? I'm thinking you should be in a position in the GC to make >>> sure this can never happen. If you need to do that in the GC (which >>> is fine), then the GC could apply a "resolve" function to get the >>> to-space object, and call size() (or whatever) on that object. This >>> shouldn't have to leak out of the GC, right? >> >> It is a problem when we are about to evacuate an object. Then we need >> to know its size in order to allocate and copy an appropriate chunk. >> The problem is that this part is racy: two threads (e.g. two Java >> threads via barrier, or one Java thread vs one GC thread) might >> compete over this: both would create a copy of the object, but >> ultimately only one would succeed (by CASing the fwd pointer). >> Therefore, getting hold of the object size is racy, by design, and >> this requires to resolve the _klass. Now, we can do that ahead of >> time, and call oopDesc::size_given_klass() and all would be good, >> except that size_given_klass() asserts that the object is indeed of >> the given klass, and hence fetches _klass again, which, at this point, >> is racy. Solving this inside the GC would require to basically copy >> all the machinery to get hold of object size into the GC. Are you >> asking me to do that? > > Other GCs store forwarding pointers in the mark word. See > oopDesc::forward_to and friends. Could you do the same and get rid of > this problem? > No. Other GCs store the fwd pointer there, but only during a pause, and while possibly stashing the mark word somewhere else in the meantime. We need to do it outside of GC pauses, plus we need a way (bit) to indicate what it actually is (fwd pointer or Klass*). The mark word is already badly overloaded and also accessed much more often and in critical paths (e.g. locking), while the Klass* is basically immutable, and has the lowest 3 bits free (when running with -UseCompressedClassPointers, which would be enforced by Shenandoah). Using the Klass* slot is therefore the simplest and most efficient place to keep the fwd pointer. Attempting to use the mark word would require much more barriers and cause more overhead to manage it. Roman From stefan.karlsson at oracle.com Thu Apr 18 15:34:09 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 18 Apr 2019 17:34:09 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <23c7a1ac-d99d-cd28-b091-432c5b5a5860@redhat.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> <23c7a1ac-d99d-cd28-b091-432c5b5a5860@redhat.com> Message-ID: <5a075e86-2621-7960-8f52-f7de1660b6e2@oracle.com> On 2019-04-18 17:29, Roman Kennke wrote: > > > Am 18.04.19 um 16:34 schrieb Stefan Karlsson: >> On 2019-04-18 15:13, Roman Kennke wrote: >>>>>>> To add a little more detail, I could move the change up into >>>>>>> is_objArray(), but I don't want to expose it to any non-assert >>>>>>> paths. Therefore I could do 2 different impls there, guarded by >>>>>>> #ifdef ASSERT but I don't think it's a good idea to behave >>>>>>> differently under ASSERT, that kindof defeats the point of >>>>>>> assert, right? >>>>>>> >>>>>>> What do you think ? >>>>>> >>>>>> I don't follow your argument. Under asserts you need to access >>>>>> the klass pointer "safely" but otherwise you do not. So there are >>>>>> two behaviours related to accessing the klass pointer anyway. I'd >>>>>> rather see that encapsulated in the accessor. >>>>>> >>>>>> I assume it's not just asserts but any debug only code that wants >>>>>> to access the klass pointer. >>>>> >>>>> In general, for any runtime calls into oopDesc::klass() the access >>>>> should be safe. The acrobatics is only necessary for *GC-internal* >>>> >>>> This is the part I don't quite understand, and goes back to my >>>> initial question. Why are you doing these operations on from-space >>>> objects? I'm thinking you should be in a position in the GC to make >>>> sure this can never happen. If you need to do that in the GC (which >>>> is fine), then the GC could apply a "resolve" function to get the >>>> to-space object, and call size() (or whatever) on that object. This >>>> shouldn't have to leak out of the GC, right? >>> >>> It is a problem when we are about to evacuate an object. Then we >>> need to know its size in order to allocate and copy an appropriate >>> chunk. The problem is that this part is racy: two threads (e.g. two >>> Java threads via barrier, or one Java thread vs one GC thread) might >>> compete over this: both would create a copy of the object, but >>> ultimately only one would succeed (by CASing the fwd pointer). >>> Therefore, getting hold of the object size is racy, by design, and >>> this requires to resolve the _klass. Now, we can do that ahead of >>> time, and call oopDesc::size_given_klass() and all would be good, >>> except that size_given_klass() asserts that the object is indeed of >>> the given klass, and hence fetches _klass again, which, at this >>> point, is racy. Solving this inside the GC would require to >>> basically copy all the machinery to get hold of object size into the >>> GC. Are you asking me to do that? >> >> Other GCs store forwarding pointers in the mark word. See >> oopDesc::forward_to and friends. Could you do the same and get rid of >> this problem? >> > > No. Other GCs store the fwd pointer there, but only during a pause, > and while possibly stashing the mark word somewhere else in the meantime. > > We need to do it outside of GC pauses, plus we need a way (bit) to > indicate what it actually is (fwd pointer or Klass*). The mark word is > already badly overloaded and also accessed much more often and in > critical paths (e.g. locking), while the Klass* is basically > immutable, and has the lowest 3 bits free (when running with > -UseCompressedClassPointers, which would be enforced by Shenandoah). > Using the Klass* slot is therefore the simplest and most efficient > place to keep the fwd pointer. Attempting to use the mark word would > require much more barriers and cause more overhead to manage it. Are you sure? Remember, the object is in from-space and no thread is allowed to change it, except the threads that are copying out of the from-space. StefanK > > Roman From gerard.ziemski at oracle.com Thu Apr 18 15:45:24 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Thu, 18 Apr 2019 10:45:24 -0500 Subject: RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes In-Reply-To: <5CB6980F.2050800@oracle.com> References: <1573e3e8-ace5-ee34-1269-96b4ef2fbe78@oracle.com> <5CA4F10F.2010805@oracle.com> <5CAD05AF.1090700@oracle.com> <2404c7a6-b701-bba9-a2d2-a0b3232cde7e@oracle.com> <7002f66b-49af-b38d-8d97-8642e684e998@oracle.com> <5CB0CEBE.5000400@oracle.com> <72c5aeb5-d65c-33d0-fc95-5e469316478f@oracle.com> <5CB6980F.2050800@oracle.com> Message-ID: <16c2e23d-0c69-40bc-40d0-75cad1630e4d@oracle.com> Sorry Erik that it took me a while to respond, but the thread got so long, that I missed your reply initially. On 4/16/19 10:05 PM, Erik Gahlin wrote: >>>>>> >>>>>> If the lock is taken, then it means that someone is scanning >>>>>> through the entire table, or the table is being resized. Either >>>>>> way, we're not loosing data, but are just temporarily blind - I >>>>>> don't see a problem here for a long running apps, they will start >>>>>> receiving events eventually (which happen every 10 sec by default) >>> A user can set period "everyChunk" which means events are guaranteed >>> to be in the recording. >>> >>> I think we should try to avoid breaking that contract. When event >>> streaming is in place, we can implement requestable events where a >>> user can demand an event programmatically from Java. If they >>> sometimes don't get an event, it will break their code in a subtle way. >> >> No problem, I removed the resize_lock around the JFR table >> statistics, so we might get a slightly incorrect stats every now and >> then, but we will be emitting the events on schedule: >> http://cr.openjdk.java.net/~gziemski/8185525_rev7 > Is it sufficient to just remove the lock to make it "work"? Yes, the event statistics might be slightly off though. > > I think it could be OK to use stale data, or perhaps count a value > twice, but are there other issues that needs to be fixed as well? > Robbin may have more information on this. Yes, we might end up miscounting some items, as Coleen pointed out before. I already noted that emitting the event in such situation might result is slightly wrong data, and used that as an argument that I'd prefer not to emit the event at all, but you said that you preferred slightly wrong data as long as we emit the event. I don't want to speak for Robbin here, but I want to note that he already expressed his opinion, by saying that we might as well skip the event, when we can't grab the lock, which would me my personal choice here as well. > > An alternative approach would be to use the last known data, if we are > not able to take the lock. It would be old, but not out of whack. This is not really much better than not emitting the event at all, as we had in previous implementation. Any client reading the events might as well assume that the missing event would be same as the last for this particular event and synthesize one as needed. I don't see this as much improvement. > > That said, it would be interesting to have some numbers on what the > cost would be to wait for the lock. Robbin's hash table is concurrent and I personally would hate to introduce, even in JFR code base, a mechanism that blocks and waits around for the table to be locked (however infrequent such situation might be called for). I'm not saying that it can not be done, but I personally would not want to do this. If you want to follow up on this yourself, however, you can always do that. > >> >> Last question: what is the recommended way to programatically tell if >> JFR is ON? I'm wondering whether I should collect the add/remove >> rates for the tables only if JRF is ON. As it is right now, we >> collect them always. It's just an atomic increment, but still, it's >> work only JFR events need. > > You can use the JFR_ONLY macro, if it's not built with JFR. If you > want to check if a recording is running, you can use > Jfr::is_recording(), but perhaps Jfr::is_enabled() is more > accurate/correct if a recording is started/stopped repeatedly?' Thanks! I used some of these APIs, but I think that we currently don't have enough granularity here, so I filed JDK-8222736 expressing my concerns and the logic behind it. > > I looked at jfrPeriodic.cpp, and it seems to me that things could be > simplified, i.e. > > template > static void emit_table_statistics(TableStatistics& statistics) { > ?? T event; > ?? event.set_bucketCount(statistics._number_of_buckets); > ?? ... > ?? event.commit(); > } Very nice suggestion, thank you. As much as I'd like to move on from this issue, I'm having second thoughts about the "insertion rate" and "deletion rate" attributes for these events. They can be synthesized by clients, and I wonder whether or not we should include them. I'd rather see only the core attributes be part of the event, and anything else that can be synthesized by clients, not "clutter" the event. Uploaded update revision here http://cr.openjdk.java.net/~gziemski/8185525_rev8 cheers From rkennke at redhat.com Thu Apr 18 16:21:04 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 18 Apr 2019 18:21:04 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <5a075e86-2621-7960-8f52-f7de1660b6e2@oracle.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> <23c7a1ac-d99d-cd28-b091-432c5b5a5860@redhat.com> <5a075e86-2621-7960-8f52-f7de1660b6e2@oracle.com> Message-ID: <4cf05460-7914-9b1e-99f7-526c364e36e5@redhat.com> Am 18.04.19 um 17:34 schrieb Stefan Karlsson: > On 2019-04-18 17:29, Roman Kennke wrote: >> >> >> Am 18.04.19 um 16:34 schrieb Stefan Karlsson: >>> On 2019-04-18 15:13, Roman Kennke wrote: >>>>>>>> To add a little more detail, I could move the change up into >>>>>>>> is_objArray(), but I don't want to expose it to any non-assert >>>>>>>> paths. Therefore I could do 2 different impls there, guarded by >>>>>>>> #ifdef ASSERT but I don't think it's a good idea to behave >>>>>>>> differently under ASSERT, that kindof defeats the point of >>>>>>>> assert, right? >>>>>>>> >>>>>>>> What do you think ? >>>>>>> >>>>>>> I don't follow your argument. Under asserts you need to access >>>>>>> the klass pointer "safely" but otherwise you do not. So there are >>>>>>> two behaviours related to accessing the klass pointer anyway. I'd >>>>>>> rather see that encapsulated in the accessor. >>>>>>> >>>>>>> I assume it's not just asserts but any debug only code that wants >>>>>>> to access the klass pointer. >>>>>> >>>>>> In general, for any runtime calls into oopDesc::klass() the access >>>>>> should be safe. The acrobatics is only necessary for *GC-internal* >>>>> >>>>> This is the part I don't quite understand, and goes back to my >>>>> initial question. Why are you doing these operations on from-space >>>>> objects? I'm thinking you should be in a position in the GC to make >>>>> sure this can never happen. If you need to do that in the GC (which >>>>> is fine), then the GC could apply a "resolve" function to get the >>>>> to-space object, and call size() (or whatever) on that object. This >>>>> shouldn't have to leak out of the GC, right? >>>> >>>> It is a problem when we are about to evacuate an object. Then we >>>> need to know its size in order to allocate and copy an appropriate >>>> chunk. The problem is that this part is racy: two threads (e.g. two >>>> Java threads via barrier, or one Java thread vs one GC thread) might >>>> compete over this: both would create a copy of the object, but >>>> ultimately only one would succeed (by CASing the fwd pointer). >>>> Therefore, getting hold of the object size is racy, by design, and >>>> this requires to resolve the _klass. Now, we can do that ahead of >>>> time, and call oopDesc::size_given_klass() and all would be good, >>>> except that size_given_klass() asserts that the object is indeed of >>>> the given klass, and hence fetches _klass again, which, at this >>>> point, is racy. Solving this inside the GC would require to >>>> basically copy all the machinery to get hold of object size into the >>>> GC. Are you asking me to do that? >>> >>> Other GCs store forwarding pointers in the mark word. See >>> oopDesc::forward_to and friends. Could you do the same and get rid of >>> this problem? >>> >> >> No. Other GCs store the fwd pointer there, but only during a pause, >> and while possibly stashing the mark word somewhere else in the meantime. >> >> We need to do it outside of GC pauses, plus we need a way (bit) to >> indicate what it actually is (fwd pointer or Klass*). The mark word is >> already badly overloaded and also accessed much more often and in >> critical paths (e.g. locking), while the Klass* is basically >> immutable, and has the lowest 3 bits free (when running with >> -UseCompressedClassPointers, which would be enforced by Shenandoah). >> Using the Klass* slot is therefore the simplest and most efficient >> place to keep the fwd pointer. Attempting to use the mark word would >> require much more barriers and cause more overhead to manage it. > > Are you sure? Remember, the object is in from-space and no thread is > allowed to change it, except the threads that are copying out of the > from-space. Right. But we still need one bit in it to differentiate between fwd ptr and regular mark word. And installing the fwd pointer concurrently with locking seems more of a horror story than dealing with an otherwise immutable Klass*. In any case, I guess we could do without any GC interface changes by dropping the asserts in size_given_klass() if you think that is reasonable, also avoid the extra asserts in klass() and friends (even though I would prefer to have some way to assert sanity there...), plus the changes for JDK-8222537 which are arguably an improvement in any case. Roman From per.liden at oracle.com Thu Apr 18 16:47:30 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 18 Apr 2019 18:47:30 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: <4cf05460-7914-9b1e-99f7-526c364e36e5@redhat.com> References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> <23c7a1ac-d99d-cd28-b091-432c5b5a5860@redhat.com> <5a075e86-2621-7960-8f52-f7de1660b6e2@oracle.com> <4cf05460-7914-9b1e-99f7-526c364e36e5@redhat.com> Message-ID: On 04/18/2019 06:21 PM, Roman Kennke wrote: > > > Am 18.04.19 um 17:34 schrieb Stefan Karlsson: >> On 2019-04-18 17:29, Roman Kennke wrote: >>> >>> >>> Am 18.04.19 um 16:34 schrieb Stefan Karlsson: >>>> On 2019-04-18 15:13, Roman Kennke wrote: >>>>>>>>> To add a little more detail, I could move the change up into >>>>>>>>> is_objArray(), but I don't want to expose it to any non-assert >>>>>>>>> paths. Therefore I could do 2 different impls there, guarded by >>>>>>>>> #ifdef ASSERT but I don't think it's a good idea to behave >>>>>>>>> differently under ASSERT, that kindof defeats the point of >>>>>>>>> assert, right? >>>>>>>>> >>>>>>>>> What do you think ? >>>>>>>> >>>>>>>> I don't follow your argument. Under asserts you need to access >>>>>>>> the klass pointer "safely" but otherwise you do not. So there >>>>>>>> are two behaviours related to accessing the klass pointer >>>>>>>> anyway. I'd rather see that encapsulated in the accessor. >>>>>>>> >>>>>>>> I assume it's not just asserts but any debug only code that >>>>>>>> wants to access the klass pointer. >>>>>>> >>>>>>> In general, for any runtime calls into oopDesc::klass() the >>>>>>> access should be safe. The acrobatics is only necessary for >>>>>>> *GC-internal* >>>>>> >>>>>> This is the part I don't quite understand, and goes back to my >>>>>> initial question. Why are you doing these operations on from-space >>>>>> objects? I'm thinking you should be in a position in the GC to >>>>>> make sure this can never happen. If you need to do that in the GC >>>>>> (which is fine), then the GC could apply a "resolve" function to >>>>>> get the to-space object, and call size() (or whatever) on that >>>>>> object. This shouldn't have to leak out of the GC, right? >>>>> >>>>> It is a problem when we are about to evacuate an object. Then we >>>>> need to know its size in order to allocate and copy an appropriate >>>>> chunk. The problem is that this part is racy: two threads (e.g. two >>>>> Java threads via barrier, or one Java thread vs one GC thread) >>>>> might compete over this: both would create a copy of the object, >>>>> but ultimately only one would succeed (by CASing the fwd pointer). >>>>> Therefore, getting hold of the object size is racy, by design, and >>>>> this requires to resolve the _klass. Now, we can do that ahead of >>>>> time, and call oopDesc::size_given_klass() and all would be good, >>>>> except that size_given_klass() asserts that the object is indeed of >>>>> the given klass, and hence fetches _klass again, which, at this >>>>> point, is racy. Solving this inside the GC would require to >>>>> basically copy all the machinery to get hold of object size into >>>>> the GC. Are you asking me to do that? >>>> >>>> Other GCs store forwarding pointers in the mark word. See >>>> oopDesc::forward_to and friends. Could you do the same and get rid >>>> of this problem? >>>> >>> >>> No. Other GCs store the fwd pointer there, but only during a pause, >>> and while possibly stashing the mark word somewhere else in the >>> meantime. >>> >>> We need to do it outside of GC pauses, plus we need a way (bit) to >>> indicate what it actually is (fwd pointer or Klass*). The mark word >>> is already badly overloaded and also accessed much more often and in >>> critical paths (e.g. locking), while the Klass* is basically >>> immutable, and has the lowest 3 bits free (when running with >>> -UseCompressedClassPointers, which would be enforced by Shenandoah). >>> Using the Klass* slot is therefore the simplest and most efficient >>> place to keep the fwd pointer. Attempting to use the mark word would >>> require much more barriers and cause more overhead to manage it. >> >> Are you sure? Remember, the object is in from-space and no thread is >> allowed to change it, except the threads that are copying out of the >> from-space. > > Right. But we still need one bit in it to differentiate between fwd ptr > and regular mark word. And installing the fwd pointer concurrently with We already have such bits, and oopDesc::forward_to_atomic() will set them for you. Aren't they enough? > locking seems more of a horror story than dealing with an otherwise > immutable Klass*. With a to-space invariant, I can't see how this can happen concurrently with locking. cheers, Per > > In any case, I guess we could do without any GC interface changes by > dropping the asserts in size_given_klass() if you think that is > reasonable, also avoid the extra asserts in klass() and friends (even > though I would prefer to have some way to assert sanity there...), plus > the changes for JDK-8222537 which are arguably an improvement in any case. > > Roman From daniel.daugherty at oracle.com Thu Apr 18 17:11:08 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 18 Apr 2019 13:11:08 -0400 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> <23c7a1ac-d99d-cd28-b091-432c5b5a5860@redhat.com> <5a075e86-2621-7960-8f52-f7de1660b6e2@oracle.com> <4cf05460-7914-9b1e-99f7-526c364e36e5@redhat.com> Message-ID: <9ab1708c-6164-4622-be05-721bf8906c90@oracle.com> On 4/18/19 12:47 PM, Per Liden wrote: > > On 04/18/2019 06:21 PM, Roman Kennke wrote: >> >> >> Am 18.04.19 um 17:34 schrieb Stefan Karlsson: >>> On 2019-04-18 17:29, Roman Kennke wrote: >>>> >>>> >>>> Am 18.04.19 um 16:34 schrieb Stefan Karlsson: >>>>> On 2019-04-18 15:13, Roman Kennke wrote: >>>>>>>>>> To add a little more detail, I could move the change up into >>>>>>>>>> is_objArray(), but I don't want to expose it to any >>>>>>>>>> non-assert paths. Therefore I could do 2 different impls >>>>>>>>>> there, guarded by #ifdef ASSERT but I don't think it's a good >>>>>>>>>> idea to behave differently under ASSERT, that kindof defeats >>>>>>>>>> the point of assert, right? >>>>>>>>>> >>>>>>>>>> What do you think ? >>>>>>>>> >>>>>>>>> I don't follow your argument. Under asserts you need to access >>>>>>>>> the klass pointer "safely" but otherwise you do not. So there >>>>>>>>> are two behaviours related to accessing the klass pointer >>>>>>>>> anyway. I'd rather see that encapsulated in the accessor. >>>>>>>>> >>>>>>>>> I assume it's not just asserts but any debug only code that >>>>>>>>> wants to access the klass pointer. >>>>>>>> >>>>>>>> In general, for any runtime calls into oopDesc::klass() the >>>>>>>> access should be safe. The acrobatics is only necessary for >>>>>>>> *GC-internal* >>>>>>> >>>>>>> This is the part I don't quite understand, and goes back to my >>>>>>> initial question. Why are you doing these operations on >>>>>>> from-space objects? I'm thinking you should be in a position in >>>>>>> the GC to make sure this can never happen. If you need to do >>>>>>> that in the GC (which is fine), then the GC could apply a >>>>>>> "resolve" function to get the to-space object, and call size() >>>>>>> (or whatever) on that object. This shouldn't have to leak out of >>>>>>> the GC, right? >>>>>> >>>>>> It is a problem when we are about to evacuate an object. Then we >>>>>> need to know its size in order to allocate and copy an >>>>>> appropriate chunk. The problem is that this part is racy: two >>>>>> threads (e.g. two Java threads via barrier, or one Java thread vs >>>>>> one GC thread) might compete over this: both would create a copy >>>>>> of the object, but ultimately only one would succeed (by CASing >>>>>> the fwd pointer). Therefore, getting hold of the object size is >>>>>> racy, by design, and this requires to resolve the _klass. Now, we >>>>>> can do that ahead of time, and call oopDesc::size_given_klass() >>>>>> and all would be good, except that size_given_klass() asserts >>>>>> that the object is indeed of the given klass, and hence fetches >>>>>> _klass again, which, at this point, is racy. Solving this inside >>>>>> the GC would require to basically copy all the machinery to get >>>>>> hold of object size into the GC. Are you asking me to do that? >>>>> >>>>> Other GCs store forwarding pointers in the mark word. See >>>>> oopDesc::forward_to and friends. Could you do the same and get rid >>>>> of this problem? >>>>> >>>> >>>> No. Other GCs store the fwd pointer there, but only during a pause, >>>> and while possibly stashing the mark word somewhere else in the >>>> meantime. >>>> >>>> We need to do it outside of GC pauses, plus we need a way (bit) to >>>> indicate what it actually is (fwd pointer or Klass*). The mark word >>>> is already badly overloaded and also accessed much more often and >>>> in critical paths (e.g. locking), while the Klass* is basically >>>> immutable, and has the lowest 3 bits free (when running with >>>> -UseCompressedClassPointers, which would be enforced by >>>> Shenandoah). Using the Klass* slot is therefore the simplest and >>>> most efficient place to keep the fwd pointer. Attempting to use the >>>> mark word would require much more barriers and cause more overhead >>>> to manage it. >>> >>> Are you sure? Remember, the object is in from-space and no thread is >>> allowed to change it, except the threads that are copying out of the >>> from-space. >> >> Right. But we still need one bit in it to differentiate between fwd >> ptr and regular mark word. And installing the fwd pointer >> concurrently with > > We already have such bits, and oopDesc::forward_to_atomic() will set > them for you. Aren't they enough? > >> locking seems more of a horror story than dealing with an otherwise >> immutable Klass*. > > With a to-space invariant, I can't see how this can happen > concurrently with locking. Roman might be thinking about Async Monitor Deflation... :-) Dan > > cheers, > Per > >> >> In any case, I guess we could do without any GC interface changes by >> dropping the asserts in size_given_klass() if you think that is >> reasonable, also avoid the extra asserts in klass() and friends (even >> though I would prefer to have some way to assert sanity there...), >> plus the changes for JDK-8222537 which are arguably an improvement in >> any case. >> >> Roman From rkennke at redhat.com Thu Apr 18 17:15:55 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 18 Apr 2019 19:15:55 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> <23c7a1ac-d99d-cd28-b091-432c5b5a5860@redhat.com> <5a075e86-2621-7960-8f52-f7de1660b6e2@oracle.com> <4cf05460-7914-9b1e-99f7-526c364e36e5@redhat.com> Message-ID: Am 18.04.19 um 18:47 schrieb Per Liden: > > On 04/18/2019 06:21 PM, Roman Kennke wrote: >> >> >> Am 18.04.19 um 17:34 schrieb Stefan Karlsson: >>> On 2019-04-18 17:29, Roman Kennke wrote: >>>> >>>> >>>> Am 18.04.19 um 16:34 schrieb Stefan Karlsson: >>>>> On 2019-04-18 15:13, Roman Kennke wrote: >>>>>>>>>> To add a little more detail, I could move the change up into >>>>>>>>>> is_objArray(), but I don't want to expose it to any non-assert >>>>>>>>>> paths. Therefore I could do 2 different impls there, guarded >>>>>>>>>> by #ifdef ASSERT but I don't think it's a good idea to behave >>>>>>>>>> differently under ASSERT, that kindof defeats the point of >>>>>>>>>> assert, right? >>>>>>>>>> >>>>>>>>>> What do you think ? >>>>>>>>> >>>>>>>>> I don't follow your argument. Under asserts you need to access >>>>>>>>> the klass pointer "safely" but otherwise you do not. So there >>>>>>>>> are two behaviours related to accessing the klass pointer >>>>>>>>> anyway. I'd rather see that encapsulated in the accessor. >>>>>>>>> >>>>>>>>> I assume it's not just asserts but any debug only code that >>>>>>>>> wants to access the klass pointer. >>>>>>>> >>>>>>>> In general, for any runtime calls into oopDesc::klass() the >>>>>>>> access should be safe. The acrobatics is only necessary for >>>>>>>> *GC-internal* >>>>>>> >>>>>>> This is the part I don't quite understand, and goes back to my >>>>>>> initial question. Why are you doing these operations on >>>>>>> from-space objects? I'm thinking you should be in a position in >>>>>>> the GC to make sure this can never happen. If you need to do that >>>>>>> in the GC (which is fine), then the GC could apply a "resolve" >>>>>>> function to get the to-space object, and call size() (or >>>>>>> whatever) on that object. This shouldn't have to leak out of the >>>>>>> GC, right? >>>>>> >>>>>> It is a problem when we are about to evacuate an object. Then we >>>>>> need to know its size in order to allocate and copy an appropriate >>>>>> chunk. The problem is that this part is racy: two threads (e.g. >>>>>> two Java threads via barrier, or one Java thread vs one GC thread) >>>>>> might compete over this: both would create a copy of the object, >>>>>> but ultimately only one would succeed (by CASing the fwd pointer). >>>>>> Therefore, getting hold of the object size is racy, by design, and >>>>>> this requires to resolve the _klass. Now, we can do that ahead of >>>>>> time, and call oopDesc::size_given_klass() and all would be good, >>>>>> except that size_given_klass() asserts that the object is indeed >>>>>> of the given klass, and hence fetches _klass again, which, at this >>>>>> point, is racy. Solving this inside the GC would require to >>>>>> basically copy all the machinery to get hold of object size into >>>>>> the GC. Are you asking me to do that? >>>>> >>>>> Other GCs store forwarding pointers in the mark word. See >>>>> oopDesc::forward_to and friends. Could you do the same and get rid >>>>> of this problem? >>>>> >>>> >>>> No. Other GCs store the fwd pointer there, but only during a pause, >>>> and while possibly stashing the mark word somewhere else in the >>>> meantime. >>>> >>>> We need to do it outside of GC pauses, plus we need a way (bit) to >>>> indicate what it actually is (fwd pointer or Klass*). The mark word >>>> is already badly overloaded and also accessed much more often and in >>>> critical paths (e.g. locking), while the Klass* is basically >>>> immutable, and has the lowest 3 bits free (when running with >>>> -UseCompressedClassPointers, which would be enforced by Shenandoah). >>>> Using the Klass* slot is therefore the simplest and most efficient >>>> place to keep the fwd pointer. Attempting to use the mark word would >>>> require much more barriers and cause more overhead to manage it. >>> >>> Are you sure? Remember, the object is in from-space and no thread is >>> allowed to change it, except the threads that are copying out of the >>> from-space. >> >> Right. But we still need one bit in it to differentiate between fwd >> ptr and regular mark word. And installing the fwd pointer concurrently >> with > > We already have such bits, and oopDesc::forward_to_atomic() will set > them for you. Aren't they enough? > >> locking seems more of a horror story than dealing with an otherwise >> immutable Klass*. > > With a to-space invariant, I can't see how this can happen concurrently > with locking. Let's say we have one thread (1) which wants to evacuate an object, and another which does some locking operation on it (2). (1) would optimistically copy the object, then attempts to install the forwarding pointer into header, while (2) modifies the header. Which means 1. oops, the header changed, need to CAS again, 2. How to ensure consistency in the to-space object after evac ? I suppose this can be solved by a reasonable protocol, but even just thinking about it gives me a headache, while the same on an immutable Klass* or external brooks pointer is trivial. Also, I don't understand why we even need to bikeshed this. All I am proposing is: 1. A sanity assert in klass() and friends which hurts nobody. We could actually make this a little nicer and check klass->is_klass(), and add the GC-hook CollectedHeap::is_klass() there, similar to what we already do in oopDesc::is_oop(). 2. Making the asserts in path from size_given_klass() reliable in light of concurrent modification. Instead of dropping those asserts, I would actually like to strengthen them and not only assert obj->is_XYZArray() but even assert obj->klass() == klass, with the appropriate precautions needed for GC/Shenandoah. 3. Avoid calling klass() twice in typeArrayOop::size() (JDK-8222537) which is actually a (minor) performance improvement. Is this too much to ask? Removing an otherwise good assert, and/or not allowing an assert in klass() seems to hurt our debugging capabilities. This doesn't make sense to me. Roman From per.liden at oracle.com Thu Apr 18 17:40:44 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 18 Apr 2019 19:40:44 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> <23c7a1ac-d99d-cd28-b091-432c5b5a5860@redhat.com> <5a075e86-2621-7960-8f52-f7de1660b6e2@oracle.com> <4cf05460-7914-9b1e-99f7-526c364e36e5@redhat.com> Message-ID: On 04/18/2019 07:15 PM, Roman Kennke wrote: > Am 18.04.19 um 18:47 schrieb Per Liden: >> >> On 04/18/2019 06:21 PM, Roman Kennke wrote: >>> >>> >>> Am 18.04.19 um 17:34 schrieb Stefan Karlsson: >>>> On 2019-04-18 17:29, Roman Kennke wrote: >>>>> >>>>> >>>>> Am 18.04.19 um 16:34 schrieb Stefan Karlsson: >>>>>> On 2019-04-18 15:13, Roman Kennke wrote: >>>>>>>>>>> To add a little more detail, I could move the change up into >>>>>>>>>>> is_objArray(), but I don't want to expose it to any >>>>>>>>>>> non-assert paths. Therefore I could do 2 different impls >>>>>>>>>>> there, guarded by #ifdef ASSERT but I don't think it's a good >>>>>>>>>>> idea to behave differently under ASSERT, that kindof defeats >>>>>>>>>>> the point of assert, right? >>>>>>>>>>> >>>>>>>>>>> What do you think ? >>>>>>>>>> >>>>>>>>>> I don't follow your argument. Under asserts you need to access >>>>>>>>>> the klass pointer "safely" but otherwise you do not. So there >>>>>>>>>> are two behaviours related to accessing the klass pointer >>>>>>>>>> anyway. I'd rather see that encapsulated in the accessor. >>>>>>>>>> >>>>>>>>>> I assume it's not just asserts but any debug only code that >>>>>>>>>> wants to access the klass pointer. >>>>>>>>> >>>>>>>>> In general, for any runtime calls into oopDesc::klass() the >>>>>>>>> access should be safe. The acrobatics is only necessary for >>>>>>>>> *GC-internal* >>>>>>>> >>>>>>>> This is the part I don't quite understand, and goes back to my >>>>>>>> initial question. Why are you doing these operations on >>>>>>>> from-space objects? I'm thinking you should be in a position in >>>>>>>> the GC to make sure this can never happen. If you need to do >>>>>>>> that in the GC (which is fine), then the GC could apply a >>>>>>>> "resolve" function to get the to-space object, and call size() >>>>>>>> (or whatever) on that object. This shouldn't have to leak out of >>>>>>>> the GC, right? >>>>>>> >>>>>>> It is a problem when we are about to evacuate an object. Then we >>>>>>> need to know its size in order to allocate and copy an >>>>>>> appropriate chunk. The problem is that this part is racy: two >>>>>>> threads (e.g. two Java threads via barrier, or one Java thread vs >>>>>>> one GC thread) might compete over this: both would create a copy >>>>>>> of the object, but ultimately only one would succeed (by CASing >>>>>>> the fwd pointer). Therefore, getting hold of the object size is >>>>>>> racy, by design, and this requires to resolve the _klass. Now, we >>>>>>> can do that ahead of time, and call oopDesc::size_given_klass() >>>>>>> and all would be good, except that size_given_klass() asserts >>>>>>> that the object is indeed of the given klass, and hence fetches >>>>>>> _klass again, which, at this point, is racy. Solving this inside >>>>>>> the GC would require to basically copy all the machinery to get >>>>>>> hold of object size into the GC. Are you asking me to do that? >>>>>> >>>>>> Other GCs store forwarding pointers in the mark word. See >>>>>> oopDesc::forward_to and friends. Could you do the same and get rid >>>>>> of this problem? >>>>>> >>>>> >>>>> No. Other GCs store the fwd pointer there, but only during a pause, >>>>> and while possibly stashing the mark word somewhere else in the >>>>> meantime. >>>>> >>>>> We need to do it outside of GC pauses, plus we need a way (bit) to >>>>> indicate what it actually is (fwd pointer or Klass*). The mark word >>>>> is already badly overloaded and also accessed much more often and >>>>> in critical paths (e.g. locking), while the Klass* is basically >>>>> immutable, and has the lowest 3 bits free (when running with >>>>> -UseCompressedClassPointers, which would be enforced by >>>>> Shenandoah). Using the Klass* slot is therefore the simplest and >>>>> most efficient place to keep the fwd pointer. Attempting to use the >>>>> mark word would require much more barriers and cause more overhead >>>>> to manage it. >>>> >>>> Are you sure? Remember, the object is in from-space and no thread is >>>> allowed to change it, except the threads that are copying out of the >>>> from-space. >>> >>> Right. But we still need one bit in it to differentiate between fwd >>> ptr and regular mark word. And installing the fwd pointer >>> concurrently with >> >> We already have such bits, and oopDesc::forward_to_atomic() will set >> them for you. Aren't they enough? >> >>> locking seems more of a horror story than dealing with an otherwise >>> immutable Klass*. >> >> With a to-space invariant, I can't see how this can happen >> concurrently with locking. > > Let's say we have one thread (1) which wants to evacuate an object, and > another which does some locking operation on it (2). (1) would But how did that second thread even get hold of that from-space object in the first place? That thread would also _first_ race to evacuate it, _then_ try to lock it, when it's in to-space. cheers, Per > optimistically copy the object, then attempts to install the forwarding > pointer into header, while (2) modifies the header. Which means 1. oops, > the header changed, need to CAS again, 2. How to ensure consistency in > the to-space object after evac ? I suppose this can be solved by a > reasonable protocol, but even just thinking about it gives me a > headache, while the same on an immutable Klass* or external brooks > pointer is trivial. > > Also, I don't understand why we even need to bikeshed this. All I am > proposing is: > 1. A sanity assert in klass() and friends which hurts nobody. We could > actually make this a little nicer and check klass->is_klass(), and add > the GC-hook CollectedHeap::is_klass() there, similar to what we already > do in oopDesc::is_oop(). > 2. Making the asserts in path from size_given_klass() reliable in light > of concurrent modification. Instead of dropping those asserts, I would > actually like to strengthen them and not only assert obj->is_XYZArray() > but even assert obj->klass() == klass, with the appropriate precautions > needed for GC/Shenandoah. > 3. Avoid calling klass() twice in typeArrayOop::size() (JDK-8222537) > which is actually a (minor) performance improvement. > > Is this too much to ask? > > Removing an otherwise good assert, and/or not allowing an assert in > klass() seems to hurt our debugging capabilities. This doesn't make > sense to me. > > Roman From rkennke at redhat.com Thu Apr 18 18:22:57 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 18 Apr 2019 20:22:57 +0200 Subject: RFR: JDK-8222545: Safe klass asserts In-Reply-To: References: <83aa18cb-d97c-80d3-5ceb-d7f94ea8b893@redhat.com> <710dc08b-19dc-a8f7-61d5-96f6925265bc@oracle.com> <997a7714-81bc-cab7-4c86-ba71de7d33a9@redhat.com> <7010c39c-9963-51f2-0dac-9f79a1ad789c@oracle.com> <3c0c5076-6041-b974-023d-e1f3e1afba45@redhat.com> <672a71e1-4816-4291-6d04-c2985f4e7fa3@redhat.com> <23c7a1ac-d99d-cd28-b091-432c5b5a5860@redhat.com> <5a075e86-2621-7960-8f52-f7de1660b6e2@oracle.com> <4cf05460-7914-9b1e-99f7-526c364e36e5@redhat.com> Message-ID: Am 18. April 2019 19:40:44 MESZ schrieb Per Liden : >On 04/18/2019 07:15 PM, Roman Kennke wrote: >> Am 18.04.19 um 18:47 schrieb Per Liden: >>> >>> On 04/18/2019 06:21 PM, Roman Kennke wrote: >>>> >>>> >>>> Am 18.04.19 um 17:34 schrieb Stefan Karlsson: >>>>> On 2019-04-18 17:29, Roman Kennke wrote: >>>>>> >>>>>> >>>>>> Am 18.04.19 um 16:34 schrieb Stefan Karlsson: >>>>>>> On 2019-04-18 15:13, Roman Kennke wrote: >>>>>>>>>>>> To add a little more detail, I could move the change up >into >>>>>>>>>>>> is_objArray(), but I don't want to expose it to any >>>>>>>>>>>> non-assert paths. Therefore I could do 2 different impls >>>>>>>>>>>> there, guarded by #ifdef ASSERT but I don't think it's a >good >>>>>>>>>>>> idea to behave differently under ASSERT, that kindof >defeats >>>>>>>>>>>> the point of assert, right? >>>>>>>>>>>> >>>>>>>>>>>> What do you think ? >>>>>>>>>>> >>>>>>>>>>> I don't follow your argument. Under asserts you need to >access >>>>>>>>>>> the klass pointer "safely" but otherwise you do not. So >there >>>>>>>>>>> are two behaviours related to accessing the klass pointer >>>>>>>>>>> anyway. I'd rather see that encapsulated in the accessor. >>>>>>>>>>> >>>>>>>>>>> I assume it's not just asserts but any debug only code that >>>>>>>>>>> wants to access the klass pointer. >>>>>>>>>> >>>>>>>>>> In general, for any runtime calls into oopDesc::klass() the >>>>>>>>>> access should be safe. The acrobatics is only necessary for >>>>>>>>>> *GC-internal* >>>>>>>>> >>>>>>>>> This is the part I don't quite understand, and goes back to my > >>>>>>>>> initial question. Why are you doing these operations on >>>>>>>>> from-space objects? I'm thinking you should be in a position >in >>>>>>>>> the GC to make sure this can never happen. If you need to do >>>>>>>>> that in the GC (which is fine), then the GC could apply a >>>>>>>>> "resolve" function to get the to-space object, and call size() > >>>>>>>>> (or whatever) on that object. This shouldn't have to leak out >of >>>>>>>>> the GC, right? >>>>>>>> >>>>>>>> It is a problem when we are about to evacuate an object. Then >we >>>>>>>> need to know its size in order to allocate and copy an >>>>>>>> appropriate chunk. The problem is that this part is racy: two >>>>>>>> threads (e.g. two Java threads via barrier, or one Java thread >vs >>>>>>>> one GC thread) might compete over this: both would create a >copy >>>>>>>> of the object, but ultimately only one would succeed (by CASing > >>>>>>>> the fwd pointer). Therefore, getting hold of the object size is > >>>>>>>> racy, by design, and this requires to resolve the _klass. Now, >we >>>>>>>> can do that ahead of time, and call oopDesc::size_given_klass() > >>>>>>>> and all would be good, except that size_given_klass() asserts >>>>>>>> that the object is indeed of the given klass, and hence fetches > >>>>>>>> _klass again, which, at this point, is racy. Solving this >inside >>>>>>>> the GC would require to basically copy all the machinery to get > >>>>>>>> hold of object size into the GC. Are you asking me to do that? >>>>>>> >>>>>>> Other GCs store forwarding pointers in the mark word. See >>>>>>> oopDesc::forward_to and friends. Could you do the same and get >rid >>>>>>> of this problem? >>>>>>> >>>>>> >>>>>> No. Other GCs store the fwd pointer there, but only during a >pause, >>>>>> and while possibly stashing the mark word somewhere else in the >>>>>> meantime. >>>>>> >>>>>> We need to do it outside of GC pauses, plus we need a way (bit) >to >>>>>> indicate what it actually is (fwd pointer or Klass*). The mark >word >>>>>> is already badly overloaded and also accessed much more often and > >>>>>> in critical paths (e.g. locking), while the Klass* is basically >>>>>> immutable, and has the lowest 3 bits free (when running with >>>>>> -UseCompressedClassPointers, which would be enforced by >>>>>> Shenandoah). Using the Klass* slot is therefore the simplest and >>>>>> most efficient place to keep the fwd pointer. Attempting to use >the >>>>>> mark word would require much more barriers and cause more >overhead >>>>>> to manage it. >>>>> >>>>> Are you sure? Remember, the object is in from-space and no thread >is >>>>> allowed to change it, except the threads that are copying out of >the >>>>> from-space. >>>> >>>> Right. But we still need one bit in it to differentiate between fwd > >>>> ptr and regular mark word. And installing the fwd pointer >>>> concurrently with >>> >>> We already have such bits, and oopDesc::forward_to_atomic() will set > >>> them for you. Aren't they enough? >>> >>>> locking seems more of a horror story than dealing with an otherwise > >>>> immutable Klass*. >>> >>> With a to-space invariant, I can't see how this can happen >>> concurrently with locking. >> >> Let's say we have one thread (1) which wants to evacuate an object, >and >> another which does some locking operation on it (2). (1) would > >But how did that second thread even get hold of that from-space object >in the first place? That thread would also _first_ race to evacuate it, > >_then_ try to lock it, when it's in to-space. OK, you are right. My brain is apparently still half-stuck in weak invariant mode. Let me try that and see if this works without new GC interfaces ;-) Thanks, Roman >cheers, >Per > >> optimistically copy the object, then attempts to install the >forwarding >> pointer into header, while (2) modifies the header. Which means 1. >oops, >> the header changed, need to CAS again, 2. How to ensure consistency >in >> the to-space object after evac ? I suppose this can be solved by a >> reasonable protocol, but even just thinking about it gives me a >> headache, while the same on an immutable Klass* or external brooks >> pointer is trivial. >> >> Also, I don't understand why we even need to bikeshed this. All I am >> proposing is: >> 1. A sanity assert in klass() and friends which hurts nobody. We >could >> actually make this a little nicer and check klass->is_klass(), and >add >> the GC-hook CollectedHeap::is_klass() there, similar to what we >already >> do in oopDesc::is_oop(). >> 2. Making the asserts in path from size_given_klass() reliable in >light >> of concurrent modification. Instead of dropping those asserts, I >would >> actually like to strengthen them and not only assert >obj->is_XYZArray() >> but even assert obj->klass() == klass, with the appropriate >precautions >> needed for GC/Shenandoah. >> 3. Avoid calling klass() twice in typeArrayOop::size() (JDK-8222537) >> which is actually a (minor) performance improvement. >> >> Is this too much to ask? >> >> Removing an otherwise good assert, and/or not allowing an assert in >> klass() seems to hurt our debugging capabilities. This doesn't make >> sense to me. >> >> Roman -- Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From m.sundar85 at gmail.com Thu Apr 18 19:08:09 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Thu, 18 Apr 2019 12:08:09 -0700 Subject: When to use ThreadContextClassLoader vs ApplicationClassLoader Message-ID: Hi, I am trying to understand the class loader delegation concepts and came across this ThreadContextClassLoader. Can someone give me more info on why we need ThreadContextClassLoader if ApplicationClassLoader can be used all time? Any pointers on this topic will help me to understand this better. Thanks Sundar From yumin.qi at gmail.com Thu Apr 18 20:36:18 2019 From: yumin.qi at gmail.com (yumin qi) Date: Thu, 18 Apr 2019 13:36:18 -0700 Subject: When to use ThreadContextClassLoader vs ApplicationClassLoader In-Reply-To: References: Message-ID: You may check this doc: https://docs.oracle.com/javase/jndi/tutorial/beyond/misc/classloader.html On Thu, Apr 18, 2019 at 12:08 PM Sundara Mohan M wrote: > Hi, > I am trying to understand the class loader delegation concepts and came > across this ThreadContextClassLoader. > > Can someone give me more info on why we need ThreadContextClassLoader if > ApplicationClassLoader can be used all time? > Any pointers on this topic will help me to understand this better. > > Thanks > Sundar > From jason_mehrens at hotmail.com Fri Apr 12 20:52:13 2019 From: jason_mehrens at hotmail.com (Jason Mehrens) Date: Fri, 12 Apr 2019 20:52:13 +0000 Subject: RFR(L): 8218628: Add detailed message to NullPointerException describing what is null. In-Reply-To: References: <7c4b0bc27961471e91195bef9e767226@sap.com> <5c445ea9-24fb-0007-78df-41b94aadde2a@oracle.com> <8d1cc0b0-4a01-4564-73a9-4c635bfbfbaf@oracle.com> <3245ec3cefe2471e8382048164c0ba6b@sap.com> , Message-ID: Hi Goetz, Looking at the test cases I didn't see any tests for the single argument java.util.Objects.requireNonNull. Using this prototype is that method treated like a hidden frame? Cheers, Jason ________________________________________ From: core-libs-dev on behalf of Lindenmaier, Goetz Sent: Friday, April 12, 2019 5:33 AM To: Mandy Chung; Roger Riggs Cc: Java Core Libs; hotspot-runtime-dev at openjdk.java.net Subject: RE: RFR(L): 8218628: Add detailed message to NullPointerException describing what is null. Hi, while waiting for progress on corresponding the JEP, I improved the implementation of generating the NPE message. It now uses a single outputStream. This removes several allocations of temporary data. I also removed TrackingStackSource. The analysis code originally addressed several use cases, for NullPointerExceptions this is not needed. I cleaned up bytecodeUtils from some code not (really) needed. I split get_null_pointer_slot() into two methods: get_NPE_null_slot() and print_NPE_failed_action(). This simplifies the implementation, and streamlines it more with the text in the JEP. I print methods using the code added in "8221470: Print methods in exception messages in java-like Syntax.", so it now prints 'void m(int)' instead of 'm(I)V'. I implemented a row of new test cases, and rearranged the test to test the message part of print_NPE_failed_action() and print_NPE_cause() separated. I made sure all bytecodes handled in these methods are covered. Further I arranged the tests in methods according to the functional properties as discussed in the JEP. http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/07 Best regards, Goetz. > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Donnerstag, 14. M?rz 2019 21:56 > To: 'Mandy Chung' ; 'Roger Riggs' > > Cc: 'Java Core Libs' ; 'hotspot-runtime- > dev at openjdk.java.net' > Subject: RE: RFR(L): 8218628: Add detailed message to NullPointerException > describing what is null. > > Hi, > > I had promised to work on a better wording of the messages. > > This I deliver with this webrev: > http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/05- > otherMessages/ > > The test in the webrev is modified to just print messages along with the > code that raised the messages. > > Please have a look at these files with test output contained in the webrev: > Messages with debug information: > http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/05- > otherMessages/output_with_debug_info.txt > Messages without debug information: > http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/05- > otherMessages/output_no_debug_info.txt > > Especially look at the first few messages, they point out the usefulness > of this change. They precisely say what was null in a chain of dereferences. > > Best regards, > Goetz. > > > > -----Original Message----- > > From: Lindenmaier, Goetz > > Sent: Wednesday, February 13, 2019 10:09 AM > > To: 'Mandy Chung' ; Roger Riggs > > > > Cc: Java Core Libs ; hotspot-runtime- > > dev at openjdk.java.net > > Subject: RE: RFR(L): 8218628: Add detailed message to NullPointerException > > describing what is null. > > > > Hi Mandy, > > > > Thanks for supporting my intend of adding the message as such! > > I'll start implementing this in Java and come back with a webrev > > in a while. > > > > In parallel, I would like to continue discussing the other > > topics, e.g., the wording of the message. I will probably come up > > with a separate webrev for that. > > > > Best regards, > > Goetz. > > > > > > > > > -----Original Message----- > > > From: core-libs-dev On Behalf > > > Of Mandy Chung > > > Sent: Tuesday, February 12, 2019 7:32 PM > > > To: Roger Riggs > > > Cc: Java Core Libs ; hotspot-runtime- > > > dev at openjdk.java.net > > > Subject: Re: RFR(L): 8218628: Add detailed message to > > NullPointerException > > > describing what is null. > > > > > > On 2/8/19 11:46 AM, Roger Riggs wrote: > > > > Hi, > > > > > > > > A few higher level issues should be considered, though the details > > > > of the webrev captured my immediate attention. > > > > > > > > Is this the right feature and is this the right level of implementation > > > > (C++/native)? > > > > : > > > > How much of this can be done in Java code with StackWalker and other > > > > java APIs? > > > > It would be a shame to add this much native code if there was a more > > > robust > > > > way to implement it using APIs with more leverage. > > > > > > Improving the NPE message for better diagnosability is helpful while > > > I share the same concern Roger raised. > > > > > > Implementing this feature in Java and the library would be a better > > > choice as this isn't absolutely required to be done in VM in native. > > > > > > NPE keeps a backtrace capturing the method id and bci of each stack > > > frame. One option to explore is to have StackWalker to accept a > > > Throwable object that returns a stream of StackFrame which allows > > > you to get the method and BCI and also code source (I started a > > > prototype for JDK-8189752 some time ago). It can use the bytecode > > > library e.g. ASM to read the bytecode. For NPE message, you can > > > implement a specialized StackFrameTraverser just for building > > > an exception message purpose. > > > > > > Mandy From m.sundar85 at gmail.com Thu Apr 18 22:05:35 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Thu, 18 Apr 2019 15:05:35 -0700 Subject: When to use ThreadContextClassLoader vs ApplicationClassLoader In-Reply-To: References: Message-ID: Thanks a lot for sharing. On Thu, Apr 18, 2019 at 1:36 PM yumin qi wrote: > You may check this doc: > https://docs.oracle.com/javase/jndi/tutorial/beyond/misc/classloader.html > > On Thu, Apr 18, 2019 at 12:08 PM Sundara Mohan M > wrote: > >> Hi, >> I am trying to understand the class loader delegation concepts and came >> across this ThreadContextClassLoader. >> >> Can someone give me more info on why we need ThreadContextClassLoader if >> ApplicationClassLoader can be used all time? >> Any pointers on this topic will help me to understand this better. >> >> Thanks >> Sundar >> > From m.sundar85 at gmail.com Thu Apr 18 22:20:53 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Thu, 18 Apr 2019 15:20:53 -0700 Subject: Proper way to scan all classes inside application/war files Message-ID: Hi, I was scanning all classes (to find all annotated class) using the URLClassLoader.getUrls() methods to find all URL and find it with JDK8. Since JDK9 onwards all App/System Class loaders are not deriving from URLClassLoader it doesn't work anymore. (solution of making ucp variable inside BuiltinClassLoader accesible and reading URLs works). What is the proper way to scan all(AppClassLoader + any other class loader) classes? Any pointers on this will be helpful. Thanks Sundar From forax at univ-mlv.fr Thu Apr 18 22:54:24 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 19 Apr 2019 00:54:24 +0200 (CEST) Subject: Proper way to scan all classes inside application/war files In-Reply-To: References: Message-ID: <1815310213.32448.1555628064375.JavaMail.zimbra@u-pem.fr> Hi Sundara, wrong mailing list :) It's an issue for the "jigsaw-dev" mailing list. regards, R?mi ----- Mail original ----- > De: "Sundara Mohan M" > ?: "hotspot-runtime-dev" > Envoy?: Vendredi 19 Avril 2019 00:20:53 > Objet: Proper way to scan all classes inside application/war files > Hi, > I was scanning all classes (to find all annotated class) using the > URLClassLoader.getUrls() methods to find all URL and find it with JDK8. > Since JDK9 onwards all App/System Class loaders are not deriving from > URLClassLoader it doesn't work anymore. (solution of making ucp variable > inside BuiltinClassLoader accesible and reading URLs works). > > What is the proper way to scan all(AppClassLoader + any other class loader) > classes? > > Any pointers on this will be helpful. > > > Thanks > Sundar From david.holmes at oracle.com Thu Apr 18 23:09:18 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Apr 2019 09:09:18 +1000 Subject: When to use ThreadContextClassLoader vs ApplicationClassLoader In-Reply-To: References: Message-ID: <8d83f6e5-626f-dfdc-7e19-61a032cb37ab@oracle.com> Please note your question was nothing to do with the development of the Hotspot VM in OpenJDK. Questions on the Java language and API usage should be directed to appropriate user groups/forums. Thanks, David On 19/04/2019 8:05 am, Sundara Mohan M wrote: > Thanks a lot for sharing. > > On Thu, Apr 18, 2019 at 1:36 PM yumin qi wrote: > >> You may check this doc: >> https://docs.oracle.com/javase/jndi/tutorial/beyond/misc/classloader.html >> >> On Thu, Apr 18, 2019 at 12:08 PM Sundara Mohan M >> wrote: >> >>> Hi, >>> I am trying to understand the class loader delegation concepts and came >>> across this ThreadContextClassLoader. >>> >>> Can someone give me more info on why we need ThreadContextClassLoader if >>> ApplicationClassLoader can be used all time? >>> Any pointers on this topic will help me to understand this better. >>> >>> Thanks >>> Sundar >>> >> From daniel.daugherty at oracle.com Fri Apr 19 15:58:49 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 19 Apr 2019 11:58:49 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints (CR1/v2.01/4-for-jdk13) In-Reply-To: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> Message-ID: Greetings, I finally have CR1 for the Async Monitor Deflation project ready to go. It's also known as v2.01 (for those for with the patches) and as webrev/4-for-jdk13 (for those with webrev URLs). Sorry for all the names... Main bug URL: ??? JDK-8153224 Monitor deflation prolong safepoints ??? https://bugs.openjdk.java.net/browse/JDK-8153224 Baseline bug fixes URL: ??? JDK-8222295 more baseline cleanups from Async Monitor Deflation project ??? https://bugs.openjdk.java.net/browse/JDK-8222295 The project is currently baselined on jdk-13+15. Here's the webrev for the latest baseline changes (JDK-8222295): http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295 Here's the full webrev URL (JDK-8153224 only): http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.full/ Here's the incremental webrev URL (JDK-8153224): http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.inc/ So I'm looking for reviews for both JDK-8222295 and the latest version of JDK-8153224... I still have to update the OpenJDK wiki to reflect the CR changes: https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation This version of the patch has been thru Mach5 tier[1-3] testing on Oracle's usual set of platforms. Mach5 tier[4-6] is running now and Mach5 tier[78] will be run later today. My stress kit on Solaris-X64 is running now. Linux-X64 stress testing will start on Sunday. I'm planning to do Kitchensink runs, SPECjbb2015 runs and my monitor inflation stress tests on Linux-X64, MacOSX and Solaris-X64. Thanks, in advance, for any questions, comments or suggestions. Dan On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: > Greetings, > > Welcome to the OpenJDK review thread for my port of Carsten's work on: > > ??? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > Here's a link to the OpenJDK wiki that describes my port: > > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > > Here's the webrev URL: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ > > Here's a link to Carsten's original webrev: > > http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ > > Earlier versions of this patch have been through several rounds of > preliminary review. Many thanks to Carsten, Coleen, Robbin, and > Roman for their preliminary code review comments. A very special > thanks to Robbin and Roman for building and testing the patch in > their own environments (including specJBB2015). > > This version of the patch has been thru Mach5 tier[1-8] testing on > Oracle's usual set of platforms. Earlier versions have been run > through my stress kit on my Linux-X64 and Solaris-X64 servers > (product, fastdebug, slowdebug).Earlier versions have run Kitchensink > for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug > and slowdebug). Earlier versions have run my monitor inflation stress > tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, > fastdebug and slowdebug). > > All of the testing done on earlier versions will be redone on the > latest version of the patch. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > > P.S. > One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java > is currently failing in -Xcomp mode on Win* only. I've been trying > to characterize/analyze this failure for more than a week now. At > this point I'm convinced that Async Monitor Deflation is aggravating > an existing bug. However, I plan to have a better handle on that > failure before these bits are pushed to the jdk/jdk repo. > From karen.kinnear at oracle.com Fri Apr 19 16:29:31 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 19 Apr 2019 12:29:31 -0400 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CAFAF21.3030007@oracle.com> References: <5CAFAF21.3030007@oracle.com> Message-ID: Calvin, Many thanks for all the work getting this ready, significantly enhancing the testing and bug fixes. I marked the CSR as reviewed-by - it looks great! I reviewed this set of changes - I did not review the tests - I assume you can get someone else to do that. I am grateful that Jiangli and Ioi are going to review this also - they are much closer to the details than I am. 1. Do you have any performance numbers? 1a. Startup: does using a combined dynamic CDS archive + base archive give similar startup benefits when you have the same classes in the archives? 1b. Do you have samples of uses of the combined dynamic CDS archive + base archive vs. a single static archive built for an application? - how do the sets of archived classes differ? - one note was that the AtExit approach exclude list adds anything that has not yet linked - does that make a significant difference in the number of classes that are archived? Does that make a difference in either startup time or in application execution time? I could see that going either way. 1c. Any sense of performance cost for first run - how much time does it take to create an incremental archive? - is the time comparable to an existing dump for a single archive for the application? - this is an ease-of-use feature - so we are not expecting that to be fast - the point is to set expectations in our documentation 2. Footprint With two archives rather than one, is there a significant footprint difference? Obviously this will vary by app and archive. Once again, the point is to set expectations. 3. Runtime performance With two sets of archived dictionaries & symbolTables - is there any significant performance cost to larger benchmarks, e.g. for class loading lookup for classes that are not in the archives? Or symbol lookup? 4. Platform support Which platforms is this supported on? Which ones did you test? For example, did you run the tests on Windows? Detailed feedback on the code: Just minor comments - I don?t need to see an updated webrev: 1. metaSpaceShared.hpp line 156: what is the hardcoded -100 for? Should that be an enum? 2. jfrRecorder.cpp So JFR recordings are disabled if DynamicDumpSharedSpaces? why? Is that a future rfe? 3. systemDictionaryShared.cpp Could you possibly add a comment to add_verification_constraint for if (DynamicDumpSharedSpaces) return false -- I think the logic is: because we have successfully linked any instanceKlass we archive with DynamicDumpSharedSpaces, we have resolved all the constraint classes. -- I didn't check the order - is this called before or after excluding? If after, then would it make sense to add an assertion here is_linked? Then if you ever change how/when linking is done, this might catch future errors. 4. systemDictionaryShared.cpp EstimateSizeForArchive::do_entry Is it the case that for info.is_builtin() there are no verification constraints? So you could skip that calculation? Or did I misunderstand? 5. compactHashtable.cpp serialize/header/calculate_header_size -- could you dynamically determine size_of header so you don't need to hardcode a 5? 6. classLoader.cpp line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are mutually exclusive. Can you clarify for me: My memory of the base archive is that we do not allow the following options at dump time - and these are the same for the dynamic archive: ?limit-modules, ?upgrade-module-path, ?patch-module. I have forgotten: Today with UseSharedSpaces - do we allow these flags? Is that also the same behavior with the dynamic archive? 7. classLoaderExt.cpp assert line 66: only used with -Xshare:dump -> "only used at dump time" 8. symbolTable.cpp line 473: comment // used by UseSharedArchived2 ? command-line arg name has changed 9. filemap.cpp Comment lines 529 ... Is this true - that you can only support dynamic dumping with the default CDS archive? Could you clarify what the restrictions are? The CSR implies you can support ?a specific base CDS archive" - so base layer can not have appended boot class path - and base layer can't have a module path What can you specify for the dynamic dumping relative to the base archive? - matching class path? - appended class path? in future - could it have a module path that matched the base archive? Should any of these restrictions be clarified in documentation/CSR since they appear to be new? 10. filemap.cpp check_archive Do some of the return false paths skip performing os::close(fd)? and get_base_archive_name_from_header Does the first return false path fail to os::free(dynamic_header) lines 753-754: two FIXME comments Could you delete commented out line 1087 in filemap.cpp ? 11. filemap.hpp line 214: TODO left in 12. metaspace.cpp line 1418 FIXME left in 13. java.cpp FIXME: is this the right place? For starting the DynamicArchive::dump Please check with David Holmes on that one 14. dynamicArchive.hpp line 55 (and others): MetsapceObj -> MetaspaceObj 15. dynamicArchive.cpp line 285 rel-ayout -> re-layout lines 277 && 412 Do we archive array klasses in the base archive but not in the dynamic archive? Is that a potential RFE? Is it possible that GatherKlassesAndSymbols::do_unique_ref could be called with an array class? Same question for copy_impl? line 934: "no onger" -> "no longer" 16. What is AllowArchivingWithJavaAgent? Is that a hook for a potential future rfe? Do you want to check in that code at this time? In product? thanks, Karen > On Apr 11, 2019, at 5:18 PM, Calvin Cheung wrote: > > This is a follow-up on the preliminary code review sent by Jiangli in January[1]. > > Highlights of changes since then: > 1. New vm option for dumping a dynamic archive (-XX:ArchiveClassesAtExit=) and enhancement to the existing -XX:SharedArchiveFile option. Please refer to the corresponding CSR[2] for details. > 2. New way to run existing AppCDS tests in dynamic CDS archive mode. At the jtreg command line, the user can run many existing AppCDS tests in dynamic CDS archive mode by specifying the following: > -vmoptions:-Dtest.dynamic.cds.archive=true /open/test/hotspot/jtreg:hotspot_appcds_dynamic > We will have a follow-up RFE to determine in which tier the above tests should be run. > 3. Added more tests. > 4. Various bug fixes to improve stability. > > RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 > webrev: http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ > > (The webrev is based on top of the following rev: http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) > > Testing: > - mach5 tiers 1- 3 (including the new tests) > - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few tests require more investigation) > > thanks, > Calvin > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html > [2] https://bugs.openjdk.java.net/browse/JDK-8221706 From calvin.cheung at oracle.com Fri Apr 19 20:44:47 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Fri, 19 Apr 2019 13:44:47 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CAFAF21.3030007@oracle.com> References: <5CAFAF21.3030007@oracle.com> Message-ID: <5CBA333F.20702@oracle.com> Ioi and I have fixed a few issues since webrev.00. Fixed issues: - incorrect system dictionary size estimate; - MetaspaceShared::remap_shared_readonly_as_readwrite() needs to handle the dynamic archive case; - bug in MetaspaceShared::is_shared_dynamic(); - couple of tests fix. webrevs: incremental: http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_00_01/ full: http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.01/ thanks, Calvin On 4/11/19, 2:18 PM, Calvin Cheung wrote: > This is a follow-up on the preliminary code review sent by Jiangli in > January[1]. > > Highlights of changes since then: > 1. New vm option for dumping a dynamic archive > (-XX:ArchiveClassesAtExit=) and enhancement to the > existing -XX:SharedArchiveFile option. Please refer to the > corresponding CSR[2] for details. > 2. New way to run existing AppCDS tests in dynamic CDS archive mode. > At the jtreg command line, the user can run many existing AppCDS tests > in dynamic CDS archive mode by specifying the following: > -vmoptions:-Dtest.dynamic.cds.archive=true > /open/test/hotspot/jtreg:hotspot_appcds_dynamic > We will have a follow-up RFE to determine in which tier the above > tests should be run. > 3. Added more tests. > 4. Various bug fixes to improve stability. > > RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 > webrev: > http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ > > (The webrev is based on top of the following rev: > http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) > > Testing: > - mach5 tiers 1- 3 (including the new tests) > - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few > tests require more investigation) > > thanks, > Calvin > > [1] > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html > [2] https://bugs.openjdk.java.net/browse/JDK-8221706 From daniel.daugherty at oracle.com Mon Apr 22 13:19:04 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 22 Apr 2019 09:19:04 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints (CR1/v2.01/4-for-jdk13) In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> Message-ID: I just realized that I forgot to include the list of CR0 -> CR1 changes that were made. I meant to include it as an attachment to the original CR1 email, but here it is in-line: Functional: ? - Allow a special cleanup request (e.g., System.gc()) to check the ??? global in-use list also. ? - Move setting of per-thread omShouldDeflateIdleMonitors flag from ??? ObjectSynchronizer::do_safepoint_work() to ??? ObjectSynchronizer::deflate_thread_local_monitors(). ? - ObjectMonitor::install_displaced_markword_in_object() ??? - add 'const oop obj' parameter for the target object ??? - catch race with ObjectMonitor's object not matching target object ??? - catch 2nd race with cleared ObjectMonitor header ??? - use the target object instead of ObjectMonitor's object to ????? restore the object's header (via cas_set_mark). ??? - clear ObjectMonitor header if it is still marked so a racing ????? install_displaced_markword_in_object() can bail out sooner. ??? - clarify/update most comments ? - Catch 2nd race with ref_count in deflate_monitor_using_JT(). ? - Add more install_displaced_markword_in_object() calls to reduce ??? more retries to one iteration. ? - Set allocation state to "Old" in inflate(). ? - Make sure ObjectMonitorHandle::_om_ptr is set before publishing ??? an ObjectMonitor. ? - Remove work arounds from ObjectMonitor::dec_ref_count() and ??? ObjectMonitor::dec_ref_count(). ? - ObjectMonitor::clear_using_JT() ??? - cannot check or clear ObjectMonitor header value ??? - clarify comments ? - ObjectSynchronizer::omAlloc() needs to clear _header field (just ??? like _owner and _contentions) when moving an ObjectMonitor from ??? the global free list to the per thread free list. Style: ? - Remove 'goto' from FastHashCode(). ? - Refactor common code from deflate_global_idle_monitors_using_JT() ??? and deflate_per_thread_idle_monitors_using_JT() into ??? deflate_common_idle_monitors_using_JT(). ? - Use Atomic::replace_if_null() instead of cmpxchg() with NULL check. ? - Remove unneeded '(address)' casts. Debugging/Maintenance: ? - Tighten up ObjectMonitorHandle::set_om_ptr() caller checks. ? - Add ADIM_guarantee() for doing guarantee() calls if ??? AsyncDeflateIdleMonitors and assert() calls otherwise. ? - Add/update guarantee()'s and assert()'s. ? - Delete some paranoid assert()'s. ? - Add/move/update comments. Thanks, in advance, for any questions, comments or suggestions. Dan On 4/19/19 11:58 AM, Daniel D. Daugherty wrote: > Greetings, > > I finally have CR1 for the Async Monitor Deflation project ready to > go. It's also known as v2.01 (for those for with the patches) and as > webrev/4-for-jdk13 (for those with webrev URLs). Sorry for all the > names... > > Main bug URL: > > ??? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > Baseline bug fixes URL: > > ??? JDK-8222295 more baseline cleanups from Async Monitor Deflation > project > ??? https://bugs.openjdk.java.net/browse/JDK-8222295 > > The project is currently baselined on jdk-13+15. > > Here's the webrev for the latest baseline changes (JDK-8222295): > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295 > > Here's the full webrev URL (JDK-8153224 only): > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.full/ > > Here's the incremental webrev URL (JDK-8153224): > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.inc/ > > So I'm looking for reviews for both JDK-8222295 and the latest version > of JDK-8153224... > > I still have to update the OpenJDK wiki to reflect the CR changes: > > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > > This version of the patch has been thru Mach5 tier[1-3] testing on > Oracle's usual set of platforms. Mach5 tier[4-6] is running now and > Mach5 tier[78] will be run later today. My stress kit on Solaris-X64 > is running now. Linux-X64 stress testing will start on Sunday. I'm > planning to do Kitchensink runs, SPECjbb2015 runs and my monitor > inflation stress tests on Linux-X64, MacOSX and Solaris-X64. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > > > On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> Welcome to the OpenJDK review thread for my port of Carsten's work on: >> >> ??? JDK-8153224 Monitor deflation prolong safepoints >> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> Here's a link to the OpenJDK wiki that describes my port: >> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >> >> Here's the webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >> >> Here's a link to Carsten's original webrev: >> >> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >> >> Earlier versions of this patch have been through several rounds of >> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >> Roman for their preliminary code review comments. A very special >> thanks to Robbin and Roman for building and testing the patch in >> their own environments (including specJBB2015). >> >> This version of the patch has been thru Mach5 tier[1-8] testing on >> Oracle's usual set of platforms. Earlier versions have been run >> through my stress kit on my Linux-X64 and Solaris-X64 servers >> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >> and slowdebug). Earlier versions have run my monitor inflation stress >> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >> fastdebug and slowdebug). >> >> All of the testing done on earlier versions will be redone on the >> latest version of the patch. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> >> P.S. >> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >> is currently failing in -Xcomp mode on Win* only. I've been trying >> to characterize/analyze this failure for more than a week now. At >> this point I'm convinced that Async Monitor Deflation is aggravating >> an existing bug. However, I plan to have a better handle on that >> failure before these bits are pushed to the jdk/jdk repo. >> > > From lois.foltan at oracle.com Mon Apr 22 15:18:08 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Mon, 22 Apr 2019 11:18:08 -0400 Subject: RFR (XS) JDK-8222502: Replace 19,20 case alternatives with JVM_CONSTANT_Module/Package names Message-ID: <6cb68b86-9bcb-d779-0616-f91108ddea80@oracle.com> Please review the following change to add JVM_CONSTANT_Module and JVM_CONSTANT_Package to classfile_constants.h and use these constant types instead of hard coded 19 & 20 within ClassFileParser. open webrev at: http://cr.openjdk.java.net/~lfoltan/bug_jdk8222502.0/webrev/ bug link: https://bugs.openjdk.java.net/browse/JDK-8222502 Testing: hs-tier1-3, jdk-tier1-3 Thanks, Lois From coleen.phillimore at oracle.com Mon Apr 22 15:20:52 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 22 Apr 2019 11:20:52 -0400 Subject: RFR (XS) JDK-8222502: Replace 19,20 case alternatives with JVM_CONSTANT_Module/Package names In-Reply-To: <6cb68b86-9bcb-d779-0616-f91108ddea80@oracle.com> References: <6cb68b86-9bcb-d779-0616-f91108ddea80@oracle.com> Message-ID: <6fd427f4-64e7-fa8d-91d3-adb2470b4c6c@oracle.com> Lois, Thank you for fixing this!? Looks good. Coleen On 4/22/19 11:18 AM, Lois Foltan wrote: > Please review the following change to add JVM_CONSTANT_Module and > JVM_CONSTANT_Package to classfile_constants.h and use these constant > types instead of hard coded 19 & 20 within ClassFileParser. > > open webrev at: > http://cr.openjdk.java.net/~lfoltan/bug_jdk8222502.0/webrev/ > bug link: https://bugs.openjdk.java.net/browse/JDK-8222502 > > > Testing: hs-tier1-3, jdk-tier1-3 > > Thanks, > Lois From lois.foltan at oracle.com Mon Apr 22 15:24:33 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Mon, 22 Apr 2019 11:24:33 -0400 Subject: RFR (XS) JDK-8222502: Replace 19,20 case alternatives with JVM_CONSTANT_Module/Package names In-Reply-To: <6fd427f4-64e7-fa8d-91d3-adb2470b4c6c@oracle.com> References: <6cb68b86-9bcb-d779-0616-f91108ddea80@oracle.com> <6fd427f4-64e7-fa8d-91d3-adb2470b4c6c@oracle.com> Message-ID: Thanks Coleen! Lois On 4/22/2019 11:20 AM, coleen.phillimore at oracle.com wrote: > > Lois, > Thank you for fixing this!? Looks good. > Coleen > > On 4/22/19 11:18 AM, Lois Foltan wrote: >> Please review the following change to add JVM_CONSTANT_Module and >> JVM_CONSTANT_Package to classfile_constants.h and use these constant >> types instead of hard coded 19 & 20 within ClassFileParser. >> >> open webrev at: >> http://cr.openjdk.java.net/~lfoltan/bug_jdk8222502.0/webrev/ >> bug link: https://bugs.openjdk.java.net/browse/JDK-8222502 >> >> >> Testing: hs-tier1-3, jdk-tier1-3 >> >> Thanks, >> Lois > From harold.seigel at oracle.com Mon Apr 22 17:01:38 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Mon, 22 Apr 2019 13:01:38 -0400 Subject: RFR (XS) JDK-8222502: Replace 19,20 case alternatives with JVM_CONSTANT_Module/Package names In-Reply-To: <6cb68b86-9bcb-d779-0616-f91108ddea80@oracle.com> References: <6cb68b86-9bcb-d779-0616-f91108ddea80@oracle.com> Message-ID: Hi Lois, This looks good. Thanks, Harold On 4/22/2019 11:18 AM, Lois Foltan wrote: > Please review the following change to add JVM_CONSTANT_Module and > JVM_CONSTANT_Package to classfile_constants.h and use these constant > types instead of hard coded 19 & 20 within ClassFileParser. > > open webrev at: > http://cr.openjdk.java.net/~lfoltan/bug_jdk8222502.0/webrev/ > bug link: https://bugs.openjdk.java.net/browse/JDK-8222502 > > > Testing: hs-tier1-3, jdk-tier1-3 > > Thanks, > Lois From calvin.cheung at oracle.com Mon Apr 22 17:06:16 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Mon, 22 Apr 2019 10:06:16 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: References: <5CAFAF21.3030007@oracle.com> Message-ID: <5CBDF488.7000601@oracle.com> Hi Karen, Thanks for your review! Please see my replies in-line below. On 4/19/19, 9:29 AM, Karen Kinnear wrote: > Calvin, > > Many thanks for all the work getting this ready, significantly > enhancing the testing and bug fixes. > > I marked the CSR as reviewed-by - it looks great! > > I reviewed this set of changes - I did not review the tests - I assume > you can get someone > else to do that. I am grateful that Jiangli and Ioi are going to > review this also - they are much closer to > the details than I am. > > 1. Do you have any performance numbers? > 1a. Startup: does using a combined dynamic CDS archive + base archive > give similar startup benefits > when you have the same classes in the archives? Below are some performance numbers from Eric, each number is for 50 runs: (base: using the default CDS archive, test: using the dynamic archive, Eric will get some numbers with a single archive which I think that's what you're looking for) Lambda-noop: base: 0.066441427 seconds time elapsed test: 0.075428824 seconds time elapsed Noop: base: 0.057614537 seconds time elapsed test: 0.066061557 seconds time elapsed Netty: base: 0.827013307 seconds time elapsed test: 0.604982805 seconds time elapsed Spring: base: 2.376707358 seconds time elapsed test: 1.927618893 seconds time elapsed The first 2 apps only have 2 to 3 classes in the dynamic archive. So the overhead is likely due to having to open and map the dynamic archive and performs checking on header, etc. For small apps, I think it's better to use a single archive. The Netty app has around 1400 classes in the dynamic archive; the Spring app has about 3700 classes in the dynamic archive. I also used our LotsOfClasses test to collect some perf numbers. This is more like runtime performance, not startup performance. With dynamic archive (100 runs each): real 2m37.191s real 2m36.003s Total loaded classes = 24254 Loaded from base archive = 1186 Loaded from top archive = 23042 Loaded from jrt:/ (runtime module) = 26 With single archive (100 runs each): real 2m38.346s real 2m36.947s Total loaded classes = 24254 Loaded from archive = 24228 Loaded from jrt:/ (runtime module) = 26 > > 1b. Do you have samples of uses of the combined dynamic CDS archive + > base archive vs. a single > static archive built for an application? > - how do the sets of archived classes differ? Currently, the default CDS archive contains around 1187 classes. With the -XX:ArchiveClassesAtExit option, if the classes are not found in the default CDS archive, they will be archived in the dynamic archive. The above LotsOfClasses example shows some distributions between various archives. > - one note was that the AtExit approach exclude list adds anything > that has not yet linked - does that make a significant difference in > the number of classes that are archived? Does that make a difference > in either startup time or in application execution time? I could see > that going either way. As the above numbers indicated, there's not much difference in terms of execution time using a dynamic vs a single archive with a large number of classes loaded. The numbers from Netty and Spring apps show an improvement over default CDS archive. > > 1c. Any sense of performance cost for first run - how much time does > it take to create an incremental archive? > - is the time comparable to an existing dump for a single archive > for the application? > - this is an ease-of-use feature - so we are not expecting that to > be fast > - the point is to set expectations in our documentation I did some rough measurements with the LotsOfClasses test with around 15000 classes in the classlist. Dynamic archive dumping (one run each): real 0m19.756s real 0m20.241s Static archive dumping (one run each): real 0m17.725s real 0m16.993s > > 2. Footprint > With two archives rather than one, is there a significant footprint > difference? Obviously this will vary by app and archive. > Once again, the point is to set expectations. Sizes of the archives for the LotsOfClasses test in 1a. Single archive: 242962432 Default CDS archive: 12365824 Dynamic archive: 197525504 > > 3. Runtime performance > With two sets of archived dictionaries & symbolTables - is there any > significant performance cost to larger benchmarks, e.g. for class > loading lookup for classes that are not in the archives? Or symbol > lookup? I used the LotsOfClasses test again. This time archiving about half of the classes which will be loaded during runtime. Dynamic archive (10 runs each): real 0m30.214s real 0m29.633s Loaded classes = 24254 Loaded from dynamic archive: 13168 Single archive (10 runs each): real 0m32.383s real 0m32.905s Loaded classes = 24254 Loaded from single archive = 15063 > > 4. Platform support > Which platforms is this supported on? > Which ones did you test? For example, did you run the tests on Windows? I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, Solaris, Windows). > > Detailed feedback on the code: Just minor comments - I don?t need to > see an updated webrev: I'm going to look into your detailed feedback below and may reply in a separate email. thanks, Calvin > > 1. metaSpaceShared.hpp > line 156: > what is the hardcoded -100 for? Should that be an enum? > > 2. jfrRecorder.cpp > So JFR recordings are disabled if DynamicDumpSharedSpaces? > why? > Is that a future rfe? > > 3. systemDictionaryShared.cpp > Could you possibly add a comment to add_verification_constraint > for if (DynamicDumpSharedSpaces) > return false > > -- I think the logic is: > because we have successfully linked any instanceKlass we archive > with DynamicDumpSharedSpaces, we have resolved all the constraint classes. > > -- I didn't check the order - is this called before or after > excluding? If after, then would it make sense to add an assertion > here is_linked? Then if you ever change how/when linking is done, this > might catch future errors. > > 4. systemDictionaryShared.cpp > EstimateSizeForArchive::do_entry > Is it the case that for info.is_builtin() there are no verification > constraints? So you could skip that calculation? Or did I misunderstand? > > 5. compactHashtable.cpp > serialize/header/calculate_header_size > -- could you dynamically determine size_of header so you don't need > to hardcode a 5? > > 6. classLoader.cpp > line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are > mutually exclusive. > Can you clarify for me: > My memory of the base archive is that we do not allow the following > options at dump time - and these > are the same for the dynamic archive: ?limit-modules, > ?upgrade-module-path, ?patch-module. > > I have forgotten: > Today with UseSharedSpaces - do we allow these flags? Is that also the > same behavior with the dynamic > archive? > > 7. classLoaderExt.cpp > assert line 66: only used with -Xshare:dump > -> "only used at dump time" > > 8. symbolTable.cpp > line 473: comment // used by UseSharedArchived2 > ? command-line arg name has changed > > 9. filemap.cpp > Comment lines 529 ... > Is this true - that you can only support dynamic dumping with the > default CDS archive? Could you clarify what the restrictions are? > The CSR implies you can support ?a specific base CDS archive" > - so base layer can not have appended boot class path > - and base layer can't have a module path > > What can you specify for the dynamic dumping relative to the base archive? > - matching class path? > - appended class path? > in future - could it have a module path that matched the base archive? > > Should any of these restrictions be clarified in documentation/CSR > since they appear to be new? > > 10. filemap.cpp > check_archive > Do some of the return false paths skip performing os::close(fd)? > > and get_base_archive_name_from_header > Does the first return false path fail to os::free(dynamic_header) > > lines 753-754: two FIXME comments > > Could you delete commented out line 1087 in filemap.cpp ? > > 11. filemap.hpp > line 214: TODO left in > > 12. metaspace.cpp > line 1418 FIXME left in > > 13. java.cpp > FIXME: is this the right place? > For starting the DynamicArchive::dump > > Please check with David Holmes on that one > > 14. dynamicArchive.hpp > line 55 (and others): MetsapceObj -> MetaspaceObj > > 15. dynamicArchive.cpp > line 285 rel-ayout -> re-layout > > lines 277 && 412 > Do we archive array klasses in the base archive but not in the dynamic > archive? > Is that a potential RFE? > Is it possible that GatherKlassesAndSymbols::do_unique_ref could be > called with an array class? > Same question for copy_impl? > > line 934: "no onger" -> "no longer" > > 16. What is AllowArchivingWithJavaAgent? Is that a hook for a > potential future rfe? > Do you want to check in that code at this time? In product? > > thanks, > Karen > > >> On Apr 11, 2019, at 5:18 PM, Calvin Cheung > > wrote: >> >> This is a follow-up on the preliminary code review sent by Jiangli in >> January[1]. >> >> Highlights of changes since then: >> 1. New vm option for dumping a dynamic archive >> (-XX:ArchiveClassesAtExit=) and enhancement to the >> existing -XX:SharedArchiveFile option. Please refer to the >> corresponding CSR[2] for details. >> 2. New way to run existing AppCDS tests in dynamic CDS archive mode. >> At the jtreg command line, the user can run many existing AppCDS >> tests in dynamic CDS archive mode by specifying the following: >> -vmoptions:-Dtest.dynamic.cds.archive=true >> /open/test/hotspot/jtreg:hotspot_appcds_dynamic >> We will have a follow-up RFE to determine in which tier the above >> tests should be run. >> 3. Added more tests. >> 4. Various bug fixes to improve stability. >> >> RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 >> webrev: >> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ >> >> >> (The webrev is based on top of the following rev: >> http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) >> >> Testing: >> - mach5 tiers 1- 3 (including the new tests) >> - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few >> tests require more investigation) >> >> thanks, >> Calvin >> >> [1] >> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html >> [2] https://bugs.openjdk.java.net/browse/JDK-8221706 > From lois.foltan at oracle.com Mon Apr 22 17:13:32 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Mon, 22 Apr 2019 13:13:32 -0400 Subject: RFR (XS) JDK-8222502: Replace 19,20 case alternatives with JVM_CONSTANT_Module/Package names In-Reply-To: References: <6cb68b86-9bcb-d779-0616-f91108ddea80@oracle.com> Message-ID: <57fa4497-6585-8fd7-c467-6e667b51b8bb@oracle.com> Thanks Harold! Lois On 4/22/2019 1:01 PM, Harold Seigel wrote: > Hi Lois, > > This looks good. > > Thanks, Harold > > On 4/22/2019 11:18 AM, Lois Foltan wrote: >> Please review the following change to add JVM_CONSTANT_Module and >> JVM_CONSTANT_Package to classfile_constants.h and use these constant >> types instead of hard coded 19 & 20 within ClassFileParser. >> >> open webrev at: >> http://cr.openjdk.java.net/~lfoltan/bug_jdk8222502.0/webrev/ >> bug link: https://bugs.openjdk.java.net/browse/JDK-8222502 >> >> >> Testing: hs-tier1-3, jdk-tier1-3 >> >> Thanks, >> Lois From claes.redestad at oracle.com Mon Apr 22 18:30:21 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 22 Apr 2019 20:30:21 +0200 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CBDF488.7000601@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> Message-ID: <2d31e0ef-4f13-3551-83d4-3bb9153fd17f@oracle.com> On 2019-04-22 19:06, Calvin Cheung wrote: > > The first 2 apps only have 2 to 3 classes in the dynamic archive. So the > overhead is likely due to having to open and map the dynamic archive and > performs checking on header, etc. For small apps, I think it's better to > use a single archive. The Netty app has around 1400? classes in the > dynamic archive; the Spring app has about 3700 classes in the dynamic > archive. Could this be the overhead due -XX:+VerifySharedSpaces being enabled when enabling dynamic archives - possibly leading to us to verify even the base CDS archive (which we wouldn't do, normally). I've recently proposed this verification be disabled by default in all configurations[1], which would eliminate this causing surprise regressions when layering archives. Thanks! /Claes [1] https://bugs.openjdk.java.net/browse/JDK-8221478 From jianglizhou at google.com Mon Apr 22 21:07:18 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Mon, 22 Apr 2019 14:07:18 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CBA333F.20702@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> Message-ID: Hi Calvin, Congrats on finalizing the dynamic archiving work and completing testing. After the integration of the dynamic archiving, a follow-up RFE can be done to merge the archiving/copying code in dynamicArchive.* and metaspaceShared.* for better maintenance in the future. As there are many duplicates between those two, having shared implementation for both static and dynamic will be beneficial and reduce the maintenance cost. Here are my comments mainly for additional cleanups and some minor issues. - src/hotspot/share/classfile/classLoader.cpp 1337 // FIXME: DynamicDumpSharedSpaces and --patch-modules are mutually exclusive 1338 assert(!DynamicDumpSharedSpaces, "sanity"); I tagged the comment with 'FIXME' to serve as a reminder to add more details. The reason DynamicDumpSharedSpaces is 'mutually exclusive' with with --patch-modules because DynamicDumpSharedSpaces is only enabled when UseSharedSpaces is also enabled. As --patch-modules is not supported with UseSharedSpaces, it is not supported with DynamicDumpSharedSpaces either. 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, (ClassFileStream*)stream); Please add assert(DynamicDumpSharedSpaces, "sanity"); to the above code. With the new dynamic archiving capability, it's now able to load/archive a class with user defined classloader via this call path. A comment explaining this is also needed. - src/hotspot/share/classfile/classLoaderExt.cpp 64 void ClassLoaderExt::setup_app_search_path() { 65 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, 66 "this function is only used with -Xshare:dump"); The above message needs to be updated to reflect the new command-line option. 304 result->set_shared_classpath_index(UNREGISTERED_INDEX); 305 SystemDictionaryShared::set_shared_class_misc_info(result, stream); <<<<<<<<<< Why is the set_shared_class_misc_info call being removed? If this is a bug fix for loading classes from the classlist for user defined classloaders, it should be handled separately, and with a separate bug ID as well. - src/hotspot/share/classfile/compactHashtable.cpp 207 size_t SimpleCompactHashtable::calculate_header_size() { 208 // We have 5 fields. Each takes up sizeof(intptr_t). See WriteClosure::do_u4 209 size_t bytes = sizeof(intptr_t) * 5; 210 return bytes; 211 } 212 213 void SimpleCompactHashtable::serialize_header(SerializeClosure* soc) { 214 // NOTE: if you change this function, you MUST change the number 5 in 215 // calculate_header_size() accordingly. ... As a cleanup, a better way to handle this is to calculate the size within SimpleCompactHashtable::serialize_header during serializing the data and set the size value in a valuable. SimpleCompactHashtable::calculate_header_size() should simply retrieve the value. A renaming of SimpleCompactHashtable::calculate_header_size() can also be done. - src/hotspot/share/classfile/dictionary.cpp 315 InstanceKlass* Dictionary::find_class(Symbol* name) { 316 unsigned int hash = compute_hash(name); 317 int index = hash_to_index(hash); 318 return find_class(index, hash, name); 319 } Looks like the new function is not references (unless I'm missing something). Please remove the function. - src/hotspot/share/classfile/dictionary.hpp 65 InstanceKlass* find_class(Symbol* name); Same comment as the above. - src/hotspot/share/classfile/symbolTable.cpp. 473 Symbol* const _archived; // used by UseSharedArchived2 Please removed 'UseSharedArchived2'. The comment also needs more clarifications. I couldn't find any references to SymbolTableCreateEntry. Can you please point to me where it is being used? - src/hotspot/share/classfile/systemDictionaryShared.cpp 1218 if (DynamicDumpSharedSpaces) { 1219 return false; 1220 } else { The above case for DynamicDumpSharedSpaces needs to be examined carefully. Can you please ask Harold (and Coleen or Karen) to take a look? Also, a comment is needed to explain that we can complete all verification checks at dynamic dumping time. - src/hotspot/share/classfile/systemDictionaryShared.cpp 1279 ResourceMark rm; You can use 'ResourceMark rm(THREAD)'. - src/hotspot/share/memory/allocation.hpp 255 // 256 // When CDS is not enabled, both pointers are set to NULL. 257 static void* _shared_metaspace_base; // (inclusive) low address 258 static void* _shared_metaspace_top; // (exclusive) high addres Why the comment at line 256 was removed? - src/hotspot/share/memory/filemap.cpp 101 void FileMapInfo::fail_continue(const char *msg, ...) { 102 va_list ap; 103 va_start(ap, msg); 104 if (_runtime_dynamic_info == NULL) { 105 MetaspaceShared::set_archive_loading_failed(); 106 } else { 107 DynamicArchive::disable(); 108 } The above fail_continue only works if _runtime_dynamic_info is setup after the mapping the base archive. Comments should be add to explain that. Can you please rename '_runtime_dynamic_info' so it's more descriptive? Maybe use 'dynamic_archive_info'. 587 bool FileMapInfo::same_files(const char* file1, const char* file2) { The usage of FileMapInfo::same_files is not necessary and should be removed. The base archive's CRC checksum values are recorded in the dynamic archive. The runtime verifies the CRC values to make sure the same archive is used at dump time and runtime, regardless of the base archive path or name. It is designed for all use cases: * base CDS archive is specified in the -XX:SharedArchiveFile at dynamic dumping time * -XX:SharedArchiveFile is not specified at dynamic dumping time, default location for the default CDS archive is used * default CDS archive is specified in the -XX:SharedArchiveFile at runtime * default CDS archive is not specified in the -XX:SharedArchiveFile at runtime, default location for the default CDS archive is used In all above cases, the base archive CRC values check is sufficient. The use of path/name is fragile and should be avoided. That will allow you to remove the _base_archive_name_size from the dynamic archive. 752 if (is_static) { 753 // FIXME check for dynamic header as well 754 // FIXME Don't just check the last region -- check all regions! Can you please address the first FIXME at line 753? Checking the last region is sufficient since the archive is written is sequential order. The second FIXME is not necessary. - src/hotspot/share/memory/metaspace.cpp 1417 bool Metaspace::contains(const void* ptr) { 1418 // FIXME: need to check the dynamic archive Can you please remove the above FIXME? There is no need for a separate check. - src/hotspot/share/memory/metaspaceShared.cpp 830 intptr_t* MetaspaceShared::fix_cpp_vtable_for_second_archive Can you please rename the function to fix_cpp_vtable_for_dynamic_archive? - src/hotspot/share/oops/klass.cpp 527 assert (DumpSharedSpaces || DynamicDumpSharedSpaces, 528 "only called for DumpSharedSpaces"); 544 void Klass::remove_java_mirror() { 545 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, "only called for DumpSharedSpaces"); Please fix the messages above. - src/hotspot/share/prims/whitebox.cpp 2332 {CC"getResolvedReferences", CC"(Ljava/lang/Class;)Ljava/lang/Object;", (void*)&WB_GetResolvedReferences}, 2333 {CC"linkClass", CC"(Ljava/lang/Class;)V", (void*)&WB_LinkClass}, 2334 {CC"areOpenArchiveHeapObjectsMapped", CC"()Z", (void*)&WB_AreOpenArchiveHeapObjectsMapped}, Can you please align the indentation of line 2333 (to be the same as line 2332 or 2334)? - src/hotspot/share/runtime/arguments.cpp 1491 bool Arguments::check_unsupported_cds_runtime_properties() { 1492 assert(UseSharedSpaces, "this function is only used with -Xshare:{on,auto}"); 1493 assert(ARRAY_SIZE(unsupported_properties) == ARRAY_SIZE(unsupported_options), "must be"); 1494 if (ArchiveClassesAtExit != NULL) { 1495 // dynamic dumping, just return false, check_unsupported_dumping_properties() will be called 1496 // in init_shared_archive_paths(). 1497 return false; 1498 } The check_unsupported_cds_runtime_properties() should be done for the 'ArchiveClassesAtExit != NULL' case as well. Dynamic dumping is a combination of both dump time and runtime. 2729 // -Xshare:auto || -Xshare:dynamicDump As you've renamed the command-line argument for dynamic dumping support, the comment needs to be fixed. 3125 // Compiler threads may concurrently update the class metadata (such as method entries), so it's 3126 // unsafe with DumpSharedSpaces (which modifies the class metadata in place). Let's disable 3127 // compiler just to be safe. 3128 // 3129 // Note: this is not a concern for DynamicDumpSharedSpaces, which makes a copy of the class metadata 3130 // instead of modifying them in place. The copy is inaccessible to the compiler. 3131 set_mode_flags(_int); We need to come back to revisit the above for the 'static' archive dumping at one point. There is a RFE filed for that, if I remember correctly. Could you please add a 'TODO' notes in the above comment. A check should be done in arguments.cpp to make sure DynamicDumpSharedSpaces is not manipulated from the command-line directly. DynamicDumpSharedSpaces should not be enabled in the command-line without ArchiveClassesAtExit being specified. - src/hotspot/share/runtime/java.cpp 509 510 // FIXME: is this the right place? 511 if (DynamicDumpSharedSpaces) { 512 DynamicArchive::dump(); 513 } Again, the above 'FIXME' is served as a cleanup reminder. Please get opinions from others on this change. If the calling place is okay, please remove the FIXME. - test Could you please add a test case for setting DynamicDumpSharedSpaces from command-line? I only took a brief look of the test changes. Please ask Misha to review the test changes as well. Thanks and regards, Jiangli From jianglizhou at google.com Mon Apr 22 21:16:35 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Mon, 22 Apr 2019 14:16:35 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CBDF488.7000601@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> Message-ID: Hi Calvin, Can you please also publish the final performance numbers in the JEP 350 (or the implementation RFE) comment section? Thanks, Jiangli On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung wrote: > Hi Karen, > > Thanks for your review! > Please see my replies in-line below. > > On 4/19/19, 9:29 AM, Karen Kinnear wrote: > > Calvin, > > > > Many thanks for all the work getting this ready, significantly > > enhancing the testing and bug fixes. > > > > I marked the CSR as reviewed-by - it looks great! > > > > I reviewed this set of changes - I did not review the tests - I assume > > you can get someone > > else to do that. I am grateful that Jiangli and Ioi are going to > > review this also - they are much closer to > > the details than I am. > > > > 1. Do you have any performance numbers? > > 1a. Startup: does using a combined dynamic CDS archive + base archive > > give similar startup benefits > > when you have the same classes in the archives? > Below are some performance numbers from Eric, each number is for 50 runs: > (base: using the default CDS archive, > test: using the dynamic archive, > Eric will get some numbers with a single archive which I think that's > what you're looking for) > > Lambda-noop: > base: > 0.066441427 seconds time elapsed > test: > 0.075428824 seconds time elapsed > > Noop: > base: > 0.057614537 seconds time elapsed > test: > 0.066061557 seconds time elapsed > > Netty: > base: > 0.827013307 seconds time elapsed > test: > 0.604982805 seconds time elapsed > > Spring: > base: > 2.376707358 seconds time elapsed > test: > 1.927618893 seconds time elapsed > > The first 2 apps only have 2 to 3 classes in the dynamic archive. So the > overhead is likely due to having to open and map the dynamic archive and > performs checking on header, etc. For small apps, I think it's better to > use a single archive. The Netty app has around 1400 classes in the > dynamic archive; the Spring app has about 3700 classes in the dynamic > archive. > > I also used our LotsOfClasses test to collect some perf numbers. This is > more like runtime performance, not startup performance. > > With dynamic archive (100 runs each): > real 2m37.191s > real 2m36.003s > Total loaded classes = 24254 > Loaded from base archive = 1186 > Loaded from top archive = 23042 > Loaded from jrt:/ (runtime module) = 26 > > With single archive (100 runs each): > real 2m38.346s > real 2m36.947s > Total loaded classes = 24254 > Loaded from archive = 24228 > Loaded from jrt:/ (runtime module) = 26 > > > > > 1b. Do you have samples of uses of the combined dynamic CDS archive + > > base archive vs. a single > > static archive built for an application? > > - how do the sets of archived classes differ? > Currently, the default CDS archive contains around 1187 classes. With > the -XX:ArchiveClassesAtExit option, if the classes are not found in the > default CDS archive, they will be archived in the dynamic archive. The > above LotsOfClasses example shows some distributions between various > archives. > > - one note was that the AtExit approach exclude list adds anything > > that has not yet linked - does that make a significant difference in > > the number of classes that are archived? Does that make a difference > > in either startup time or in application execution time? I could see > > that going either way. > As the above numbers indicated, there's not much difference in terms of > execution time using a dynamic vs a single archive with a large number > of classes loaded. The numbers from Netty and Spring apps show an > improvement over default CDS archive. > > > > 1c. Any sense of performance cost for first run - how much time does > > it take to create an incremental archive? > > - is the time comparable to an existing dump for a single archive > > for the application? > > - this is an ease-of-use feature - so we are not expecting that to > > be fast > > - the point is to set expectations in our documentation > I did some rough measurements with the LotsOfClasses test with around > 15000 classes in the classlist. > > Dynamic archive dumping (one run each): > real 0m19.756s > real 0m20.241s > > Static archive dumping (one run each): > real 0m17.725s > real 0m16.993s > > > > 2. Footprint > > With two archives rather than one, is there a significant footprint > > difference? Obviously this will vary by app and archive. > > Once again, the point is to set expectations. > Sizes of the archives for the LotsOfClasses test in 1a. > > Single archive: 242962432 > Default CDS archive: 12365824 > Dynamic archive: 197525504 > > > > > 3. Runtime performance > > With two sets of archived dictionaries & symbolTables - is there any > > significant performance cost to larger benchmarks, e.g. for class > > loading lookup for classes that are not in the archives? Or symbol > > lookup? > I used the LotsOfClasses test again. This time archiving about half of > the classes which will be loaded during runtime. > > Dynamic archive (10 runs each): > real 0m30.214s > real 0m29.633s > Loaded classes = 24254 > Loaded from dynamic archive: 13168 > > Single archive (10 runs each): > real 0m32.383s > real 0m32.905s > Loaded classes = 24254 > Loaded from single archive = 15063 > > > > 4. Platform support > > Which platforms is this supported on? > > Which ones did you test? For example, did you run the tests on Windows? > I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, Solaris, > Windows). > > > > Detailed feedback on the code: Just minor comments - I don?t need to > > see an updated webrev: > I'm going to look into your detailed feedback below and may reply in a > separate email. > > thanks, > Calvin > > > > 1. metaSpaceShared.hpp > > line 156: > > what is the hardcoded -100 for? Should that be an enum? > > > > 2. jfrRecorder.cpp > > So JFR recordings are disabled if DynamicDumpSharedSpaces? > > why? > > Is that a future rfe? > > > > 3. systemDictionaryShared.cpp > > Could you possibly add a comment to add_verification_constraint > > for if (DynamicDumpSharedSpaces) > > return false > > > > -- I think the logic is: > > because we have successfully linked any instanceKlass we archive > > with DynamicDumpSharedSpaces, we have resolved all the constraint > classes. > > > > -- I didn't check the order - is this called before or after > > excluding? If after, then would it make sense to add an assertion > > here is_linked? Then if you ever change how/when linking is done, this > > might catch future errors. > > > > 4. systemDictionaryShared.cpp > > EstimateSizeForArchive::do_entry > > Is it the case that for info.is_builtin() there are no verification > > constraints? So you could skip that calculation? Or did I misunderstand? > > > > 5. compactHashtable.cpp > > serialize/header/calculate_header_size > > -- could you dynamically determine size_of header so you don't need > > to hardcode a 5? > > > > 6. classLoader.cpp > > line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are > > mutually exclusive. > > Can you clarify for me: > > My memory of the base archive is that we do not allow the following > > options at dump time - and these > > are the same for the dynamic archive: ?limit-modules, > > ?upgrade-module-path, ?patch-module. > > > > I have forgotten: > > Today with UseSharedSpaces - do we allow these flags? Is that also the > > same behavior with the dynamic > > archive? > > > > 7. classLoaderExt.cpp > > assert line 66: only used with -Xshare:dump > > -> "only used at dump time" > > > > 8. symbolTable.cpp > > line 473: comment // used by UseSharedArchived2 > > ? command-line arg name has changed > > > > 9. filemap.cpp > > Comment lines 529 ... > > Is this true - that you can only support dynamic dumping with the > > default CDS archive? Could you clarify what the restrictions are? > > The CSR implies you can support ?a specific base CDS archive" > > - so base layer can not have appended boot class path > > - and base layer can't have a module path > > > > What can you specify for the dynamic dumping relative to the base > archive? > > - matching class path? > > - appended class path? > > in future - could it have a module path that matched the base archive? > > > > Should any of these restrictions be clarified in documentation/CSR > > since they appear to be new? > > > > 10. filemap.cpp > > check_archive > > Do some of the return false paths skip performing os::close(fd)? > > > > and get_base_archive_name_from_header > > Does the first return false path fail to os::free(dynamic_header) > > > > lines 753-754: two FIXME comments > > > > Could you delete commented out line 1087 in filemap.cpp ? > > > > 11. filemap.hpp > > line 214: TODO left in > > > > 12. metaspace.cpp > > line 1418 FIXME left in > > > > 13. java.cpp > > FIXME: is this the right place? > > For starting the DynamicArchive::dump > > > > Please check with David Holmes on that one > > > > 14. dynamicArchive.hpp > > line 55 (and others): MetsapceObj -> MetaspaceObj > > > > 15. dynamicArchive.cpp > > line 285 rel-ayout -> re-layout > > > > lines 277 && 412 > > Do we archive array klasses in the base archive but not in the dynamic > > archive? > > Is that a potential RFE? > > Is it possible that GatherKlassesAndSymbols::do_unique_ref could be > > called with an array class? > > Same question for copy_impl? > > > > line 934: "no onger" -> "no longer" > > > > 16. What is AllowArchivingWithJavaAgent? Is that a hook for a > > potential future rfe? > > Do you want to check in that code at this time? In product? > > > > thanks, > > Karen > > > > > >> On Apr 11, 2019, at 5:18 PM, Calvin Cheung >> > wrote: > >> > >> This is a follow-up on the preliminary code review sent by Jiangli in > >> January[1]. > >> > >> Highlights of changes since then: > >> 1. New vm option for dumping a dynamic archive > >> (-XX:ArchiveClassesAtExit=) and enhancement to the > >> existing -XX:SharedArchiveFile option. Please refer to the > >> corresponding CSR[2] for details. > >> 2. New way to run existing AppCDS tests in dynamic CDS archive mode. > >> At the jtreg command line, the user can run many existing AppCDS > >> tests in dynamic CDS archive mode by specifying the following: > >> -vmoptions:-Dtest.dynamic.cds.archive=true > >> /open/test/hotspot/jtreg:hotspot_appcds_dynamic > >> We will have a follow-up RFE to determine in which tier the above > >> tests should be run. > >> 3. Added more tests. > >> 4. Various bug fixes to improve stability. > >> > >> RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 > >> webrev: > >> > http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ > >> < > http://cr.openjdk.java.net/%7Eccheung/8207812_dynamic_cds_archive/webrev.00/ > > > >> > >> (The webrev is based on top of the following rev: > >> http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) > >> > >> Testing: > >> - mach5 tiers 1- 3 (including the new tests) > >> - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few > >> tests require more investigation) > >> > >> thanks, > >> Calvin > >> > >> [1] > >> > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html > >> [2] https://bugs.openjdk.java.net/browse/JDK-8221706 > > > From eric.caspole at oracle.com Mon Apr 22 22:13:30 2019 From: eric.caspole at oracle.com (Eric Caspole) Date: Mon, 22 Apr 2019 18:13:30 -0400 Subject: RFR (XS) 8222818: NMT summary could show the GC in use Message-ID: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> Hi, could I have reviews and any opinions on this little change to show the GC name in the NMT output, as this helps us to more easily triage performance data. This passed tier 1 and 2. Thanks, Eric JBS: https://bugs.openjdk.java.net/browse/JDK-8222818 webrev: http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From david.holmes at oracle.com Tue Apr 23 00:19:31 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Apr 2019 10:19:31 +1000 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> Message-ID: Hi Eric, On 23/04/2019 8:13 am, Eric Caspole wrote: > Hi, could I have reviews and any opinions on this little change to show > the GC name in the NMT output, as this helps us to more easily triage > performance data. The idea seems fine. For the implementation wouldn't it be simpler to do something like: if (flag == mtGC) { out->print("%s - %s (", NMTUtil::flag_to_name(flag), GCConfig::hs_err_name()); } else { out->print("-%26s (", NMTUtil::flag_to_name(flag)); } and skip the need for a local buffer and snprintf? Aside: it's probably used in enough different contexts that GCConfig::hs_err_name should be renamed. Also if the VM terminates during initialization is it possible for this code to be executed before the GCConfig has been setup? And if so how will it behave? Thanks, David > This passed tier 1 and 2. > Thanks, > Eric > > > JBS: > https://bugs.openjdk.java.net/browse/JDK-8222818 > > webrev: > http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From david.holmes at oracle.com Tue Apr 23 01:39:09 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Apr 2019 11:39:09 +1000 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> Message-ID: Hi Robin, Sorry, now Easter break got in the way :) On 17/04/2019 11:55 pm, Robin Westberg wrote: >> On 12 Apr 2019, at 11:15, David Holmes wrote: >> I'd prefer to fix a windows problem, just on windows. I'm not hung up on having sleep in the name, but if you prefer timed_yield to naked_short_nanosleep then that's fine (and avoids people wondering what the "naked" part means). >> >> If we need the TimedYield capability in the future then lets revisit that then. > > Sure, here?s a lighter version of this change that changes the Windows implementation of naked_short_nanosleep, with a few adjustments to some assumptions in the waiting-for-safepoint backoff strategy. > > Still passes tier1, with the same performance improvements on Windows (and no obvious regressions on Linux). > > New webrev: > https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02/ Windows changes look fine - thanks. Safepoint backoff change seems okay but what affect does it have on performance on non-Windows? (javaTimeNanos can sometimes be expensive) Thanks, David > Best regards, > Robin > >> >> Thanks, >> David >> ----- >> >>> Best regards, >>> Robin >>>> >>>> ? >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>> >>>>> Best regards, >>>>> Robin >>>>>> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >>>>>> >>>>>> Hi David, >>>>>> >>>>>>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>>>>>> >>>>>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>>>>> Hi David, >>>>>>>> Thanks for taking a look! >>>>>>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>>>>>> >>>>>>>>> Hi Robin, >>>>>>>>> >>>>>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>>>>> Hi all, >>>>>>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>>>>>> >>>>>>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>>>>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>>>>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>>>>>> >>>>>>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >>>>>> >>>>>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >>>>>> >>>>>> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >>>>>> >>>>>>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >>>>>> >>>>>> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>>>>> >>>>>> Best regards, >>>>>> Robin >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> Best regards, >>>>>>>> Robin >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> ----- >>>>>>>>> >>>>>>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>>>>> Testing: tier1 >>>>>>>>>> Best regards, >>>>>>>>>> Robin > From zgu at redhat.com Tue Apr 23 01:57:14 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 22 Apr 2019 21:57:14 -0400 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> Message-ID: <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> On 4/22/19 8:19 PM, David Holmes wrote: > Hi Eric, > > On 23/04/2019 8:13 am, Eric Caspole wrote: >> Hi, could I have reviews and any opinions on this little change to >> show the GC name in the NMT output, as this helps us to more easily >> triage performance data. > > The idea seems fine. > > For the implementation wouldn't it be simpler to do something like: > > if (flag == mtGC) { > ? out->print("%s - %s (", NMTUtil::flag_to_name(flag), > ????????????????????????? GCConfig::hs_err_name()); > } else { > ? out->print("-%26s (", NMTUtil::flag_to_name(flag)); > } > Yes, this is simpler. I don't like where the name is placed, it screws up section alignments. I would prefer to place name inside parenthesis. e.g. - GC (g1 gc reserved=379056KB, committed=93220KB) Thanks, -Zhengyu > and skip the need for a local buffer and snprintf? > > Aside: it's probably used in enough different contexts that > GCConfig::hs_err_name should be renamed. > > Also if the VM terminates during initialization is it possible for this > code to be executed before the GCConfig has been setup? And if so how > will it behave? > > Thanks, > David > >> This passed tier 1 and 2. >> Thanks, >> Eric >> >> >> JBS: >> https://bugs.openjdk.java.net/browse/JDK-8222818 >> >> webrev: >> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From kirk at kodewerk.com Tue Apr 23 02:34:19 2019 From: kirk at kodewerk.com (Kodewerk) Date: Mon, 22 Apr 2019 19:34:19 -0700 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> Message-ID: > On Apr 22, 2019, at 6:57 PM, Zhengyu Gu wrote: > > > > On 4/22/19 8:19 PM, David Holmes wrote: >> Hi Eric, >> On 23/04/2019 8:13 am, Eric Caspole wrote: >>> Hi, could I have reviews and any opinions on this little change to show the GC name in the NMT output, as this helps us to more easily triage performance data. >> The idea seems fine. > >> For the implementation wouldn't it be simpler to do something like: >> if (flag == mtGC) { >> out->print("%s - %s (", NMTUtil::flag_to_name(flag), >> GCConfig::hs_err_name()); >> } else { >> out->print("-%26s (", NMTUtil::flag_to_name(flag)); >> } > Yes, this is simpler. > > I don't like where the name is placed, it screws up section alignments. I would prefer to place name inside parenthesis. e.g. > > - GC (g1 gc reserved=379056KB, committed=93220KB) - GC [g1 gc reserved=379056KB, committed=93220KB] Is a format that is more inline with how other information is presented. Kind regards, Kirk From robbin.ehn at oracle.com Tue Apr 23 08:07:36 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 23 Apr 2019 10:07:36 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> Message-ID: <8a6c8f0b-21b0-b7f9-e64e-ba7e348a57b3@oracle.com> Hi Robin, > New webrev: > https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02/ Looks good, thanks for fixing. /Robbin > > Best regards, > Robin > >> >> Thanks, >> David >> ----- >> >>> Best regards, >>> Robin >>>> >>>> ? >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>> >>>>> Best regards, >>>>> Robin >>>>>> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >>>>>> >>>>>> Hi David, >>>>>> >>>>>>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>>>>>> >>>>>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>>>>> Hi David, >>>>>>>> Thanks for taking a look! >>>>>>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>>>>>> >>>>>>>>> Hi Robin, >>>>>>>>> >>>>>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>>>>> Hi all, >>>>>>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>>>>>> >>>>>>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>>>>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>>>>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>>>>>> >>>>>>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >>>>>> >>>>>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >>>>>> >>>>>> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >>>>>> >>>>>>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >>>>>> >>>>>> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>>>>> >>>>>> Best regards, >>>>>> Robin >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> Best regards, >>>>>>>> Robin >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> ----- >>>>>>>>> >>>>>>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>>>>> Testing: tier1 >>>>>>>>>> Best regards, >>>>>>>>>> Robin > From daniel.daugherty at oracle.com Tue Apr 23 14:28:00 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 23 Apr 2019 10:28:00 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project Message-ID: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> Greetings, I have a (S)mall patch extracted from the Async Monitor Deflation project that is ready for code review. Karen, a number of the changes here are from your code review comments to the parent bug: ? ? JDK-8153224 Monitor deflation prolong safepoints ??? https://bugs.openjdk.java.net/browse/JDK-8153224 The short version of what this patch is about: ??? More baseline cleanups to the ObjectMonitor subsystem. The details are in the bug report: ??? JDK-8222295 more baseline cleanups from Async Monitor Deflation project ??? https://bugs.openjdk.java.net/browse/JDK-8222295 Here's the webrev: http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/ This patch along with the current patch for Async Monitor Deflation project have been through Mach5 tier[1-8] testing. I have been actively using the revised assert()'s and guarantee()'s with additional diagnostic info while debugging my port of the Async Monitor Deflation project code. Thanks, in advance, for any questions, comments or suggestions. Dan From coleen.phillimore at oracle.com Tue Apr 23 15:41:53 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Apr 2019 11:41:53 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> Message-ID: http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/src/hotspot/share/oops/markOop.cpp.frames.html 37 if (mon == NULL) { 38 st->print("NULL (this should never be seen!)"); 39 } else { 40 st->print("{contentions=0x%08x,waiters=0x%08x" 41 ",recursions=" INTPTR_FORMAT ",owner=" INTPTR_FORMAT "}", 42 mon->contentions(), mon->waiters(), mon->recursions(), 43 p2i(mon->owner())); 44 } Following convention, it seems like this code should be in ObjectMonitor::print_on(outputStream* st) so markOop doesn't have to know objectMonitor fields/accessors. Otherwise looks like a good self-contained cleanup to me. Coleen On 4/23/19 10:28 AM, Daniel D. Daugherty wrote: > Greetings, > > I have a (S)mall patch extracted from the Async Monitor Deflation project > that is ready for code review. > > Karen, a number of the changes here are from your code review comments > to the parent bug: > > ? ? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > The short version of what this patch is about: > > ??? More baseline cleanups to the ObjectMonitor subsystem. > > The details are in the bug report: > > ??? JDK-8222295 more baseline cleanups from Async Monitor Deflation > project > ??? https://bugs.openjdk.java.net/browse/JDK-8222295 > > Here's the webrev: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/ > > This patch along with the current patch for Async Monitor Deflation > project have been through Mach5 tier[1-8] testing. > > I have been actively using the revised assert()'s and guarantee()'s with > additional diagnostic info while debugging my port of the Async Monitor > Deflation project code. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan From daniel.daugherty at oracle.com Tue Apr 23 16:36:12 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 23 Apr 2019 12:36:12 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> Message-ID: <43313488-8580-6958-9d87-f298527017fa@oracle.com> Coleen, Thanks for the quick review! More below... On 4/23/19 11:41 AM, coleen.phillimore at oracle.com wrote: > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/src/hotspot/share/oops/markOop.cpp.frames.html > > > ? 37???? if (mon == NULL) { > ? 38?????? st->print("NULL (this should never be seen!)"); > ? 39???? } else { > 40 st->print("{contentions=0x%08x,waiters=0x%08x" > ? 41???????????????? ",recursions=" INTPTR_FORMAT ",owner=" > INTPTR_FORMAT "}", > 42 mon->contentions(), mon->waiters(), mon->recursions(), > ? 43???????????????? p2i(mon->owner())); > ? 44???? } > > > Following convention, it seems like this code should be in > ObjectMonitor::print_on(outputStream* st) so markOop doesn't have to > know objectMonitor fields/accessors. That's a really interesting point... When you take a look at the whole of the markOopDesc::print_on() function, it is trying to give _some_ visibility into the interpretation of the various things that we have encoded into the mark oop word/header. For example, if the mark "is locked", it has this code: ? 45?? } else if (is_locked()) { ? 46???? st->print(" locked(" INTPTR_FORMAT ")->", value()); ? 47???? if (is_neutral()) { ? 48?????? st->print("is_neutral"); ? 49?????? if (has_no_hash()) { ? 50???????? st->print(" no_hash"); ? 51?????? } else { ? 52???????? st->print(" hash=" INTPTR_FORMAT, hash()); ? 53?????? } ? 54?????? st->print(" age=%d", age()); ? 55???? } else if (has_bias_pattern()) { ? 56?????? st->print("is_biased"); ? 57?????? JavaThread* jt = biased_locker(); ? 58?????? st->print(" biased_locker=" INTPTR_FORMAT, p2i(jt)); ? 59???? } else { ? 60?????? st->print("??"); ? 61???? } and if the mark "is unlocked", it has this code: ? 62?? } else { ? 63???? assert(is_unlocked() || has_bias_pattern(), "just checking"); ? 64???? st->print("mark("); ? 65???? if (has_bias_pattern()) st->print("biased,"); ? 66???? st->print("hash " INTPTR_FORMAT ",", hash()); ? 67???? st->print("age %d)", age()); ? 68?? } So I understand the reasons for the limited peek into the ObjectMonitor for the mark "has monitor" case since we do that limited level of detail for the other interpretations of the mark oop header. Summary: I'm not planning on changing that for this bug. However, now that I've pasted these code snippets, I think I see some confusion here. The mark "is locked" and mark "is unlocked" branches both have code for biased locking. That seems strange to me, but that should be looked at separately. > Otherwise looks like a good self-contained cleanup to me. Thanks! You'll see some of your other requested changes in the review thread for JDK-8153224 (CR1/v2.01/4-for-jdk13). Dan > > Coleen > > On 4/23/19 10:28 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a (S)mall patch extracted from the Async Monitor Deflation >> project >> that is ready for code review. >> >> Karen, a number of the changes here are from your code review comments >> to the parent bug: >> >> ? ? JDK-8153224 Monitor deflation prolong safepoints >> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> The short version of what this patch is about: >> >> ??? More baseline cleanups to the ObjectMonitor subsystem. >> >> The details are in the bug report: >> >> ??? JDK-8222295 more baseline cleanups from Async Monitor Deflation >> project >> ??? https://bugs.openjdk.java.net/browse/JDK-8222295 >> >> Here's the webrev: >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/ >> >> This patch along with the current patch for Async Monitor Deflation >> project have been through Mach5 tier[1-8] testing. >> >> I have been actively using the revised assert()'s and guarantee()'s with >> additional diagnostic info while debugging my port of the Async Monitor >> Deflation project code. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan > From daniel.daugherty at oracle.com Tue Apr 23 16:58:19 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 23 Apr 2019 12:58:19 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: <43313488-8580-6958-9d87-f298527017fa@oracle.com> References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> <43313488-8580-6958-9d87-f298527017fa@oracle.com> Message-ID: <715e4a1d-13c8-077e-692b-8a0e627eb728@oracle.com> Filed the following new bug: ??? JDK-8222893 markOopDesc::print_on() is a bit confused ??? https://bugs.openjdk.java.net/browse/JDK-8222893 Coleen, please let me know if I've captured the confusion here... :-) Dan P.S. What can I say? It's code that deals with mark oops, on-stack locks, biased locks and inflated locks... If there was ever code that had a right to be confused... ROFL... On 4/23/19 12:36 PM, Daniel D. Daugherty wrote: > On 4/23/19 11:41 AM, coleen.phillimore at oracle.com wrote: >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/src/hotspot/share/oops/markOop.cpp.frames.html >> >> >> ? 37???? if (mon == NULL) { >> ? 38?????? st->print("NULL (this should never be seen!)"); >> ? 39???? } else { >> 40 st->print("{contentions=0x%08x,waiters=0x%08x" >> ? 41???????????????? ",recursions=" INTPTR_FORMAT ",owner=" >> INTPTR_FORMAT "}", >> 42 mon->contentions(), mon->waiters(), mon->recursions(), >> ? 43???????????????? p2i(mon->owner())); >> ? 44???? } >> >> >> Following convention, it seems like this code should be in >> ObjectMonitor::print_on(outputStream* st) so markOop doesn't have to >> know objectMonitor fields/accessors. > > That's a really interesting point... When you take a look at the > whole of the markOopDesc::print_on() function, it is trying to > give _some_ visibility into the interpretation of the various > things that we have encoded into the mark oop word/header. > For example, if the mark "is locked", it has this code: > > ? 45?? } else if (is_locked()) { > ? 46???? st->print(" locked(" INTPTR_FORMAT ")->", value()); > ? 47???? if (is_neutral()) { > ? 48?????? st->print("is_neutral"); > ? 49?????? if (has_no_hash()) { > ? 50???????? st->print(" no_hash"); > ? 51?????? } else { > ? 52???????? st->print(" hash=" INTPTR_FORMAT, hash()); > ? 53?????? } > ? 54?????? st->print(" age=%d", age()); > ? 55???? } else if (has_bias_pattern()) { > ? 56?????? st->print("is_biased"); > ? 57?????? JavaThread* jt = biased_locker(); > ? 58?????? st->print(" biased_locker=" INTPTR_FORMAT, p2i(jt)); > ? 59???? } else { > ? 60?????? st->print("??"); > ? 61???? } > > and if the mark "is unlocked", it has this code: > > ? 62?? } else { > ? 63???? assert(is_unlocked() || has_bias_pattern(), "just checking"); > ? 64???? st->print("mark("); > ? 65???? if (has_bias_pattern()) st->print("biased,"); > ? 66???? st->print("hash " INTPTR_FORMAT ",", hash()); > ? 67???? st->print("age %d)", age()); > ? 68?? } > > So I understand the reasons for the limited peek into the > ObjectMonitor for the mark "has monitor" case since we do > that limited level of detail for the other interpretations > of the mark oop header. > > Summary: I'm not planning on changing that for this bug. > > However, now that I've pasted these code snippets, I think I > see some confusion here. The mark "is locked" and mark "is unlocked" > branches both have code for biased locking. That seems strange to > me, but that should be looked at separately. > > >> Otherwise looks like a good self-contained cleanup to me. > > Thanks! You'll see some of your other requested changes in the > review thread for JDK-8153224 (CR1/v2.01/4-for-jdk13). > > Dan > > >> >> Coleen From karen.kinnear at oracle.com Tue Apr 23 18:16:36 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Tue, 23 Apr 2019 14:16:36 -0400 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> Message-ID: <3614850A-450A-470B-BF2D-3CEA21D03B95@oracle.com> Calvin, I added to the CSR a comment from my favorite customer - relative to the user model for the command-line flags. He likes the proposal to reduce the number of steps a customer has to perform to get startup and footprint benefits from the archive. The comment was that it would be very helpful if the user only needed to change their scripts once - so a single command-line argument would create a dynamic archive if one did not exist, and use it if it already existed. Is there a way to evolve the ArchiveClassesAtExit= to have that functionality? thanks, Karen p.s. I think it makes more sense to put performance numbers in the implementation RFE comments rather than the JEP comments > On Apr 22, 2019, at 5:16 PM, Jiangli Zhou wrote: > > Hi Calvin, > > Can you please also publish the final performance numbers in the JEP 350 (or the implementation RFE) comment section? > > Thanks, > Jiangli > > On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung > wrote: > Hi Karen, > > Thanks for your review! > Please see my replies in-line below. > > On 4/19/19, 9:29 AM, Karen Kinnear wrote: > > Calvin, > > > > Many thanks for all the work getting this ready, significantly > > enhancing the testing and bug fixes. > > > > I marked the CSR as reviewed-by - it looks great! > > > > I reviewed this set of changes - I did not review the tests - I assume > > you can get someone > > else to do that. I am grateful that Jiangli and Ioi are going to > > review this also - they are much closer to > > the details than I am. > > > > 1. Do you have any performance numbers? > > 1a. Startup: does using a combined dynamic CDS archive + base archive > > give similar startup benefits > > when you have the same classes in the archives? > Below are some performance numbers from Eric, each number is for 50 runs: > (base: using the default CDS archive, > test: using the dynamic archive, > Eric will get some numbers with a single archive which I think that's > what you're looking for) > > Lambda-noop: > base: > 0.066441427 seconds time elapsed > test: > 0.075428824 seconds time elapsed > > Noop: > base: > 0.057614537 seconds time elapsed > test: > 0.066061557 seconds time elapsed > > Netty: > base: > 0.827013307 seconds time elapsed > test: > 0.604982805 seconds time elapsed > > Spring: > base: > 2.376707358 seconds time elapsed > test: > 1.927618893 seconds time elapsed > > The first 2 apps only have 2 to 3 classes in the dynamic archive. So the > overhead is likely due to having to open and map the dynamic archive and > performs checking on header, etc. For small apps, I think it's better to > use a single archive. The Netty app has around 1400 classes in the > dynamic archive; the Spring app has about 3700 classes in the dynamic > archive. > > I also used our LotsOfClasses test to collect some perf numbers. This is > more like runtime performance, not startup performance. > > With dynamic archive (100 runs each): > real 2m37.191s > real 2m36.003s > Total loaded classes = 24254 > Loaded from base archive = 1186 > Loaded from top archive = 23042 > Loaded from jrt:/ (runtime module) = 26 > > With single archive (100 runs each): > real 2m38.346s > real 2m36.947s > Total loaded classes = 24254 > Loaded from archive = 24228 > Loaded from jrt:/ (runtime module) = 26 > > > > > 1b. Do you have samples of uses of the combined dynamic CDS archive + > > base archive vs. a single > > static archive built for an application? > > - how do the sets of archived classes differ? > Currently, the default CDS archive contains around 1187 classes. With > the -XX:ArchiveClassesAtExit option, if the classes are not found in the > default CDS archive, they will be archived in the dynamic archive. The > above LotsOfClasses example shows some distributions between various > archives. > > - one note was that the AtExit approach exclude list adds anything > > that has not yet linked - does that make a significant difference in > > the number of classes that are archived? Does that make a difference > > in either startup time or in application execution time? I could see > > that going either way. > As the above numbers indicated, there's not much difference in terms of > execution time using a dynamic vs a single archive with a large number > of classes loaded. The numbers from Netty and Spring apps show an > improvement over default CDS archive. > > > > 1c. Any sense of performance cost for first run - how much time does > > it take to create an incremental archive? > > - is the time comparable to an existing dump for a single archive > > for the application? > > - this is an ease-of-use feature - so we are not expecting that to > > be fast > > - the point is to set expectations in our documentation > I did some rough measurements with the LotsOfClasses test with around > 15000 classes in the classlist. > > Dynamic archive dumping (one run each): > real 0m19.756s > real 0m20.241s > > Static archive dumping (one run each): > real 0m17.725s > real 0m16.993s > > > > 2. Footprint > > With two archives rather than one, is there a significant footprint > > difference? Obviously this will vary by app and archive. > > Once again, the point is to set expectations. > Sizes of the archives for the LotsOfClasses test in 1a. > > Single archive: 242962432 > Default CDS archive: 12365824 > Dynamic archive: 197525504 > > > > > 3. Runtime performance > > With two sets of archived dictionaries & symbolTables - is there any > > significant performance cost to larger benchmarks, e.g. for class > > loading lookup for classes that are not in the archives? Or symbol > > lookup? > I used the LotsOfClasses test again. This time archiving about half of > the classes which will be loaded during runtime. > > Dynamic archive (10 runs each): > real 0m30.214s > real 0m29.633s > Loaded classes = 24254 > Loaded from dynamic archive: 13168 > > Single archive (10 runs each): > real 0m32.383s > real 0m32.905s > Loaded classes = 24254 > Loaded from single archive = 15063 > > > > 4. Platform support > > Which platforms is this supported on? > > Which ones did you test? For example, did you run the tests on Windows? > I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, Solaris, > Windows). > > > > Detailed feedback on the code: Just minor comments - I don?t need to > > see an updated webrev: > I'm going to look into your detailed feedback below and may reply in a > separate email. > > thanks, > Calvin > > > > 1. metaSpaceShared.hpp > > line 156: > > what is the hardcoded -100 for? Should that be an enum? > > > > 2. jfrRecorder.cpp > > So JFR recordings are disabled if DynamicDumpSharedSpaces? > > why? > > Is that a future rfe? > > > > 3. systemDictionaryShared.cpp > > Could you possibly add a comment to add_verification_constraint > > for if (DynamicDumpSharedSpaces) > > return false > > > > -- I think the logic is: > > because we have successfully linked any instanceKlass we archive > > with DynamicDumpSharedSpaces, we have resolved all the constraint classes. > > > > -- I didn't check the order - is this called before or after > > excluding? If after, then would it make sense to add an assertion > > here is_linked? Then if you ever change how/when linking is done, this > > might catch future errors. > > > > 4. systemDictionaryShared.cpp > > EstimateSizeForArchive::do_entry > > Is it the case that for info.is_builtin() there are no verification > > constraints? So you could skip that calculation? Or did I misunderstand? > > > > 5. compactHashtable.cpp > > serialize/header/calculate_header_size > > -- could you dynamically determine size_of header so you don't need > > to hardcode a 5? > > > > 6. classLoader.cpp > > line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are > > mutually exclusive. > > Can you clarify for me: > > My memory of the base archive is that we do not allow the following > > options at dump time - and these > > are the same for the dynamic archive: ?limit-modules, > > ?upgrade-module-path, ?patch-module. > > > > I have forgotten: > > Today with UseSharedSpaces - do we allow these flags? Is that also the > > same behavior with the dynamic > > archive? > > > > 7. classLoaderExt.cpp > > assert line 66: only used with -Xshare:dump > > -> "only used at dump time" > > > > 8. symbolTable.cpp > > line 473: comment // used by UseSharedArchived2 > > ? command-line arg name has changed > > > > 9. filemap.cpp > > Comment lines 529 ... > > Is this true - that you can only support dynamic dumping with the > > default CDS archive? Could you clarify what the restrictions are? > > The CSR implies you can support ?a specific base CDS archive" > > - so base layer can not have appended boot class path > > - and base layer can't have a module path > > > > What can you specify for the dynamic dumping relative to the base archive? > > - matching class path? > > - appended class path? > > in future - could it have a module path that matched the base archive? > > > > Should any of these restrictions be clarified in documentation/CSR > > since they appear to be new? > > > > 10. filemap.cpp > > check_archive > > Do some of the return false paths skip performing os::close(fd)? > > > > and get_base_archive_name_from_header > > Does the first return false path fail to os::free(dynamic_header) > > > > lines 753-754: two FIXME comments > > > > Could you delete commented out line 1087 in filemap.cpp ? > > > > 11. filemap.hpp > > line 214: TODO left in > > > > 12. metaspace.cpp > > line 1418 FIXME left in > > > > 13. java.cpp > > FIXME: is this the right place? > > For starting the DynamicArchive::dump > > > > Please check with David Holmes on that one > > > > 14. dynamicArchive.hpp > > line 55 (and others): MetsapceObj -> MetaspaceObj > > > > 15. dynamicArchive.cpp > > line 285 rel-ayout -> re-layout > > > > lines 277 && 412 > > Do we archive array klasses in the base archive but not in the dynamic > > archive? > > Is that a potential RFE? > > Is it possible that GatherKlassesAndSymbols::do_unique_ref could be > > called with an array class? > > Same question for copy_impl? > > > > line 934: "no onger" -> "no longer" > > > > 16. What is AllowArchivingWithJavaAgent? Is that a hook for a > > potential future rfe? > > Do you want to check in that code at this time? In product? > > > > thanks, > > Karen > > > > > >> On Apr 11, 2019, at 5:18 PM, Calvin Cheung > >> >> wrote: > >> > >> This is a follow-up on the preliminary code review sent by Jiangli in > >> January[1]. > >> > >> Highlights of changes since then: > >> 1. New vm option for dumping a dynamic archive > >> (-XX:ArchiveClassesAtExit=) and enhancement to the > >> existing -XX:SharedArchiveFile option. Please refer to the > >> corresponding CSR[2] for details. > >> 2. New way to run existing AppCDS tests in dynamic CDS archive mode. > >> At the jtreg command line, the user can run many existing AppCDS > >> tests in dynamic CDS archive mode by specifying the following: > >> -vmoptions:-Dtest.dynamic.cds.archive=true > >> /open/test/hotspot/jtreg:hotspot_appcds_dynamic > >> We will have a follow-up RFE to determine in which tier the above > >> tests should be run. > >> 3. Added more tests. > >> 4. Various bug fixes to improve stability. > >> > >> RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 > >> webrev: > >> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ > >> > > >> > >> (The webrev is based on top of the following rev: > >> http://hg.openjdk.java.net/jdk/jdk/rev/805584336738 ) > >> > >> Testing: > >> - mach5 tiers 1- 3 (including the new tests) > >> - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few > >> tests require more investigation) > >> > >> thanks, > >> Calvin > >> > >> [1] > >> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html > >> [2] https://bugs.openjdk.java.net/browse/JDK-8221706 > > From patricio.chilano.mateo at oracle.com Tue Apr 23 18:27:48 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Tue, 23 Apr 2019 14:27:48 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: <715e4a1d-13c8-077e-692b-8a0e627eb728@oracle.com> References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> <43313488-8580-6958-9d87-f298527017fa@oracle.com> <715e4a1d-13c8-077e-692b-8a0e627eb728@oracle.com> Message-ID: <9eca7f6d-0ca5-5b77-ea75-1e4e9ce7d574@oracle.com> Hi Dan, On 4/23/19 12:58 PM, Daniel D. Daugherty wrote: > Filed the following new bug: > > ??? JDK-8222893 markOopDesc::print_on() is a bit confused > ??? https://bugs.openjdk.java.net/browse/JDK-8222893 > > Coleen, please let me know if I've captured the confusion here... :-) > > Dan > > P.S. > What can I say? It's code that deals with mark oops, on-stack locks, > biased locks and inflated locks... If there was ever code that had > a right to be confused... ROFL... I agree, that block of code seems to be in the wrong branch, I updated the bug with more description. Thanks, Patricio > On 4/23/19 12:36 PM, Daniel D. Daugherty wrote: >> On 4/23/19 11:41 AM, coleen.phillimore at oracle.com wrote: >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/src/hotspot/share/oops/markOop.cpp.frames.html >>> >>> >>> ? 37???? if (mon == NULL) { >>> ? 38?????? st->print("NULL (this should never be seen!)"); >>> ? 39???? } else { >>> 40 st->print("{contentions=0x%08x,waiters=0x%08x" >>> ? 41???????????????? ",recursions=" INTPTR_FORMAT ",owner=" >>> INTPTR_FORMAT "}", >>> 42 mon->contentions(), mon->waiters(), mon->recursions(), >>> ? 43???????????????? p2i(mon->owner())); >>> ? 44???? } >>> >>> >>> Following convention, it seems like this code should be in >>> ObjectMonitor::print_on(outputStream* st) so markOop doesn't have to >>> know objectMonitor fields/accessors. >> >> That's a really interesting point... When you take a look at the >> whole of the markOopDesc::print_on() function, it is trying to >> give _some_ visibility into the interpretation of the various >> things that we have encoded into the mark oop word/header. >> For example, if the mark "is locked", it has this code: >> >> ? 45?? } else if (is_locked()) { >> ? 46???? st->print(" locked(" INTPTR_FORMAT ")->", value()); >> ? 47???? if (is_neutral()) { >> ? 48?????? st->print("is_neutral"); >> ? 49?????? if (has_no_hash()) { >> ? 50???????? st->print(" no_hash"); >> ? 51?????? } else { >> ? 52???????? st->print(" hash=" INTPTR_FORMAT, hash()); >> ? 53?????? } >> ? 54?????? st->print(" age=%d", age()); >> ? 55???? } else if (has_bias_pattern()) { >> ? 56?????? st->print("is_biased"); >> ? 57?????? JavaThread* jt = biased_locker(); >> ? 58?????? st->print(" biased_locker=" INTPTR_FORMAT, p2i(jt)); >> ? 59???? } else { >> ? 60?????? st->print("??"); >> ? 61???? } >> >> and if the mark "is unlocked", it has this code: >> >> ? 62?? } else { >> ? 63???? assert(is_unlocked() || has_bias_pattern(), "just checking"); >> ? 64???? st->print("mark("); >> ? 65???? if (has_bias_pattern()) st->print("biased,"); >> ? 66???? st->print("hash " INTPTR_FORMAT ",", hash()); >> ? 67???? st->print("age %d)", age()); >> ? 68?? } >> >> So I understand the reasons for the limited peek into the >> ObjectMonitor for the mark "has monitor" case since we do >> that limited level of detail for the other interpretations >> of the mark oop header. >> >> Summary: I'm not planning on changing that for this bug. >> >> However, now that I've pasted these code snippets, I think I >> see some confusion here. The mark "is locked" and mark "is unlocked" >> branches both have code for biased locking. That seems strange to >> me, but that should be looked at separately. >> >> >>> Otherwise looks like a good self-contained cleanup to me. >> >> Thanks! You'll see some of your other requested changes in the >> review thread for JDK-8153224 (CR1/v2.01/4-for-jdk13). >> >> Dan >> >> >>> >>> Coleen > From harold.seigel at oracle.com Tue Apr 23 18:34:12 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Tue, 23 Apr 2019 14:34:12 -0400 Subject: RFR 8221685: -XX:BytecodeVerificationRemote and -XX:BytecodeVerificationLocal should be diagnostic options Message-ID: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> Hi, Please review this change to make the hotspot BytecodeVerification* options be diagnostic.? Use of either of these options without -XX:+UnlockDiagnosticVMOptions will now result in the following message: ??? > java -XX:+BytecodeVerificationLocal -version ??? Error: VM option 'BytecodeVerificationLocal' is diagnostic and must be enabled via -XX:+UnlockDiagnosticVMOptions. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8221685/webrev/index.html JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8221685 The fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-x64, Windows, and Mac OS X, Mach5 tiers 3 -5 on Linux-x64, and by running JCK-13 Lang and VM tests on Linux-x64. Additionally, the java command was run to ensure that -XX:+UnlockDiagnosticVMOptions is needed when specifying the BytecodeVerification* options. Thanks, Harold From lois.foltan at oracle.com Tue Apr 23 18:36:45 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 23 Apr 2019 14:36:45 -0400 Subject: RFR 8221685: -XX:BytecodeVerificationRemote and -XX:BytecodeVerificationLocal should be diagnostic options In-Reply-To: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> References: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> Message-ID: <1cc286aa-9dd6-e4e4-7e5a-d51fb0e16849@oracle.com> Looks good Harold. Lois On 4/23/2019 2:34 PM, Harold Seigel wrote: > Hi, > > Please review this change to make the hotspot BytecodeVerification* > options be diagnostic.? Use of either of these options without > -XX:+UnlockDiagnosticVMOptions will now result in the following message: > > ??? > java -XX:+BytecodeVerificationLocal -version > ??? Error: VM option 'BytecodeVerificationLocal' is diagnostic and > must be enabled via -XX:+UnlockDiagnosticVMOptions. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8221685/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8221685 > > The fix was regression tested by running Mach5 tiers 1 and 2 tests and > builds on Linux-x64, Windows, and Mac OS X, Mach5 tiers 3 -5 on > Linux-x64, and by running JCK-13 Lang and VM tests on Linux-x64. > Additionally, the java command was run to ensure that > -XX:+UnlockDiagnosticVMOptions is needed when specifying the > BytecodeVerification* options. > > Thanks, Harold > From harold.seigel at oracle.com Tue Apr 23 18:37:27 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Tue, 23 Apr 2019 14:37:27 -0400 Subject: RFR 8221685: -XX:BytecodeVerificationRemote and -XX:BytecodeVerificationLocal should be diagnostic options In-Reply-To: <1cc286aa-9dd6-e4e4-7e5a-d51fb0e16849@oracle.com> References: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> <1cc286aa-9dd6-e4e4-7e5a-d51fb0e16849@oracle.com> Message-ID: <94040351-81ab-12cc-bd13-22dd3d2b0c8f@oracle.com> Thanks Lois! Harold On 4/23/2019 2:36 PM, Lois Foltan wrote: > Looks good Harold. > Lois > > On 4/23/2019 2:34 PM, Harold Seigel wrote: >> Hi, >> >> Please review this change to make the hotspot BytecodeVerification* >> options be diagnostic.? Use of either of these options without >> -XX:+UnlockDiagnosticVMOptions will now result in the following message: >> >> ??? > java -XX:+BytecodeVerificationLocal -version >> ??? Error: VM option 'BytecodeVerificationLocal' is diagnostic and >> must be enabled via -XX:+UnlockDiagnosticVMOptions. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8221685/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8221685 >> >> The fix was regression tested by running Mach5 tiers 1 and 2 tests >> and builds on Linux-x64, Windows, and Mac OS X, Mach5 tiers 3 -5 on >> Linux-x64, and by running JCK-13 Lang and VM tests on Linux-x64. >> Additionally, the java command was run to ensure that >> -XX:+UnlockDiagnosticVMOptions is needed when specifying the >> BytecodeVerification* options. >> >> Thanks, Harold >> > From karen.kinnear at oracle.com Tue Apr 23 18:39:23 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Tue, 23 Apr 2019 14:39:23 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> Message-ID: Dan, Looks good. Thank you for extracting these clean ups - that will make the changes for the Async work clearer. thanks, Karen > On Apr 23, 2019, at 10:28 AM, Daniel D. Daugherty wrote: > > Greetings, > > I have a (S)mall patch extracted from the Async Monitor Deflation project > that is ready for code review. > > Karen, a number of the changes here are from your code review comments > to the parent bug: > > JDK-8153224 Monitor deflation prolong safepoints > https://bugs.openjdk.java.net/browse/JDK-8153224 > > The short version of what this patch is about: > > More baseline cleanups to the ObjectMonitor subsystem. > > The details are in the bug report: > > JDK-8222295 more baseline cleanups from Async Monitor Deflation project > https://bugs.openjdk.java.net/browse/JDK-8222295 > > Here's the webrev: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/ > > This patch along with the current patch for Async Monitor Deflation > project have been through Mach5 tier[1-8] testing. > > I have been actively using the revised assert()'s and guarantee()'s with > additional diagnostic info while debugging my port of the Async Monitor > Deflation project code. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan From eric.caspole at oracle.com Tue Apr 23 18:53:27 2019 From: eric.caspole at oracle.com (Eric Caspole) Date: Tue, 23 Apr 2019 14:53:27 -0400 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> Message-ID: Hi Zhengyu, Hopefully this email comes through in monospace, the alignment is OK for me: currently: - GC (reserved=379056KB, committed=93220KB) (malloc=39184KB #2159) (mmap: reserved=339872KB, committed=54036KB) My version: - GC - g1 gc (reserved=379090KB, committed=93254KB) (malloc=39218KB #2194) (mmap: reserved=339872KB, committed=54036KB) so it is aligned going to the left off the parenthesis like the current version. Is that what you mean? I like the way the GC stands out like this but it is OK to put it in the parentheses on the right. Thanks, Eric On 4/22/19 21:57, Zhengyu Gu wrote: > > > On 4/22/19 8:19 PM, David Holmes wrote: >> Hi Eric, >> >> On 23/04/2019 8:13 am, Eric Caspole wrote: >>> Hi, could I have reviews and any opinions on this little change to >>> show the GC name in the NMT output, as this helps us to more easily >>> triage performance data. >> >> The idea seems fine. > >> >> For the implementation wouldn't it be simpler to do something like: >> >> if (flag == mtGC) { >> ?? out->print("%s - %s (", NMTUtil::flag_to_name(flag), >> ?????????????????????????? GCConfig::hs_err_name()); >> } else { >> ?? out->print("-%26s (", NMTUtil::flag_to_name(flag)); >> } >> > Yes, this is simpler. > > I don't like where the name is placed, it screws up section alignments. > I would prefer to place name inside parenthesis. e.g. > > - GC (g1 gc reserved=379056KB, committed=93220KB) > > Thanks, > > -Zhengyu > >> and skip the need for a local buffer and snprintf? >> >> Aside: it's probably used in enough different contexts that >> GCConfig::hs_err_name should be renamed. >> >> Also if the VM terminates during initialization is it possible for >> this code to be executed before the GCConfig has been setup? And if so >> how will it behave? >> >> Thanks, >> David >> >>> This passed tier 1 and 2. >>> Thanks, >>> Eric >>> >>> >>> JBS: >>> https://bugs.openjdk.java.net/browse/JDK-8222818 >>> >>> webrev: >>> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From coleen.phillimore at oracle.com Tue Apr 23 19:01:29 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Apr 2019 15:01:29 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: <715e4a1d-13c8-077e-692b-8a0e627eb728@oracle.com> References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> <43313488-8580-6958-9d87-f298527017fa@oracle.com> <715e4a1d-13c8-077e-692b-8a0e627eb728@oracle.com> Message-ID: <65eecc05-75f7-314b-172f-63ad6f33dc14@oracle.com> On 4/23/19 12:58 PM, Daniel D. Daugherty wrote: > Filed the following new bug: > > ??? JDK-8222893 markOopDesc::print_on() is a bit confused > https://bugs.openjdk.java.net/browse/JDK-8222893 > > Coleen, please let me know if I've captured the confusion here... :-) > > Dan > > P.S. > What can I say? It's code that deals with mark oops, on-stack locks, > biased locks and inflated locks... If there was ever code that had > a right to be confused... ROFL... > > > On 4/23/19 12:36 PM, Daniel D. Daugherty wrote: >> On 4/23/19 11:41 AM, coleen.phillimore at oracle.com wrote: >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/src/hotspot/share/oops/markOop.cpp.frames.html >>> >>> >>> ? 37???? if (mon == NULL) { >>> ? 38?????? st->print("NULL (this should never be seen!)"); >>> ? 39???? } else { >>> 40 st->print("{contentions=0x%08x,waiters=0x%08x" >>> ? 41???????????????? ",recursions=" INTPTR_FORMAT ",owner=" >>> INTPTR_FORMAT "}", >>> 42 mon->contentions(), mon->waiters(), mon->recursions(), >>> ? 43???????????????? p2i(mon->owner())); >>> ? 44???? } >>> >>> >>> Following convention, it seems like this code should be in >>> ObjectMonitor::print_on(outputStream* st) so markOop doesn't have to >>> know objectMonitor fields/accessors. >> >> That's a really interesting point... When you take a look at the >> whole of the markOopDesc::print_on() function, it is trying to >> give _some_ visibility into the interpretation of the various >> things that we have encoded into the mark oop word/header. >> For example, if the mark "is locked", it has this code: >> >> ? 45?? } else if (is_locked()) { >> ? 46???? st->print(" locked(" INTPTR_FORMAT ")->", value()); >> ? 47???? if (is_neutral()) { >> ? 48?????? st->print("is_neutral"); >> ? 49?????? if (has_no_hash()) { >> ? 50???????? st->print(" no_hash"); >> ? 51?????? } else { >> ? 52???????? st->print(" hash=" INTPTR_FORMAT, hash()); >> ? 53?????? } >> ? 54?????? st->print(" age=%d", age()); >> ? 55???? } else if (has_bias_pattern()) { >> ? 56?????? st->print("is_biased"); >> ? 57?????? JavaThread* jt = biased_locker(); >> ? 58?????? st->print(" biased_locker=" INTPTR_FORMAT, p2i(jt)); >> ? 59???? } else { >> ? 60?????? st->print("??"); >> ? 61???? } >> >> and if the mark "is unlocked", it has this code: >> >> ? 62?? } else { >> ? 63???? assert(is_unlocked() || has_bias_pattern(), "just checking"); >> ? 64???? st->print("mark("); >> ? 65???? if (has_bias_pattern()) st->print("biased,"); >> ? 66???? st->print("hash " INTPTR_FORMAT ",", hash()); >> ? 67???? st->print("age %d)", age()); >> ? 68?? } >> >> So I understand the reasons for the limited peek into the >> ObjectMonitor for the mark "has monitor" case since we do >> that limited level of detail for the other interpretations >> of the mark oop header. >> >> Summary: I'm not planning on changing that for this bug. >> >> However, now that I've pasted these code snippets, I think I >> see some confusion here. The mark "is locked" and mark "is unlocked" >> branches both have code for biased locking. That seems strange to >> me, but that should be looked at separately. >> The difference I see is that the is_locked() branches of markOop::print() code don't try to print *inside* another object, like ObjectLocker, which I'd like to see separated from markOop printing.? It can be done via. this new bug.? There are a lot of disparate things in the markOop header (which should be MarkWord but that's another issue). Printing the biased locking thread didn't seem out of place here, I have to admit.? If we printed fields in the Thread, that would be different. >> >>> Otherwise looks like a good self-contained cleanup to me. >> >> Thanks! You'll see some of your other requested changes in the >> review thread for JDK-8153224 (CR1/v2.01/4-for-jdk13). Thank you for making these changes. Coleen >> >> Dan >> >> >>> >>> Coleen > From daniel.daugherty at oracle.com Tue Apr 23 19:05:00 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 23 Apr 2019 15:05:00 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: <9eca7f6d-0ca5-5b77-ea75-1e4e9ce7d574@oracle.com> References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> <43313488-8580-6958-9d87-f298527017fa@oracle.com> <715e4a1d-13c8-077e-692b-8a0e627eb728@oracle.com> <9eca7f6d-0ca5-5b77-ea75-1e4e9ce7d574@oracle.com> Message-ID: <5f54fd89-7c29-764d-fc87-ee90b33e19ee@oracle.com> On 4/23/19 2:27 PM, Patricio Chilano wrote: > Hi Dan, > > On 4/23/19 12:58 PM, Daniel D. Daugherty wrote: >> Filed the following new bug: >> >> ??? JDK-8222893 markOopDesc::print_on() is a bit confused >> ??? https://bugs.openjdk.java.net/browse/JDK-8222893 >> >> Coleen, please let me know if I've captured the confusion here... :-) >> >> Dan >> >> P.S. >> What can I say? It's code that deals with mark oops, on-stack locks, >> biased locks and inflated locks... If there was ever code that had >> a right to be confused... ROFL... > I agree, that block of code seems to be in the wrong branch, I updated > the bug with more description. Thanks for the sanity check Patricio! Dan > > > Thanks, > Patricio >> On 4/23/19 12:36 PM, Daniel D. Daugherty wrote: >>> On 4/23/19 11:41 AM, coleen.phillimore at oracle.com wrote: >>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/src/hotspot/share/oops/markOop.cpp.frames.html >>>> >>>> >>>> ? 37???? if (mon == NULL) { >>>> ? 38?????? st->print("NULL (this should never be seen!)"); >>>> ? 39???? } else { >>>> 40 st->print("{contentions=0x%08x,waiters=0x%08x" >>>> ? 41???????????????? ",recursions=" INTPTR_FORMAT ",owner=" >>>> INTPTR_FORMAT "}", >>>> 42 mon->contentions(), mon->waiters(), mon->recursions(), >>>> ? 43???????????????? p2i(mon->owner())); >>>> ? 44???? } >>>> >>>> >>>> Following convention, it seems like this code should be in >>>> ObjectMonitor::print_on(outputStream* st) so markOop doesn't have >>>> to know objectMonitor fields/accessors. >>> >>> That's a really interesting point... When you take a look at the >>> whole of the markOopDesc::print_on() function, it is trying to >>> give _some_ visibility into the interpretation of the various >>> things that we have encoded into the mark oop word/header. >>> For example, if the mark "is locked", it has this code: >>> >>> ? 45?? } else if (is_locked()) { >>> ? 46???? st->print(" locked(" INTPTR_FORMAT ")->", value()); >>> ? 47???? if (is_neutral()) { >>> ? 48?????? st->print("is_neutral"); >>> ? 49?????? if (has_no_hash()) { >>> ? 50???????? st->print(" no_hash"); >>> ? 51?????? } else { >>> ? 52???????? st->print(" hash=" INTPTR_FORMAT, hash()); >>> ? 53?????? } >>> ? 54?????? st->print(" age=%d", age()); >>> ? 55???? } else if (has_bias_pattern()) { >>> ? 56?????? st->print("is_biased"); >>> ? 57?????? JavaThread* jt = biased_locker(); >>> ? 58?????? st->print(" biased_locker=" INTPTR_FORMAT, p2i(jt)); >>> ? 59???? } else { >>> ? 60?????? st->print("??"); >>> ? 61???? } >>> >>> and if the mark "is unlocked", it has this code: >>> >>> ? 62?? } else { >>> ? 63???? assert(is_unlocked() || has_bias_pattern(), "just checking"); >>> ? 64???? st->print("mark("); >>> ? 65???? if (has_bias_pattern()) st->print("biased,"); >>> ? 66???? st->print("hash " INTPTR_FORMAT ",", hash()); >>> ? 67???? st->print("age %d)", age()); >>> ? 68?? } >>> >>> So I understand the reasons for the limited peek into the >>> ObjectMonitor for the mark "has monitor" case since we do >>> that limited level of detail for the other interpretations >>> of the mark oop header. >>> >>> Summary: I'm not planning on changing that for this bug. >>> >>> However, now that I've pasted these code snippets, I think I >>> see some confusion here. The mark "is locked" and mark "is unlocked" >>> branches both have code for biased locking. That seems strange to >>> me, but that should be looked at separately. >>> >>> >>>> Otherwise looks like a good self-contained cleanup to me. >>> >>> Thanks! You'll see some of your other requested changes in the >>> review thread for JDK-8153224 (CR1/v2.01/4-for-jdk13). >>> >>> Dan >>> >>> >>>> >>>> Coleen >> > From daniel.daugherty at oracle.com Tue Apr 23 19:05:29 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 23 Apr 2019 15:05:29 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> Message-ID: On 4/23/19 2:39 PM, Karen Kinnear wrote: > Dan, > > Looks good. Thank you for extracting these clean ups - that will make the changes for > the Async work clearer. Thanks for the review! Dan > > thanks, > Karen > >> On Apr 23, 2019, at 10:28 AM, Daniel D. Daugherty wrote: >> >> Greetings, >> >> I have a (S)mall patch extracted from the Async Monitor Deflation project >> that is ready for code review. >> >> Karen, a number of the changes here are from your code review comments >> to the parent bug: >> >> JDK-8153224 Monitor deflation prolong safepoints >> https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> The short version of what this patch is about: >> >> More baseline cleanups to the ObjectMonitor subsystem. >> >> The details are in the bug report: >> >> JDK-8222295 more baseline cleanups from Async Monitor Deflation project >> https://bugs.openjdk.java.net/browse/JDK-8222295 >> >> Here's the webrev: >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/ >> >> This patch along with the current patch for Async Monitor Deflation >> project have been through Mach5 tier[1-8] testing. >> >> I have been actively using the revised assert()'s and guarantee()'s with >> additional diagnostic info while debugging my port of the Async Monitor >> Deflation project code. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan From daniel.daugherty at oracle.com Tue Apr 23 19:04:08 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 23 Apr 2019 15:04:08 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: <65eecc05-75f7-314b-172f-63ad6f33dc14@oracle.com> References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> <43313488-8580-6958-9d87-f298527017fa@oracle.com> <715e4a1d-13c8-077e-692b-8a0e627eb728@oracle.com> <65eecc05-75f7-314b-172f-63ad6f33dc14@oracle.com> Message-ID: <4e014765-df8d-df6b-8518-5d042959c87a@oracle.com> On 4/23/19 3:01 PM, coleen.phillimore at oracle.com wrote: > > > On 4/23/19 12:58 PM, Daniel D. Daugherty wrote: >> Filed the following new bug: >> >> ??? JDK-8222893 markOopDesc::print_on() is a bit confused >> https://bugs.openjdk.java.net/browse/JDK-8222893 >> >> Coleen, please let me know if I've captured the confusion here... :-) >> >> Dan >> >> P.S. >> What can I say? It's code that deals with mark oops, on-stack locks, >> biased locks and inflated locks... If there was ever code that had >> a right to be confused... ROFL... >> >> >> On 4/23/19 12:36 PM, Daniel D. Daugherty wrote: >>> On 4/23/19 11:41 AM, coleen.phillimore at oracle.com wrote: >>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/src/hotspot/share/oops/markOop.cpp.frames.html >>>> >>>> >>>> ? 37???? if (mon == NULL) { >>>> ? 38?????? st->print("NULL (this should never be seen!)"); >>>> ? 39???? } else { >>>> 40 st->print("{contentions=0x%08x,waiters=0x%08x" >>>> ? 41???????????????? ",recursions=" INTPTR_FORMAT ",owner=" >>>> INTPTR_FORMAT "}", >>>> 42 mon->contentions(), mon->waiters(), mon->recursions(), >>>> ? 43???????????????? p2i(mon->owner())); >>>> ? 44???? } >>>> >>>> >>>> Following convention, it seems like this code should be in >>>> ObjectMonitor::print_on(outputStream* st) so markOop doesn't have >>>> to know objectMonitor fields/accessors. >>> >>> That's a really interesting point... When you take a look at the >>> whole of the markOopDesc::print_on() function, it is trying to >>> give _some_ visibility into the interpretation of the various >>> things that we have encoded into the mark oop word/header. >>> For example, if the mark "is locked", it has this code: >>> >>> ? 45?? } else if (is_locked()) { >>> ? 46???? st->print(" locked(" INTPTR_FORMAT ")->", value()); >>> ? 47???? if (is_neutral()) { >>> ? 48?????? st->print("is_neutral"); >>> ? 49?????? if (has_no_hash()) { >>> ? 50???????? st->print(" no_hash"); >>> ? 51?????? } else { >>> ? 52???????? st->print(" hash=" INTPTR_FORMAT, hash()); >>> ? 53?????? } >>> ? 54?????? st->print(" age=%d", age()); >>> ? 55???? } else if (has_bias_pattern()) { >>> ? 56?????? st->print("is_biased"); >>> ? 57?????? JavaThread* jt = biased_locker(); >>> ? 58?????? st->print(" biased_locker=" INTPTR_FORMAT, p2i(jt)); >>> ? 59???? } else { >>> ? 60?????? st->print("??"); >>> ? 61???? } >>> >>> and if the mark "is unlocked", it has this code: >>> >>> ? 62?? } else { >>> ? 63???? assert(is_unlocked() || has_bias_pattern(), "just checking"); >>> ? 64???? st->print("mark("); >>> ? 65???? if (has_bias_pattern()) st->print("biased,"); >>> ? 66???? st->print("hash " INTPTR_FORMAT ",", hash()); >>> ? 67???? st->print("age %d)", age()); >>> ? 68?? } >>> >>> So I understand the reasons for the limited peek into the >>> ObjectMonitor for the mark "has monitor" case since we do >>> that limited level of detail for the other interpretations >>> of the mark oop header. >>> >>> Summary: I'm not planning on changing that for this bug. >>> >>> However, now that I've pasted these code snippets, I think I >>> see some confusion here. The mark "is locked" and mark "is unlocked" >>> branches both have code for biased locking. That seems strange to >>> me, but that should be looked at separately. >>> > > The difference I see is that the is_locked() branches of > markOop::print() code don't try to print *inside* another object, like > ObjectLocker, which I'd like to see separated from markOop printing.? > It can be done via. this new bug.? There are a lot of disparate things > in the markOop header (which should be MarkWord but that's another issue). > > Printing the biased locking thread didn't seem out of place here, I > have to admit.? If we printed fields in the Thread, that would be > different. No argument about "inside" versus what's already there. What I was trying to say was that the only way to print anything interesting about a mark oop word that refers to an ObjectMonitor is to peek inside that ObjectMonitor. Dan > >>> >>>> Otherwise looks like a good self-contained cleanup to me. >>> >>> Thanks! You'll see some of your other requested changes in the >>> review thread for JDK-8153224 (CR1/v2.01/4-for-jdk13). > > Thank you for making these changes. > > Coleen > >>> >>> Dan >>> >>> >>>> >>>> Coleen >> > From coleen.phillimore at oracle.com Tue Apr 23 19:07:47 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Apr 2019 15:07:47 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: <4e014765-df8d-df6b-8518-5d042959c87a@oracle.com> References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> <43313488-8580-6958-9d87-f298527017fa@oracle.com> <715e4a1d-13c8-077e-692b-8a0e627eb728@oracle.com> <65eecc05-75f7-314b-172f-63ad6f33dc14@oracle.com> <4e014765-df8d-df6b-8518-5d042959c87a@oracle.com> Message-ID: <4c4ad63b-81e0-c75f-978e-88827bae8e44@oracle.com> On 4/23/19 3:04 PM, Daniel D. Daugherty wrote: > On 4/23/19 3:01 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 4/23/19 12:58 PM, Daniel D. Daugherty wrote: >>> Filed the following new bug: >>> >>> ??? JDK-8222893 markOopDesc::print_on() is a bit confused >>> https://bugs.openjdk.java.net/browse/JDK-8222893 >>> >>> Coleen, please let me know if I've captured the confusion here... :-) >>> >>> Dan >>> >>> P.S. >>> What can I say? It's code that deals with mark oops, on-stack locks, >>> biased locks and inflated locks... If there was ever code that had >>> a right to be confused... ROFL... >>> >>> >>> On 4/23/19 12:36 PM, Daniel D. Daugherty wrote: >>>> On 4/23/19 11:41 AM, coleen.phillimore at oracle.com wrote: >>>>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/src/hotspot/share/oops/markOop.cpp.frames.html >>>>> >>>>> >>>>> ? 37???? if (mon == NULL) { >>>>> ? 38?????? st->print("NULL (this should never be seen!)"); >>>>> ? 39???? } else { >>>>> 40 st->print("{contentions=0x%08x,waiters=0x%08x" >>>>> ? 41???????????????? ",recursions=" INTPTR_FORMAT ",owner=" >>>>> INTPTR_FORMAT "}", >>>>> 42 mon->contentions(), mon->waiters(), mon->recursions(), >>>>> ? 43???????????????? p2i(mon->owner())); >>>>> ? 44???? } >>>>> >>>>> >>>>> Following convention, it seems like this code should be in >>>>> ObjectMonitor::print_on(outputStream* st) so markOop doesn't have >>>>> to know objectMonitor fields/accessors. >>>> >>>> That's a really interesting point... When you take a look at the >>>> whole of the markOopDesc::print_on() function, it is trying to >>>> give _some_ visibility into the interpretation of the various >>>> things that we have encoded into the mark oop word/header. >>>> For example, if the mark "is locked", it has this code: >>>> >>>> ? 45?? } else if (is_locked()) { >>>> ? 46???? st->print(" locked(" INTPTR_FORMAT ")->", value()); >>>> ? 47???? if (is_neutral()) { >>>> ? 48?????? st->print("is_neutral"); >>>> ? 49?????? if (has_no_hash()) { >>>> ? 50???????? st->print(" no_hash"); >>>> ? 51?????? } else { >>>> ? 52???????? st->print(" hash=" INTPTR_FORMAT, hash()); >>>> ? 53?????? } >>>> ? 54?????? st->print(" age=%d", age()); >>>> ? 55???? } else if (has_bias_pattern()) { >>>> ? 56?????? st->print("is_biased"); >>>> ? 57?????? JavaThread* jt = biased_locker(); >>>> ? 58?????? st->print(" biased_locker=" INTPTR_FORMAT, p2i(jt)); >>>> ? 59???? } else { >>>> ? 60?????? st->print("??"); >>>> ? 61???? } >>>> >>>> and if the mark "is unlocked", it has this code: >>>> >>>> ? 62?? } else { >>>> ? 63???? assert(is_unlocked() || has_bias_pattern(), "just checking"); >>>> ? 64???? st->print("mark("); >>>> ? 65???? if (has_bias_pattern()) st->print("biased,"); >>>> ? 66???? st->print("hash " INTPTR_FORMAT ",", hash()); >>>> ? 67???? st->print("age %d)", age()); >>>> ? 68?? } >>>> >>>> So I understand the reasons for the limited peek into the >>>> ObjectMonitor for the mark "has monitor" case since we do >>>> that limited level of detail for the other interpretations >>>> of the mark oop header. >>>> >>>> Summary: I'm not planning on changing that for this bug. >>>> >>>> However, now that I've pasted these code snippets, I think I >>>> see some confusion here. The mark "is locked" and mark "is unlocked" >>>> branches both have code for biased locking. That seems strange to >>>> me, but that should be looked at separately. >>>> >> >> The difference I see is that the is_locked() branches of >> markOop::print() code don't try to print *inside* another object, >> like ObjectLocker, which I'd like to see separated from markOop >> printing.? It can be done via. this new bug.? There are a lot of >> disparate things in the markOop header (which should be MarkWord but >> that's another issue). >> >> Printing the biased locking thread didn't seem out of place here, I >> have to admit.? If we printed fields in the Thread, that would be >> different. > > No argument about "inside" versus what's already there. What I was > trying to say was that the only way to print anything interesting > about a mark oop word that refers to an ObjectMonitor is to peek > inside that ObjectMonitor. So this is fine as is, if you want to make an ObjectMonitor::print_on(outputStream*st) and call it in this separate bug. Coleen > > Dan > >> >>>> >>>>> Otherwise looks like a good self-contained cleanup to me. >>>> >>>> Thanks! You'll see some of your other requested changes in the >>>> review thread for JDK-8153224 (CR1/v2.01/4-for-jdk13). >> >> Thank you for making these changes. >> >> Coleen >> >>>> >>>> Dan >>>> >>>> >>>>> >>>>> Coleen >>> >> > From robbin.ehn at oracle.com Tue Apr 23 19:38:13 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 23 Apr 2019 21:38:13 +0200 Subject: RFR(s): 8222640: Remove deopt suspend In-Reply-To: References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> Message-ID: <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> Hi Dean, Is this what you had in mind: diff -r 295029840379 src/hotspot/share/runtime/frame.cpp --- a/src/hotspot/share/runtime/frame.cpp Tue Apr 23 09:58:55 2019 +0200 +++ b/src/hotspot/share/runtime/frame.cpp Tue Apr 23 21:32:00 2019 +0200 @@ -272,4 +272,6 @@ void frame::deoptimize(JavaThread* thread) { + assert(thread->frame_anchor()->has_last_Java_frame() && + thread->frame_anchor()->walkable(), "must be"); // Schedule deoptimization of an nmethod activation with this frame. assert(_cb != NULL && _cb->is_compiled(), "must be"); Passes t1-5. v2: http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ Inc: http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ Thanks, Robbin On 2019-04-18 06:22, dean.long at oracle.com wrote: > In frame::deoptimize(), can we assert that we have an anchor frame and that it is walkable? > > dl > > On 4/17/19 3:09 AM, Robbin Ehn wrote: >> Adding compiler. >> >> /Robbin >> >> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>> Hi all, please consider this change. >>> >>> The code for deopt suspend is no longer needed since today the register window >>> is always flushed when this code executes. Exactly when this code was needed is not clear, entered via duke changeset >>> 1. I did not dig since we no longer have such use case. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>> >>> Passes t1-5. >>> >>> Thanks, Robbin > From karen.kinnear at oracle.com Tue Apr 23 19:41:19 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Tue, 23 Apr 2019 15:41:19 -0400 Subject: RFR 8221685: -XX:BytecodeVerificationRemote and -XX:BytecodeVerificationLocal should be diagnostic options In-Reply-To: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> References: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> Message-ID: <522EB38F-8F20-4156-AF82-7733C615F857@oracle.com> Looks good Harold. thanks, Karen > On Apr 23, 2019, at 2:34 PM, Harold Seigel wrote: > > Hi, > > Please review this change to make the hotspot BytecodeVerification* options be diagnostic. Use of either of these options without -XX:+UnlockDiagnosticVMOptions will now result in the following message: > > > java -XX:+BytecodeVerificationLocal -version > Error: VM option 'BytecodeVerificationLocal' is diagnostic and must be enabled via -XX:+UnlockDiagnosticVMOptions. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8221685/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8221685 > > The fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-x64, Windows, and Mac OS X, Mach5 tiers 3 -5 on Linux-x64, and by running JCK-13 Lang and VM tests on Linux-x64. Additionally, the java command was run to ensure that -XX:+UnlockDiagnosticVMOptions is needed when specifying the BytecodeVerification* options. > > Thanks, Harold > From harold.seigel at oracle.com Tue Apr 23 19:55:11 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Tue, 23 Apr 2019 15:55:11 -0400 Subject: RFR 8221685: -XX:BytecodeVerificationRemote and -XX:BytecodeVerificationLocal should be diagnostic options In-Reply-To: <522EB38F-8F20-4156-AF82-7733C615F857@oracle.com> References: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> <522EB38F-8F20-4156-AF82-7733C615F857@oracle.com> Message-ID: <546b301a-eaae-1673-a4c8-9868cf15cf4d@oracle.com> Thanks Karen! Harold On 4/23/2019 3:41 PM, Karen Kinnear wrote: > Looks good Harold. > > thanks, > Karen > >> On Apr 23, 2019, at 2:34 PM, Harold Seigel wrote: >> >> Hi, >> >> Please review this change to make the hotspot BytecodeVerification* options be diagnostic. Use of either of these options without -XX:+UnlockDiagnosticVMOptions will now result in the following message: >> >> > java -XX:+BytecodeVerificationLocal -version >> Error: VM option 'BytecodeVerificationLocal' is diagnostic and must be enabled via -XX:+UnlockDiagnosticVMOptions. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8221685/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8221685 >> >> The fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-x64, Windows, and Mac OS X, Mach5 tiers 3 -5 on Linux-x64, and by running JCK-13 Lang and VM tests on Linux-x64. Additionally, the java command was run to ensure that -XX:+UnlockDiagnosticVMOptions is needed when specifying the BytecodeVerification* options. >> >> Thanks, Harold >> From kirk at kodewerk.com Tue Apr 23 20:03:43 2019 From: kirk at kodewerk.com (Kodewerk) Date: Tue, 23 Apr 2019 13:03:43 -0700 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> Message-ID: Hi Eric, G1GC is obviously GC so why not reduce to... - G1GC - (reserved=379090KB, committed=93254KB) (malloc=39218KB #2194) (mmap: reserved=339872KB, committed=54036KB) Kind regards, Kirk > On Apr 23, 2019, at 11:53 AM, Eric Caspole wrote: > > Hi Zhengyu, > Hopefully this email comes through in monospace, the alignment is OK for me: > > > currently: > > - GC (reserved=379056KB, committed=93220KB) > (malloc=39184KB #2159) > (mmap: reserved=339872KB, committed=54036KB) > > > My version: > > > - GC - g1 gc (reserved=379090KB, committed=93254KB) > (malloc=39218KB #2194) > (mmap: reserved=339872KB, committed=54036KB) > > > so it is aligned going to the left off the parenthesis like the current version. Is that what you mean? I like the way the GC stands out like this but it is OK to put it in the parentheses on the right. > > Thanks, > Eric > > > > On 4/22/19 21:57, Zhengyu Gu wrote: >> On 4/22/19 8:19 PM, David Holmes wrote: >>> Hi Eric, >>> >>> On 23/04/2019 8:13 am, Eric Caspole wrote: >>>> Hi, could I have reviews and any opinions on this little change to show the GC name in the NMT output, as this helps us to more easily triage performance data. >>> >>> The idea seems fine. >>> >>> For the implementation wouldn't it be simpler to do something like: >>> >>> if (flag == mtGC) { >>> out->print("%s - %s (", NMTUtil::flag_to_name(flag), >>> GCConfig::hs_err_name()); >>> } else { >>> out->print("-%26s (", NMTUtil::flag_to_name(flag)); >>> } >>> >> Yes, this is simpler. >> I don't like where the name is placed, it screws up section alignments. I would prefer to place name inside parenthesis. e.g. >> - GC (g1 gc reserved=379056KB, committed=93220KB) >> Thanks, >> -Zhengyu >>> and skip the need for a local buffer and snprintf? >>> >>> Aside: it's probably used in enough different contexts that GCConfig::hs_err_name should be renamed. >>> >>> Also if the VM terminates during initialization is it possible for this code to be executed before the GCConfig has been setup? And if so how will it behave? >>> >>> Thanks, >>> David >>> >>>> This passed tier 1 and 2. >>>> Thanks, >>>> Eric >>>> >>>> >>>> JBS: >>>> https://bugs.openjdk.java.net/browse/JDK-8222818 >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From shade at redhat.com Tue Apr 23 20:10:15 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 23 Apr 2019 22:10:15 +0200 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> Message-ID: On 4/23/19 12:13 AM, Eric Caspole wrote: > JBS: > https://bugs.openjdk.java.net/browse/JDK-8222818 > > webrev: > http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From afar, It does not seem such a good idea to customize the NMT classes. Maybe the intent would be better catered by dumping the JVM arguments in the NMT summary? That would also help to further analyze the data, for example, correlate it with the number of compiler/GC threads, heap sizes, exotic options set, etc. -Aleksey From calvin.cheung at oracle.com Tue Apr 23 20:26:26 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 23 Apr 2019 13:26:26 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <3614850A-450A-470B-BF2D-3CEA21D03B95@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> <3614850A-450A-470B-BF2D-3CEA21D03B95@oracle.com> Message-ID: <5CBF74F2.9000206@oracle.com> On 4/23/19, 11:16 AM, Karen Kinnear wrote: > Calvin, > > I added to the CSR a comment from my favorite customer - relative to > the user model for the command-line flags. > He likes the proposal to reduce the number of steps a customer has to > perform to get startup and footprint benefits > from the archive. > > The comment was that it would be very helpful if the user only needed > to change their scripts once - so > a single command-line argument would create a dynamic archive if one > did not exist, and use it if it > already existed. > > Is there a way to evolve the ArchiveClassesAtExit= to > have that functionality? One drawback of this proposal is that the ArchiveClassesAtExit option has 2 meanings which I find confusing. Maybe eventually we can do some kind of automatic CDS archive dumping without having to specify any command line option. Such as when a java app is run at the first time, there will be some CDS archive created with a unique name. Subsequent run of the same app will make use of the archive. > > thanks, > Karen > > p.s. I think it makes more sense to put performance numbers in the > implementation RFE comments rather than the JEP > comments We found a bug yesterday and will run more performance tests. I'll put the performance numberss in the implementation RFE. thanks, Calvin > >> On Apr 22, 2019, at 5:16 PM, Jiangli Zhou > > wrote: >> >> Hi Calvin, >> >> Can you please also publish the final performance numbers in the JEP >> 350 (or the implementation RFE) comment section? >> >> Thanks, >> Jiangli >> >> On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung >> > wrote: >> >> Hi Karen, >> >> Thanks for your review! >> Please see my replies in-line below. >> >> On 4/19/19, 9:29 AM, Karen Kinnear wrote: >> > Calvin, >> > >> > Many thanks for all the work getting this ready, significantly >> > enhancing the testing and bug fixes. >> > >> > I marked the CSR as reviewed-by - it looks great! >> > >> > I reviewed this set of changes - I did not review the tests - I >> assume >> > you can get someone >> > else to do that. I am grateful that Jiangli and Ioi are going to >> > review this also - they are much closer to >> > the details than I am. >> > >> > 1. Do you have any performance numbers? >> > 1a. Startup: does using a combined dynamic CDS archive + base >> archive >> > give similar startup benefits >> > when you have the same classes in the archives? >> Below are some performance numbers from Eric, each number is for >> 50 runs: >> (base: using the default CDS archive, >> test: using the dynamic archive, >> Eric will get some numbers with a single archive which I think that's >> what you're looking for) >> >> Lambda-noop: >> base: >> 0.066441427 seconds time elapsed >> test: >> 0.075428824 seconds time elapsed >> >> Noop: >> base: >> 0.057614537 seconds time elapsed >> test: >> 0.066061557 seconds time elapsed >> >> Netty: >> base: >> 0.827013307 seconds time elapsed >> test: >> 0.604982805 seconds time elapsed >> >> Spring: >> base: >> 2.376707358 seconds time elapsed >> test: >> 1.927618893 seconds time elapsed >> >> The first 2 apps only have 2 to 3 classes in the dynamic archive. >> So the >> overhead is likely due to having to open and map the dynamic >> archive and >> performs checking on header, etc. For small apps, I think it's >> better to >> use a single archive. The Netty app has around 1400 classes in the >> dynamic archive; the Spring app has about 3700 classes in the dynamic >> archive. >> >> I also used our LotsOfClasses test to collect some perf numbers. >> This is >> more like runtime performance, not startup performance. >> >> With dynamic archive (100 runs each): >> real 2m37.191s >> real 2m36.003s >> Total loaded classes = 24254 >> Loaded from base archive = 1186 >> Loaded from top archive = 23042 >> Loaded from jrt:/ (runtime module) = 26 >> >> With single archive (100 runs each): >> real 2m38.346s >> real 2m36.947s >> Total loaded classes = 24254 >> Loaded from archive = 24228 >> Loaded from jrt:/ (runtime module) = 26 >> >> > >> > 1b. Do you have samples of uses of the combined dynamic CDS >> archive + >> > base archive vs. a single >> > static archive built for an application? >> > - how do the sets of archived classes differ? >> Currently, the default CDS archive contains around 1187 classes. With >> the -XX:ArchiveClassesAtExit option, if the classes are not found >> in the >> default CDS archive, they will be archived in the dynamic >> archive. The >> above LotsOfClasses example shows some distributions between various >> archives. >> > - one note was that the AtExit approach exclude list adds >> anything >> > that has not yet linked - does that make a significant >> difference in >> > the number of classes that are archived? Does that make a >> difference >> > in either startup time or in application execution time? I >> could see >> > that going either way. >> As the above numbers indicated, there's not much difference in >> terms of >> execution time using a dynamic vs a single archive with a large >> number >> of classes loaded. The numbers from Netty and Spring apps show an >> improvement over default CDS archive. >> > >> > 1c. Any sense of performance cost for first run - how much time >> does >> > it take to create an incremental archive? >> > - is the time comparable to an existing dump for a single >> archive >> > for the application? >> > - this is an ease-of-use feature - so we are not expecting >> that to >> > be fast >> > - the point is to set expectations in our documentation >> I did some rough measurements with the LotsOfClasses test with around >> 15000 classes in the classlist. >> >> Dynamic archive dumping (one run each): >> real 0m19.756s >> real 0m20.241s >> >> Static archive dumping (one run each): >> real 0m17.725s >> real 0m16.993s >> > >> > 2. Footprint >> > With two archives rather than one, is there a significant footprint >> > difference? Obviously this will vary by app and archive. >> > Once again, the point is to set expectations. >> Sizes of the archives for the LotsOfClasses test in 1a. >> >> Single archive: 242962432 >> Default CDS archive: 12365824 >> Dynamic archive: 197525504 >> >> > >> > 3. Runtime performance >> > With two sets of archived dictionaries & symbolTables - is >> there any >> > significant performance cost to larger benchmarks, e.g. for class >> > loading lookup for classes that are not in the archives? Or symbol >> > lookup? >> I used the LotsOfClasses test again. This time archiving about >> half of >> the classes which will be loaded during runtime. >> >> Dynamic archive (10 runs each): >> real 0m30.214s >> real 0m29.633s >> Loaded classes = 24254 >> Loaded from dynamic archive: 13168 >> >> Single archive (10 runs each): >> real 0m32.383s >> real 0m32.905s >> Loaded classes = 24254 >> Loaded from single archive = 15063 >> > >> > 4. Platform support >> > Which platforms is this supported on? >> > Which ones did you test? For example, did you run the tests on >> Windows? >> I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, >> Solaris, >> Windows). >> > >> > Detailed feedback on the code: Just minor comments - I don?t >> need to >> > see an updated webrev: >> I'm going to look into your detailed feedback below and may reply >> in a >> separate email. >> >> thanks, >> Calvin >> > >> > 1. metaSpaceShared.hpp >> > line 156: >> > what is the hardcoded -100 for? Should that be an enum? >> > >> > 2. jfrRecorder.cpp >> > So JFR recordings are disabled if DynamicDumpSharedSpaces? >> > why? >> > Is that a future rfe? >> > >> > 3. systemDictionaryShared.cpp >> > Could you possibly add a comment to add_verification_constraint >> > for if (DynamicDumpSharedSpaces) >> > return false >> > >> > -- I think the logic is: >> > because we have successfully linked any instanceKlass we archive >> > with DynamicDumpSharedSpaces, we have resolved all the >> constraint classes. >> > >> > -- I didn't check the order - is this called before or after >> > excluding? If after, then would it make sense to add an assertion >> > here is_linked? Then if you ever change how/when linking is >> done, this >> > might catch future errors. >> > >> > 4. systemDictionaryShared.cpp >> > EstimateSizeForArchive::do_entry >> > Is it the case that for info.is_builtin() there are no verification >> > constraints? So you could skip that calculation? Or did I >> misunderstand? >> > >> > 5. compactHashtable.cpp >> > serialize/header/calculate_header_size >> > -- could you dynamically determine size_of header so you don't need >> > to hardcode a 5? >> > >> > 6. classLoader.cpp >> > line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are >> > mutually exclusive. >> > Can you clarify for me: >> > My memory of the base archive is that we do not allow the following >> > options at dump time - and these >> > are the same for the dynamic archive: ?limit-modules, >> > ?upgrade-module-path, ?patch-module. >> > >> > I have forgotten: >> > Today with UseSharedSpaces - do we allow these flags? Is that >> also the >> > same behavior with the dynamic >> > archive? >> > >> > 7. classLoaderExt.cpp >> > assert line 66: only used with -Xshare:dump >> > -> "only used at dump time" >> > >> > 8. symbolTable.cpp >> > line 473: comment // used by UseSharedArchived2 >> > ? command-line arg name has changed >> > >> > 9. filemap.cpp >> > Comment lines 529 ... >> > Is this true - that you can only support dynamic dumping with the >> > default CDS archive? Could you clarify what the restrictions are? >> > The CSR implies you can support ?a specific base CDS archive" >> > - so base layer can not have appended boot class path >> > - and base layer can't have a module path >> > >> > What can you specify for the dynamic dumping relative to the >> base archive? >> > - matching class path? >> > - appended class path? >> > in future - could it have a module path that matched the base >> archive? >> > >> > Should any of these restrictions be clarified in documentation/CSR >> > since they appear to be new? >> > >> > 10. filemap.cpp >> > check_archive >> > Do some of the return false paths skip performing os::close(fd)? >> > >> > and get_base_archive_name_from_header >> > Does the first return false path fail to os::free(dynamic_header) >> > >> > lines 753-754: two FIXME comments >> > >> > Could you delete commented out line 1087 in filemap.cpp ? >> > >> > 11. filemap.hpp >> > line 214: TODO left in >> > >> > 12. metaspace.cpp >> > line 1418 FIXME left in >> > >> > 13. java.cpp >> > FIXME: is this the right place? >> > For starting the DynamicArchive::dump >> > >> > Please check with David Holmes on that one >> > >> > 14. dynamicArchive.hpp >> > line 55 (and others): MetsapceObj -> MetaspaceObj >> > >> > 15. dynamicArchive.cpp >> > line 285 rel-ayout -> re-layout >> > >> > lines 277 && 412 >> > Do we archive array klasses in the base archive but not in the >> dynamic >> > archive? >> > Is that a potential RFE? >> > Is it possible that GatherKlassesAndSymbols::do_unique_ref could be >> > called with an array class? >> > Same question for copy_impl? >> > >> > line 934: "no onger" -> "no longer" >> > >> > 16. What is AllowArchivingWithJavaAgent? Is that a hook for a >> > potential future rfe? >> > Do you want to check in that code at this time? In product? >> > >> > thanks, >> > Karen >> > >> > >> >> On Apr 11, 2019, at 5:18 PM, Calvin Cheung >> >> >> > >> wrote: >> >> >> >> This is a follow-up on the preliminary code review sent by >> Jiangli in >> >> January[1]. >> >> >> >> Highlights of changes since then: >> >> 1. New vm option for dumping a dynamic archive >> >> (-XX:ArchiveClassesAtExit=) and enhancement >> to the >> >> existing -XX:SharedArchiveFile option. Please refer to the >> >> corresponding CSR[2] for details. >> >> 2. New way to run existing AppCDS tests in dynamic CDS archive >> mode. >> >> At the jtreg command line, the user can run many existing AppCDS >> >> tests in dynamic CDS archive mode by specifying the following: >> >> -vmoptions:-Dtest.dynamic.cds.archive=true >> >> /open/test/hotspot/jtreg:hotspot_appcds_dynamic >> >> We will have a follow-up RFE to determine in which tier the >> above >> >> tests should be run. >> >> 3. Added more tests. >> >> 4. Various bug fixes to improve stability. >> >> >> >> RFE:https://bugs.openjdk.java.net/browse/JDK-8207812 >> >> webrev: >> >>http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ >> >> >> >> >> >> >> >> (The webrev is based on top of the following rev: >> >>http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) >> >> >> >> Testing: >> >> - mach5 tiers 1- 3 (including the new tests) >> >> - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few >> >> tests require more investigation) >> >> >> >> thanks, >> >> Calvin >> >> >> >> [1] >> >>https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html >> >> [2]https://bugs.openjdk.java.net/browse/JDK-8221706 >> > >> > From ioi.lam at oracle.com Tue Apr 23 20:43:42 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 23 Apr 2019 13:43:42 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CBF74F2.9000206@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> <3614850A-450A-470B-BF2D-3CEA21D03B95@oracle.com> <5CBF74F2.9000206@oracle.com> Message-ID: On 4/23/19 1:26 PM, Calvin Cheung wrote: > > > On 4/23/19, 11:16 AM, Karen Kinnear wrote: >> Calvin, >> >> I added to the CSR a comment from my favorite customer - relative to >> the user model for the command-line flags. >> He likes the proposal to reduce the number of steps a customer has to >> perform to get startup and footprint benefits >> from the archive. >> >> The comment was that it would be very helpful if the user only needed >> to change their scripts once - so >> a single command-line argument would create a dynamic archive if one >> did not exist, and use it if it >> already existed. >> >> Is there a way to evolve the ArchiveClassesAtExit= >> to have that functionality? > One drawback of this proposal is that the ArchiveClassesAtExit option > has 2 meanings which I find confusing. > Maybe eventually we can do some kind of automatic CDS archive dumping > without having to specify any command line option. Such as when a java > app is run at the first time, there will be some CDS archive created > with a unique name. Subsequent run of the same app will make use of > the archive. When we scoped this JEP, we wanted to provide just the minimal building blocks, so a user could implement automation on top of the JVM. Something like ARCHIVE=foo.jsa if test -f $ARCHIVE; then ??? FLAG="-XX:SharedArchiveFile=$ARCHIVE" else ??? FLAG="-XX:ArchiveClassesAtExit=$ARCHIVE" fi $JAVA_HOME/bin/java -cp foo.jar $FLAG FooApp Note that you also need to update the archive if the Java version has changed, so the test would be a little more complicated ARCHIVE=foo.jsa VERSION=foo.version if test -f $ARCHIVE -a -f $VERSION && cmp -s $VERSION $JAVA_HOME/release; then ??? FLAG="-XX:SharedArchiveFile=$ARCHIVE" else ??? FLAG="-XX:ArchiveClassesAtExit=$ARCHIVE" ??? cp -f $JAVA_HOME/release $VERSION fi $JAVA_HOME/bin/java -cp foo.jar $FLAG FooApp As Calvin mentioned, we are planning to make the archive management more automatic. So eventually you might be able to do something like java -Xshare:reallyauto -cp foo.jar FooApp And the JSA file will be automatically generated if necessary. We probably need some logic to delete older archives to avoid filling up the disk. I think the automation feature needs to be carefully planned out, so we should do that in a follow-up RFE or JEP. Thanks - Ioi >> >> thanks, >> Karen >> >> p.s. I think it makes more sense to put performance numbers in the >> implementation RFE comments rather than the JEP >> comments > We found a bug yesterday and will run more performance tests. I'll put > the performance numberss in the implementation RFE. > > thanks, > Calvin >> >>> On Apr 22, 2019, at 5:16 PM, Jiangli Zhou >> > wrote: >>> >>> Hi Calvin, >>> >>> Can you please also publish the final performance numbers in the JEP >>> 350 (or the implementation RFE) comment section? >>> >>> Thanks, >>> Jiangli >>> >>> On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung >>> > wrote: >>> >>> ??? Hi Karen, >>> >>> ??? Thanks for your review! >>> ??? Please see my replies in-line below. >>> >>> ??? On 4/19/19, 9:29 AM, Karen Kinnear wrote: >>> ??? > Calvin, >>> ??? > >>> ??? > Many thanks for all the work getting this ready, significantly >>> ??? > enhancing the testing and bug fixes. >>> ??? > >>> ??? > I marked the CSR as reviewed-by - it looks great! >>> ??? > >>> ??? > I reviewed this set of changes - I did not review the tests - I >>> ??? assume >>> ??? > you can get someone >>> ??? > else to do that.? I am grateful that Jiangli and Ioi are going to >>> ??? > review this also - they are much closer to >>> ??? > the details than I am. >>> ??? > >>> ??? > 1. Do you have any performance numbers? >>> ??? > 1a. Startup: does using a combined dynamic CDS archive + base >>> ??? archive >>> ??? > give similar startup benefits >>> ??? > when you have the same classes in the archives? >>> ??? Below are some performance numbers from Eric, each number is for >>> ??? 50 runs: >>> ??? (base: using the default CDS archive, >>> ??? test: using the dynamic archive, >>> ??? Eric will get some numbers with a single archive which I think >>> that's >>> ??? what you're looking for) >>> >>> ??? Lambda-noop: >>> ??? base: >>> ??? 0.066441427 seconds time elapsed >>> ??? test: >>> ??? 0.075428824 seconds time elapsed >>> >>> ??? Noop: >>> ??? base: >>> ??? 0.057614537 seconds time elapsed >>> ??? test: >>> ??? 0.066061557 seconds time elapsed >>> >>> ??? Netty: >>> ??? base: >>> ??? 0.827013307 seconds time elapsed >>> ??? test: >>> ??? 0.604982805 seconds time elapsed >>> >>> ??? Spring: >>> ??? base: >>> ??? 2.376707358 seconds time elapsed >>> ??? test: >>> ??? 1.927618893 seconds time elapsed >>> >>> ??? The first 2 apps only have 2 to 3 classes in the dynamic archive. >>> ??? So the >>> ??? overhead is likely due to having to open and map the dynamic >>> ??? archive and >>> ??? performs checking on header, etc. For small apps, I think it's >>> ??? better to >>> ??? use a single archive. The Netty app has around 1400 classes in the >>> ??? dynamic archive; the Spring app has about 3700 classes in the >>> dynamic >>> ??? archive. >>> >>> ??? I also used our LotsOfClasses test to collect some perf numbers. >>> ??? This is >>> ??? more like runtime performance, not startup performance. >>> >>> ??? With dynamic archive (100 runs each): >>> ??? real??? 2m37.191s >>> ??? real??? 2m36.003s >>> ??? Total loaded classes = 24254 >>> ??? Loaded from base archive = 1186 >>> ??? Loaded from top archive = 23042 >>> ??? Loaded from jrt:/ (runtime module) = 26 >>> >>> ??? With single archive (100 runs each): >>> ??? real??? 2m38.346s >>> ??? real??? 2m36.947s >>> ??? Total loaded classes = 24254 >>> ??? Loaded from archive = 24228 >>> ??? Loaded from jrt:/ (runtime module) = 26 >>> >>> ??? > >>> ??? > 1b. Do you have samples of uses of the combined dynamic CDS >>> ??? archive + >>> ??? > base archive vs. a single >>> ??? > static archive built for an application? >>> ??? >???? - how do the sets of archived classes differ? >>> ??? Currently, the default CDS archive contains around 1187 classes. >>> With >>> ??? the -XX:ArchiveClassesAtExit option, if the classes are not found >>> ??? in the >>> ??? default CDS archive, they will be archived in the dynamic >>> ??? archive. The >>> ??? above LotsOfClasses example shows some distributions between >>> various >>> ??? archives. >>> ??? >???? - one note was that the AtExit approach exclude list adds >>> ??? anything >>> ??? > that has not yet linked - does that make a significant >>> ??? difference in >>> ??? > the number of classes that are archived? Does that make a >>> ??? difference >>> ??? > in either startup time or in application execution time? I >>> ??? could see >>> ??? > that going either way. >>> ??? As the above numbers indicated, there's not much difference in >>> ??? terms of >>> ??? execution time using a dynamic vs a single archive with a large >>> ??? number >>> ??? of classes loaded. The numbers from Netty and Spring apps show an >>> ??? improvement over default CDS archive. >>> ??? > >>> ??? > 1c. Any sense of performance cost for first run - how much time >>> ??? does >>> ??? > it take to create an incremental archive? >>> ??? >???? - is the time comparable to an existing dump for a single >>> ??? archive >>> ??? > for the application? >>> ??? >???? - this is an ease-of-use feature - so we are not expecting >>> ??? that to >>> ??? > be fast >>> ??? >???? - the point is to set expectations in our documentation >>> ??? I did some rough measurements with the LotsOfClasses test with >>> around >>> ??? 15000 classes in the classlist. >>> >>> ??? Dynamic archive dumping (one run each): >>> ??? real??? 0m19.756s >>> ??? real??? 0m20.241s >>> >>> ??? Static archive dumping (one run each): >>> ??? real??? 0m17.725s >>> ??? real??? 0m16.993s >>> ??? > >>> ??? > 2. Footprint >>> ??? > With two archives rather than one, is there a significant >>> footprint >>> ??? > difference? Obviously this will vary by app and archive. >>> ??? > Once again, the point is to set expectations. >>> ??? Sizes of the archives for the LotsOfClasses test in 1a. >>> >>> ??? Single archive: 242962432 >>> ??? Default CDS archive: 12365824 >>> ??? Dynamic archive: 197525504 >>> >>> ??? > >>> ??? > 3. Runtime performance >>> ??? > With two sets of archived dictionaries & symbolTables - is >>> ??? there any >>> ??? > significant performance cost to larger benchmarks, e.g. for class >>> ??? > loading lookup for classes that are not in the archives?? Or >>> symbol >>> ??? > lookup? >>> ??? I used the LotsOfClasses test again. This time archiving about >>> ??? half of >>> ??? the classes which will be loaded during runtime. >>> >>> ??? Dynamic archive (10 runs each): >>> ??? real??? 0m30.214s >>> ??? real??? 0m29.633s >>> ??? Loaded classes = 24254 >>> ??? Loaded from dynamic archive: 13168 >>> >>> ??? Single archive (10 runs each): >>> ??? real??? 0m32.383s >>> ??? real??? 0m32.905s >>> ??? Loaded classes = 24254 >>> ??? Loaded from single archive = 15063 >>> ??? > >>> ??? > 4. Platform support >>> ??? > Which platforms is this supported on? >>> ??? > Which ones did you test? For example, did you run the tests on >>> ??? Windows? >>> ??? I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, >>> ??? Solaris, >>> ??? Windows). >>> ??? > >>> ??? > Detailed feedback on the code: Just minor comments - I don?t >>> ??? need to >>> ??? > see an updated webrev: >>> ??? I'm going to look into your detailed feedback below and may reply >>> ??? in a >>> ??? separate email. >>> >>> ??? thanks, >>> ??? Calvin >>> ??? > >>> ??? > 1. metaSpaceShared.hpp >>> ??? > line 156: >>> ??? > what is the hardcoded -100 for? Should that be an enum? >>> ??? > >>> ??? > 2. jfrRecorder.cpp >>> ??? > So JFR recordings are disabled if DynamicDumpSharedSpaces? >>> ??? > why? >>> ??? > Is that a future rfe? >>> ??? > >>> ??? > 3. systemDictionaryShared.cpp >>> ??? > Could you possibly add a comment to add_verification_constraint >>> ??? > for if (DynamicDumpSharedSpaces) >>> ??? >??? return false >>> ??? > >>> ??? > -- I think the logic is: >>> ??? >?? because we have successfully linked any instanceKlass we >>> archive >>> ??? > with DynamicDumpSharedSpaces, we have resolved all the >>> ??? constraint classes. >>> ??? > >>> ??? > -- I didn't check the order - is this called before or after >>> ??? > excluding? If after, then would it make sense to add an assertion >>> ??? > here is_linked? Then if you ever change how/when linking is >>> ??? done, this >>> ??? > might catch future errors. >>> ??? > >>> ??? > 4. systemDictionaryShared.cpp >>> ??? > EstimateSizeForArchive::do_entry >>> ??? > Is it the case that for info.is_builtin() there are no >>> verification >>> ??? > constraints? So you could skip that calculation? Or did I >>> ??? misunderstand? >>> ??? > >>> ??? > 5. compactHashtable.cpp >>> ??? > serialize/header/calculate_header_size >>> ??? > -- could you dynamically determine size_of header so you don't >>> need >>> ??? > to hardcode a 5? >>> ??? > >>> ??? > 6. classLoader.cpp >>> ??? > line 1337: //FIXME: DynamicDumpSharedSpaces and >>> --patch-modules are >>> ??? > mutually exclusive. >>> ??? > Can you clarify for me: >>> ??? > My memory of the base archive is that we do not allow the >>> following >>> ??? > options at dump time - and these >>> ??? > are the same for the dynamic archive: ?limit-modules, >>> ??? > ?upgrade-module-path, ?patch-module. >>> ??? > >>> ??? > I have forgotten: >>> ??? > Today with UseSharedSpaces - do we allow these flags? Is that >>> ??? also the >>> ??? > same behavior with the dynamic >>> ??? > archive? >>> ??? > >>> ??? > 7. classLoaderExt.cpp >>> ??? > assert line 66: only used with -Xshare:dump >>> ??? >? -> "only used at dump time" >>> ??? > >>> ??? > 8. symbolTable.cpp >>> ??? > line 473: comment // used by UseSharedArchived2 >>> ??? > ? command-line arg name has changed >>> ??? > >>> ??? > 9. filemap.cpp >>> ??? > Comment lines 529 ... >>> ??? > Is this true - that you can only support dynamic dumping with the >>> ??? > default CDS archive? Could you clarify what the restrictions are? >>> ??? > The CSR implies you can support ?a specific base CDS archive" >>> ??? >?? - so base layer can not have appended boot class path >>> ??? >?? - and base layer can't have a module path >>> ??? > >>> ??? > What can you specify for the dynamic dumping relative to the >>> ??? base archive? >>> ??? >?? - matching class path? >>> ??? >?? - appended class path? >>> ??? >?? in future - could it have a module path that matched the base >>> ??? archive? >>> ??? > >>> ??? > Should any of these restrictions be clarified in >>> documentation/CSR >>> ??? > since they appear to be new? >>> ??? > >>> ??? > 10. filemap.cpp >>> ??? > check_archive >>> ??? > Do some of the return false paths skip performing os::close(fd)? >>> ??? > >>> ??? > and get_base_archive_name_from_header >>> ??? > Does the first return false path fail to os::free(dynamic_header) >>> ??? > >>> ??? > lines 753-754: two FIXME comments >>> ??? > >>> ??? > Could you delete commented out line 1087 in filemap.cpp ? >>> ??? > >>> ??? > 11. filemap.hpp >>> ??? > line 214: TODO left in >>> ??? > >>> ??? > 12. metaspace.cpp >>> ??? > line 1418 FIXME left in >>> ??? > >>> ??? > 13. java.cpp >>> ??? > FIXME: is this the right place? >>> ??? > For starting the DynamicArchive::dump >>> ??? > >>> ??? > Please check with David Holmes on that one >>> ??? > >>> ??? > 14. dynamicArchive.hpp >>> ??? > line 55 (and others): MetsapceObj -> MetaspaceObj >>> ??? > >>> ??? > 15. dynamicArchive.cpp >>> ??? > line 285 rel-ayout -> re-layout >>> ??? > >>> ??? > lines 277 && 412 >>> ??? > Do we archive array klasses in the base archive but not in the >>> ??? dynamic >>> ??? > archive? >>> ??? > Is that a potential RFE? >>> ??? > Is it possible that GatherKlassesAndSymbols::do_unique_ref >>> could be >>> ??? > called with an array class? >>> ??? > Same question for copy_impl? >>> ??? > >>> ??? > line 934: "no onger" -> "no longer" >>> ??? > >>> ??? > 16. What is AllowArchivingWithJavaAgent? Is that a hook for a >>> ??? > potential future rfe? >>> ??? > Do you want to check in that code at this time? In product? >>> ??? > >>> ??? > thanks, >>> ??? > Karen >>> ??? > >>> ??? > >>> ??? >> On Apr 11, 2019, at 5:18 PM, Calvin Cheung >>> ??? >>> ??? >> >> ??? >> wrote: >>> ??? >> >>> ??? >> This is a follow-up on the preliminary code review sent by >>> ??? Jiangli in >>> ??? >> January[1]. >>> ??? >> >>> ??? >> Highlights of changes since then: >>> ??? >> 1. New vm option for dumping a dynamic archive >>> ??? >> (-XX:ArchiveClassesAtExit=) and enhancement >>> ??? to the >>> ??? >> existing -XX:SharedArchiveFile option. Please refer to the >>> ??? >> corresponding CSR[2] for details. >>> ??? >> 2. New way to run existing AppCDS tests in dynamic CDS archive >>> ??? mode. >>> ??? >> At the jtreg command line, the user can run many existing AppCDS >>> ??? >> tests in dynamic CDS archive mode by specifying the following: >>> ??? >>??? -vmoptions:-Dtest.dynamic.cds.archive=true >>> ??? >> /open/test/hotspot/jtreg:hotspot_appcds_dynamic >>> ??? >>?? We will have a follow-up RFE to determine in which tier the >>> ??? above >>> ??? >> tests should be run. >>> ??? >> 3. Added more tests. >>> ??? >> 4. Various bug fixes to improve stability. >>> ??? >> >>> ??? >> RFE:https://bugs.openjdk.java.net/browse/JDK-8207812 >>> ??? >> webrev: >>> >>http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ >>> >>> ??? >> >>> >>> ??? >> >>> ??? >> (The webrev is based on top of the following rev: >>> >>http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) >>> ??? >> >>> ??? >> Testing: >>> ??? >>??? - mach5 tiers 1- 3 (including the new tests) >>> ??? >>??? - AppCDS tests in dynamic CDS archive mode on linux-x64 (a >>> few >>> ??? >> tests require more investigation) >>> ??? >> >>> ??? >> thanks, >>> ??? >> Calvin >>> ??? >> >>> ??? >> [1] >>> >>https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html >>> ??? >> [2]https://bugs.openjdk.java.net/browse/JDK-8221706 >>> ??? > >>> >> From zgu at redhat.com Tue Apr 23 20:50:03 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 23 Apr 2019 16:50:03 -0400 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> Message-ID: <7961d936-6eeb-9e78-ddbe-bc56e271ff5d@redhat.com> On 4/23/19 2:53 PM, Eric Caspole wrote: > Hi Zhengyu, > Hopefully this email comes through in monospace, the alignment is OK for > me: > > > currently: > > -??????????????????????? GC (reserved=379056KB, committed=93220KB) > ??????????????????????????? (malloc=39184KB #2159) > ??????????????????????????? (mmap: reserved=339872KB, committed=54036KB) > > > My version: > > > -??????????????? GC - g1 gc (reserved=379090KB, committed=93254KB) > ??????????????????????????? (malloc=39218KB #2194) > ??????????????????????????? (mmap: reserved=339872KB, committed=54036KB) > > > so it is aligned going to the left off the parenthesis like the current > version. Is that what you mean? I like the way the GC stands out like > this but it is OK to put it in the parentheses on the right. Different GC has different name, it is hard to get them all aligned right, and it does not worth the effort. So, my suggestion is to place GC name inside parentheses, and you don't have to deal with indents to the left. e.g. - GC (reserved=379056KB, committed=93220KB by g1 gc) (malloc=39184KB #2159) (mmap: reserved=339872KB, committed=54036KB) Thanks, -Zhengyu > > Thanks, > Eric > > > > On 4/22/19 21:57, Zhengyu Gu wrote: >> >> >> On 4/22/19 8:19 PM, David Holmes wrote: >>> Hi Eric, >>> >>> On 23/04/2019 8:13 am, Eric Caspole wrote: >>>> Hi, could I have reviews and any opinions on this little change to >>>> show the GC name in the NMT output, as this helps us to more easily >>>> triage performance data. >>> >>> The idea seems fine. >> >>> >>> For the implementation wouldn't it be simpler to do something like: >>> >>> if (flag == mtGC) { >>> ?? out->print("%s - %s (", NMTUtil::flag_to_name(flag), >>> ?????????????????????????? GCConfig::hs_err_name()); >>> } else { >>> ?? out->print("-%26s (", NMTUtil::flag_to_name(flag)); >>> } >>> >> Yes, this is simpler. >> >> I don't like where the name is placed, it screws up section >> alignments. I would prefer to place name inside parenthesis. e.g. >> >> - GC (g1 gc reserved=379056KB, committed=93220KB) >> >> Thanks, >> >> -Zhengyu >> >>> and skip the need for a local buffer and snprintf? >>> >>> Aside: it's probably used in enough different contexts that >>> GCConfig::hs_err_name should be renamed. >>> >>> Also if the VM terminates during initialization is it possible for >>> this code to be executed before the GCConfig has been setup? And if >>> so how will it behave? >>> >>> Thanks, >>> David >>> >>>> This passed tier 1 and 2. >>>> Thanks, >>>> Eric >>>> >>>> >>>> JBS: >>>> https://bugs.openjdk.java.net/browse/JDK-8222818 >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From dean.long at oracle.com Tue Apr 23 21:17:42 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 23 Apr 2019 14:17:42 -0700 Subject: RFR(s): 8222640: Remove deopt suspend In-Reply-To: <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> Message-ID: <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> Yes, looks good! dl On 4/23/19 12:38 PM, Robbin Ehn wrote: > Hi Dean, > > Is this what you had in mind: > diff -r 295029840379 src/hotspot/share/runtime/frame.cpp > --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 09:58:55 > 2019 +0200 > +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 21:32:00 > 2019 +0200 > @@ -272,4 +272,6 @@ > > ?void frame::deoptimize(JavaThread* thread) { > +? assert(thread->frame_anchor()->has_last_Java_frame() && > +???????? thread->frame_anchor()->walkable(), "must be"); > ?? // Schedule deoptimization of an nmethod activation with this frame. > ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); > > Passes t1-5. > > v2: > http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ > Inc: > http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ > > Thanks, Robbin > > On 2019-04-18 06:22, dean.long at oracle.com wrote: >> In frame::deoptimize(), can we assert that we have an anchor frame >> and that it is walkable? >> >> dl >> >> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>> Adding compiler. >>> >>> /Robbin >>> >>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>> Hi all, please consider this change. >>>> >>>> The code for deopt suspend is no longer needed since today the >>>> register window >>>> is always flushed when this code executes. Exactly when this code >>>> was needed is not clear, entered via duke changeset 1. I did not >>>> dig since we no longer have such use case. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>> >>>> Passes t1-5. >>>> >>>> Thanks, Robbin >> From robbin.ehn at oracle.com Tue Apr 23 21:32:12 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 23 Apr 2019 23:32:12 +0200 Subject: RFR(s): 8222640: Remove deopt suspend In-Reply-To: <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> Message-ID: <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> Thanks Dean! /Robbin On 2019-04-23 23:17, dean.long at oracle.com wrote: > Yes, looks good! > > dl > > On 4/23/19 12:38 PM, Robbin Ehn wrote: >> Hi Dean, >> >> Is this what you had in mind: >> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 09:58:55 2019 +0200 >> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 21:32:00 2019 +0200 >> @@ -272,4 +272,6 @@ >> >> ?void frame::deoptimize(JavaThread* thread) { >> +? assert(thread->frame_anchor()->has_last_Java_frame() && >> +???????? thread->frame_anchor()->walkable(), "must be"); >> ?? // Schedule deoptimization of an nmethod activation with this frame. >> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >> >> Passes t1-5. >> >> v2: >> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >> Inc: >> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >> >> Thanks, Robbin >> >> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>> In frame::deoptimize(), can we assert that we have an anchor frame and that it is walkable? >>> >>> dl >>> >>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>> Adding compiler. >>>> >>>> /Robbin >>>> >>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>> Hi all, please consider this change. >>>>> >>>>> The code for deopt suspend is no longer needed since today the register window >>>>> is always flushed when this code executes. Exactly when this code was needed is not clear, entered via duke >>>>> changeset 1. I did not dig since we no longer have such use case. >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>> Issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>> >>>>> Passes t1-5. >>>>> >>>>> Thanks, Robbin >>> > From eric.caspole at oracle.com Tue Apr 23 21:43:31 2019 From: eric.caspole at oracle.com (Eric Caspole) Date: Tue, 23 Apr 2019 17:43:31 -0400 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> Message-ID: <4e0611c1-b616-86dc-cf6f-1f6b96e601d9@oracle.com> On 4/23/19 16:10, Aleksey Shipilev wrote: > On 4/23/19 12:13 AM, Eric Caspole wrote: >> JBS: >> https://bugs.openjdk.java.net/browse/JDK-8222818 >> >> webrev: >> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ > > From afar, It does not seem such a good idea to customize the NMT classes. Maybe the intent would be > better catered by dumping the JVM arguments in the NMT summary? That would also help to further > analyze the data, for example, correlate it with the number of compiler/GC threads, heap sizes, > exotic options set, etc. That would be more or less a JEP to standardize how to print all the diagnostic info in every place the VM prints any diagnostic info. Because we will never train the users to send enough data to actually diagnose their problem ;) > > -Aleksey > From david.holmes at oracle.com Tue Apr 23 22:17:02 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Apr 2019 08:17:02 +1000 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: <7961d936-6eeb-9e78-ddbe-bc56e271ff5d@redhat.com> References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> <7961d936-6eeb-9e78-ddbe-bc56e271ff5d@redhat.com> Message-ID: On 24/04/2019 6:50 am, Zhengyu Gu wrote: > > > On 4/23/19 2:53 PM, Eric Caspole wrote: >> Hi Zhengyu, >> Hopefully this email comes through in monospace, the alignment is OK >> for me: >> >> >> currently: >> >> -??????????????????????? GC (reserved=379056KB, committed=93220KB) >> ???????????????????????????? (malloc=39184KB #2159) >> ???????????????????????????? (mmap: reserved=339872KB, committed=54036KB) >> >> >> My version: >> >> >> -??????????????? GC - g1 gc (reserved=379090KB, committed=93254KB) >> ???????????????????????????? (malloc=39218KB #2194) >> ???????????????????????????? (mmap: reserved=339872KB, committed=54036KB) >> >> >> so it is aligned going to the left off the parenthesis like the >> current version. Is that what you mean? I like the way the GC stands >> out like this but it is OK to put it in the parentheses on the right. > > Different GC has different name, it is hard to get them all aligned > right, and it does not worth the effort. AFAICS The code already "reserves" 26 characters just to print "GC", which is right-aligned. So all this does is take some of the 24 existing spaces and fill them in with the GC name so you end up with: GC - g1 gc (reserved=379056KB, committed=93220KB) (malloc=39184KB #2159) (mmap: reserved=339872KB, committed=54036KB) or GC - shenandoah gc (reserved=379056KB, committed=93220KB) (malloc=39184KB #2159) (mmap: reserved=339872KB, committed=54036KB) etc. Only nit is that 26 seems to small for "concurrent mark sweep gc". Also the alignment of 26 could be specified dynamically based on the length of the hs_err_name() string if needed. David ----- > > So, my suggestion is to place GC name inside parentheses, and you don't > have to deal with indents to the left. > > e.g. > > > -???????????????????? GC (reserved=379056KB, committed=93220KB by g1 gc) > ????????????????????????? (malloc=39184KB #2159) > ????????????????????????? (mmap: reserved=339872KB, committed=54036KB) > > > Thanks, > > -Zhengyu > >> >> Thanks, >> Eric >> >> >> >> On 4/22/19 21:57, Zhengyu Gu wrote: >>> >>> >>> On 4/22/19 8:19 PM, David Holmes wrote: >>>> Hi Eric, >>>> >>>> On 23/04/2019 8:13 am, Eric Caspole wrote: >>>>> Hi, could I have reviews and any opinions on this little change to >>>>> show the GC name in the NMT output, as this helps us to more easily >>>>> triage performance data. >>>> >>>> The idea seems fine. >>> >>>> >>>> For the implementation wouldn't it be simpler to do something like: >>>> >>>> if (flag == mtGC) { >>>> ?? out->print("%s - %s (", NMTUtil::flag_to_name(flag), >>>> ?????????????????????????? GCConfig::hs_err_name()); >>>> } else { >>>> ?? out->print("-%26s (", NMTUtil::flag_to_name(flag)); >>>> } >>>> >>> Yes, this is simpler. >>> >>> I don't like where the name is placed, it screws up section >>> alignments. I would prefer to place name inside parenthesis. e.g. >>> >>> - GC (g1 gc reserved=379056KB, committed=93220KB) >>> >>> Thanks, >>> >>> -Zhengyu >>> >>>> and skip the need for a local buffer and snprintf? >>>> >>>> Aside: it's probably used in enough different contexts that >>>> GCConfig::hs_err_name should be renamed. >>>> >>>> Also if the VM terminates during initialization is it possible for >>>> this code to be executed before the GCConfig has been setup? And if >>>> so how will it behave? >>>> >>>> Thanks, >>>> David >>>> >>>>> This passed tier 1 and 2. >>>>> Thanks, >>>>> Eric >>>>> >>>>> >>>>> JBS: >>>>> https://bugs.openjdk.java.net/browse/JDK-8222818 >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From coleen.phillimore at oracle.com Tue Apr 23 22:47:29 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Apr 2019 18:47:29 -0400 Subject: RFR(s): 8222640: Remove deopt suspend In-Reply-To: <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> Message-ID: +1? This looks good! Coleen On 4/23/19 5:32 PM, Robbin Ehn wrote: > Thanks Dean! > > /Robbin > > On 2019-04-23 23:17, dean.long at oracle.com wrote: >> Yes, looks good! >> >> dl >> >> On 4/23/19 12:38 PM, Robbin Ehn wrote: >>> Hi Dean, >>> >>> Is this what you had in mind: >>> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >>> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 09:58:55 >>> 2019 +0200 >>> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 21:32:00 >>> 2019 +0200 >>> @@ -272,4 +272,6 @@ >>> >>> ?void frame::deoptimize(JavaThread* thread) { >>> +? assert(thread->frame_anchor()->has_last_Java_frame() && >>> +???????? thread->frame_anchor()->walkable(), "must be"); >>> ?? // Schedule deoptimization of an nmethod activation with this frame. >>> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >>> >>> Passes t1-5. >>> >>> v2: >>> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >>> Inc: >>> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >>> >>> Thanks, Robbin >>> >>> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>>> In frame::deoptimize(), can we assert that we have an anchor frame >>>> and that it is walkable? >>>> >>>> dl >>>> >>>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>>> Adding compiler. >>>>> >>>>> /Robbin >>>>> >>>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>>> Hi all, please consider this change. >>>>>> >>>>>> The code for deopt suspend is no longer needed since today the >>>>>> register window >>>>>> is always flushed when this code executes. Exactly when this code >>>>>> was needed is not clear, entered via duke changeset 1. I did not >>>>>> dig since we no longer have such use case. >>>>>> >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>>> Issue: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>>> >>>>>> Passes t1-5. >>>>>> >>>>>> Thanks, Robbin >>>> >> From calvin.cheung at oracle.com Tue Apr 23 22:55:29 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 23 Apr 2019 15:55:29 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: References: <5CAFAF21.3030007@oracle.com> Message-ID: <5CBF97E1.8080500@oracle.com> Hi Karen, The following incremental webrev should have addressed most of your comments: http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/ Please see my replies inline below. On 4/19/19, 9:29 AM, Karen Kinnear wrote: > 1. metaSpaceShared.hpp > line 156: > what is the hardcoded -100 for? Should that be an enum? I don't know what the -100 means either. The function in question is not new code. It was moved from the metaSpaceShared.cpp. I've moved it back there. > > 2. jfrRecorder.cpp > So JFR recordings are disabled if DynamicDumpSharedSpaces? > why? It was also done for DumpSharedSpaces via the fix for https://bugs.openjdk.java.net/browse/JDK-8203664. > Is that a future rfe? > > 3. systemDictionaryShared.cpp > Could you possibly add a comment to add_verification_constraint > for if (DynamicDumpSharedSpaces) > return false Comment added. > > -- I think the logic is: > because we have successfully linked any instanceKlass we archive > with DynamicDumpSharedSpaces, we have resolved all the constraint classes. > > -- I didn't check the order - is this called before or after > excluding? If after, then would it make sense to add an assertion > here is_linked? Then if you ever change how/when linking is done, this > might catch future errors. Not all InstanceKlass are linked at that point; the ones failed verification won't be linked. > > 4. systemDictionaryShared.cpp > EstimateSizeForArchive::do_entry > Is it the case that for info.is_builtin() there are no verification > constraints? So you could skip that calculation? Or did I misunderstand? The size also includes header and crc sizes: static size_t byte_size(InstanceKlass* klass, int num_constraints) { return header_size_size() + crc_size(klass) + verifier_constraints_size(num_constraints) + verifier_constraint_flags_size(num_constraints); } > > 5. compactHashtable.cpp > serialize/header/calculate_header_size > -- could you dynamically determine size_of header so you don't need > to hardcode a 5? I checked with Ioi on this one. The problem is calculate_header_size() needs to be called during size estimation, and serialize_header is called after size estimation. > > 6. classLoader.cpp > line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are > mutually exclusive. > Can you clarify for me: > My memory of the base archive is that we do not allow the following > options at dump time - and these > are the same for the dynamic archive: ?limit-modules, > ?upgrade-module-path, ?patch-module. Yes, the same should apply for the dynamic archive. FIXME has been replaced with a comment. > > I have forgotten: > Today with UseSharedSpaces - do we allow these flags? Is that also the > same behavior with the dynamic > archive? Yes, same behavior with the dynamic archive. We ran those tests in dynamic archive mode. > > 7. classLoaderExt.cpp > assert line 66: only used with -Xshare:dump > -> "only used at dump time" Done. > > 8. symbolTable.cpp > line 473: comment // used by UseSharedArchived2 > ? command-line arg name has changed Actually, the entire SymbolTableCreateEntry class can be removed. It was left there probably due to merge error. > > 9. filemap.cpp > Comment lines 529 ... > Is this true - that you can only support dynamic dumping with the > default CDS archive? Could you clarify what the restrictions are? > The CSR implies you can support ?a specific base CDS archive" Yes. > - so base layer can not have appended boot class path > - and base layer can't have a module path Correct. > > What can you specify for the dynamic dumping relative to the base archive? > - matching class path? > - appended class path? Yes. > in future - could it have a module path that matched the base archive? Sure, in another RFE. > > Should any of these restrictions be clarified in documentation/CSR > since they appear to be new? I'll update the doc. > > 10. filemap.cpp > check_archive > Do some of the return false paths skip performing os::close(fd)? Fixed. > > and get_base_archive_name_from_header > Does the first return false path fail to os::free(dynamic_header) Fixed. > > lines 753-754: two FIXME comments I implemented the fist FIXME. The second one is not needed. I've changed it to a comment. > > Could you delete commented out line 1087 in filemap.cpp ? Done. > > 11. filemap.hpp > line 214: TODO left in I leave it there for now. It isn't too simple to get rid of the static declaration. I can do a follow up after this RFE. > > 12. metaspace.cpp > line 1418 FIXME left in Removed. > > 13. java.cpp > FIXME: is this the right place? > For starting the DynamicArchive::dump > > Please check with David Holmes on that one I've removed the FIXME. I've also check with David H. He said the following: > Not an easy question to answer It depends on all the code that might > be touched through DynamicArchive::dump() and whether it might > interact with anything already "shutdown". It will really come down to > testing (run all tests with dynamic dumping enabled) to see if there > are any unexpected interactions. > > 14. dynamicArchive.hpp > line 55 (and others): MetsapceObj -> MetaspaceObj Done > > 15. dynamicArchive.cpp > line 285 rel-ayout -> re-layout Done > > lines 277 && 412 > Do we archive array klasses in the base archive but not in the dynamic > archive? > Is that a potential RFE? We currently don't handle array klasses in the AppCDS archive either. Please refer to: open/test/hotspot/jtreg/runtime/appcds/javaldr/ArrayTest.java. > Is it possible that GatherKlassesAndSymbols::do_unique_ref could be > called with an array class? > Same question for copy_impl? > > line 934: "no onger" -> "no longer" Done > > 16. What is AllowArchivingWithJavaAgent? Is that a hook for a > potential future rfe? > Do you want to check in that code at this time? In product? The flag was added for the https://bugs.openjdk.java.net/browse/JDK-8201375 in JDK12. I needed to add the same check for the dynamic archive case. thanks, Calvin From robbin.ehn at oracle.com Tue Apr 23 23:49:01 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 24 Apr 2019 01:49:01 +0200 Subject: RFR(s): 8222640: Remove deopt suspend In-Reply-To: References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> Message-ID: <45681477-9a04-a623-6686-c72b36be6c38@oracle.com> Thanks Coleen! /Robbin On 2019-04-24 00:47, coleen.phillimore at oracle.com wrote: > +1? This looks good! > Coleen > > On 4/23/19 5:32 PM, Robbin Ehn wrote: >> Thanks Dean! >> >> /Robbin >> >> On 2019-04-23 23:17, dean.long at oracle.com wrote: >>> Yes, looks good! >>> >>> dl >>> >>> On 4/23/19 12:38 PM, Robbin Ehn wrote: >>>> Hi Dean, >>>> >>>> Is this what you had in mind: >>>> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >>>> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 09:58:55 2019 +0200 >>>> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 21:32:00 2019 +0200 >>>> @@ -272,4 +272,6 @@ >>>> >>>> ?void frame::deoptimize(JavaThread* thread) { >>>> +? assert(thread->frame_anchor()->has_last_Java_frame() && >>>> +???????? thread->frame_anchor()->walkable(), "must be"); >>>> ?? // Schedule deoptimization of an nmethod activation with this frame. >>>> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >>>> >>>> Passes t1-5. >>>> >>>> v2: >>>> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >>>> Inc: >>>> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >>>> >>>> Thanks, Robbin >>>> >>>> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>>>> In frame::deoptimize(), can we assert that we have an anchor frame and that it is walkable? >>>>> >>>>> dl >>>>> >>>>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>>>> Adding compiler. >>>>>> >>>>>> /Robbin >>>>>> >>>>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>>>> Hi all, please consider this change. >>>>>>> >>>>>>> The code for deopt suspend is no longer needed since today the register window >>>>>>> is always flushed when this code executes. Exactly when this code was needed is not clear, entered via duke >>>>>>> changeset 1. I did not dig since we no longer have such use case. >>>>>>> >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>>>> Issue: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>>>> >>>>>>> Passes t1-5. >>>>>>> >>>>>>> Thanks, Robbin >>>>> >>> > From calvin.cheung at oracle.com Wed Apr 24 00:08:38 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 23 Apr 2019 17:08:38 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> Message-ID: <5CBFA906.1030205@oracle.com> Hi Jiangli, Thanks a lot for your review! On 4/22/19, 2:07 PM, Jiangli Zhou wrote: > Hi Calvin, > > Congrats on finalizing the dynamic archiving work and completing > testing. After the integration of the dynamic archiving, a follow-up > RFE can be done to merge the archiving/copying code in > dynamicArchive.* and metaspaceShared.* for better maintenance in the > future. As there are many duplicates between those two, having shared > implementation for both static and dynamic will be beneficial and > reduce the maintenance cost. I'll file an RFE for the above. > > Here are my comments mainly for additional cleanups and some minor issues. > > - src/hotspot/share/classfile/classLoader.cpp > > 1337 // FIXME: DynamicDumpSharedSpaces and --patch-modules are > mutually exclusive > 1338 assert(!DynamicDumpSharedSpaces, "sanity"); > > I tagged the comment with 'FIXME' to serve as a reminder to add more > details. The reason DynamicDumpSharedSpaces is 'mutually exclusive' > with with --patch-modules because DynamicDumpSharedSpaces is only > enabled when UseSharedSpaces is also enabled. As --patch-modules is > not supported with UseSharedSpaces, it is not supported with > DynamicDumpSharedSpaces either. I've converted the FIXME to a comment. > > 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); > 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, > (ClassFileStream*)stream); > > Please add assert(DynamicDumpSharedSpaces, "sanity"); to the above > code. With the new dynamic archiving capability, it's now able to > load/archive a class with user defined classloader via this call path. > A comment explaining this is also needed. I tried the assert but it didn't work. Not only DynamicDumpSharedSpaces will go through that code path. > > - src/hotspot/share/classfile/classLoaderExt.cpp > > 64 void ClassLoaderExt::setup_app_search_path() { > 65 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, > 66 "this function is only used with -Xshare:dump"); > > The above message needs to be updated to reflect the new command-line option. Done. > > 304 result->set_shared_classpath_index(UNREGISTERED_INDEX); > 305 SystemDictionaryShared::set_shared_class_misc_info(result, > stream);<<<<<<<<<< > > Why is the set_shared_class_misc_info call being removed? If this is a > bug fix for loading classes from the classlist for user defined > classloaders, it should be handled separately, and with a separate bug > ID as well. It is called in ClassLoader::record_result() from KlassFactory::create_from_stream(). > > - src/hotspot/share/classfile/compactHashtable.cpp > > 207 size_t SimpleCompactHashtable::calculate_header_size() { > 208 // We have 5 fields. Each takes up sizeof(intptr_t). See > WriteClosure::do_u4 > 209 size_t bytes = sizeof(intptr_t) * 5; > 210 return bytes; > 211 } > > 212 > 213 void SimpleCompactHashtable::serialize_header(SerializeClosure* soc) { > 214 // NOTE: if you change this function, you MUST change the number 5 in > 215 // calculate_header_size() accordingly. > ... > > As a cleanup, a better way to handle this is to calculate the size > within SimpleCompactHashtable::serialize_header during serializing the > data and set the size value in a valuable. > SimpleCompactHashtable::calculate_header_size() should simply retrieve > the value. A renaming of > SimpleCompactHashtable::calculate_header_size() can also be done. I've checked with Ioi on this one. The problem is calculate_header_size() needs to be called during size estimation, and serialize_header is called after size estimation. > > - src/hotspot/share/classfile/dictionary.cpp > > 315 InstanceKlass* Dictionary::find_class(Symbol* name) { > 316 unsigned int hash = compute_hash(name); > 317 int index = hash_to_index(hash); > 318 return find_class(index, hash, name); > 319 } > > Looks like the new function is not references (unless I'm missing > something). Please remove the function. > > - src/hotspot/share/classfile/dictionary.hpp > > 65 InstanceKlass* find_class(Symbol* name); > > Same comment as the above. I've removed the function. > > - src/hotspot/share/classfile/symbolTable.cpp. > > 473 Symbol* const _archived; // used by UseSharedArchived2 > > Please removed 'UseSharedArchived2'. The comment also needs more clarifications. > > I couldn't find any references to SymbolTableCreateEntry. Can you > please point to me where it is being used? I've removed the entire SymbolTableCreateEntry class. It was left there probably due to merge error. > > - src/hotspot/share/classfile/systemDictionaryShared.cpp > > 1218 if (DynamicDumpSharedSpaces) { > 1219 return false; > 1220 } else { > > The above case for DynamicDumpSharedSpaces needs to be examined > carefully. Can you please ask Harold (and Coleen or Karen) to take a > look? Also, a comment is needed to explain that we can complete all > verification checks at dynamic dumping time. I've added a comment. If it return false, the caller will call VerificationType::resolve_and_check_assignability(). > > - src/hotspot/share/classfile/systemDictionaryShared.cpp > > 1279 ResourceMark rm; > > You can use 'ResourceMark rm(THREAD)'. Fixed. > > - src/hotspot/share/memory/allocation.hpp > > 255 // > 256 // When CDS is not enabled, both pointers are set to NULL. > 257 static void* _shared_metaspace_base; // (inclusive) low address > 258 static void* _shared_metaspace_top; // (exclusive) high addres > > Why the comment at line 256 was removed? I've added back the comment. > > - src/hotspot/share/memory/filemap.cpp > > 101 void FileMapInfo::fail_continue(const char *msg, ...) { > 102 va_list ap; > 103 va_start(ap, msg); > 104 if (_runtime_dynamic_info == NULL) { > 105 MetaspaceShared::set_archive_loading_failed(); > 106 } else { > 107 DynamicArchive::disable(); > 108 } > > The above fail_continue only works if _runtime_dynamic_info is setup > after the mapping the base archive. Comments should be add to explain > that. Comment added. > > Can you please rename '_runtime_dynamic_info' so it's more > descriptive? Maybe use 'dynamic_archive_info'. Renamed to '_dynamic_archive_info'. > > 587 bool FileMapInfo::same_files(const char* file1, const char* file2) { > > The usage of FileMapInfo::same_files is not necessary and should be > removed. The base archive's CRC checksum values are recorded in the > dynamic archive. The runtime verifies the CRC values to make sure the > same archive is used at dump time and runtime, regardless of the base > archive path or name. It is designed for all use cases: The same_files() function is also used in arguments.cpp: 3530 if (DynamicDumpSharedSpaces) { 3531 if (FileMapInfo::same_files(SharedArchiveFile, ArchiveClassesAtExit)) { 3532 vm_exit_during_initialization( 3533 "Cannot have the same archive file specified for -XX:SharedArchiveFile and -XX:ArchiveClassesAtExit", 3534 SharedArchiveFile); 3535 } 3536 } The function is also needed for the RFE: https://bugs.openjdk.java.net/browse/JDK-8211723 We still verify the CRC values during runtime. > > * base CDS archive is specified in the -XX:SharedArchiveFile at > dynamic dumping time > * -XX:SharedArchiveFile is not specified at dynamic dumping time, > default location for the default CDS archive is used > * default CDS archive is specified in the -XX:SharedArchiveFile at runtime > * default CDS archive is not specified in the -XX:SharedArchiveFile at > runtime, default location for the default CDS archive is used Regarding the fourth point above, the user could have a non-default base archive and only specify the top archive during runtime. > > In all above cases, the base archive CRC values check is sufficient. > The use of path/name is fragile and should be avoided. That will allow > you to remove the _base_archive_name_size from the dynamic archive. We still need the _base_archive_name_size and the base archive name in the header because of the above reason. > > 752 if (is_static) { > 753 // FIXME check for dynamic header as well > 754 // FIXME Don't just check the last region -- check all regions! > > Can you please address the first FIXME at line 753? > > Checking the last region is sufficient since the archive is written is > sequential order. The second FIXME is not necessary. I've addressed the first FIXME and converted the second one to a comment. > > - src/hotspot/share/memory/metaspace.cpp > > 1417 bool Metaspace::contains(const void* ptr) { > 1418 // FIXME: need to check the dynamic archive > > Can you please remove the above FIXME? There is no need for a separate check. Done. > > - src/hotspot/share/memory/metaspaceShared.cpp > > 830 intptr_t* MetaspaceShared::fix_cpp_vtable_for_second_archive > > Can you please rename the function to fix_cpp_vtable_for_dynamic_archive? Done. > > - src/hotspot/share/oops/klass.cpp > > 527 assert (DumpSharedSpaces || DynamicDumpSharedSpaces, > 528 "only called for DumpSharedSpaces"); > > 544 void Klass::remove_java_mirror() { > 545 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, "only > called for DumpSharedSpaces"); > > Please fix the messages above. Done. > > - src/hotspot/share/prims/whitebox.cpp > > 2332 {CC"getResolvedReferences", > CC"(Ljava/lang/Class;)Ljava/lang/Object;", > (void*)&WB_GetResolvedReferences}, > 2333 {CC"linkClass", CC"(Ljava/lang/Class;)V", > (void*)&WB_LinkClass}, > 2334 {CC"areOpenArchiveHeapObjectsMapped", CC"()Z", > (void*)&WB_AreOpenArchiveHeapObjectsMapped}, > > Can you please align the indentation of line 2333 (to be the same as > line 2332 or 2334)? Aligned (void*) with line 2334. (It doesn't show in the webrev since only blank space changes) > > - src/hotspot/share/runtime/arguments.cpp > > 1491 bool Arguments::check_unsupported_cds_runtime_properties() { > 1492 assert(UseSharedSpaces, "this function is only used with > -Xshare:{on,auto}"); > 1493 assert(ARRAY_SIZE(unsupported_properties) == > ARRAY_SIZE(unsupported_options), "must be"); > 1494 if (ArchiveClassesAtExit != NULL) { > 1495 // dynamic dumping, just return false, > check_unsupported_dumping_properties() will be called > 1496 // in init_shared_archive_paths(). > 1497 return false; > 1498 } > > The check_unsupported_cds_runtime_properties() should be done for the > 'ArchiveClassesAtExit != NULL' case as well. Dynamic dumping is a > combination of both dump time and runtime. The 'ArchiveClassesAtExit != NULL' is for dumping CDS archive to the user's point of view, that's why the comments in lines 1495 and 1496. During runtime, ArchiveClassesAtExit will be NULL, so the check_unsupported_cds_runtime_properties() will be called as usual. > > 2729 // -Xshare:auto || -Xshare:dynamicDump > > As you've renamed the command-line argument for dynamic dumping > support, the comment needs to be fixed. Fixed. > > 3125 // Compiler threads may concurrently update the class > metadata (such as method entries), so it's > 3126 // unsafe with DumpSharedSpaces (which modifies the class > metadata in place). Let's disable > 3127 // compiler just to be safe. > 3128 // > 3129 // Note: this is not a concern for DynamicDumpSharedSpaces, > which makes a copy of the class metadata > 3130 // instead of modifying them in place. The copy is > inaccessible to the compiler. > 3131 set_mode_flags(_int); > > We need to come back to revisit the above for the 'static' archive > dumping at one point. There is a RFE filed for that, if I remember > correctly. Could you please add a 'TODO' notes in the above comment. Added TODO. > > A check should be done in arguments.cpp to make sure > DynamicDumpSharedSpaces is not manipulated from the command-line > directly. DynamicDumpSharedSpaces should not be enabled in the > command-line without ArchiveClassesAtExit being specified. Done. > > - src/hotspot/share/runtime/java.cpp > > 509 > 510 // FIXME: is this the right place? > 511 if (DynamicDumpSharedSpaces) { > 512 DynamicArchive::dump(); > 513 } > > Again, the above 'FIXME' is served as a cleanup reminder. Please get > opinions from others on this change. If the calling place is okay, > please remove the FIXME. Removed the FIXME for now. Checked with David H. He indicated there's no easy answer for this. Just need to do a lot of testing. > - test > > Could you please add a test case for setting DynamicDumpSharedSpaces > from command-line? Here's an incremental webrev which contains a new test: http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/ thanks, Calvin > > I only took a brief look of the test changes. Please ask Misha to > review the test changes as well. > > Thanks and regards, > Jiangli From david.holmes at oracle.com Wed Apr 24 01:51:43 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Apr 2019 11:51:43 +1000 Subject: RFR 8221685: -XX:BytecodeVerificationRemote and -XX:BytecodeVerificationLocal should be diagnostic options In-Reply-To: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> References: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> Message-ID: Hi Harold, Looks good. Minor nit: - product(bool, BytecodeVerificationRemote, true, \ + diagnostic(bool, BytecodeVerificationRemote, true, \ "Enable the Java bytecode verifier for remote classes") \ \ - product(bool, BytecodeVerificationLocal, false, \ + diagnostic(bool, BytecodeVerificationLocal, false, \ "Enable the Java bytecode verifier for local classes") \ can you fix the indentation on the "Enable ..." lines. Thanks, David ----- On 24/04/2019 4:34 am, Harold Seigel wrote: > Hi, > > Please review this change to make the hotspot BytecodeVerification* > options be diagnostic.? Use of either of these options without > -XX:+UnlockDiagnosticVMOptions will now result in the following message: > > ??? > java -XX:+BytecodeVerificationLocal -version > ??? Error: VM option 'BytecodeVerificationLocal' is diagnostic and must > be enabled via -XX:+UnlockDiagnosticVMOptions. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8221685/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8221685 > > The fix was regression tested by running Mach5 tiers 1 and 2 tests and > builds on Linux-x64, Windows, and Mac OS X, Mach5 tiers 3 -5 on > Linux-x64, and by running JCK-13 Lang and VM tests on Linux-x64. > Additionally, the java command was run to ensure that > -XX:+UnlockDiagnosticVMOptions is needed when specifying the > BytecodeVerification* options. > > Thanks, Harold > From david.holmes at oracle.com Wed Apr 24 02:05:38 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Apr 2019 12:05:38 +1000 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> Message-ID: Hi Dan, That all seems fine. Thanks, David On 24/04/2019 12:28 am, Daniel D. Daugherty wrote: > Greetings, > > I have a (S)mall patch extracted from the Async Monitor Deflation project > that is ready for code review. > > Karen, a number of the changes here are from your code review comments > to the parent bug: > > ? ? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > The short version of what this patch is about: > > ??? More baseline cleanups to the ObjectMonitor subsystem. > > The details are in the bug report: > > ??? JDK-8222295 more baseline cleanups from Async Monitor Deflation > project > ??? https://bugs.openjdk.java.net/browse/JDK-8222295 > > Here's the webrev: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/ > > This patch along with the current patch for Async Monitor Deflation > project have been through Mach5 tier[1-8] testing. > > I have been actively using the revised assert()'s and guarantee()'s with > additional diagnostic info while debugging my port of the Async Monitor > Deflation project code. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan From david.holmes at oracle.com Wed Apr 24 07:12:47 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Apr 2019 17:12:47 +1000 Subject: RFR(S): 8222518: Remove unnecessary caching of Parker object in java.lang.Thread Message-ID: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8222518 webrev: http://cr.openjdk.java.net/~dholmes/8222518/webrev/ The original implementation of Unsafe.unpark simply extracted the JavaThread reference from the java.lang.Thread oop and if non-null extracted the Parker instance from it and invoked unpark. This was racy however as the native JavaThread could terminate at any time and deallocate the Parker. That logic was fixed by JDK-6271298 which used of combination of type-stable-memory "event" objects for the Parker, along with use of the Threads_lock to obtain the initial reference to the Parker (from a JavaThread guaranteed to be alive), together with caching the native Parker pointer in a field of java.lang.Thread. Even though the native thread may have terminated the Parker was still valid (even if associated with a different thread) and the unpark at worst was a spurious wakeup for that other thread. When JDK-8167108 introduced Thread Safe-Memory-Reclaimation (SMR) the logic was updated to always use the safe mechanism - we grab a ThreadsListHandle then check the cached field, else lookup the native thread to see if it is alive and locate the Parker instance that way. With SMR the caching of the Parker pointer no longer serves any purpose - we no longer have a lock-free use-the-cache path versus a lock-using populate-the-cache path. With SMR we've already"paid" for the ability to ensure the native thread can't terminate regardless of whether we lookup the field from the java.lang.Thread or the JavaThread. So we can simplify the code and save a little footprint by removing the cache from java.lang.Thread: /* * JVM-private state that persists after native thread termination. */ private long nativeParkEventPointer; and the supporting code from unsafe.cpp and javaClass.*pp in the JVM. I considered restoring the fast-path use of the cache without recourse to Thread-SMR but performance measurements failed to show any benefit in doing. See bug report for details. Thanks, David From stefan.karlsson at oracle.com Wed Apr 24 08:47:38 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 24 Apr 2019 10:47:38 +0200 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> <7961d936-6eeb-9e78-ddbe-bc56e271ff5d@redhat.com> Message-ID: On 2019-04-24 00:17, David Holmes wrote: > On 24/04/2019 6:50 am, Zhengyu Gu wrote: >> >> >> On 4/23/19 2:53 PM, Eric Caspole wrote: >>> Hi Zhengyu, >>> Hopefully this email comes through in monospace, the alignment is OK >>> for me: >>> >>> >>> currently: >>> >>> -??????????????????????? GC (reserved=379056KB, committed=93220KB) >>> ???????????????????????????? (malloc=39184KB #2159) >>> ???????????????????????????? (mmap: reserved=339872KB, >>> committed=54036KB) >>> >>> >>> My version: >>> >>> >>> -??????????????? GC - g1 gc (reserved=379090KB, committed=93254KB) >>> ???????????????????????????? (malloc=39218KB #2194) >>> ???????????????????????????? (mmap: reserved=339872KB, >>> committed=54036KB) >>> >>> >>> so it is aligned going to the left off the parenthesis like the >>> current version. Is that what you mean? I like the way the GC stands >>> out like this but it is OK to put it in the parentheses on the right. >> >> Different GC has different name, it is hard to get them all aligned >> right, and it does not worth the effort. > > AFAICS The code already "reserves" 26 characters just to print "GC", > which is right-aligned. So all this does is take some of the 24 existing > spaces and fill them in with the GC name so you end up with: > > ??????????????? GC - g1 gc (reserved=379056KB, committed=93220KB) > ?????????????????????????? (malloc=39184KB #2159) > ?????????????????????????? (mmap: reserved=339872KB, committed=54036KB) > > or > ??????? GC - shenandoah gc (reserved=379056KB, committed=93220KB) > ?????????????????????????? (malloc=39184KB #2159) > ?????????????????????????? (mmap: reserved=339872KB, committed=54036KB) > > etc. Only nit is that 26 seems to small for "concurrent mark sweep gc". > Also the alignment of 26 could be specified dynamically based on the > length of the hs_err_name() string if needed. FYI, the name of the function hs_err_name() was chosen to deter people from using it in other places. Maybe it's time to add another function in GCConfig that return better names? Maybe use names that match our -XX:UseGC flags? If you would find that useful, I've created a patch that exposes three new functions that return the flag names, or parts of it: http://cr.openjdk.java.net/~stefank/8222818/gcFlagNames/webrev.01/ GCConfig::flag_name() gives: UseConcMarkSweepGC UseEpsilonGC UseG1GC UseParallelGC UseSerialGC UseShenandoahGC UseZGC GCConfig::flag_name_no_use() gives: ConcMarkSweepGC EpsilonGC G1GC ParallelGC SerialGC ShenandoahGC ZGC GCConfig::flag_name_no_use_no_gc() gives: ConcMarkSweep Epsilon G1 Parallel Serial Shenandoah Z The universe.cpp changes, are only temporary changes to test the output. StefanK > > David > ----- > >> >> So, my suggestion is to place GC name inside parentheses, and you >> don't have to deal with indents to the left. >> >> e.g. >> >> >> -???????????????????? GC (reserved=379056KB, committed=93220KB by g1 gc) >> ?????????????????????????? (malloc=39184KB #2159) >> ?????????????????????????? (mmap: reserved=339872KB, committed=54036KB) >> >> >> Thanks, >> >> -Zhengyu >> >>> >>> Thanks, >>> Eric >>> >>> >>> >>> On 4/22/19 21:57, Zhengyu Gu wrote: >>>> >>>> >>>> On 4/22/19 8:19 PM, David Holmes wrote: >>>>> Hi Eric, >>>>> >>>>> On 23/04/2019 8:13 am, Eric Caspole wrote: >>>>>> Hi, could I have reviews and any opinions on this little change to >>>>>> show the GC name in the NMT output, as this helps us to more >>>>>> easily triage performance data. >>>>> >>>>> The idea seems fine. >>>> >>>>> >>>>> For the implementation wouldn't it be simpler to do something like: >>>>> >>>>> if (flag == mtGC) { >>>>> ?? out->print("%s - %s (", NMTUtil::flag_to_name(flag), >>>>> ?????????????????????????? GCConfig::hs_err_name()); >>>>> } else { >>>>> ?? out->print("-%26s (", NMTUtil::flag_to_name(flag)); >>>>> } >>>>> >>>> Yes, this is simpler. >>>> >>>> I don't like where the name is placed, it screws up section >>>> alignments. I would prefer to place name inside parenthesis. e.g. >>>> >>>> - GC (g1 gc reserved=379056KB, committed=93220KB) >>>> >>>> Thanks, >>>> >>>> -Zhengyu >>>> >>>>> and skip the need for a local buffer and snprintf? >>>>> >>>>> Aside: it's probably used in enough different contexts that >>>>> GCConfig::hs_err_name should be renamed. >>>>> >>>>> Also if the VM terminates during initialization is it possible for >>>>> this code to be executed before the GCConfig has been setup? And if >>>>> so how will it behave? >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> This passed tier 1 and 2. >>>>>> Thanks, >>>>>> Eric >>>>>> >>>>>> >>>>>> JBS: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8222818 >>>>>> >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From david.holmes at oracle.com Wed Apr 24 09:18:52 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Apr 2019 19:18:52 +1000 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> <7961d936-6eeb-9e78-ddbe-bc56e271ff5d@redhat.com> Message-ID: <955ef400-511d-1c03-0c34-a10f7834591d@oracle.com> Hi Stefan, On 24/04/2019 6:47 pm, Stefan Karlsson wrote: > > > On 2019-04-24 00:17, David Holmes wrote: >> On 24/04/2019 6:50 am, Zhengyu Gu wrote: >>> >>> >>> On 4/23/19 2:53 PM, Eric Caspole wrote: >>>> Hi Zhengyu, >>>> Hopefully this email comes through in monospace, the alignment is OK >>>> for me: >>>> >>>> >>>> currently: >>>> >>>> -??????????????????????? GC (reserved=379056KB, committed=93220KB) >>>> ???????????????????????????? (malloc=39184KB #2159) >>>> ???????????????????????????? (mmap: reserved=339872KB, >>>> committed=54036KB) >>>> >>>> >>>> My version: >>>> >>>> >>>> -??????????????? GC - g1 gc (reserved=379090KB, committed=93254KB) >>>> ???????????????????????????? (malloc=39218KB #2194) >>>> ???????????????????????????? (mmap: reserved=339872KB, >>>> committed=54036KB) >>>> >>>> >>>> so it is aligned going to the left off the parenthesis like the >>>> current version. Is that what you mean? I like the way the GC stands >>>> out like this but it is OK to put it in the parentheses on the right. >>> >>> Different GC has different name, it is hard to get them all aligned >>> right, and it does not worth the effort. >> >> AFAICS The code already "reserves" 26 characters just to print "GC", >> which is right-aligned. So all this does is take some of the 24 >> existing spaces and fill them in with the GC name so you end up with: >> >> ???????????????? GC - g1 gc (reserved=379056KB, committed=93220KB) >> ??????????????????????????? (malloc=39184KB #2159) >> ??????????????????????????? (mmap: reserved=339872KB, committed=54036KB) >> >> or >> ???????? GC - shenandoah gc (reserved=379056KB, committed=93220KB) >> ??????????????????????????? (malloc=39184KB #2159) >> ??????????????????????????? (mmap: reserved=339872KB, committed=54036KB) >> >> etc. Only nit is that 26 seems to small for "concurrent mark sweep >> gc". Also the alignment of 26 could be specified dynamically based on >> the length of the hs_err_name() string if needed. > > > FYI, the name of the function hs_err_name() was chosen to deter people > from using it in other places. Maybe it's time to add another function > in GCConfig that return better names? Maybe use names that match our > -XX:UseGC flags? I was thinking more simply that we might rename hs_err_name() to name() and make the strings a little nicer e.g. "G1 GC" instead of "g1 gc". But if you want to add more general GC functionality here that's up to you :) Cheers, David > If you would find that useful, I've created a patch that exposes three > new functions that return the flag names, or parts of it: > http://cr.openjdk.java.net/~stefank/8222818/gcFlagNames/webrev.01/ > > GCConfig::flag_name() gives: > ?UseConcMarkSweepGC > ?UseEpsilonGC > ?UseG1GC > ?UseParallelGC > ?UseSerialGC > ?UseShenandoahGC > ?UseZGC > > GCConfig::flag_name_no_use() gives: > ?ConcMarkSweepGC > ?EpsilonGC > ?G1GC > ?ParallelGC > ?SerialGC > ?ShenandoahGC > ?ZGC > > GCConfig::flag_name_no_use_no_gc() gives: > ?ConcMarkSweep > ?Epsilon > ?G1 > ?Parallel > ?Serial > ?Shenandoah > ?Z > > The universe.cpp changes, are only temporary changes to test the output. > > StefanK > >> >> David >> ----- >> >>> >>> So, my suggestion is to place GC name inside parentheses, and you >>> don't have to deal with indents to the left. >>> >>> e.g. >>> >>> >>> -???????????????????? GC (reserved=379056KB, committed=93220KB by g1 gc) >>> ?????????????????????????? (malloc=39184KB #2159) >>> ?????????????????????????? (mmap: reserved=339872KB, committed=54036KB) >>> >>> >>> Thanks, >>> >>> -Zhengyu >>> >>>> >>>> Thanks, >>>> Eric >>>> >>>> >>>> >>>> On 4/22/19 21:57, Zhengyu Gu wrote: >>>>> >>>>> >>>>> On 4/22/19 8:19 PM, David Holmes wrote: >>>>>> Hi Eric, >>>>>> >>>>>> On 23/04/2019 8:13 am, Eric Caspole wrote: >>>>>>> Hi, could I have reviews and any opinions on this little change >>>>>>> to show the GC name in the NMT output, as this helps us to more >>>>>>> easily triage performance data. >>>>>> >>>>>> The idea seems fine. >>>>> >>>>>> >>>>>> For the implementation wouldn't it be simpler to do something like: >>>>>> >>>>>> if (flag == mtGC) { >>>>>> ?? out->print("%s - %s (", NMTUtil::flag_to_name(flag), >>>>>> ?????????????????????????? GCConfig::hs_err_name()); >>>>>> } else { >>>>>> ?? out->print("-%26s (", NMTUtil::flag_to_name(flag)); >>>>>> } >>>>>> >>>>> Yes, this is simpler. >>>>> >>>>> I don't like where the name is placed, it screws up section >>>>> alignments. I would prefer to place name inside parenthesis. e.g. >>>>> >>>>> - GC (g1 gc reserved=379056KB, committed=93220KB) >>>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>>>> and skip the need for a local buffer and snprintf? >>>>>> >>>>>> Aside: it's probably used in enough different contexts that >>>>>> GCConfig::hs_err_name should be renamed. >>>>>> >>>>>> Also if the VM terminates during initialization is it possible for >>>>>> this code to be executed before the GCConfig has been setup? And >>>>>> if so how will it behave? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> This passed tier 1 and 2. >>>>>>> Thanks, >>>>>>> Eric >>>>>>> >>>>>>> >>>>>>> JBS: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222818 >>>>>>> >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From stefan.karlsson at oracle.com Wed Apr 24 09:25:52 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 24 Apr 2019 11:25:52 +0200 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: <955ef400-511d-1c03-0c34-a10f7834591d@oracle.com> References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> <7961d936-6eeb-9e78-ddbe-bc56e271ff5d@redhat.com> <955ef400-511d-1c03-0c34-a10f7834591d@oracle.com> Message-ID: On 2019-04-24 11:18, David Holmes wrote: > Hi Stefan, > > On 24/04/2019 6:47 pm, Stefan Karlsson wrote: >> >> >> On 2019-04-24 00:17, David Holmes wrote: >>> On 24/04/2019 6:50 am, Zhengyu Gu wrote: >>>> >>>> >>>> On 4/23/19 2:53 PM, Eric Caspole wrote: >>>>> Hi Zhengyu, >>>>> Hopefully this email comes through in monospace, the alignment is >>>>> OK for me: >>>>> >>>>> >>>>> currently: >>>>> >>>>> -??????????????????????? GC (reserved=379056KB, committed=93220KB) >>>>> ???????????????????????????? (malloc=39184KB #2159) >>>>> ???????????????????????????? (mmap: reserved=339872KB, >>>>> committed=54036KB) >>>>> >>>>> >>>>> My version: >>>>> >>>>> >>>>> -??????????????? GC - g1 gc (reserved=379090KB, committed=93254KB) >>>>> ???????????????????????????? (malloc=39218KB #2194) >>>>> ???????????????????????????? (mmap: reserved=339872KB, >>>>> committed=54036KB) >>>>> >>>>> >>>>> so it is aligned going to the left off the parenthesis like the >>>>> current version. Is that what you mean? I like the way the GC >>>>> stands out like this but it is OK to put it in the parentheses on >>>>> the right. >>>> >>>> Different GC has different name, it is hard to get them all aligned >>>> right, and it does not worth the effort. >>> >>> AFAICS The code already "reserves" 26 characters just to print "GC", >>> which is right-aligned. So all this does is take some of the 24 >>> existing spaces and fill them in with the GC name so you end up with: >>> >>> ???????????????? GC - g1 gc (reserved=379056KB, committed=93220KB) >>> ??????????????????????????? (malloc=39184KB #2159) >>> ??????????????????????????? (mmap: reserved=339872KB, committed=54036KB) >>> >>> or >>> ???????? GC - shenandoah gc (reserved=379056KB, committed=93220KB) >>> ??????????????????????????? (malloc=39184KB #2159) >>> ??????????????????????????? (mmap: reserved=339872KB, committed=54036KB) >>> >>> etc. Only nit is that 26 seems to small for "concurrent mark sweep >>> gc". Also the alignment of 26 could be specified dynamically based on >>> the length of the hs_err_name() string if needed. >> >> >> FYI, the name of the function hs_err_name() was chosen to deter people >> from using it in other places. Maybe it's time to add another function >> in GCConfig that return better names? Maybe use names that match our >> -XX:UseGC flags? > > I was thinking more simply that we might rename hs_err_name() to name() > and make the strings a little nicer e.g. "G1 GC" instead of "g1 gc". But > if you want to add more general GC functionality here that's up to you :) Either way works for me. We tried to change the name before, but it was met with resistance: https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-April/021937.html StefanK > > Cheers, > David > >> If you would find that useful, I've created a patch that exposes three >> new functions that return the flag names, or parts of it: >> http://cr.openjdk.java.net/~stefank/8222818/gcFlagNames/webrev.01/ >> >> GCConfig::flag_name() gives: >> ??UseConcMarkSweepGC >> ??UseEpsilonGC >> ??UseG1GC >> ??UseParallelGC >> ??UseSerialGC >> ??UseShenandoahGC >> ??UseZGC >> >> GCConfig::flag_name_no_use() gives: >> ??ConcMarkSweepGC >> ??EpsilonGC >> ??G1GC >> ??ParallelGC >> ??SerialGC >> ??ShenandoahGC >> ??ZGC >> >> GCConfig::flag_name_no_use_no_gc() gives: >> ??ConcMarkSweep >> ??Epsilon >> ??G1 >> ??Parallel >> ??Serial >> ??Shenandoah >> ??Z >> >> The universe.cpp changes, are only temporary changes to test the output. >> >> StefanK >> >>> >>> David >>> ----- >>> >>>> >>>> So, my suggestion is to place GC name inside parentheses, and you >>>> don't have to deal with indents to the left. >>>> >>>> e.g. >>>> >>>> >>>> -???????????????????? GC (reserved=379056KB, committed=93220KB by g1 >>>> gc) >>>> ?????????????????????????? (malloc=39184KB #2159) >>>> ?????????????????????????? (mmap: reserved=339872KB, committed=54036KB) >>>> >>>> >>>> Thanks, >>>> >>>> -Zhengyu >>>> >>>>> >>>>> Thanks, >>>>> Eric >>>>> >>>>> >>>>> >>>>> On 4/22/19 21:57, Zhengyu Gu wrote: >>>>>> >>>>>> >>>>>> On 4/22/19 8:19 PM, David Holmes wrote: >>>>>>> Hi Eric, >>>>>>> >>>>>>> On 23/04/2019 8:13 am, Eric Caspole wrote: >>>>>>>> Hi, could I have reviews and any opinions on this little change >>>>>>>> to show the GC name in the NMT output, as this helps us to more >>>>>>>> easily triage performance data. >>>>>>> >>>>>>> The idea seems fine. >>>>>> >>>>>>> >>>>>>> For the implementation wouldn't it be simpler to do something like: >>>>>>> >>>>>>> if (flag == mtGC) { >>>>>>> ?? out->print("%s - %s (", NMTUtil::flag_to_name(flag), >>>>>>> ?????????????????????????? GCConfig::hs_err_name()); >>>>>>> } else { >>>>>>> ?? out->print("-%26s (", NMTUtil::flag_to_name(flag)); >>>>>>> } >>>>>>> >>>>>> Yes, this is simpler. >>>>>> >>>>>> I don't like where the name is placed, it screws up section >>>>>> alignments. I would prefer to place name inside parenthesis. e.g. >>>>>> >>>>>> - GC (g1 gc reserved=379056KB, committed=93220KB) >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>>> >>>>>>> and skip the need for a local buffer and snprintf? >>>>>>> >>>>>>> Aside: it's probably used in enough different contexts that >>>>>>> GCConfig::hs_err_name should be renamed. >>>>>>> >>>>>>> Also if the VM terminates during initialization is it possible >>>>>>> for this code to be executed before the GCConfig has been setup? >>>>>>> And if so how will it behave? >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> This passed tier 1 and 2. >>>>>>>> Thanks, >>>>>>>> Eric >>>>>>>> >>>>>>>> >>>>>>>> JBS: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222818 >>>>>>>> >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From harold.seigel at oracle.com Wed Apr 24 11:46:59 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Wed, 24 Apr 2019 07:46:59 -0400 Subject: RFR 8221685: -XX:BytecodeVerificationRemote and -XX:BytecodeVerificationLocal should be diagnostic options In-Reply-To: References: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> Message-ID: Thanks David. I'll fix the indents before pushing the change. Harold On 4/23/2019 9:51 PM, David Holmes wrote: > Hi Harold, > > Looks good. Minor nit: > > -? product(bool, BytecodeVerificationRemote, true, ??? \ > +? diagnostic(bool, BytecodeVerificationRemote, true, ??? \ > ?????????? "Enable the Java bytecode verifier for remote classes") ???? \ > > ???? \ > -? product(bool, BytecodeVerificationLocal, false, ??? \ > +? diagnostic(bool, BytecodeVerificationLocal, false, ??? \ > ?????????? "Enable the Java bytecode verifier for local classes") ???? \ > > can you fix the indentation on the "Enable ..." lines. > > Thanks, > David > ----- > > On 24/04/2019 4:34 am, Harold Seigel wrote: >> Hi, >> >> Please review this change to make the hotspot BytecodeVerification* >> options be diagnostic.? Use of either of these options without >> -XX:+UnlockDiagnosticVMOptions will now result in the following message: >> >> ???? > java -XX:+BytecodeVerificationLocal -version >> ???? Error: VM option 'BytecodeVerificationLocal' is diagnostic and >> must be enabled via -XX:+UnlockDiagnosticVMOptions. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8221685/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8221685 >> >> The fix was regression tested by running Mach5 tiers 1 and 2 tests >> and builds on Linux-x64, Windows, and Mac OS X, Mach5 tiers 3 -5 on >> Linux-x64, and by running JCK-13 Lang and VM tests on Linux-x64. >> Additionally, the java command was run to ensure that >> -XX:+UnlockDiagnosticVMOptions is needed when specifying the >> BytecodeVerification* options. >> >> Thanks, Harold >> From harold.seigel at oracle.com Wed Apr 24 11:58:07 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Wed, 24 Apr 2019 07:58:07 -0400 Subject: RFR 8221685: -XX:BytecodeVerificationRemote and -XX:BytecodeVerificationLocal should be diagnostic options In-Reply-To: References: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> Message-ID: <4d295b61-8dd1-dd5f-2b4b-d77bf6bfd84a@oracle.com> Hi David, Actually, I think the indent is correct.? The message text is lined up at column 10 regardless of if the option is develop, diagnostic, product, etc. Thanks, Harold On 4/24/2019 7:46 AM, Harold Seigel wrote: > Thanks David. > > I'll fix the indents before pushing the change. > > Harold > > On 4/23/2019 9:51 PM, David Holmes wrote: >> Hi Harold, >> >> Looks good. Minor nit: >> >> -? product(bool, BytecodeVerificationRemote, true, ??? \ >> +? diagnostic(bool, BytecodeVerificationRemote, true, ??? \ >> ?????????? "Enable the Java bytecode verifier for remote classes") >> ???? \ >> >> ???? \ >> -? product(bool, BytecodeVerificationLocal, false, ??? \ >> +? diagnostic(bool, BytecodeVerificationLocal, false, ??? \ >> ?????????? "Enable the Java bytecode verifier for local classes") ???? \ >> >> can you fix the indentation on the "Enable ..." lines. >> >> Thanks, >> David >> ----- >> >> On 24/04/2019 4:34 am, Harold Seigel wrote: >>> Hi, >>> >>> Please review this change to make the hotspot BytecodeVerification* >>> options be diagnostic.? Use of either of these options without >>> -XX:+UnlockDiagnosticVMOptions will now result in the following >>> message: >>> >>> ???? > java -XX:+BytecodeVerificationLocal -version >>> ???? Error: VM option 'BytecodeVerificationLocal' is diagnostic and >>> must be enabled via -XX:+UnlockDiagnosticVMOptions. >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8221685/webrev/index.html >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8221685 >>> >>> The fix was regression tested by running Mach5 tiers 1 and 2 tests >>> and builds on Linux-x64, Windows, and Mac OS X, Mach5 tiers 3 -5 on >>> Linux-x64, and by running JCK-13 Lang and VM tests on Linux-x64. >>> Additionally, the java command was run to ensure that >>> -XX:+UnlockDiagnosticVMOptions is needed when specifying the >>> BytecodeVerification* options. >>> >>> Thanks, Harold >>> From david.holmes at oracle.com Wed Apr 24 12:06:52 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Apr 2019 22:06:52 +1000 Subject: RFR 8221685: -XX:BytecodeVerificationRemote and -XX:BytecodeVerificationLocal should be diagnostic options In-Reply-To: <4d295b61-8dd1-dd5f-2b4b-d77bf6bfd84a@oracle.com> References: <5ae3a167-680c-ba15-5468-76a4e13e37e1@oracle.com> <4d295b61-8dd1-dd5f-2b4b-d77bf6bfd84a@oracle.com> Message-ID: <7f5eaca6-4d4b-bdc1-d528-f680c877b78e@oracle.com> On 24/04/2019 9:58 pm, Harold Seigel wrote: > Hi David, > > Actually, I think the indent is correct.? The message text is lined up > at column 10 regardless of if the option is develop, diagnostic, > product, etc. Ah I see that now - was just looking at the diff. Thanks, David > Thanks, Harold > > On 4/24/2019 7:46 AM, Harold Seigel wrote: >> Thanks David. >> >> I'll fix the indents before pushing the change. >> >> Harold >> >> On 4/23/2019 9:51 PM, David Holmes wrote: >>> Hi Harold, >>> >>> Looks good. Minor nit: >>> >>> -? product(bool, BytecodeVerificationRemote, true, ??? \ >>> +? diagnostic(bool, BytecodeVerificationRemote, true, ??? \ >>> ?????????? "Enable the Java bytecode verifier for remote classes") >>> ???? \ >>> >>> ???? \ >>> -? product(bool, BytecodeVerificationLocal, false, ??? \ >>> +? diagnostic(bool, BytecodeVerificationLocal, false, ??? \ >>> ?????????? "Enable the Java bytecode verifier for local classes") ???? \ >>> >>> can you fix the indentation on the "Enable ..." lines. >>> >>> Thanks, >>> David >>> ----- >>> >>> On 24/04/2019 4:34 am, Harold Seigel wrote: >>>> Hi, >>>> >>>> Please review this change to make the hotspot BytecodeVerification* >>>> options be diagnostic.? Use of either of these options without >>>> -XX:+UnlockDiagnosticVMOptions will now result in the following >>>> message: >>>> >>>> ???? > java -XX:+BytecodeVerificationLocal -version >>>> ???? Error: VM option 'BytecodeVerificationLocal' is diagnostic and >>>> must be enabled via -XX:+UnlockDiagnosticVMOptions. >>>> >>>> Open Webrev: >>>> http://cr.openjdk.java.net/~hseigel/bug_8221685/webrev/index.html >>>> >>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8221685 >>>> >>>> The fix was regression tested by running Mach5 tiers 1 and 2 tests >>>> and builds on Linux-x64, Windows, and Mac OS X, Mach5 tiers 3 -5 on >>>> Linux-x64, and by running JCK-13 Lang and VM tests on Linux-x64. >>>> Additionally, the java command was run to ensure that >>>> -XX:+UnlockDiagnosticVMOptions is needed when specifying the >>>> BytecodeVerification* options. >>>> >>>> Thanks, Harold >>>> From robin.westberg at oracle.com Wed Apr 24 12:57:31 2019 From: robin.westberg at oracle.com (Robin Westberg) Date: Wed, 24 Apr 2019 14:57:31 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: <8a6c8f0b-21b0-b7f9-e64e-ba7e348a57b3@oracle.com> References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> <8a6c8f0b-21b0-b7f9-e64e-ba7e348a57b3@oracle.com> Message-ID: <57F9A59B-4674-4CEF-88C5-8DF4E0DFC6CF@oracle.com> Hi Robbin, Thanks for reviewing! Best regards, Robin > On 23 Apr 2019, at 10:07, Robbin Ehn wrote: > > Hi Robin, > >> New webrev: >> https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02/ > > Looks good, thanks for fixing. > > /Robbin > >> Best regards, >> Robin >>> >>> Thanks, >>> David >>> ----- >>> >>>> Best regards, >>>> Robin >>>>> >>>>> ? >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>> >>>>>> Best regards, >>>>>> Robin >>>>>>> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >>>>>>> >>>>>>> Hi David, >>>>>>> >>>>>>>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>>>>>>> >>>>>>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>>>>>> Hi David, >>>>>>>>> Thanks for taking a look! >>>>>>>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>>>>>>> >>>>>>>>>> Hi Robin, >>>>>>>>>> >>>>>>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>>>>>>> >>>>>>>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>>>>>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>>>>>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>>>>>>> >>>>>>>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >>>>>>> >>>>>>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >>>>>>> >>>>>>> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >>>>>>> >>>>>>>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >>>>>>> >>>>>>> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>>>>>> >>>>>>> Best regards, >>>>>>> Robin >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Robin >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> ----- >>>>>>>>>> >>>>>>>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>>>>>> Testing: tier1 >>>>>>>>>>> Best regards, >>>>>>>>>>> Robin From robin.westberg at oracle.com Wed Apr 24 13:12:55 2019 From: robin.westberg at oracle.com (Robin Westberg) Date: Wed, 24 Apr 2019 15:12:55 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> Message-ID: Hi David, > On 23 Apr 2019, at 03:39, David Holmes wrote: > > Hi Robin, > > Sorry, now Easter break got in the way :) > > > > On 17/04/2019 11:55 pm, Robin Westberg wrote: >>> On 12 Apr 2019, at 11:15, David Holmes wrote: >>> I'd prefer to fix a windows problem, just on windows. I'm not hung up on having sleep in the name, but if you prefer timed_yield to naked_short_nanosleep then that's fine (and avoids people wondering what the "naked" part means). >>> >>> If we need the TimedYield capability in the future then lets revisit that then. >> Sure, here?s a lighter version of this change that changes the Windows implementation of naked_short_nanosleep, with a few adjustments to some assumptions in the waiting-for-safepoint backoff strategy. >> Still passes tier1, with the same performance improvements on Windows (and no obvious regressions on Linux). >> New webrev: >> https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02/ > > Windows changes look fine - thanks. > > Safepoint backoff change seems okay but what affect does it have on performance on non-Windows? (javaTimeNanos can sometimes be expensive) In the case where we have to wait for threads to stop it shouldn't matter, it will be insignificant compared to the time spent actually sleeping. But in the case where all threads manage to stop before we resort to waiting there is indeed an unnecessary call to javaTimeNanos - I have not observed any measurable difference, but I?ve reworked the code a little bit to avoid this just in case: Full webrev: https://cr.openjdk.java.net/~rwestberg/8220795/webrev.03/ Incremental: https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02-03/ Best regards, Robin > > Thanks, > David > >> Best regards, >> Robin >>> >>> Thanks, >>> David >>> ----- >>> >>>> Best regards, >>>> Robin >>>>> >>>>> ? >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>> >>>>>> Best regards, >>>>>> Robin >>>>>>> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >>>>>>> >>>>>>> Hi David, >>>>>>> >>>>>>>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>>>>>>> >>>>>>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>>>>>> Hi David, >>>>>>>>> Thanks for taking a look! >>>>>>>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>>>>>>> >>>>>>>>>> Hi Robin, >>>>>>>>>> >>>>>>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>>>>>>> >>>>>>>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>>>>>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>>>>>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>>>>>>> >>>>>>>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >>>>>>> >>>>>>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >>>>>>> >>>>>>> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >>>>>>> >>>>>>>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >>>>>>> >>>>>>> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>>>>>> >>>>>>> Best regards, >>>>>>> Robin >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Robin >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> ----- >>>>>>>>>> >>>>>>>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>>>>>> Testing: tier1 >>>>>>>>>>> Best regards, >>>>>>>>>>> Robin From daniel.daugherty at oracle.com Wed Apr 24 13:25:18 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 24 Apr 2019 09:25:18 -0400 Subject: RFR(S): 8222295 more baseline cleanups from Async Monitor Deflation project In-Reply-To: References: <4d5a1981-9fbe-04f8-5203-a4d254b5cff3@oracle.com> Message-ID: <59e984c3-1043-8c6b-4b2a-14cbe6afe0ae@oracle.com> Thanks for the review! Dan On 4/23/19 10:05 PM, David Holmes wrote: > Hi Dan, > > That all seems fine. > > Thanks, > David > > On 24/04/2019 12:28 am, Daniel D. Daugherty wrote: >> Greetings, >> >> I have a (S)mall patch extracted from the Async Monitor Deflation >> project >> that is ready for code review. >> >> Karen, a number of the changes here are from your code review comments >> to the parent bug: >> >> ?? ? JDK-8153224 Monitor deflation prolong safepoints >> ???? https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> The short version of what this patch is about: >> >> ???? More baseline cleanups to the ObjectMonitor subsystem. >> >> The details are in the bug report: >> >> ???? JDK-8222295 more baseline cleanups from Async Monitor Deflation >> project >> ???? https://bugs.openjdk.java.net/browse/JDK-8222295 >> >> Here's the webrev: >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295/ >> >> This patch along with the current patch for Async Monitor Deflation >> project have been through Mach5 tier[1-8] testing. >> >> I have been actively using the revised assert()'s and guarantee()'s with >> additional diagnostic info while debugging my port of the Async Monitor >> Deflation project code. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan > From daniel.daugherty at oracle.com Wed Apr 24 15:16:17 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 24 Apr 2019 11:16:17 -0400 Subject: RFR(S): 8222518: Remove unnecessary caching of Parker object in java.lang.Thread In-Reply-To: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> References: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> Message-ID: <0db0ea2a-6192-0587-3cd2-41f0d718c449@oracle.com> On 4/24/19 3:12 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8222518 > webrev: http://cr.openjdk.java.net/~dholmes/8222518/webrev/ src/hotspot/share/classfile/javaClasses.cpp ??? L1629: ? macro(_park_blocker_offset,? k, "parkBlocker", object_signature, false); ??????? Line ends with a ';' and the previous last line did not. When the ? ? ? ? THREAD_FIELDS_DO macro is called, it is already followed by a ';': ??????? L1635: ? THREAD_FIELDS_DO(FIELD_COMPUTE_OFFSET); ??????? L1640: ? THREAD_FIELDS_DO(FIELD_SERIALIZE_OFFSET); src/hotspot/share/classfile/javaClasses.hpp ??? No comments. src/hotspot/share/prims/unsafe.cpp ??? No comments. src/java.base/share/classes/java/lang/Thread.java ??? No comments. Thumbs up.? I don't need to see another webrev if you choose to remove the ';' on L1629. Dan > > The original implementation of Unsafe.unpark simply extracted the > JavaThread reference from the java.lang.Thread oop and if non-null > extracted the Parker instance from it and invoked unpark. This was > racy however as the native JavaThread could terminate at any time and > deallocate the Parker. > > That logic was fixed by JDK-6271298 which used of combination of > type-stable-memory "event" objects for the Parker, along with use of > the Threads_lock to obtain the initial reference to the Parker (from a > JavaThread guaranteed to be alive), together with caching the native > Parker pointer in a field of java.lang.Thread. Even though the native > thread may have terminated the Parker was still valid (even if > associated with a different thread) and the unpark at worst was a > spurious wakeup for that other thread. > > When JDK-8167108 introduced Thread Safe-Memory-Reclaimation (SMR) the > logic was updated to always use the safe mechanism - we grab a > ThreadsListHandle then check the cached field, else lookup the native > thread to see if it is alive and locate the Parker instance that way. > With SMR the caching of the Parker pointer no longer serves any > purpose - we no longer have a lock-free use-the-cache path versus a > lock-using populate-the-cache path. With SMR we've already"paid" for > the ability to ensure the native thread can't terminate regardless of > whether we lookup the field from the java.lang.Thread or the > JavaThread. So we can simplify the code and save a little footprint by > removing the cache from java.lang.Thread: > > ??? /* > ???? * JVM-private state that persists after native thread termination. > ???? */ > ??? private long nativeParkEventPointer; > > and the supporting code from unsafe.cpp and javaClass.*pp in the JVM. > > I considered restoring the fast-path use of the cache without recourse > to Thread-SMR but performance measurements failed to show any benefit > in doing. See bug report for details. > > Thanks, > David From mikhailo.seledtsov at oracle.com Wed Apr 24 20:03:23 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 24 Apr 2019 13:03:23 -0700 Subject: 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" Message-ID: <53cccc16-386a-8a86-930c-decadeeec874@oracle.com> Please review this change that makes a test more robust. The test originally relied on the fact that JAVA_MAIN_CLASS variable is always present when running jtreg tests. This assumption is wrong, hence I reworked the test to define its own unique environment variable. In order to introduce environment variable I reworked the DockerTestUtils a little bit, by splitting the dockerRunJava() method into 2: buildJavaCommand() to build the command; dockerRunJava() uses buildJavaCommand() then runs the command. This allowed me to introduce the test environment variable to the child process. The behavior of dockerRunJava() should not be affected by this change, it is a simle split. ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222888 ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.00/ ??? Testing: ??????? 1. Ran hotspot docker tests on Linux-x64 machine with docker enigne configured. ?????????? Ran both via jtreg directly and via make ?????????? All PASS Thank you, Misha From mikhailo.seledtsov at oracle.com Wed Apr 24 20:06:39 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 24 Apr 2019 13:06:39 -0700 Subject: RFR(S): 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" Message-ID: <1ebce1c3-4a62-801e-8876-47e930dfea73@oracle.com> Please review this change that makes a test more robust. The test originally relied on the fact that JAVA_MAIN_CLASS variable is always present when running jtreg tests. This assumption is wrong, hence I reworked the test to define its own unique environment variable. In order to introduce environment variable I reworked the DockerTestUtils a little bit, by splitting the dockerRunJava() method into 2: buildJavaCommand() to build the command; dockerRunJava() uses buildJavaCommand() then runs the command. This allowed me to introduce the test environment variable to the child process. The behavior of dockerRunJava() should not be affected by this change, it is a simle split. ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222888 ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.00/ ??? Testing: ??????? 1. Ran hotspot docker tests on Linux-x64 machine with docker enigne configured. ?????????? Ran both via jtreg directly and via make ?????????? All PASS Thank you, Misha From erik.gahlin at oracle.com Wed Apr 24 20:10:02 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Wed, 24 Apr 2019 22:10:02 +0200 Subject: 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" In-Reply-To: <53cccc16-386a-8a86-930c-decadeeec874@oracle.com> References: <53cccc16-386a-8a86-930c-decadeeec874@oracle.com> Message-ID: <5CC0C29A.4070303@oracle.com> Looks good Erik > Please review this change that makes a test more robust. The test > originally relied on the fact that JAVA_MAIN_CLASS variable > is always present when running jtreg tests. This assumption is wrong, > hence I reworked the test to define its own unique > environment variable. > > In order to introduce environment variable I reworked the > DockerTestUtils a little bit, by splitting the > dockerRunJava() method into 2: buildJavaCommand() to build the > command; dockerRunJava() uses buildJavaCommand() > then runs the command. This allowed me to introduce the test > environment variable to the child process. > The behavior of dockerRunJava() should not be affected by this change, > it is a simle split. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8222888 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.00/ > Testing: > 1. Ran hotspot docker tests on Linux-x64 machine with docker > enigne configured. > Ran both via jtreg directly and via make > All PASS > > Thank you, > Misha > From mikhailo.seledtsov at oracle.com Wed Apr 24 20:18:45 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 24 Apr 2019 13:18:45 -0700 Subject: 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" In-Reply-To: <5CC0C29A.4070303@oracle.com> References: <53cccc16-386a-8a86-930c-decadeeec874@oracle.com> <5CC0C29A.4070303@oracle.com> Message-ID: <22a29b59-16d2-654b-c266-e817220ff984@oracle.com> Thank you Erik, Misha On 4/24/19 1:10 PM, Erik Gahlin wrote: > Looks good > > Erik > >> Please review this change that makes a test more robust. The test >> originally relied on the fact that JAVA_MAIN_CLASS variable >> is always present when running jtreg tests. This assumption is wrong, >> hence I reworked the test to define its own unique >> environment variable. >> >> In order to introduce environment variable I reworked the >> DockerTestUtils a little bit, by splitting the >> dockerRunJava() method into 2: buildJavaCommand() to build the >> command; dockerRunJava() uses buildJavaCommand() >> then runs the command. This allowed me to introduce the test >> environment variable to the child process. >> The behavior of dockerRunJava() should not be affected by this >> change, it is a simle split. >> >> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222888 >> ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.00/ >> ??? Testing: >> ??????? 1. Ran hotspot docker tests on Linux-x64 machine with docker >> enigne configured. >> ?????????? Ran both via jtreg directly and via make >> ?????????? All PASS >> >> Thank you, >> Misha >> > From karen.kinnear at oracle.com Wed Apr 24 20:23:15 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 24 Apr 2019 16:23:15 -0400 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CBF97E1.8080500@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBF97E1.8080500@oracle.com> Message-ID: <32BA074F-4049-4F07-A102-B3CECDD69110@oracle.com> Calvin, Thank you for the updated webrev as well as the detailed explanations. These changes look good. thanks, Karen minor notes below > On Apr 23, 2019, at 6:55 PM, Calvin Cheung wrote: > > Hi Karen, > > The following incremental webrev should have addressed most of your comments: > http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/ > > Please see my replies inline below. > >> >> 2. jfrRecorder.cpp >> So JFR recordings are disabled if DynamicDumpSharedSpaces? >> why? > It was also done for DumpSharedSpaces via the fix for https://bugs.openjdk.java.net/browse/JDK-8203664 . >> Is that a future rfe? So I get that right now we don?t support JFR recordings. Thank you for the link - which explores at least two issues with doing that 1) JFR rewrites some core libraries classes to add tracing and 2) issues with add-exports/add-reads. So I get that it would not be easy to figure out a way to do this. My concern is for the longer-term goal of having the user able to do a first run which creates and archive and subsequent runs - with the same command-line - which uses the archive. So could you please consider filing an RFE for future to investigate this more to see if there is a possible way to make this work? >> >> >> 4. systemDictionaryShared.cpp >> EstimateSizeForArchive::do_entry >> Is it the case that for info.is_builtin() there are no verification constraints? So you could skip that calculation? Or did I misunderstand? > > The size also includes header and crc sizes: > static size_t byte_size(InstanceKlass* klass, int num_constraints) { > return header_size_size() + > crc_size(klass) + > verifier_constraints_size(num_constraints) + > verifier_constraint_flags_size(num_constraints); > } Thank you. So my assumption is accurate about no verification constraints, and the call is needed anyway. >> >> 5. compactHashtable.cpp >> serialize/header/calculate_header_size >> -- could you dynamically determine size_of header so you don't need >> to hardcode a 5? > I checked with Ioi on this one. The problem is calculate_header_size() needs to be called during size estimation, and serialize_header is called after size estimation. Thank you for trying. >> >> 6. classLoader.cpp >> line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are mutually exclusive. >> Can you clarify for me: >> My memory of the base archive is that we do not allow the following options at dump time - and these >> are the same for the dynamic archive: ?limit-modules, ?upgrade-module-path, ?patch-module. > Yes, the same should apply for the dynamic archive. > FIXME has been replaced with a comment. Thank you. >> >> 9. filemap.cpp >> Comment lines 529 ... >> Is this true - that you can only support dynamic dumping with the default CDS archive? Could you clarify what the restrictions are? >> The CSR implies you can support ?a specific base CDS archive" > Yes. >> - so base layer can not have appended boot class path >> - and base layer can't have a module path > Correct. >> >> What can you specify for the dynamic dumping relative to the base archive? >> - matching class path? >> - appended class path? > Yes. >> in future - could it have a module path that matched the base archive? > Sure, in another RFE. Yes - part of this exercise is identifying future RFEs. >> >> Should any of these restrictions be clarified in documentation/CSR since they appear to be new? > I'll update the doc. Thank you. >> >> 11. filemap.hpp >> line 214: TODO left in > I leave it there for now. It isn't too simple to get rid of the static declaration. > I can do a follow up after this RFE. It doesn?t bother me that they are static - I don?t need an RFE here. I just figured with a TODO that you meant to get back to it. >> >> 13. java.cpp >> FIXME: is this the right place? >> For starting the DynamicArchive::dump >> >> Please check with David Holmes on that one > > I've removed the FIXME. I've also check with David H. He said the following: >> Not an easy question to answer It depends on all the code that might be touched through DynamicArchive::dump() and whether it might interact with anything already "shutdown". It will really come down to testing (run all tests with dynamic dumping enabled) to see if there are any unexpected interactions. Have to agree with him - not an easy question to answer - and the benefits of testing :-) >> >> lines 277 && 412 >> Do we archive array klasses in the base archive but not in the dynamic archive? >> Is that a potential RFE? > We currently don't handle array klasses in the AppCDS archive either. > Please refer to: open/test/hotspot/jtreg/runtime/appcds/javaldr/ArrayTest.java. Thank you. I missed that. >> > > thanks, > Calvin From leonid.mesnik at oracle.com Wed Apr 24 23:46:31 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 24 Apr 2019 16:46:31 -0700 Subject: 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" In-Reply-To: <53cccc16-386a-8a86-930c-decadeeec874@oracle.com> References: <53cccc16-386a-8a86-930c-decadeeec874@oracle.com> Message-ID: <440BCDC7-DF35-4F97-90CF-CCF31E145478@oracle.com> Looks good. Leonid > On Apr 24, 2019, at 1:03 PM, mikhailo.seledtsov at oracle.com wrote: > > Please review this change that makes a test more robust. The test originally relied on the fact that JAVA_MAIN_CLASS variable > is always present when running jtreg tests. This assumption is wrong, hence I reworked the test to define its own unique > environment variable. > > In order to introduce environment variable I reworked the DockerTestUtils a little bit, by splitting the > dockerRunJava() method into 2: buildJavaCommand() to build the command; dockerRunJava() uses buildJavaCommand() > then runs the command. This allowed me to introduce the test environment variable to the child process. > The behavior of dockerRunJava() should not be affected by this change, it is a simle split. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8222888 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.00/ > Testing: > 1. Ran hotspot docker tests on Linux-x64 machine with docker enigne configured. > Ran both via jtreg directly and via make > All PASS > > Thank you, > Misha > From mikhailo.seledtsov at oracle.com Wed Apr 24 23:47:29 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 24 Apr 2019 16:47:29 -0700 Subject: 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" In-Reply-To: <440BCDC7-DF35-4F97-90CF-CCF31E145478@oracle.com> References: <53cccc16-386a-8a86-930c-decadeeec874@oracle.com> <440BCDC7-DF35-4F97-90CF-CCF31E145478@oracle.com> Message-ID: Thank you, Misha On 4/24/19 4:46 PM, Leonid Mesnik wrote: > Looks good. > > Leonid > >> On Apr 24, 2019, at 1:03 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Please review this change that makes a test more robust. The test originally relied on the fact that JAVA_MAIN_CLASS variable >> is always present when running jtreg tests. This assumption is wrong, hence I reworked the test to define its own unique >> environment variable. >> >> In order to introduce environment variable I reworked the DockerTestUtils a little bit, by splitting the >> dockerRunJava() method into 2: buildJavaCommand() to build the command; dockerRunJava() uses buildJavaCommand() >> then runs the command. This allowed me to introduce the test environment variable to the child process. >> The behavior of dockerRunJava() should not be affected by this change, it is a simle split. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8222888 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.00/ >> Testing: >> 1. Ran hotspot docker tests on Linux-x64 machine with docker enigne configured. >> Ran both via jtreg directly and via make >> All PASS >> >> Thank you, >> Misha >> From jianglizhou at google.com Thu Apr 25 02:54:59 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Wed, 24 Apr 2019 19:54:59 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CBFA906.1030205@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> <5CBFA906.1030205@oracle.com> Message-ID: Please see comments inlined. On Tue, Apr 23, 2019 at 5:08 PM Calvin Cheung wrote: > > Hi Jiangli, > > Thanks a lot for your review! > > On 4/22/19, 2:07 PM, Jiangli Zhou wrote: > > Hi Calvin, > > > > Congrats on finalizing the dynamic archiving work and completing > > testing. After the integration of the dynamic archiving, a follow-up > > RFE can be done to merge the archiving/copying code in > > dynamicArchive.* and metaspaceShared.* for better maintenance in the > > future. As there are many duplicates between those two, having shared > > implementation for both static and dynamic will be beneficial and > > reduce the maintenance cost. > I'll file an RFE for the above. > > > > Here are my comments mainly for additional cleanups and some minor issues. > > > > - src/hotspot/share/classfile/classLoader.cpp > > > > 1337 // FIXME: DynamicDumpSharedSpaces and --patch-modules are > > mutually exclusive > > 1338 assert(!DynamicDumpSharedSpaces, "sanity"); > > > > I tagged the comment with 'FIXME' to serve as a reminder to add more > > details. The reason DynamicDumpSharedSpaces is 'mutually exclusive' > > with with --patch-modules because DynamicDumpSharedSpaces is only > > enabled when UseSharedSpaces is also enabled. As --patch-modules is > > not supported with UseSharedSpaces, it is not supported with > > DynamicDumpSharedSpaces either. > I've converted the FIXME to a comment. > > > > 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); > > 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, > > (ClassFileStream*)stream); > > > > Please add assert(DynamicDumpSharedSpaces, "sanity"); to the above > > code. With the new dynamic archiving capability, it's now able to > > load/archive a class with user defined classloader via this call path. > > A comment explaining this is also needed. > I tried the assert but it didn't work. Not only DynamicDumpSharedSpaces > will go through that code path. I should be more clear. The new code is only intended for the DynamicDumpSharedSpaces, since the shared_classpath_index is set to UNREGISTERED_INDEX by ClassLoaderExt::load_class when loading class with "source:" in the class list file at static dumping time. 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, (ClassFileStream*)stream); After thinking more, it's probably better to remove the following marked code from ClassLoaderExt::load_class. That avoids setting twice in two different places during static dumping. It also makes the code cleaner. InstanceKlass* ClassLoaderExt::load_class(Symbol* name, const char* path, TRAPS) { ... result->set_shared_classpath_index(UNREGISTERED_INDEX); <<<<<<<<<<< SystemDictionaryShared::set_shared_class_misc_info(result, stream); <<<<<<<<<<<< > > > > - src/hotspot/share/classfile/classLoaderExt.cpp > > > > 64 void ClassLoaderExt::setup_app_search_path() { > > 65 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, > > 66 "this function is only used with -Xshare:dump"); > > > > The above message needs to be updated to reflect the new command-line option. > Done. > > > > 304 result->set_shared_classpath_index(UNREGISTERED_INDEX); > > 305 SystemDictionaryShared::set_shared_class_misc_info(result, > > stream);<<<<<<<<<< > > > > Why is the set_shared_class_misc_info call being removed? If this is a > > bug fix for loading classes from the classlist for user defined > > classloaders, it should be handled separately, and with a separate bug > > ID as well. > It is called in ClassLoader::record_result() from > KlassFactory::create_from_stream(). Ok, this is related to the above comment. > > > > - src/hotspot/share/classfile/compactHashtable.cpp > > > > 207 size_t SimpleCompactHashtable::calculate_header_size() { > > 208 // We have 5 fields. Each takes up sizeof(intptr_t). See > > WriteClosure::do_u4 > > 209 size_t bytes = sizeof(intptr_t) * 5; > > 210 return bytes; > > 211 } > > > > 212 > > 213 void SimpleCompactHashtable::serialize_header(SerializeClosure* soc) { > > 214 // NOTE: if you change this function, you MUST change the number 5 in > > 215 // calculate_header_size() accordingly. > > ... > > > > As a cleanup, a better way to handle this is to calculate the size > > within SimpleCompactHashtable::serialize_header during serializing the > > data and set the size value in a valuable. > > SimpleCompactHashtable::calculate_header_size() should simply retrieve > > the value. A renaming of > > SimpleCompactHashtable::calculate_header_size() can also be done. > I've checked with Ioi on this one. The problem is > calculate_header_size() needs to be called during size estimation, and > serialize_header is called after size estimation. Can you please file a RFE for this? The current code is okay for the first integration. It deserves some efforts to make it cleaner (probably with a different solution) since it can be error-prone. > > > > - src/hotspot/share/classfile/dictionary.cpp > > > > 315 InstanceKlass* Dictionary::find_class(Symbol* name) { > > 316 unsigned int hash = compute_hash(name); > > 317 int index = hash_to_index(hash); > > 318 return find_class(index, hash, name); > > 319 } > > > > Looks like the new function is not references (unless I'm missing > > something). Please remove the function. > > > > - src/hotspot/share/classfile/dictionary.hpp > > > > 65 InstanceKlass* find_class(Symbol* name); > > > > Same comment as the above. > I've removed the function. > > > > - src/hotspot/share/classfile/symbolTable.cpp. > > > > 473 Symbol* const _archived; // used by UseSharedArchived2 > > > > Please removed 'UseSharedArchived2'. The comment also needs more clarifications. > > > > I couldn't find any references to SymbolTableCreateEntry. Can you > > please point to me where it is being used? > I've removed the entire SymbolTableCreateEntry class. It was left there > probably due to merge error. > > > > - src/hotspot/share/classfile/systemDictionaryShared.cpp > > > > 1218 if (DynamicDumpSharedSpaces) { > > 1219 return false; > > 1220 } else { > > > > The above case for DynamicDumpSharedSpaces needs to be examined > > carefully. Can you please ask Harold (and Coleen or Karen) to take a > > look? Also, a comment is needed to explain that we can complete all > > verification checks at dynamic dumping time. > I've added a comment. If it return false, the caller will call > VerificationType::resolve_and_check_assignability(). > > > > - src/hotspot/share/classfile/systemDictionaryShared.cpp > > > > 1279 ResourceMark rm; > > > > You can use 'ResourceMark rm(THREAD)'. > Fixed. > > > > - src/hotspot/share/memory/allocation.hpp > > > > 255 // > > 256 // When CDS is not enabled, both pointers are set to NULL. > > 257 static void* _shared_metaspace_base; // (inclusive) low address > > 258 static void* _shared_metaspace_top; // (exclusive) high addres > > > > Why the comment at line 256 was removed? > I've added back the comment. > > > > - src/hotspot/share/memory/filemap.cpp > > > > 101 void FileMapInfo::fail_continue(const char *msg, ...) { > > 102 va_list ap; > > 103 va_start(ap, msg); > > 104 if (_runtime_dynamic_info == NULL) { > > 105 MetaspaceShared::set_archive_loading_failed(); > > 106 } else { > > 107 DynamicArchive::disable(); > > 108 } > > > > The above fail_continue only works if _runtime_dynamic_info is setup > > after the mapping the base archive. Comments should be add to explain > > that. > Comment added. > > > > Can you please rename '_runtime_dynamic_info' so it's more > > descriptive? Maybe use 'dynamic_archive_info'. > Renamed to '_dynamic_archive_info'. > > > > 587 bool FileMapInfo::same_files(const char* file1, const char* file2) { > > > > The usage of FileMapInfo::same_files is not necessary and should be > > removed. The base archive's CRC checksum values are recorded in the > > dynamic archive. The runtime verifies the CRC values to make sure the > > same archive is used at dump time and runtime, regardless of the base > > archive path or name. It is designed for all use cases: > The same_files() function is also used in arguments.cpp: > 3530 if (DynamicDumpSharedSpaces) { > 3531 if (FileMapInfo::same_files(SharedArchiveFile, > ArchiveClassesAtExit)) { > 3532 vm_exit_during_initialization( > 3533 "Cannot have the same archive file specified for > -XX:SharedArchiveFile and -XX:ArchiveClassesAtExit", > 3534 SharedArchiveFile); > 3535 } > 3536 } > > The function is also needed for the RFE: > https://bugs.openjdk.java.net/browse/JDK-8211723 Ok. It should be treated a bug, not a RFE. The shared path table check does not verify the path ordering (also including the case when new path components are inserted). The bug should be handled as a high priority task for dynamic archive. > > We still verify the CRC values during runtime. > > > > * base CDS archive is specified in the -XX:SharedArchiveFile at > > dynamic dumping time > > * -XX:SharedArchiveFile is not specified at dynamic dumping time, > > default location for the default CDS archive is used > > * default CDS archive is specified in the -XX:SharedArchiveFile at runtime > > * default CDS archive is not specified in the -XX:SharedArchiveFile at > > runtime, default location for the default CDS archive is used > Regarding the fourth point above, the user could have a non-default base > archive and only specify the top archive during runtime. I would argue against it since it doesn't always work and adds extra code. When the archive path/name is changed, the recorded one in the dynamic archive would no longer work. User still need to specify the path/name in the command-line. The use case only works for the default CDS archive. For non-default CDS archive, specifying in the command-line option results a cleaner design and less fragile code. > > > > In all above cases, the base archive CRC values check is sufficient. > > The use of path/name is fragile and should be avoided. That will allow > > you to remove the _base_archive_name_size from the dynamic archive. > We still need the _base_archive_name_size and the base archive name in > the header because of the above reason. Please see my comment above. > > > > 752 if (is_static) { > > 753 // FIXME check for dynamic header as well > > 754 // FIXME Don't just check the last region -- check all regions! > > > > Can you please address the first FIXME at line 753? > > > > Checking the last region is sufficient since the archive is written is > > sequential order. The second FIXME is not necessary. > I've addressed the first FIXME and converted the second one to a comment. > > > > - src/hotspot/share/memory/metaspace.cpp > > > > 1417 bool Metaspace::contains(const void* ptr) { > > 1418 // FIXME: need to check the dynamic archive > > > > Can you please remove the above FIXME? There is no need for a separate check. > Done. > > > > - src/hotspot/share/memory/metaspaceShared.cpp > > > > 830 intptr_t* MetaspaceShared::fix_cpp_vtable_for_second_archive > > > > Can you please rename the function to fix_cpp_vtable_for_dynamic_archive? > Done. > > > > - src/hotspot/share/oops/klass.cpp > > > > 527 assert (DumpSharedSpaces || DynamicDumpSharedSpaces, > > 528 "only called for DumpSharedSpaces"); > > > > 544 void Klass::remove_java_mirror() { > > 545 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, "only > > called for DumpSharedSpaces"); > > > > Please fix the messages above. > Done. > > > > - src/hotspot/share/prims/whitebox.cpp > > > > 2332 {CC"getResolvedReferences", > > CC"(Ljava/lang/Class;)Ljava/lang/Object;", > > (void*)&WB_GetResolvedReferences}, > > 2333 {CC"linkClass", CC"(Ljava/lang/Class;)V", > > (void*)&WB_LinkClass}, > > 2334 {CC"areOpenArchiveHeapObjectsMapped", CC"()Z", > > (void*)&WB_AreOpenArchiveHeapObjectsMapped}, > > > > Can you please align the indentation of line 2333 (to be the same as > > line 2332 or 2334)? > Aligned (void*) with line 2334. (It doesn't show in the webrev since > only blank space changes) > > > > - src/hotspot/share/runtime/arguments.cpp > > > > 1491 bool Arguments::check_unsupported_cds_runtime_properties() { > > 1492 assert(UseSharedSpaces, "this function is only used with > > -Xshare:{on,auto}"); > > 1493 assert(ARRAY_SIZE(unsupported_properties) == > > ARRAY_SIZE(unsupported_options), "must be"); > > 1494 if (ArchiveClassesAtExit != NULL) { > > 1495 // dynamic dumping, just return false, > > check_unsupported_dumping_properties() will be called > > 1496 // in init_shared_archive_paths(). > > 1497 return false; > > 1498 } > > > > The check_unsupported_cds_runtime_properties() should be done for the > > 'ArchiveClassesAtExit != NULL' case as well. Dynamic dumping is a > > combination of both dump time and runtime. > The 'ArchiveClassesAtExit != NULL' is for dumping CDS archive to the > user's point of view, that's why the comments in lines 1495 and 1496. > During runtime, ArchiveClassesAtExit will be NULL, so the > check_unsupported_cds_runtime_properties() will be called as usual. During dynamic dumping, UseSharedSpace is true. Dynamic dumping is special case of the 'runtime', that's why Dynamic dumping it is a combination of both dump time and runtime. So check_unsupported_cds_runtime_properties() is also need for dynamic dumping. > > > > 2729 // -Xshare:auto || -Xshare:dynamicDump > > > > As you've renamed the command-line argument for dynamic dumping > > support, the comment needs to be fixed. > Fixed. > > > > 3125 // Compiler threads may concurrently update the class > > metadata (such as method entries), so it's > > 3126 // unsafe with DumpSharedSpaces (which modifies the class > > metadata in place). Let's disable > > 3127 // compiler just to be safe. > > 3128 // > > 3129 // Note: this is not a concern for DynamicDumpSharedSpaces, > > which makes a copy of the class metadata > > 3130 // instead of modifying them in place. The copy is > > inaccessible to the compiler. > > 3131 set_mode_flags(_int); > > > > We need to come back to revisit the above for the 'static' archive > > dumping at one point. There is a RFE filed for that, if I remember > > correctly. Could you please add a 'TODO' notes in the above comment. > Added TODO. > > > > A check should be done in arguments.cpp to make sure > > DynamicDumpSharedSpaces is not manipulated from the command-line > > directly. DynamicDumpSharedSpaces should not be enabled in the > > command-line without ArchiveClassesAtExit being specified. > Done. > > > > - src/hotspot/share/runtime/java.cpp > > > > 509 > > 510 // FIXME: is this the right place? > > 511 if (DynamicDumpSharedSpaces) { > > 512 DynamicArchive::dump(); > > 513 } > > > > Again, the above 'FIXME' is served as a cleanup reminder. Please get > > opinions from others on this change. If the calling place is okay, > > please remove the FIXME. > Removed the FIXME for now. Checked with David H. He indicated there's no > easy answer for this. Just need to do a lot of testing. > > - test > > > > Could you please add a test case for setting DynamicDumpSharedSpaces > > from command-line? > Here's an incremental webrev which contains a new test: > > http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/ > > thanks, > Calvin > > > > I only took a brief look of the test changes. Please ask Misha to > > review the test changes as well. > > > > Thanks and regards, > > Jiangli Thanks, Jiangli From robbin.ehn at oracle.com Thu Apr 25 07:13:32 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 25 Apr 2019 09:13:32 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> Message-ID: <5d9a43c1-44af-383c-70f8-3e77218d3813@oracle.com> Still good, thanks! /Robbin On 4/24/19 3:12 PM, Robin Westberg wrote: > Hi David, > >> On 23 Apr 2019, at 03:39, David Holmes wrote: >> >> Hi Robin, >> >> Sorry, now Easter break got in the way :) >> >> >> >> On 17/04/2019 11:55 pm, Robin Westberg wrote: >>>> On 12 Apr 2019, at 11:15, David Holmes wrote: >>>> I'd prefer to fix a windows problem, just on windows. I'm not hung up on having sleep in the name, but if you prefer timed_yield to naked_short_nanosleep then that's fine (and avoids people wondering what the "naked" part means). >>>> >>>> If we need the TimedYield capability in the future then lets revisit that then. >>> Sure, here?s a lighter version of this change that changes the Windows implementation of naked_short_nanosleep, with a few adjustments to some assumptions in the waiting-for-safepoint backoff strategy. >>> Still passes tier1, with the same performance improvements on Windows (and no obvious regressions on Linux). >>> New webrev: >>> https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02/ >> >> Windows changes look fine - thanks. >> >> Safepoint backoff change seems okay but what affect does it have on performance on non-Windows? (javaTimeNanos can sometimes be expensive) > > In the case where we have to wait for threads to stop it shouldn't matter, it will be insignificant compared to the time spent actually sleeping. But in the case where all threads manage to stop before we resort to waiting there is indeed an unnecessary call to javaTimeNanos - I have not observed any measurable difference, but I?ve reworked the code a little bit to avoid this just in case: > > Full webrev: https://cr.openjdk.java.net/~rwestberg/8220795/webrev.03/ > Incremental: https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02-03/ > > Best regards, > Robin > >> >> Thanks, >> David >> >>> Best regards, >>> Robin >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Best regards, >>>>> Robin >>>>>> >>>>>> ? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>> >>>>>>> Best regards, >>>>>>> Robin >>>>>>>> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >>>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>>>>>>>> >>>>>>>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>>>>>>> Hi David, >>>>>>>>>> Thanks for taking a look! >>>>>>>>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Robin, >>>>>>>>>>> >>>>>>>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>>>>>>>> >>>>>>>>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>>>>>>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>>>>>>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>>>>>>>> >>>>>>>>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >>>>>>>> >>>>>>>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >>>>>>>> >>>>>>>> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >>>>>>>> >>>>>>>>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >>>>>>>> >>>>>>>> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Robin >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Robin >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> ----- >>>>>>>>>>> >>>>>>>>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>>>>>>> Testing: tier1 >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Robin > From robbin.ehn at oracle.com Thu Apr 25 07:53:56 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 25 Apr 2019 09:53:56 +0200 Subject: RFR(S): 8222518: Remove unnecessary caching of Parker object in java.lang.Thread In-Reply-To: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> References: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> Message-ID: <9b063a96-1859-6280-7412-75d54c1a1fb6@oracle.com> Hi David, Looks good. Just a question: It seems like we could just hold the ThreadsList over p->unpark() and not rely on TSM ? Not sure in how many places we do rely on it, but it would be nice to remove TSM for parkers. The exiting thread would set parker to NULL before removing itself from the threadslist and free it when it's off. Thanks, Robbin On 4/24/19 9:12 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8222518 > webrev: http://cr.openjdk.java.net/~dholmes/8222518/webrev/ > > The original implementation of Unsafe.unpark simply extracted the JavaThread > reference from the java.lang.Thread oop and if non-null extracted the Parker > instance from it and invoked unpark. This was racy however as the native > JavaThread could terminate at any time and deallocate the Parker. > > That logic was fixed by JDK-6271298 which used of combination of > type-stable-memory "event" objects for the Parker, along with use of the > Threads_lock to obtain the initial reference to the Parker (from a JavaThread > guaranteed to be alive), together with caching the native Parker pointer in a > field of java.lang.Thread. Even though the native thread may have terminated the > Parker was still valid (even if associated with a different thread) and the > unpark at worst was a spurious wakeup for that other thread. > > When JDK-8167108 introduced Thread Safe-Memory-Reclaimation (SMR) the logic was > updated to always use the safe mechanism - we grab a ThreadsListHandle then > check the cached field, else lookup the native thread to see if it is alive and > locate the Parker instance that way. > With SMR the caching of the Parker pointer no longer serves any purpose - we no > longer have a lock-free use-the-cache path versus a lock-using > populate-the-cache path. With SMR we've already"paid" for the ability to ensure > the native thread can't terminate regardless of whether we lookup the field from > the java.lang.Thread or the JavaThread. So we can simplify the code and save a > little footprint by removing the cache from java.lang.Thread: > > ??? /* > ???? * JVM-private state that persists after native thread termination. > ???? */ > ??? private long nativeParkEventPointer; > > and the supporting code from unsafe.cpp and javaClass.*pp in the JVM. > > I considered restoring the fast-path use of the cache without recourse to > Thread-SMR but performance measurements failed to show any benefit in doing. See > bug report for details. > > Thanks, > David From sgehwolf at redhat.com Thu Apr 25 08:17:19 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 25 Apr 2019 10:17:19 +0200 Subject: RFR(S): 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" In-Reply-To: <1ebce1c3-4a62-801e-8876-47e930dfea73@oracle.com> References: <1ebce1c3-4a62-801e-8876-47e930dfea73@oracle.com> Message-ID: Hi Misha, Thanks for fixing this! On Wed, 2019-04-24 at 13:06 -0700, mikhailo.seledtsov at oracle.com wrote: > Please review this change that makes a test more robust. The test > originally relied on the fact that JAVA_MAIN_CLASS variable > is always present when running jtreg tests. This assumption is wrong, > hence I reworked the test to define its own unique > environment variable. > > In order to introduce environment variable I reworked the > DockerTestUtils a little bit, by splitting the > dockerRunJava() method into 2: buildJavaCommand() to build the command; > dockerRunJava() uses buildJavaCommand() > then runs the command. This allowed me to introduce the test environment > variable to the child process. > The behavior of dockerRunJava() should not be affected by this change, > it is a simle split. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8222888 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.00/ > Testing: > 1. Ran hotspot docker tests on Linux-x64 machine with docker > enigne configured. > Ran both via jtreg directly and via make > All PASS This doesn't seem right to me: private static void testEnvironmentVariables() throws Exception { Common.logNewTestCase("EnvironmentVariables"); - DockerTestUtils.dockerRunJava( + List cmd = DockerTestUtils.buildJavaCommand( commonDockerOpts() - .addClassOptions("jdk.InitialEnvironmentVariable")) + .addClassOptions("jdk.InitialEnvironmentVariable")); + + ProcessBuilder pb = new ProcessBuilder(cmd); + + // Container has JAVA_HOME defined via the Dockerfile; make sure + // it is reported by JFR event. + // Environment variable set in host system should not be visible inside a container, + // and should not be reported by JFR. + pb.environment().put(TEST_ENV_VARIABLE, TEST_ENV_VALUE); + DockerTestUtils.execute(cmd) .shouldHaveExitValue(0) .shouldContain("key = JAVA_HOME") - .shouldNotContain(getTestEnvironmentVariable()); + .shouldContain("value = /jdk") + .shouldNotContain(TEST_ENV_VARIABLE) + .shouldNotContain(TEST_ENV_VALUE); So you are creating a new ProcessBuilder instance, add the environment variable, but then never start it? This means that not even the docker run cmd will have the environment. Let alone the JVM inside docker. Note that DockerTestUtils.execute() has it's own version of ProcessBuilder. If I'm reading this right, that should be the function that needs to conditionally add the environment. Am I missing something? Thanks, Severin From david.holmes at oracle.com Thu Apr 25 08:44:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Apr 2019 18:44:55 +1000 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: <5d9a43c1-44af-383c-70f8-3e77218d3813@oracle.com> References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> <5d9a43c1-44af-383c-70f8-3e77218d3813@oracle.com> Message-ID: +1 Thanks, David On 25/04/2019 5:13 pm, Robbin Ehn wrote: > Still good, thanks! > > /Robbin > > On 4/24/19 3:12 PM, Robin Westberg wrote: >> Hi David, >> >>> On 23 Apr 2019, at 03:39, David Holmes wrote: >>> >>> Hi Robin, >>> >>> Sorry, now Easter break got in the way :) >>> >>> >>> >>> On 17/04/2019 11:55 pm, Robin Westberg wrote: >>>>> On 12 Apr 2019, at 11:15, David Holmes >>>>> wrote: >>>>> I'd prefer to fix a windows problem, just on windows. I'm not hung >>>>> up on having sleep in the name, but if you prefer timed_yield to >>>>> naked_short_nanosleep then that's fine (and avoids people wondering >>>>> what the "naked" part means). >>>>> >>>>> If we need the TimedYield capability in the future then lets >>>>> revisit that then. >>>> Sure, here?s a lighter version of this change that changes the >>>> Windows implementation of naked_short_nanosleep, with a few >>>> adjustments to some assumptions in the waiting-for-safepoint backoff >>>> strategy. >>>> Still passes tier1, with the same performance improvements on >>>> Windows (and no obvious regressions on Linux). >>>> New webrev: >>>> https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02/ >>> >>> Windows changes look fine - thanks. >>> >>> Safepoint backoff change seems okay but what affect does it have on >>> performance on non-Windows? (javaTimeNanos can sometimes be expensive) >> >> In the case where we have to wait for threads to stop it shouldn't >> matter, it will be insignificant compared to the time spent actually >> sleeping. But in the case where all threads manage to stop before we >> resort to waiting there is indeed an unnecessary call to javaTimeNanos >> - I have not observed any measurable difference, but I?ve reworked the >> code a little bit to avoid this just in case: >> >> Full webrev: https://cr.openjdk.java.net/~rwestberg/8220795/webrev.03/ >> Incremental: https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02-03/ >> >> Best regards, >> Robin >> >>> >>> Thanks, >>> David >>> >>>> Best regards, >>>> Robin >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> Best regards, >>>>>> Robin >>>>>>> >>>>>>> ? >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>> >>>>>>>> Best regards, >>>>>>>> Robin >>>>>>>>> On 5 Apr 2019, at 13:54, Robin Westberg >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>>> On 5 Apr 2019, at 12:10, David Holmes >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> Thanks for taking a look! >>>>>>>>>>>> On 5 Apr 2019, at 10:49, David Holmes >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Robin, >>>>>>>>>>>> >>>>>>>>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> Please review the following change that modifies the way >>>>>>>>>>>>> safepointing waits for other threads to stop. As part of >>>>>>>>>>>>> JDK-8203469, os::naked_short_nanosleep was used for all >>>>>>>>>>>>> platforms. However, on Windows, this function has >>>>>>>>>>>>> millisecond granularity which is too coarse and caused >>>>>>>>>>>>> performance regressions. Instead, we can take advantage of >>>>>>>>>>>>> the fact that Windows can tell us if anyone else is waiting >>>>>>>>>>>>> to run on the current cpu. For other platforms the original >>>>>>>>>>>>> waiting is used. >>>>>>>>>>>> >>>>>>>>>>>> Can't you just make the new code the implementation of >>>>>>>>>>>> os::naked_short_nanosleep on Windows and avoid adding yet >>>>>>>>>>>> another sleep/yield abstraction? If Windows nanosleep only >>>>>>>>>>>> has millisecond granularity then it's broken. >>>>>>>>>>> Right, I considered that approach, but it's not obvious to me >>>>>>>>>>> what semantics would be appropriate for that. Depending on >>>>>>>>>>> whether you want an accurate sleep or want other threads to >>>>>>>>>>> make progress, it could conceivably be either implemented as >>>>>>>>>>> pure spinning, or spinning combined with yielding (either >>>>>>>>>>> cpu-local or Sleep(0)-style). >>>>>>>>>>> That said, perhaps os_naked_short_nanosleep should be >>>>>>>>>>> removed, the only remaining usage is in spinYield, which >>>>>>>>>>> turns to sleeping after giving up on yielding. For windows, >>>>>>>>>>> it may be more appropriate to switch to Sleep(0) in that case. >>>>>>>>>> >>>>>>>>>> I definitely don't think we need to introduce another >>>>>>>>>> abstraction (though I'm okay with replacing one with a new one). >>>>>>>>> >>>>>>>>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As >>>>>>>>> I see it, you would use SpinYield when waiting for something >>>>>>>>> like a CAS race, and TimedYield when waiting for a thread >>>>>>>>> rendezvous. >>>>>>>>> >>>>>>>>> I did try to make TimedYield into a different ?policy? for the >>>>>>>>> existing SpinYield class at first, but didn?t quite feel like I >>>>>>>>> found a nice API for it. So perhaps it's fine to just have them >>>>>>>>> as separate classes. >>>>>>>>> >>>>>>>>>> It was probably discussed at the time but the >>>>>>>>>> naked_short_nanosleep on Windows definitely seems to have far >>>>>>>>>> too much overhead - even having to create a new waitable-timer >>>>>>>>>> each time. Do we know if the overhead is actually the >>>>>>>>>> resolution of the timer or whether it's all the extra work >>>>>>>>>> that method needs to do? I should try to dig up the email on >>>>>>>>>> that. :) >>>>>>>>> >>>>>>>>> Yes, as far as I know the best possible sleep resolution you >>>>>>>>> can obtain on Windows is one millisecond (perhaps 0.5ms in >>>>>>>>> theory if you go directly for NtSetTimerResolution). The >>>>>>>>> SetWaitableTimer approach is the best though to actually obtain >>>>>>>>> 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Robin >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Robin >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> ----- >>>>>>>>>>>> >>>>>>>>>>>>> Various benchmarks (specjbb2015, specjm2008) show better >>>>>>>>>>>>> numbers for Windows and no regressions for Linux. >>>>>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>>>>>>>> Webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>>>>>>>> Testing: tier1 >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> Robin >> From robbin.ehn at oracle.com Thu Apr 25 08:53:57 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 25 Apr 2019 10:53:57 +0200 Subject: RFR(s): 8222637: Obsolete NeedsDeoptSuspend (was RFR(s): 8222640: Remove deopt suspend) In-Reply-To: <45681477-9a04-a623-6686-c72b36be6c38@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> <45681477-9a04-a623-6686-c72b36be6c38@oracle.com> Message-ID: Hi, The same patch as in 8222640 but with obsoleting of the flag also. Issue: https://bugs.openjdk.java.net/browse/JDK-8222637 CSR: https://bugs.openjdk.java.net/browse/JDK-8222639 The incremental change is thus: http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/globals.hpp.sdiff.html http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/arguments.cpp.sdiff.html Full: http://cr.openjdk.java.net/~rehn/8222637/webrev/ Dead and Coleen had previously review 8222640, so if they can acknowledge this inc change. Thanks, Robbin On 4/24/19 1:49 AM, Robbin Ehn wrote: > Thanks Coleen! > > /Robbin > > On 2019-04-24 00:47, coleen.phillimore at oracle.com wrote: >> +1? This looks good! >> Coleen >> >> On 4/23/19 5:32 PM, Robbin Ehn wrote: >>> Thanks Dean! >>> >>> /Robbin >>> >>> On 2019-04-23 23:17, dean.long at oracle.com wrote: >>>> Yes, looks good! >>>> >>>> dl >>>> >>>> On 4/23/19 12:38 PM, Robbin Ehn wrote: >>>>> Hi Dean, >>>>> >>>>> Is this what you had in mind: >>>>> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >>>>> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 09:58:55 2019 +0200 >>>>> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 21:32:00 2019 +0200 >>>>> @@ -272,4 +272,6 @@ >>>>> >>>>> ?void frame::deoptimize(JavaThread* thread) { >>>>> +? assert(thread->frame_anchor()->has_last_Java_frame() && >>>>> +???????? thread->frame_anchor()->walkable(), "must be"); >>>>> ?? // Schedule deoptimization of an nmethod activation with this frame. >>>>> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >>>>> >>>>> Passes t1-5. >>>>> >>>>> v2: >>>>> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >>>>> Inc: >>>>> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >>>>> >>>>> Thanks, Robbin >>>>> >>>>> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>>>>> In frame::deoptimize(), can we assert that we have an anchor frame and >>>>>> that it is walkable? >>>>>> >>>>>> dl >>>>>> >>>>>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>>>>> Adding compiler. >>>>>>> >>>>>>> /Robbin >>>>>>> >>>>>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>>>>> Hi all, please consider this change. >>>>>>>> >>>>>>>> The code for deopt suspend is no longer needed since today the register >>>>>>>> window >>>>>>>> is always flushed when this code executes. Exactly when this code was >>>>>>>> needed is not clear, entered via duke changeset 1. I did not dig since >>>>>>>> we no longer have such use case. >>>>>>>> >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>>>>> Issue: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>>>>> >>>>>>>> Passes t1-5. >>>>>>>> >>>>>>>> Thanks, Robbin >>>>>> >>>> >> From robin.westberg at oracle.com Thu Apr 25 10:01:24 2019 From: robin.westberg at oracle.com (Robin Westberg) Date: Thu, 25 Apr 2019 12:01:24 +0200 Subject: RFR: 8220795: Introduce TimedYield utility to improve time-to-safepoint on Windows In-Reply-To: References: <6F8DA727-95CE-466F-B3D2-20E4A3484533@oracle.com> <7611b6a2-2f85-cee4-234c-fbefbb04e8de@oracle.com> <8aa76f8d-9717-77c4-6a80-d03f1db02cd2@oracle.com> <94c00e4b-cbda-b9b1-2db3-94a4e15f1e73@oracle.com> <5d9a43c1-44af-383c-70f8-3e77218d3813@oracle.com> Message-ID: <588E5CB0-6B84-4854-82D1-0FA4469A0E96@oracle.com> Hi David and Robbin, Thanks again for the reviews! Best regards, Robin > On 25 Apr 2019, at 10:44, David Holmes wrote: > > +1 > > Thanks, > David > > On 25/04/2019 5:13 pm, Robbin Ehn wrote: >> Still good, thanks! >> /Robbin >> On 4/24/19 3:12 PM, Robin Westberg wrote: >>> Hi David, >>> >>>> On 23 Apr 2019, at 03:39, David Holmes wrote: >>>> >>>> Hi Robin, >>>> >>>> Sorry, now Easter break got in the way :) >>>> >>>> >>>> >>>> On 17/04/2019 11:55 pm, Robin Westberg wrote: >>>>>> On 12 Apr 2019, at 11:15, David Holmes wrote: >>>>>> I'd prefer to fix a windows problem, just on windows. I'm not hung up on having sleep in the name, but if you prefer timed_yield to naked_short_nanosleep then that's fine (and avoids people wondering what the "naked" part means). >>>>>> >>>>>> If we need the TimedYield capability in the future then lets revisit that then. >>>>> Sure, here?s a lighter version of this change that changes the Windows implementation of naked_short_nanosleep, with a few adjustments to some assumptions in the waiting-for-safepoint backoff strategy. >>>>> Still passes tier1, with the same performance improvements on Windows (and no obvious regressions on Linux). >>>>> New webrev: >>>>> https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02/ >>>> >>>> Windows changes look fine - thanks. >>>> >>>> Safepoint backoff change seems okay but what affect does it have on performance on non-Windows? (javaTimeNanos can sometimes be expensive) >>> >>> In the case where we have to wait for threads to stop it shouldn't matter, it will be insignificant compared to the time spent actually sleeping. But in the case where all threads manage to stop before we resort to waiting there is indeed an unnecessary call to javaTimeNanos - I have not observed any measurable difference, but I?ve reworked the code a little bit to avoid this just in case: >>> >>> Full webrev: https://cr.openjdk.java.net/~rwestberg/8220795/webrev.03/ >>> Incremental: https://cr.openjdk.java.net/~rwestberg/8220795/webrev.02-03/ >>> >>> Best regards, >>> Robin >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> Best regards, >>>>> Robin >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Best regards, >>>>>>> Robin >>>>>>>> >>>>>>>> ? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Robin >>>>>>>>>> On 5 Apr 2019, at 13:54, Robin Westberg wrote: >>>>>>>>>> >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>>> On 5 Apr 2019, at 12:10, David Holmes wrote: >>>>>>>>>>> >>>>>>>>>>> On 5/04/2019 7:53 pm, Robin Westberg wrote: >>>>>>>>>>>> Hi David, >>>>>>>>>>>> Thanks for taking a look! >>>>>>>>>>>>> On 5 Apr 2019, at 10:49, David Holmes wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Robin, >>>>>>>>>>>>> >>>>>>>>>>>>> On 5/04/2019 6:05 pm, Robin Westberg wrote: >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> Please review the following change that modifies the way safepointing waits for other threads to stop. As part of JDK-8203469, os::naked_short_nanosleep was used for all platforms. However, on Windows, this function has millisecond granularity which is too coarse and caused performance regressions. Instead, we can take advantage of the fact that Windows can tell us if anyone else is waiting to run on the current cpu. For other platforms the original waiting is used. >>>>>>>>>>>>> >>>>>>>>>>>>> Can't you just make the new code the implementation of os::naked_short_nanosleep on Windows and avoid adding yet another sleep/yield abstraction? If Windows nanosleep only has millisecond granularity then it's broken. >>>>>>>>>>>> Right, I considered that approach, but it's not obvious to me what semantics would be appropriate for that. Depending on whether you want an accurate sleep or want other threads to make progress, it could conceivably be either implemented as pure spinning, or spinning combined with yielding (either cpu-local or Sleep(0)-style). >>>>>>>>>>>> That said, perhaps os_naked_short_nanosleep should be removed, the only remaining usage is in spinYield, which turns to sleeping after giving up on yielding. For windows, it may be more appropriate to switch to Sleep(0) in that case. >>>>>>>>>>> >>>>>>>>>>> I definitely don't think we need to introduce another abstraction (though I'm okay with replacing one with a new one). >>>>>>>>>> >>>>>>>>>> Okay, then I?d prefer to drop the os::naked_short_nanosleep. As I see it, you would use SpinYield when waiting for something like a CAS race, and TimedYield when waiting for a thread rendezvous. >>>>>>>>>> >>>>>>>>>> I did try to make TimedYield into a different ?policy? for the existing SpinYield class at first, but didn?t quite feel like I found a nice API for it. So perhaps it's fine to just have them as separate classes. >>>>>>>>>> >>>>>>>>>>> It was probably discussed at the time but the naked_short_nanosleep on Windows definitely seems to have far too much overhead - even having to create a new waitable-timer each time. Do we know if the overhead is actually the resolution of the timer or whether it's all the extra work that method needs to do? I should try to dig up the email on that. :) >>>>>>>>>> >>>>>>>>>> Yes, as far as I know the best possible sleep resolution you can obtain on Windows is one millisecond (perhaps 0.5ms in theory if you go directly for NtSetTimerResolution). The SetWaitableTimer approach is the best though to actually obtain 1ms resolution, plain Sleep(1) usually sleeps around 2ms. >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Robin >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Robin >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>>>> ----- >>>>>>>>>>>>> >>>>>>>>>>>>>> Various benchmarks (specjbb2015, specjm2008) show better numbers for Windows and no regressions for Linux. >>>>>>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8220795 >>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~rwestberg/8220795/webrev.00/ >>>>>>>>>>>>>> Testing: tier1 >>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>> Robin >>> From david.holmes at oracle.com Thu Apr 25 10:48:45 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Apr 2019 20:48:45 +1000 Subject: RFR(s): 8222637: Obsolete NeedsDeoptSuspend (was RFR(s): 8222640: Remove deopt suspend) In-Reply-To: References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> <45681477-9a04-a623-6686-c72b36be6c38@oracle.com> Message-ID: <9d09a876-a520-6218-a781-d711cdf58bc1@oracle.com> Looks good Robbin! Nice to see things simplified. Thanks, David On 25/04/2019 6:53 pm, Robbin Ehn wrote: > Hi, > > The same patch as in 8222640 but with obsoleting of the flag also. > > Issue: > https://bugs.openjdk.java.net/browse/JDK-8222637 > CSR: > https://bugs.openjdk.java.net/browse/JDK-8222639 > > The incremental change is thus: > http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/globals.hpp.sdiff.html > > http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/arguments.cpp.sdiff.html > > > Full: > http://cr.openjdk.java.net/~rehn/8222637/webrev/ > > Dead and Coleen had previously review 8222640, so if they can > acknowledge this inc change. > > Thanks, Robbin > > On 4/24/19 1:49 AM, Robbin Ehn wrote: >> Thanks Coleen! >> >> /Robbin >> >> On 2019-04-24 00:47, coleen.phillimore at oracle.com wrote: >>> +1? This looks good! >>> Coleen >>> >>> On 4/23/19 5:32 PM, Robbin Ehn wrote: >>>> Thanks Dean! >>>> >>>> /Robbin >>>> >>>> On 2019-04-23 23:17, dean.long at oracle.com wrote: >>>>> Yes, looks good! >>>>> >>>>> dl >>>>> >>>>> On 4/23/19 12:38 PM, Robbin Ehn wrote: >>>>>> Hi Dean, >>>>>> >>>>>> Is this what you had in mind: >>>>>> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >>>>>> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 >>>>>> 09:58:55 2019 +0200 >>>>>> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 >>>>>> 21:32:00 2019 +0200 >>>>>> @@ -272,4 +272,6 @@ >>>>>> >>>>>> ?void frame::deoptimize(JavaThread* thread) { >>>>>> +? assert(thread->frame_anchor()->has_last_Java_frame() && >>>>>> +???????? thread->frame_anchor()->walkable(), "must be"); >>>>>> ?? // Schedule deoptimization of an nmethod activation with this >>>>>> frame. >>>>>> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >>>>>> >>>>>> Passes t1-5. >>>>>> >>>>>> v2: >>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >>>>>> Inc: >>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>>>>>> In frame::deoptimize(), can we assert that we have an anchor >>>>>>> frame and that it is walkable? >>>>>>> >>>>>>> dl >>>>>>> >>>>>>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>>>>>> Adding compiler. >>>>>>>> >>>>>>>> /Robbin >>>>>>>> >>>>>>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>>>>>> Hi all, please consider this change. >>>>>>>>> >>>>>>>>> The code for deopt suspend is no longer needed since today the >>>>>>>>> register window >>>>>>>>> is always flushed when this code executes. Exactly when this >>>>>>>>> code was needed is not clear, entered via duke changeset 1. I >>>>>>>>> did not dig since we no longer have such use case. >>>>>>>>> >>>>>>>>> Webrev: >>>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>>>>>> Issue: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>>>>>> >>>>>>>>> Passes t1-5. >>>>>>>>> >>>>>>>>> Thanks, Robbin >>>>>>> >>>>> >>> From robbin.ehn at oracle.com Thu Apr 25 12:05:28 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 25 Apr 2019 14:05:28 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes Message-ID: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> Hi all, please review. Let's deopt with handshakes. Removed VM op Deoptimize, instead we handshake. Locks needs to be inflate since we are not in a safepoint. Goes on top of: https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html Code: http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html Issue: https://bugs.openjdk.java.net/browse/JDK-8221734 Passes t1-7 and multiple t1-5 runs. A few startup benchmark see a small speedup. Thanks, Robbin From robbin.ehn at oracle.com Thu Apr 25 12:07:24 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 25 Apr 2019 14:07:24 +0200 Subject: RFR(s): 8222637: Obsolete NeedsDeoptSuspend (was RFR(s): 8222640: Remove deopt suspend) In-Reply-To: <9d09a876-a520-6218-a781-d711cdf58bc1@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> <45681477-9a04-a623-6686-c72b36be6c38@oracle.com> <9d09a876-a520-6218-a781-d711cdf58bc1@oracle.com> Message-ID: <2b64144a-62bb-bdb6-77d2-196bb2936668@oracle.com> Thanks Coleen! /Robbin Ops, s/Dead/Dean/ , sorry :) On 4/25/19 12:48 PM, wrote: > Looks good Robbin! > > Nice to see things simplified. > > Thanks, > David > > On 25/04/2019 6:53 pm, Robbin Ehn wrote: >> Hi, >> >> The same patch as in 8222640 but with obsoleting of the flag also. >> >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8222637 >> CSR: >> https://bugs.openjdk.java.net/browse/JDK-8222639 >> >> The incremental change is thus: >> http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/globals.hpp.sdiff.html >> >> http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/arguments.cpp.sdiff.html >> >> >> Full: >> http://cr.openjdk.java.net/~rehn/8222637/webrev/ >> >> Dead and Coleen had previously review 8222640, so if they can acknowledge this >> inc change. >> >> Thanks, Robbin >> >> On 4/24/19 1:49 AM, Robbin Ehn wrote: >>> Thanks Coleen! >>> >>> /Robbin >>> >>> On 2019-04-24 00:47, coleen.phillimore at oracle.com wrote: >>>> +1? This looks good! >>>> Coleen >>>> >>>> On 4/23/19 5:32 PM, Robbin Ehn wrote: >>>>> Thanks Dean! >>>>> >>>>> /Robbin >>>>> >>>>> On 2019-04-23 23:17, dean.long at oracle.com wrote: >>>>>> Yes, looks good! >>>>>> >>>>>> dl >>>>>> >>>>>> On 4/23/19 12:38 PM, Robbin Ehn wrote: >>>>>>> Hi Dean, >>>>>>> >>>>>>> Is this what you had in mind: >>>>>>> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >>>>>>> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 09:58:55 2019 >>>>>>> +0200 >>>>>>> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 21:32:00 2019 >>>>>>> +0200 >>>>>>> @@ -272,4 +272,6 @@ >>>>>>> >>>>>>> ?void frame::deoptimize(JavaThread* thread) { >>>>>>> +? assert(thread->frame_anchor()->has_last_Java_frame() && >>>>>>> +???????? thread->frame_anchor()->walkable(), "must be"); >>>>>>> ?? // Schedule deoptimization of an nmethod activation with this frame. >>>>>>> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >>>>>>> >>>>>>> Passes t1-5. >>>>>>> >>>>>>> v2: >>>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >>>>>>> Inc: >>>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >>>>>>> >>>>>>> Thanks, Robbin >>>>>>> >>>>>>> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>>>>>>> In frame::deoptimize(), can we assert that we have an anchor frame and >>>>>>>> that it is walkable? >>>>>>>> >>>>>>>> dl >>>>>>>> >>>>>>>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>>>>>>> Adding compiler. >>>>>>>>> >>>>>>>>> /Robbin >>>>>>>>> >>>>>>>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>>>>>>> Hi all, please consider this change. >>>>>>>>>> >>>>>>>>>> The code for deopt suspend is no longer needed since today the >>>>>>>>>> register window >>>>>>>>>> is always flushed when this code executes. Exactly when this code was >>>>>>>>>> needed is not clear, entered via duke changeset 1. I did not dig since >>>>>>>>>> we no longer have such use case. >>>>>>>>>> >>>>>>>>>> Webrev: >>>>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>>>>>>> Issue: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>>>>>>> >>>>>>>>>> Passes t1-5. >>>>>>>>>> >>>>>>>>>> Thanks, Robbin >>>>>>>> >>>>>> >>>> From coleen.phillimore at oracle.com Thu Apr 25 12:10:45 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 25 Apr 2019 08:10:45 -0400 Subject: RFR(s): 8222637: Obsolete NeedsDeoptSuspend (was RFR(s): 8222640: Remove deopt suspend) In-Reply-To: <2b64144a-62bb-bdb6-77d2-196bb2936668@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> <45681477-9a04-a623-6686-c72b36be6c38@oracle.com> <9d09a876-a520-6218-a781-d711cdf58bc1@oracle.com> <2b64144a-62bb-bdb6-77d2-196bb2936668@oracle.com> Message-ID: :)? Looks awesome, Robbin! Thanks for fixing this! Coleen, not Dean or David On 4/25/19 8:07 AM, Robbin Ehn wrote: > Thanks Coleen! > > /Robbin > > Ops, s/Dead/Dean/ , sorry :) > > On 4/25/19 12:48 PM,? wrote: >> Looks good Robbin! >> >> Nice to see things simplified. >> >> Thanks, >> David >> >> On 25/04/2019 6:53 pm, Robbin Ehn wrote: >>> Hi, >>> >>> The same patch as in 8222640 but with obsoleting of the flag also. >>> >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8222637 >>> CSR: >>> https://bugs.openjdk.java.net/browse/JDK-8222639 >>> >>> The incremental change is thus: >>> http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/globals.hpp.sdiff.html >>> >>> http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/arguments.cpp.sdiff.html >>> >>> >>> Full: >>> http://cr.openjdk.java.net/~rehn/8222637/webrev/ >>> >>> Dead and Coleen had previously review 8222640, so if they can >>> acknowledge this inc change. >>> >>> Thanks, Robbin >>> >>> On 4/24/19 1:49 AM, Robbin Ehn wrote: >>>> Thanks Coleen! >>>> >>>> /Robbin >>>> >>>> On 2019-04-24 00:47, coleen.phillimore at oracle.com wrote: >>>>> +1? This looks good! >>>>> Coleen >>>>> >>>>> On 4/23/19 5:32 PM, Robbin Ehn wrote: >>>>>> Thanks Dean! >>>>>> >>>>>> /Robbin >>>>>> >>>>>> On 2019-04-23 23:17, dean.long at oracle.com wrote: >>>>>>> Yes, looks good! >>>>>>> >>>>>>> dl >>>>>>> >>>>>>> On 4/23/19 12:38 PM, Robbin Ehn wrote: >>>>>>>> Hi Dean, >>>>>>>> >>>>>>>> Is this what you had in mind: >>>>>>>> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >>>>>>>> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 >>>>>>>> 09:58:55 2019 +0200 >>>>>>>> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 >>>>>>>> 21:32:00 2019 +0200 >>>>>>>> @@ -272,4 +272,6 @@ >>>>>>>> >>>>>>>> ?void frame::deoptimize(JavaThread* thread) { >>>>>>>> + assert(thread->frame_anchor()->has_last_Java_frame() && >>>>>>>> +???????? thread->frame_anchor()->walkable(), "must be"); >>>>>>>> ?? // Schedule deoptimization of an nmethod activation with >>>>>>>> this frame. >>>>>>>> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >>>>>>>> >>>>>>>> Passes t1-5. >>>>>>>> >>>>>>>> v2: >>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >>>>>>>> Inc: >>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >>>>>>>> >>>>>>>> Thanks, Robbin >>>>>>>> >>>>>>>> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>>>>>>>> In frame::deoptimize(), can we assert that we have an anchor >>>>>>>>> frame and that it is walkable? >>>>>>>>> >>>>>>>>> dl >>>>>>>>> >>>>>>>>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>>>>>>>> Adding compiler. >>>>>>>>>> >>>>>>>>>> /Robbin >>>>>>>>>> >>>>>>>>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>>>>>>>> Hi all, please consider this change. >>>>>>>>>>> >>>>>>>>>>> The code for deopt suspend is no longer needed since today >>>>>>>>>>> the register window >>>>>>>>>>> is always flushed when this code executes. Exactly when this >>>>>>>>>>> code was needed is not clear, entered via duke changeset 1. >>>>>>>>>>> I did not dig since we no longer have such use case. >>>>>>>>>>> >>>>>>>>>>> Webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>>>>>>>> Issue: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>>>>>>>> >>>>>>>>>>> Passes t1-5. >>>>>>>>>>> >>>>>>>>>>> Thanks, Robbin >>>>>>>>> >>>>>>> >>>>> From robbin.ehn at oracle.com Thu Apr 25 12:13:04 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 25 Apr 2019 14:13:04 +0200 Subject: RFR(s): 8222637: Obsolete NeedsDeoptSuspend (was RFR(s): 8222640: Remove deopt suspend) In-Reply-To: References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> <45681477-9a04-a623-6686-c72b36be6c38@oracle.com> <9d09a876-a520-6218-a781-d711cdf58bc1@oracle.com> <2b64144a-62bb-bdb6-77d2-196bb2936668@oracle.com> Message-ID: <2f3b9d8c-e45d-9e14-c83f-ae70b72e597f@oracle.com> Thanks Coleen! On 4/25/19 2:10 PM, coleen.phillimore at oracle.com wrote: > > :)? Looks awesome, Robbin! > Thanks for fixing this! > Coleen, not Dean or David Ah, not my day... > > On 4/25/19 8:07 AM, Robbin Ehn wrote: >> Thanks Coleen! s/Coleen/David :) Thanks for helping with CSR David! /Robbin >> >> /Robbin >> >> Ops, s/Dead/Dean/ , sorry :) >> >> On 4/25/19 12:48 PM,? wrote: >>> Looks good Robbin! >>> >>> Nice to see things simplified. >>> >>> Thanks, >>> David >>> >>> On 25/04/2019 6:53 pm, Robbin Ehn wrote: >>>> Hi, >>>> >>>> The same patch as in 8222640 but with obsoleting of the flag also. >>>> >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8222637 >>>> CSR: >>>> https://bugs.openjdk.java.net/browse/JDK-8222639 >>>> >>>> The incremental change is thus: >>>> http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/globals.hpp.sdiff.html >>>> >>>> http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/arguments.cpp.sdiff.html >>>> >>>> >>>> Full: >>>> http://cr.openjdk.java.net/~rehn/8222637/webrev/ >>>> >>>> Dead and Coleen had previously review 8222640, so if they can acknowledge >>>> this inc change. >>>> >>>> Thanks, Robbin >>>> >>>> On 4/24/19 1:49 AM, Robbin Ehn wrote: >>>>> Thanks Coleen! >>>>> >>>>> /Robbin >>>>> >>>>> On 2019-04-24 00:47, coleen.phillimore at oracle.com wrote: >>>>>> +1? This looks good! >>>>>> Coleen >>>>>> >>>>>> On 4/23/19 5:32 PM, Robbin Ehn wrote: >>>>>>> Thanks Dean! >>>>>>> >>>>>>> /Robbin >>>>>>> >>>>>>> On 2019-04-23 23:17, dean.long at oracle.com wrote: >>>>>>>> Yes, looks good! >>>>>>>> >>>>>>>> dl >>>>>>>> >>>>>>>> On 4/23/19 12:38 PM, Robbin Ehn wrote: >>>>>>>>> Hi Dean, >>>>>>>>> >>>>>>>>> Is this what you had in mind: >>>>>>>>> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >>>>>>>>> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 09:58:55 >>>>>>>>> 2019 +0200 >>>>>>>>> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 21:32:00 >>>>>>>>> 2019 +0200 >>>>>>>>> @@ -272,4 +272,6 @@ >>>>>>>>> >>>>>>>>> ?void frame::deoptimize(JavaThread* thread) { >>>>>>>>> + assert(thread->frame_anchor()->has_last_Java_frame() && >>>>>>>>> +???????? thread->frame_anchor()->walkable(), "must be"); >>>>>>>>> ?? // Schedule deoptimization of an nmethod activation with this frame. >>>>>>>>> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >>>>>>>>> >>>>>>>>> Passes t1-5. >>>>>>>>> >>>>>>>>> v2: >>>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >>>>>>>>> Inc: >>>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >>>>>>>>> >>>>>>>>> Thanks, Robbin >>>>>>>>> >>>>>>>>> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>>>>>>>>> In frame::deoptimize(), can we assert that we have an anchor frame and >>>>>>>>>> that it is walkable? >>>>>>>>>> >>>>>>>>>> dl >>>>>>>>>> >>>>>>>>>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>>>>>>>>> Adding compiler. >>>>>>>>>>> >>>>>>>>>>> /Robbin >>>>>>>>>>> >>>>>>>>>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>>>>>>>>> Hi all, please consider this change. >>>>>>>>>>>> >>>>>>>>>>>> The code for deopt suspend is no longer needed since today the >>>>>>>>>>>> register window >>>>>>>>>>>> is always flushed when this code executes. Exactly when this code >>>>>>>>>>>> was needed is not clear, entered via duke changeset 1. I did not dig >>>>>>>>>>>> since we no longer have such use case. >>>>>>>>>>>> >>>>>>>>>>>> Webrev: >>>>>>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>>>>>>>>> Issue: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>>>>>>>>> >>>>>>>>>>>> Passes t1-5. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, Robbin >>>>>>>>>> >>>>>>>> >>>>>> > From dean.long at oracle.com Thu Apr 25 14:49:57 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 25 Apr 2019 07:49:57 -0700 Subject: RFR(s): 8222637: Obsolete NeedsDeoptSuspend (was RFR(s): 8222640: Remove deopt suspend) In-Reply-To: References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> <45681477-9a04-a623-6686-c72b36be6c38@oracle.com> Message-ID: <4296a391-9bae-daa7-2190-4d28acaa1074@oracle.com> Looks good. dl On 4/25/19 1:53 AM, Robbin Ehn wrote: > Hi, > > The same patch as in 8222640 but with obsoleting of the flag also. > > Issue: > https://bugs.openjdk.java.net/browse/JDK-8222637 > CSR: > https://bugs.openjdk.java.net/browse/JDK-8222639 > > The incremental change is thus: > http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/globals.hpp.sdiff.html > > http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/arguments.cpp.sdiff.html > > > Full: > http://cr.openjdk.java.net/~rehn/8222637/webrev/ > > Dead and Coleen had previously review 8222640, so if they can > acknowledge this inc change. > > Thanks, Robbin > > On 4/24/19 1:49 AM, Robbin Ehn wrote: >> Thanks Coleen! >> >> /Robbin >> >> On 2019-04-24 00:47, coleen.phillimore at oracle.com wrote: >>> +1? This looks good! >>> Coleen >>> >>> On 4/23/19 5:32 PM, Robbin Ehn wrote: >>>> Thanks Dean! >>>> >>>> /Robbin >>>> >>>> On 2019-04-23 23:17, dean.long at oracle.com wrote: >>>>> Yes, looks good! >>>>> >>>>> dl >>>>> >>>>> On 4/23/19 12:38 PM, Robbin Ehn wrote: >>>>>> Hi Dean, >>>>>> >>>>>> Is this what you had in mind: >>>>>> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >>>>>> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 >>>>>> 09:58:55 2019 +0200 >>>>>> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 >>>>>> 21:32:00 2019 +0200 >>>>>> @@ -272,4 +272,6 @@ >>>>>> >>>>>> ?void frame::deoptimize(JavaThread* thread) { >>>>>> + assert(thread->frame_anchor()->has_last_Java_frame() && >>>>>> +???????? thread->frame_anchor()->walkable(), "must be"); >>>>>> ?? // Schedule deoptimization of an nmethod activation with this >>>>>> frame. >>>>>> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >>>>>> >>>>>> Passes t1-5. >>>>>> >>>>>> v2: >>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >>>>>> Inc: >>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>>>>>> In frame::deoptimize(), can we assert that we have an anchor >>>>>>> frame and that it is walkable? >>>>>>> >>>>>>> dl >>>>>>> >>>>>>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>>>>>> Adding compiler. >>>>>>>> >>>>>>>> /Robbin >>>>>>>> >>>>>>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>>>>>> Hi all, please consider this change. >>>>>>>>> >>>>>>>>> The code for deopt suspend is no longer needed since today the >>>>>>>>> register window >>>>>>>>> is always flushed when this code executes. Exactly when this >>>>>>>>> code was needed is not clear, entered via duke changeset 1. I >>>>>>>>> did not dig since we no longer have such use case. >>>>>>>>> >>>>>>>>> Webrev: >>>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>>>>>> Issue: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>>>>>> >>>>>>>>> Passes t1-5. >>>>>>>>> >>>>>>>>> Thanks, Robbin >>>>>>> >>>>> >>> From mikhailo.seledtsov at oracle.com Thu Apr 25 15:27:07 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 25 Apr 2019 08:27:07 -0700 Subject: RFR(S): 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" In-Reply-To: References: <1ebce1c3-4a62-801e-8876-47e930dfea73@oracle.com> Message-ID: <08b09cee-f436-750a-fef5-0d575992f14d@oracle.com> Hi Severin, ? Good catch. What I really meant to do is this: ??? pb.environment().put(TEST_ENV_VARIABLE, TEST_ENV_VALUE); ??????? OutputAnalyzer out = new OutputAnalyzer(pb.start())? // <<==== start the process for new process builder with environment var defined ??????????? .shouldHaveExitValue(0) ??????????? .shouldContain("key = JAVA_HOME") ??????????? .shouldContain("value = /jdk") It was an unfortunate copy/paste error, thanks for catching it. I will run tests, and update a webrev shortly. Thank you, Misha On 4/25/19 1:17 AM, Severin Gehwolf wrote: > Hi Misha, > > Thanks for fixing this! > > On Wed, 2019-04-24 at 13:06 -0700, mikhailo.seledtsov at oracle.com wrote: >> Please review this change that makes a test more robust. The test >> originally relied on the fact that JAVA_MAIN_CLASS variable >> is always present when running jtreg tests. This assumption is wrong, >> hence I reworked the test to define its own unique >> environment variable. >> >> In order to introduce environment variable I reworked the >> DockerTestUtils a little bit, by splitting the >> dockerRunJava() method into 2: buildJavaCommand() to build the command; >> dockerRunJava() uses buildJavaCommand() >> then runs the command. This allowed me to introduce the test environment >> variable to the child process. >> The behavior of dockerRunJava() should not be affected by this change, >> it is a simle split. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8222888 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.00/ >> Testing: >> 1. Ran hotspot docker tests on Linux-x64 machine with docker >> enigne configured. >> Ran both via jtreg directly and via make >> All PASS > This doesn't seem right to me: > > private static void testEnvironmentVariables() throws Exception { > Common.logNewTestCase("EnvironmentVariables"); > > - DockerTestUtils.dockerRunJava( > + List cmd = DockerTestUtils.buildJavaCommand( > commonDockerOpts() > - .addClassOptions("jdk.InitialEnvironmentVariable")) > + .addClassOptions("jdk.InitialEnvironmentVariable")); > + > + ProcessBuilder pb = new ProcessBuilder(cmd); > + > + // Container has JAVA_HOME defined via the Dockerfile; make sure > + // it is reported by JFR event. > + // Environment variable set in host system should not be visible inside a container, > + // and should not be reported by JFR. > + pb.environment().put(TEST_ENV_VARIABLE, TEST_ENV_VALUE); > + DockerTestUtils.execute(cmd) > .shouldHaveExitValue(0) > .shouldContain("key = JAVA_HOME") > - .shouldNotContain(getTestEnvironmentVariable()); > + .shouldContain("value = /jdk") > + .shouldNotContain(TEST_ENV_VARIABLE) > + .shouldNotContain(TEST_ENV_VALUE); > > So you are creating a new ProcessBuilder instance, add the environment > variable, but then never start it? This means that not even the docker > run cmd will have the environment. Let alone the JVM inside docker. > > Note that DockerTestUtils.execute() has it's own version of > ProcessBuilder. If I'm reading this right, that should be the function > that needs to conditionally add the environment. > > Am I missing something? > > Thanks, > Severin > From mikhailo.seledtsov at oracle.com Thu Apr 25 15:48:36 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 25 Apr 2019 08:48:36 -0700 Subject: RFR(S): 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" In-Reply-To: <08b09cee-f436-750a-fef5-0d575992f14d@oracle.com> References: <1ebce1c3-4a62-801e-8876-47e930dfea73@oracle.com> <08b09cee-f436-750a-fef5-0d575992f14d@oracle.com> Message-ID: <8574e691-d315-dfa7-dbad-ab1c3a0da8ed@oracle.com> Here is the updated webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.01/index.html Thank you, Misha On 4/25/19 8:27 AM, mikhailo.seledtsov at oracle.com wrote: > Hi Severin, > > ? Good catch. What I really meant to do is this: > > ??? pb.environment().put(TEST_ENV_VARIABLE, TEST_ENV_VALUE); > ??????? OutputAnalyzer out = new OutputAnalyzer(pb.start())? // <<==== > start the process for new process builder with environment var defined > ??????????? .shouldHaveExitValue(0) > ??????????? .shouldContain("key = JAVA_HOME") > ??????????? .shouldContain("value = /jdk") > > It was an unfortunate copy/paste error, thanks for catching it. > > I will run tests, and update a webrev shortly. > > > Thank you, > Misha > > > On 4/25/19 1:17 AM, Severin Gehwolf wrote: >> Hi Misha, >> >> Thanks for fixing this! >> >> On Wed, 2019-04-24 at 13:06 -0700, mikhailo.seledtsov at oracle.com wrote: >>> Please review this change that makes a test more robust. The test >>> originally relied on the fact that JAVA_MAIN_CLASS variable >>> is always present when running jtreg tests. This assumption is wrong, >>> hence I reworked the test to define its own unique >>> environment variable. >>> >>> In order to introduce environment variable I reworked the >>> DockerTestUtils a little bit, by splitting the >>> dockerRunJava() method into 2: buildJavaCommand() to build the command; >>> dockerRunJava() uses buildJavaCommand() >>> then runs the command. This allowed me to introduce the test >>> environment >>> variable to the child process. >>> The behavior of dockerRunJava() should not be affected by this change, >>> it is a simle split. >>> >>> ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8222888 >>> ????? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222888.00/ >>> ????? Testing: >>> ????????? 1. Ran hotspot docker tests on Linux-x64 machine with docker >>> enigne configured. >>> ???????????? Ran both via jtreg directly and via make >>> ???????????? All PASS >> This doesn't seem right to me: >> >> ????? private static void testEnvironmentVariables() throws Exception { >> ????????? Common.logNewTestCase("EnvironmentVariables"); >> ? -??????? DockerTestUtils.dockerRunJava( >> +??????? List cmd = DockerTestUtils.buildJavaCommand( >> ??????????????????????????????????????? commonDockerOpts() >> - .addClassOptions("jdk.InitialEnvironmentVariable")) >> + .addClassOptions("jdk.InitialEnvironmentVariable")); >> + >> +??????? ProcessBuilder pb = new ProcessBuilder(cmd); >> + >> +??????? // Container has JAVA_HOME defined via the Dockerfile; make >> sure >> +??????? // it is reported by JFR event. >> +??????? // Environment variable set in host system should not be >> visible inside a container, >> +??????? // and should not be reported by JFR. >> +??????? pb.environment().put(TEST_ENV_VARIABLE, TEST_ENV_VALUE); >> +??????? DockerTestUtils.execute(cmd) >> ????????????? .shouldHaveExitValue(0) >> ????????????? .shouldContain("key = JAVA_HOME") >> -??????????? .shouldNotContain(getTestEnvironmentVariable()); >> +??????????? .shouldContain("value = /jdk") >> +??????????? .shouldNotContain(TEST_ENV_VARIABLE) >> +??????????? .shouldNotContain(TEST_ENV_VALUE); >> >> So you are creating a new ProcessBuilder instance, add the environment >> variable, but then never start it? This means that not even the docker >> run cmd will have the environment. Let alone the JVM inside docker. >> >> Note that DockerTestUtils.execute() has it's own version of >> ProcessBuilder. If I'm reading this right, that should be the function >> that needs to conditionally add the environment. >> >> Am I missing something? >> >> Thanks, >> Severin >> > From karen.kinnear at oracle.com Thu Apr 25 15:48:54 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 25 Apr 2019 11:48:54 -0400 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CBFA906.1030205@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> <5CBFA906.1030205@oracle.com> Message-ID: <1F703F87-5214-4410-A780-47166EFFC5E0@oracle.com> One follow-up question relative to Jiangli?s review questions please: > On Apr 23, 2019, at 8:08 PM, Calvin Cheung wrote: > > Hi Jiangli, > > Thanks a lot for your review! > > On 4/22/19, 2:07 PM, Jiangli Zhou wrote: >> Hi Calvin, >> >> >> - src/hotspot/share/classfile/systemDictionaryShared.cpp >> >> 1218 if (DynamicDumpSharedSpaces) { >> 1219 return false; >> 1220 } else { >> >> The above case for DynamicDumpSharedSpaces needs to be examined >> carefully. Can you please ask Harold (and Coleen or Karen) to take a >> look? Also, a comment is needed to explain that we can complete all >> verification checks at dynamic dumping time. > I've added a comment. If it return false, the caller will call VerificationType::resolve_and_check_assignability(). Just making sure I understand this correctly: When we archive a class, we also archive all supertypes. We do not necessarily archive all classes in the verification constraints list. We always record all verification constraints, whether or not we actually continue to load and perform them. When we run with an archive, we ensure that the supertypes that we use are also in the archive and are an exact match, otherwise we don?t select the class from the archive. We then check the verification constraints list, which might then cause additional class loading. If that fails, again, we don?t select the class from the archive. This allows more flexibility in changes to the classes found at runtime, as long as their ?isAssignableFrom? constraints are still met. So - prior to the DynamicDumpSharedSpace - In SystemDictionaryShared::add_verification_constraint: I agree with the logic that we add constraints for all archived classes. I agree that for non-built-in class loaders, we could not complete the verification check at dump time, and only perform this at runtime using the archive. For built-in class loaders - I agree that we could ALSO perform the verification check at dump time, and if it failed, eliminate archiving the class. I assume that is intended behavior for DynamicDumpSharedSpaces? Couple of questions: 1. Does what I said above for built-in class loaders match your understanding? If so, perhaps this would be a good time to clarify the comment - since we will perform the verification check now AND and when running with the archive. 2. For DynamicDumpSharedSpaces Since we do not perform any loading as part of the PopulateDynamicDumpSharedSpaces, I assume that this is all about recording verification constraints for all classes loaded while we run, before we get to the archive creation on exit. So your logic of creating constraints for all archived classes and then returning false so that the verification is also run during the first run makes sense to me. And yes, it makes sense to add a comment that we should ALSO perform verification checks during the initial run in addition to creating the constraint for running later with the archive. I don?t conceptually think of this as checking constraints ?at dump time?, but rather for the initial run before creating the dynamic archive. thanks, Karen >> >> >> I only took a brief look of the test changes. Please ask Misha to >> review the test changes as well. >> >> Thanks and regards, >> Jiangli From sgehwolf at redhat.com Thu Apr 25 15:53:45 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 25 Apr 2019 17:53:45 +0200 Subject: RFR(S): 8222888: [TESTBUG] docker/TestJFREvents.java fails due to "RuntimeException: JAVA_MAIN_CLASS_* is not defined" In-Reply-To: <8574e691-d315-dfa7-dbad-ab1c3a0da8ed@oracle.com> References: <1ebce1c3-4a62-801e-8876-47e930dfea73@oracle.com> <08b09cee-f436-750a-fef5-0d575992f14d@oracle.com> <8574e691-d315-dfa7-dbad-ab1c3a0da8ed@oracle.com> Message-ID: On Thu, 2019-04-25 at 08:48 -0700, mikhailo.seledtsov at oracle.com wrote: > Here is the updated webrev: > http://cr.openjdk.java.net/~mseledtsov/8222888.01/index.html This looks good to me. Thanks, Severin From yumin.qi at gmail.com Thu Apr 25 16:07:55 2019 From: yumin.qi at gmail.com (yumin qi) Date: Thu, 25 Apr 2019 09:07:55 -0700 Subject: RFC: JWarmup precompile java hot methods at application startup In-Reply-To: References: Message-ID: Hi, Apart from comments from compiler professionals can I have comments from runtime either? The changes mostly land in runtime area. Thanks Yumin On Tue, Apr 16, 2019 at 11:27 AM yumin qi wrote: > HI, > > Did anyone have comments for this version? > > Thanks > Yumin > > On Tue, Apr 9, 2019 at 10:36 AM yumin qi wrote: > >> Alan, >> Thanks! Updated in same link: >> http://cr.openjdk.java.net/~minqi/8220692/webrev-02/ >> >> Removed non-boot loader branch in nativeLookup.cpp. >> Added jdk.jwarmup to boot loader list in make/common/Modules.gmk. >> Tested again to make sure the new changes. >> >> Thanks >> Yumin >> >> >> On Tue, Apr 9, 2019 at 4:48 AM Alan Bateman >> wrote: >> >>> On 09/04/2019 07:10, yumin qi wrote: >>> > >>> > Now the registerNatives is found when it looks up for native entry >>> > in lookupNative.cpp. I thought the class JWarmUp will be loaded by >>> > boot loader like Unsafe or WhiteBox, but I was wrong, it is loaded by >>> > app class loader so logic for obtaining its native entry put in both >>> > cases, boot loader and non boot loaders. >>> > >>> make/common/Modules.gmk is where BOOT_MODULES is defined with the list >>> of modules mapped to the boot loader. >>> >>> -Alan >>> >> From karen.kinnear at oracle.com Thu Apr 25 16:08:26 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 25 Apr 2019 12:08:26 -0400 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CBDF488.7000601@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> Message-ID: <773B4428-C2E5-4EEF-BA8C-DDDE33CA46EA@oracle.com> Calvin, Thank you for checking out this information. Summary is that running the dynamic archive can be slightly faster than without, which was the big question. So that is really good news. I understand that you may have found/fixed a bug and that Claes had an idea about how to make the startup faster - so I assume you will have better numbers soon. In the meantime, with applications with more than 2-3 classes, startup has improved, so overall good news. I don?t know where/how this new option will be documented - I wanted to make sure that you set customer expectations here - that this flag is an ease-of-use flag, so that instead of step 1: create class list, step 2: create archive, step 3: run with archive, that you can have the first real run create the archive on exit. And the expectation should be that after the run, the archive creation will take about the same amount of time it used to. Subsequent runs will be faster since they using the archive. [detailed questions embedded below] > On Apr 22, 2019, at 1:06 PM, Calvin Cheung wrote: > > Hi Karen, > > Thanks for your review! > Please see my replies in-line below. > > On 4/19/19, 9:29 AM, Karen Kinnear wrote: >> >> Calvin, >> >> Many thanks for all the work getting this ready, significantly enhancing the testing and bug fixes. >> >> I marked the CSR as reviewed-by - it looks great! >> >> I reviewed this set of changes - I did not review the tests - I assume you can get someone >> else to do that. I am grateful that Jiangli and Ioi are going to review this also - they are much closer to >> the details than I am. >> >> 1. Do you have any performance numbers? >> 1a. Startup: does using a combined dynamic CDS archive + base archive give similar startup benefits >> when you have the same classes in the archives? > Below are some performance numbers from Eric, each number is for 50 runs: > (base: using the default CDS archive, > test: using the dynamic archive, > Eric will get some numbers with a single archive which I think that's what you're looking for) > > Lambda-noop: > base: > 0.066441427 seconds time elapsed > test: > 0.075428824 seconds time elapsed > > Noop: > base: > 0.057614537 seconds time elapsed > test: > 0.066061557 seconds time elapsed > > Netty: > base: > 0.827013307 seconds time elapsed > test: > 0.604982805 seconds time elapsed > > Spring: > base: > 2.376707358 seconds time elapsed > test: > 1.927618893 seconds time elapsed > > The first 2 apps only have 2 to 3 classes in the dynamic archive. So the overhead is likely due to having to open and map the dynamic archive and performs checking on header, etc. For small apps, I think it's better to use a single archive. The Netty app has around 1400 classes in the dynamic archive; the Spring app has about 3700 classes in the dynamic archive. > > I also used our LotsOfClasses test to collect some perf numbers. This is more like runtime performance, not startup performance. > > With dynamic archive (100 runs each): > real 2m37.191s > real 2m36.003s > Total loaded classes = 24254 > Loaded from base archive = 1186 > Loaded from top archive = 23042 > Loaded from jrt:/ (runtime module) = 26 > > With single archive (100 runs each): > real 2m38.346s > real 2m36.947s > Total loaded classes = 24254 > Loaded from archive = 24228 > Loaded from jrt:/ (runtime module) = 26 > >> >> 1b. Do you have samples of uses of the combined dynamic CDS archive + base archive vs. a single >> static archive built for an application? >> - how do the sets of archived classes differ? > Currently, the default CDS archive contains around 1187 classes. With the -XX:ArchiveClassesAtExit option, if the classes are not found in the default CDS archive, they will be archived in the dynamic archive. The above LotsOfClasses example shows some distributions between various archives. >> - one note was that the AtExit approach exclude list adds anything that has not yet linked - does that make a significant difference in the number of classes that are archived? Does that make a difference in either startup time or in application execution time? I could see that going either way. > As the above numbers indicated, there's not much difference in terms of execution time using a dynamic vs a single archive with a large number of classes loaded. The numbers from Netty and Spring apps show an improvement over default CDS archive. >> >> 1c. Any sense of performance cost for first run - how much time does it take to create an incremental archive? >> - is the time comparable to an existing dump for a single archive for the application? >> - this is an ease-of-use feature - so we are not expecting that to be fast >> - the point is to set expectations in our documentation > I did some rough measurements with the LotsOfClasses test with around 15000 classes in the classlist. > > Dynamic archive dumping (one run each): > real 0m19.756s > real 0m20.241s > > Static archive dumping (one run each): > real 0m17.725s > real 0m16.993s What are the two numbers from one run? >> >> 2. Footprint >> With two archives rather than one, is there a significant footprint difference? Obviously this will vary by app and archive. >> Once again, the point is to set expectations. > Sizes of the archives for the LotsOfClasses test in 1a. > > Single archive: 242962432 > Default CDS archive: 12365824 > Dynamic archive: 197525504 Is this accurate? So the combined sizes are smaller than the single archive? > >> >> 3. Runtime performance >> With two sets of archived dictionaries & symbolTables - is there any significant performance cost to larger benchmarks, e.g. for class loading lookup for classes that are not in the archives? Or symbol lookup? > I used the LotsOfClasses test again. This time archiving about half of the classes which will be loaded during runtime. > > Dynamic archive (10 runs each): > real 0m30.214s > real 0m29.633s > Loaded classes = 24254 > Loaded from dynamic archive: 13168 Question - is this loaded from both archives? Or just from the dynamic archive so you also loaded additional classes from the base archive? > > Single archive (10 runs each): > real 0m32.383s > real 0m32.905s > Loaded classes = 24254 > Loaded from single archive = 15063 >> >> 4. Platform support >> Which platforms is this supported on? >> Which ones did you test? For example, did you run the tests on Windows? > I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, Solaris, Windows). Many thanks, Karen > > thanks, > Calvin From karen.kinnear at oracle.com Thu Apr 25 16:08:34 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 25 Apr 2019 12:08:34 -0400 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> <3614850A-450A-470B-BF2D-3CEA21D03B95@oracle.com> <5CBF74F2.9000206@oracle.com> Message-ID: <6553FAA0-5301-4472-A89A-74E9001C7E1A@oracle.com> Ioi and Calvin, Thank you for the thoughtful responses. I totally agree that keeping this in mind as part of where we are going longer term is the better approach. And I do like the idea of giving customers suggestions for their scripts - whether that is someone?s blog or ... thanks, Karen > On Apr 23, 2019, at 4:43 PM, Ioi Lam wrote: > > > > On 4/23/19 1:26 PM, Calvin Cheung wrote: >> >> >> On 4/23/19, 11:16 AM, Karen Kinnear wrote: >>> Calvin, >>> >>> I added to the CSR a comment from my favorite customer - relative to the user model for the command-line flags. >>> He likes the proposal to reduce the number of steps a customer has to perform to get startup and footprint benefits >>> from the archive. >>> >>> The comment was that it would be very helpful if the user only needed to change their scripts once - so >>> a single command-line argument would create a dynamic archive if one did not exist, and use it if it >>> already existed. >>> >>> Is there a way to evolve the ArchiveClassesAtExit= to have that functionality? >> One drawback of this proposal is that the ArchiveClassesAtExit option has 2 meanings which I find confusing. >> Maybe eventually we can do some kind of automatic CDS archive dumping without having to specify any command line option. Such as when a java app is run at the first time, there will be some CDS archive created with a unique name. Subsequent run of the same app will make use of the archive. > > When we scoped this JEP, we wanted to provide just the minimal building blocks, so a user could implement automation on top of the JVM. Something like > > ARCHIVE=foo.jsa > if test -f $ARCHIVE; then > FLAG="-XX:SharedArchiveFile=$ARCHIVE" > else > FLAG="-XX:ArchiveClassesAtExit=$ARCHIVE" > fi > > $JAVA_HOME/bin/java -cp foo.jar $FLAG FooApp > > Note that you also need to update the archive if the Java version has changed, so the test would be a little more complicated > > ARCHIVE=foo.jsa > VERSION=foo.version > if test -f $ARCHIVE -a -f $VERSION && cmp -s $VERSION > $JAVA_HOME/release; then > FLAG="-XX:SharedArchiveFile=$ARCHIVE" > else > FLAG="-XX:ArchiveClassesAtExit=$ARCHIVE" > cp -f $JAVA_HOME/release $VERSION > fi > $JAVA_HOME/bin/java -cp foo.jar $FLAG FooApp > > As Calvin mentioned, we are planning to make the archive management more automatic. So eventually you might be able to do something like > > java -Xshare:reallyauto -cp foo.jar FooApp > > And the JSA file will be automatically generated if necessary. We probably need some logic to delete older archives to avoid filling up the disk. > > I think the automation feature needs to be carefully planned out, so we should do that in a follow-up RFE or JEP. > > Thanks > - Ioi > > > >>> >>> thanks, >>> Karen >>> >>> p.s. I think it makes more sense to put performance numbers in the implementation RFE comments rather than the JEP >>> comments >> We found a bug yesterday and will run more performance tests. I'll put the performance numberss in the implementation RFE. >> >> thanks, >> Calvin >>> >>>> On Apr 22, 2019, at 5:16 PM, Jiangli Zhou >> wrote: >>>> >>>> Hi Calvin, >>>> >>>> Can you please also publish the final performance numbers in the JEP 350 (or the implementation RFE) comment section? >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung >> wrote: >>>> >>>> Hi Karen, >>>> >>>> Thanks for your review! >>>> Please see my replies in-line below. >>>> >>>> On 4/19/19, 9:29 AM, Karen Kinnear wrote: >>>> > Calvin, >>>> > >>>> > Many thanks for all the work getting this ready, significantly >>>> > enhancing the testing and bug fixes. >>>> > >>>> > I marked the CSR as reviewed-by - it looks great! >>>> > >>>> > I reviewed this set of changes - I did not review the tests - I >>>> assume >>>> > you can get someone >>>> > else to do that. I am grateful that Jiangli and Ioi are going to >>>> > review this also - they are much closer to >>>> > the details than I am. >>>> > >>>> > 1. Do you have any performance numbers? >>>> > 1a. Startup: does using a combined dynamic CDS archive + base >>>> archive >>>> > give similar startup benefits >>>> > when you have the same classes in the archives? >>>> Below are some performance numbers from Eric, each number is for >>>> 50 runs: >>>> (base: using the default CDS archive, >>>> test: using the dynamic archive, >>>> Eric will get some numbers with a single archive which I think that's >>>> what you're looking for) >>>> >>>> Lambda-noop: >>>> base: >>>> 0.066441427 seconds time elapsed >>>> test: >>>> 0.075428824 seconds time elapsed >>>> >>>> Noop: >>>> base: >>>> 0.057614537 seconds time elapsed >>>> test: >>>> 0.066061557 seconds time elapsed >>>> >>>> Netty: >>>> base: >>>> 0.827013307 seconds time elapsed >>>> test: >>>> 0.604982805 seconds time elapsed >>>> >>>> Spring: >>>> base: >>>> 2.376707358 seconds time elapsed >>>> test: >>>> 1.927618893 seconds time elapsed >>>> >>>> The first 2 apps only have 2 to 3 classes in the dynamic archive. >>>> So the >>>> overhead is likely due to having to open and map the dynamic >>>> archive and >>>> performs checking on header, etc. For small apps, I think it's >>>> better to >>>> use a single archive. The Netty app has around 1400 classes in the >>>> dynamic archive; the Spring app has about 3700 classes in the dynamic >>>> archive. >>>> >>>> I also used our LotsOfClasses test to collect some perf numbers. >>>> This is >>>> more like runtime performance, not startup performance. >>>> >>>> With dynamic archive (100 runs each): >>>> real 2m37.191s >>>> real 2m36.003s >>>> Total loaded classes = 24254 >>>> Loaded from base archive = 1186 >>>> Loaded from top archive = 23042 >>>> Loaded from jrt:/ (runtime module) = 26 >>>> >>>> With single archive (100 runs each): >>>> real 2m38.346s >>>> real 2m36.947s >>>> Total loaded classes = 24254 >>>> Loaded from archive = 24228 >>>> Loaded from jrt:/ (runtime module) = 26 >>>> >>>> > >>>> > 1b. Do you have samples of uses of the combined dynamic CDS >>>> archive + >>>> > base archive vs. a single >>>> > static archive built for an application? >>>> > - how do the sets of archived classes differ? >>>> Currently, the default CDS archive contains around 1187 classes. With >>>> the -XX:ArchiveClassesAtExit option, if the classes are not found >>>> in the >>>> default CDS archive, they will be archived in the dynamic >>>> archive. The >>>> above LotsOfClasses example shows some distributions between various >>>> archives. >>>> > - one note was that the AtExit approach exclude list adds >>>> anything >>>> > that has not yet linked - does that make a significant >>>> difference in >>>> > the number of classes that are archived? Does that make a >>>> difference >>>> > in either startup time or in application execution time? I >>>> could see >>>> > that going either way. >>>> As the above numbers indicated, there's not much difference in >>>> terms of >>>> execution time using a dynamic vs a single archive with a large >>>> number >>>> of classes loaded. The numbers from Netty and Spring apps show an >>>> improvement over default CDS archive. >>>> > >>>> > 1c. Any sense of performance cost for first run - how much time >>>> does >>>> > it take to create an incremental archive? >>>> > - is the time comparable to an existing dump for a single >>>> archive >>>> > for the application? >>>> > - this is an ease-of-use feature - so we are not expecting >>>> that to >>>> > be fast >>>> > - the point is to set expectations in our documentation >>>> I did some rough measurements with the LotsOfClasses test with around >>>> 15000 classes in the classlist. >>>> >>>> Dynamic archive dumping (one run each): >>>> real 0m19.756s >>>> real 0m20.241s >>>> >>>> Static archive dumping (one run each): >>>> real 0m17.725s >>>> real 0m16.993s >>>> > >>>> > 2. Footprint >>>> > With two archives rather than one, is there a significant footprint >>>> > difference? Obviously this will vary by app and archive. >>>> > Once again, the point is to set expectations. >>>> Sizes of the archives for the LotsOfClasses test in 1a. >>>> >>>> Single archive: 242962432 >>>> Default CDS archive: 12365824 >>>> Dynamic archive: 197525504 >>>> >>>> > >>>> > 3. Runtime performance >>>> > With two sets of archived dictionaries & symbolTables - is >>>> there any >>>> > significant performance cost to larger benchmarks, e.g. for class >>>> > loading lookup for classes that are not in the archives? Or symbol >>>> > lookup? >>>> I used the LotsOfClasses test again. This time archiving about >>>> half of >>>> the classes which will be loaded during runtime. >>>> >>>> Dynamic archive (10 runs each): >>>> real 0m30.214s >>>> real 0m29.633s >>>> Loaded classes = 24254 >>>> Loaded from dynamic archive: 13168 >>>> >>>> Single archive (10 runs each): >>>> real 0m32.383s >>>> real 0m32.905s >>>> Loaded classes = 24254 >>>> Loaded from single archive = 15063 >>>> > >>>> > 4. Platform support >>>> > Which platforms is this supported on? >>>> > Which ones did you test? For example, did you run the tests on >>>> Windows? >>>> I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, >>>> Solaris, >>>> Windows). >>>> > >>>> > Detailed feedback on the code: Just minor comments - I don?t >>>> need to >>>> > see an updated webrev: >>>> I'm going to look into your detailed feedback below and may reply >>>> in a >>>> separate email. >>>> >>>> thanks, >>>> Calvin >>>> > >>>> > 1. metaSpaceShared.hpp >>>> > line 156: >>>> > what is the hardcoded -100 for? Should that be an enum? >>>> > >>>> > 2. jfrRecorder.cpp >>>> > So JFR recordings are disabled if DynamicDumpSharedSpaces? >>>> > why? >>>> > Is that a future rfe? >>>> > >>>> > 3. systemDictionaryShared.cpp >>>> > Could you possibly add a comment to add_verification_constraint >>>> > for if (DynamicDumpSharedSpaces) >>>> > return false >>>> > >>>> > -- I think the logic is: >>>> > because we have successfully linked any instanceKlass we archive >>>> > with DynamicDumpSharedSpaces, we have resolved all the >>>> constraint classes. >>>> > >>>> > -- I didn't check the order - is this called before or after >>>> > excluding? If after, then would it make sense to add an assertion >>>> > here is_linked? Then if you ever change how/when linking is >>>> done, this >>>> > might catch future errors. >>>> > >>>> > 4. systemDictionaryShared.cpp >>>> > EstimateSizeForArchive::do_entry >>>> > Is it the case that for info.is_builtin() there are no verification >>>> > constraints? So you could skip that calculation? Or did I >>>> misunderstand? >>>> > >>>> > 5. compactHashtable.cpp >>>> > serialize/header/calculate_header_size >>>> > -- could you dynamically determine size_of header so you don't need >>>> > to hardcode a 5? >>>> > >>>> > 6. classLoader.cpp >>>> > line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are >>>> > mutually exclusive. >>>> > Can you clarify for me: >>>> > My memory of the base archive is that we do not allow the following >>>> > options at dump time - and these >>>> > are the same for the dynamic archive: ?limit-modules, >>>> > ?upgrade-module-path, ?patch-module. >>>> > >>>> > I have forgotten: >>>> > Today with UseSharedSpaces - do we allow these flags? Is that >>>> also the >>>> > same behavior with the dynamic >>>> > archive? >>>> > >>>> > 7. classLoaderExt.cpp >>>> > assert line 66: only used with -Xshare:dump >>>> > -> "only used at dump time" >>>> > >>>> > 8. symbolTable.cpp >>>> > line 473: comment // used by UseSharedArchived2 >>>> > ? command-line arg name has changed >>>> > >>>> > 9. filemap.cpp >>>> > Comment lines 529 ... >>>> > Is this true - that you can only support dynamic dumping with the >>>> > default CDS archive? Could you clarify what the restrictions are? >>>> > The CSR implies you can support ?a specific base CDS archive" >>>> > - so base layer can not have appended boot class path >>>> > - and base layer can't have a module path >>>> > >>>> > What can you specify for the dynamic dumping relative to the >>>> base archive? >>>> > - matching class path? >>>> > - appended class path? >>>> > in future - could it have a module path that matched the base >>>> archive? >>>> > >>>> > Should any of these restrictions be clarified in documentation/CSR >>>> > since they appear to be new? >>>> > >>>> > 10. filemap.cpp >>>> > check_archive >>>> > Do some of the return false paths skip performing os::close(fd)? >>>> > >>>> > and get_base_archive_name_from_header >>>> > Does the first return false path fail to os::free(dynamic_header) >>>> > >>>> > lines 753-754: two FIXME comments >>>> > >>>> > Could you delete commented out line 1087 in filemap.cpp ? >>>> > >>>> > 11. filemap.hpp >>>> > line 214: TODO left in >>>> > >>>> > 12. metaspace.cpp >>>> > line 1418 FIXME left in >>>> > >>>> > 13. java.cpp >>>> > FIXME: is this the right place? >>>> > For starting the DynamicArchive::dump >>>> > >>>> > Please check with David Holmes on that one >>>> > >>>> > 14. dynamicArchive.hpp >>>> > line 55 (and others): MetsapceObj -> MetaspaceObj >>>> > >>>> > 15. dynamicArchive.cpp >>>> > line 285 rel-ayout -> re-layout >>>> > >>>> > lines 277 && 412 >>>> > Do we archive array klasses in the base archive but not in the >>>> dynamic >>>> > archive? >>>> > Is that a potential RFE? >>>> > Is it possible that GatherKlassesAndSymbols::do_unique_ref could be >>>> > called with an array class? >>>> > Same question for copy_impl? >>>> > >>>> > line 934: "no onger" -> "no longer" >>>> > >>>> > 16. What is AllowArchivingWithJavaAgent? Is that a hook for a >>>> > potential future rfe? >>>> > Do you want to check in that code at this time? In product? >>>> > >>>> > thanks, >>>> > Karen >>>> > >>>> > >>>> >> On Apr 11, 2019, at 5:18 PM, Calvin Cheung >>>> > >>>> >> >>>> >> wrote: >>>> >> >>>> >> This is a follow-up on the preliminary code review sent by >>>> Jiangli in >>>> >> January[1]. >>>> >> >>>> >> Highlights of changes since then: >>>> >> 1. New vm option for dumping a dynamic archive >>>> >> (-XX:ArchiveClassesAtExit=) and enhancement >>>> to the >>>> >> existing -XX:SharedArchiveFile option. Please refer to the >>>> >> corresponding CSR[2] for details. >>>> >> 2. New way to run existing AppCDS tests in dynamic CDS archive >>>> mode. >>>> >> At the jtreg command line, the user can run many existing AppCDS >>>> >> tests in dynamic CDS archive mode by specifying the following: >>>> >> -vmoptions:-Dtest.dynamic.cds.archive=true >>>> >> /open/test/hotspot/jtreg:hotspot_appcds_dynamic >>>> >> We will have a follow-up RFE to determine in which tier the >>>> above >>>> >> tests should be run. >>>> >> 3. Added more tests. >>>> >> 4. Various bug fixes to improve stability. >>>> >> >>>> >> RFE:https://bugs.openjdk.java.net/browse/JDK-8207812 >>>> >> webrev: >>>> >>http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ >>>> > >>>> >> >>>> >>>> >> >>>> >> (The webrev is based on top of the following rev: >>>> >>http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) >>>> >> >>>> >> Testing: >>>> >> - mach5 tiers 1- 3 (including the new tests) >>>> >> - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few >>>> >> tests require more investigation) >>>> >> >>>> >> thanks, >>>> >> Calvin >>>> >> >>>> >> [1] >>>> >>https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html >>>> >> [2]https://bugs.openjdk.java.net/browse/JDK-8221706 >>>> > From daniel.daugherty at oracle.com Thu Apr 25 16:38:37 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 25 Apr 2019 12:38:37 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints (CR2/v2.02/5-for-jdk13) In-Reply-To: References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> Message-ID: <313e51c8-b672-bb1c-577a-49868f09e6c1@oracle.com> Greetings, I have a small but important bug fix for the Async Monitor Deflation project ready to go. It's also known as v2.02 (for those for with the patches) and as webrev/5-for-jdk13 (for those with webrev URLs). Sorry for all the names... JDK-8222295 was pushed to jdk/jdk two days ago so that baseline patch is out of our hair. Main bug URL: ??? JDK-8153224 Monitor deflation prolong safepoints ??? https://bugs.openjdk.java.net/browse/JDK-8153224 The project is currently baselined on jdk-13+17. Here's the full webrev URL: http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.full/ Here's the incremental webrev URL (JDK-8153224): http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.inc/ I still have to update the OpenJDK wiki to reflect the CR2 changes: https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation This version of the patch has been thru Mach5 tier[1-6] testing on Oracle's usual set of platforms. Mach5 tier[7-8] is running now. My stress kit is running on Solaris-X64 now. Kitchensink8H is running now on product, fastdebug, and slowdebug bits on Linux-X64, MacOSX and Solaris-X64. 12 hour Inflate2 runs are running now on product, fastdebug and slowdebug bits on Linux-X64, MacOSX and Solaris-X64. I'll start my my stress kit on Linux-X64 sometime on Sunday (after my jdk-13+18 stress run is done). I'll do SPECjbb2015 baseline and CR2 runs after all the stress testing is done. Thanks, in advance, for any questions, comments or suggestions. Dan On 4/19/19 11:58 AM, Daniel D. Daugherty wrote: > Greetings, > > I finally have CR1 for the Async Monitor Deflation project ready to > go. It's also known as v2.01 (for those for with the patches) and as > webrev/4-for-jdk13 (for those with webrev URLs). Sorry for all the > names... > > Main bug URL: > > ??? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > Baseline bug fixes URL: > > ??? JDK-8222295 more baseline cleanups from Async Monitor Deflation > project > ??? https://bugs.openjdk.java.net/browse/JDK-8222295 > > The project is currently baselined on jdk-13+15. > > Here's the webrev for the latest baseline changes (JDK-8222295): > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295 > > Here's the full webrev URL (JDK-8153224 only): > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.full/ > > Here's the incremental webrev URL (JDK-8153224): > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.inc/ > > So I'm looking for reviews for both JDK-8222295 and the latest version > of JDK-8153224... > > I still have to update the OpenJDK wiki to reflect the CR changes: > > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > > This version of the patch has been thru Mach5 tier[1-3] testing on > Oracle's usual set of platforms. Mach5 tier[4-6] is running now and > Mach5 tier[78] will be run later today. My stress kit on Solaris-X64 > is running now. Linux-X64 stress testing will start on Sunday. I'm > planning to do Kitchensink runs, SPECjbb2015 runs and my monitor > inflation stress tests on Linux-X64, MacOSX and Solaris-X64. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > > > On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> Welcome to the OpenJDK review thread for my port of Carsten's work on: >> >> ??? JDK-8153224 Monitor deflation prolong safepoints >> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> Here's a link to the OpenJDK wiki that describes my port: >> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >> >> Here's the webrev URL: >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >> >> Here's a link to Carsten's original webrev: >> >> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >> >> Earlier versions of this patch have been through several rounds of >> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >> Roman for their preliminary code review comments. A very special >> thanks to Robbin and Roman for building and testing the patch in >> their own environments (including specJBB2015). >> >> This version of the patch has been thru Mach5 tier[1-8] testing on >> Oracle's usual set of platforms. Earlier versions have been run >> through my stress kit on my Linux-X64 and Solaris-X64 servers >> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >> and slowdebug). Earlier versions have run my monitor inflation stress >> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >> fastdebug and slowdebug). >> >> All of the testing done on earlier versions will be redone on the >> latest version of the patch. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> >> P.S. >> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >> is currently failing in -Xcomp mode on Win* only. I've been trying >> to characterize/analyze this failure for more than a week now. At >> this point I'm convinced that Async Monitor Deflation is aggravating >> an existing bug. However, I plan to have a better handle on that >> failure before these bits are pushed to the jdk/jdk repo. >> > > From jianglizhou at google.com Thu Apr 25 16:40:44 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Thu, 25 Apr 2019 09:40:44 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <3614850A-450A-470B-BF2D-3CEA21D03B95@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> <3614850A-450A-470B-BF2D-3CEA21D03B95@oracle.com> Message-ID: Karen, On Tue, Apr 23, 2019 at 11:16 AM Karen Kinnear wrote: > > Calvin, > > I added to the CSR a comment from my favorite customer - relative to the user model for the command-line flags. > He likes the proposal to reduce the number of steps a customer has to perform to get startup and footprint benefits > from the archive. > > The comment was that it would be very helpful if the user only needed to change their scripts once - so > a single command-line argument would create a dynamic archive if one did not exist, and use it if it > already existed. This is a very plausible idea and is in the same direction of the multi-staged rollout of the end goal of dynamic archiving, which is making the archive generation/usage completely transparent and automatic (no command-line option is need to generate and use the dynamic archive). Using a single command-line option to control the creation of the archive (when one doesn't exist) and the use of the archive (when one already exists) is needed for cases when users want more control. The following is copied from http://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html. The behavior is unspecified if the underlying file changes after mmap is established. Currently, we are not handling the case. The dynamic archive simply follow the usage model of the existing static archive with separate steps for dump-time and runtime, which however doesn't preclude the case. In the current usage model, concurrency is less often. With the single command-line option controlled creation&use, there might be more concurrences and we probably want to handle the issue. "If the size of the mapped file changes after the call to mmap() as a result of some other operation on the mapped file, the effect of references to portions of the mapped region that correspond to added or removed portions of the file is unspecified." The current flag, ArchiveClassesAtExit doesn't scale and can't handle the single option controlled creation/use case. It would be a good idea to re-think the command-line option now and change it before the first integration, so it can scale. -Xshare:dump/on/auto/dynamic might be able to serve the purpose. 'dynamic' can be used to trigger just dynamic dumping for now. It can be augmented to support the use case that you described above. Best regards, Jiangli > > Is there a way to evolve the ArchiveClassesAtExit= to have that functionality? > > thanks, > Karen > > p.s. I think it makes more sense to put performance numbers in the implementation RFE comments rather than the JEP > comments > > On Apr 22, 2019, at 5:16 PM, Jiangli Zhou wrote: > > Hi Calvin, > > Can you please also publish the final performance numbers in the JEP 350 (or the implementation RFE) comment section? > > Thanks, > Jiangli > > On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung wrote: >> >> Hi Karen, >> >> Thanks for your review! >> Please see my replies in-line below. >> >> On 4/19/19, 9:29 AM, Karen Kinnear wrote: >> > Calvin, >> > >> > Many thanks for all the work getting this ready, significantly >> > enhancing the testing and bug fixes. >> > >> > I marked the CSR as reviewed-by - it looks great! >> > >> > I reviewed this set of changes - I did not review the tests - I assume >> > you can get someone >> > else to do that. I am grateful that Jiangli and Ioi are going to >> > review this also - they are much closer to >> > the details than I am. >> > >> > 1. Do you have any performance numbers? >> > 1a. Startup: does using a combined dynamic CDS archive + base archive >> > give similar startup benefits >> > when you have the same classes in the archives? >> Below are some performance numbers from Eric, each number is for 50 runs: >> (base: using the default CDS archive, >> test: using the dynamic archive, >> Eric will get some numbers with a single archive which I think that's >> what you're looking for) >> >> Lambda-noop: >> base: >> 0.066441427 seconds time elapsed >> test: >> 0.075428824 seconds time elapsed >> >> Noop: >> base: >> 0.057614537 seconds time elapsed >> test: >> 0.066061557 seconds time elapsed >> >> Netty: >> base: >> 0.827013307 seconds time elapsed >> test: >> 0.604982805 seconds time elapsed >> >> Spring: >> base: >> 2.376707358 seconds time elapsed >> test: >> 1.927618893 seconds time elapsed >> >> The first 2 apps only have 2 to 3 classes in the dynamic archive. So the >> overhead is likely due to having to open and map the dynamic archive and >> performs checking on header, etc. For small apps, I think it's better to >> use a single archive. The Netty app has around 1400 classes in the >> dynamic archive; the Spring app has about 3700 classes in the dynamic >> archive. >> >> I also used our LotsOfClasses test to collect some perf numbers. This is >> more like runtime performance, not startup performance. >> >> With dynamic archive (100 runs each): >> real 2m37.191s >> real 2m36.003s >> Total loaded classes = 24254 >> Loaded from base archive = 1186 >> Loaded from top archive = 23042 >> Loaded from jrt:/ (runtime module) = 26 >> >> With single archive (100 runs each): >> real 2m38.346s >> real 2m36.947s >> Total loaded classes = 24254 >> Loaded from archive = 24228 >> Loaded from jrt:/ (runtime module) = 26 >> >> > >> > 1b. Do you have samples of uses of the combined dynamic CDS archive + >> > base archive vs. a single >> > static archive built for an application? >> > - how do the sets of archived classes differ? >> Currently, the default CDS archive contains around 1187 classes. With >> the -XX:ArchiveClassesAtExit option, if the classes are not found in the >> default CDS archive, they will be archived in the dynamic archive. The >> above LotsOfClasses example shows some distributions between various >> archives. >> > - one note was that the AtExit approach exclude list adds anything >> > that has not yet linked - does that make a significant difference in >> > the number of classes that are archived? Does that make a difference >> > in either startup time or in application execution time? I could see >> > that going either way. >> As the above numbers indicated, there's not much difference in terms of >> execution time using a dynamic vs a single archive with a large number >> of classes loaded. The numbers from Netty and Spring apps show an >> improvement over default CDS archive. >> > >> > 1c. Any sense of performance cost for first run - how much time does >> > it take to create an incremental archive? >> > - is the time comparable to an existing dump for a single archive >> > for the application? >> > - this is an ease-of-use feature - so we are not expecting that to >> > be fast >> > - the point is to set expectations in our documentation >> I did some rough measurements with the LotsOfClasses test with around >> 15000 classes in the classlist. >> >> Dynamic archive dumping (one run each): >> real 0m19.756s >> real 0m20.241s >> >> Static archive dumping (one run each): >> real 0m17.725s >> real 0m16.993s >> > >> > 2. Footprint >> > With two archives rather than one, is there a significant footprint >> > difference? Obviously this will vary by app and archive. >> > Once again, the point is to set expectations. >> Sizes of the archives for the LotsOfClasses test in 1a. >> >> Single archive: 242962432 >> Default CDS archive: 12365824 >> Dynamic archive: 197525504 >> >> > >> > 3. Runtime performance >> > With two sets of archived dictionaries & symbolTables - is there any >> > significant performance cost to larger benchmarks, e.g. for class >> > loading lookup for classes that are not in the archives? Or symbol >> > lookup? >> I used the LotsOfClasses test again. This time archiving about half of >> the classes which will be loaded during runtime. >> >> Dynamic archive (10 runs each): >> real 0m30.214s >> real 0m29.633s >> Loaded classes = 24254 >> Loaded from dynamic archive: 13168 >> >> Single archive (10 runs each): >> real 0m32.383s >> real 0m32.905s >> Loaded classes = 24254 >> Loaded from single archive = 15063 >> > >> > 4. Platform support >> > Which platforms is this supported on? >> > Which ones did you test? For example, did you run the tests on Windows? >> I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, Solaris, >> Windows). >> > >> > Detailed feedback on the code: Just minor comments - I don?t need to >> > see an updated webrev: >> I'm going to look into your detailed feedback below and may reply in a >> separate email. >> >> thanks, >> Calvin >> > >> > 1. metaSpaceShared.hpp >> > line 156: >> > what is the hardcoded -100 for? Should that be an enum? >> > >> > 2. jfrRecorder.cpp >> > So JFR recordings are disabled if DynamicDumpSharedSpaces? >> > why? >> > Is that a future rfe? >> > >> > 3. systemDictionaryShared.cpp >> > Could you possibly add a comment to add_verification_constraint >> > for if (DynamicDumpSharedSpaces) >> > return false >> > >> > -- I think the logic is: >> > because we have successfully linked any instanceKlass we archive >> > with DynamicDumpSharedSpaces, we have resolved all the constraint classes. >> > >> > -- I didn't check the order - is this called before or after >> > excluding? If after, then would it make sense to add an assertion >> > here is_linked? Then if you ever change how/when linking is done, this >> > might catch future errors. >> > >> > 4. systemDictionaryShared.cpp >> > EstimateSizeForArchive::do_entry >> > Is it the case that for info.is_builtin() there are no verification >> > constraints? So you could skip that calculation? Or did I misunderstand? >> > >> > 5. compactHashtable.cpp >> > serialize/header/calculate_header_size >> > -- could you dynamically determine size_of header so you don't need >> > to hardcode a 5? >> > >> > 6. classLoader.cpp >> > line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are >> > mutually exclusive. >> > Can you clarify for me: >> > My memory of the base archive is that we do not allow the following >> > options at dump time - and these >> > are the same for the dynamic archive: ?limit-modules, >> > ?upgrade-module-path, ?patch-module. >> > >> > I have forgotten: >> > Today with UseSharedSpaces - do we allow these flags? Is that also the >> > same behavior with the dynamic >> > archive? >> > >> > 7. classLoaderExt.cpp >> > assert line 66: only used with -Xshare:dump >> > -> "only used at dump time" >> > >> > 8. symbolTable.cpp >> > line 473: comment // used by UseSharedArchived2 >> > ? command-line arg name has changed >> > >> > 9. filemap.cpp >> > Comment lines 529 ... >> > Is this true - that you can only support dynamic dumping with the >> > default CDS archive? Could you clarify what the restrictions are? >> > The CSR implies you can support ?a specific base CDS archive" >> > - so base layer can not have appended boot class path >> > - and base layer can't have a module path >> > >> > What can you specify for the dynamic dumping relative to the base archive? >> > - matching class path? >> > - appended class path? >> > in future - could it have a module path that matched the base archive? >> > >> > Should any of these restrictions be clarified in documentation/CSR >> > since they appear to be new? >> > >> > 10. filemap.cpp >> > check_archive >> > Do some of the return false paths skip performing os::close(fd)? >> > >> > and get_base_archive_name_from_header >> > Does the first return false path fail to os::free(dynamic_header) >> > >> > lines 753-754: two FIXME comments >> > >> > Could you delete commented out line 1087 in filemap.cpp ? >> > >> > 11. filemap.hpp >> > line 214: TODO left in >> > >> > 12. metaspace.cpp >> > line 1418 FIXME left in >> > >> > 13. java.cpp >> > FIXME: is this the right place? >> > For starting the DynamicArchive::dump >> > >> > Please check with David Holmes on that one >> > >> > 14. dynamicArchive.hpp >> > line 55 (and others): MetsapceObj -> MetaspaceObj >> > >> > 15. dynamicArchive.cpp >> > line 285 rel-ayout -> re-layout >> > >> > lines 277 && 412 >> > Do we archive array klasses in the base archive but not in the dynamic >> > archive? >> > Is that a potential RFE? >> > Is it possible that GatherKlassesAndSymbols::do_unique_ref could be >> > called with an array class? >> > Same question for copy_impl? >> > >> > line 934: "no onger" -> "no longer" >> > >> > 16. What is AllowArchivingWithJavaAgent? Is that a hook for a >> > potential future rfe? >> > Do you want to check in that code at this time? In product? >> > >> > thanks, >> > Karen >> > >> > >> >> On Apr 11, 2019, at 5:18 PM, Calvin Cheung > >> > wrote: >> >> >> >> This is a follow-up on the preliminary code review sent by Jiangli in >> >> January[1]. >> >> >> >> Highlights of changes since then: >> >> 1. New vm option for dumping a dynamic archive >> >> (-XX:ArchiveClassesAtExit=) and enhancement to the >> >> existing -XX:SharedArchiveFile option. Please refer to the >> >> corresponding CSR[2] for details. >> >> 2. New way to run existing AppCDS tests in dynamic CDS archive mode. >> >> At the jtreg command line, the user can run many existing AppCDS >> >> tests in dynamic CDS archive mode by specifying the following: >> >> -vmoptions:-Dtest.dynamic.cds.archive=true >> >> /open/test/hotspot/jtreg:hotspot_appcds_dynamic >> >> We will have a follow-up RFE to determine in which tier the above >> >> tests should be run. >> >> 3. Added more tests. >> >> 4. Various bug fixes to improve stability. >> >> >> >> RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 >> >> webrev: >> >> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ >> >> >> >> >> >> (The webrev is based on top of the following rev: >> >> http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) >> >> >> >> Testing: >> >> - mach5 tiers 1- 3 (including the new tests) >> >> - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few >> >> tests require more investigation) >> >> >> >> thanks, >> >> Calvin >> >> >> >> [1] >> >> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html >> >> [2] https://bugs.openjdk.java.net/browse/JDK-8221706 >> > > > From karen.kinnear at oracle.com Thu Apr 25 17:35:34 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 25 Apr 2019 13:35:34 -0400 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> <3614850A-450A-470B-BF2D-3CEA21D03B95@oracle.com> Message-ID: <36EA339A-9C37-4E96-90E1-69268A5E2967@oracle.com> Jiangli, Thank you for thinking so carefully about this. You ask great questions. One detailed question that arose for me from your email was: What happens today if you specify -XX:ArchiveClassesAtExit and the archive already exists? Is it the case that we delete the existing file before open the new one? Just want to make sure we document that. Here is my understanding of where we are thinking of going - folks please correct anything - of course as we learn more this will evolve. 1. ArchiveClassesAtExit This is potentially an intermediate step, which as you point out is only here to create an archive at the end of an execution. We don?t yet have enough experience with this and with a potential future continuous dumping mode to know if this mode will still be useful to customers when more advanced modes are available. My personal sense is that this could be a useful model even long-term. 2. I agree that we have talked about longer-term - a) incremental archiving additional loaded classes - there are a lot of ways to do this - possibly creating a separate dynamic archive, archiving as you go rather than at exit - possibly creating another layer of dynamic archive - possibly updating the current dynamic archive and being the only one who can read/write it - I?m not sure if this is needed or the exit model is sufficient (and possibly more efficient?) - prototyping could help us learn this b) possibilities of sharing an archive while it is being updated - I confess I am not excited about the concurrency complexities here - so not sure it is worth it - less useful with current models of Cloud usage c) a single command that could create an archive if it does not exist and use it if one does d) possibly making the default be to generate a dynamic archive if it is not available and use if it is - this could be built on archive at exit model or on incremental archive - I think this is a reasonable model to explore - customers today expect the first execution of applications to be slower and the dynamic linker?s caching will make second runs faster - I think we need more field experience and user feedback before we go here I do appreciate your link to mmap. I agree we don?t handle that today. I believe we don?t need to support it unless we explicitly want concurrent reader/writers. Am I correctly hearing what you are saying? So I think the model is to add ArchiveClassesAtExit now, and to reserve the more flexible command-line argument for when we move to more automation. I don?t think we know yet whether that would use a dumpatexit or an incremental model. I would rather we save the automation model until we know more about where we are going, rather than use it for this step. At that point, try to find a command-line argument that has a create-if-does-not-exist/use-if-does-exist model (perhaps -Xshare:dynamic or -Xshare::reallyauto ?). My translation here is that we are all aiming in the same direction, the discussion is really about how to phase this. Fair? I share a sadness that we are not already there. However I am actually quite excited about where we are - many thanks to you and Calvin and Ioi for years of design and implementation! thanks, Karen > On Apr 25, 2019, at 12:40 PM, Jiangli Zhou wrote: > > Karen, > > On Tue, Apr 23, 2019 at 11:16 AM Karen Kinnear > wrote: >> >> Calvin, >> >> I added to the CSR a comment from my favorite customer - relative to the user model for the command-line flags. >> He likes the proposal to reduce the number of steps a customer has to perform to get startup and footprint benefits >> from the archive. >> >> The comment was that it would be very helpful if the user only needed to change their scripts once - so >> a single command-line argument would create a dynamic archive if one did not exist, and use it if it >> already existed. > > This is a very plausible idea and is in the same direction of the > multi-staged rollout of the end goal of dynamic archiving, which is > making the archive generation/usage completely transparent and > automatic (no command-line option is need to generate and use the > dynamic archive). > > Using a single command-line option to control the creation of the > archive (when one doesn't exist) and the use of the archive (when one > already exists) is needed for cases when users want more control. > > The following is copied from > http://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html . The > behavior is unspecified if the underlying file changes after mmap is > established. Currently, we are not handling the case. The dynamic > archive simply follow the usage model of the existing static archive > with separate steps for dump-time and runtime, which however doesn't > preclude the case. In the current usage model, concurrency is less > often. With the single command-line option controlled creation&use, > there might be more concurrences and we probably want to handle the > issue. > > "If the size of the mapped file changes after the call to mmap() as a > result of some other operation on the mapped file, the effect of > references to portions of the mapped region that correspond to added > or removed portions of the file is unspecified." > > The current flag, ArchiveClassesAtExit doesn't scale and can't handle > the single option controlled creation/use case. It would be a good > idea to re-think the command-line option now and change it before the > first integration, so it can scale. -Xshare:dump/on/auto/dynamic might > be able to serve the purpose. 'dynamic' > can be used to trigger just dynamic dumping for now. It can be > augmented to support the use case that you described above. > > Best regards, > Jiangli > >> >> Is there a way to evolve the ArchiveClassesAtExit= to have that functionality? >> >> thanks, >> Karen >> >> p.s. I think it makes more sense to put performance numbers in the implementation RFE comments rather than the JEP >> comments >> >> On Apr 22, 2019, at 5:16 PM, Jiangli Zhou wrote: >> >> Hi Calvin, >> >> Can you please also publish the final performance numbers in the JEP 350 (or the implementation RFE) comment section? >> >> Thanks, >> Jiangli >> >> On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung > wrote: >>> >>> Hi Karen, >>> >>> Thanks for your review! >>> Please see my replies in-line below. >>> >>> On 4/19/19, 9:29 AM, Karen Kinnear wrote: >>>> Calvin, >>>> >>>> Many thanks for all the work getting this ready, significantly >>>> enhancing the testing and bug fixes. >>>> >>>> I marked the CSR as reviewed-by - it looks great! >>>> >>>> I reviewed this set of changes - I did not review the tests - I assume >>>> you can get someone >>>> else to do that. I am grateful that Jiangli and Ioi are going to >>>> review this also - they are much closer to >>>> the details than I am. >>>> >>>> 1. Do you have any performance numbers? >>>> 1a. Startup: does using a combined dynamic CDS archive + base archive >>>> give similar startup benefits >>>> when you have the same classes in the archives? >>> Below are some performance numbers from Eric, each number is for 50 runs: >>> (base: using the default CDS archive, >>> test: using the dynamic archive, >>> Eric will get some numbers with a single archive which I think that's >>> what you're looking for) >>> >>> Lambda-noop: >>> base: >>> 0.066441427 seconds time elapsed >>> test: >>> 0.075428824 seconds time elapsed >>> >>> Noop: >>> base: >>> 0.057614537 seconds time elapsed >>> test: >>> 0.066061557 seconds time elapsed >>> >>> Netty: >>> base: >>> 0.827013307 seconds time elapsed >>> test: >>> 0.604982805 seconds time elapsed >>> >>> Spring: >>> base: >>> 2.376707358 seconds time elapsed >>> test: >>> 1.927618893 seconds time elapsed >>> >>> The first 2 apps only have 2 to 3 classes in the dynamic archive. So the >>> overhead is likely due to having to open and map the dynamic archive and >>> performs checking on header, etc. For small apps, I think it's better to >>> use a single archive. The Netty app has around 1400 classes in the >>> dynamic archive; the Spring app has about 3700 classes in the dynamic >>> archive. >>> >>> I also used our LotsOfClasses test to collect some perf numbers. This is >>> more like runtime performance, not startup performance. >>> >>> With dynamic archive (100 runs each): >>> real 2m37.191s >>> real 2m36.003s >>> Total loaded classes = 24254 >>> Loaded from base archive = 1186 >>> Loaded from top archive = 23042 >>> Loaded from jrt:/ (runtime module) = 26 >>> >>> With single archive (100 runs each): >>> real 2m38.346s >>> real 2m36.947s >>> Total loaded classes = 24254 >>> Loaded from archive = 24228 >>> Loaded from jrt:/ (runtime module) = 26 >>> >>>> >>>> 1b. Do you have samples of uses of the combined dynamic CDS archive + >>>> base archive vs. a single >>>> static archive built for an application? >>>> - how do the sets of archived classes differ? >>> Currently, the default CDS archive contains around 1187 classes. With >>> the -XX:ArchiveClassesAtExit option, if the classes are not found in the >>> default CDS archive, they will be archived in the dynamic archive. The >>> above LotsOfClasses example shows some distributions between various >>> archives. >>>> - one note was that the AtExit approach exclude list adds anything >>>> that has not yet linked - does that make a significant difference in >>>> the number of classes that are archived? Does that make a difference >>>> in either startup time or in application execution time? I could see >>>> that going either way. >>> As the above numbers indicated, there's not much difference in terms of >>> execution time using a dynamic vs a single archive with a large number >>> of classes loaded. The numbers from Netty and Spring apps show an >>> improvement over default CDS archive. >>>> >>>> 1c. Any sense of performance cost for first run - how much time does >>>> it take to create an incremental archive? >>>> - is the time comparable to an existing dump for a single archive >>>> for the application? >>>> - this is an ease-of-use feature - so we are not expecting that to >>>> be fast >>>> - the point is to set expectations in our documentation >>> I did some rough measurements with the LotsOfClasses test with around >>> 15000 classes in the classlist. >>> >>> Dynamic archive dumping (one run each): >>> real 0m19.756s >>> real 0m20.241s >>> >>> Static archive dumping (one run each): >>> real 0m17.725s >>> real 0m16.993s >>>> >>>> 2. Footprint >>>> With two archives rather than one, is there a significant footprint >>>> difference? Obviously this will vary by app and archive. >>>> Once again, the point is to set expectations. >>> Sizes of the archives for the LotsOfClasses test in 1a. >>> >>> Single archive: 242962432 >>> Default CDS archive: 12365824 >>> Dynamic archive: 197525504 >>> >>>> >>>> 3. Runtime performance >>>> With two sets of archived dictionaries & symbolTables - is there any >>>> significant performance cost to larger benchmarks, e.g. for class >>>> loading lookup for classes that are not in the archives? Or symbol >>>> lookup? >>> I used the LotsOfClasses test again. This time archiving about half of >>> the classes which will be loaded during runtime. >>> >>> Dynamic archive (10 runs each): >>> real 0m30.214s >>> real 0m29.633s >>> Loaded classes = 24254 >>> Loaded from dynamic archive: 13168 >>> >>> Single archive (10 runs each): >>> real 0m32.383s >>> real 0m32.905s >>> Loaded classes = 24254 >>> Loaded from single archive = 15063 >>>> >>>> 4. Platform support >>>> Which platforms is this supported on? >>>> Which ones did you test? For example, did you run the tests on Windows? >>> I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, Solaris, >>> Windows). >>>> >>>> Detailed feedback on the code: Just minor comments - I don?t need to >>>> see an updated webrev: >>> I'm going to look into your detailed feedback below and may reply in a >>> separate email. >>> >>> thanks, >>> Calvin >>>> >>>> 1. metaSpaceShared.hpp >>>> line 156: >>>> what is the hardcoded -100 for? Should that be an enum? >>>> >>>> 2. jfrRecorder.cpp >>>> So JFR recordings are disabled if DynamicDumpSharedSpaces? >>>> why? >>>> Is that a future rfe? >>>> >>>> 3. systemDictionaryShared.cpp >>>> Could you possibly add a comment to add_verification_constraint >>>> for if (DynamicDumpSharedSpaces) >>>> return false >>>> >>>> -- I think the logic is: >>>> because we have successfully linked any instanceKlass we archive >>>> with DynamicDumpSharedSpaces, we have resolved all the constraint classes. >>>> >>>> -- I didn't check the order - is this called before or after >>>> excluding? If after, then would it make sense to add an assertion >>>> here is_linked? Then if you ever change how/when linking is done, this >>>> might catch future errors. >>>> >>>> 4. systemDictionaryShared.cpp >>>> EstimateSizeForArchive::do_entry >>>> Is it the case that for info.is_builtin() there are no verification >>>> constraints? So you could skip that calculation? Or did I misunderstand? >>>> >>>> 5. compactHashtable.cpp >>>> serialize/header/calculate_header_size >>>> -- could you dynamically determine size_of header so you don't need >>>> to hardcode a 5? >>>> >>>> 6. classLoader.cpp >>>> line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are >>>> mutually exclusive. >>>> Can you clarify for me: >>>> My memory of the base archive is that we do not allow the following >>>> options at dump time - and these >>>> are the same for the dynamic archive: ?limit-modules, >>>> ?upgrade-module-path, ?patch-module. >>>> >>>> I have forgotten: >>>> Today with UseSharedSpaces - do we allow these flags? Is that also the >>>> same behavior with the dynamic >>>> archive? >>>> >>>> 7. classLoaderExt.cpp >>>> assert line 66: only used with -Xshare:dump >>>> -> "only used at dump time" >>>> >>>> 8. symbolTable.cpp >>>> line 473: comment // used by UseSharedArchived2 >>>> ? command-line arg name has changed >>>> >>>> 9. filemap.cpp >>>> Comment lines 529 ... >>>> Is this true - that you can only support dynamic dumping with the >>>> default CDS archive? Could you clarify what the restrictions are? >>>> The CSR implies you can support ?a specific base CDS archive" >>>> - so base layer can not have appended boot class path >>>> - and base layer can't have a module path >>>> >>>> What can you specify for the dynamic dumping relative to the base archive? >>>> - matching class path? >>>> - appended class path? >>>> in future - could it have a module path that matched the base archive? >>>> >>>> Should any of these restrictions be clarified in documentation/CSR >>>> since they appear to be new? >>>> >>>> 10. filemap.cpp >>>> check_archive >>>> Do some of the return false paths skip performing os::close(fd)? >>>> >>>> and get_base_archive_name_from_header >>>> Does the first return false path fail to os::free(dynamic_header) >>>> >>>> lines 753-754: two FIXME comments >>>> >>>> Could you delete commented out line 1087 in filemap.cpp ? >>>> >>>> 11. filemap.hpp >>>> line 214: TODO left in >>>> >>>> 12. metaspace.cpp >>>> line 1418 FIXME left in >>>> >>>> 13. java.cpp >>>> FIXME: is this the right place? >>>> For starting the DynamicArchive::dump >>>> >>>> Please check with David Holmes on that one >>>> >>>> 14. dynamicArchive.hpp >>>> line 55 (and others): MetsapceObj -> MetaspaceObj >>>> >>>> 15. dynamicArchive.cpp >>>> line 285 rel-ayout -> re-layout >>>> >>>> lines 277 && 412 >>>> Do we archive array klasses in the base archive but not in the dynamic >>>> archive? >>>> Is that a potential RFE? >>>> Is it possible that GatherKlassesAndSymbols::do_unique_ref could be >>>> called with an array class? >>>> Same question for copy_impl? >>>> >>>> line 934: "no onger" -> "no longer" >>>> >>>> 16. What is AllowArchivingWithJavaAgent? Is that a hook for a >>>> potential future rfe? >>>> Do you want to check in that code at this time? In product? >>>> >>>> thanks, >>>> Karen >>>> >>>> >>>>> On Apr 11, 2019, at 5:18 PM, Calvin Cheung >>>>> >> wrote: >>>>> >>>>> This is a follow-up on the preliminary code review sent by Jiangli in >>>>> January[1]. >>>>> >>>>> Highlights of changes since then: >>>>> 1. New vm option for dumping a dynamic archive >>>>> (-XX:ArchiveClassesAtExit=) and enhancement to the >>>>> existing -XX:SharedArchiveFile option. Please refer to the >>>>> corresponding CSR[2] for details. >>>>> 2. New way to run existing AppCDS tests in dynamic CDS archive mode. >>>>> At the jtreg command line, the user can run many existing AppCDS >>>>> tests in dynamic CDS archive mode by specifying the following: >>>>> -vmoptions:-Dtest.dynamic.cds.archive=true >>>>> /open/test/hotspot/jtreg:hotspot_appcds_dynamic >>>>> We will have a follow-up RFE to determine in which tier the above >>>>> tests should be run. >>>>> 3. Added more tests. >>>>> 4. Various bug fixes to improve stability. >>>>> >>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 >>>>> webrev: >>>>> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ >>>>> > >>>>> >>>>> (The webrev is based on top of the following rev: >>>>> http://hg.openjdk.java.net/jdk/jdk/rev/805584336738 ) >>>>> >>>>> Testing: >>>>> - mach5 tiers 1- 3 (including the new tests) >>>>> - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few >>>>> tests require more investigation) >>>>> >>>>> thanks, >>>>> Calvin >>>>> >>>>> [1] >>>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html >>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8221706 From markus.gronlund at oracle.com Thu Apr 25 20:25:22 2019 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Thu, 25 Apr 2019 13:25:22 -0700 (PDT) Subject: RFR(XS): 8221121: applications/microbenchmarks are encountering crashes in tier5 Message-ID: <1515916c-6187-46d4-8815-5e824284e04b@default> Greetings, Please review this small patch to address the following: Bug: https://bugs.openjdk.java.net/browse/JDK-8221121 Webrev: http://cr.openjdk.java.net/~mgronlun/8221121/webrev01/ Description: The applications/microbenchmarks added to tier5 are failing in some instances (debug builds), with, as an example, the following trace: # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/scratch/opt/mach5/mesos/work_dir/slaves/2dd962d0-8988-479b-a804-57ab764ada59-S77631/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/d8f4cb38-0dec-4477-a89a-f62853433c56/runs/cb3f55fa-9e13-481c-a7d6-9f33d2b8b457/workspace/open/src/hotspot/share/jfr/recorder/storage/jfrMemorySpace.inline.hpp:85), pid=5748, tid=5770 # assert(t->identity() == __null) failed: invariant This occurred when JFR was running with the in-memory configuration where buffers are reused FIFO-style. In the implementation, an "age node" will manage a full buffer for its reclamation, and age nodes provides for a linked "full" (fifo) list. Issue: The age node was not expected to retain an identity after being added to the full list, where the assertion fired during the subsequent discard-reuse processing step. This situation only manifests with running JFR in-memory configurations and using debug builds. Fix is to release the age node before insertion onto full list. Thanks Markus From daniel.daugherty at oracle.com Thu Apr 25 20:43:22 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 25 Apr 2019 16:43:22 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> Message-ID: <2536572e-158f-0913-889b-ef76d6122c79@oracle.com> On 4/25/19 8:05 AM, Robbin Ehn wrote: > Hi all, please review. > > Let's deopt with handshakes. > Removed VM op Deoptimize, instead we handshake. > Locks needs to be inflate since we are not in a safepoint. > > Goes on top of: > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html > > > Code: > http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html Robbin, you'll have to merge with Coleen's recent MutexLocker fix (8222518). src/hotspot/share/aot/aotCodeHeap.cpp ??? No comments. src/hotspot/share/aot/aotCompiledMethod.cpp ??? L163: bool AOTCompiledMethod::make_not_entrant_helper(int new_state) { ??? L207: bool AOTCompiledMethod::make_entrant() { ??????? So the Compiler team is on board with switching from the ??????? Patching_lock to the CompiledMethod_lock? src/hotspot/share/code/codeCache.cpp ??? No comments. src/hotspot/share/code/nmethod.cpp ??? L1180: bool nmethod::make_not_entrant_or_zombie(int state) { ??? L2853: void nmethod::clear_jvmci_installed_code() { ??? L2861: void nmethod::clear_speculation_log() { ??? L2869: void nmethod::maybe_invalidate_installed_code() { ??? L2904: void nmethod::invalidate_installed_code(Handle installedCode, TRAPS) { ??????? So the Compiler team is on board with switching from the ??????? Patching_lock to the CompiledMethod_lock? src/hotspot/share/code/nmethod.hpp ??? No comments. src/hotspot/share/gc/z/zBarrierSetNMethod.cpp ??? No comments (copyright year needs update). src/hotspot/share/gc/z/zNMethod.cpp ??? No comments (copyright year needs update). src/hotspot/share/oops/markOop.hpp ??? L180: ? bool biased_locker_is(JavaThread* thread) const { ??????? Is this for a different project? ??? L181: ??? if (!has_bias_pattern()) { ??? L182: ????? return false; ??? L183: ??? } ??? L184: ??? // If current thread is not the owner it can be unbiased at anytime. ??? L185: ??? JavaThread* jt = (JavaThread*) ((intptr_t) (mask_bits(value(), ~(biased_lock_mask_in_place | age_mask_in_place | epoch_mask_in_place)))); ??????? So you don't want to use this: ? ? ? ? ? JavaThread* biased_locker() const { ? ? ? ?? ?? assert(has_bias_pattern(), "should not call this otherwise"); ? ? ? ?? ?? return (JavaThread*) ((intptr_t) (mask_bits(value(), ~(biased_lock_mask_in_place | age_mask_in_place | epoch_mask_in_place)))); ? ? ? ? ? } ??????? because of the assert(). ??????? I think biased_locker() and biased_locker_is() both need to work ??????? with a copy of the markOop so that it can't change dynamically. ??????? Something like: ????????? JavaThread* biased_locker() const { ??????????? markOop copy = this; ??????????? assert(copy.has_bias_pattern(), "should not call this otherwise"); ??????????? return (JavaThread*) ((intptr_t) (copy.mask_bits(value(), ~(biased_lock_mask_in_place | age_mask_in_place | epoch_mask_in_place)))); ????????? } ????????? bool biased_locker_is(JavaThread* thread) const { ??????????? markOop copy = this; ??????????? if (!copy.has_bias_pattern()) { ????????????? return false; ??????????? } ??????????? return copy.biased_locker(); ????????? } src/hotspot/share/oops/method.cpp ??? old L104: ? clear_code(false /* don't need a lock */); // from_c/from_i get set to c2i/i2i ??????? Is the comment after '//' not useful? ??? L946: ??? MutexLockerEx ml(CompiledMethod_lock->owned_by_self() ? NULL : CompiledMethod_lock, Mutex::_no_safepoint_check_flag); ??????? So the Compiler team is on board with switching from the ??????? Patching_lock to the CompiledMethod_lock? src/hotspot/share/oops/method.hpp ??? No comments. src/hotspot/share/prims/jvmtiEventController.cpp ??? No comments (copyright year needs update). src/hotspot/share/prims/methodHandles.cpp ??? No comments. src/hotspot/share/prims/whitebox.cpp ??? No comments. src/hotspot/share/runtime/biasedLocking.cpp src/hotspot/share/runtime/biasedLocking.hpp ??? Hmmm... More Biased Locking changes. I didn't take a close ??? look at these. src/hotspot/share/runtime/deoptimization.cpp ??? No comments (but some overlap with Biased Locking, ouch) src/hotspot/share/runtime/deoptimization.hpp ??? No comments. src/hotspot/share/runtime/mutex.hpp ??? old L65: ?????? special??????? = tty??????????? +?? 1, ??? new L65: ?????? special??????? = tty??????????? +?? 2, ??????? Why? src/hotspot/share/runtime/mutexLocker.cpp ??? No comments. src/hotspot/share/runtime/mutexLocker.hpp ??? L34: extern Mutex*?? Patching_lock;?????????????????? // a lock used to guard code patching of compiled code ??? L35: extern Mutex*?? CompiledMethod_lock; ??????? A comment is traditional here... src/hotspot/share/runtime/synchronizer.cpp ??? old L1317: ???????? !SafepointSynchronize::is_at_safepoint(), "invariant"); ??? new L1317: ???????? !Universe::heap()->is_gc_active(), "invariant"); ??????? Why? ? ? L1446: ??????? ResourceMark rm; ??? L1496: ????? ResourceMark rm; ??????? Why drop 'Self'? That makes the ResourceMark more expensive.. src/hotspot/share/runtime/thread.cpp ??? No comments. src/hotspot/share/runtime/thread.hpp ??? No comments. src/hotspot/share/runtime/vmOperations.cpp ??? No comments. src/hotspot/share/runtime/vmOperations.hpp ??? No comments. src/hotspot/share/services/dtraceAttacher.cpp ??? No comments (copyright year needs update). I don't think I've found anything that's "must fix". Please let me know if I should re-review: ??? src/hotspot/share/runtime/biasedLocking.cpp ??? src/hotspot/share/runtime/biasedLocking.hpp ??? src/hotspot/share/runtime/deoptimization.cpp because the Biased Locking changes are critical for this project. Dan > Issue: > https://bugs.openjdk.java.net/browse/JDK-8221734 > > Passes t1-7 and multiple t1-5 runs. > > A few startup benchmark see a small speedup. > > Thanks, Robbin From calvin.cheung at oracle.com Thu Apr 25 22:23:14 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 25 Apr 2019 15:23:14 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <1F703F87-5214-4410-A780-47166EFFC5E0@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> <5CBFA906.1030205@oracle.com> <1F703F87-5214-4410-A780-47166EFFC5E0@oracle.com> Message-ID: <5CC23352.4050706@oracle.com> Hi Karen, I've updated the comment to the following: if (DynamicDumpSharedSpaces) { // For dynamic dumping, we can resolve all the constraint classes for all class loaders during // the initial run prior to creating the archive before vm exit. We will also perform verification // check when running with the archive. return false; } else { if (is_builtin(k)) { // For builtin class loaders, we can try to complete the verification check at dump time, // because we can resolve all the constraint classes. We will also perform verification check // when running with the archive. return false; } thanks, Calvin On 4/25/19, 8:48 AM, Karen Kinnear wrote: > One follow-up question relative to Jiangli?s review questions please: > >> On Apr 23, 2019, at 8:08 PM, Calvin Cheung > > wrote: >> >> Hi Jiangli, >> >> Thanks a lot for your review! >> >> On 4/22/19, 2:07 PM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> >>> - src/hotspot/share/classfile/systemDictionaryShared.cpp >>> >>> 1218 if (DynamicDumpSharedSpaces) { >>> 1219 return false; >>> 1220 } else { >>> >>> The above case for DynamicDumpSharedSpaces needs to be examined >>> carefully. Can you please ask Harold (and Coleen or Karen) to take a >>> look? Also, a comment is needed to explain that we can complete all >>> verification checks at dynamic dumping time. >> I've added a comment. If it return false, the caller will call >> VerificationType::resolve_and_check_assignability(). > Just making sure I understand this correctly: > > When we archive a class, we also archive all supertypes. We do not > necessarily archive all classes in > the verification constraints list. We always record all verification > constraints, whether or not we actually > continue to load and perform them. > > When we run with an archive, we ensure that the supertypes that we use > are also in the archive and are an exact match, > otherwise we don?t select the class from the archive. > We then check the verification constraints list, which might then > cause additional class loading. If that fails, > again, we don?t select the class from the archive. This allows more > flexibility in changes to the classes found > at runtime, as long as their ?isAssignableFrom? constraints are still met. > > So - prior to the DynamicDumpSharedSpace - > > In SystemDictionaryShared::add_verification_constraint: > I agree with the logic that we add constraints for all archived classes. > > I agree that for non-built-in class loaders, we could not complete > the verification check at dump time, > and only perform this at runtime using the archive. > For built-in class loaders - I agree that we could ALSO perform the > verification check at dump time, and > if it failed, eliminate archiving the class. > > I assume that is intended behavior for DynamicDumpSharedSpaces? > > Couple of questions: > 1. Does what I said above for built-in class loaders match your > understanding? If so, perhaps this would > be a good time to clarify the comment - since we will perform the > verification check now AND > and when running with the archive. > > 2. For DynamicDumpSharedSpaces > Since we do not perform any loading as part of the > PopulateDynamicDumpSharedSpaces, > I assume that this is all about recording verification constraints for > all classes loaded while > we run, before we get to the archive creation on exit. > > So your logic of creating constraints for all archived classes and > then returning false so > that the verification is also run during the first run makes sense to me. > > And yes, it makes sense to add a comment that we should ALSO perform > verification checks during > the initial run in addition to creating the constraint for running > later with the archive. I don?t conceptually > think of this as checking constraints ?at dump time?, but rather for > the initial run before creating > the dynamic archive. > > thanks, > Karen > From mikhailo.seledtsov at oracle.com Fri Apr 26 00:51:30 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 25 Apr 2019 17:51:30 -0700 Subject: RFR(S): 8222769: [TESTBUG] TestJFRNetworkEvents should not rely on hostname command Message-ID: <277d1da2-93fa-c516-fb7d-cf02a6d99dbd@oracle.com> Please review this change that uses a better platform independent way of obtaining IP address inside a docker container. This test worked for Oracle Linux, but did not work on Fedora, which is now fixed. The code for obtaining IP address in getLocalIp() was contributed by Severin, thank you Severin. ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8222769 ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222769.00/ ??? Testing: ??????? Ran the affected test on Linux-x64 (Oracle Linux 7.3, 7.6) - Passed Thank you, Misha From leonid.mesnik at oracle.com Fri Apr 26 01:25:46 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Thu, 25 Apr 2019 18:25:46 -0700 Subject: RFR(S): 8222769: [TESTBUG] TestJFRNetworkEvents should not rely on hostname command In-Reply-To: <277d1da2-93fa-c516-fb7d-cf02a6d99dbd@oracle.com> References: <277d1da2-93fa-c516-fb7d-cf02a6d99dbd@oracle.com> Message-ID: Hi The overall fix looks food. But I expect that there are might be different docker environments. Would not be more robust to verify that address exist in getLocalIp() list and fails if not? Leonid > On Apr 25, 2019, at 5:51 PM, mikhailo.seledtsov at oracle.com wrote: > > Please review this change that uses a better platform independent way of obtaining IP address inside a docker container. > This test worked for Oracle Linux, but did not work on Fedora, which is now fixed. > The code for obtaining IP address in getLocalIp() was contributed by Severin, thank you Severin. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8222769 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8222769.00/ > Testing: > Ran the affected test on Linux-x64 (Oracle Linux 7.3, 7.6) - Passed > > > Thank you, > Misha > From mikhailo.seledtsov at oracle.com Fri Apr 26 01:38:29 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 25 Apr 2019 18:38:29 -0700 Subject: RFR(S): 8222769: [TESTBUG] TestJFRNetworkEvents should not rely on hostname command In-Reply-To: References: <277d1da2-93fa-c516-fb7d-cf02a6d99dbd@oracle.com> Message-ID: Thank you Leonid, On 4/25/19 6:25 PM, Leonid Mesnik wrote: > Hi > > The overall fix looks food. But I expect that there are might be different docker environments. > Would not be more robust to verify that address exist in getLocalIp() list and fails if not? Sounds good. I will update the code to match the address reported by JFR to all the addresses returned by getLocalIp(). If at least a single match found, test continues. If not, test fails. Misha > > Leonid > >> On Apr 25, 2019, at 5:51 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Please review this change that uses a better platform independent way of obtaining IP address inside a docker container. >> This test worked for Oracle Linux, but did not work on Fedora, which is now fixed. >> The code for obtaining IP address in getLocalIp() was contributed by Severin, thank you Severin. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8222769 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8222769.00/ >> Testing: >> Ran the affected test on Linux-x64 (Oracle Linux 7.3, 7.6) - Passed >> >> >> Thank you, >> Misha >> From mikhailo.seledtsov at oracle.com Fri Apr 26 02:24:04 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 25 Apr 2019 19:24:04 -0700 Subject: RFR(S): 8222769: [TESTBUG] TestJFRNetworkEvents should not rely on hostname command In-Reply-To: References: <277d1da2-93fa-c516-fb7d-cf02a6d99dbd@oracle.com> Message-ID: Here is the updated webrev: http://cr.openjdk.java.net/~mseledtsov/8222769.01/ On 4/25/19 6:38 PM, mikhailo.seledtsov at oracle.com wrote: > Thank you Leonid, > > > On 4/25/19 6:25 PM, Leonid Mesnik wrote: >> Hi >> >> The overall fix looks food. But I expect that there are might be >> different docker environments. >> Would not be more robust to verify that address exist in getLocalIp() >> list and fails if not? > Sounds good. I will update the code to match the address reported by > JFR to all the addresses returned by getLocalIp(). If at least a > single match found, test continues. If not, test fails. > > Misha >> >> Leonid >> >>> On Apr 25, 2019, at 5:51 PM, mikhailo.seledtsov at oracle.com wrote: >>> >>> Please review this change that uses a better platform independent >>> way of obtaining IP address inside a docker container. >>> This test worked for Oracle Linux, but did not work on Fedora, which >>> is now fixed. >>> The code for obtaining IP address in getLocalIp() was contributed by >>> Severin, thank you Severin. >>> >>> ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8222769 >>> ???? Webrev: http://cr.openjdk.java.net/~mseledtsov/8222769.00/ >>> ???? Testing: >>> ???????? Ran the affected test on Linux-x64 (Oracle Linux 7.3, 7.6) >>> - Passed >>> >>> >>> Thank you, >>> Misha >>> > From david.holmes at oracle.com Fri Apr 26 04:57:57 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Apr 2019 14:57:57 +1000 Subject: RFR(S): 8222518: Remove unnecessary caching of Parker object in java.lang.Thread In-Reply-To: <0db0ea2a-6192-0587-3cd2-41f0d718c449@oracle.com> References: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> <0db0ea2a-6192-0587-3cd2-41f0d718c449@oracle.com> Message-ID: <0d902766-235d-66ae-a69e-966d9da1548f@oracle.com> Thanks Dan! Extraneous ; culled. David On 25/04/2019 1:16 am, Daniel D. Daugherty wrote: > On 4/24/19 3:12 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8222518 >> webrev: http://cr.openjdk.java.net/~dholmes/8222518/webrev/ > > src/hotspot/share/classfile/javaClasses.cpp > ??? L1629: ? macro(_park_blocker_offset,? k, "parkBlocker", > object_signature, false); > ??????? Line ends with a ';' and the previous last line did not. When the > ? ? ? ? THREAD_FIELDS_DO macro is called, it is already followed by a ';': > > ??????? L1635: ? THREAD_FIELDS_DO(FIELD_COMPUTE_OFFSET); > ??????? L1640: ? THREAD_FIELDS_DO(FIELD_SERIALIZE_OFFSET); > > src/hotspot/share/classfile/javaClasses.hpp > ??? No comments. > > src/hotspot/share/prims/unsafe.cpp > ??? No comments. > > src/java.base/share/classes/java/lang/Thread.java > ??? No comments. > > Thumbs up.? I don't need to see another webrev if you choose to remove > the ';' on L1629. > > Dan > >> >> The original implementation of Unsafe.unpark simply extracted the >> JavaThread reference from the java.lang.Thread oop and if non-null >> extracted the Parker instance from it and invoked unpark. This was >> racy however as the native JavaThread could terminate at any time and >> deallocate the Parker. >> >> That logic was fixed by JDK-6271298 which used of combination of >> type-stable-memory "event" objects for the Parker, along with use of >> the Threads_lock to obtain the initial reference to the Parker (from a >> JavaThread guaranteed to be alive), together with caching the native >> Parker pointer in a field of java.lang.Thread. Even though the native >> thread may have terminated the Parker was still valid (even if >> associated with a different thread) and the unpark at worst was a >> spurious wakeup for that other thread. >> >> When JDK-8167108 introduced Thread Safe-Memory-Reclaimation (SMR) the >> logic was updated to always use the safe mechanism - we grab a >> ThreadsListHandle then check the cached field, else lookup the native >> thread to see if it is alive and locate the Parker instance that way. >> With SMR the caching of the Parker pointer no longer serves any >> purpose - we no longer have a lock-free use-the-cache path versus a >> lock-using populate-the-cache path. With SMR we've already"paid" for >> the ability to ensure the native thread can't terminate regardless of >> whether we lookup the field from the java.lang.Thread or the >> JavaThread. So we can simplify the code and save a little footprint by >> removing the cache from java.lang.Thread: >> >> ??? /* >> ???? * JVM-private state that persists after native thread termination. >> ???? */ >> ??? private long nativeParkEventPointer; >> >> and the supporting code from unsafe.cpp and javaClass.*pp in the JVM. >> >> I considered restoring the fast-path use of the cache without recourse >> to Thread-SMR but performance measurements failed to show any benefit >> in doing. See bug report for details. >> >> Thanks, >> David > From david.holmes at oracle.com Fri Apr 26 05:30:08 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Apr 2019 15:30:08 +1000 Subject: RFR(S): 8222518: Remove unnecessary caching of Parker object in java.lang.Thread In-Reply-To: <9b063a96-1859-6280-7412-75d54c1a1fb6@oracle.com> References: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> <9b063a96-1859-6280-7412-75d54c1a1fb6@oracle.com> Message-ID: Hi Robbin, On 25/04/2019 5:53 pm, Robbin Ehn wrote: > Hi David, > > Looks good. Thanks for the review. > Just a question: > It seems like we could just hold the ThreadsList over p->unpark() and > not rely on TSM ? Yes now it is done this way we could do the unpark while holding the TLH and avoid relying on TSM. > Not sure in how many places we do rely on it, but it would be nice to > remove TSM for parkers. TSM for Parkers was introduced by JDK-6271298 (there's a typo in the comment in park.hpp that transposes the last 2 numbers of the bug) so I think this is the only usage that relies on it. > The exiting thread would set parker to NULL before removing itself from > the threadslist and free it when it's off. I don't think we need that complexity. It should suffice change: JavaThread::~JavaThread() { // JSR166 -- return the parker to the free list Parker::Release(_parker); _parker = NULL; to do "delete _parker" instead. I'll file a RFE for that. Thanks, David > Thanks, Robbin > > On 4/24/19 9:12 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8222518 >> webrev: http://cr.openjdk.java.net/~dholmes/8222518/webrev/ >> >> The original implementation of Unsafe.unpark simply extracted the >> JavaThread reference from the java.lang.Thread oop and if non-null >> extracted the Parker instance from it and invoked unpark. This was >> racy however as the native JavaThread could terminate at any time and >> deallocate the Parker. >> >> That logic was fixed by JDK-6271298 which used of combination of >> type-stable-memory "event" objects for the Parker, along with use of >> the Threads_lock to obtain the initial reference to the Parker (from a >> JavaThread guaranteed to be alive), together with caching the native >> Parker pointer in a field of java.lang.Thread. Even though the native >> thread may have terminated the Parker was still valid (even if >> associated with a different thread) and the unpark at worst was a >> spurious wakeup for that other thread. >> >> When JDK-8167108 introduced Thread Safe-Memory-Reclaimation (SMR) the >> logic was updated to always use the safe mechanism - we grab a >> ThreadsListHandle then check the cached field, else lookup the native >> thread to see if it is alive and locate the Parker instance that way. >> With SMR the caching of the Parker pointer no longer serves any >> purpose - we no longer have a lock-free use-the-cache path versus a >> lock-using populate-the-cache path. With SMR we've already"paid" for >> the ability to ensure the native thread can't terminate regardless of >> whether we lookup the field from the java.lang.Thread or the >> JavaThread. So we can simplify the code and save a little footprint by >> removing the cache from java.lang.Thread: >> >> ???? /* >> ????? * JVM-private state that persists after native thread termination. >> ????? */ >> ???? private long nativeParkEventPointer; >> >> and the supporting code from unsafe.cpp and javaClass.*pp in the JVM. >> >> I considered restoring the fast-path use of the cache without recourse >> to Thread-SMR but performance measurements failed to show any benefit >> in doing. See bug report for details. >> >> Thanks, >> David From david.holmes at oracle.com Fri Apr 26 05:32:17 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Apr 2019 15:32:17 +1000 Subject: RFR(S): 8222518: Remove unnecessary caching of Parker object in java.lang.Thread In-Reply-To: <0d902766-235d-66ae-a69e-966d9da1548f@oracle.com> References: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> <0db0ea2a-6192-0587-3cd2-41f0d718c449@oracle.com> <0d902766-235d-66ae-a69e-966d9da1548f@oracle.com> Message-ID: I pushed this today based on Dan and Robbin's reviews, but realized just after the act that I should have waited for any feedback from core-libs - apologies about that. If there are concerns I will roll it back. Thanks, David ----- On 26/04/2019 2:57 pm, David Holmes wrote: > Thanks Dan! Extraneous ; culled. > > David > > On 25/04/2019 1:16 am, Daniel D. Daugherty wrote: >> On 4/24/19 3:12 AM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8222518 >>> webrev: http://cr.openjdk.java.net/~dholmes/8222518/webrev/ >> >> src/hotspot/share/classfile/javaClasses.cpp >> ???? L1629: ? macro(_park_blocker_offset,? k, "parkBlocker", >> object_signature, false); >> ???????? Line ends with a ';' and the previous last line did not. When >> the >> ?? ? ? ? THREAD_FIELDS_DO macro is called, it is already followed by a >> ';': >> >> ???????? L1635: ? THREAD_FIELDS_DO(FIELD_COMPUTE_OFFSET); >> ???????? L1640: ? THREAD_FIELDS_DO(FIELD_SERIALIZE_OFFSET); >> >> src/hotspot/share/classfile/javaClasses.hpp >> ???? No comments. >> >> src/hotspot/share/prims/unsafe.cpp >> ???? No comments. >> >> src/java.base/share/classes/java/lang/Thread.java >> ???? No comments. >> >> Thumbs up.? I don't need to see another webrev if you choose to remove >> the ';' on L1629. >> >> Dan >> >>> >>> The original implementation of Unsafe.unpark simply extracted the >>> JavaThread reference from the java.lang.Thread oop and if non-null >>> extracted the Parker instance from it and invoked unpark. This was >>> racy however as the native JavaThread could terminate at any time and >>> deallocate the Parker. >>> >>> That logic was fixed by JDK-6271298 which used of combination of >>> type-stable-memory "event" objects for the Parker, along with use of >>> the Threads_lock to obtain the initial reference to the Parker (from >>> a JavaThread guaranteed to be alive), together with caching the >>> native Parker pointer in a field of java.lang.Thread. Even though the >>> native thread may have terminated the Parker was still valid (even if >>> associated with a different thread) and the unpark at worst was a >>> spurious wakeup for that other thread. >>> >>> When JDK-8167108 introduced Thread Safe-Memory-Reclaimation (SMR) the >>> logic was updated to always use the safe mechanism - we grab a >>> ThreadsListHandle then check the cached field, else lookup the native >>> thread to see if it is alive and locate the Parker instance that way. >>> With SMR the caching of the Parker pointer no longer serves any >>> purpose - we no longer have a lock-free use-the-cache path versus a >>> lock-using populate-the-cache path. With SMR we've already"paid" for >>> the ability to ensure the native thread can't terminate regardless of >>> whether we lookup the field from the java.lang.Thread or the >>> JavaThread. So we can simplify the code and save a little footprint >>> by removing the cache from java.lang.Thread: >>> >>> ??? /* >>> ???? * JVM-private state that persists after native thread termination. >>> ???? */ >>> ??? private long nativeParkEventPointer; >>> >>> and the supporting code from unsafe.cpp and javaClass.*pp in the JVM. >>> >>> I considered restoring the fast-path use of the cache without >>> recourse to Thread-SMR but performance measurements failed to show >>> any benefit in doing. See bug report for details. >>> >>> Thanks, >>> David >> From david.holmes at oracle.com Fri Apr 26 06:52:15 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Apr 2019 16:52:15 +1000 Subject: RFC: JWarmup precompile java hot methods at application startup In-Reply-To: References: Message-ID: Hi Yumin, On 26/04/2019 2:07 am, yumin qi wrote: > Hi, > > Apart from comments from compiler professionals can I have comments from > runtime either? The changes mostly land in runtime area. I have to question why the changes mostly land in runtime area! The high-level description of this feature does not sound like it depends on the runtime at all. The "recording" feature should just come from the JITs data; and the actual warmup should just be an interaction during VM initialization with the JIT. I don't see anything in the JEP to explain the actual design, and why it impacts on the runtime so much. It also sounds like a selective Xcomp mode to me. It even sounds very similar to Initialization-Time-Compilation (ITC) that we employed in Java Real-Time System: https://docs.oracle.com/javase/realtime/doc_2.2u1/release/JavaRTSCompilation.html Cheers, David > Thanks > Yumin > > On Tue, Apr 16, 2019 at 11:27 AM yumin qi wrote: > >> HI, >> >> Did anyone have comments for this version? >> >> Thanks >> Yumin >> >> On Tue, Apr 9, 2019 at 10:36 AM yumin qi wrote: >> >>> Alan, >>> Thanks! Updated in same link: >>> http://cr.openjdk.java.net/~minqi/8220692/webrev-02/ >>> >>> Removed non-boot loader branch in nativeLookup.cpp. >>> Added jdk.jwarmup to boot loader list in make/common/Modules.gmk. >>> Tested again to make sure the new changes. >>> >>> Thanks >>> Yumin >>> >>> >>> On Tue, Apr 9, 2019 at 4:48 AM Alan Bateman >>> wrote: >>> >>>> On 09/04/2019 07:10, yumin qi wrote: >>>>> >>>>> Now the registerNatives is found when it looks up for native entry >>>>> in lookupNative.cpp. I thought the class JWarmUp will be loaded by >>>>> boot loader like Unsafe or WhiteBox, but I was wrong, it is loaded by >>>>> app class loader so logic for obtaining its native entry put in both >>>>> cases, boot loader and non boot loaders. >>>>> >>>> make/common/Modules.gmk is where BOOT_MODULES is defined with the list >>>> of modules mapped to the boot loader. >>>> >>>> -Alan >>>> >>> From robbin.ehn at oracle.com Fri Apr 26 08:14:30 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 26 Apr 2019 10:14:30 +0200 Subject: RFR(S): 8222518: Remove unnecessary caching of Parker object in java.lang.Thread In-Reply-To: References: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> <9b063a96-1859-6280-7412-75d54c1a1fb6@oracle.com> Message-ID: <1d4f26a1-a774-2fad-d0af-498665d37954@oracle.com> > I'll file a RFE for that. Great, thanks! /Robbin > > Thanks, > David > > > >> Thanks, Robbin >> >> On 4/24/19 9:12 AM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8222518 >>> webrev: http://cr.openjdk.java.net/~dholmes/8222518/webrev/ >>> >>> The original implementation of Unsafe.unpark simply extracted the JavaThread >>> reference from the java.lang.Thread oop and if non-null extracted the Parker >>> instance from it and invoked unpark. This was racy however as the native >>> JavaThread could terminate at any time and deallocate the Parker. >>> >>> That logic was fixed by JDK-6271298 which used of combination of >>> type-stable-memory "event" objects for the Parker, along with use of the >>> Threads_lock to obtain the initial reference to the Parker (from a JavaThread >>> guaranteed to be alive), together with caching the native Parker pointer in a >>> field of java.lang.Thread. Even though the native thread may have terminated >>> the Parker was still valid (even if associated with a different thread) and >>> the unpark at worst was a spurious wakeup for that other thread. >>> >>> When JDK-8167108 introduced Thread Safe-Memory-Reclaimation (SMR) the logic >>> was updated to always use the safe mechanism - we grab a ThreadsListHandle >>> then check the cached field, else lookup the native thread to see if it is >>> alive and locate the Parker instance that way. >>> With SMR the caching of the Parker pointer no longer serves any purpose - we >>> no longer have a lock-free use-the-cache path versus a lock-using >>> populate-the-cache path. With SMR we've already"paid" for the ability to >>> ensure the native thread can't terminate regardless of whether we lookup the >>> field from the java.lang.Thread or the JavaThread. So we can simplify the >>> code and save a little footprint by removing the cache from java.lang.Thread: >>> >>> ???? /* >>> ????? * JVM-private state that persists after native thread termination. >>> ????? */ >>> ???? private long nativeParkEventPointer; >>> >>> and the supporting code from unsafe.cpp and javaClass.*pp in the JVM. >>> >>> I considered restoring the fast-path use of the cache without recourse to >>> Thread-SMR but performance measurements failed to show any benefit in doing. >>> See bug report for details. >>> >>> Thanks, >>> David From robbin.ehn at oracle.com Fri Apr 26 08:16:15 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 26 Apr 2019 10:16:15 +0200 Subject: RFR(s): 8222637: Obsolete NeedsDeoptSuspend (was RFR(s): 8222640: Remove deopt suspend) In-Reply-To: <4296a391-9bae-daa7-2190-4d28acaa1074@oracle.com> References: <2f4eeddd-1480-4dab-8c64-187282586348@oracle.com> <70e28808-14a2-d063-9dcf-cf49ca3fa7f6@oracle.com> <8d4cffa7-9980-dabd-b446-be8a07efca7f@oracle.com> <7010d33e-13d1-77a4-2a36-0cbe03209bed@oracle.com> <544b8a83-e5bb-eead-7797-70e08ca15ee8@oracle.com> <45681477-9a04-a623-6686-c72b36be6c38@oracle.com> <4296a391-9bae-daa7-2190-4d28acaa1074@oracle.com> Message-ID: <1fd15fbd-88b1-d044-169a-40c61d526002@oracle.com> Thanks Dean! /Robbin On 4/25/19 4:49 PM, dean.long at oracle.com wrote: > Looks good. > > dl > > On 4/25/19 1:53 AM, Robbin Ehn wrote: >> Hi, >> >> The same patch as in 8222640 but with obsoleting of the flag also. >> >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8222637 >> CSR: >> https://bugs.openjdk.java.net/browse/JDK-8222639 >> >> The incremental change is thus: >> http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/globals.hpp.sdiff.html >> >> http://cr.openjdk.java.net/~rehn/8222637/webrev/src/hotspot/share/runtime/arguments.cpp.sdiff.html >> >> >> Full: >> http://cr.openjdk.java.net/~rehn/8222637/webrev/ >> >> Dead and Coleen had previously review 8222640, so if they can acknowledge this >> inc change. >> >> Thanks, Robbin >> >> On 4/24/19 1:49 AM, Robbin Ehn wrote: >>> Thanks Coleen! >>> >>> /Robbin >>> >>> On 2019-04-24 00:47, coleen.phillimore at oracle.com wrote: >>>> +1? This looks good! >>>> Coleen >>>> >>>> On 4/23/19 5:32 PM, Robbin Ehn wrote: >>>>> Thanks Dean! >>>>> >>>>> /Robbin >>>>> >>>>> On 2019-04-23 23:17, dean.long at oracle.com wrote: >>>>>> Yes, looks good! >>>>>> >>>>>> dl >>>>>> >>>>>> On 4/23/19 12:38 PM, Robbin Ehn wrote: >>>>>>> Hi Dean, >>>>>>> >>>>>>> Is this what you had in mind: >>>>>>> diff -r 295029840379 src/hotspot/share/runtime/frame.cpp >>>>>>> --- a/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 09:58:55 2019 >>>>>>> +0200 >>>>>>> +++ b/src/hotspot/share/runtime/frame.cpp?????? Tue Apr 23 21:32:00 2019 >>>>>>> +0200 >>>>>>> @@ -272,4 +272,6 @@ >>>>>>> >>>>>>> ?void frame::deoptimize(JavaThread* thread) { >>>>>>> + assert(thread->frame_anchor()->has_last_Java_frame() && >>>>>>> +???????? thread->frame_anchor()->walkable(), "must be"); >>>>>>> ?? // Schedule deoptimization of an nmethod activation with this frame. >>>>>>> ?? assert(_cb != NULL && _cb->is_compiled(), "must be"); >>>>>>> >>>>>>> Passes t1-5. >>>>>>> >>>>>>> v2: >>>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/webrev/ >>>>>>> Inc: >>>>>>> http://cr.openjdk.java.net/~rehn/8222640/2/inc/webrev/ >>>>>>> >>>>>>> Thanks, Robbin >>>>>>> >>>>>>> On 2019-04-18 06:22, dean.long at oracle.com wrote: >>>>>>>> In frame::deoptimize(), can we assert that we have an anchor frame and >>>>>>>> that it is walkable? >>>>>>>> >>>>>>>> dl >>>>>>>> >>>>>>>> On 4/17/19 3:09 AM, Robbin Ehn wrote: >>>>>>>>> Adding compiler. >>>>>>>>> >>>>>>>>> /Robbin >>>>>>>>> >>>>>>>>> On 4/17/19 10:35 AM, Robbin Ehn wrote: >>>>>>>>>> Hi all, please consider this change. >>>>>>>>>> >>>>>>>>>> The code for deopt suspend is no longer needed since today the >>>>>>>>>> register window >>>>>>>>>> is always flushed when this code executes. Exactly when this code was >>>>>>>>>> needed is not clear, entered via duke changeset 1. I did not dig since >>>>>>>>>> we no longer have such use case. >>>>>>>>>> >>>>>>>>>> Webrev: >>>>>>>>>> http://cr.openjdk.java.net/~rehn/8222640/webrev/ >>>>>>>>>> Issue: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222640 >>>>>>>>>> >>>>>>>>>> Passes t1-5. >>>>>>>>>> >>>>>>>>>> Thanks, Robbin >>>>>>>> >>>>>> >>>> > From Alan.Bateman at oracle.com Fri Apr 26 09:25:30 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 26 Apr 2019 10:25:30 +0100 Subject: RFR(S): 8222518: Remove unnecessary caching of Parker object in java.lang.Thread In-Reply-To: References: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> <0db0ea2a-6192-0587-3cd2-41f0d718c449@oracle.com> <0d902766-235d-66ae-a69e-966d9da1548f@oracle.com> Message-ID: <3b23c713-6913-3d20-048c-2c51deace041@oracle.com> On 26/04/2019 06:32, David Holmes wrote: > I pushed this today based on Dan and Robbin's reviews, but realized > just after the act that I should have waited for any feedback from > core-libs - apologies about that. If there are concerns I will roll it > back. I don't think there are any concerns, it is of course very welcome to remove a field from Thread. -Alan From Alan.Bateman at oracle.com Fri Apr 26 10:56:00 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 26 Apr 2019 11:56:00 +0100 Subject: RFC: JWarmup precompile java hot methods at application startup In-Reply-To: References: Message-ID: <028f6a4a-b0be-bd65-519e-d76b5054e0e8@oracle.com> On 26/04/2019 07:52, David Holmes wrote: > > I have to question why the changes mostly land in runtime area! The > high-level description of this feature does not sound like it depends > on the runtime at all. The "recording" feature should just come from > the JITs data; and the actual warmup should just be an interaction > during VM initialization with the JIT. I don't see anything in the JEP > to explain the actual design, and why it impacts on the runtime so much. In addition, the draft JEP may need updating to outline the mode that requires the application to "notify" the runtime via a JDK-specific API that it has completed initialization. If this mode is part of the proposal then it should be described in the JEP. -Alan From david.holmes at oracle.com Fri Apr 26 11:56:44 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Apr 2019 21:56:44 +1000 Subject: RFR(S): 8222518: Remove unnecessary caching of Parker object in java.lang.Thread In-Reply-To: <3b23c713-6913-3d20-048c-2c51deace041@oracle.com> References: <4d1ad4b6-95e9-e7c9-2064-4e9ff67edae8@oracle.com> <0db0ea2a-6192-0587-3cd2-41f0d718c449@oracle.com> <0d902766-235d-66ae-a69e-966d9da1548f@oracle.com> <3b23c713-6913-3d20-048c-2c51deace041@oracle.com> Message-ID: On 26/04/2019 7:25 pm, Alan Bateman wrote: > On 26/04/2019 06:32, David Holmes wrote: >> I pushed this today based on Dan and Robbin's reviews, but realized >> just after the act that I should have waited for any feedback from >> core-libs - apologies about that. If there are concerns I will roll it >> back. > I don't think there are any concerns, it is of course very welcome to > remove a field from Thread. Thanks Alan! David > -Alan From robbin.ehn at oracle.com Fri Apr 26 12:50:25 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 26 Apr 2019 14:50:25 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <2536572e-158f-0913-889b-ef76d6122c79@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <2536572e-158f-0913-889b-ef76d6122c79@oracle.com> Message-ID: <64233d26-fc9f-9eb4-2c83-34186a669832@oracle.com> Hi Dan, thanks for looking at this! On 4/25/19 10:43 PM, Daniel D. Daugherty wrote: >> Code: >> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html > > Robbin, you'll have to merge with Coleen's recent MutexLocker fix (8222518). Yes, done! > src/hotspot/share/aot/aotCompiledMethod.cpp > ??? L163: bool AOTCompiledMethod::make_not_entrant_helper(int new_state) { > ??? L207: bool AOTCompiledMethod::make_entrant() { > ??????? So the Compiler team is on board with switching from the > ??????? Patching_lock to the CompiledMethod_lock? Patching_lock originally intended to protect code patching AFAIK, is now used for a lot of different unrelated cases. To iterate the compiled methods we need to hold CodeCache_lock, which is a leaf lock. Sometimes we take CodeCache_lock while holding Patching_lock, so we can't take Patching_lock while iterating. By having a new leaf lock for the compiledmethod we can update them while iterating over them. > src/hotspot/share/gc/z/zBarrierSetNMethod.cpp > ??? No comments (copyright year needs update). Fixed. > > src/hotspot/share/gc/z/zNMethod.cpp > ??? No comments (copyright year needs update). Fixed. > > src/hotspot/share/oops/markOop.hpp > ??? L180: ? bool biased_locker_is(JavaThread* thread) const { > ??????? Is this for a different project? > > ??? L181: ??? if (!has_bias_pattern()) { > ??? L182: ????? return false; > ??? L183: ??? } > ??? L184: ??? // If current thread is not the owner it can be unbiased at anytime. > ??? L185: ??? JavaThread* jt = (JavaThread*) ((intptr_t) (mask_bits(value(), > ~(biased_lock_mask_in_place | age_mask_in_place | epoch_mask_in_place)))); > ??????? So you don't want to use this: > > ? ? ? ? ? JavaThread* biased_locker() const { > ? ? ? ?? ?? assert(has_bias_pattern(), "should not call this otherwise"); > ? ? ? ?? ?? return (JavaThread*) ((intptr_t) (mask_bits(value(), > ~(biased_lock_mask_in_place | age_mask_in_place | epoch_mask_in_place)))); > ? ? ? ? ? } > > ??????? because of the assert(). > > ??????? I think biased_locker() and biased_locker_is() both need to work > ??????? with a copy of the markOop so that it can't change dynamically. > ??????? Something like: > > ????????? JavaThread* biased_locker() const { > ??????????? markOop copy = this; > ??????????? assert(copy.has_bias_pattern(), "should not call this otherwise"); > ??????????? return (JavaThread*) ((intptr_t) (copy.mask_bits(value(), > ~(biased_lock_mask_in_place | age_mask_in_place | epoch_mask_in_place)))); > ????????? } > > ????????? bool biased_locker_is(JavaThread* thread) const { > ??????????? markOop copy = this; > ??????????? if (!copy.has_bias_pattern()) { > ????????????? return false; > ??????????? } > ??????????? return copy.biased_locker(); > ????????? } Yes, thanks I did a slightly different fix, since we already have a copy. > > src/hotspot/share/oops/method.cpp > ??? old L104: ? clear_code(false /* don't need a lock */); // from_c/from_i get > set to c2i/i2i > ??????? Is the comment after '//' not useful? Added it back. > > src/hotspot/share/runtime/biasedLocking.cpp > src/hotspot/share/runtime/biasedLocking.hpp > ??? Hmmm... More Biased Locking changes. I didn't take a close > ??? look at these. Yes, please. > > src/hotspot/share/runtime/mutex.hpp > ??? old L65: ?????? special??????? = tty??????????? +?? 1, > ??? new L65: ?????? special??????? = tty??????????? +?? 2, > ??????? Why? CompiledMethod_lock must be under CodeCache_lock. There is no easy way to push locks up, without just hoping testing asserts. This way I only need to look at the new lock. Also the compiler locks are very coarse grained, there should be more locks in here :) > > src/hotspot/share/runtime/mutexLocker.cpp > ??? No comments. > > src/hotspot/share/runtime/mutexLocker.hpp > ??? L34: extern Mutex*?? Patching_lock;?????????????????? // a lock used to > guard code patching of compiled code > ??? L35: extern Mutex*?? CompiledMethod_lock; > ??????? A comment is traditional here... > Fixed. > src/hotspot/share/runtime/synchronizer.cpp > ??? old L1317: ???????? !SafepointSynchronize::is_at_safepoint(), "invariant"); > ??? new L1317: ???????? !Universe::heap()->is_gc_active(), "invariant"); > ??????? Why? If we use handshake fallback path (obsolete in JDK 14) we execute the handshake inside a safepoint. Thus when inflating we can be at a safepoint. > > ? ? L1446: ??????? ResourceMark rm; > ??? L1496: ????? ResourceMark rm; > ??????? Why drop 'Self'? That makes the ResourceMark more expensive.. During the handshake when the VM thread executes the handshake on behalf of the JavaThread, thus inflating the monitor for that thread, meaning Self is not Thread::current(). > src/hotspot/share/services/dtraceAttacher.cpp > ??? No comments (copyright year needs update). Fixed. > > > I don't think I've found anything that's "must fix". Please let me know > if I should re-review: > > ??? src/hotspot/share/runtime/biasedLocking.cpp > ??? src/hotspot/share/runtime/biasedLocking.hpp > ??? src/hotspot/share/runtime/deoptimization.cpp > > because the Biased Locking changes are critical for this project. Yes, the inflation part is trickiest. More eyes is better! I have some small changes to biasedLocking.[h|c}pp which I'll send through some tiers of testing. Please hold of re-exmine that until I'll send out the updated and tested code. Thanks! /Robbin > > Dan > > >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8221734 >> >> Passes t1-7 and multiple t1-5 runs. >> >> A few startup benchmark see a small speedup. >> >> Thanks, Robbin > From coleen.phillimore at oracle.com Fri Apr 26 13:06:03 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 26 Apr 2019 09:06:03 -0400 Subject: RFR(T): 8221738: ErrorFile option does not handle pre-existing error files of the same name In-Reply-To: References: Message-ID: I think this looks very reasonable.?? OutputAnalyzer now has a constructor that takes a file name as a parameter.? You should use this instead for your test. Thanks, Coleen On 4/7/19 3:17 AM, Thomas St?fe wrote: > Hi all, > > May I please have reviews for this small fix: > > bug: https://bugs.openjdk.java.net/browse/JDK-8221738 > cr: > http://cr.openjdk.java.net/~stuefe/webrevs/8221738-errorfile-option-does-not-handle-pre-existing-error-files-of-the-same-name/webrev.00/webrev/ > > Fixes a long standing issue where -XX:ErrorFile= will only work > if does not exist yet. If it does, error file falls silently > back to "/hs_err_pid...". > > For more detailed discussions, please see the bug and the associated CSR. > > The fix now causes the error file to be overwritten > > Thanks, Thomas From karen.kinnear at oracle.com Fri Apr 26 14:06:16 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 26 Apr 2019 10:06:16 -0400 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CC23352.4050706@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> <5CBFA906.1030205@oracle.com> <1F703F87-5214-4410-A780-47166EFFC5E0@oracle.com> <5CC23352.4050706@oracle.com> Message-ID: <12ED1B88-6ADD-46D8-B3FF-4C1E264AF080@oracle.com> Calvin, Many thanks - I think that makes it clearer for everyone. thanks, Karen > On Apr 25, 2019, at 6:23 PM, Calvin Cheung wrote: > > Hi Karen, > > I've updated the comment to the following: > > if (DynamicDumpSharedSpaces) { > // For dynamic dumping, we can resolve all the constraint classes for all class loaders during > // the initial run prior to creating the archive before vm exit. We will also perform verification > // check when running with the archive. > return false; > } else { > if (is_builtin(k)) { > // For builtin class loaders, we can try to complete the verification check at dump time, > // because we can resolve all the constraint classes. We will also perform verification check > // when running with the archive. > return false; > } > > thanks, > Calvin > > On 4/25/19, 8:48 AM, Karen Kinnear wrote: >> >> One follow-up question relative to Jiangli?s review questions please: >> >>> On Apr 23, 2019, at 8:08 PM, Calvin Cheung > wrote: >>> >>> Hi Jiangli, >>> >>> Thanks a lot for your review! >>> >>> On 4/22/19, 2:07 PM, Jiangli Zhou wrote: >>>> Hi Calvin, >>>> >>>> >>>> - src/hotspot/share/classfile/systemDictionaryShared.cpp >>>> >>>> 1218 if (DynamicDumpSharedSpaces) { >>>> 1219 return false; >>>> 1220 } else { >>>> >>>> The above case for DynamicDumpSharedSpaces needs to be examined >>>> carefully. Can you please ask Harold (and Coleen or Karen) to take a >>>> look? Also, a comment is needed to explain that we can complete all >>>> verification checks at dynamic dumping time. >>> I've added a comment. If it return false, the caller will call VerificationType::resolve_and_check_assignability(). >> Just making sure I understand this correctly: >> >> When we archive a class, we also archive all supertypes. We do not necessarily archive all classes in >> the verification constraints list. We always record all verification constraints, whether or not we actually >> continue to load and perform them. >> >> When we run with an archive, we ensure that the supertypes that we use are also in the archive and are an exact match, >> otherwise we don?t select the class from the archive. >> We then check the verification constraints list, which might then cause additional class loading. If that fails, >> again, we don?t select the class from the archive. This allows more flexibility in changes to the classes found >> at runtime, as long as their ?isAssignableFrom? constraints are still met. >> >> So - prior to the DynamicDumpSharedSpace - >> >> In SystemDictionaryShared::add_verification_constraint: >> I agree with the logic that we add constraints for all archived classes. >> >> I agree that for non-built-in class loaders, we could not complete the verification check at dump time, >> and only perform this at runtime using the archive. >> >> For built-in class loaders - I agree that we could ALSO perform the verification check at dump time, and >> if it failed, eliminate archiving the class. >> >> I assume that is intended behavior for DynamicDumpSharedSpaces? >> >> Couple of questions: >> 1. Does what I said above for built-in class loaders match your understanding? If so, perhaps this would >> be a good time to clarify the comment - since we will perform the verification check now AND >> and when running with the archive. >> >> 2. For DynamicDumpSharedSpaces >> Since we do not perform any loading as part of the PopulateDynamicDumpSharedSpaces, >> I assume that this is all about recording verification constraints for all classes loaded while >> we run, before we get to the archive creation on exit. >> >> So your logic of creating constraints for all archived classes and then returning false so >> that the verification is also run during the first run makes sense to me. >> >> And yes, it makes sense to add a comment that we should ALSO perform verification checks during >> the initial run in addition to creating the constraint for running later with the archive. I don?t conceptually >> think of this as checking constraints ?at dump time?, but rather for the initial run before creating >> the dynamic archive. >> >> thanks, >> Karen >> From jianglizhou at google.com Fri Apr 26 14:36:15 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Fri, 26 Apr 2019 07:36:15 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <36EA339A-9C37-4E96-90E1-69268A5E2967@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBDF488.7000601@oracle.com> <3614850A-450A-470B-BF2D-3CEA21D03B95@oracle.com> <36EA339A-9C37-4E96-90E1-69268A5E2967@oracle.com> Message-ID: Karen, Please see comments below. On Thu, Apr 25, 2019 at 10:35 AM Karen Kinnear wrote: > > Jiangli, > > Thank you for thinking so carefully about this. You ask great questions. > > One detailed question that arose for me from your email was: > What happens today if you specify -XX:ArchiveClassesAtExit and the archive > already exists? Is it the case that we delete the existing file before open the new one? Yes. > Just want to make sure we document that. > > Here is my understanding of where we are thinking of going - folks please correct anything - > of course as we learn more this will evolve. > > 1. ArchiveClassesAtExit > This is potentially an intermediate step, which as you point out is only here to create an archive > at the end of an execution. > We don?t yet have enough experience with this and with a potential future continuous dumping > mode to know if this mode will still be useful to customers when more advanced modes are > available. My personal sense is that this could be a useful model even long-term. > > 2. I agree that we have talked about longer-term - > a) incremental archiving additional loaded classes > - there are a lot of ways to do this > - possibly creating a separate dynamic archive, archiving as you go rather than at exit > - possibly creating another layer of dynamic archive > - possibly updating the current dynamic archive and being the only one who can read/write it > - I?m not sure if this is needed or the exit model is sufficient (and possibly more efficient?) > - prototyping could help us learn this > b) possibilities of sharing an archive while it is being updated > - I confess I am not excited about the concurrency complexities here - so not sure it is worth it > - less useful with current models of Cloud usage > c) a single command that could create an archive if it does not exist and use it if one does > d) possibly making the default be to generate a dynamic archive if it is not available and use if it is > - this could be built on archive at exit model or on incremental archive > - I think this is a reasonable model to explore - customers today expect the first execution of applications > to be slower and the dynamic linker?s caching will make second runs faster > - I think we need more field experience and user feedback before we go here > > I do appreciate your link to mmap. > I agree we don?t handle that today. I believe we don?t need to support it unless we explicitly want > concurrent reader/writers. Am I correctly hearing what you are saying? > > So I think the model is to add ArchiveClassesAtExit now, and to reserve the more flexible > command-line argument for when we move to more automation. > > I don?t think we know yet whether that would use a dumpatexit or an incremental model. > I would rather we save the automation model until we know more about where we are going, rather > than use it for this step. > > At that point, try to find a command-line argument that has a create-if-does-not-exist/use-if-does-exist model > (perhaps -Xshare:dynamic or -Xshare::reallyauto ?). > > My translation here is that we are all aiming in the same direction, the discussion is really about > how to phase this. > > Fair? Sounds reasonable. Thanks and regards, Jiangli > > I share a sadness that we are not already there. > > However I am actually quite excited about where we are - many thanks to you and Calvin and Ioi for years of design and implementation! > > thanks, > Karen > > On Apr 25, 2019, at 12:40 PM, Jiangli Zhou wrote: > > Karen, > > On Tue, Apr 23, 2019 at 11:16 AM Karen Kinnear wrote: > > > Calvin, > > I added to the CSR a comment from my favorite customer - relative to the user model for the command-line flags. > He likes the proposal to reduce the number of steps a customer has to perform to get startup and footprint benefits > from the archive. > > The comment was that it would be very helpful if the user only needed to change their scripts once - so > a single command-line argument would create a dynamic archive if one did not exist, and use it if it > already existed. > > > This is a very plausible idea and is in the same direction of the > multi-staged rollout of the end goal of dynamic archiving, which is > making the archive generation/usage completely transparent and > automatic (no command-line option is need to generate and use the > dynamic archive). > > Using a single command-line option to control the creation of the > archive (when one doesn't exist) and the use of the archive (when one > already exists) is needed for cases when users want more control. > > The following is copied from > http://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html. The > behavior is unspecified if the underlying file changes after mmap is > established. Currently, we are not handling the case. The dynamic > archive simply follow the usage model of the existing static archive > with separate steps for dump-time and runtime, which however doesn't > preclude the case. In the current usage model, concurrency is less > often. With the single command-line option controlled creation&use, > there might be more concurrences and we probably want to handle the > issue. > > "If the size of the mapped file changes after the call to mmap() as a > result of some other operation on the mapped file, the effect of > references to portions of the mapped region that correspond to added > or removed portions of the file is unspecified." > > The current flag, ArchiveClassesAtExit doesn't scale and can't handle > the single option controlled creation/use case. It would be a good > idea to re-think the command-line option now and change it before the > first integration, so it can scale. -Xshare:dump/on/auto/dynamic might > be able to serve the purpose. 'dynamic' > can be used to trigger just dynamic dumping for now. It can be > augmented to support the use case that you described above. > > Best regards, > Jiangli > > > Is there a way to evolve the ArchiveClassesAtExit= to have that functionality? > > thanks, > Karen > > p.s. I think it makes more sense to put performance numbers in the implementation RFE comments rather than the JEP > comments > > On Apr 22, 2019, at 5:16 PM, Jiangli Zhou wrote: > > Hi Calvin, > > Can you please also publish the final performance numbers in the JEP 350 (or the implementation RFE) comment section? > > Thanks, > Jiangli > > On Mon, Apr 22, 2019 at 10:07 AM Calvin Cheung wrote: > > > Hi Karen, > > Thanks for your review! > Please see my replies in-line below. > > On 4/19/19, 9:29 AM, Karen Kinnear wrote: > > Calvin, > > Many thanks for all the work getting this ready, significantly > enhancing the testing and bug fixes. > > I marked the CSR as reviewed-by - it looks great! > > I reviewed this set of changes - I did not review the tests - I assume > you can get someone > else to do that. I am grateful that Jiangli and Ioi are going to > review this also - they are much closer to > the details than I am. > > 1. Do you have any performance numbers? > 1a. Startup: does using a combined dynamic CDS archive + base archive > give similar startup benefits > when you have the same classes in the archives? > > Below are some performance numbers from Eric, each number is for 50 runs: > (base: using the default CDS archive, > test: using the dynamic archive, > Eric will get some numbers with a single archive which I think that's > what you're looking for) > > Lambda-noop: > base: > 0.066441427 seconds time elapsed > test: > 0.075428824 seconds time elapsed > > Noop: > base: > 0.057614537 seconds time elapsed > test: > 0.066061557 seconds time elapsed > > Netty: > base: > 0.827013307 seconds time elapsed > test: > 0.604982805 seconds time elapsed > > Spring: > base: > 2.376707358 seconds time elapsed > test: > 1.927618893 seconds time elapsed > > The first 2 apps only have 2 to 3 classes in the dynamic archive. So the > overhead is likely due to having to open and map the dynamic archive and > performs checking on header, etc. For small apps, I think it's better to > use a single archive. The Netty app has around 1400 classes in the > dynamic archive; the Spring app has about 3700 classes in the dynamic > archive. > > I also used our LotsOfClasses test to collect some perf numbers. This is > more like runtime performance, not startup performance. > > With dynamic archive (100 runs each): > real 2m37.191s > real 2m36.003s > Total loaded classes = 24254 > Loaded from base archive = 1186 > Loaded from top archive = 23042 > Loaded from jrt:/ (runtime module) = 26 > > With single archive (100 runs each): > real 2m38.346s > real 2m36.947s > Total loaded classes = 24254 > Loaded from archive = 24228 > Loaded from jrt:/ (runtime module) = 26 > > > 1b. Do you have samples of uses of the combined dynamic CDS archive + > base archive vs. a single > static archive built for an application? > - how do the sets of archived classes differ? > > Currently, the default CDS archive contains around 1187 classes. With > the -XX:ArchiveClassesAtExit option, if the classes are not found in the > default CDS archive, they will be archived in the dynamic archive. The > above LotsOfClasses example shows some distributions between various > archives. > > - one note was that the AtExit approach exclude list adds anything > that has not yet linked - does that make a significant difference in > the number of classes that are archived? Does that make a difference > in either startup time or in application execution time? I could see > that going either way. > > As the above numbers indicated, there's not much difference in terms of > execution time using a dynamic vs a single archive with a large number > of classes loaded. The numbers from Netty and Spring apps show an > improvement over default CDS archive. > > > 1c. Any sense of performance cost for first run - how much time does > it take to create an incremental archive? > - is the time comparable to an existing dump for a single archive > for the application? > - this is an ease-of-use feature - so we are not expecting that to > be fast > - the point is to set expectations in our documentation > > I did some rough measurements with the LotsOfClasses test with around > 15000 classes in the classlist. > > Dynamic archive dumping (one run each): > real 0m19.756s > real 0m20.241s > > Static archive dumping (one run each): > real 0m17.725s > real 0m16.993s > > > 2. Footprint > With two archives rather than one, is there a significant footprint > difference? Obviously this will vary by app and archive. > Once again, the point is to set expectations. > > Sizes of the archives for the LotsOfClasses test in 1a. > > Single archive: 242962432 > Default CDS archive: 12365824 > Dynamic archive: 197525504 > > > 3. Runtime performance > With two sets of archived dictionaries & symbolTables - is there any > significant performance cost to larger benchmarks, e.g. for class > loading lookup for classes that are not in the archives? Or symbol > lookup? > > I used the LotsOfClasses test again. This time archiving about half of > the classes which will be loaded during runtime. > > Dynamic archive (10 runs each): > real 0m30.214s > real 0m29.633s > Loaded classes = 24254 > Loaded from dynamic archive: 13168 > > Single archive (10 runs each): > real 0m32.383s > real 0m32.905s > Loaded classes = 24254 > Loaded from single archive = 15063 > > > 4. Platform support > Which platforms is this supported on? > Which ones did you test? For example, did you run the tests on Windows? > > I ran the jtreg tests via mach5 on all 4 platforms (Linux, Mac, Solaris, > Windows). > > > Detailed feedback on the code: Just minor comments - I don?t need to > see an updated webrev: > > I'm going to look into your detailed feedback below and may reply in a > separate email. > > thanks, > Calvin > > > 1. metaSpaceShared.hpp > line 156: > what is the hardcoded -100 for? Should that be an enum? > > 2. jfrRecorder.cpp > So JFR recordings are disabled if DynamicDumpSharedSpaces? > why? > Is that a future rfe? > > 3. systemDictionaryShared.cpp > Could you possibly add a comment to add_verification_constraint > for if (DynamicDumpSharedSpaces) > return false > > -- I think the logic is: > because we have successfully linked any instanceKlass we archive > with DynamicDumpSharedSpaces, we have resolved all the constraint classes. > > -- I didn't check the order - is this called before or after > excluding? If after, then would it make sense to add an assertion > here is_linked? Then if you ever change how/when linking is done, this > might catch future errors. > > 4. systemDictionaryShared.cpp > EstimateSizeForArchive::do_entry > Is it the case that for info.is_builtin() there are no verification > constraints? So you could skip that calculation? Or did I misunderstand? > > 5. compactHashtable.cpp > serialize/header/calculate_header_size > -- could you dynamically determine size_of header so you don't need > to hardcode a 5? > > 6. classLoader.cpp > line 1337: //FIXME: DynamicDumpSharedSpaces and --patch-modules are > mutually exclusive. > Can you clarify for me: > My memory of the base archive is that we do not allow the following > options at dump time - and these > are the same for the dynamic archive: ?limit-modules, > ?upgrade-module-path, ?patch-module. > > I have forgotten: > Today with UseSharedSpaces - do we allow these flags? Is that also the > same behavior with the dynamic > archive? > > 7. classLoaderExt.cpp > assert line 66: only used with -Xshare:dump > -> "only used at dump time" > > 8. symbolTable.cpp > line 473: comment // used by UseSharedArchived2 > ? command-line arg name has changed > > 9. filemap.cpp > Comment lines 529 ... > Is this true - that you can only support dynamic dumping with the > default CDS archive? Could you clarify what the restrictions are? > The CSR implies you can support ?a specific base CDS archive" > - so base layer can not have appended boot class path > - and base layer can't have a module path > > What can you specify for the dynamic dumping relative to the base archive? > - matching class path? > - appended class path? > in future - could it have a module path that matched the base archive? > > Should any of these restrictions be clarified in documentation/CSR > since they appear to be new? > > 10. filemap.cpp > check_archive > Do some of the return false paths skip performing os::close(fd)? > > and get_base_archive_name_from_header > Does the first return false path fail to os::free(dynamic_header) > > lines 753-754: two FIXME comments > > Could you delete commented out line 1087 in filemap.cpp ? > > 11. filemap.hpp > line 214: TODO left in > > 12. metaspace.cpp > line 1418 FIXME left in > > 13. java.cpp > FIXME: is this the right place? > For starting the DynamicArchive::dump > > Please check with David Holmes on that one > > 14. dynamicArchive.hpp > line 55 (and others): MetsapceObj -> MetaspaceObj > > 15. dynamicArchive.cpp > line 285 rel-ayout -> re-layout > > lines 277 && 412 > Do we archive array klasses in the base archive but not in the dynamic > archive? > Is that a potential RFE? > Is it possible that GatherKlassesAndSymbols::do_unique_ref could be > called with an array class? > Same question for copy_impl? > > line 934: "no onger" -> "no longer" > > 16. What is AllowArchivingWithJavaAgent? Is that a hook for a > potential future rfe? > Do you want to check in that code at this time? In product? > > thanks, > Karen > > > On Apr 11, 2019, at 5:18 PM, Calvin Cheung > wrote: > > This is a follow-up on the preliminary code review sent by Jiangli in > January[1]. > > Highlights of changes since then: > 1. New vm option for dumping a dynamic archive > (-XX:ArchiveClassesAtExit=) and enhancement to the > existing -XX:SharedArchiveFile option. Please refer to the > corresponding CSR[2] for details. > 2. New way to run existing AppCDS tests in dynamic CDS archive mode. > At the jtreg command line, the user can run many existing AppCDS > tests in dynamic CDS archive mode by specifying the following: > -vmoptions:-Dtest.dynamic.cds.archive=true > /open/test/hotspot/jtreg:hotspot_appcds_dynamic > We will have a follow-up RFE to determine in which tier the above > tests should be run. > 3. Added more tests. > 4. Various bug fixes to improve stability. > > RFE: https://bugs.openjdk.java.net/browse/JDK-8207812 > webrev: > http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/webrev.00/ > > > (The webrev is based on top of the following rev: > http://hg.openjdk.java.net/jdk/jdk/rev/805584336738) > > Testing: > - mach5 tiers 1- 3 (including the new tests) > - AppCDS tests in dynamic CDS archive mode on linux-x64 (a few > tests require more investigation) > > thanks, > Calvin > > [1] > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-January/032176.html > [2] https://bugs.openjdk.java.net/browse/JDK-8221706 > > From thomas.stuefe at gmail.com Fri Apr 26 17:17:20 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 26 Apr 2019 19:17:20 +0200 Subject: RFR(s): 8222015: Small VM.metaspace improvements In-Reply-To: References: Message-ID: Hi all, Latest version, with changes requested by Jiangli: full: http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/webrev.01/webrev/ delta: http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/webrev_delta.01/webrev/ May I have a second reviewer, please? Thank you. Best Regards, Thomas On Fri, Apr 5, 2019 at 12:06 PM Thomas St?fe wrote: > Hi all, > > may I have please a review for this collection of small improvements to > the VM.metaspace diagnostic command? > > - it clearly marks now classes whose metadata reside in cds > - it shows the number of classes loaded, incl. those from cds, in the > overviews too. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8222015 > cr: > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/webrev.00/webrev/ > > Example output: > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-by-spacetype.txt > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders.txt > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders-showclasses.txt (scroll > down -> cds classes in are now marked with 's') > > Thank you, > > Thomas > From yumin.qi at gmail.com Fri Apr 26 17:41:30 2019 From: yumin.qi at gmail.com (yumin qi) Date: Fri, 26 Apr 2019 10:41:30 -0700 Subject: RFC: JWarmup precompile java hot methods at application startup In-Reply-To: <0f11a565-930f-4456-b2fa-3cb9bc476c16.kuaiwei.kw@alibaba-inc.com> References: <0f11a565-930f-4456-b2fa-3cb9bc476c16.kuaiwei.kw@alibaba-inc.com> Message-ID: David, Wei Thanks for your comment. Java RTS has an option -XX:PreInitList=my-preinit-file which stores the pre-init class list. Will the classes in that list be initialized in order? JWarmUp recorded the class init order in pre-run to prevent runtime unnecessary deoptimization due to class initialization out of order. The current design is wait for vm finished startup to startup warm up. We have tried compile when class loading and found many problems so decided go with current design. You said it is like selective -Xcomp, yes, it looks like in first, but next we will enhance with method profiling information to make more optimized code. The changes have been made to many runtime files, so need comments from runtime either. Thanks Yumin On Fri, Apr 26, 2019 at 5:53 AM Kuai Wei wrote: > Hi David, > > I try to add more info about JWarmup. Yumin may explain more detail > later. > > In record phase, JWarmup will record hot methods and the class > initialize order. We believe class order is important. Without it, most > warmup compilation will be failed by deopt. > > In warmup phase, JVM will check init order before warmup compilation. If > the recorded dependent classes are initialized, (the classes may not be > really dependent, we just check the init order), the methods will be warmup > compiled. So we delay warmup compilation after JVM startup, we need wait > JVM to load most classes. > > Thanks, > Kuai Wei > > ------------------------------------------------------------------ > From:David Holmes > Send Time:2019?4?26?(???) 14:54 > To:yumin qi ; hotspot-runtim. < > hotspot-runtime-dev at openjdk.java.net> > Cc:hotspot-dev > Subject:Re: RFC: JWarmup precompile java hot methods at application startup > > Hi Yumin, > > On 26/04/2019 2:07 am, yumin qi wrote: > > Hi, > > > > > Apart from comments from compiler professionals can I have comments from > > runtime either? The changes mostly land in runtime area. > > I have to question why the changes mostly land in runtime area! The > high-level description of this feature does not sound like it depends on > the runtime at all. The "recording" feature should just come from the > JITs data; and the actual warmup should just be an interaction during VM > initialization with the JIT. I don't see anything in the JEP to explain > the actual design, and why it impacts on the runtime so much. > > It also sounds like a selective Xcomp mode to me. > > It even sounds very similar to Initialization-Time-Compilation (ITC) > that we employed in Java Real-Time System: > > > https://docs.oracle.com/javase/realtime/doc_2.2u1/release/JavaRTSCompilation.html > > > Cheers, > David > > > Thanks > > Yumin > > > > On Tue, Apr 16, 2019 at 11:27 AM yumin qi wrote: > > > >> HI, > >> > >> Did anyone have comments for this version? > >> > >> Thanks > >> Yumin > >> > >> On Tue, Apr 9, 2019 at 10:36 AM yumin qi wrote: > >> > >>> Alan, > >>> Thanks! Updated in same link: > >>> http://cr.openjdk.java.net/~minqi/8220692/webrev-02/ > >>> > >>> Removed non-boot loader branch in nativeLookup.cpp. > >>> Added jdk.jwarmup to boot loader list in make/common/Modules.gmk. > >>> Tested again to make sure the new changes. > >>> > >>> Thanks > >>> Yumin > >>> > >>> > >>> On Tue, Apr 9, 2019 at 4:48 AM Alan Bateman > >>> wrote: > >>> > >>>> On 09/04/2019 07:10, yumin qi wrote: > >>>>> > >>>>> Now the registerNatives is found when it looks up for native entry > >>>>> in lookupNative.cpp. I thought the class JWarmUp will be loaded by > >>>>> boot loader like Unsafe or WhiteBox, but I was wrong, it is loaded by > >>>>> app class loader so logic for obtaining its native entry put in both > >>>>> cases, boot loader and non boot loaders. > >>>>> > >>>> make/common/Modules.gmk is where BOOT_MODULES is defined with the list > >>>> of modules mapped to the boot loader. > >>>> > >>>> -Alan > >>>> > >>> > > From yumin.qi at gmail.com Fri Apr 26 17:42:12 2019 From: yumin.qi at gmail.com (yumin qi) Date: Fri, 26 Apr 2019 10:42:12 -0700 Subject: RFC: JWarmup precompile java hot methods at application startup In-Reply-To: <028f6a4a-b0be-bd65-519e-d76b5054e0e8@oracle.com> References: <028f6a4a-b0be-bd65-519e-d76b5054e0e8@oracle.com> Message-ID: Hi, Alan Thanks! I will update the JEP for this information. Thanks Yumin On Fri, Apr 26, 2019 at 3:56 AM Alan Bateman wrote: > On 26/04/2019 07:52, David Holmes wrote: > > > > I have to question why the changes mostly land in runtime area! The > > high-level description of this feature does not sound like it depends > > on the runtime at all. The "recording" feature should just come from > > the JITs data; and the actual warmup should just be an interaction > > during VM initialization with the JIT. I don't see anything in the JEP > > to explain the actual design, and why it impacts on the runtime so much. > In addition, the draft JEP may need updating to outline the mode that > requires the application to "notify" the runtime via a JDK-specific API > that it has completed initialization. If this mode is part of the > proposal then it should be described in the JEP. > > -Alan > From gerard.ziemski at oracle.com Fri Apr 26 18:00:08 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Fri, 26 Apr 2019 13:00:08 -0500 Subject: RFR 8211331: [Event Request] Events to track Unsafe memory allocations In-Reply-To: <0471A20B-654D-42FF-B28F-828E6B50FF59@me.com> References: <0471A20B-654D-42FF-B28F-828E6B50FF59@me.com> Message-ID: <0e9db086-4acc-e2a9-c88c-c825a12dfb7c@oracle.com> Hi all, Please review this feature, which adds JFR event tracking for Unsafe memory allocations/deallocations. I chose to use malloc_size() on Mac, malloc_usable_size() on Linux and _msize() on Windows to track pointer size, which allowed me to track the sizes during deallocations (as well as the actual sizes that the OS allocated). Other platforms have that feature disabled, but still track the (client) allocation sizes. Bug: https://bugs.openjdk.java.net/browse/JDK-8211331 Webrev: http://cr.openjdk.java.net/~gziemski/8211331_rev1 Testing: Mach5 tier1 passes, Mach5 tier1,2,3,4,5,6,7 in progress? cheers From erik.gahlin at oracle.com Fri Apr 26 18:06:35 2019 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Fri, 26 Apr 2019 20:06:35 +0200 Subject: RFR(XS): 8221121: applications/microbenchmarks are encountering crashes in tier5 In-Reply-To: <1515916c-6187-46d4-8815-5e824284e04b@default> References: <1515916c-6187-46d4-8815-5e824284e04b@default> Message-ID: <5CC348AB.7080704@oracle.com> Looks good. Erik > Greetings, > > Please review this small patch to address the following: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221121 > Webrev: http://cr.openjdk.java.net/~mgronlun/8221121/webrev01/ > > Description: > > The applications/microbenchmarks added to tier5 are failing in some instances (debug builds), with, as an example, the following trace: > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/scratch/opt/mach5/mesos/work_dir/slaves/2dd962d0-8988-479b-a804-57ab764ada59-S77631/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/d8f4cb38-0dec-4477-a89a-f62853433c56/runs/cb3f55fa-9e13-481c-a7d6-9f33d2b8b457/workspace/open/src/hotspot/share/jfr/recorder/storage/jfrMemorySpace.inline.hpp:85), pid=5748, tid=5770 > # assert(t->identity() == __null) failed: invariant > > This occurred when JFR was running with the in-memory configuration where buffers are reused FIFO-style. > In the implementation, an "age node" will manage a full buffer for its reclamation, and age nodes provides for a linked "full" (fifo) list. > > Issue: > The age node was not expected to retain an identity after being added to the full list, where the assertion fired during the subsequent discard-reuse processing step. > This situation only manifests with running JFR in-memory configurations and using debug builds. > > Fix is to release the age node before insertion onto full list. > > Thanks > Markus From leonid.mesnik at oracle.com Fri Apr 26 18:57:28 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Fri, 26 Apr 2019 11:57:28 -0700 Subject: RFR(S): 8222769: [TESTBUG] TestJFRNetworkEvents should not rely on hostname command In-Reply-To: References: <277d1da2-93fa-c516-fb7d-cf02a6d99dbd@oracle.com> Message-ID: <76E9C867-C2A9-4B5B-A7D8-4620E8AC41CF@oracle.com> Looks good. Leonid > On Apr 25, 2019, at 7:24 PM, mikhailo.seledtsov at oracle.com wrote: > > Here is the updated webrev: http://cr.openjdk.java.net/~mseledtsov/8222769.01/ > > > On 4/25/19 6:38 PM, mikhailo.seledtsov at oracle.com wrote: >> Thank you Leonid, >> >> >> On 4/25/19 6:25 PM, Leonid Mesnik wrote: >>> Hi >>> >>> The overall fix looks food. But I expect that there are might be different docker environments. >>> Would not be more robust to verify that address exist in getLocalIp() list and fails if not? >> Sounds good. I will update the code to match the address reported by JFR to all the addresses returned by getLocalIp(). If at least a single match found, test continues. If not, test fails. >> >> Misha >>> >>> Leonid >>> >>>> On Apr 25, 2019, at 5:51 PM, mikhailo.seledtsov at oracle.com wrote: >>>> >>>> Please review this change that uses a better platform independent way of obtaining IP address inside a docker container. >>>> This test worked for Oracle Linux, but did not work on Fedora, which is now fixed. >>>> The code for obtaining IP address in getLocalIp() was contributed by Severin, thank you Severin. >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222769 >>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8222769.00/ >>>> Testing: >>>> Ran the affected test on Linux-x64 (Oracle Linux 7.3, 7.6) - Passed >>>> >>>> >>>> Thank you, >>>> Misha >>>> >> > From jianglizhou at google.com Sat Apr 27 00:48:02 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Fri, 26 Apr 2019 17:48:02 -0700 Subject: RFR(s): 8222015: Small VM.metaspace improvements In-Reply-To: References: Message-ID: Hi Thomas, The updates look good to me. Best, Jiangli On Fri, Apr 26, 2019 at 10:18 AM Thomas St?fe wrote: > > Hi all, > > Latest version, with changes requested by Jiangli: > > full: > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/webrev.01/webrev/ > delta: > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/webrev_delta.01/webrev/ > > May I have a second reviewer, please? > > Thank you. > > Best Regards, Thomas > > On Fri, Apr 5, 2019 at 12:06 PM Thomas St?fe > wrote: > > > Hi all, > > > > may I have please a review for this collection of small improvements to > > the VM.metaspace diagnostic command? > > > > - it clearly marks now classes whose metadata reside in cds > > - it shows the number of classes loaded, incl. those from cds, in the > > overviews too. > > > > Issue: https://bugs.openjdk.java.net/browse/JDK-8222015 > > cr: > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/webrev.00/webrev/ > > > > Example output: > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-by-spacetype.txt > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders.txt > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8222015--small-vm.metaspace-improvements/example-showloaders-showclasses.txt (scroll > > down -> cds classes in are now marked with 's') > > > > Thank you, > > > > Thomas > > From david.holmes at oracle.com Mon Apr 29 01:48:37 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 Apr 2019 11:48:37 +1000 Subject: RFC: JWarmup precompile java hot methods at application startup In-Reply-To: References: <0f11a565-930f-4456-b2fa-3cb9bc476c16.kuaiwei.kw@alibaba-inc.com> Message-ID: <84654b41-d785-13e7-27cf-4cdc0a868982@oracle.com> On 27/04/2019 3:41 am, yumin qi wrote: > David, Wei > ? Thanks for your comment. > ? Java RTS has an option -XX:PreInitList=my-preinit-file which stores > the pre-init class list. Will the classes in that list be initialized in > order? Yes they would. That was its purpose. > ?JWarmUp recorded the class init order in pre-run to prevent runtime > unnecessary?deoptimization due to class initialization out of order. The I don't understand that sentence sorry. > current design is wait for vm finished startup to startup warm up. We > have tried compile when class loading and found many problems so decided > go with current design. I'm trying to understand the details of the current design and I'm afraid I'm not getting it at all. I would have expected the "pre-run" to generate a list of methods to compile, sorted by initialization order. So yes some tweaking to the runtime to track the initialization order. I would then expect the actual run to take that file and at some point during startup run through that list and load and initialize each class, then compile each method. David ----- > ?You said it is like selective -Xcomp, yes, it looks like in first, but > next we will enhance?with method profiling information to make more > optimized code. > > ?The changes have been made to many runtime files, so need comments > from runtime either. > > Thanks > Yumin > > On Fri, Apr 26, 2019 at 5:53 AM Kuai Wei > wrote: > > Hi David, > > ? I try to add more info about JWarmup. Yumin may explain more > detail later. > > ? In record phase, JWarmup will record hot methods and the class > initialize order. We believe class order is important. Without it, > most warmup compilation will be failed by deopt. > > ? In warmup phase, JVM will check init order before warmup > compilation. If the recorded dependent classes are initialized, (the > classes may not be really dependent, we just check the init order), > the methods will be warmup compiled. So we delay warmup compilation > after JVM startup, we need wait JVM to load most classes. > > Thanks, > Kuai Wei > > ------------------------------------------------------------------ > From:David Holmes > > Send Time:2019?4?26?(???) 14:54 > To:yumin qi >; > hotspot-runtim. > > Cc:hotspot-dev > > Subject:Re: RFC: JWarmup precompile java hot methods at > application startup > > Hi?Yumin, > > On?26/04/2019?2:07?am,?yumin?qi?wrote: > >?Hi, > > > >????Apart?from?comments?from?compiler?professionals?can?I?have?comments?from > >?runtime?either??The?changes?mostly?land?in?runtime?area. > > I?have?to?question?why?the?changes?mostly?land?in?runtime?area!?The > high-level?description?of?this?feature?does?not?sound?like?it?depends?on > > the?runtime?at?all.?The?"recording"?feature?should?just?come?from?the > > JITs?data;?and?the?actual?warmup?should?just?be?an?interaction?during?VM > > initialization?with?the?JIT.?I?don't?see?anything?in?the?JEP?to?explain > > the?actual?design,?and?why?it?impacts?on?the?runtime?so?much. > > It?also?sounds?like?a?selective?Xcomp?mode?to?me. > > It?even?sounds?very?similar?to?Initialization-Time-Compilation?(ITC) > > that?we?employed?in?Java?Real-Time?System: > > https://docs.oracle.com/javase/realtime/doc_2.2u1/release/JavaRTSCompilation.html > > Cheers, > David > > >?Thanks > >?Yumin > > > >?On?Tue,?Apr?16,?2019?at?11:27?AM?yumin?qi? >?wrote: > > > >>?HI, > >> > >>????Did?anyone?have?comments?for?this?version? > >> > >>?Thanks > >>?Yumin > >> > >>?On?Tue,?Apr?9,?2019?at?10:36?AM?yumin?qi? >?wrote: > >> > >>>?Alan, > >>>????Thanks!?Updated?in?same?link: > >>> http://cr.openjdk.java.net/~minqi/8220692/webrev-02/ > >>> > >>>????Removed?non-boot?loader?branch?in?nativeLookup.cpp. > >>>????Added?jdk.jwarmup?to?boot?loader?list?in?make/common/Modules.gmk. > >>>????Tested?again?to?make?sure?the?new?changes. > >>> > >>>????Thanks > >>>????Yumin > >>> > >>> > >>>?On?Tue,?Apr?9,?2019?at?4:48?AM?Alan?Bateman?> > >>>?wrote: > >>> > >>>>?On?09/04/2019?07:10,?yumin?qi?wrote: > >>>>> > >>>>>????Now?the?registerNatives?is?found?when?it?looks?up?for?native?entry > >>>>>?in?lookupNative.cpp.?I?thought?the?class?JWarmUp?will?be?loaded?by > >>>>>?boot?loader?like?Unsafe?or?WhiteBox,?but?I?was?wrong,?it?is?loaded?by > >>>>>?app?class?loader?so?logic?for?obtaining?its?native?entry?put?in?both > >>>>>?cases,?boot?loader?and?non?boot?loaders. > >>>>> > >>>>?make/common/Modules.gmk?is?where?BOOT_MODULES?is?defined?with?the?list > >>>>?of?modules?mapped?to?the?boot?loader. > >>>> > >>>>?-Alan > >>>> > >>> > From stefan.karlsson at oracle.com Mon Apr 29 11:44:00 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 29 Apr 2019 13:44:00 +0200 Subject: RFR: 8223064: Minor cleanups in ResolvedMethodTable Message-ID: <5dd2ff6b-c16c-e2a6-cecf-b990852a5992@oracle.com> Hi all, Please review these minor cleanups in the ResolvedMethodTable. http://cr.openjdk.java.net/~stefank/8223064/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8223064 - Remove unused EXCEPTION_MARK - unnecessarily copied from StringTable - Reuse existing LogTarget - Remove unreachable return statement Thanks, StefanK From harold.seigel at oracle.com Mon Apr 29 14:26:03 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Mon, 29 Apr 2019 10:26:03 -0400 Subject: RFR: 8223064: Minor cleanups in ResolvedMethodTable In-Reply-To: <5dd2ff6b-c16c-e2a6-cecf-b990852a5992@oracle.com> References: <5dd2ff6b-c16c-e2a6-cecf-b990852a5992@oracle.com> Message-ID: <897f4968-8710-6344-e07c-fbc3a71d86a7@oracle.com> Hi Stefan, These changes look good. Harold On 4/29/2019 7:44 AM, Stefan Karlsson wrote: > Hi all, > > Please review these minor cleanups in the ResolvedMethodTable. > > http://cr.openjdk.java.net/~stefank/8223064/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8223064 > > - Remove unused EXCEPTION_MARK - unnecessarily copied from StringTable > - Reuse existing LogTarget > - Remove unreachable return statement > > Thanks, > StefanK From coleen.phillimore at oracle.com Mon Apr 29 15:24:58 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 29 Apr 2019 11:24:58 -0400 Subject: RFR: 8223064: Minor cleanups in ResolvedMethodTable In-Reply-To: <897f4968-8710-6344-e07c-fbc3a71d86a7@oracle.com> References: <5dd2ff6b-c16c-e2a6-cecf-b990852a5992@oracle.com> <897f4968-8710-6344-e07c-fbc3a71d86a7@oracle.com> Message-ID: <7b224b64-ff44-9693-8d53-b6e7b6762498@oracle.com> +1 thanks, Coleen On 4/29/19 10:26 AM, Harold Seigel wrote: > Hi Stefan, > > These changes look good. > > Harold > > On 4/29/2019 7:44 AM, Stefan Karlsson wrote: >> Hi all, >> >> Please review these minor cleanups in the ResolvedMethodTable. >> >> http://cr.openjdk.java.net/~stefank/8223064/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8223064 >> >> - Remove unused EXCEPTION_MARK - unnecessarily copied from StringTable >> - Reuse existing LogTarget >> - Remove unreachable return statement >> >> Thanks, >> StefanK From mikhailo.seledtsov at oracle.com Mon Apr 29 16:28:30 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Mon, 29 Apr 2019 09:28:30 -0700 Subject: RFR(S): 8222769: [TESTBUG] TestJFRNetworkEvents should not rely on hostname command In-Reply-To: <76E9C867-C2A9-4B5B-A7D8-4620E8AC41CF@oracle.com> References: <277d1da2-93fa-c516-fb7d-cf02a6d99dbd@oracle.com> <76E9C867-C2A9-4B5B-A7D8-4620E8AC41CF@oracle.com> Message-ID: <796ef303-9082-41dc-e5a8-ff9a2050ca25@oracle.com> Thank you for review Leonid. On 4/26/19 11:57 AM, Leonid Mesnik wrote: > Looks good. > > Leonid > >> On Apr 25, 2019, at 7:24 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Here is the updated webrev: http://cr.openjdk.java.net/~mseledtsov/8222769.01/ >> >> >> On 4/25/19 6:38 PM, mikhailo.seledtsov at oracle.com wrote: >>> Thank you Leonid, >>> >>> >>> On 4/25/19 6:25 PM, Leonid Mesnik wrote: >>>> Hi >>>> >>>> The overall fix looks food. But I expect that there are might be different docker environments. >>>> Would not be more robust to verify that address exist in getLocalIp() list and fails if not? >>> Sounds good. I will update the code to match the address reported by JFR to all the addresses returned by getLocalIp(). If at least a single match found, test continues. If not, test fails. >>> >>> Misha >>>> Leonid >>>> >>>>> On Apr 25, 2019, at 5:51 PM, mikhailo.seledtsov at oracle.com wrote: >>>>> >>>>> Please review this change that uses a better platform independent way of obtaining IP address inside a docker container. >>>>> This test worked for Oracle Linux, but did not work on Fedora, which is now fixed. >>>>> The code for obtaining IP address in getLocalIp() was contributed by Severin, thank you Severin. >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222769 >>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8222769.00/ >>>>> Testing: >>>>> Ran the affected test on Linux-x64 (Oracle Linux 7.3, 7.6) - Passed >>>>> >>>>> >>>>> Thank you, >>>>> Misha >>>>> From yumin.qi at gmail.com Mon Apr 29 16:35:15 2019 From: yumin.qi at gmail.com (yumin qi) Date: Mon, 29 Apr 2019 09:35:15 -0700 Subject: RFC: JWarmup precompile java hot methods at application startup In-Reply-To: <84654b41-d785-13e7-27cf-4cdc0a868982@oracle.com> References: <0f11a565-930f-4456-b2fa-3cb9bc476c16.kuaiwei.kw@alibaba-inc.com> <84654b41-d785-13e7-27cf-4cdc0a868982@oracle.com> Message-ID: HI, David Thanks for comments. On Sun, Apr 28, 2019 at 6:50 PM David Holmes wrote: > > > JWarmUp recorded the class init order in pre-run to prevent runtime > > unnecessary deoptimization due to class initialization out of order. The > > I don't understand that sentence sorry. > > Sorry, that is my mistake, it should be 'compilation failure'. > > current design is wait for vm finished startup to startup warm up. We > > have tried compile when class loading and found many problems so decided > > go with current design. > > I'm trying to understand the details of the current design and I'm > afraid I'm not getting it at all. I would have expected the "pre-run" to > generate a list of methods to compile, sorted by initialization order. > So yes some tweaking to the runtime to track the initialization order. > > I would then expect the actual run to take that file and at some point > during startup run through that list and load and initialize each class, > then compile each method. > > The methods remembered in order of time already in the file. In regular compilation, all the classes in the compilation already resolved or can be resolved. Since JWarmUp pre-compile those methods (in order as they are in pre-run) at some point which is early than they were in pre-run, some classes have not been loaded yet. We could not assume the class is loaded by which class loader especially for custom class loaders, the compilation has to fail. We supply an API for application to inform VM the time to start warmup compilation. At that point, most classes have been loaded already since application started, so most of the methods can be compiled successfully though we may have a very small number of failures. Application developers know the time point when to use this API to start warmup compilation. Would this answer your question? Thanks Yumin > David > ----- > > > You said it is like selective -Xcomp, yes, it looks like in first, but > > next we will enhance with method profiling information to make more > > optimized code. > > > > The changes have been made to many runtime files, so need comments > > from runtime either. > > > > Thanks > > Yumin > > > > On Fri, Apr 26, 2019 at 5:53 AM Kuai Wei > > wrote: > > > > Hi David, > > > > I try to add more info about JWarmup. Yumin may explain more > > detail later. > > > > In record phase, JWarmup will record hot methods and the class > > initialize order. We believe class order is important. Without it, > > most warmup compilation will be failed by deopt. > > > > In warmup phase, JVM will check init order before warmup > > compilation. If the recorded dependent classes are initialized, (the > > classes may not be really dependent, we just check the init order), > > the methods will be warmup compiled. So we delay warmup compilation > > after JVM startup, we need wait JVM to load most classes. > > > > Thanks, > > Kuai Wei > > > > > ------------------------------------------------------------------ > > From:David Holmes > > > > Send Time:2019?4?26?(???) 14:54 > > To:yumin qi >; > > hotspot-runtim. > > > > Cc:hotspot-dev > > > > Subject:Re: RFC: JWarmup precompile java hot methods at > > application startup > > > > Hi Yumin, > > > > On 26/04/2019 2:07 am, yumin qi wrote: > > > Hi, > > > > > > > Apart from comments from compiler professionals can I have comments from > > > runtime either? The changes mostly land in runtime area. > > > > > I have to question why the changes mostly land in runtime area! The > > > high-level description of this feature does not sound like it depends on > > > > > the runtime at all. The "recording" feature should just come from the > > > > > JITs data; and the actual warmup should just be an interaction during VM > > > > > initialization with the JIT. I don't see anything in the JEP to explain > > > > the actual design, and why it impacts on the runtime so much. > > > > It also sounds like a selective Xcomp mode to me. > > > > > It even sounds very similar to Initialization-Time-Compilation (ITC) > > > > that we employed in Java Real-Time System: > > > > > https://docs.oracle.com/javase/realtime/doc_2.2u1/release/JavaRTSCompilation.html > > > > Cheers, > > David > > > > > Thanks > > > Yumin > > > > > > On Tue, Apr 16, 2019 at 11:27 AM yumin qi > > wrote: > > > > > >> HI, > > >> > > >> Did anyone have comments for this version? > > >> > > >> Thanks > > >> Yumin > > >> > > >> On Tue, Apr 9, 2019 at 10:36 AM yumin qi > > wrote: > > >> > > >>> Alan, > > >>> Thanks! Updated in same link: > > >>> http://cr.openjdk.java.net/~minqi/8220692/webrev-02/ > > >>> > > >>> Removed non-boot loader branch in nativeLookup.cpp. > > > >>> Added jdk.jwarmup to boot loader list in make/common/Modules.gmk. > > >>> Tested again to make sure the new changes. > > >>> > > >>> Thanks > > >>> Yumin > > >>> > > >>> > > >>> On Tue, Apr 9, 2019 at 4:48 AM Alan Bateman < > Alan.Bateman at oracle.com > > > >>> wrote: > > >>> > > >>>> On 09/04/2019 07:10, yumin qi wrote: > > >>>>> > > > >>>>> Now the registerNatives is found when it looks up for native entry > > > >>>>> in lookupNative.cpp. I thought the class JWarmUp will be loaded by > > > >>>>> boot loader like Unsafe or WhiteBox, but I was wrong, it is loaded by > > > >>>>> app class loader so logic for obtaining its native entry put in both > > >>>>> cases, boot loader and non boot loaders. > > >>>>> > > > >>>> make/common/Modules.gmk is where BOOT_MODULES is defined with the list > > >>>> of modules mapped to the boot loader. > > >>>> > > >>>> -Alan > > >>>> > > >>> > > > From calvin.cheung at oracle.com Mon Apr 29 17:13:38 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Mon, 29 Apr 2019 10:13:38 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> <5CBFA906.1030205@oracle.com> Message-ID: <5CC730C2.50800@oracle.com> Hi Jiangli, Thanks for the re-review. Please see my comments in-line below... On 4/24/19, 7:54 PM, Jiangli Zhou wrote: > Please see comments inlined. > > On Tue, Apr 23, 2019 at 5:08 PM Calvin Cheung wrote: >> Hi Jiangli, >> >> Thanks a lot for your review! >> >> On 4/22/19, 2:07 PM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> Congrats on finalizing the dynamic archiving work and completing >>> testing. After the integration of the dynamic archiving, a follow-up >>> RFE can be done to merge the archiving/copying code in >>> dynamicArchive.* and metaspaceShared.* for better maintenance in the >>> future. As there are many duplicates between those two, having shared >>> implementation for both static and dynamic will be beneficial and >>> reduce the maintenance cost. >> I'll file an RFE for the above. >>> Here are my comments mainly for additional cleanups and some minor issues. >>> >>> - src/hotspot/share/classfile/classLoader.cpp >>> >>> 1337 // FIXME: DynamicDumpSharedSpaces and --patch-modules are >>> mutually exclusive >>> 1338 assert(!DynamicDumpSharedSpaces, "sanity"); >>> >>> I tagged the comment with 'FIXME' to serve as a reminder to add more >>> details. The reason DynamicDumpSharedSpaces is 'mutually exclusive' >>> with with --patch-modules because DynamicDumpSharedSpaces is only >>> enabled when UseSharedSpaces is also enabled. As --patch-modules is >>> not supported with UseSharedSpaces, it is not supported with >>> DynamicDumpSharedSpaces either. >> I've converted the FIXME to a comment. >>> 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); >>> 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, >>> (ClassFileStream*)stream); >>> >>> Please add assert(DynamicDumpSharedSpaces, "sanity"); to the above >>> code. With the new dynamic archiving capability, it's now able to >>> load/archive a class with user defined classloader via this call path. >>> A comment explaining this is also needed. >> I tried the assert but it didn't work. Not only DynamicDumpSharedSpaces >> will go through that code path. > I should be more clear. The new code is only intended for the > DynamicDumpSharedSpaces, since the shared_classpath_index is set to > UNREGISTERED_INDEX by ClassLoaderExt::load_class when loading class > with "source:" in the class list file at static dumping time. > > 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); > 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, > (ClassFileStream*)stream); > > After thinking more, it's probably better to remove the following > marked code from ClassLoaderExt::load_class. That avoids setting twice > in two different places during static dumping. It also makes the code > cleaner. > > InstanceKlass* ClassLoaderExt::load_class(Symbol* name, const char* > path, TRAPS) { > ... > result->set_shared_classpath_index(UNREGISTERED_INDEX);<<<<<<<<<<< > SystemDictionaryShared::set_shared_class_misc_info(result, stream); > <<<<<<<<<<<< http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/src/hotspot/share/classfile/classLoaderExt.cpp.html Only the first statement is still there. I agree that the set_shared_classpath_indexd() can be removed. > >>> - src/hotspot/share/classfile/classLoaderExt.cpp >>> >>> 64 void ClassLoaderExt::setup_app_search_path() { >>> 65 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, >>> 66 "this function is only used with -Xshare:dump"); >>> >>> The above message needs to be updated to reflect the new command-line option. >> Done. >>> 304 result->set_shared_classpath_index(UNREGISTERED_INDEX); >>> 305 SystemDictionaryShared::set_shared_class_misc_info(result, >>> stream);<<<<<<<<<< >>> >>> Why is the set_shared_class_misc_info call being removed? If this is a >>> bug fix for loading classes from the classlist for user defined >>> classloaders, it should be handled separately, and with a separate bug >>> ID as well. >> It is called in ClassLoader::record_result() from >> KlassFactory::create_from_stream(). > > Ok, this is related to the above comment. > >>> - src/hotspot/share/classfile/compactHashtable.cpp >>> >>> 207 size_t SimpleCompactHashtable::calculate_header_size() { >>> 208 // We have 5 fields. Each takes up sizeof(intptr_t). See >>> WriteClosure::do_u4 >>> 209 size_t bytes = sizeof(intptr_t) * 5; >>> 210 return bytes; >>> 211 } >>> >>> 212 >>> 213 void SimpleCompactHashtable::serialize_header(SerializeClosure* soc) { >>> 214 // NOTE: if you change this function, you MUST change the number 5 in >>> 215 // calculate_header_size() accordingly. >>> ... >>> >>> As a cleanup, a better way to handle this is to calculate the size >>> within SimpleCompactHashtable::serialize_header during serializing the >>> data and set the size value in a valuable. >>> SimpleCompactHashtable::calculate_header_size() should simply retrieve >>> the value. A renaming of >>> SimpleCompactHashtable::calculate_header_size() can also be done. >> I've checked with Ioi on this one. The problem is >> calculate_header_size() needs to be called during size estimation, and >> serialize_header is called after size estimation. > Can you please file a RFE for this? The current code is okay for the > first integration. It deserves some efforts to make it cleaner > (probably with a different solution) since it can be error-prone. I've filed: https://bugs.openjdk.java.net/browse/JDK-8223004 Avoid using a hard-coded number in SimpleCompactHashtable::calculate_header_size() > >>> - src/hotspot/share/classfile/dictionary.cpp >>> >>> 315 InstanceKlass* Dictionary::find_class(Symbol* name) { >>> 316 unsigned int hash = compute_hash(name); >>> 317 int index = hash_to_index(hash); >>> 318 return find_class(index, hash, name); >>> 319 } >>> >>> Looks like the new function is not references (unless I'm missing >>> something). Please remove the function. >>> >>> - src/hotspot/share/classfile/dictionary.hpp >>> >>> 65 InstanceKlass* find_class(Symbol* name); >>> >>> Same comment as the above. >> I've removed the function. >>> - src/hotspot/share/classfile/symbolTable.cpp. >>> >>> 473 Symbol* const _archived; // used by UseSharedArchived2 >>> >>> Please removed 'UseSharedArchived2'. The comment also needs more clarifications. >>> >>> I couldn't find any references to SymbolTableCreateEntry. Can you >>> please point to me where it is being used? >> I've removed the entire SymbolTableCreateEntry class. It was left there >> probably due to merge error. >>> - src/hotspot/share/classfile/systemDictionaryShared.cpp >>> >>> 1218 if (DynamicDumpSharedSpaces) { >>> 1219 return false; >>> 1220 } else { >>> >>> The above case for DynamicDumpSharedSpaces needs to be examined >>> carefully. Can you please ask Harold (and Coleen or Karen) to take a >>> look? Also, a comment is needed to explain that we can complete all >>> verification checks at dynamic dumping time. >> I've added a comment. If it return false, the caller will call >> VerificationType::resolve_and_check_assignability(). >>> - src/hotspot/share/classfile/systemDictionaryShared.cpp >>> >>> 1279 ResourceMark rm; >>> >>> You can use 'ResourceMark rm(THREAD)'. >> Fixed. >>> - src/hotspot/share/memory/allocation.hpp >>> >>> 255 // >>> 256 // When CDS is not enabled, both pointers are set to NULL. >>> 257 static void* _shared_metaspace_base; // (inclusive) low address >>> 258 static void* _shared_metaspace_top; // (exclusive) high addres >>> >>> Why the comment at line 256 was removed? >> I've added back the comment. >>> - src/hotspot/share/memory/filemap.cpp >>> >>> 101 void FileMapInfo::fail_continue(const char *msg, ...) { >>> 102 va_list ap; >>> 103 va_start(ap, msg); >>> 104 if (_runtime_dynamic_info == NULL) { >>> 105 MetaspaceShared::set_archive_loading_failed(); >>> 106 } else { >>> 107 DynamicArchive::disable(); >>> 108 } >>> >>> The above fail_continue only works if _runtime_dynamic_info is setup >>> after the mapping the base archive. Comments should be add to explain >>> that. >> Comment added. >>> Can you please rename '_runtime_dynamic_info' so it's more >>> descriptive? Maybe use 'dynamic_archive_info'. >> Renamed to '_dynamic_archive_info'. >>> 587 bool FileMapInfo::same_files(const char* file1, const char* file2) { >>> >>> The usage of FileMapInfo::same_files is not necessary and should be >>> removed. The base archive's CRC checksum values are recorded in the >>> dynamic archive. The runtime verifies the CRC values to make sure the >>> same archive is used at dump time and runtime, regardless of the base >>> archive path or name. It is designed for all use cases: >> The same_files() function is also used in arguments.cpp: >> 3530 if (DynamicDumpSharedSpaces) { >> 3531 if (FileMapInfo::same_files(SharedArchiveFile, >> ArchiveClassesAtExit)) { >> 3532 vm_exit_during_initialization( >> 3533 "Cannot have the same archive file specified for >> -XX:SharedArchiveFile and -XX:ArchiveClassesAtExit", >> 3534 SharedArchiveFile); >> 3535 } >> 3536 } >> >> The function is also needed for the RFE: >> https://bugs.openjdk.java.net/browse/JDK-8211723 > Ok. It should be treated a bug, not a RFE. > > The shared path table check does not verify the path ordering (also > including the case when new path components are inserted). The bug > should be handled as a high priority task for dynamic archive. I saw that you and Karen have had some discussion in the bug report since you sent this review comment. I take that you're fine with that. > >> We still verify the CRC values during runtime. >>> * base CDS archive is specified in the -XX:SharedArchiveFile at >>> dynamic dumping time >>> * -XX:SharedArchiveFile is not specified at dynamic dumping time, >>> default location for the default CDS archive is used >>> * default CDS archive is specified in the -XX:SharedArchiveFile at runtime >>> * default CDS archive is not specified in the -XX:SharedArchiveFile at >>> runtime, default location for the default CDS archive is used >> Regarding the fourth point above, the user could have a non-default base >> archive and only specify the top archive during runtime. > I would argue against it since it doesn't always work and adds extra > code. When the archive path/name is changed, the recorded one in the > dynamic archive would no longer work. User still need to specify the > path/name in the command-line. The use case only works for the default > CDS archive. For non-default CDS archive, specifying in the > command-line option results a cleaner design and less fragile code. If the base archive is moved, then the user has to modify the command-line anyway, whether the user initially specified (a) only the top archive, or (b) both archives. Requiring both archives to be specified will only hurt the users who never move the archives. We'd like to leave it as is. > >>> In all above cases, the base archive CRC values check is sufficient. >>> The use of path/name is fragile and should be avoided. That will allow >>> you to remove the _base_archive_name_size from the dynamic archive. >> We still need the _base_archive_name_size and the base archive name in >> the header because of the above reason. > Please see my comment above. > >>> 752 if (is_static) { >>> 753 // FIXME check for dynamic header as well >>> 754 // FIXME Don't just check the last region -- check all regions! >>> >>> Can you please address the first FIXME at line 753? >>> >>> Checking the last region is sufficient since the archive is written is >>> sequential order. The second FIXME is not necessary. >> I've addressed the first FIXME and converted the second one to a comment. >>> - src/hotspot/share/memory/metaspace.cpp >>> >>> 1417 bool Metaspace::contains(const void* ptr) { >>> 1418 // FIXME: need to check the dynamic archive >>> >>> Can you please remove the above FIXME? There is no need for a separate check. >> Done. >>> - src/hotspot/share/memory/metaspaceShared.cpp >>> >>> 830 intptr_t* MetaspaceShared::fix_cpp_vtable_for_second_archive >>> >>> Can you please rename the function to fix_cpp_vtable_for_dynamic_archive? >> Done. >>> - src/hotspot/share/oops/klass.cpp >>> >>> 527 assert (DumpSharedSpaces || DynamicDumpSharedSpaces, >>> 528 "only called for DumpSharedSpaces"); >>> >>> 544 void Klass::remove_java_mirror() { >>> 545 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, "only >>> called for DumpSharedSpaces"); >>> >>> Please fix the messages above. >> Done. >>> - src/hotspot/share/prims/whitebox.cpp >>> >>> 2332 {CC"getResolvedReferences", >>> CC"(Ljava/lang/Class;)Ljava/lang/Object;", >>> (void*)&WB_GetResolvedReferences}, >>> 2333 {CC"linkClass", CC"(Ljava/lang/Class;)V", >>> (void*)&WB_LinkClass}, >>> 2334 {CC"areOpenArchiveHeapObjectsMapped", CC"()Z", >>> (void*)&WB_AreOpenArchiveHeapObjectsMapped}, >>> >>> Can you please align the indentation of line 2333 (to be the same as >>> line 2332 or 2334)? >> Aligned (void*) with line 2334. (It doesn't show in the webrev since >> only blank space changes) >>> - src/hotspot/share/runtime/arguments.cpp >>> >>> 1491 bool Arguments::check_unsupported_cds_runtime_properties() { >>> 1492 assert(UseSharedSpaces, "this function is only used with >>> -Xshare:{on,auto}"); >>> 1493 assert(ARRAY_SIZE(unsupported_properties) == >>> ARRAY_SIZE(unsupported_options), "must be"); >>> 1494 if (ArchiveClassesAtExit != NULL) { >>> 1495 // dynamic dumping, just return false, >>> check_unsupported_dumping_properties() will be called >>> 1496 // in init_shared_archive_paths(). >>> 1497 return false; >>> 1498 } >>> >>> The check_unsupported_cds_runtime_properties() should be done for the >>> 'ArchiveClassesAtExit != NULL' case as well. Dynamic dumping is a >>> combination of both dump time and runtime. >> The 'ArchiveClassesAtExit != NULL' is for dumping CDS archive to the >> user's point of view, that's why the comments in lines 1495 and 1496. >> During runtime, ArchiveClassesAtExit will be NULL, so the >> check_unsupported_cds_runtime_properties() will be called as usual. > During dynamic dumping, UseSharedSpace is true. Dynamic dumping is > special case of the 'runtime', that's why Dynamic dumping it is a > combination of both dump time and runtime. So > check_unsupported_cds_runtime_properties() is also need for dynamic > dumping. If the check_unsupported_cds_runtime_properties() is called for dynamic dumping, the user will see a warning message warning("CDS is disabled when the %s option is specified.", unsupported_options[i]); instead of the following error with my current patch: vm_exit_during_initialization("Cannot use the following option when dumping the shared archive", unsupported_options[i]); which gives the user the same feedback on the static dumping case. We think it is important to give correct error message to the user. We'd like to leave the code as is but have updated the comment as follows: if (ArchiveClassesAtExit != NULL) { // dynamic dumping, just return false for now. // check_unsupported_dumping_properties() will be called later to check the same set of // properties, and will exit the VM with the correct error message if the unsupported properties // are used. return false; } Here's an updated delta webrev: http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02%2b/ Additional changes comparing with delta_01_02 include: - the mentioned one-line removal in classLoaderExt.cpp; - updated comment in arguments.cpp; - small fixes to handle tests run in -Xshare:off mode; - enable building on the zero platform. Comparing with the delta_01_02 webrev, the additional changed files are: > make/hotspot/lib/JvmFeatures.gmk > src/hotspot/share/classfile/symbolTable.hpp > src/hotspot/share/runtime/arguments.hpp > test/hotspot/jtreg/runtime/appcds/dynamicArchive/DynamicArchiveTestBase.java > test/hotspot/jtreg/runtime/appcds/dynamicArchive/HelloDynamicCustom.java Thanks again for your review and contribution to this RFE. Calvin >>> 2729 // -Xshare:auto || -Xshare:dynamicDump >>> >>> As you've renamed the command-line argument for dynamic dumping >>> support, the comment needs to be fixed. >> Fixed. >>> 3125 // Compiler threads may concurrently update the class >>> metadata (such as method entries), so it's >>> 3126 // unsafe with DumpSharedSpaces (which modifies the class >>> metadata in place). Let's disable >>> 3127 // compiler just to be safe. >>> 3128 // >>> 3129 // Note: this is not a concern for DynamicDumpSharedSpaces, >>> which makes a copy of the class metadata >>> 3130 // instead of modifying them in place. The copy is >>> inaccessible to the compiler. >>> 3131 set_mode_flags(_int); >>> >>> We need to come back to revisit the above for the 'static' archive >>> dumping at one point. There is a RFE filed for that, if I remember >>> correctly. Could you please add a 'TODO' notes in the above comment. >> Added TODO. >>> A check should be done in arguments.cpp to make sure >>> DynamicDumpSharedSpaces is not manipulated from the command-line >>> directly. DynamicDumpSharedSpaces should not be enabled in the >>> command-line without ArchiveClassesAtExit being specified. >> Done. >>> - src/hotspot/share/runtime/java.cpp >>> >>> 509 >>> 510 // FIXME: is this the right place? >>> 511 if (DynamicDumpSharedSpaces) { >>> 512 DynamicArchive::dump(); >>> 513 } >>> >>> Again, the above 'FIXME' is served as a cleanup reminder. Please get >>> opinions from others on this change. If the calling place is okay, >>> please remove the FIXME. >> Removed the FIXME for now. Checked with David H. He indicated there's no >> easy answer for this. Just need to do a lot of testing. >>> - test >>> >>> Could you please add a test case for setting DynamicDumpSharedSpaces >>> from command-line? >> Here's an incremental webrev which contains a new test: >> >> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/ >> >> thanks, >> Calvin >>> I only took a brief look of the test changes. Please ask Misha to >>> review the test changes as well. >>> >>> Thanks and regards, >>> Jiangli > Thanks, > Jiangli From stefan.karlsson at oracle.com Mon Apr 29 17:41:31 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 29 Apr 2019 19:41:31 +0200 Subject: RFR: 8223064: Minor cleanups in ResolvedMethodTable In-Reply-To: <897f4968-8710-6344-e07c-fbc3a71d86a7@oracle.com> References: <5dd2ff6b-c16c-e2a6-cecf-b990852a5992@oracle.com> <897f4968-8710-6344-e07c-fbc3a71d86a7@oracle.com> Message-ID: Thanks, Harold. StefanK On 2019-04-29 16:26, Harold Seigel wrote: > Hi Stefan, > > These changes look good. > > Harold > > On 4/29/2019 7:44 AM, Stefan Karlsson wrote: >> Hi all, >> >> Please review these minor cleanups in the ResolvedMethodTable. >> >> http://cr.openjdk.java.net/~stefank/8223064/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8223064 >> >> - Remove unused EXCEPTION_MARK - unnecessarily copied from StringTable >> - Reuse existing LogTarget >> - Remove unreachable return statement >> >> Thanks, >> StefanK From stefan.karlsson at oracle.com Mon Apr 29 17:41:48 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 29 Apr 2019 19:41:48 +0200 Subject: RFR: 8223064: Minor cleanups in ResolvedMethodTable In-Reply-To: <7b224b64-ff44-9693-8d53-b6e7b6762498@oracle.com> References: <5dd2ff6b-c16c-e2a6-cecf-b990852a5992@oracle.com> <897f4968-8710-6344-e07c-fbc3a71d86a7@oracle.com> <7b224b64-ff44-9693-8d53-b6e7b6762498@oracle.com> Message-ID: <176cf475-386e-bec6-b972-ae423b1209e3@oracle.com> Thanks, Coleen. StefanK On 2019-04-29 17:24, coleen.phillimore at oracle.com wrote: > +1 > thanks, > Coleen > > On 4/29/19 10:26 AM, Harold Seigel wrote: >> Hi Stefan, >> >> These changes look good. >> >> Harold >> >> On 4/29/2019 7:44 AM, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please review these minor cleanups in the ResolvedMethodTable. >>> >>> http://cr.openjdk.java.net/~stefank/8223064/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8223064 >>> >>> - Remove unused EXCEPTION_MARK - unnecessarily copied from StringTable >>> - Reuse existing LogTarget >>> - Remove unreachable return statement >>> >>> Thanks, >>> StefanK > From daniel.daugherty at oracle.com Mon Apr 29 18:17:14 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 29 Apr 2019 14:17:14 -0400 Subject: RFR(L) 8153224 Monitor deflation prolong safepoints (CR2/v2.02/5-for-jdk13) In-Reply-To: <313e51c8-b672-bb1c-577a-49868f09e6c1@oracle.com> References: <62729044-8a22-0e20-0eda-04d47c9ea23c@oracle.com> <313e51c8-b672-bb1c-577a-49868f09e6c1@oracle.com> Message-ID: <188efed2-9ac3-6636-efb4-a9f9347bcd3e@oracle.com> Greetings, Just wanted to report the current test results on these bits. Baseline: jdk-13+17 Exp:????? CR2/v2.02/5-for-jdk13 Average SPECjbb2015 Results for Each OS (CR2/v2.02/5-for-jdk13) ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ?????? 28266.70?? 24681.70? 22740.40??????? 8648.00? Linux-X64 base ?????? 26907.00?? 23873.90? 22576.20??????? 8146.70? Linux-X64 exp ??????? 5621.00??? 4701.00?? 4721.80??????? 1499.30? MacOSX base ??????? 5841.80??? 4885.00?? 4707.80??????? 1506.70? MacOSX exp ?????? 16584.00?? 14344.50? 13184.10??????? 2765.20? Solaris-X64 base ?????? 16584.00?? 14218.50? 13267.00??????? 2752.60? Solaris-X64 exp For this round of SPECjbb2015 testing, I rebooted each machine before starting the 20 runs (10 base and 10 exp) of SPECjbb2015 testing. The MacOSX and the Solaris-X64 results are very similar. The Linux-X64 results are a couple of _thousand_ critical-jOPS higher for both base and exp results! I have no explanation and even less confidence in my use of SPECjbb2015. Here's the previous SPECjbb2015 results: Average SPECjbb2015 Results for Each OS (CR0/v2.00+/3-for-jdk13) ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ?????? 23838.00?? 22446.90? 20738.80??????? 6166.70? Linux-X64 base ?????? 23838.00?? 22279.40? 20262.00??????? 5891.50? Linux-X64 exp ??????? 5841.80??? 4885.00?? 4764.00??????? 1495.10? MacOSX base ??????? 5621.00??? 4701.00?? 4778.00??????? 1492.10? MacOSX exp ?????? 16125.20?? 13852.30? 12780.50??????? 2791.90? Solaris-X64 base ?????? 15788.70?? 13861.40? 12551.40??????? 2665.90? Solaris-X64 exp Linux-X64 Machine: ? - Ubuntu 16.04, Dell T7600, 64GB RAM ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 threads MacOSX Machine: ? - MacOS 10.13.6, Mac Mini, mid 2011, 16GB RAM ? - 2 GHz Intel Core i7 (I7-2635QM), 1 CPU x 4 cores x 2 threads Solaris-X64 Machine: ? - Solaris 11.2 SRU5.5, Dell T7600, 64GB RAM ? - Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 2 CPUs x 8 cores x 2 threads The results for each of the SPECjbb2015 runs are shown below. Solaris-X64 stress run - no unexpected failures Linux-X64 stress run - in progress Kitchensink8H {release, fastdebug, slowdebug} x {MacOSX, Lin-X64, Sol-X64} ? - no crashes or assertions; average time failures on Sol-X64 due to running ??? in parallel with other testing Inflate2 12H {release, fastdebug, slowdebug} x {MacOSX, Lin-X64, Sol-X64} ? - no failures Mach5 Tier[1-8] ? - gc/g1/humongousObjects/TestHumongousClassLoader.java still fails ? - did not see any failures to the known C2 race this round For Solaris-X64, I ran my stress kit, Kitchensink8H and Inflate2 all in parallel: no surprises. CR2 is back to the same stability as CR0 Dan $ ksh summarize_SPECjbb2015_results.ksh SPECjbb2015.MacOSX.base.* ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ?????????? 5621?????? 4701????? 4778?????????? 1548 SPECjbb2015.MacOSX.base.01 ?????????? 5621?????? 4701????? 4778?????????? 1509 SPECjbb2015.MacOSX.base.02 ?????????? 5621?????? 4701????? 4497?????????? 1435 SPECjbb2015.MacOSX.base.03 ?????????? 5621?????? 4701????? 4778?????????? 1509 SPECjbb2015.MacOSX.base.04 ?????????? 5621?????? 4701????? 4778?????????? 1484 SPECjbb2015.MacOSX.base.05 ?????????? 5621?????? 4701????? 4778?????????? 1556 SPECjbb2015.MacOSX.base.06 ?????????? 5621?????? 4701????? 4778?????????? 1473 SPECjbb2015.MacOSX.base.07 ?????????? 5621?????? 4701????? 4497?????????? 1435 SPECjbb2015.MacOSX.base.08 ?????????? 5621?????? 4701????? 4778?????????? 1484 SPECjbb2015.MacOSX.base.09 ?????????? 5621?????? 4701????? 4778?????????? 1560 SPECjbb2015.MacOSX.base.10 ---------------? ---------? --------? -------------? -------- ??????? 5621.00??? 4701.00?? 4721.80??????? 1499.30? average of values $ ksh summarize_SPECjbb2015_results.ksh SPECjbb2015.MacOSX.exp.* ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ?????????? 6725?????? 5621????? 4708?????????? 1543 SPECjbb2015.MacOSX.exp.01 ?????????? 5621?????? 4701????? 4497?????????? 1484 SPECjbb2015.MacOSX.exp.02 ?????????? 5621?????? 4701????? 4778?????????? 1560 SPECjbb2015.MacOSX.exp.03 ?????????? 6725?????? 5621????? 4708?????????? 1588 SPECjbb2015.MacOSX.exp.04 ?????????? 5621?????? 4701????? 4497?????????? 1521 SPECjbb2015.MacOSX.exp.05 ?????????? 5621?????? 4701????? 4778?????????? 1521 SPECjbb2015.MacOSX.exp.06 ?????????? 5621?????? 4701????? 4778?????????? 1326 SPECjbb2015.MacOSX.exp.07 ?????????? 5621?????? 4701????? 4778?????????? 1556 SPECjbb2015.MacOSX.exp.08 ?????????? 5621?????? 4701????? 4778?????????? 1521 SPECjbb2015.MacOSX.exp.09 ?????????? 5621?????? 4701????? 4778?????????? 1447 SPECjbb2015.MacOSX.exp.10 ---------------? ---------? --------? -------------? -------- ??????? 5841.80??? 4885.00?? 4707.80??????? 1506.70? average of values $ ksh summarize_SPECjbb2015_results.ksh SPECjbb2015.Sol-X64.base.* ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ????????? 16584????? 14453???? 13267?????????? 3142 SPECjbb2015.Sol-X64.base.01 ????????? 16584????? 14353???? 13267?????????? 2667 SPECjbb2015.Sol-X64.base.02 ????????? 16584????? 14580???? 13267?????????? 2995 SPECjbb2015.Sol-X64.base.03 ????????? 16584????? 14136???? 13267?????????? 2032 SPECjbb2015.Sol-X64.base.04 ????????? 16584????? 14195???? 12438?????????? 3308 SPECjbb2015.Sol-X64.base.05 ????????? 16584????? 14353???? 13267?????????? 2807 SPECjbb2015.Sol-X64.base.06 ????????? 16584????? 13837???? 13267?????????? 1928 SPECjbb2015.Sol-X64.base.07 ????????? 16584????? 14456???? 13267?????????? 3070 SPECjbb2015.Sol-X64.base.08 ????????? 16584????? 14729???? 13267?????????? 3107 SPECjbb2015.Sol-X64.base.09 ????????? 16584????? 14353???? 13267?????????? 2596 SPECjbb2015.Sol-X64.base.10 ---------------? ---------? --------? -------------? -------- ?????? 16584.00?? 14344.50? 13184.10??????? 2765.20? average of values $ ksh summarize_SPECjbb2015_results.ksh SPECjbb2015.Sol-X64.exp.* ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ????????? 16584????? 14295???? 13267?????????? 3253 SPECjbb2015.Sol-X64.exp.01 ????????? 16584????? 14580???? 13267?????????? 2077 SPECjbb2015.Sol-X64.exp.02 ????????? 16584????? 13837???? 13267?????????? 3000 SPECjbb2015.Sol-X64.exp.03 ????????? 16584????? 14456???? 13267?????????? 2537 SPECjbb2015.Sol-X64.exp.04 ????????? 16584????? 14456???? 13267?????????? 3251 SPECjbb2015.Sol-X64.exp.05 ????????? 16584????? 13837???? 13267?????????? 2219 SPECjbb2015.Sol-X64.exp.06 ????????? 16584????? 14267???? 13267?????????? 2667 SPECjbb2015.Sol-X64.exp.07 ????????? 16584????? 14353???? 13267?????????? 2942 SPECjbb2015.Sol-X64.exp.08 ????????? 16584????? 14267???? 13267?????????? 3475 SPECjbb2015.Sol-X64.exp.09 ????????? 16584????? 13837???? 13267?????????? 2105 SPECjbb2015.Sol-X64.exp.10 ---------------? ---------? --------? -------------? -------- ?????? 16584.00?? 14218.50? 13267.00??????? 2752.60? average of values $ ksh summarize_SPECjbb2015_results.ksh SPECjbb2015.Lin-X64.base.* ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ????????? 28585????? 24908???? 22868?????????? 8734 SPECjbb2015.Lin-X64.base.01 ????????? 28585????? 25122???? 22868?????????? 8734 SPECjbb2015.Lin-X64.base.02 ????????? 28585????? 24045???? 22868?????????? 8707 SPECjbb2015.Lin-X64.base.03 ????????? 25402????? 24482???? 21592?????????? 8797 SPECjbb2015.Lin-X64.base.04 ????????? 28585????? 24730???? 22868?????????? 9007 SPECjbb2015.Lin-X64.base.05 ????????? 28585????? 24581???? 22868?????????? 8007 SPECjbb2015.Lin-X64.base.06 ????????? 28585????? 24730???? 22868?????????? 8707 SPECjbb2015.Lin-X64.base.07 ????????? 28585????? 24581???? 22868?????????? 8976 SPECjbb2015.Lin-X64.base.08 ????????? 28585????? 24730???? 22868?????????? 8390 SPECjbb2015.Lin-X64.base.09 ????????? 28585????? 24908???? 22868?????????? 8421 SPECjbb2015.Lin-X64.base.10 ---------------? ---------? --------? -------------? -------- ?????? 28266.70?? 24681.70? 22740.40??????? 8648.00? average of values $ ksh summarize_SPECjbb2015_results.ksh SPECjbb2015.Lin-X64.exp.* ???? hbIR?????????? hbIR (max attempted)? (settled)? max-jOPS? critical-jOPS? run_name ---------------? ---------? --------? -------------? -------- ????????? 28585????? 23838???? 21439?????????? 7550 SPECjbb2015.Lin-X64.exp.01 ????????? 28585????? 24581???? 22868?????????? 9088 SPECjbb2015.Lin-X64.exp.02 ????????? 28585????? 23838???? 21439?????????? 8352 SPECjbb2015.Lin-X64.exp.03 ????????? 23838????? 23076???? 22646?????????? 7669 SPECjbb2015.Lin-X64.exp.04 ????????? 28585????? 23838???? 22868?????????? 7613 SPECjbb2015.Lin-X64.exp.05 ????????? 24482????? 23715???? 23258?????????? 7588 SPECjbb2015.Lin-X64.exp.06 ????????? 25402????? 24482???? 22862?????????? 7762 SPECjbb2015.Lin-X64.exp.07 ????????? 23838????? 23695???? 22646?????????? 8013 SPECjbb2015.Lin-X64.exp.08 ????????? 28585????? 23838???? 22868?????????? 8565 SPECjbb2015.Lin-X64.exp.09 ????????? 28585????? 23838???? 22868?????????? 9267 SPECjbb2015.Lin-X64.exp.10 ---------------? ---------? --------? -------------? -------- ?????? 26907.00?? 23873.90? 22576.20??????? 8146.70? average of values On 4/25/19 12:38 PM, Daniel D. Daugherty wrote: > Greetings, > > I have a small but important bug fix for the Async Monitor Deflation > project ready to go. It's also known as v2.02 (for those for with the > patches) and as webrev/5-for-jdk13 (for those with webrev URLs). Sorry > for all the names... > > JDK-8222295 was pushed to jdk/jdk two days ago so that baseline patch > is out of our hair. > > Main bug URL: > > ??? JDK-8153224 Monitor deflation prolong safepoints > ??? https://bugs.openjdk.java.net/browse/JDK-8153224 > > The project is currently baselined on jdk-13+17. > > Here's the full webrev URL: > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.full/ > > Here's the incremental webrev URL (JDK-8153224): > > http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.inc/ > > I still have to update the OpenJDK wiki to reflect the CR2 changes: > > https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation > > This version of the patch has been thru Mach5 tier[1-6] testing on > Oracle's usual set of platforms. Mach5 tier[7-8] is running now. > My stress kit is running on Solaris-X64 now. Kitchensink8H is running > now on product, fastdebug, and slowdebug bits on Linux-X64, MacOSX > and Solaris-X64. 12 hour Inflate2 runs are running now on product, > fastdebug and slowdebug bits on Linux-X64, MacOSX and Solaris-X64. > I'll start my my stress kit on Linux-X64 sometime on Sunday (after > my jdk-13+18 stress run is done). > > I'll do SPECjbb2015 baseline and CR2 runs after all the stress > testing is done. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > > > On 4/19/19 11:58 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> I finally have CR1 for the Async Monitor Deflation project ready to >> go. It's also known as v2.01 (for those for with the patches) and as >> webrev/4-for-jdk13 (for those with webrev URLs). Sorry for all the >> names... >> >> Main bug URL: >> >> ??? JDK-8153224 Monitor deflation prolong safepoints >> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >> >> Baseline bug fixes URL: >> >> ??? JDK-8222295 more baseline cleanups from Async Monitor Deflation >> project >> ??? https://bugs.openjdk.java.net/browse/JDK-8222295 >> >> The project is currently baselined on jdk-13+15. >> >> Here's the webrev for the latest baseline changes (JDK-8222295): >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.8222295 >> >> Here's the full webrev URL (JDK-8153224 only): >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.full/ >> >> Here's the incremental webrev URL (JDK-8153224): >> >> http://cr.openjdk.java.net/~dcubed/8153224-webrev/4-for-jdk13.inc/ >> >> So I'm looking for reviews for both JDK-8222295 and the latest version >> of JDK-8153224... >> >> I still have to update the OpenJDK wiki to reflect the CR changes: >> >> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >> >> This version of the patch has been thru Mach5 tier[1-3] testing on >> Oracle's usual set of platforms. Mach5 tier[4-6] is running now and >> Mach5 tier[78] will be run later today. My stress kit on Solaris-X64 >> is running now. Linux-X64 stress testing will start on Sunday. I'm >> planning to do Kitchensink runs, SPECjbb2015 runs and my monitor >> inflation stress tests on Linux-X64, MacOSX and Solaris-X64. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> >> >> On 3/24/19 9:57 AM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> Welcome to the OpenJDK review thread for my port of Carsten's work on: >>> >>> ??? JDK-8153224 Monitor deflation prolong safepoints >>> ??? https://bugs.openjdk.java.net/browse/JDK-8153224 >>> >>> Here's a link to the OpenJDK wiki that describes my port: >>> >>> https://wiki.openjdk.java.net/display/HotSpot/Async+Monitor+Deflation >>> >>> Here's the webrev URL: >>> >>> http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13/ >>> >>> Here's a link to Carsten's original webrev: >>> >>> http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/ >>> >>> Earlier versions of this patch have been through several rounds of >>> preliminary review. Many thanks to Carsten, Coleen, Robbin, and >>> Roman for their preliminary code review comments. A very special >>> thanks to Robbin and Roman for building and testing the patch in >>> their own environments (including specJBB2015). >>> >>> This version of the patch has been thru Mach5 tier[1-8] testing on >>> Oracle's usual set of platforms. Earlier versions have been run >>> through my stress kit on my Linux-X64 and Solaris-X64 servers >>> (product, fastdebug, slowdebug).Earlier versions have run Kitchensink >>> for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, fastdebug >>> and slowdebug). Earlier versions have run my monitor inflation stress >>> tests for 12 hours on MacOSX, Linux-X64 and Solaris-X64 (product, >>> fastdebug and slowdebug). >>> >>> All of the testing done on earlier versions will be redone on the >>> latest version of the patch. >>> >>> Thanks, in advance, for any questions, comments or suggestions. >>> >>> Dan >>> >>> P.S. >>> One subtest in gc/g1/humongousObjects/TestHumongousClassLoader.java >>> is currently failing in -Xcomp mode on Win* only. I've been trying >>> to characterize/analyze this failure for more than a week now. At >>> this point I'm convinced that Async Monitor Deflation is aggravating >>> an existing bug. However, I plan to have a better handle on that >>> failure before these bits are pushed to the jdk/jdk repo. >>> >> >> > > From eric.caspole at oracle.com Mon Apr 29 22:07:17 2019 From: eric.caspole at oracle.com (Eric Caspole) Date: Mon, 29 Apr 2019 18:07:17 -0400 Subject: RFR (XS) 8222818: NMT summary could show the GC in use In-Reply-To: References: <4ebbfa55-c17c-1261-22e2-13539a385d40@oracle.com> <8d8be2f1-240c-1cbb-64b7-b427be23d140@redhat.com> <7961d936-6eeb-9e78-ddbe-bc56e271ff5d@redhat.com> <955ef400-511d-1c03-0c34-a10f7834591d@oracle.com> Message-ID: Hi everybody, I got busy with other minor emergencies and lost track of this discussion about what the gc names should be and where it should go, so I'll cancel this PR and let someone else take it up later if there is interest. Eric On 4/24/19 05:25, Stefan Karlsson wrote: > On 2019-04-24 11:18, David Holmes wrote: >> Hi Stefan, >> >> On 24/04/2019 6:47 pm, Stefan Karlsson wrote: >>> >>> >>> On 2019-04-24 00:17, David Holmes wrote: >>>> On 24/04/2019 6:50 am, Zhengyu Gu wrote: >>>>> >>>>> >>>>> On 4/23/19 2:53 PM, Eric Caspole wrote: >>>>>> Hi Zhengyu, >>>>>> Hopefully this email comes through in monospace, the alignment is >>>>>> OK for me: >>>>>> >>>>>> >>>>>> currently: >>>>>> >>>>>> -??????????????????????? GC (reserved=379056KB, committed=93220KB) >>>>>> ???????????????????????????? (malloc=39184KB #2159) >>>>>> ???????????????????????????? (mmap: reserved=339872KB, >>>>>> committed=54036KB) >>>>>> >>>>>> >>>>>> My version: >>>>>> >>>>>> >>>>>> -??????????????? GC - g1 gc (reserved=379090KB, committed=93254KB) >>>>>> ???????????????????????????? (malloc=39218KB #2194) >>>>>> ???????????????????????????? (mmap: reserved=339872KB, >>>>>> committed=54036KB) >>>>>> >>>>>> >>>>>> so it is aligned going to the left off the parenthesis like the >>>>>> current version. Is that what you mean? I like the way the GC >>>>>> stands out like this but it is OK to put it in the parentheses on >>>>>> the right. >>>>> >>>>> Different GC has different name, it is hard to get them all aligned >>>>> right, and it does not worth the effort. >>>> >>>> AFAICS The code already "reserves" 26 characters just to print "GC", >>>> which is right-aligned. So all this does is take some of the 24 >>>> existing spaces and fill them in with the GC name so you end up with: >>>> >>>> ???????????????? GC - g1 gc (reserved=379056KB, committed=93220KB) >>>> ??????????????????????????? (malloc=39184KB #2159) >>>> ??????????????????????????? (mmap: reserved=339872KB, >>>> committed=54036KB) >>>> >>>> or >>>> ???????? GC - shenandoah gc (reserved=379056KB, committed=93220KB) >>>> ??????????????????????????? (malloc=39184KB #2159) >>>> ??????????????????????????? (mmap: reserved=339872KB, >>>> committed=54036KB) >>>> >>>> etc. Only nit is that 26 seems to small for "concurrent mark sweep >>>> gc". Also the alignment of 26 could be specified dynamically based >>>> on the length of the hs_err_name() string if needed. >>> >>> >>> FYI, the name of the function hs_err_name() was chosen to deter >>> people from using it in other places. Maybe it's time to add another >>> function in GCConfig that return better names? Maybe use names that >>> match our -XX:UseGC flags? >> >> I was thinking more simply that we might rename hs_err_name() to >> name() and make the strings a little nicer e.g. "G1 GC" instead of "g1 >> gc". But if you want to add more general GC functionality here that's >> up to you :) > > Either way works for me. > > We tried to change the name before, but it was met with resistance: > https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-April/021937.html > > > StefanK > >> >> Cheers, >> David >> >>> If you would find that useful, I've created a patch that exposes >>> three new functions that return the flag names, or parts of it: >>> http://cr.openjdk.java.net/~stefank/8222818/gcFlagNames/webrev.01/ >>> >>> GCConfig::flag_name() gives: >>> ??UseConcMarkSweepGC >>> ??UseEpsilonGC >>> ??UseG1GC >>> ??UseParallelGC >>> ??UseSerialGC >>> ??UseShenandoahGC >>> ??UseZGC >>> >>> GCConfig::flag_name_no_use() gives: >>> ??ConcMarkSweepGC >>> ??EpsilonGC >>> ??G1GC >>> ??ParallelGC >>> ??SerialGC >>> ??ShenandoahGC >>> ??ZGC >>> >>> GCConfig::flag_name_no_use_no_gc() gives: >>> ??ConcMarkSweep >>> ??Epsilon >>> ??G1 >>> ??Parallel >>> ??Serial >>> ??Shenandoah >>> ??Z >>> >>> The universe.cpp changes, are only temporary changes to test the output. >>> >>> StefanK >>> >>>> >>>> David >>>> ----- >>>> >>>>> >>>>> So, my suggestion is to place GC name inside parentheses, and you >>>>> don't have to deal with indents to the left. >>>>> >>>>> e.g. >>>>> >>>>> >>>>> -???????????????????? GC (reserved=379056KB, committed=93220KB by >>>>> g1 gc) >>>>> ?????????????????????????? (malloc=39184KB #2159) >>>>> ?????????????????????????? (mmap: reserved=339872KB, >>>>> committed=54036KB) >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>>>> >>>>>> Thanks, >>>>>> Eric >>>>>> >>>>>> >>>>>> >>>>>> On 4/22/19 21:57, Zhengyu Gu wrote: >>>>>>> >>>>>>> >>>>>>> On 4/22/19 8:19 PM, David Holmes wrote: >>>>>>>> Hi Eric, >>>>>>>> >>>>>>>> On 23/04/2019 8:13 am, Eric Caspole wrote: >>>>>>>>> Hi, could I have reviews and any opinions on this little change >>>>>>>>> to show the GC name in the NMT output, as this helps us to more >>>>>>>>> easily triage performance data. >>>>>>>> >>>>>>>> The idea seems fine. >>>>>>> >>>>>>>> >>>>>>>> For the implementation wouldn't it be simpler to do something like: >>>>>>>> >>>>>>>> if (flag == mtGC) { >>>>>>>> ?? out->print("%s - %s (", NMTUtil::flag_to_name(flag), >>>>>>>> ?????????????????????????? GCConfig::hs_err_name()); >>>>>>>> } else { >>>>>>>> ?? out->print("-%26s (", NMTUtil::flag_to_name(flag)); >>>>>>>> } >>>>>>>> >>>>>>> Yes, this is simpler. >>>>>>> >>>>>>> I don't like where the name is placed, it screws up section >>>>>>> alignments. I would prefer to place name inside parenthesis. e.g. >>>>>>> >>>>>>> - GC (g1 gc reserved=379056KB, committed=93220KB) >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -Zhengyu >>>>>>> >>>>>>>> and skip the need for a local buffer and snprintf? >>>>>>>> >>>>>>>> Aside: it's probably used in enough different contexts that >>>>>>>> GCConfig::hs_err_name should be renamed. >>>>>>>> >>>>>>>> Also if the VM terminates during initialization is it possible >>>>>>>> for this code to be executed before the GCConfig has been setup? >>>>>>>> And if so how will it behave? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> This passed tier 1 and 2. >>>>>>>>> Thanks, >>>>>>>>> Eric >>>>>>>>> >>>>>>>>> >>>>>>>>> JBS: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222818 >>>>>>>>> >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~ecaspole/JDK-8222818/02/webrev/ From jianglizhou at google.com Tue Apr 30 02:42:18 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Mon, 29 Apr 2019 19:42:18 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CC730C2.50800@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> <5CBFA906.1030205@oracle.com> <5CC730C2.50800@oracle.com> Message-ID: Hi Calvin, The updates look ok. Please make sure to test with a minimum build as well. Best, Jiangli On Mon, Apr 29, 2019 at 10:13 AM Calvin Cheung wrote: > > Hi Jiangli, > > Thanks for the re-review. > Please see my comments in-line below... > > On 4/24/19, 7:54 PM, Jiangli Zhou wrote: > > Please see comments inlined. > > > > On Tue, Apr 23, 2019 at 5:08 PM Calvin Cheung wrote: > >> Hi Jiangli, > >> > >> Thanks a lot for your review! > >> > >> On 4/22/19, 2:07 PM, Jiangli Zhou wrote: > >>> Hi Calvin, > >>> > >>> Congrats on finalizing the dynamic archiving work and completing > >>> testing. After the integration of the dynamic archiving, a follow-up > >>> RFE can be done to merge the archiving/copying code in > >>> dynamicArchive.* and metaspaceShared.* for better maintenance in the > >>> future. As there are many duplicates between those two, having shared > >>> implementation for both static and dynamic will be beneficial and > >>> reduce the maintenance cost. > >> I'll file an RFE for the above. > >>> Here are my comments mainly for additional cleanups and some minor issues. > >>> > >>> - src/hotspot/share/classfile/classLoader.cpp > >>> > >>> 1337 // FIXME: DynamicDumpSharedSpaces and --patch-modules are > >>> mutually exclusive > >>> 1338 assert(!DynamicDumpSharedSpaces, "sanity"); > >>> > >>> I tagged the comment with 'FIXME' to serve as a reminder to add more > >>> details. The reason DynamicDumpSharedSpaces is 'mutually exclusive' > >>> with with --patch-modules because DynamicDumpSharedSpaces is only > >>> enabled when UseSharedSpaces is also enabled. As --patch-modules is > >>> not supported with UseSharedSpaces, it is not supported with > >>> DynamicDumpSharedSpaces either. > >> I've converted the FIXME to a comment. > >>> 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); > >>> 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, > >>> (ClassFileStream*)stream); > >>> > >>> Please add assert(DynamicDumpSharedSpaces, "sanity"); to the above > >>> code. With the new dynamic archiving capability, it's now able to > >>> load/archive a class with user defined classloader via this call path. > >>> A comment explaining this is also needed. > >> I tried the assert but it didn't work. Not only DynamicDumpSharedSpaces > >> will go through that code path. > > I should be more clear. The new code is only intended for the > > DynamicDumpSharedSpaces, since the shared_classpath_index is set to > > UNREGISTERED_INDEX by ClassLoaderExt::load_class when loading class > > with "source:" in the class list file at static dumping time. > > > > 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); > > 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, > > (ClassFileStream*)stream); > > > > After thinking more, it's probably better to remove the following > > marked code from ClassLoaderExt::load_class. That avoids setting twice > > in two different places during static dumping. It also makes the code > > cleaner. > > > > InstanceKlass* ClassLoaderExt::load_class(Symbol* name, const char* > > path, TRAPS) { > > ... > > result->set_shared_classpath_index(UNREGISTERED_INDEX);<<<<<<<<<<< > > SystemDictionaryShared::set_shared_class_misc_info(result, stream); > > <<<<<<<<<<<< > http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/src/hotspot/share/classfile/classLoaderExt.cpp.html > > Only the first statement is still there. I agree that the > set_shared_classpath_indexd() can be removed. > > > >>> - src/hotspot/share/classfile/classLoaderExt.cpp > >>> > >>> 64 void ClassLoaderExt::setup_app_search_path() { > >>> 65 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, > >>> 66 "this function is only used with -Xshare:dump"); > >>> > >>> The above message needs to be updated to reflect the new command-line option. > >> Done. > >>> 304 result->set_shared_classpath_index(UNREGISTERED_INDEX); > >>> 305 SystemDictionaryShared::set_shared_class_misc_info(result, > >>> stream);<<<<<<<<<< > >>> > >>> Why is the set_shared_class_misc_info call being removed? If this is a > >>> bug fix for loading classes from the classlist for user defined > >>> classloaders, it should be handled separately, and with a separate bug > >>> ID as well. > >> It is called in ClassLoader::record_result() from > >> KlassFactory::create_from_stream(). > > > > Ok, this is related to the above comment. > > > >>> - src/hotspot/share/classfile/compactHashtable.cpp > >>> > >>> 207 size_t SimpleCompactHashtable::calculate_header_size() { > >>> 208 // We have 5 fields. Each takes up sizeof(intptr_t). See > >>> WriteClosure::do_u4 > >>> 209 size_t bytes = sizeof(intptr_t) * 5; > >>> 210 return bytes; > >>> 211 } > >>> > >>> 212 > >>> 213 void SimpleCompactHashtable::serialize_header(SerializeClosure* soc) { > >>> 214 // NOTE: if you change this function, you MUST change the number 5 in > >>> 215 // calculate_header_size() accordingly. > >>> ... > >>> > >>> As a cleanup, a better way to handle this is to calculate the size > >>> within SimpleCompactHashtable::serialize_header during serializing the > >>> data and set the size value in a valuable. > >>> SimpleCompactHashtable::calculate_header_size() should simply retrieve > >>> the value. A renaming of > >>> SimpleCompactHashtable::calculate_header_size() can also be done. > >> I've checked with Ioi on this one. The problem is > >> calculate_header_size() needs to be called during size estimation, and > >> serialize_header is called after size estimation. > > Can you please file a RFE for this? The current code is okay for the > > first integration. It deserves some efforts to make it cleaner > > (probably with a different solution) since it can be error-prone. > I've filed: > https://bugs.openjdk.java.net/browse/JDK-8223004 > Avoid using a hard-coded number in > SimpleCompactHashtable::calculate_header_size() > > > >>> - src/hotspot/share/classfile/dictionary.cpp > >>> > >>> 315 InstanceKlass* Dictionary::find_class(Symbol* name) { > >>> 316 unsigned int hash = compute_hash(name); > >>> 317 int index = hash_to_index(hash); > >>> 318 return find_class(index, hash, name); > >>> 319 } > >>> > >>> Looks like the new function is not references (unless I'm missing > >>> something). Please remove the function. > >>> > >>> - src/hotspot/share/classfile/dictionary.hpp > >>> > >>> 65 InstanceKlass* find_class(Symbol* name); > >>> > >>> Same comment as the above. > >> I've removed the function. > >>> - src/hotspot/share/classfile/symbolTable.cpp. > >>> > >>> 473 Symbol* const _archived; // used by UseSharedArchived2 > >>> > >>> Please removed 'UseSharedArchived2'. The comment also needs more clarifications. > >>> > >>> I couldn't find any references to SymbolTableCreateEntry. Can you > >>> please point to me where it is being used? > >> I've removed the entire SymbolTableCreateEntry class. It was left there > >> probably due to merge error. > >>> - src/hotspot/share/classfile/systemDictionaryShared.cpp > >>> > >>> 1218 if (DynamicDumpSharedSpaces) { > >>> 1219 return false; > >>> 1220 } else { > >>> > >>> The above case for DynamicDumpSharedSpaces needs to be examined > >>> carefully. Can you please ask Harold (and Coleen or Karen) to take a > >>> look? Also, a comment is needed to explain that we can complete all > >>> verification checks at dynamic dumping time. > >> I've added a comment. If it return false, the caller will call > >> VerificationType::resolve_and_check_assignability(). > >>> - src/hotspot/share/classfile/systemDictionaryShared.cpp > >>> > >>> 1279 ResourceMark rm; > >>> > >>> You can use 'ResourceMark rm(THREAD)'. > >> Fixed. > >>> - src/hotspot/share/memory/allocation.hpp > >>> > >>> 255 // > >>> 256 // When CDS is not enabled, both pointers are set to NULL. > >>> 257 static void* _shared_metaspace_base; // (inclusive) low address > >>> 258 static void* _shared_metaspace_top; // (exclusive) high addres > >>> > >>> Why the comment at line 256 was removed? > >> I've added back the comment. > >>> - src/hotspot/share/memory/filemap.cpp > >>> > >>> 101 void FileMapInfo::fail_continue(const char *msg, ...) { > >>> 102 va_list ap; > >>> 103 va_start(ap, msg); > >>> 104 if (_runtime_dynamic_info == NULL) { > >>> 105 MetaspaceShared::set_archive_loading_failed(); > >>> 106 } else { > >>> 107 DynamicArchive::disable(); > >>> 108 } > >>> > >>> The above fail_continue only works if _runtime_dynamic_info is setup > >>> after the mapping the base archive. Comments should be add to explain > >>> that. > >> Comment added. > >>> Can you please rename '_runtime_dynamic_info' so it's more > >>> descriptive? Maybe use 'dynamic_archive_info'. > >> Renamed to '_dynamic_archive_info'. > >>> 587 bool FileMapInfo::same_files(const char* file1, const char* file2) { > >>> > >>> The usage of FileMapInfo::same_files is not necessary and should be > >>> removed. The base archive's CRC checksum values are recorded in the > >>> dynamic archive. The runtime verifies the CRC values to make sure the > >>> same archive is used at dump time and runtime, regardless of the base > >>> archive path or name. It is designed for all use cases: > >> The same_files() function is also used in arguments.cpp: > >> 3530 if (DynamicDumpSharedSpaces) { > >> 3531 if (FileMapInfo::same_files(SharedArchiveFile, > >> ArchiveClassesAtExit)) { > >> 3532 vm_exit_during_initialization( > >> 3533 "Cannot have the same archive file specified for > >> -XX:SharedArchiveFile and -XX:ArchiveClassesAtExit", > >> 3534 SharedArchiveFile); > >> 3535 } > >> 3536 } > >> > >> The function is also needed for the RFE: > >> https://bugs.openjdk.java.net/browse/JDK-8211723 > > Ok. It should be treated a bug, not a RFE. > > > > The shared path table check does not verify the path ordering (also > > including the case when new path components are inserted). The bug > > should be handled as a high priority task for dynamic archive. > I saw that you and Karen have had some discussion in the bug report > since you sent this review comment. I take that you're fine with that. > > > >> We still verify the CRC values during runtime. > >>> * base CDS archive is specified in the -XX:SharedArchiveFile at > >>> dynamic dumping time > >>> * -XX:SharedArchiveFile is not specified at dynamic dumping time, > >>> default location for the default CDS archive is used > >>> * default CDS archive is specified in the -XX:SharedArchiveFile at runtime > >>> * default CDS archive is not specified in the -XX:SharedArchiveFile at > >>> runtime, default location for the default CDS archive is used > >> Regarding the fourth point above, the user could have a non-default base > >> archive and only specify the top archive during runtime. > > I would argue against it since it doesn't always work and adds extra > > code. When the archive path/name is changed, the recorded one in the > > dynamic archive would no longer work. User still need to specify the > > path/name in the command-line. The use case only works for the default > > CDS archive. For non-default CDS archive, specifying in the > > command-line option results a cleaner design and less fragile code. > If the base archive is moved, then the user has to modify the > command-line anyway, whether the user initially specified (a) only the > top archive, or (b) both archives. > Requiring both archives to be specified will only hurt the users who > never move the archives. > > We'd like to leave it as is. > > > > >>> In all above cases, the base archive CRC values check is sufficient. > >>> The use of path/name is fragile and should be avoided. That will allow > >>> you to remove the _base_archive_name_size from the dynamic archive. > >> We still need the _base_archive_name_size and the base archive name in > >> the header because of the above reason. > > Please see my comment above. > > > >>> 752 if (is_static) { > >>> 753 // FIXME check for dynamic header as well > >>> 754 // FIXME Don't just check the last region -- check all regions! > >>> > >>> Can you please address the first FIXME at line 753? > >>> > >>> Checking the last region is sufficient since the archive is written is > >>> sequential order. The second FIXME is not necessary. > >> I've addressed the first FIXME and converted the second one to a comment. > >>> - src/hotspot/share/memory/metaspace.cpp > >>> > >>> 1417 bool Metaspace::contains(const void* ptr) { > >>> 1418 // FIXME: need to check the dynamic archive > >>> > >>> Can you please remove the above FIXME? There is no need for a separate check. > >> Done. > >>> - src/hotspot/share/memory/metaspaceShared.cpp > >>> > >>> 830 intptr_t* MetaspaceShared::fix_cpp_vtable_for_second_archive > >>> > >>> Can you please rename the function to fix_cpp_vtable_for_dynamic_archive? > >> Done. > >>> - src/hotspot/share/oops/klass.cpp > >>> > >>> 527 assert (DumpSharedSpaces || DynamicDumpSharedSpaces, > >>> 528 "only called for DumpSharedSpaces"); > >>> > >>> 544 void Klass::remove_java_mirror() { > >>> 545 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, "only > >>> called for DumpSharedSpaces"); > >>> > >>> Please fix the messages above. > >> Done. > >>> - src/hotspot/share/prims/whitebox.cpp > >>> > >>> 2332 {CC"getResolvedReferences", > >>> CC"(Ljava/lang/Class;)Ljava/lang/Object;", > >>> (void*)&WB_GetResolvedReferences}, > >>> 2333 {CC"linkClass", CC"(Ljava/lang/Class;)V", > >>> (void*)&WB_LinkClass}, > >>> 2334 {CC"areOpenArchiveHeapObjectsMapped", CC"()Z", > >>> (void*)&WB_AreOpenArchiveHeapObjectsMapped}, > >>> > >>> Can you please align the indentation of line 2333 (to be the same as > >>> line 2332 or 2334)? > >> Aligned (void*) with line 2334. (It doesn't show in the webrev since > >> only blank space changes) > >>> - src/hotspot/share/runtime/arguments.cpp > >>> > >>> 1491 bool Arguments::check_unsupported_cds_runtime_properties() { > >>> 1492 assert(UseSharedSpaces, "this function is only used with > >>> -Xshare:{on,auto}"); > >>> 1493 assert(ARRAY_SIZE(unsupported_properties) == > >>> ARRAY_SIZE(unsupported_options), "must be"); > >>> 1494 if (ArchiveClassesAtExit != NULL) { > >>> 1495 // dynamic dumping, just return false, > >>> check_unsupported_dumping_properties() will be called > >>> 1496 // in init_shared_archive_paths(). > >>> 1497 return false; > >>> 1498 } > >>> > >>> The check_unsupported_cds_runtime_properties() should be done for the > >>> 'ArchiveClassesAtExit != NULL' case as well. Dynamic dumping is a > >>> combination of both dump time and runtime. > >> The 'ArchiveClassesAtExit != NULL' is for dumping CDS archive to the > >> user's point of view, that's why the comments in lines 1495 and 1496. > >> During runtime, ArchiveClassesAtExit will be NULL, so the > >> check_unsupported_cds_runtime_properties() will be called as usual. > > During dynamic dumping, UseSharedSpace is true. Dynamic dumping is > > special case of the 'runtime', that's why Dynamic dumping it is a > > combination of both dump time and runtime. So > > check_unsupported_cds_runtime_properties() is also need for dynamic > > dumping. > If the check_unsupported_cds_runtime_properties() is called for dynamic > dumping, the user will see a warning message > warning("CDS is disabled when the %s option is specified.", > unsupported_options[i]); > instead of the following error with my current patch: > vm_exit_during_initialization("Cannot use the following option > when dumping the shared archive", unsupported_options[i]); > which gives the user the same feedback on the static dumping case. > > We think it is important to give correct error message to the user. We'd > like to leave the code as is but have updated the comment as follows: > > if (ArchiveClassesAtExit != NULL) { > // dynamic dumping, just return false for now. > // check_unsupported_dumping_properties() will be called later to > check the same set of > // properties, and will exit the VM with the correct error message > if the unsupported properties > // are used. > return false; > } > > Here's an updated delta webrev: > > http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02%2b/ > > Additional changes comparing with delta_01_02 include: > - the mentioned one-line removal in classLoaderExt.cpp; > - updated comment in arguments.cpp; > - small fixes to handle tests run in -Xshare:off mode; > - enable building on the zero platform. > > Comparing with the delta_01_02 webrev, the additional changed files are: > > make/hotspot/lib/JvmFeatures.gmk > > src/hotspot/share/classfile/symbolTable.hpp > > src/hotspot/share/runtime/arguments.hpp > > > test/hotspot/jtreg/runtime/appcds/dynamicArchive/DynamicArchiveTestBase.java > > test/hotspot/jtreg/runtime/appcds/dynamicArchive/HelloDynamicCustom.java > > Thanks again for your review and contribution to this RFE. > > Calvin > >>> 2729 // -Xshare:auto || -Xshare:dynamicDump > >>> > >>> As you've renamed the command-line argument for dynamic dumping > >>> support, the comment needs to be fixed. > >> Fixed. > >>> 3125 // Compiler threads may concurrently update the class > >>> metadata (such as method entries), so it's > >>> 3126 // unsafe with DumpSharedSpaces (which modifies the class > >>> metadata in place). Let's disable > >>> 3127 // compiler just to be safe. > >>> 3128 // > >>> 3129 // Note: this is not a concern for DynamicDumpSharedSpaces, > >>> which makes a copy of the class metadata > >>> 3130 // instead of modifying them in place. The copy is > >>> inaccessible to the compiler. > >>> 3131 set_mode_flags(_int); > >>> > >>> We need to come back to revisit the above for the 'static' archive > >>> dumping at one point. There is a RFE filed for that, if I remember > >>> correctly. Could you please add a 'TODO' notes in the above comment. > >> Added TODO. > >>> A check should be done in arguments.cpp to make sure > >>> DynamicDumpSharedSpaces is not manipulated from the command-line > >>> directly. DynamicDumpSharedSpaces should not be enabled in the > >>> command-line without ArchiveClassesAtExit being specified. > >> Done. > >>> - src/hotspot/share/runtime/java.cpp > >>> > >>> 509 > >>> 510 // FIXME: is this the right place? > >>> 511 if (DynamicDumpSharedSpaces) { > >>> 512 DynamicArchive::dump(); > >>> 513 } > >>> > >>> Again, the above 'FIXME' is served as a cleanup reminder. Please get > >>> opinions from others on this change. If the calling place is okay, > >>> please remove the FIXME. > >> Removed the FIXME for now. Checked with David H. He indicated there's no > >> easy answer for this. Just need to do a lot of testing. > >>> - test > >>> > >>> Could you please add a test case for setting DynamicDumpSharedSpaces > >>> from command-line? > >> Here's an incremental webrev which contains a new test: > >> > >> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/ > >> > >> thanks, > >> Calvin > >>> I only took a brief look of the test changes. Please ask Misha to > >>> review the test changes as well. > >>> > >>> Thanks and regards, > >>> Jiangli > > Thanks, > > Jiangli From karen.kinnear at oracle.com Tue Apr 30 16:00:48 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Tue, 30 Apr 2019 12:00:48 -0400 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <5CC730C2.50800@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> <5CBFA906.1030205@oracle.com> <5CC730C2.50800@oracle.com> Message-ID: <6BB7EA19-F3C5-44C6-8C0B-137361EE9C70@oracle.com> Code changes look good to go for me. Many thanks for the updates and additional testing. thanks, Karen p.s. minor note: dynamicArchive.hpp line 57 ?an copy? -> ?a copy" > On Apr 29, 2019, at 1:13 PM, Calvin Cheung wrote: > > Hi Jiangli, > > Thanks for the re-review. > Please see my comments in-line below... > > On 4/24/19, 7:54 PM, Jiangli Zhou wrote: >> Please see comments inlined. >> >> On Tue, Apr 23, 2019 at 5:08 PM Calvin Cheung wrote: >>> Hi Jiangli, >>> >>> Thanks a lot for your review! >>> >>> On 4/22/19, 2:07 PM, Jiangli Zhou wrote: >>>> Hi Calvin, >>>> >>>> Congrats on finalizing the dynamic archiving work and completing >>>> testing. After the integration of the dynamic archiving, a follow-up >>>> RFE can be done to merge the archiving/copying code in >>>> dynamicArchive.* and metaspaceShared.* for better maintenance in the >>>> future. As there are many duplicates between those two, having shared >>>> implementation for both static and dynamic will be beneficial and >>>> reduce the maintenance cost. >>> I'll file an RFE for the above. >>>> Here are my comments mainly for additional cleanups and some minor issues. >>>> >>>> - src/hotspot/share/classfile/classLoader.cpp >>>> >>>> 1337 // FIXME: DynamicDumpSharedSpaces and --patch-modules are >>>> mutually exclusive >>>> 1338 assert(!DynamicDumpSharedSpaces, "sanity"); >>>> >>>> I tagged the comment with 'FIXME' to serve as a reminder to add more >>>> details. The reason DynamicDumpSharedSpaces is 'mutually exclusive' >>>> with with --patch-modules because DynamicDumpSharedSpaces is only >>>> enabled when UseSharedSpaces is also enabled. As --patch-modules is >>>> not supported with UseSharedSpaces, it is not supported with >>>> DynamicDumpSharedSpaces either. >>> I've converted the FIXME to a comment. >>>> 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); >>>> 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, >>>> (ClassFileStream*)stream); >>>> >>>> Please add assert(DynamicDumpSharedSpaces, "sanity"); to the above >>>> code. With the new dynamic archiving capability, it's now able to >>>> load/archive a class with user defined classloader via this call path. >>>> A comment explaining this is also needed. >>> I tried the assert but it didn't work. Not only DynamicDumpSharedSpaces >>> will go through that code path. >> I should be more clear. The new code is only intended for the >> DynamicDumpSharedSpaces, since the shared_classpath_index is set to >> UNREGISTERED_INDEX by ClassLoaderExt::load_class when loading class >> with "source:" in the class list file at static dumping time. >> >> 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); >> 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, >> (ClassFileStream*)stream); >> >> After thinking more, it's probably better to remove the following >> marked code from ClassLoaderExt::load_class. That avoids setting twice >> in two different places during static dumping. It also makes the code >> cleaner. >> >> InstanceKlass* ClassLoaderExt::load_class(Symbol* name, const char* >> path, TRAPS) { >> ... >> result->set_shared_classpath_index(UNREGISTERED_INDEX);<<<<<<<<<<< >> SystemDictionaryShared::set_shared_class_misc_info(result, stream); >> <<<<<<<<<<<< > http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/src/hotspot/share/classfile/classLoaderExt.cpp.html > > Only the first statement is still there. I agree that the set_shared_classpath_indexd() can be removed. >> >>>> - src/hotspot/share/classfile/classLoaderExt.cpp >>>> >>>> 64 void ClassLoaderExt::setup_app_search_path() { >>>> 65 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, >>>> 66 "this function is only used with -Xshare:dump"); >>>> >>>> The above message needs to be updated to reflect the new command-line option. >>> Done. >>>> 304 result->set_shared_classpath_index(UNREGISTERED_INDEX); >>>> 305 SystemDictionaryShared::set_shared_class_misc_info(result, >>>> stream);<<<<<<<<<< >>>> >>>> Why is the set_shared_class_misc_info call being removed? If this is a >>>> bug fix for loading classes from the classlist for user defined >>>> classloaders, it should be handled separately, and with a separate bug >>>> ID as well. >>> It is called in ClassLoader::record_result() from >>> KlassFactory::create_from_stream(). >> >> Ok, this is related to the above comment. >> >>>> - src/hotspot/share/classfile/compactHashtable.cpp >>>> >>>> 207 size_t SimpleCompactHashtable::calculate_header_size() { >>>> 208 // We have 5 fields. Each takes up sizeof(intptr_t). See >>>> WriteClosure::do_u4 >>>> 209 size_t bytes = sizeof(intptr_t) * 5; >>>> 210 return bytes; >>>> 211 } >>>> >>>> 212 >>>> 213 void SimpleCompactHashtable::serialize_header(SerializeClosure* soc) { >>>> 214 // NOTE: if you change this function, you MUST change the number 5 in >>>> 215 // calculate_header_size() accordingly. >>>> ... >>>> >>>> As a cleanup, a better way to handle this is to calculate the size >>>> within SimpleCompactHashtable::serialize_header during serializing the >>>> data and set the size value in a valuable. >>>> SimpleCompactHashtable::calculate_header_size() should simply retrieve >>>> the value. A renaming of >>>> SimpleCompactHashtable::calculate_header_size() can also be done. >>> I've checked with Ioi on this one. The problem is >>> calculate_header_size() needs to be called during size estimation, and >>> serialize_header is called after size estimation. >> Can you please file a RFE for this? The current code is okay for the >> first integration. It deserves some efforts to make it cleaner >> (probably with a different solution) since it can be error-prone. > I've filed: > https://bugs.openjdk.java.net/browse/JDK-8223004 > Avoid using a hard-coded number in SimpleCompactHashtable::calculate_header_size() >> >>>> - src/hotspot/share/classfile/dictionary.cpp >>>> >>>> 315 InstanceKlass* Dictionary::find_class(Symbol* name) { >>>> 316 unsigned int hash = compute_hash(name); >>>> 317 int index = hash_to_index(hash); >>>> 318 return find_class(index, hash, name); >>>> 319 } >>>> >>>> Looks like the new function is not references (unless I'm missing >>>> something). Please remove the function. >>>> >>>> - src/hotspot/share/classfile/dictionary.hpp >>>> >>>> 65 InstanceKlass* find_class(Symbol* name); >>>> >>>> Same comment as the above. >>> I've removed the function. >>>> - src/hotspot/share/classfile/symbolTable.cpp. >>>> >>>> 473 Symbol* const _archived; // used by UseSharedArchived2 >>>> >>>> Please removed 'UseSharedArchived2'. The comment also needs more clarifications. >>>> >>>> I couldn't find any references to SymbolTableCreateEntry. Can you >>>> please point to me where it is being used? >>> I've removed the entire SymbolTableCreateEntry class. It was left there >>> probably due to merge error. >>>> - src/hotspot/share/classfile/systemDictionaryShared.cpp >>>> >>>> 1218 if (DynamicDumpSharedSpaces) { >>>> 1219 return false; >>>> 1220 } else { >>>> >>>> The above case for DynamicDumpSharedSpaces needs to be examined >>>> carefully. Can you please ask Harold (and Coleen or Karen) to take a >>>> look? Also, a comment is needed to explain that we can complete all >>>> verification checks at dynamic dumping time. >>> I've added a comment. If it return false, the caller will call >>> VerificationType::resolve_and_check_assignability(). >>>> - src/hotspot/share/classfile/systemDictionaryShared.cpp >>>> >>>> 1279 ResourceMark rm; >>>> >>>> You can use 'ResourceMark rm(THREAD)'. >>> Fixed. >>>> - src/hotspot/share/memory/allocation.hpp >>>> >>>> 255 // >>>> 256 // When CDS is not enabled, both pointers are set to NULL. >>>> 257 static void* _shared_metaspace_base; // (inclusive) low address >>>> 258 static void* _shared_metaspace_top; // (exclusive) high addres >>>> >>>> Why the comment at line 256 was removed? >>> I've added back the comment. >>>> - src/hotspot/share/memory/filemap.cpp >>>> >>>> 101 void FileMapInfo::fail_continue(const char *msg, ...) { >>>> 102 va_list ap; >>>> 103 va_start(ap, msg); >>>> 104 if (_runtime_dynamic_info == NULL) { >>>> 105 MetaspaceShared::set_archive_loading_failed(); >>>> 106 } else { >>>> 107 DynamicArchive::disable(); >>>> 108 } >>>> >>>> The above fail_continue only works if _runtime_dynamic_info is setup >>>> after the mapping the base archive. Comments should be add to explain >>>> that. >>> Comment added. >>>> Can you please rename '_runtime_dynamic_info' so it's more >>>> descriptive? Maybe use 'dynamic_archive_info'. >>> Renamed to '_dynamic_archive_info'. >>>> 587 bool FileMapInfo::same_files(const char* file1, const char* file2) { >>>> >>>> The usage of FileMapInfo::same_files is not necessary and should be >>>> removed. The base archive's CRC checksum values are recorded in the >>>> dynamic archive. The runtime verifies the CRC values to make sure the >>>> same archive is used at dump time and runtime, regardless of the base >>>> archive path or name. It is designed for all use cases: >>> The same_files() function is also used in arguments.cpp: >>> 3530 if (DynamicDumpSharedSpaces) { >>> 3531 if (FileMapInfo::same_files(SharedArchiveFile, >>> ArchiveClassesAtExit)) { >>> 3532 vm_exit_during_initialization( >>> 3533 "Cannot have the same archive file specified for >>> -XX:SharedArchiveFile and -XX:ArchiveClassesAtExit", >>> 3534 SharedArchiveFile); >>> 3535 } >>> 3536 } >>> >>> The function is also needed for the RFE: >>> https://bugs.openjdk.java.net/browse/JDK-8211723 >> Ok. It should be treated a bug, not a RFE. >> >> The shared path table check does not verify the path ordering (also >> including the case when new path components are inserted). The bug >> should be handled as a high priority task for dynamic archive. > I saw that you and Karen have had some discussion in the bug report since you sent this review comment. I take that you're fine with that. >> >>> We still verify the CRC values during runtime. >>>> * base CDS archive is specified in the -XX:SharedArchiveFile at >>>> dynamic dumping time >>>> * -XX:SharedArchiveFile is not specified at dynamic dumping time, >>>> default location for the default CDS archive is used >>>> * default CDS archive is specified in the -XX:SharedArchiveFile at runtime >>>> * default CDS archive is not specified in the -XX:SharedArchiveFile at >>>> runtime, default location for the default CDS archive is used >>> Regarding the fourth point above, the user could have a non-default base >>> archive and only specify the top archive during runtime. >> I would argue against it since it doesn't always work and adds extra >> code. When the archive path/name is changed, the recorded one in the >> dynamic archive would no longer work. User still need to specify the >> path/name in the command-line. The use case only works for the default >> CDS archive. For non-default CDS archive, specifying in the >> command-line option results a cleaner design and less fragile code. > If the base archive is moved, then the user has to modify the command-line anyway, whether the user initially specified (a) only the top archive, or (b) both archives. > Requiring both archives to be specified will only hurt the users who never move the archives. > > We'd like to leave it as is. > >> >>>> In all above cases, the base archive CRC values check is sufficient. >>>> The use of path/name is fragile and should be avoided. That will allow >>>> you to remove the _base_archive_name_size from the dynamic archive. >>> We still need the _base_archive_name_size and the base archive name in >>> the header because of the above reason. >> Please see my comment above. >> >>>> 752 if (is_static) { >>>> 753 // FIXME check for dynamic header as well >>>> 754 // FIXME Don't just check the last region -- check all regions! >>>> >>>> Can you please address the first FIXME at line 753? >>>> >>>> Checking the last region is sufficient since the archive is written is >>>> sequential order. The second FIXME is not necessary. >>> I've addressed the first FIXME and converted the second one to a comment. >>>> - src/hotspot/share/memory/metaspace.cpp >>>> >>>> 1417 bool Metaspace::contains(const void* ptr) { >>>> 1418 // FIXME: need to check the dynamic archive >>>> >>>> Can you please remove the above FIXME? There is no need for a separate check. >>> Done. >>>> - src/hotspot/share/memory/metaspaceShared.cpp >>>> >>>> 830 intptr_t* MetaspaceShared::fix_cpp_vtable_for_second_archive >>>> >>>> Can you please rename the function to fix_cpp_vtable_for_dynamic_archive? >>> Done. >>>> - src/hotspot/share/oops/klass.cpp >>>> >>>> 527 assert (DumpSharedSpaces || DynamicDumpSharedSpaces, >>>> 528 "only called for DumpSharedSpaces"); >>>> >>>> 544 void Klass::remove_java_mirror() { >>>> 545 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, "only >>>> called for DumpSharedSpaces"); >>>> >>>> Please fix the messages above. >>> Done. >>>> - src/hotspot/share/prims/whitebox.cpp >>>> >>>> 2332 {CC"getResolvedReferences", >>>> CC"(Ljava/lang/Class;)Ljava/lang/Object;", >>>> (void*)&WB_GetResolvedReferences}, >>>> 2333 {CC"linkClass", CC"(Ljava/lang/Class;)V", >>>> (void*)&WB_LinkClass}, >>>> 2334 {CC"areOpenArchiveHeapObjectsMapped", CC"()Z", >>>> (void*)&WB_AreOpenArchiveHeapObjectsMapped}, >>>> >>>> Can you please align the indentation of line 2333 (to be the same as >>>> line 2332 or 2334)? >>> Aligned (void*) with line 2334. (It doesn't show in the webrev since >>> only blank space changes) >>>> - src/hotspot/share/runtime/arguments.cpp >>>> >>>> 1491 bool Arguments::check_unsupported_cds_runtime_properties() { >>>> 1492 assert(UseSharedSpaces, "this function is only used with >>>> -Xshare:{on,auto}"); >>>> 1493 assert(ARRAY_SIZE(unsupported_properties) == >>>> ARRAY_SIZE(unsupported_options), "must be"); >>>> 1494 if (ArchiveClassesAtExit != NULL) { >>>> 1495 // dynamic dumping, just return false, >>>> check_unsupported_dumping_properties() will be called >>>> 1496 // in init_shared_archive_paths(). >>>> 1497 return false; >>>> 1498 } >>>> >>>> The check_unsupported_cds_runtime_properties() should be done for the >>>> 'ArchiveClassesAtExit != NULL' case as well. Dynamic dumping is a >>>> combination of both dump time and runtime. >>> The 'ArchiveClassesAtExit != NULL' is for dumping CDS archive to the >>> user's point of view, that's why the comments in lines 1495 and 1496. >>> During runtime, ArchiveClassesAtExit will be NULL, so the >>> check_unsupported_cds_runtime_properties() will be called as usual. >> During dynamic dumping, UseSharedSpace is true. Dynamic dumping is >> special case of the 'runtime', that's why Dynamic dumping it is a >> combination of both dump time and runtime. So >> check_unsupported_cds_runtime_properties() is also need for dynamic >> dumping. > If the check_unsupported_cds_runtime_properties() is called for dynamic dumping, the user will see a warning message > warning("CDS is disabled when the %s option is specified.", unsupported_options[i]); > instead of the following error with my current patch: > vm_exit_during_initialization("Cannot use the following option when dumping the shared archive", unsupported_options[i]); > which gives the user the same feedback on the static dumping case. > > We think it is important to give correct error message to the user. We'd like to leave the code as is but have updated the comment as follows: > > if (ArchiveClassesAtExit != NULL) { > // dynamic dumping, just return false for now. > // check_unsupported_dumping_properties() will be called later to check the same set of > // properties, and will exit the VM with the correct error message if the unsupported properties > // are used. > return false; > } > > Here's an updated delta webrev: > http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02%2b/ > > Additional changes comparing with delta_01_02 include: > - the mentioned one-line removal in classLoaderExt.cpp; > - updated comment in arguments.cpp; > - small fixes to handle tests run in -Xshare:off mode; > - enable building on the zero platform. > > Comparing with the delta_01_02 webrev, the additional changed files are: > > make/hotspot/lib/JvmFeatures.gmk > > src/hotspot/share/classfile/symbolTable.hpp > > src/hotspot/share/runtime/arguments.hpp > > test/hotspot/jtreg/runtime/appcds/dynamicArchive/DynamicArchiveTestBase.java > > test/hotspot/jtreg/runtime/appcds/dynamicArchive/HelloDynamicCustom.java > > Thanks again for your review and contribution to this RFE. > > Calvin >>>> 2729 // -Xshare:auto || -Xshare:dynamicDump >>>> >>>> As you've renamed the command-line argument for dynamic dumping >>>> support, the comment needs to be fixed. >>> Fixed. >>>> 3125 // Compiler threads may concurrently update the class >>>> metadata (such as method entries), so it's >>>> 3126 // unsafe with DumpSharedSpaces (which modifies the class >>>> metadata in place). Let's disable >>>> 3127 // compiler just to be safe. >>>> 3128 // >>>> 3129 // Note: this is not a concern for DynamicDumpSharedSpaces, >>>> which makes a copy of the class metadata >>>> 3130 // instead of modifying them in place. The copy is >>>> inaccessible to the compiler. >>>> 3131 set_mode_flags(_int); >>>> >>>> We need to come back to revisit the above for the 'static' archive >>>> dumping at one point. There is a RFE filed for that, if I remember >>>> correctly. Could you please add a 'TODO' notes in the above comment. >>> Added TODO. >>>> A check should be done in arguments.cpp to make sure >>>> DynamicDumpSharedSpaces is not manipulated from the command-line >>>> directly. DynamicDumpSharedSpaces should not be enabled in the >>>> command-line without ArchiveClassesAtExit being specified. >>> Done. >>>> - src/hotspot/share/runtime/java.cpp >>>> >>>> 509 >>>> 510 // FIXME: is this the right place? >>>> 511 if (DynamicDumpSharedSpaces) { >>>> 512 DynamicArchive::dump(); >>>> 513 } >>>> >>>> Again, the above 'FIXME' is served as a cleanup reminder. Please get >>>> opinions from others on this change. If the calling place is okay, >>>> please remove the FIXME. >>> Removed the FIXME for now. Checked with David H. He indicated there's no >>> easy answer for this. Just need to do a lot of testing. >>>> - test >>>> >>>> Could you please add a test case for setting DynamicDumpSharedSpaces >>>> from command-line? >>> Here's an incremental webrev which contains a new test: >>> >>> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/ >>> >>> thanks, >>> Calvin >>>> I only took a brief look of the test changes. Please ask Misha to >>>> review the test changes as well. >>>> >>>> Thanks and regards, >>>> Jiangli >> Thanks, >> Jiangli From calvin.cheung at oracle.com Tue Apr 30 17:46:42 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 30 Apr 2019 10:46:42 -0700 Subject: RFR: 8207812: Implement Dynamic CDS Archive In-Reply-To: <6BB7EA19-F3C5-44C6-8C0B-137361EE9C70@oracle.com> References: <5CAFAF21.3030007@oracle.com> <5CBA333F.20702@oracle.com> <5CBFA906.1030205@oracle.com> <5CC730C2.50800@oracle.com> <6BB7EA19-F3C5-44C6-8C0B-137361EE9C70@oracle.com> Message-ID: <5CC88A02.1010703@oracle.com> Karen, Thanks for taking another look. I'll take care of the typo you mentioned below. thanks, Calvin On 4/30/19, 9:00 AM, Karen Kinnear wrote: > Code changes look good to go for me. > Many thanks for the updates and additional testing. > > thanks, > Karen > > p.s. minor note: dynamicArchive.hpp line 57 ?an copy? -> ?a copy" > >> On Apr 29, 2019, at 1:13 PM, Calvin Cheung > > wrote: >> >> Hi Jiangli, >> >> Thanks for the re-review. >> Please see my comments in-line below... >> >> On 4/24/19, 7:54 PM, Jiangli Zhou wrote: >>> Please see comments inlined. >>> >>> On Tue, Apr 23, 2019 at 5:08 PM Calvin >>> Cheung> >>> wrote: >>>> Hi Jiangli, >>>> >>>> Thanks a lot for your review! >>>> >>>> On 4/22/19, 2:07 PM, Jiangli Zhou wrote: >>>>> Hi Calvin, >>>>> >>>>> Congrats on finalizing the dynamic archiving work and completing >>>>> testing. After the integration of the dynamic archiving, a follow-up >>>>> RFE can be done to merge the archiving/copying code in >>>>> dynamicArchive.* and metaspaceShared.* for better maintenance in the >>>>> future. As there are many duplicates between those two, having shared >>>>> implementation for both static and dynamic will be beneficial and >>>>> reduce the maintenance cost. >>>> I'll file an RFE for the above. >>>>> Here are my comments mainly for additional cleanups and some minor >>>>> issues. >>>>> >>>>> - src/hotspot/share/classfile/classLoader.cpp >>>>> >>>>> 1337 // FIXME: DynamicDumpSharedSpaces and --patch-modules are >>>>> mutually exclusive >>>>> 1338 assert(!DynamicDumpSharedSpaces, "sanity"); >>>>> >>>>> I tagged the comment with 'FIXME' to serve as a reminder to add more >>>>> details. The reason DynamicDumpSharedSpaces is 'mutually exclusive' >>>>> with with --patch-modules because DynamicDumpSharedSpaces is only >>>>> enabled when UseSharedSpaces is also enabled. As --patch-modules is >>>>> not supported with UseSharedSpaces, it is not supported with >>>>> DynamicDumpSharedSpaces either. >>>> I've converted the FIXME to a comment. >>>>> 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); >>>>> 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, >>>>> (ClassFileStream*)stream); >>>>> >>>>> Please add assert(DynamicDumpSharedSpaces, "sanity"); to the above >>>>> code. With the new dynamic archiving capability, it's now able to >>>>> load/archive a class with user defined classloader via this call path. >>>>> A comment explaining this is also needed. >>>> I tried the assert but it didn't work. Not only DynamicDumpSharedSpaces >>>> will go through that code path. >>> I should be more clear. The new code is only intended for the >>> DynamicDumpSharedSpaces, since the shared_classpath_index is set to >>> UNREGISTERED_INDEX by ClassLoaderExt::load_class when loading class >>> with "source:" in the class list file at static dumping time. >>> >>> 1518 ik->set_shared_classpath_index(UNREGISTERED_INDEX); >>> 1519 SystemDictionaryShared::set_shared_class_misc_info(ik, >>> (ClassFileStream*)stream); >>> >>> After thinking more, it's probably better to remove the following >>> marked code from ClassLoaderExt::load_class. That avoids setting twice >>> in two different places during static dumping. It also makes the code >>> cleaner. >>> >>> InstanceKlass* ClassLoaderExt::load_class(Symbol* name, const char* >>> path, TRAPS) { >>> ... >>> result->set_shared_classpath_index(UNREGISTERED_INDEX);<<<<<<<<<<< >>> SystemDictionaryShared::set_shared_class_misc_info(result, stream); >>> <<<<<<<<<<<< >> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/src/hotspot/share/classfile/classLoaderExt.cpp.html >> >> >> Only the first statement is still there. I agree that the >> set_shared_classpath_indexd() can be removed. >>> >>>>> - src/hotspot/share/classfile/classLoaderExt.cpp >>>>> >>>>> 64 void ClassLoaderExt::setup_app_search_path() { >>>>> 65 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, >>>>> 66 "this function is only used with -Xshare:dump"); >>>>> >>>>> The above message needs to be updated to reflect the new >>>>> command-line option. >>>> Done. >>>>> 304 result->set_shared_classpath_index(UNREGISTERED_INDEX); >>>>> 305 SystemDictionaryShared::set_shared_class_misc_info(result, >>>>> stream);<<<<<<<<<< >>>>> >>>>> Why is the set_shared_class_misc_info call being removed? If this is a >>>>> bug fix for loading classes from the classlist for user defined >>>>> classloaders, it should be handled separately, and with a separate bug >>>>> ID as well. >>>> It is called in ClassLoader::record_result() from >>>> KlassFactory::create_from_stream(). >>> >>> Ok, this is related to the above comment. >>> >>>>> - src/hotspot/share/classfile/compactHashtable.cpp >>>>> >>>>> 207 size_t SimpleCompactHashtable::calculate_header_size() { >>>>> 208 // We have 5 fields. Each takes up sizeof(intptr_t). See >>>>> WriteClosure::do_u4 >>>>> 209 size_t bytes = sizeof(intptr_t) * 5; >>>>> 210 return bytes; >>>>> 211 } >>>>> >>>>> 212 >>>>> 213 void >>>>> SimpleCompactHashtable::serialize_header(SerializeClosure* soc) { >>>>> 214 // NOTE: if you change this function, you MUST change the >>>>> number 5 in >>>>> 215 // calculate_header_size() accordingly. >>>>> ... >>>>> >>>>> As a cleanup, a better way to handle this is to calculate the size >>>>> within SimpleCompactHashtable::serialize_header during serializing the >>>>> data and set the size value in a valuable. >>>>> SimpleCompactHashtable::calculate_header_size() should simply retrieve >>>>> the value. A renaming of >>>>> SimpleCompactHashtable::calculate_header_size() can also be done. >>>> I've checked with Ioi on this one. The problem is >>>> calculate_header_size() needs to be called during size estimation, and >>>> serialize_header is called after size estimation. >>> Can you please file a RFE for this? The current code is okay for the >>> first integration. It deserves some efforts to make it cleaner >>> (probably with a different solution) since it can be error-prone. >> I've filed: >> https://bugs.openjdk.java.net/browse/JDK-8223004 >> Avoid using a hard-coded number in >> SimpleCompactHashtable::calculate_header_size() >>> >>>>> - src/hotspot/share/classfile/dictionary.cpp >>>>> >>>>> 315 InstanceKlass* Dictionary::find_class(Symbol* name) { >>>>> 316 unsigned int hash = compute_hash(name); >>>>> 317 int index = hash_to_index(hash); >>>>> 318 return find_class(index, hash, name); >>>>> 319 } >>>>> >>>>> Looks like the new function is not references (unless I'm missing >>>>> something). Please remove the function. >>>>> >>>>> - src/hotspot/share/classfile/dictionary.hpp >>>>> >>>>> 65 InstanceKlass* find_class(Symbol* name); >>>>> >>>>> Same comment as the above. >>>> I've removed the function. >>>>> - src/hotspot/share/classfile/symbolTable.cpp. >>>>> >>>>> 473 Symbol* const _archived; // used by UseSharedArchived2 >>>>> >>>>> Please removed 'UseSharedArchived2'. The comment also needs more >>>>> clarifications. >>>>> >>>>> I couldn't find any references to SymbolTableCreateEntry. Can you >>>>> please point to me where it is being used? >>>> I've removed the entire SymbolTableCreateEntry class. It was left there >>>> probably due to merge error. >>>>> - src/hotspot/share/classfile/systemDictionaryShared.cpp >>>>> >>>>> 1218 if (DynamicDumpSharedSpaces) { >>>>> 1219 return false; >>>>> 1220 } else { >>>>> >>>>> The above case for DynamicDumpSharedSpaces needs to be examined >>>>> carefully. Can you please ask Harold (and Coleen or Karen) to take a >>>>> look? Also, a comment is needed to explain that we can complete all >>>>> verification checks at dynamic dumping time. >>>> I've added a comment. If it return false, the caller will call >>>> VerificationType::resolve_and_check_assignability(). >>>>> - src/hotspot/share/classfile/systemDictionaryShared.cpp >>>>> >>>>> 1279 ResourceMark rm; >>>>> >>>>> You can use 'ResourceMark rm(THREAD)'. >>>> Fixed. >>>>> - src/hotspot/share/memory/allocation.hpp >>>>> >>>>> 255 // >>>>> 256 // When CDS is not enabled, both pointers are set to NULL. >>>>> 257 static void* _shared_metaspace_base; // (inclusive) low >>>>> address >>>>> 258 static void* _shared_metaspace_top; // (exclusive) high >>>>> addres >>>>> >>>>> Why the comment at line 256 was removed? >>>> I've added back the comment. >>>>> - src/hotspot/share/memory/filemap.cpp >>>>> >>>>> 101 void FileMapInfo::fail_continue(const char *msg, ...) { >>>>> 102 va_list ap; >>>>> 103 va_start(ap, msg); >>>>> 104 if (_runtime_dynamic_info == NULL) { >>>>> 105 MetaspaceShared::set_archive_loading_failed(); >>>>> 106 } else { >>>>> 107 DynamicArchive::disable(); >>>>> 108 } >>>>> >>>>> The above fail_continue only works if _runtime_dynamic_info is setup >>>>> after the mapping the base archive. Comments should be add to explain >>>>> that. >>>> Comment added. >>>>> Can you please rename '_runtime_dynamic_info' so it's more >>>>> descriptive? Maybe use 'dynamic_archive_info'. >>>> Renamed to '_dynamic_archive_info'. >>>>> 587 bool FileMapInfo::same_files(const char* file1, const char* >>>>> file2) { >>>>> >>>>> The usage of FileMapInfo::same_files is not necessary and should be >>>>> removed. The base archive's CRC checksum values are recorded in the >>>>> dynamic archive. The runtime verifies the CRC values to make sure the >>>>> same archive is used at dump time and runtime, regardless of the base >>>>> archive path or name. It is designed for all use cases: >>>> The same_files() function is also used in arguments.cpp: >>>> 3530 if (DynamicDumpSharedSpaces) { >>>> 3531 if (FileMapInfo::same_files(SharedArchiveFile, >>>> ArchiveClassesAtExit)) { >>>> 3532 vm_exit_during_initialization( >>>> 3533 "Cannot have the same archive file specified for >>>> -XX:SharedArchiveFile and -XX:ArchiveClassesAtExit", >>>> 3534 SharedArchiveFile); >>>> 3535 } >>>> 3536 } >>>> >>>> The function is also needed for the RFE: >>>> https://bugs.openjdk.java.net/browse/JDK-8211723 >>> Ok. It should be treated a bug, not a RFE. >>> >>> The shared path table check does not verify the path ordering (also >>> including the case when new path components are inserted). The bug >>> should be handled as a high priority task for dynamic archive. >> I saw that you and Karen have had some discussion in the bug report >> since you sent this review comment. I take that you're fine with that. >>> >>>> We still verify the CRC values during runtime. >>>>> * base CDS archive is specified in the -XX:SharedArchiveFile at >>>>> dynamic dumping time >>>>> * -XX:SharedArchiveFile is not specified at dynamic dumping time, >>>>> default location for the default CDS archive is used >>>>> * default CDS archive is specified in the -XX:SharedArchiveFile at >>>>> runtime >>>>> * default CDS archive is not specified in the -XX:SharedArchiveFile at >>>>> runtime, default location for the default CDS archive is used >>>> Regarding the fourth point above, the user could have a non-default >>>> base >>>> archive and only specify the top archive during runtime. >>> I would argue against it since it doesn't always work and adds extra >>> code. When the archive path/name is changed, the recorded one in the >>> dynamic archive would no longer work. User still need to specify the >>> path/name in the command-line. The use case only works for the default >>> CDS archive. For non-default CDS archive, specifying in the >>> command-line option results a cleaner design and less fragile code. >> If the base archive is moved, then the user has to modify the >> command-line anyway, whether the user initially specified (a) only >> the top archive, or (b) both archives. >> Requiring both archives to be specified will only hurt the users who >> never move the archives. >> >> We'd like to leave it as is. >> >>> >>>>> In all above cases, the base archive CRC values check is sufficient. >>>>> The use of path/name is fragile and should be avoided. That will allow >>>>> you to remove the _base_archive_name_size from the dynamic archive. >>>> We still need the _base_archive_name_size and the base archive name in >>>> the header because of the above reason. >>> Please see my comment above. >>> >>>>> 752 if (is_static) { >>>>> 753 // FIXME check for dynamic header as well >>>>> 754 // FIXME Don't just check the last region -- check all >>>>> regions! >>>>> >>>>> Can you please address the first FIXME at line 753? >>>>> >>>>> Checking the last region is sufficient since the archive is written is >>>>> sequential order. The second FIXME is not necessary. >>>> I've addressed the first FIXME and converted the second one to a >>>> comment. >>>>> - src/hotspot/share/memory/metaspace.cpp >>>>> >>>>> 1417 bool Metaspace::contains(const void* ptr) { >>>>> 1418 // FIXME: need to check the dynamic archive >>>>> >>>>> Can you please remove the above FIXME? There is no need for a >>>>> separate check. >>>> Done. >>>>> - src/hotspot/share/memory/metaspaceShared.cpp >>>>> >>>>> 830 intptr_t* MetaspaceShared::fix_cpp_vtable_for_second_archive >>>>> >>>>> Can you please rename the function to >>>>> fix_cpp_vtable_for_dynamic_archive? >>>> Done. >>>>> - src/hotspot/share/oops/klass.cpp >>>>> >>>>> 527 assert (DumpSharedSpaces || DynamicDumpSharedSpaces, >>>>> 528 "only called for DumpSharedSpaces"); >>>>> >>>>> 544 void Klass::remove_java_mirror() { >>>>> 545 assert(DumpSharedSpaces || DynamicDumpSharedSpaces, "only >>>>> called for DumpSharedSpaces"); >>>>> >>>>> Please fix the messages above. >>>> Done. >>>>> - src/hotspot/share/prims/whitebox.cpp >>>>> >>>>> 2332 {CC"getResolvedReferences", >>>>> CC"(Ljava/lang/Class;)Ljava/lang/Object;", >>>>> (void*)&WB_GetResolvedReferences}, >>>>> 2333 {CC"linkClass", CC"(Ljava/lang/Class;)V", >>>>> (void*)&WB_LinkClass}, >>>>> 2334 {CC"areOpenArchiveHeapObjectsMapped", CC"()Z", >>>>> (void*)&WB_AreOpenArchiveHeapObjectsMapped}, >>>>> >>>>> Can you please align the indentation of line 2333 (to be the same as >>>>> line 2332 or 2334)? >>>> Aligned (void*) with line 2334. (It doesn't show in the webrev since >>>> only blank space changes) >>>>> - src/hotspot/share/runtime/arguments.cpp >>>>> >>>>> 1491 bool Arguments::check_unsupported_cds_runtime_properties() { >>>>> 1492 assert(UseSharedSpaces, "this function is only used with >>>>> -Xshare:{on,auto}"); >>>>> 1493 assert(ARRAY_SIZE(unsupported_properties) == >>>>> ARRAY_SIZE(unsupported_options), "must be"); >>>>> 1494 if (ArchiveClassesAtExit != NULL) { >>>>> 1495 // dynamic dumping, just return false, >>>>> check_unsupported_dumping_properties() will be called >>>>> 1496 // in init_shared_archive_paths(). >>>>> 1497 return false; >>>>> 1498 } >>>>> >>>>> The check_unsupported_cds_runtime_properties() should be done for the >>>>> 'ArchiveClassesAtExit != NULL' case as well. Dynamic dumping is a >>>>> combination of both dump time and runtime. >>>> The 'ArchiveClassesAtExit != NULL' is for dumping CDS archive to the >>>> user's point of view, that's why the comments in lines 1495 and 1496. >>>> During runtime, ArchiveClassesAtExit will be NULL, so the >>>> check_unsupported_cds_runtime_properties() will be called as usual. >>> During dynamic dumping, UseSharedSpace is true. Dynamic dumping is >>> special case of the 'runtime', that's why Dynamic dumping it is a >>> combination of both dump time and runtime. So >>> check_unsupported_cds_runtime_properties() is also need for dynamic >>> dumping. >> If the check_unsupported_cds_runtime_properties() is called for >> dynamic dumping, the user will see a warning message >> warning("CDS is disabled when the %s option is specified.", >> unsupported_options[i]); >> instead of the following error with my current patch: >> vm_exit_during_initialization("Cannot use the following option >> when dumping the shared archive", unsupported_options[i]); >> which gives the user the same feedback on the static dumping case. >> >> We think it is important to give correct error message to the user. >> We'd like to leave the code as is but have updated the comment as >> follows: >> >> if (ArchiveClassesAtExit != NULL) { >> // dynamic dumping, just return false for now. >> // check_unsupported_dumping_properties() will be called later to >> check the same set of >> // properties, and will exit the VM with the correct error message >> if the unsupported properties >> // are used. >> return false; >> } >> >> Here's an updated delta webrev: >> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02%2b/ >> >> >> Additional changes comparing with delta_01_02 include: >> - the mentioned one-line removal in classLoaderExt.cpp; >> - updated comment in arguments.cpp; >> - small fixes to handle tests run in -Xshare:off mode; >> - enable building on the zero platform. >> >> Comparing with the delta_01_02 webrev, the additional changed files are: >> > make/hotspot/lib/JvmFeatures.gmk >> > src/hotspot/share/classfile/symbolTable.hpp >> > src/hotspot/share/runtime/arguments.hpp >> > test/hotspot/jtreg/runtime/appcds/dynamicArchive/DynamicArchiveTestBase.java >> > test/hotspot/jtreg/runtime/appcds/dynamicArchive/HelloDynamicCustom.java >> >> Thanks again for your review and contribution to this RFE. >> >> Calvin >>>>> 2729 // -Xshare:auto || -Xshare:dynamicDump >>>>> >>>>> As you've renamed the command-line argument for dynamic dumping >>>>> support, the comment needs to be fixed. >>>> Fixed. >>>>> 3125 // Compiler threads may concurrently update the class >>>>> metadata (such as method entries), so it's >>>>> 3126 // unsafe with DumpSharedSpaces (which modifies the class >>>>> metadata in place). Let's disable >>>>> 3127 // compiler just to be safe. >>>>> 3128 // >>>>> 3129 // Note: this is not a concern for DynamicDumpSharedSpaces, >>>>> which makes a copy of the class metadata >>>>> 3130 // instead of modifying them in place. The copy is >>>>> inaccessible to the compiler. >>>>> 3131 set_mode_flags(_int); >>>>> >>>>> We need to come back to revisit the above for the 'static' archive >>>>> dumping at one point. There is a RFE filed for that, if I remember >>>>> correctly. Could you please add a 'TODO' notes in the above comment. >>>> Added TODO. >>>>> A check should be done in arguments.cpp to make sure >>>>> DynamicDumpSharedSpaces is not manipulated from the command-line >>>>> directly. DynamicDumpSharedSpaces should not be enabled in the >>>>> command-line without ArchiveClassesAtExit being specified. >>>> Done. >>>>> - src/hotspot/share/runtime/java.cpp >>>>> >>>>> 509 >>>>> 510 // FIXME: is this the right place? >>>>> 511 if (DynamicDumpSharedSpaces) { >>>>> 512 DynamicArchive::dump(); >>>>> 513 } >>>>> >>>>> Again, the above 'FIXME' is served as a cleanup reminder. Please get >>>>> opinions from others on this change. If the calling place is okay, >>>>> please remove the FIXME. >>>> Removed the FIXME for now. Checked with David H. He indicated >>>> there's no >>>> easy answer for this. Just need to do a lot of testing. >>>>> - test >>>>> >>>>> Could you please add a test case for setting DynamicDumpSharedSpaces >>>>> from command-line? >>>> Here's an incremental webrev which contains a new test: >>>> >>>> http://cr.openjdk.java.net/~ccheung/8207812_dynamic_cds_archive/delta_01_02/ >>>> >>>> >>>> thanks, >>>> Calvin >>>>> I only took a brief look of the test changes. Please ask Misha to >>>>> review the test changes as well. >>>>> >>>>> Thanks and regards, >>>>> Jiangli >>> Thanks, >>> Jiangli > From coleen.phillimore at oracle.com Tue Apr 30 20:52:11 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 30 Apr 2019 16:52:11 -0400 Subject: RFR (T) 8213399: DecoderLocker is unused Message-ID: <79ef30c5-8906-c71a-75cf-d99f30ce64dd@oracle.com> Summary: remove DecoderLocker See bug for history and more info.?? Ran mach5 tier1 and runtime/ErrorHandler tests on linux and windows. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8213399.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8213399 Thanks, Coleen From ioi.lam at oracle.com Tue Apr 30 21:00:59 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 30 Apr 2019 14:00:59 -0700 Subject: RFR (T) 8213399: DecoderLocker is unused In-Reply-To: <79ef30c5-8906-c71a-75cf-d99f30ce64dd@oracle.com> References: <79ef30c5-8906-c71a-75cf-d99f30ce64dd@oracle.com> Message-ID: <769106be-5985-f84b-da4e-5b8372d8051c@oracle.com> Hi Coleen, Looks good to me. Thanks - Ioi On 4/30/19 1:52 PM, coleen.phillimore at oracle.com wrote: > Summary: remove DecoderLocker > > See bug for history and more info.?? Ran mach5 tier1 and > runtime/ErrorHandler tests on linux and windows. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8213399.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8213399 > > Thanks, > Coleen From coleen.phillimore at oracle.com Tue Apr 30 21:25:23 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 30 Apr 2019 17:25:23 -0400 Subject: RFR (T) 8213399: DecoderLocker is unused In-Reply-To: <769106be-5985-f84b-da4e-5b8372d8051c@oracle.com> References: <79ef30c5-8906-c71a-75cf-d99f30ce64dd@oracle.com> <769106be-5985-f84b-da4e-5b8372d8051c@oracle.com> Message-ID: <9c92f020-8608-f304-9b1a-49c756d27c6f@oracle.com> Thanks Ioi! Coleen On 4/30/19 5:00 PM, Ioi Lam wrote: > Hi Coleen, > > Looks good to me. > > Thanks > - Ioi > > > On 4/30/19 1:52 PM, coleen.phillimore at oracle.com wrote: >> Summary: remove DecoderLocker >> >> See bug for history and more info.?? Ran mach5 tier1 and >> runtime/ErrorHandler tests on linux and windows. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8213399.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8213399 >> >> Thanks, >> Coleen >