From goetz.lindenmaier at sap.com Tue Aug 1 14:20:00 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 1 Aug 2017 14:20:00 +0000 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: <597F90A0.4030706@oracle.com> References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> Message-ID: Hi, I made new webrevs implementing the change with @requires: http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02-hs/ I also changed the bug description and synopsis. For the jtreg runner I would propose to set the property test.jdk so that it is available in VMProps. Igor also ran into this issue. Best regards, Goetz. > -----Original Message----- > From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > Sent: Montag, 31. Juli 2017 22:19 > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests > > Hi Goetz, > > I have an idea on how to address your second use case. > The idea is to define a special test property (e.g. > test.cds.disable.cds.support) which will override logic inside the > VMProps.vmCDSSupported(). If this property is defined to "true" in test > invocation command then vmCDSSupported() returns false (CDS is disabled, > not supported), and all tests marked with "@requires vm.cds.supported" > will be skipped. > > How to use it: > jtreg -Dtest.cds.disable.cds.support=true > E.g.: jtreg -Dtest.cds.disable.cds.support=true > hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java > > I prototyped this approach, it works for me. I have attached the diff. > Let me know whether this works for your use case, or if you have any > questions. > > > Thank you, > Mikhailo > > > On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: > > Hi Mikhailo, > > > > Basically I'm fine with using the @requires property. > > But is there a way to overrule the outcome of the method > > implemented In VMProps.java computing the property? > > I have two use cases for the key I want to introduce. > > > > First, our internal VM (we are Oracle licensees) is compiled without > > CDS support. Thus we don't want to run the CDS tests. Currently > > we have them all listed in the ProblemList, but that's not nice, especially > > because we have to adapt it whenever a new test is added. > > As I understand, the @requires property works fine, here. > > > > Second, we also test the two ports we contributed (ppc and s390). These > contain > > rudimentary cds support and so far passed all tests. Unfortunately it broke > > lately in jdk10. Instead of fixing it (our people are working on finishing our > > internal Java 9 port) I would like to switch off all cds tests. > > As I can set the key on the command line of jtreg, I easily can do that. > > Is there a way to do similar with the @requires property? > > > > Best regards, > > Goetz. > > > > > > > >> -----Original Message----- > >> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > >> Sent: Freitag, 28. Juli 2017 23:53 > >> To: Lindenmaier, Goetz > >> Cc: hotspot-runtime-dev at openjdk.java.net > >> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds > tests > >> > >> Hi Goetz, > >> > >> I am a HotSpot SQE Engineer at Oracle. I have discussed your proposed > >> fix with Igor Ignatyev (also VM SQE Engineer), and we have the following > >> feedback on this change. > >> > >> 1. As part of streamlining and simplifying SQE process and the use of > >> test tools we have narrowed down the test selection mechanisms. > >> > >> 2. Our preferred test selection mechanism is use of "@requires" and a > >> corresponding test/jtreg-ext/requires/VMProps.java. Even though JTREG > >> supports use of "@key", we prefer the use of "@requires" as a first > choice. > >> > >> 3. If it is not possible to use "@requires" for a given situation then > >> use "@key" mechanism. We would ask you if you could explore the > >> possibility of implementing this change via @requires first. > >> > >> > >> Here are several hints that may help: > >> > >> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. The value > >> of a given "requires property" is evaluated inside this file and placed > >> into a map (see public call() method). Add your evaluation code here, > >> and then follow the pattern used for other properties. Create a property > >> (e.g. vm.cds.supported, with values of true/false). Create a method that > >> evaluates the property value (e.g. isCDSSupported() or similar). > >> > >> 2. The method could use several options to evaluate whether CDS is > >> supported. > >> A. WhiteBox API. Create a new WB test API method which can return > >> true if CDS_ compiler flag is defined, otherwise false. > >> Call WB API from VMProps.java. See > >> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create your > own > >> WB.isCDSSupported() > >> WhiteBox.java resides in test/lib/sun/hotspot/WhiteBox.java > >> > >> B. Another options is to evaluate by running VM with sharing on and > >> checking the return (may be not as reliable as option A) > >> C. Other ideas welcome. > >> > >> 3. Include "@requres vm.cds.supported == true" to the appropriate tests. > >> > >> Let me know if you have any questions. > >> > >> > >> Best regards, > >> Mikhailo > >> > >> > >> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: > >>> Hi > >>> > >>> we compile the VM without CDS support. Thus the CDS tests > >>> fail. This change introduces a keyword 'cds' and marks > >>> the tests accordingly. > >>> This change also fixes the keywords specified in > >> gc/g1/TestSharedArchiveWithPreTouch.java. > >>> There may only be one @key keyword in the test specification. > >>> In runtime/CompressedOops/CompressedClassPointers.java only one > test > >>> case required CDS. I changed this sub case to succeed if CDS is not > >>> available. > >>> > >>> Please review this change. I please need a sponsor. > >>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.01/ > >>> > >>> Best regards, > >>> Goetz. From coleen.phillimore at oracle.com Tue Aug 1 14:36:02 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 1 Aug 2017 10:36:02 -0400 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd Message-ID: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> Summary: added dcmd for printing system dictionary like the stringtable and symboltable and making print functions go to outputstream rather than tty Tested with tier1 on linux x64, and runThese with jcmd to query systemdictionaries (lots of class loaders). open webrev at http://cr.openjdk.java.net/~coleenp/8184994.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8184994 Thanks, Coleen From harold.seigel at oracle.com Tue Aug 1 14:36:04 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 1 Aug 2017 10:36:04 -0400 Subject: RFR 8180627: gc/gctests/Steal/steal001: guarantee(cp->cache() == NULL) failed Message-ID: Hi, Please review this JDK-10 fix for JDK-8180627. Test gctests/Steal/steal001 was occasionally failing when an OutOfMemoryError exception happened to get thrown while linking a class. The exception caused the class's linking to fail, but the JVM did not properly clean up the constant pool cache of the partially linked class. This caused the verifier to assert when the test tried again to link the class because the verifier did not expect the unlinked class to have an existing constant pool cache. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8180627/webrev/ JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8180627 The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util and other tests, the co-located NSK tests, RBT tier2 - tier5 tests, and with JPRT. Additionally, the fix was tested by temporarily throwing an OutOfMemoryError exception in ConstantPool::initialize_resolved_references() and then checking that the verifier stopped asserting once the fix was included in the JVM build. (Thanks to Coleen for suggesting the fix.) Thanks, Harold From shade at redhat.com Tue Aug 1 14:53:07 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 1 Aug 2017 16:53:07 +0200 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd In-Reply-To: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> References: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> Message-ID: On 08/01/2017 04:36 PM, coleen.phillimore at oracle.com wrote: > Summary: added dcmd for printing system dictionary like the stringtable and symboltable and making > print functions go to outputstream rather than tty > > Tested with tier1 on linux x64, and runThese with jcmd to query systemdictionaries (lots of class > loaders). > > open webrev at http://cr.openjdk.java.net/~coleenp/8184994.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8184994 Cursory review: *) Not entirely clear why this include in compactHashtable.cpp: 32 #include "runtime/vmThread.hpp" *) I guess these pairs of lines may be coalesced in dictionary.cpp: 444 st->print_cr("^ indicates that initiating loader is different from " 445 "defining loader"); 454 st->print("%4d: %s%s", index, is_defining_class ? " " : "^", e->external_name()); 455 st->print(", loader "); *) dictionary.hpp: stray whitespace at the end 106 void print_on(outputStream* st) const ; *) placeholders.cpp: while we are here, probably worth changing print("\n") to cr()? *) placeholders.cpp: coalesce? 232 st->print("%4d: ", pindex); 233 st->print("placeholder "); *) placeholders.hpp: method name is camel-cased, can rename it? 130 void printActionQ(outputStream* st) { *) systemDictionary.cpp: in SystemDictionary::dump, the if(verbose) condition seems inverted. Should dump tables when verbose? Thanks, -Aleksey From coleen.phillimore at oracle.com Tue Aug 1 17:14:41 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 1 Aug 2017 13:14:41 -0400 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd In-Reply-To: References: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> Message-ID: On 8/1/17 10:53 AM, Aleksey Shipilev wrote: > On 08/01/2017 04:36 PM, coleen.phillimore at oracle.com wrote: >> Summary: added dcmd for printing system dictionary like the stringtable and symboltable and making >> print functions go to outputstream rather than tty >> >> Tested with tier1 on linux x64, and runThese with jcmd to query systemdictionaries (lots of class >> loaders). >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8184994.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8184994 > Cursory review: > > *) Not entirely clear why this include in compactHashtable.cpp: > > 32 #include "runtime/vmThread.hpp" There's a VMThread::vm_thread() call in cmopactHashtable.cpp. I took out #include diagnosticCommand.hpp from compactHashtable.hpp which transitively included it. I should have mentioned that a lot of this change was to include files not transitively included by removing the dcmds from compactHashtable.hpp. > > *) I guess these pairs of lines may be coalesced in dictionary.cpp: > > 444 st->print_cr("^ indicates that initiating loader is different from " > 445 "defining loader"); > > 454 st->print("%4d: %s%s", index, is_defining_class ? " " : "^", e->external_name()); > 455 st->print(", loader "); > yes. that makes sense. > *) dictionary.hpp: stray whitespace at the end > > 106 void print_on(outputStream* st) const ; Fixed. > *) placeholders.cpp: while we are here, probably worth changing print("\n") to cr()? Yes, I should have looked at this more carefully to fix these and coalesce these lines below. > > *) placeholders.cpp: coalesce? > > 232 st->print("%4d: ", pindex); > 233 st->print("placeholder "); fixed. > > *) placeholders.hpp: method name is camel-cased, can rename it? > > 130 void printActionQ(outputStream* st) { Sure. I'll rename it print_action_queue(). > > *) systemDictionary.cpp: in SystemDictionary::dump, the if(verbose) condition seems inverted. Should > dump tables when verbose? I kept the name dump_table from the VM.stringtable and VM.symboltable dcmd code, when the function really prints hashtable statistics. I could rename that function to be print_table_statistics() instead. And change ClassLoaderDataGraph::dump_dictionary to print_dictionary_statistics. If that makes sense. Thanks, Coleen > > > Thanks, > -Aleksey > From shade at redhat.com Tue Aug 1 17:19:26 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 1 Aug 2017 19:19:26 +0200 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd In-Reply-To: References: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> Message-ID: <238b2d00-c6f9-7213-9b21-f9cb0853cd34@redhat.com> On 08/01/2017 07:14 PM, coleen.phillimore at oracle.com wrote: >> *) systemDictionary.cpp: in SystemDictionary::dump, the if(verbose) condition seems inverted. Should >> dump tables when verbose? > > I kept the name dump_table from the VM.stringtable and VM.symboltable dcmd code, when the function > really prints hashtable statistics. I could rename that function to be print_table_statistics() > instead. And change ClassLoaderDataGraph::dump_dictionary to print_dictionary_statistics. If that > makes sense. Yeah, that would make sense. Thanks, -Aleksey From coleen.phillimore at oracle.com Tue Aug 1 17:32:33 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 1 Aug 2017 13:32:33 -0400 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd In-Reply-To: <238b2d00-c6f9-7213-9b21-f9cb0853cd34@redhat.com> References: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> <238b2d00-c6f9-7213-9b21-f9cb0853cd34@redhat.com> Message-ID: <9bb95959-c328-8e29-6ddc-d2333259c808@oracle.com> On 8/1/17 1:19 PM, Aleksey Shipilev wrote: > On 08/01/2017 07:14 PM, coleen.phillimore at oracle.com wrote: >>> *) systemDictionary.cpp: in SystemDictionary::dump, the if(verbose) condition seems inverted. Should >>> dump tables when verbose? >> I kept the name dump_table from the VM.stringtable and VM.symboltable dcmd code, when the function >> really prints hashtable statistics. I could rename that function to be print_table_statistics() >> instead. And change ClassLoaderDataGraph::dump_dictionary to print_dictionary_statistics. If that >> makes sense. > Yeah, that would make sense. Thanks! Here's a webrev with the renaming: open webrev at http://cr.openjdk.java.net/~coleenp/8184994.02/webrev and tested. Thanks, Coleen > > Thanks, > -Aleksey > From shade at redhat.com Tue Aug 1 17:44:48 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 1 Aug 2017 19:44:48 +0200 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd In-Reply-To: <9bb95959-c328-8e29-6ddc-d2333259c808@oracle.com> References: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> <238b2d00-c6f9-7213-9b21-f9cb0853cd34@redhat.com> <9bb95959-c328-8e29-6ddc-d2333259c808@oracle.com> Message-ID: <835b9626-fe12-6da4-cd61-1c34c723854b@redhat.com> On 08/01/2017 07:32 PM, coleen.phillimore at oracle.com wrote: > open webrev at http://cr.openjdk.java.net/~coleenp/8184994.02/webrev Looks good to me, except for print("\n")-s in PlaceholderEntry::print_entry. -Aleksey From coleen.phillimore at oracle.com Tue Aug 1 17:57:19 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 1 Aug 2017 13:57:19 -0400 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd In-Reply-To: <835b9626-fe12-6da4-cd61-1c34c723854b@redhat.com> References: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> <238b2d00-c6f9-7213-9b21-f9cb0853cd34@redhat.com> <9bb95959-c328-8e29-6ddc-d2333259c808@oracle.com> <835b9626-fe12-6da4-cd61-1c34c723854b@redhat.com> Message-ID: <0478244b-1557-74c8-eee5-7882c99cc3dc@oracle.com> On 8/1/17 1:44 PM, Aleksey Shipilev wrote: > On 08/01/2017 07:32 PM, coleen.phillimore at oracle.com wrote: >> open webrev at http://cr.openjdk.java.net/~coleenp/8184994.02/webrev > Looks good to me, except for print("\n")-s in PlaceholderEntry::print_entry. Really fixed it now. Thanks! Coleen > > -Aleksey > From mikhailo.seledtsov at oracle.com Tue Aug 1 21:48:37 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Tue, 1 Aug 2017 14:48:37 -0700 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> Message-ID: Hi Goetz, I have reviewed your updated changes, and they overall look good to me. However, I have some comments + concerns regarding VMProps.vmCDS(): 1. Throwing exceptions from within the vmCDS() method. The VMProps properties are evaluated at the start of each run. If the exception is thrown here the whole test run will fail (not just the test that uses '@requires vm.cds'). The failure will occur shortly after the start of jtreg test run with a message: "java.lang.RuntimeException: Can not start VM to test to find out it's features. Switching off class data sharing (CDS)." Your method has 2 throw statements: "new RuntimeException("Can not start VM..." and "java.lang.RuntimeException: Can not start VM to test to...". I would recommend a more graceful way to fail, e.g. to print the message and to return "false" instead. This way the rest of the test run will continue, but the tests requiring vm.cds will be skipped with qualification of "not selected". 2. The check for "An error has occurred while processing the shared archive file." assumes that archive was not created prior to the execution of this evaluation code. However, there are test modes where archive is created prior to test run. We use such mode on regular basis. In such cases the code will fail. I recommend to run "-Xshare:on -version", and check the following match that would result in return of "true": "Java HotSpot.*sharing" 3. On occasion the mapping of shared archive region to a specified address will fail (due to system configuration, space already occupied, ASLR, etc.) Hence I recommend checking for such conditions as well: if (output.firstMatch("Unable to map") != null) { System.out.println("VMProps.vmCDS() encountered an archive mapping failure, still proceeding with vm.cds=true"); return "true"; } I am returning true here because seeing this output means that CDS feature is supported, however in this particular instance archive failed to map. The rest of the changes looks good to me. See for my version of VMProps.vmCDS() below. Let me know what you think. Thank you, Mikhailo ================== my update of VMProps.vmCDS() protected String vmCDS() { System.setProperty("test.jdk", System.getProperty("java.home")); ProcessBuilder pb = ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); OutputAnalyzer output; try { output = new OutputAnalyzer(pb.start()); } catch (IOException e) { System.err.println( "Can not start VM to test to find out it's features. " + "Switching off class data sharing (CDS)." + e); return "false"; } if (output.firstMatch("Shared spaces are not supported in this VM") != null) { return "false"; } if (output.firstMatch("An error has occurred while processing the shared archive file.") != null) { return "true"; } if (output.firstMatch("Java HotSpot.*sharing") != null) { return "true"; } if (output.firstMatch("Unable to map") != null) { System.out.println("VMProps.vmCDS() encountered an archive mapping failure, still proceeding with vm.cds=true"); return "true"; } return "false"; } ================== On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: > Hi, > > I made new webrevs implementing the change with @requires: > http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ > http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02-hs/ > > I also changed the bug description and synopsis. > > For the jtreg runner I would propose to set the property test.jdk > so that it is available in VMProps. Igor also ran into this issue. > > Best regards, > Goetz. > > >> -----Original Message----- >> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >> Sent: Montag, 31. Juli 2017 22:19 >> To: Lindenmaier, Goetz >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests >> >> Hi Goetz, >> >> I have an idea on how to address your second use case. >> The idea is to define a special test property (e.g. >> test.cds.disable.cds.support) which will override logic inside the >> VMProps.vmCDSSupported(). If this property is defined to "true" in test >> invocation command then vmCDSSupported() returns false (CDS is disabled, >> not supported), and all tests marked with "@requires vm.cds.supported" >> will be skipped. >> >> How to use it: >> jtreg -Dtest.cds.disable.cds.support=true >> E.g.: jtreg -Dtest.cds.disable.cds.support=true >> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java >> >> I prototyped this approach, it works for me. I have attached the diff. >> Let me know whether this works for your use case, or if you have any >> questions. >> >> >> Thank you, >> Mikhailo >> >> >> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: >>> Hi Mikhailo, >>> >>> Basically I'm fine with using the @requires property. >>> But is there a way to overrule the outcome of the method >>> implemented In VMProps.java computing the property? >>> I have two use cases for the key I want to introduce. >>> >>> First, our internal VM (we are Oracle licensees) is compiled without >>> CDS support. Thus we don't want to run the CDS tests. Currently >>> we have them all listed in the ProblemList, but that's not nice, especially >>> because we have to adapt it whenever a new test is added. >>> As I understand, the @requires property works fine, here. >>> >>> Second, we also test the two ports we contributed (ppc and s390). These >> contain >>> rudimentary cds support and so far passed all tests. Unfortunately it broke >>> lately in jdk10. Instead of fixing it (our people are working on finishing our >>> internal Java 9 port) I would like to switch off all cds tests. >>> As I can set the key on the command line of jtreg, I easily can do that. >>> Is there a way to do similar with the @requires property? >>> >>> Best regards, >>> Goetz. >>> >>> >>> >>>> -----Original Message----- >>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>> Sent: Freitag, 28. Juli 2017 23:53 >>>> To: Lindenmaier, Goetz >>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds >> tests >>>> Hi Goetz, >>>> >>>> I am a HotSpot SQE Engineer at Oracle. I have discussed your proposed >>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the following >>>> feedback on this change. >>>> >>>> 1. As part of streamlining and simplifying SQE process and the use of >>>> test tools we have narrowed down the test selection mechanisms. >>>> >>>> 2. Our preferred test selection mechanism is use of "@requires" and a >>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though JTREG >>>> supports use of "@key", we prefer the use of "@requires" as a first >> choice. >>>> 3. If it is not possible to use "@requires" for a given situation then >>>> use "@key" mechanism. We would ask you if you could explore the >>>> possibility of implementing this change via @requires first. >>>> >>>> >>>> Here are several hints that may help: >>>> >>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. The value >>>> of a given "requires property" is evaluated inside this file and placed >>>> into a map (see public call() method). Add your evaluation code here, >>>> and then follow the pattern used for other properties. Create a property >>>> (e.g. vm.cds.supported, with values of true/false). Create a method that >>>> evaluates the property value (e.g. isCDSSupported() or similar). >>>> >>>> 2. The method could use several options to evaluate whether CDS is >>>> supported. >>>> A. WhiteBox API. Create a new WB test API method which can return >>>> true if CDS_ compiler flag is defined, otherwise false. >>>> Call WB API from VMProps.java. See >>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create your >> own >>>> WB.isCDSSupported() >>>> WhiteBox.java resides in test/lib/sun/hotspot/WhiteBox.java >>>> >>>> B. Another options is to evaluate by running VM with sharing on and >>>> checking the return (may be not as reliable as option A) >>>> C. Other ideas welcome. >>>> >>>> 3. Include "@requres vm.cds.supported == true" to the appropriate tests. >>>> >>>> Let me know if you have any questions. >>>> >>>> >>>> Best regards, >>>> Mikhailo >>>> >>>> >>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: >>>>> Hi >>>>> >>>>> we compile the VM without CDS support. Thus the CDS tests >>>>> fail. This change introduces a keyword 'cds' and marks >>>>> the tests accordingly. >>>>> This change also fixes the keywords specified in >>>> gc/g1/TestSharedArchiveWithPreTouch.java. >>>>> There may only be one @key keyword in the test specification. >>>>> In runtime/CompressedOops/CompressedClassPointers.java only one >> test >>>>> case required CDS. I changed this sub case to succeed if CDS is not >>>>> available. >>>>> >>>>> Please review this change. I please need a sponsor. >>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.01/ >>>>> >>>>> Best regards, >>>>> Goetz. From coleen.phillimore at oracle.com Tue Aug 1 21:57:57 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 1 Aug 2017 17:57:57 -0400 Subject: RFR 8180627: gc/gctests/Steal/steal001: guarantee(cp->cache() == NULL) failed In-Reply-To: References: Message-ID: <63e69627-1908-e38e-1478-00dd16c95337@oracle.com> It looks good to me. Thanks! Coleen On 8/1/17 10:36 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 fix for JDK-8180627. Test > gctests/Steal/steal001 was occasionally failing when an > OutOfMemoryError exception happened to get thrown while linking a > class. The exception caused the class's linking to fail, but the JVM > did not properly clean up the constant pool cache of the partially > linked class. This caused the verifier to assert when the test tried > again to link the class because the verifier did not expect the > unlinked class to have an existing constant pool cache. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8180627/webrev/ > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8180627 > > The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util and other tests, the co-located NSK > tests, RBT tier2 - tier5 tests, and with JPRT. > > Additionally, the fix was tested by temporarily throwing an > OutOfMemoryError exception in > ConstantPool::initialize_resolved_references() and then checking that > the verifier stopped asserting once the fix was included in the JVM > build. > > (Thanks to Coleen for suggesting the fix.) > > Thanks, Harold > From george.triantafillou at oracle.com Tue Aug 1 22:30:55 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 1 Aug 2017 18:30:55 -0400 Subject: RFR 8180627: gc/gctests/Steal/steal001: guarantee(cp->cache() == NULL) failed In-Reply-To: References: Message-ID: <618c568b-f187-3cbc-e8a6-5ad4909d349b@oracle.com> Hi Harold, Looks good. Thanks for fixing this. -George On 8/1/2017 10:36 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 fix for JDK-8180627. Test > gctests/Steal/steal001 was occasionally failing when an > OutOfMemoryError exception happened to get thrown while linking a > class. The exception caused the class's linking to fail, but the JVM > did not properly clean up the constant pool cache of the partially > linked class. This caused the verifier to assert when the test tried > again to link the class because the verifier did not expect the > unlinked class to have an existing constant pool cache. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8180627/webrev/ > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8180627 > > The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util and other tests, the co-located NSK > tests, RBT tier2 - tier5 tests, and with JPRT. > > Additionally, the fix was tested by temporarily throwing an > OutOfMemoryError exception in > ConstantPool::initialize_resolved_references() and then checking that > the verifier stopped asserting once the fix was included in the JVM > build. > > (Thanks to Coleen for suggesting the fix.) > > Thanks, Harold > From jiangli.zhou at Oracle.COM Wed Aug 2 00:29:55 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Tue, 1 Aug 2017 17:29:55 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> Message-ID: <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> Hi Ioi, Thank you so much for reviewing this. I?ve addressed all your feedbacks. Please see details below. I?ll updated the webrev after addressing Coleen?s comments. > On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: > > Hi Jiangli, > > Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) > > stringTable.cpp: StringTable::archive_string > > add assert for DumpSharedSpaces only Ok. > > filemap.cpp > > 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, > 526 int first_region, int num_regions) { > > When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: > > 537 int len = regions->length(); > 538 if (len > 1) { > 539 start = (char*)regions->at(1).start(); > 540 size = (char*)regions->at(len - 1).end() - start; > 541 } > 542 } > > The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. > > How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. > > FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { > if (first == MetaspaceShared::first_string) { > assert(num_regons <= MetaspaceShared::max_strings, "..."); > } else { > assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); > assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); > } > .... > > I?ve reworked the function and simplified the code. > > 756 if (!string_data_mapped) { > 757 StringTable::ignore_shared_strings(true); > 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); > 759 } > 760 > 761 if (open_archive_heap_data_mapped) { > 762 MetaspaceShared::set_open_archive_heap_region_mapped(); > 763 } else { > 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); > 765 } > > Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? Fixed. > > FileMapInfo::map_heap_data() -- > > 818 char* addr = (char*)regions[i].start(); > 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, > 820 addr, regions[i].byte_size(), si->_read_only, > 821 si->_allow_exec); > > What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by ?regions? array are deallocated. We don?t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ?if? check below makes sure base == addr. if (base == NULL || base != addr) { // dealloc the regions from java heap dealloc_archive_heap_regions(regions, region_num); if (log_is_enabled(Info, cds)) { log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap."); } return false; } > > constantPool.cpp > > Handle refs_handle; > ... > refs_handle = Handle(THREAD, (oop)archived); > > This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() > > I think it's more efficient if you merge these into a single statement > > Handle refs_handle(THREAD, (oop)archived); Fixed. > > Is this experimental code? Maybe it should be removed? > > 664 if (tag_at(index).is_unresolved_klass()) { > 665 #if 0 > 666 CPSlot entry = cp->slot_at(index); > 667 Symbol* name = entry.get_symbol(); > 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); > 669 if (k != NULL) { > 670 klass_at_put(index, k); > 671 } > 672 #endif > 673 } else Removed. > > cpCache.hpp: > > u8 _archived_references > > shouldn't this be declared as an narrowOop to avoid the type casts when it's used? Ok. > > cpCache.cpp: > > add assert so that one of these is used only at dump time and the other only at run time? > > 610 oop ConstantPoolCache::archived_references() { > 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); > 612 } > 613 > 614 void ConstantPoolCache::set_archived_references(oop o) { > 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); > 616 } Ok. Thanks! Jiangli > > Thanks! > - Ioi > > On 7/27/17 1:37 PM, Jiangli Zhou wrote: >> Sorry, the mail didn?t handle the rich text well. I fixed the format below. >> >> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. >> >> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. >> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >> >> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. >> >> Types of Pinned G1 Heap Regions >> >> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. >> >> 00100 0 [ 8] Pinned Mask >> 01000 0 [16] Old Mask >> 10000 0 [32] Archive Mask >> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >> >> >> Pinned Regions >> >> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. >> >> Archive Regions >> >> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. >> >> An archive region is also an old region by design. >> >> Open Archive (GC-RW) Regions >> >> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. >> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. >> >> Adjustable Outgoing Pointers >> >> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. >> >> Closed Archive (GC-RO) Regions >> >> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. >> In JDK 9 we support archive Strings with the archive regions. >> >> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. >> >> Dormant Objects >> >> Dormant objects are unreachable java objects within the open archive heap region. >> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. >> >> Object State Transition >> >> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. >> >> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. >> >> Caching Java Objects at Archive Dump Time >> >> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. >> >> Caching Constant Pool resolved_references Array >> >> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. >> >> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. >> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. >> >> Runtime Java Heap With Cached Java Objects >> >> >> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. >> >> Preliminary test execution and status: >> >> JPRT: passed >> Tier2-rt: passed >> Tier2-gc: passed >> Tier2-comp: passed >> Tier3-rt: passed >> Tier3-gc: passed >> Tier3-comp: passed >> Tier4-rt: passed >> Tier4-gc: passed >> Tier4-comp:6 jobs timed out, all other tests passed >> Tier5-rt: one test failed but passed when running locally, all other tests passed >> Tier5-gc: passed >> Tier5-comp: running >> hotspot_gc: two jobs timed out, all other tests passed >> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >> vm.gc: passed >> vm.gc in CDS mode: passed >> Kichensink: passed >> Kichensink in CDS mode: passed >> >> Thanks, >> Jiangli > From david.holmes at oracle.com Wed Aug 2 01:06:22 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Aug 2017 11:06:22 +1000 Subject: RFR[XS] (10) 8181860 [TESTBUG] serviceability/tmtools/jstack/utils/DefaultFormat.java does not recognize "sleeping" state In-Reply-To: <8cb1043f-b823-4df6-9f71-87d37b3fdafa@oracle.com> References: <8cb1043f-b823-4df6-9f71-87d37b3fdafa@oracle.com> Message-ID: <0d501e62-0e96-4304-67bf-54f6d8a084c2@oracle.com> On 18/07/2017 11:47 AM, Ioi Lam wrote: > https://bugs.openjdk.java.net/browse/JDK-8181860 > > This failure has shown up in the past few months with hotspot tier 2 > testing. Apparently "sleeping" has been a valid state for a long time > so I don't know why the failure showed up only recently. You should never be "sleeping" when waiting in Object.wait or for monitor entry! David > FYI, I ran some casual testing and the tests failed on linux/x64 but not > windows/x64. > > ================================================== > > $ hg diff > diff -r ba869214a302 > test/serviceability/tmtools/jstack/utils/DefaultFormat.java > --- a/test/serviceability/tmtools/jstack/utils/DefaultFormat.java Mon > Jul 17 09:21:48 2017 -0700 > +++ b/test/serviceability/tmtools/jstack/utils/DefaultFormat.java Mon > Jul 17 18:30:48 2017 -0700 > @@ -55,7 +55,7 @@ > protected String threadInfoPattern() { > return > "^\"(.*)\"\\s(#\\d+\\s|)(daemon\\s|)prio=(.+)\\s(os_prio=(.+)\\s|)tid=(.+)\\snid=(.+)\\s(" > > + Consts.UNKNOWN > - + > "|runnable|waiting\\son\\scondition|in\\sObject\\.wait\\(\\)|waiting\\sfor\\smonitor\\sentry)((.*))$"; > > + + > "|runnable|sleeping|waiting\\son\\scondition|in\\sObject\\.wait\\(\\)|waiting\\sfor\\smonitor\\sentry)((.*))$"; > > } > > ================================================== > > Thanks > Ioi > From david.holmes at oracle.com Wed Aug 2 01:38:31 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Aug 2017 11:38:31 +1000 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <5cd676de-872d-6d4a-691b-da561173f7d0@oracle.com> <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com> Message-ID: Catching up after my vacation ... I have some comments around the sweeper/nmethod changes that were made as part of this. First a nit: the copyright year in nmethod.hpp was not updated. It isn't clear to me that the change of _stack_traversal_mark from long to jlong is suitable. Should this really be 64-bit on a 32-bit system? And given it is set from the traversal_count which is still a plain long, this change just seems wrong to me. Then the changes to define accessors for _stack_traversal_mark with load-acquire and release-store semantics seem somewhat misguided. The only store occurs in: void nmethod::mark_as_seen_on_stack() { ... set_stack_traversal_mark(NMethodSweeper::traversal_count()); } which is used here: if (state == not_entrant) { mark_as_seen_on_stack(); OrderAccess::storestore(); } and here: virtual void do_code_blob(CodeBlob* cb) { assert(cb->is_nmethod(), "CodeBlob should be nmethod"); nmethod* nm = (nmethod*)cb; nm->set_hotness_counter(NMethodSweeper::hotness_counter_reset_val()); // If we see an activation belonging to a non_entrant nmethod, we mark it. if (nm->is_not_entrant()) { nm->mark_as_seen_on_stack(); } } so what prior-stores are the release-semantics supposed to be ordering? The original concern was over the storestore at the end of NMethodSweeper::mark_active_nmethods. That has now been removed - which seems reasonable as I can not see what stores it was trying to order. The other storestore (above) orders the setting of the _stack_traversal_mark with the store to _state (as Igor explained in the review thread). That storestore is still needed and unaffected by release-semantics on the write to _stack_traversal_mark. So as far as I can see the changes to add acquire/release semantics to _stack_traversal_mark were unnecessary and did not provide the justification for removing the storestore that was removed. If anything acquire/release semantics should have been added to the _state variable though that would also not have had any bearing on the storestore that was removed - AFAICS. Cheers, David On 20/07/2017 8:53 PM, Roman Kennke wrote: > Hi all, > > Robbin found some more missing includes in jprt testing (thanks!!) > > Differential: > http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/ > > Full: > http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/ > > > Am I breaking the record for most webrev revisions? :-P > > According the Robbin, builds are now all clean. > > Can I get final reviews and then a sponsor? > > Thanks, > Roman > > Am 16.07.2017 um 10:25 schrieb Robbin Ehn: >> Hi Roman, >> >> On 2017-07-12 15:32, Roman Kennke wrote: >>> Hi Robbin and all, >>> >>> I fixed the 32bit failures by using jlong in all relevant places: >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/ >>> >>> >>> then Robbin found another problem. SafepointCleanupTest started to fail, >>> because "mark nmethods" is no longer printed. This made me think that >>> we're not measuring the conflated (and possibly parallelized) >>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with >>> "safepoint cleanup tasks" which measures the total duration of safepoint >>> cleanup. We can't reasonably measure a possibly parallel and conflated >>> pass standalone, but we can measure all and by subtrating all the other >>> subphases, get an idea how long deflation and nmethod marking take up. >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/ >>> >>> >>> The full webrev is now: >>> >>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ >>> >>> >>> Hope that's all ;-) >> >> With this changeset something always pop-ups. >> >> Failure reason: Targets failed. Target macosx_x64_10.9-fastdebug FAILED. >> >> /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ >> -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS >> -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE >> -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions >> -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16 >> -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070 >> -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN >> -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef >> -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS >> -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86 >> -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64 >> -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED >> -DINCLUDE_AOT >> -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm >> -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22: >> error: variable has incomplete type 'StrongRootsScope' >> StrongRootsScope srs(num_cleanup_workers); >> ^ >> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: >> note: forward declaration of 'StrongRootsScope' >> class StrongRootsScope; >> ^ >> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22: >> error: variable has incomplete type 'StrongRootsScope' >> StrongRootsScope srs(1); >> ^ >> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: >> note: forward declaration of 'StrongRootsScope' >> class StrongRootsScope; >> ^ >> 2 errors generated. >> make[3]: *** >> [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o] >> Error 1 >> make[3]: *** Waiting for unfinished jobs.... >> make[2]: *** [hotspot-server-libs] Error 2 >> >> Send me the new webrev and I'll test it before the 16th round of >> review :) >> >> /Robbin >> >>> >>> Roman >>> >>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn: >>>> Hi, unfortunately the push failed on 32-bit. >>>> >>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) >>>> >>>> I do not have anytime to look at this, so here is the error. >>>> >>>> /Robbin >>>> >>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' >>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>>> member function 'long int nmethod::stack_traversal_mark()': >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>>> >>>> error: call of overloaded 'load_acquire(volatile long int*)' is >>>> ambiguous >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>>> >>>> note: candidates are: >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>>> >>>> note: static jint OrderAccess::load_acquire(const volatile jint*) >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>>> >>>> note: no known conversion for argument 1 from 'volatile long int*' >>>> to 'const volatile jint* {aka const volatile int*}' >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>>> >>>> note: static juint OrderAccess::load_acquire(const volatile juint*) >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>>> >>>> note: no known conversion for argument 1 from 'volatile long int*' >>>> to 'const volatile juint* {aka const volatile unsigned int*}' >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>>> member function 'void nmethod::set_stack_traversal_mark(long int)': >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>>> >>>> error: call of overloaded 'release_store(volatile long int*, long >>>> int&)' is ambiguous >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>>> >>>> note: candidates are: >>>> In file included from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>> >>>> from >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>>> >>>> note: static void OrderAccess::release_store(volatile jint*, jint) >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>>> >>>> note: no known conversion for argument 1 from 'volatile long int*' >>>> to 'volatile jint* {aka volatile int*}' >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>>> >>>> note: static void OrderAccess::release_store(volatile juint*, juint) >>>> >>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>>> >>>> note: no known conversion for argument 1 from 'volatile long int*' >>>> to 'volatile juint* {aka volatile unsigned int*}' >>>> >>>> On 2017-07-10 20:50, Robbin Ehn wrote: >>>>> I'll start a push now. >>>>> >>>>> /Robbin >>>>> >>>>> On 2017-07-10 12:38, Roman Kennke wrote: >>>>>> Ok, so I guess I need a sponsor for this now: >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>> >>>>>> >>>>>> Roman >>>>>> >>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>>>>> >>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>>>>> > wrote: >>>>>>>> >>>>>>>> Hi Roman, >>>>>>>> >>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>>>>> Hi Robbin, >>>>>>>>>> >>>>>>>>>> Far down -> >>>>>>>>>> >>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I'm not happy about this change: >>>>>>>>>>>> >>>>>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>>>>> + // TODO: Is this really needed? >>>>>>>>>>>> + OrderAccess::storestore(); >>>>>>>>>>>> + } >>>>>>>>>>>> >>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>>>>> consistent >>>>>>>>>>>> with an OrderAccess::storestore() that's not properly >>>>>>>>>>>> documented >>>>>>>>>>>> which is only increasing the technical debt. >>>>>>>>>>>> >>>>>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>>>>> >>>>>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case >>>>>>>>>>>>> that >>>>>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>>>>> sweeper) >>>>>>>>>>>>> is holding still. >>>>>>>>>>>> >>>>>>>>>>>> and: >>>>>>>>>>>> >>>>>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>>>>> sweeper.cpp... >>>>>>>>>>> >>>>>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>>>>> marking >>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>>>>> (outside >>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>>>>> storestore() >>>>>>>>>>> should be necessary. >>>>>>>>>>> >>>>>>>>>>> From Igor's comment I can see how it happened though: >>>>>>>>>>> Apparently >>>>>>>>>>> there >>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>>>>> with >>>>>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>>>>> required >>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>>>>> also put >>>>>>>>>>> a storestore() in the other places that call >>>>>>>>>>> mark_as_seen_on_stack(), >>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>>>>> storestore() >>>>>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>>>>> 'for >>>>>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>>>>> necessary in >>>>>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>>>>> >>>>>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>>>>> Refactor the >>>>>>>>>>> code so that both paths at least call the storestore() in the >>>>>>>>>>> same >>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and >>>>>>>>>>> call >>>>>>>>>>> storestore() in the dtor as proposed?) >>>>>>>>>> >>>>>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>>>>> >>>>>>>>>> So there is a slight optimization when not running sweeper to >>>>>>>>>> skip >>>>>>>>>> compiler barrier/fence in stw. >>>>>>>>>> >>>>>>>>>> Don't think that matter, so I propose something like: >>>>>>>>>> - long stack_traversal_mark() { return >>>>>>>>>> _stack_traversal_mark; } >>>>>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>>>>> _stack_traversal_mark = l; } >>>>>>>>>> + long stack_traversal_mark() { return >>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>>>>> >>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>>>>> that >>>>>>>>>> it is concurrent accessed. >>>>>>>>>> And remove both storestore. >>>>>>>>>> >>>>>>>>>> "Also neither of these state variables are volatile in >>>>>>>>>> nmethod, so >>>>>>>>>> even the compiler may reorder the stores" >>>>>>>>>> Fortunately at least _state is volatile now. >>>>>>>>>> >>>>>>>>>> I think _state also should use la/rs semantics instead, but >>>>>>>>>> that's >>>>>>>>>> another story. >>>>>>>>> Like this? >>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> Yes, exactly, I like this! >>>>>>>> Dan? Igor ? Tobias? >>>>>>>> >>>>>>> >>>>>>> That seems correct. >>>>>>> >>>>>>> igor >>>>>>> >>>>>>>> Thanks Roman! >>>>>>>> >>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow >>>>>>>> this >>>>>>>> thread/changeset to the end! >>>>>>>> >>>>>>>> /Robbin >>>>>>>> >>>>>>>>> Roman >>>>>>> >>>>>> >>> > From david.holmes at oracle.com Wed Aug 2 03:35:46 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Aug 2017 13:35:46 +1000 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596CD126.6080100@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> Message-ID: Hi Kim, Good planning on Erik's part to go on vacation just as I have returned ;-) On 1/08/2017 4:18 AM, Kim Barrett wrote: >> On Jul 28, 2017, at 12:25 PM, Erik Osterlund wrote: >> >> Hi Andrew, >> >> In that case, feel free to propose a revised solution while I am gone. > > Erik has asked me to try to make progress on this while he's on > vacation, rather than possibly letting it sit until he gets back. Okay, while Erik is gone perhaps you can clarify a few things. As Andrew and Roman have expressed, I too find this: + template + inline static U cmpxchg(T exchange_value, volatile U* dest, V compare_value, cmpxchg_memory_order order); totally unintuitive and unappealing. I do not understand the rationale for this this at all. It does not make any sense to me to allow T, U and V to be different types (even if constrained). It has been stated that if we force them to all be the same there is some problem with literals and the need for casts, but I don't understand what that problem would be. I'm also unclear about these assertions: STATIC_ASSERT(sizeof(T) <= size_t(BytesPerWord)); // Does the machine support atomic wide accesses? as that is not a guarantee of atomic access support. It isn't always even a necessary condition; but it certainly isn't sufficient. The conversion from use of the j* types also seems an unnecessary disruption to the APIs. Bear in mind in many cases these functions are for operating on Java-level fields and so have Java types. Those java types are well defined (jint is signed 32-bit, jshort is signed 16-bit etc etc). If compilers had had broad support for the int8_t, int16_t, int32_t etc typedefs back in Java 5 then perhaps we would have used those instead of the j* aliases/synonyms. But changing it now seems disruptive and driven by personal preference IMHO. I don't understand methods like this: 75 template 76 inline static T specialized_xchg(T exchange_value, volatile T* dest) { 77 STATIC_ASSERT(Never::value); 78 return exchange_value; 79 } 80 is this a new form of ShouldNotReachHere() ?? ie these functions should always be overridden by each platform? I admit I'm not very fluent in template-ese. For example how does one read this, and what does it mean ?? typedef Conditional::value, PlatformAtomic, GeneralizedAtomic>::type AtomicImpl; And I'm still trying to get my head around the introduced "trait" stuff. (I hate pseudo-languages that can be written within other languages :( ). Thanks, David ----- > (This is just one of a large stack of patches Erik has prepared toward > the GC access interface that he's been talking about for a while. > Providing a solid base for that work is the underlying goal for > changing atomics.) I'm getting caught up on the discussion, and > looking at Erik's most recent proposal. I should have something more > concrete soon, possibly just agreeing with Erik's recent ideas, > possibly some alternative. I'm also open to suggestions? > > From aph at redhat.com Wed Aug 2 08:31:48 2017 From: aph at redhat.com (Andrew Haley) Date: Wed, 2 Aug 2017 09:31:48 +0100 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596CD126.6080100@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> Message-ID: <920b0165-92b2-0d22-45aa-2ccba3858d74@redhat.com> On 02/08/17 04:35, David Holmes wrote: > Hi Kim, > > Good planning on Erik's part to go on vacation just as I have returned ;-) > > On 1/08/2017 4:18 AM, Kim Barrett wrote: >>> On Jul 28, 2017, at 12:25 PM, Erik Osterlund wrote: >>> >>> In that case, feel free to propose a revised solution while I am gone. >> >> Erik has asked me to try to make progress on this while he's on >> vacation, rather than possibly letting it sit until he gets back. > > Okay, while Erik is gone perhaps you can clarify a few things. As Andrew > and Roman have expressed, I too find this: > > + template > + inline static U cmpxchg(T exchange_value, volatile U* dest, V > compare_value, cmpxchg_memory_order order); > > totally unintuitive and unappealing. I do not understand the rationale > for this this at all. It does not make any sense to me to allow T, U and > V to be different types (even if constrained). It has been stated that > if we force them to all be the same there is some problem with literals > and the need for casts, but I don't understand what that problem would be. A couple of examples would help: diff -r d207c56d5b5a src/share/vm/runtime/os.cpp --- a/src/share/vm/runtime/os.cpp Tue Jul 25 15:35:09 2017 +0100 +++ b/src/share/vm/runtime/os.cpp Wed Aug 02 09:24:30 2017 +0100 @@ -756,7 +756,7 @@ while (true) { unsigned int seed = _rand_seed; int rand = random_helper(seed); - if (Atomic::cmpxchg(rand, &_rand_seed, seed) == seed) { + if (Atomic::cmpxchg((unsigned)rand, &_rand_seed, seed) == seed) { return rand; } } diff -r d207c56d5b5a src/share/vm/runtime/thread.cpp --- a/src/share/vm/runtime/thread.cpp Tue Jul 25 15:35:09 2017 +0100 +++ b/src/share/vm/runtime/thread.cpp Wed Aug 02 09:24:30 2017 +0100 @@ -4741,7 +4741,7 @@ enum MuxBits { LOCKBIT = 1 }; void Thread::muxAcquire(volatile intptr_t * Lock, const char * LockName) { - intptr_t w = Atomic::cmpxchg(LOCKBIT, Lock, 0); + intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, (intptr_t)0); if (w == 0) return; if ((w & LOCKBIT) == 0 && Atomic::cmpxchg (w|LOCKBIT, Lock, w) == w) { return; There are eight such changes required in the HotSpot codebase. I strongly believe that these changes make the code better. They show exactly where types do not match. In the case of the random generator, they show that there is a type mismatch which is IMO probably wrong, and it wouldn't hurt to fix it. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From thomas.stuefe at gmail.com Wed Aug 2 09:17:42 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 2 Aug 2017 11:17:42 +0200 Subject: RFR(xxs): 8185706: Native callstacks unreliable under Windows x64 Message-ID: Hi all, may I please have a review for this small fix. Issue: https://bugs.openjdk.java.net/browse/JDK-8185706 Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ 8185706-Native-callstacks-unreliable-under-Windows-x64/webrev.00/webrev/ This can be seen as an addon to https://bugs.openjdk.java. net/browse/JDK-8022335. Ioi Lam did a good job analyzing the original problem. On windows x64, the native compiler generates code which does not use the frame pointer (regardless whether we set -Oy-). Only in rare cases a frame pointer is used - e.g. for alloca()-functions - and, as Ioi pointed out, no guarantee either that RBP is actually the frame pointer. So, in os :: platform_print_native_stack () we walk the stack using StackWalk64(), extract the pc from each frame and print that, like normal windows coding. However, we still test for the frame pointer being NULL, and abort stack tracing if it is. This causes stack dumping to fail quite often, and unnecessarily. For example, test: java.exe -XX:ErrorHandlerTest=12 Sometimes it works, but more out of accident - as Ioi pointed out in this mail thread: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/ 2013-August/009063.html. If there are java frames above the crashing native frame, we still may have RBP set to some value (does not matter which) and os :: platform_print_native_stack () does not abort frame printing. Kind Regards, Thomas From rkennke at redhat.com Wed Aug 2 09:21:07 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 2 Aug 2017 11:21:07 +0200 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596CD126.6080100@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> Message-ID: <1c6f6ddc-41c5-4450-35d9-2e52e6919689@redhat.com> > I admit I'm not very fluent in template-ese. For example how does one > read this, and what does it mean ?? > > > And I'm still trying to get my head around the introduced "trait" > stuff. (I hate pseudo-languages that can be written within other > languages :( ). And this is, IMO, the biggest problem with the patch. It makes the code unnecessarily hard to understand and buries the actual stuff (usually 1-liners) under layers of templates and redirections. Roman From david.holmes at oracle.com Wed Aug 2 09:51:35 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Aug 2017 19:51:35 +1000 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: <920b0165-92b2-0d22-45aa-2ccba3858d74@redhat.com> References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <920b0165-92b2-0d22-45aa-2ccba3858d74@redhat.com> Message-ID: On 2/08/2017 6:31 PM, Andrew Haley wrote: > On 02/08/17 04:35, David Holmes wrote: >> Hi Kim, >> >> Good planning on Erik's part to go on vacation just as I have returned ;-) >> >> On 1/08/2017 4:18 AM, Kim Barrett wrote: >>>> On Jul 28, 2017, at 12:25 PM, Erik Osterlund wrote: >>>> >>>> In that case, feel free to propose a revised solution while I am gone. >>> >>> Erik has asked me to try to make progress on this while he's on >>> vacation, rather than possibly letting it sit until he gets back. >> >> Okay, while Erik is gone perhaps you can clarify a few things. As Andrew >> and Roman have expressed, I too find this: >> >> + template >> + inline static U cmpxchg(T exchange_value, volatile U* dest, V >> compare_value, cmpxchg_memory_order order); >> >> totally unintuitive and unappealing. I do not understand the rationale >> for this this at all. It does not make any sense to me to allow T, U and >> V to be different types (even if constrained). It has been stated that >> if we force them to all be the same there is some problem with literals >> and the need for casts, but I don't understand what that problem would be. > > A couple of examples would help: > > diff -r d207c56d5b5a src/share/vm/runtime/os.cpp > --- a/src/share/vm/runtime/os.cpp Tue Jul 25 15:35:09 2017 +0100 > +++ b/src/share/vm/runtime/os.cpp Wed Aug 02 09:24:30 2017 +0100 > @@ -756,7 +756,7 @@ > while (true) { > unsigned int seed = _rand_seed; > int rand = random_helper(seed); > - if (Atomic::cmpxchg(rand, &_rand_seed, seed) == seed) { > + if (Atomic::cmpxchg((unsigned)rand, &_rand_seed, seed) == seed) { > return rand; > } > } > diff -r d207c56d5b5a src/share/vm/runtime/thread.cpp > --- a/src/share/vm/runtime/thread.cpp Tue Jul 25 15:35:09 2017 +0100 > +++ b/src/share/vm/runtime/thread.cpp Wed Aug 02 09:24:30 2017 +0100 > @@ -4741,7 +4741,7 @@ > enum MuxBits { LOCKBIT = 1 }; > > void Thread::muxAcquire(volatile intptr_t * Lock, const char * LockName) { > - intptr_t w = Atomic::cmpxchg(LOCKBIT, Lock, 0); > + intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, (intptr_t)0); > if (w == 0) return; > if ((w & LOCKBIT) == 0 && Atomic::cmpxchg (w|LOCKBIT, Lock, w) == w) { > return; > > There are eight such changes required in the HotSpot codebase. > > I strongly believe that these changes make the code better. They show > exactly where types do not match. In the case of the random > generator, they show that there is a type mismatch which is IMO > probably wrong, and it wouldn't hurt to fix it. > I agree. I don't think there is any problem having to coerce types to the correct form when they do not match directly. The casts make it explicit. Though I'm unsure how we manage to avoid the casts in the existing code in some of those cases. Cheers, David From adinn at redhat.com Wed Aug 2 09:53:02 2017 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 2 Aug 2017 10:53:02 +0100 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: <920b0165-92b2-0d22-45aa-2ccba3858d74@redhat.com> References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <920b0165-92b2-0d22-45aa-2ccba3858d74@redhat.com> Message-ID: On 02/08/17 09:31, Andrew Haley wrote: > There are eight such changes required in the HotSpot codebase. > > I strongly believe that these changes make the code better. They show > exactly where types do not match. In the case of the random > generator, they show that there is a type mismatch which is IMO > probably wrong, and it wouldn't hurt to fix it. I am of the same opinion. If there were hundreds of such changes required then I would agree that this might i) indicate that there is some fundamental issue which implies the need to cater for mix and match argument types to cmpxchg or ii) the code base is in a terrible state. With a count of 8 I am very strongly led to conclude that neither of the above is an issue. That's not to say I can't/don't admire the nicer points of Erik's implementation. It's very clever and very interesting that you can do something like this with templates but ... it is clearly hard to follow. As a result, I think it is likely to cause more maintenance overhead and, hence, be more risky than the (demonstrably) rare alternative need to correctly cast the occasional argument to a cmpxchg call. Of course, that's merely another 2c in the balance. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From harold.seigel at oracle.com Wed Aug 2 11:49:10 2017 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 2 Aug 2017 07:49:10 -0400 Subject: RFR 8180627: gc/gctests/Steal/steal001: guarantee(cp->cache() == NULL) failed In-Reply-To: <63e69627-1908-e38e-1478-00dd16c95337@oracle.com> References: <63e69627-1908-e38e-1478-00dd16c95337@oracle.com> Message-ID: <663dbda0-1329-b7ca-0f67-9b0d277693fc@oracle.com> Hi Coleen, George, Thanks for the reviews! Harold On 8/1/2017 5:57 PM, coleen.phillimore at oracle.com wrote: > > It looks good to me. > Thanks! > Coleen > > On 8/1/17 10:36 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 fix for JDK-8180627. Test >> gctests/Steal/steal001 was occasionally failing when an >> OutOfMemoryError exception happened to get thrown while linking a >> class. The exception caused the class's linking to fail, but the JVM >> did not properly clean up the constant pool cache of the partially >> linked class. This caused the verifier to assert when the test tried >> again to link the class because the verifier did not expect the >> unlinked class to have an existing constant pool cache. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8180627/webrev/ >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8180627 >> >> The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, >> java/io, java/lang, java/util and other tests, the co-located NSK >> tests, RBT tier2 - tier5 tests, and with JPRT. >> >> Additionally, the fix was tested by temporarily throwing an >> OutOfMemoryError exception in >> ConstantPool::initialize_resolved_references() and then checking that >> the verifier stopped asserting once the fix was included in the JVM >> build. >> >> (Thanks to Coleen for suggesting the fix.) >> >> Thanks, Harold >> > From harold.seigel at oracle.com Wed Aug 2 13:55:57 2017 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 2 Aug 2017 09:55:57 -0400 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd In-Reply-To: <9bb95959-c328-8e29-6ddc-d2333259c808@oracle.com> References: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> <238b2d00-c6f9-7213-9b21-f9cb0853cd34@redhat.com> <9bb95959-c328-8e29-6ddc-d2333259c808@oracle.com> Message-ID: Hi Coleen, Other than this erroneous comment in diagnosticCommand.hpp, the change looks good: 735 // VM.systemdictionary -verbose: for dumping the*string table* Go ahead and push it. Thanks, Harold On 8/1/2017 1:32 PM, coleen.phillimore at oracle.com wrote: > > > On 8/1/17 1:19 PM, Aleksey Shipilev wrote: >> On 08/01/2017 07:14 PM, coleen.phillimore at oracle.com wrote: >>>> *) systemDictionary.cpp: in SystemDictionary::dump, the if(verbose) >>>> condition seems inverted. Should >>>> dump tables when verbose? >>> I kept the name dump_table from the VM.stringtable and >>> VM.symboltable dcmd code, when the function >>> really prints hashtable statistics. I could rename that function >>> to be print_table_statistics() >>> instead. And change ClassLoaderDataGraph::dump_dictionary to >>> print_dictionary_statistics. If that >>> makes sense. >> Yeah, that would make sense. > > Thanks! Here's a webrev with the renaming: > > open webrev at http://cr.openjdk.java.net/~coleenp/8184994.02/webrev > > and tested. > > Thanks, > Coleen > >> >> Thanks, >> -Aleksey >> > From coleen.phillimore at oracle.com Wed Aug 2 13:58:29 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 2 Aug 2017 09:58:29 -0400 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd In-Reply-To: References: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> <238b2d00-c6f9-7213-9b21-f9cb0853cd34@redhat.com> <9bb95959-c328-8e29-6ddc-d2333259c808@oracle.com> Message-ID: <123692f5-c79b-4e7d-9d4e-fe0b57282032@oracle.com> Thanks, Harold! I fixed the comment. Coleen On 8/2/17 9:55 AM, harold seigel wrote: > Hi Coleen, > > Other than this erroneous comment in diagnosticCommand.hpp, the change > looks good: > > 735 // VM.systemdictionary -verbose: for dumping the*string table* > > Go ahead and push it. > > Thanks, Harold > > On 8/1/2017 1:32 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 8/1/17 1:19 PM, Aleksey Shipilev wrote: >>> On 08/01/2017 07:14 PM, coleen.phillimore at oracle.com wrote: >>>>> *) systemDictionary.cpp: in SystemDictionary::dump, the >>>>> if(verbose) condition seems inverted. Should >>>>> dump tables when verbose? >>>> I kept the name dump_table from the VM.stringtable and >>>> VM.symboltable dcmd code, when the function >>>> really prints hashtable statistics. I could rename that function >>>> to be print_table_statistics() >>>> instead. And change ClassLoaderDataGraph::dump_dictionary to >>>> print_dictionary_statistics. If that >>>> makes sense. >>> Yeah, that would make sense. >> >> Thanks! Here's a webrev with the renaming: >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8184994.02/webrev >> >> and tested. >> >> Thanks, >> Coleen >> >>> >>> Thanks, >>> -Aleksey >>> >> > From gerald.thornbrugh at oracle.com Wed Aug 2 15:16:11 2017 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Wed, 2 Aug 2017 09:16:11 -0600 Subject: RFR 8182757: JDWP: Socket Transport handshake hangs on Solaris Message-ID: <6D4B9143-CF24-44EF-A78F-F2DA29C3F370@oracle.com> Hi, Please review this JDK-10 fix for JDK-8182757. The bug: https://bugs.openjdk.java.net/browse/JDK-8182757 The webrev: http://cr.openjdk.java.net/~gthornbr/8182757/webrev.00 If a socket is being setup without a fixed port using the SO_REUSEADDR flag it can lead to other processes interfering with the poll/receive process of a debugger/debuggee configuring a socket for communication. When SO_REUSEADDR is used other processes can attempt a listen() on the same port and receive a connect from the debuggee. This causes the debugger to stay in poll() waiting for a connect and the debuggee stays in recv() waiting to receive data from the "rogue" process that will never send it. This can also lead to connections being terminated early on the debuggee side when the ?rogue? process terminates the connection because it does not receive what it expected from the client process (i.e. the debuggee). The fix is to not use the SO_REUSEADDR flag for non-fixed port sockets. This keeps ?rogue? processes from reusing the port address and from stealing the connects sent by the debuggee. The changes to src/jdk.jdi/share/classes/com/sun/tools/jdi/SocketTransportService.java addresses when JDI (the debugger side) is acting in ?server mode?. The changes to src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c addresses when JDWP (the debuggee side) is acting in ?server mode?. I have run the JDK JDI tests and the internal Oracle VM/NSK JPDA tests on all platforms without errors. We were able to reproduce the failures with a specific load on a test machine while running JDI related tests. After we applied the fix to this system we were not able to reproduce the failures. Thanks, Gerald From george.triantafillou at oracle.com Wed Aug 2 15:26:41 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Wed, 2 Aug 2017 11:26:41 -0400 Subject: RFR 8182757: JDWP: Socket Transport handshake hangs on Solaris In-Reply-To: <6D4B9143-CF24-44EF-A78F-F2DA29C3F370@oracle.com> References: <6D4B9143-CF24-44EF-A78F-F2DA29C3F370@oracle.com> Message-ID: <72a3ded0-9f49-9009-f9eb-bfee5a6055b1@oracle.com> Hi Jerry, Looks good! Thanks for fixing this. -George On 8/2/2017 11:16 AM, Gerald Thornbrugh wrote: > Hi, > > Please review this JDK-10 fix for JDK-8182757. > > The bug: https://bugs.openjdk.java.net/browse/JDK-8182757 > > The webrev: http://cr.openjdk.java.net/~gthornbr/8182757/webrev.00 > > If a socket is being setup without a fixed port using the SO_REUSEADDR flag it can lead to other > processes interfering with the poll/receive process of a debugger/debuggee configuring a socket > for communication. When SO_REUSEADDR is used other processes can attempt a listen() on > the same port and receive a connect from the debuggee. This causes the debugger to stay in > poll() waiting for a connect and the debuggee stays in recv() waiting to receive data from the > "rogue" process that will never send it. > > This can also lead to connections being terminated early on the debuggee side when the ?rogue? > process terminates the connection because it does not receive what it expected from the client > process (i.e. the debuggee). > > The fix is to not use the SO_REUSEADDR flag for non-fixed port sockets. This keeps ?rogue? > processes from reusing the port address and from stealing the connects sent by the debuggee. > > The changes to src/jdk.jdi/share/classes/com/sun/tools/jdi/SocketTransportService.java addresses > when JDI (the debugger side) is acting in ?server mode?. > > The changes to src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c addresses when > JDWP (the debuggee side) is acting in ?server mode?. > > I have run the JDK JDI tests and the internal Oracle VM/NSK JPDA tests on all platforms without errors. > We were able to reproduce the failures with a specific load on a test machine while running JDI > related tests. After we applied the fix to this system we were not able to reproduce the failures. > > Thanks, > > Gerald From daniel.daugherty at oracle.com Wed Aug 2 15:30:54 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 2 Aug 2017 09:30:54 -0600 Subject: RFR 8182757: JDWP: Socket Transport handshake hangs on Solaris In-Reply-To: <6D4B9143-CF24-44EF-A78F-F2DA29C3F370@oracle.com> References: <6D4B9143-CF24-44EF-A78F-F2DA29C3F370@oracle.com> Message-ID: <565ff167-f193-a25c-c35a-71641d4833b0@oracle.com> On 8/2/17 9:16 AM, Gerald Thornbrugh wrote: > Hi, > > Please review this JDK-10 fix for JDK-8182757. > > The bug: https://bugs.openjdk.java.net/browse/JDK-8182757 > > The webrev: http://cr.openjdk.java.net/~gthornbr/8182757/webrev.00 src/jdk.jdi/share/classes/com/sun/tools/jdi/SocketTransportService.java No comments. src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c No comments. Thumbs up! Dan > > If a socket is being setup without a fixed port using the SO_REUSEADDR flag it can lead to other > processes interfering with the poll/receive process of a debugger/debuggee configuring a socket > for communication. When SO_REUSEADDR is used other processes can attempt a listen() on > the same port and receive a connect from the debuggee. This causes the debugger to stay in > poll() waiting for a connect and the debuggee stays in recv() waiting to receive data from the > "rogue" process that will never send it. > > This can also lead to connections being terminated early on the debuggee side when the ?rogue? > process terminates the connection because it does not receive what it expected from the client > process (i.e. the debuggee). > > The fix is to not use the SO_REUSEADDR flag for non-fixed port sockets. This keeps ?rogue? > processes from reusing the port address and from stealing the connects sent by the debuggee. > > The changes to src/jdk.jdi/share/classes/com/sun/tools/jdi/SocketTransportService.java addresses > when JDI (the debugger side) is acting in ?server mode?. > > The changes to src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c addresses when > JDWP (the debuggee side) is acting in ?server mode?. > > I have run the JDK JDI tests and the internal Oracle VM/NSK JPDA tests on all platforms without errors. > We were able to reproduce the failures with a specific load on a test machine while running JDI > related tests. After we applied the fix to this system we were not able to reproduce the failures. > > Thanks, > > Gerald From gerald.thornbrugh at oracle.com Wed Aug 2 15:36:21 2017 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Wed, 2 Aug 2017 09:36:21 -0600 Subject: RFR 8182757: JDWP: Socket Transport handshake hangs on Solaris In-Reply-To: <72a3ded0-9f49-9009-f9eb-bfee5a6055b1@oracle.com> References: <6D4B9143-CF24-44EF-A78F-F2DA29C3F370@oracle.com> <72a3ded0-9f49-9009-f9eb-bfee5a6055b1@oracle.com> Message-ID: <8B009FC1-BD33-4676-B8CC-5F17B8D5FB68@oracle.com> Hi George, Thanks! Jerry > On Aug 2, 2017, at 9:26 AM, George Triantafillou wrote: > > Hi Jerry, > > Looks good! Thanks for fixing this. > > -George > > On 8/2/2017 11:16 AM, Gerald Thornbrugh wrote: >> Hi, >> >> Please review this JDK-10 fix for JDK-8182757. >> >> The bug: https://bugs.openjdk.java.net/browse/JDK-8182757 >> >> The webrev: http://cr.openjdk.java.net/~gthornbr/8182757/webrev.00 >> >> If a socket is being setup without a fixed port using the SO_REUSEADDR flag it can lead to other >> processes interfering with the poll/receive process of a debugger/debuggee configuring a socket >> for communication. When SO_REUSEADDR is used other processes can attempt a listen() on >> the same port and receive a connect from the debuggee. This causes the debugger to stay in >> poll() waiting for a connect and the debuggee stays in recv() waiting to receive data from the >> "rogue" process that will never send it. >> >> This can also lead to connections being terminated early on the debuggee side when the ?rogue? >> process terminates the connection because it does not receive what it expected from the client >> process (i.e. the debuggee). >> >> The fix is to not use the SO_REUSEADDR flag for non-fixed port sockets. This keeps ?rogue? >> processes from reusing the port address and from stealing the connects sent by the debuggee. >> >> The changes to src/jdk.jdi/share/classes/com/sun/tools/jdi/SocketTransportService.java addresses >> when JDI (the debugger side) is acting in ?server mode?. >> >> The changes to src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c addresses when >> JDWP (the debuggee side) is acting in ?server mode?. >> >> I have run the JDK JDI tests and the internal Oracle VM/NSK JPDA tests on all platforms without errors. >> We were able to reproduce the failures with a specific load on a test machine while running JDI >> related tests. After we applied the fix to this system we were not able to reproduce the failures. >> >> Thanks, >> >> Gerald > From gerald.thornbrugh at oracle.com Wed Aug 2 15:36:44 2017 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Wed, 2 Aug 2017 09:36:44 -0600 Subject: RFR 8182757: JDWP: Socket Transport handshake hangs on Solaris In-Reply-To: <565ff167-f193-a25c-c35a-71641d4833b0@oracle.com> References: <6D4B9143-CF24-44EF-A78F-F2DA29C3F370@oracle.com> <565ff167-f193-a25c-c35a-71641d4833b0@oracle.com> Message-ID: Hi Dan, Thanks! Jerry > On Aug 2, 2017, at 9:30 AM, Daniel D. Daugherty wrote: > > On 8/2/17 9:16 AM, Gerald Thornbrugh wrote: >> Hi, >> >> Please review this JDK-10 fix for JDK-8182757. >> >> The bug: https://bugs.openjdk.java.net/browse/JDK-8182757 >> >> The webrev: http://cr.openjdk.java.net/~gthornbr/8182757/webrev.00 > > src/jdk.jdi/share/classes/com/sun/tools/jdi/SocketTransportService.java > No comments. > > src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c > No comments. > > Thumbs up! > > Dan > > >> >> If a socket is being setup without a fixed port using the SO_REUSEADDR flag it can lead to other >> processes interfering with the poll/receive process of a debugger/debuggee configuring a socket >> for communication. When SO_REUSEADDR is used other processes can attempt a listen() on >> the same port and receive a connect from the debuggee. This causes the debugger to stay in >> poll() waiting for a connect and the debuggee stays in recv() waiting to receive data from the >> "rogue" process that will never send it. >> >> This can also lead to connections being terminated early on the debuggee side when the ?rogue? >> process terminates the connection because it does not receive what it expected from the client >> process (i.e. the debuggee). >> >> The fix is to not use the SO_REUSEADDR flag for non-fixed port sockets. This keeps ?rogue? >> processes from reusing the port address and from stealing the connects sent by the debuggee. >> >> The changes to src/jdk.jdi/share/classes/com/sun/tools/jdi/SocketTransportService.java addresses >> when JDI (the debugger side) is acting in ?server mode?. >> >> The changes to src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c addresses when >> JDWP (the debuggee side) is acting in ?server mode?. >> >> I have run the JDK JDI tests and the internal Oracle VM/NSK JPDA tests on all platforms without errors. >> We were able to reproduce the failures with a specific load on a test machine while running JDI >> related tests. After we applied the fix to this system we were not able to reproduce the failures. >> >> Thanks, >> >> Gerald > From coleen.phillimore at oracle.com Wed Aug 2 15:55:56 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 2 Aug 2017 11:55:56 -0400 Subject: RFR (not as big as it looks) 8184994: Add Dictionary size logging and jcmd In-Reply-To: <731e42aa-0e29-0085-f4dd-a75c995ae774@oracle.com> References: <380f495d-ad3e-9c1d-4940-431af499ecca@oracle.com> <238b2d00-c6f9-7213-9b21-f9cb0853cd34@redhat.com> <9bb95959-c328-8e29-6ddc-d2333259c808@oracle.com> <731e42aa-0e29-0085-f4dd-a75c995ae774@oracle.com> Message-ID: Hi Serguei, I just checked it in. Luckily one of the platforms complained about the double ;; so I fixed it. Thanks! Coleen On 8/2/17 11:51 AM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > It looks good. > > Minor: > > http://cr.openjdk.java.net/~coleenp/8184994.02/webrev/src/share/vm/classfile/placeholders.hpp.udiff.html > - void print() const PRODUCT_RETURN; > + void print_entry(outputStream* st) const;; > Extra ';' > > > Thanks, > Serguei > > > On 8/1/17 10:32, coleen.phillimore at oracle.com wrote: >> >> >> On 8/1/17 1:19 PM, Aleksey Shipilev wrote: >>> On 08/01/2017 07:14 PM, coleen.phillimore at oracle.com wrote: >>>>> *) systemDictionary.cpp: in SystemDictionary::dump, the >>>>> if(verbose) condition seems inverted. Should >>>>> dump tables when verbose? >>>> I kept the name dump_table from the VM.stringtable and >>>> VM.symboltable dcmd code, when the function >>>> really prints hashtable statistics. I could rename that function >>>> to be print_table_statistics() >>>> instead. And change ClassLoaderDataGraph::dump_dictionary to >>>> print_dictionary_statistics. If that >>>> makes sense. >>> Yeah, that would make sense. >> >> Thanks! Here's a webrev with the renaming: >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8184994.02/webrev >> >> and tested. >> >> Thanks, >> Coleen >> >>> >>> Thanks, >>> -Aleksey >>> >> > From gerald.thornbrugh at oracle.com Wed Aug 2 16:02:09 2017 From: gerald.thornbrugh at oracle.com (Gerald Thornbrugh) Date: Wed, 2 Aug 2017 10:02:09 -0600 Subject: RFR 8182757: JDWP: Socket Transport handshake hangs on Solaris In-Reply-To: <7fe81ddd-f009-21f8-38b1-9192e0b14a28@oracle.com> References: <6D4B9143-CF24-44EF-A78F-F2DA29C3F370@oracle.com> <7fe81ddd-f009-21f8-38b1-9192e0b14a28@oracle.com> Message-ID: <68503672-3A34-4C26-8E09-4666E6CFD966@oracle.com> Hi Serguei, Thanks! Jerry > On Aug 2, 2017, at 9:58 AM, serguei.spitsyn at oracle.com wrote: > > Hi Gerald, > > It looks good. > > Thanks, > Serguei > > > On 8/2/17 08:16, Gerald Thornbrugh wrote: >> Hi, >> >> Please review this JDK-10 fix for JDK-8182757. >> >> The bug: https://bugs.openjdk.java.net/browse/JDK-8182757 >> >> The webrev: http://cr.openjdk.java.net/~gthornbr/8182757/webrev.00 >> >> If a socket is being setup without a fixed port using the SO_REUSEADDR flag it can lead to other >> processes interfering with the poll/receive process of a debugger/debuggee configuring a socket >> for communication. When SO_REUSEADDR is used other processes can attempt a listen() on >> the same port and receive a connect from the debuggee. This causes the debugger to stay in >> poll() waiting for a connect and the debuggee stays in recv() waiting to receive data from the >> "rogue" process that will never send it. >> >> This can also lead to connections being terminated early on the debuggee side when the ?rogue? >> process terminates the connection because it does not receive what it expected from the client >> process (i.e. the debuggee). >> >> The fix is to not use the SO_REUSEADDR flag for non-fixed port sockets. This keeps ?rogue? >> processes from reusing the port address and from stealing the connects sent by the debuggee. >> >> The changes to src/jdk.jdi/share/classes/com/sun/tools/jdi/SocketTransportService.java addresses >> when JDI (the debugger side) is acting in ?server mode?. >> >> The changes to src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c addresses when >> JDWP (the debuggee side) is acting in ?server mode?. >> >> I have run the JDK JDI tests and the internal Oracle VM/NSK JPDA tests on all platforms without errors. >> We were able to reproduce the failures with a specific load on a test machine while running JDI >> related tests. After we applied the fix to this system we were not able to reproduce the failures. >> >> Thanks, >> >> Gerald > From coleen.phillimore at oracle.com Wed Aug 2 19:42:42 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 2 Aug 2017 15:42:42 -0400 Subject: RFR (S) 8130072: Add a flag to print out statistics for both system dictionary and shared dictionary Message-ID: Summary: Include Shared Dictionary printing when printing system dictionaries in jcmd Also already included in -XX:+PrintSystemDictionaryAtExit. -Xshare:dump also has already added for -XX:+PrintSystemDictionaryAtExit. Tested with runThese in background and running jcmds, also see added test. open webrev at http://cr.openjdk.java.net/~coleenp/8130072.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8130072 Thanks, Coleen From shade at redhat.com Wed Aug 2 19:44:45 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 2 Aug 2017 21:44:45 +0200 Subject: RFR (S) 8130072: Add a flag to print out statistics for both system dictionary and shared dictionary In-Reply-To: References: Message-ID: <003aad3a-b4de-a188-dee3-9fecc5cd7dd7@redhat.com> On 08/02/2017 09:42 PM, coleen.phillimore at oracle.com wrote: > open webrev at http://cr.openjdk.java.net/~coleenp/8130072.01/webrev Looks good. -Aleksey From coleen.phillimore at oracle.com Wed Aug 2 20:12:25 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 2 Aug 2017 16:12:25 -0400 Subject: RFR (S) 8130072: Add a flag to print out statistics for both system dictionary and shared dictionary In-Reply-To: <003aad3a-b4de-a188-dee3-9fecc5cd7dd7@redhat.com> References: <003aad3a-b4de-a188-dee3-9fecc5cd7dd7@redhat.com> Message-ID: <655d9421-0892-9201-a7f5-57f7996e367a@oracle.com> On 8/2/17 3:44 PM, Aleksey Shipilev wrote: > On 08/02/2017 09:42 PM, coleen.phillimore at oracle.com wrote: >> open webrev at http://cr.openjdk.java.net/~coleenp/8130072.01/webrev > Looks good. Thank you for the fast code review! Coleen > -Aleksey > > From george.triantafillou at oracle.com Wed Aug 2 20:25:12 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Wed, 2 Aug 2017 16:25:12 -0400 Subject: RFR (S) 8130072: Add a flag to print out statistics for both system dictionary and shared dictionary In-Reply-To: References: Message-ID: <8a4da587-7876-ffce-e87f-c9fc40026195@oracle.com> Hi Coleen, Looks good! -George On 8/2/2017 3:42 PM, coleen.phillimore at oracle.com wrote: > Summary: Include Shared Dictionary printing when printing system > dictionaries in jcmd > > Also already included in -XX:+PrintSystemDictionaryAtExit. > -Xshare:dump also has already added for -XX:+PrintSystemDictionaryAtExit. > > Tested with runThese in background and running jcmds, also see added > test. > > open webrev at http://cr.openjdk.java.net/~coleenp/8130072.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8130072 > > Thanks, > Coleen From coleen.phillimore at oracle.com Wed Aug 2 20:38:43 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 2 Aug 2017 16:38:43 -0400 Subject: RFR (S) 8130072: Add a flag to print out statistics for both system dictionary and shared dictionary In-Reply-To: <8a4da587-7876-ffce-e87f-c9fc40026195@oracle.com> References: <8a4da587-7876-ffce-e87f-c9fc40026195@oracle.com> Message-ID: On 8/2/17 4:25 PM, George Triantafillou wrote: > Hi Coleen, > > Looks good! > Thanks, George! Coleen > -George > > On 8/2/2017 3:42 PM, coleen.phillimore at oracle.com wrote: >> Summary: Include Shared Dictionary printing when printing system >> dictionaries in jcmd >> >> Also already included in -XX:+PrintSystemDictionaryAtExit. >> -Xshare:dump also has already added for >> -XX:+PrintSystemDictionaryAtExit. >> >> Tested with runThese in background and running jcmds, also see added >> test. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8130072.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8130072 >> >> Thanks, >> Coleen > From kim.barrett at oracle.com Wed Aug 2 22:41:43 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 2 Aug 2017 18:41:43 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion Message-ID: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> Please review this small change to improve the debugging experience when a mutex is destroyed in a bad state. I've removed the assert in ~Mutex, which was making the same checks as in ~Monitor, but providing much less information. Also, the ~Monitor assert now (carefully) includes the name in the error message, as that may also be very helpful information. CR: https://bugs.openjdk.java.net/browse/JDK-8185746 Webrev: http://cr.openjdk.java.net/~kbarrett/8185746/hotspot.00/ Testing: Destroyed a locked mutex and looked at the assert message. From david.holmes at oracle.com Wed Aug 2 23:23:51 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Aug 2017 09:23:51 +1000 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> Message-ID: <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> Hi Kim, On 3/08/2017 8:41 AM, Kim Barrett wrote: > Please review this small change to improve the debugging experience > when a mutex is destroyed in a bad state. > > I've removed the assert in ~Mutex, which was making the same checks as > in ~Monitor, but providing much less information. Okay. Please can you add a comment to the ~Mutex noting that eg: // Defer any state checking to ~Monitor Mutex::~Mutex() { } > Also, the ~Monitor assert now (carefully) includes the name in the > error message, as that may also be very helpful information. Good strategy watching for a potentially corrupted name! Thanks, David ----- > CR: > https://bugs.openjdk.java.net/browse/JDK-8185746 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8185746/hotspot.00/ > > Testing: > Destroyed a locked mutex and looked at the assert message. > > From kim.barrett at oracle.com Wed Aug 2 23:29:37 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 2 Aug 2017 19:29:37 -0400 Subject: RFR: 8185757: QuickSort array size should be size_t Message-ID: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> Please review this change of the type of QuickSort::sort size parameter from int to size_t, and propogating this change throughout the QuickSort implementation (using size_t rather than int sizes and indices) and tests. Since I was touching the QuickSort code anyway, I made a couple of additional changes. - Re-ordered the internal template parameters, moving "idempotent" to the front and allowing the array element type and the comparator type to be deduced. - Changed the handling of the result of calling the comparator, only requiring it to return negative, zero, or positive, rather than exactly -1, 0, or +1. This makes it consistent with the standard library function qsort. Also updated a couple of callers: - In G1CollectionSet::finalize_old_part, removed no longer needed cast of (size_t) size to int. - In Method::sort_methods, removed unnecessary explicit template argument, allowing it to be deduced. I didn't change the length from int to size_t here, because that had more fanout, and there are other issues around its type. For example, it is being passed to other functions that expect a u2 value. CR: https://bugs.openjdk.java.net/browse/JDK-8185757 Webrev: http://cr.openjdk.java.net/~kbarrett/8185757/hotspot.00/ Testing: JPRT. In addition to unit testing, sorting gets exercised by G1 and by class file parsing (sort_methods). From kim.barrett at oracle.com Wed Aug 2 23:31:54 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 2 Aug 2017 19:31:54 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> Message-ID: > On Aug 2, 2017, at 7:23 PM, David Holmes wrote: > > Hi Kim, > > On 3/08/2017 8:41 AM, Kim Barrett wrote: >> Please review this small change to improve the debugging experience >> when a mutex is destroyed in a bad state. >> I've removed the assert in ~Mutex, which was making the same checks as >> in ~Monitor, but providing much less information. > > Okay. Please can you add a comment to the ~Mutex noting that eg: > > // Defer any state checking to ~Monitor > Mutex::~Mutex() { } Well, I was waffling over just removing it completely. From kim.barrett at oracle.com Wed Aug 2 23:36:43 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 2 Aug 2017 19:36:43 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> Message-ID: <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> > On Aug 2, 2017, at 7:31 PM, Kim Barrett wrote: > >> On Aug 2, 2017, at 7:23 PM, David Holmes wrote: >> >> Hi Kim, >> >> On 3/08/2017 8:41 AM, Kim Barrett wrote: >>> Please review this small change to improve the debugging experience >>> when a mutex is destroyed in a bad state. >>> I've removed the assert in ~Mutex, which was making the same checks as >>> in ~Monitor, but providing much less information. >> >> Okay. Please can you add a comment to the ~Mutex noting that eg: >> >> // Defer any state checking to ~Monitor >> Mutex::~Mutex() { } > > Well, I was waffling over just removing it completely. Note also that the structure here makes it pretty easy to accidentally invoke undefined behavior by destructor slicing. That's an entirely different problem, but it bugs me... From coleen.phillimore at oracle.com Wed Aug 2 23:49:29 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 2 Aug 2017 19:49:29 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> Message-ID: Looks good. On 8/2/17 7:36 PM, Kim Barrett wrote: >> On Aug 2, 2017, at 7:31 PM, Kim Barrett wrote: >> >>> On Aug 2, 2017, at 7:23 PM, David Holmes wrote: >>> >>> Hi Kim, >>> >>> On 3/08/2017 8:41 AM, Kim Barrett wrote: >>>> Please review this small change to improve the debugging experience >>>> when a mutex is destroyed in a bad state. >>>> I've removed the assert in ~Mutex, which was making the same checks as >>>> in ~Monitor, but providing much less information. >>> Okay. Please can you add a comment to the ~Mutex noting that eg: >>> >>> // Defer any state checking to ~Monitor >>> Mutex::~Mutex() { } >> Well, I was waffling over just removing it completely. > Note also that the structure here makes it pretty easy to accidentally > invoke undefined behavior by destructor slicing. That's an entirely > different problem, but it bugs me... Can you describe what you mean by "destructor slicing"? Do you mean calling ~Monitor on an instance of Mutex? Thanks, Coleen From kim.barrett at oracle.com Wed Aug 2 23:55:07 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 2 Aug 2017 19:55:07 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> Message-ID: <2B32C681-C924-4643-A639-B6FD2025A831@oracle.com> > On Aug 2, 2017, at 7:49 PM, coleen.phillimore at oracle.com wrote: > > > Looks good. Thanks. > On 8/2/17 7:36 PM, Kim Barrett wrote: >>> On Aug 2, 2017, at 7:31 PM, Kim Barrett wrote: >>> >>>> On Aug 2, 2017, at 7:23 PM, David Holmes wrote: >>>> >>>> Hi Kim, >>>> >>>> On 3/08/2017 8:41 AM, Kim Barrett wrote: >>>>> Please review this small change to improve the debugging experience >>>>> when a mutex is destroyed in a bad state. >>>>> I've removed the assert in ~Mutex, which was making the same checks as >>>>> in ~Monitor, but providing much less information. >>>> Okay. Please can you add a comment to the ~Mutex noting that eg: >>>> >>>> // Defer any state checking to ~Monitor >>>> Mutex::~Mutex() { } >>> Well, I was waffling over just removing it completely. >> Note also that the structure here makes it pretty easy to accidentally >> invoke undefined behavior by destructor slicing. That's an entirely >> different problem, but it bugs me... > Can you describe what you mean by "destructor slicing"? Do you mean calling ~Monitor on an instance of Mutex? Yes, exactly that. Typically by something like Monitor* m = ; ? delete m; From david.holmes at oracle.com Thu Aug 3 00:08:03 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Aug 2017 10:08:03 +1000 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> Message-ID: <8a669d70-5c4d-f8f8-59a2-b551f44de9a1@oracle.com> On 3/08/2017 9:36 AM, Kim Barrett wrote: >> On Aug 2, 2017, at 7:31 PM, Kim Barrett wrote: >> >>> On Aug 2, 2017, at 7:23 PM, David Holmes wrote: >>> >>> Hi Kim, >>> >>> On 3/08/2017 8:41 AM, Kim Barrett wrote: >>>> Please review this small change to improve the debugging experience >>>> when a mutex is destroyed in a bad state. >>>> I've removed the assert in ~Mutex, which was making the same checks as >>>> in ~Monitor, but providing much less information. >>> >>> Okay. Please can you add a comment to the ~Mutex noting that eg: >>> >>> // Defer any state checking to ~Monitor >>> Mutex::~Mutex() { } >> >> Well, I was waffling over just removing it completely. That works too. > Note also that the structure here makes it pretty easy to accidentally > invoke undefined behavior by destructor slicing. That's an entirely > different problem, but it bugs me... Please explain. Thanks, David From david.holmes at oracle.com Thu Aug 3 00:09:48 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Aug 2017 10:09:48 +1000 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: <2B32C681-C924-4643-A639-B6FD2025A831@oracle.com> References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> <2B32C681-C924-4643-A639-B6FD2025A831@oracle.com> Message-ID: On 3/08/2017 9:55 AM, Kim Barrett wrote: >> On Aug 2, 2017, at 7:49 PM, coleen.phillimore at oracle.com wrote: >> >> >> Looks good. > > Thanks. > >> On 8/2/17 7:36 PM, Kim Barrett wrote: >>>> On Aug 2, 2017, at 7:31 PM, Kim Barrett wrote: >>>> >>>>> On Aug 2, 2017, at 7:23 PM, David Holmes wrote: >>>>> >>>>> Hi Kim, >>>>> >>>>> On 3/08/2017 8:41 AM, Kim Barrett wrote: >>>>>> Please review this small change to improve the debugging experience >>>>>> when a mutex is destroyed in a bad state. >>>>>> I've removed the assert in ~Mutex, which was making the same checks as >>>>>> in ~Monitor, but providing much less information. >>>>> Okay. Please can you add a comment to the ~Mutex noting that eg: >>>>> >>>>> // Defer any state checking to ~Monitor >>>>> Mutex::~Mutex() { } >>>> Well, I was waffling over just removing it completely. >>> Note also that the structure here makes it pretty easy to accidentally >>> invoke undefined behavior by destructor slicing. That's an entirely >>> different problem, but it bugs me... >> Can you describe what you mean by "destructor slicing"? Do you mean calling ~Monitor on an instance of Mutex? > > Yes, exactly that. Typically by something like > > Monitor* m = ; > ? > delete m; virtual destructor needed? David From coleen.phillimore at oracle.com Thu Aug 3 01:25:31 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 2 Aug 2017 21:25:31 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> <2B32C681-C924-4643-A639-B6FD2025A831@oracle.com> Message-ID: On 8/2/17 8:09 PM, David Holmes wrote: > On 3/08/2017 9:55 AM, Kim Barrett wrote: >>> On Aug 2, 2017, at 7:49 PM, coleen.phillimore at oracle.com wrote: >>> >>> >>> Looks good. >> >> Thanks. >> >>> On 8/2/17 7:36 PM, Kim Barrett wrote: >>>>> On Aug 2, 2017, at 7:31 PM, Kim Barrett >>>>> wrote: >>>>> >>>>>> On Aug 2, 2017, at 7:23 PM, David Holmes >>>>>> wrote: >>>>>> >>>>>> Hi Kim, >>>>>> >>>>>> On 3/08/2017 8:41 AM, Kim Barrett wrote: >>>>>>> Please review this small change to improve the debugging experience >>>>>>> when a mutex is destroyed in a bad state. >>>>>>> I've removed the assert in ~Mutex, which was making the same >>>>>>> checks as >>>>>>> in ~Monitor, but providing much less information. >>>>>> Okay. Please can you add a comment to the ~Mutex noting that eg: >>>>>> >>>>>> // Defer any state checking to ~Monitor >>>>>> Mutex::~Mutex() { } >>>>> Well, I was waffling over just removing it completely. >>>> Note also that the structure here makes it pretty easy to accidentally >>>> invoke undefined behavior by destructor slicing. That's an entirely >>>> different problem, but it bugs me... >>> Can you describe what you mean by "destructor slicing"? Do you mean >>> calling ~Monitor on an instance of Mutex? >> >> Yes, exactly that. Typically by something like >> >> Monitor* m = ; >> ? >> delete m; > > virtual destructor needed? Yes, that was my question. I guess with Monitor, we don't want to pay the word for the vtable? But why not, it already has an embedded 64 character string. Coleen > > David > From thomas.schatzl at oracle.com Thu Aug 3 12:11:44 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 03 Aug 2017 14:11:44 +0200 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> Message-ID: <1501762304.2411.8.camel@oracle.com> Hi, On Wed, 2017-08-02 at 19:31 -0400, Kim Barrett wrote: > > > > On Aug 2, 2017, at 7:23 PM, David Holmes > > wrote: > > > > Hi Kim, > > > > On 3/08/2017 8:41 AM, Kim Barrett wrote: > > > > > > Please review this small change to improve the debugging > > > experience Looks good. > > > when a mutex is destroyed in a bad state. > > > I've removed the assert in ~Mutex, which was making the same > > > checks as > > > in ~Monitor, but providing much less information. > > Okay. Please can you add a comment to the ~Mutex noting that eg: > > > > // Defer any state checking to ~Monitor > > Mutex::~Mutex() { } > Well, I was waffling over just removing it completely. > ? do it :) I can't see, apart from it being a bug (to be fixed in another CR?), why the constructor of Monitor is not virtual. Thomas From thomas.schatzl at oracle.com Thu Aug 3 12:32:21 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 03 Aug 2017 14:32:21 +0200 Subject: RFR: 8185757: QuickSort array size should be size_t In-Reply-To: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> References: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> Message-ID: <1501763541.2411.19.camel@oracle.com> Hi Kim, On Wed, 2017-08-02 at 19:29 -0400, Kim Barrett wrote: > Please review this change of the type of QuickSort::sort size > parameter from int to size_t, and propogating this change throughout > the QuickSort implementation (using size_t rather than int sizes and > indices) and tests. > > Since I was touching the QuickSort code anyway, I made a couple of > additional changes. > > - Re-ordered the internal template parameters, moving "idempotent" to > the front and allowing the array element type and the comparator type > to be deduced. > > - Changed the handling of the result of calling the comparator, only > requiring it to return negative, zero, or positive, rather than > exactly -1, 0, or +1. This makes it consistent with the standard > library function qsort. Not sure if the change of the do-while to the for-loops improves readability that much. However, please put the closing brackets of these into extra lines (quicksort.hpp:76,77) to avoid the casual reader to overlook them. > Also updated a couple of callers: > > - In G1CollectionSet::finalize_old_part, removed no longer needed > cast of (size_t) size to int. Yes :) > > - In Method::sort_methods, removed unnecessary explicit template > argument, allowing it to be deduced.??I didn't change the length from > int to size_t here, because that had more fanout, and there are other > issues around its type.??For example, it is being passed to other > functions that expect a u2 value. Fine with me. Please consider filing a CR for that. > CR: > https://bugs.openjdk.java.net/browse/JDK-8185757 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8185757/hotspot.00/ > JPRT. In addition to unit testing, sorting gets exercised by G1 and > by class file parsing (sort_methods). One (potential) pre-existing issue with the test: it uses random() to create tests - however this decreases reproducability... Thanks, ? Thomas From thomas.schatzl at oracle.com Thu Aug 3 12:35:14 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 03 Aug 2017 14:35:14 +0200 Subject: RFR: 8185757: QuickSort array size should be size_t In-Reply-To: <1501763541.2411.19.camel@oracle.com> References: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> <1501763541.2411.19.camel@oracle.com> Message-ID: <1501763714.2411.22.camel@oracle.com> Hi again, On Thu, 2017-08-03 at 14:32 +0200, Thomas Schatzl wrote: > Hi Kim, > > On Wed, 2017-08-02 at 19:29 -0400, Kim Barrett wrote: > > > > Please review this change of the type of QuickSort::sort size > > parameter from int to size_t, and propogating this change > > throughoutthe QuickSort implementation (using size_t rather than > > int sizes and indices) and tests. > > > > Since I was touching the QuickSort code anyway, I made a couple of > > additional changes. > > > > - Re-ordered the internal template parameters, moving "idempotent" > > to the front and allowing the array element type and the comparator > > type to be deduced. > > > > - Changed the handling of the result of calling the comparator, > > only > > requiring it to return negative, zero, or positive, rather than > > exactly -1, 0, or +1. This makes it consistent with the standard > > library function qsort. > Not sure if the change of the do-while to the for-loops improves > readability that much. > However, please put the closing brackets of these into extra lines > (quicksort.hpp:76,77) to avoid the casual reader to overlook them. ?forgot to mention: looks good apart from that. Do not need a re-review for changing this. Thanks, ? Thomas From harold.seigel at oracle.com Thu Aug 3 13:03:59 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 3 Aug 2017 09:03:59 -0400 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work Message-ID: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> Hi, Please review this JDK-10 fix for JDK-8185103. The problem occurred because classes were being put on the fixup_module_field_list before their mirror field was set. If a (different) thread called method patch_javabase_entries() before the class's mirror field was set then this would cause a SIGSEGV because patch_javabase_entries() eventually calls obj_field_put() which tries to dereference the class's mirror field. This change fixes the problem by setting the class's mirror field before putting the class on the fixup_module_field_list. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util and other tests, the co-located NSK tests, and with JPRT. Additionally, the fix was tested by temporarily adding a naked_short_sleep(50) to method initialize_mirror_fields() shortly after it put a class on the fixup_module_field_list. The sleep was added in order to enhance the likelihood of patch_javabase_entries() being called before the class's mirror field got set. Without the fix, the TestThreadDumpMonitorContent.java test and the test reported in JDK-8183309 reliably got the reported SIGSEGVs. With the fix, the tests passed. Thanks, Harold From coleen.phillimore at oracle.com Thu Aug 3 16:00:13 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 3 Aug 2017 12:00:13 -0400 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: <920b0165-92b2-0d22-45aa-2ccba3858d74@redhat.com> References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <920b0165-92b2-0d22-45aa-2ccba3858d74@redhat.com> Message-ID: <63489d85-2ea5-7c30-8d86-d2976e7eecef@oracle.com> Hi, I agree with the plea for simplicity, that the types in the templates should be the same as well, and code in hotspot should be cleaned up. The cast in os::random wouldn't have been necessary if the compiler had given me an error when I changed the types to int. I would have made that unsigned, maybe... I find the IntegerTypes file reads like line noise (def: random characters you used to get from a misbehaving modem on your phone line). I would likely avoid using any features in it because they'd cost me time to figure out which things to use. The mental time cost of generalization is too high and I don't think we need to support something like atomic float operations (unless we do already?) I do agree with moving away from the java types to native types though. This is worth the disruption (which doesn't seem like that much from the first patch). Thanks, Coleen On 8/2/17 4:31 AM, Andrew Haley wrote: > On 02/08/17 04:35, David Holmes wrote: >> Hi Kim, >> >> Good planning on Erik's part to go on vacation just as I have returned ;-) >> >> On 1/08/2017 4:18 AM, Kim Barrett wrote: >>>> On Jul 28, 2017, at 12:25 PM, Erik Osterlund wrote: >>>> >>>> In that case, feel free to propose a revised solution while I am gone. >>> Erik has asked me to try to make progress on this while he's on >>> vacation, rather than possibly letting it sit until he gets back. >> Okay, while Erik is gone perhaps you can clarify a few things. As Andrew >> and Roman have expressed, I too find this: >> >> + template >> + inline static U cmpxchg(T exchange_value, volatile U* dest, V >> compare_value, cmpxchg_memory_order order); >> >> totally unintuitive and unappealing. I do not understand the rationale >> for this this at all. It does not make any sense to me to allow T, U and >> V to be different types (even if constrained). It has been stated that >> if we force them to all be the same there is some problem with literals >> and the need for casts, but I don't understand what that problem would be. > A couple of examples would help: > > diff -r d207c56d5b5a src/share/vm/runtime/os.cpp > --- a/src/share/vm/runtime/os.cpp Tue Jul 25 15:35:09 2017 +0100 > +++ b/src/share/vm/runtime/os.cpp Wed Aug 02 09:24:30 2017 +0100 > @@ -756,7 +756,7 @@ > while (true) { > unsigned int seed = _rand_seed; > int rand = random_helper(seed); > - if (Atomic::cmpxchg(rand, &_rand_seed, seed) == seed) { > + if (Atomic::cmpxchg((unsigned)rand, &_rand_seed, seed) == seed) { > return rand; > } > } > diff -r d207c56d5b5a src/share/vm/runtime/thread.cpp > --- a/src/share/vm/runtime/thread.cpp Tue Jul 25 15:35:09 2017 +0100 > +++ b/src/share/vm/runtime/thread.cpp Wed Aug 02 09:24:30 2017 +0100 > @@ -4741,7 +4741,7 @@ > enum MuxBits { LOCKBIT = 1 }; > > void Thread::muxAcquire(volatile intptr_t * Lock, const char * LockName) { > - intptr_t w = Atomic::cmpxchg(LOCKBIT, Lock, 0); > + intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, (intptr_t)0); > if (w == 0) return; > if ((w & LOCKBIT) == 0 && Atomic::cmpxchg (w|LOCKBIT, Lock, w) == w) { > return; > > There are eight such changes required in the HotSpot codebase. > > I strongly believe that these changes make the code better. They show > exactly where types do not match. In the case of the random > generator, they show that there is a type mismatch which is IMO > probably wrong, and it wouldn't hurt to fix it. > From kim.barrett at oracle.com Thu Aug 3 17:35:10 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Aug 2017 13:35:10 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> <2B32C681-C924-4643-A639-B6FD2025A831@oracle.com> Message-ID: > On Aug 2, 2017, at 9:25 PM, coleen.phillimore at oracle.com wrote: > > > > On 8/2/17 8:09 PM, David Holmes wrote: >> On 3/08/2017 9:55 AM, Kim Barrett wrote: >>>> On Aug 2, 2017, at 7:49 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> Looks good. >>> >>> Thanks. >>> >>>> On 8/2/17 7:36 PM, Kim Barrett wrote: >>>>>> On Aug 2, 2017, at 7:31 PM, Kim Barrett wrote: >>>>>> >>>>>>> On Aug 2, 2017, at 7:23 PM, David Holmes wrote: >>>>>>> >>>>>>> Hi Kim, >>>>>>> >>>>>>> On 3/08/2017 8:41 AM, Kim Barrett wrote: >>>>>>>> Please review this small change to improve the debugging experience >>>>>>>> when a mutex is destroyed in a bad state. >>>>>>>> I've removed the assert in ~Mutex, which was making the same checks as >>>>>>>> in ~Monitor, but providing much less information. >>>>>>> Okay. Please can you add a comment to the ~Mutex noting that eg: >>>>>>> >>>>>>> // Defer any state checking to ~Monitor >>>>>>> Mutex::~Mutex() { } >>>>>> Well, I was waffling over just removing it completely. >>>>> Note also that the structure here makes it pretty easy to accidentally >>>>> invoke undefined behavior by destructor slicing. That's an entirely >>>>> different problem, but it bugs me... >>>> Can you describe what you mean by "destructor slicing"? Do you mean calling ~Monitor on an instance of Mutex? >>> >>> Yes, exactly that. Typically by something like >>> >>> Monitor* m = ; >>> ? >>> delete m; >> >> virtual destructor needed? > > Yes, that was my question. I guess with Monitor, we don't want to pay the word for the vtable? But why not, it already has an embedded 64 character string. Plus the cache line padding for most instances. There?s quite a (known) mess in that hierarchy. See comments for MONITOR_NAME_LEN and for the Mutex class. From coleen.phillimore at oracle.com Thu Aug 3 17:40:57 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 3 Aug 2017 13:40:57 -0400 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> Message-ID: <0272b5ef-b219-4e49-edf0-4c697e88676f@oracle.com> This change looks good. Very minor nit: + if(k->java_mirror() == NULL) k->set_java_mirror(mirror()); Needs a space between the if and (. Probably should also be: if (k->java_mirror() == NULL) { // Only set the mirror if not set above while setting the module k->set_java_mirror(); } ie. with a comment. I do not need to see a new version. Thanks, Coleen On 8/3/17 9:03 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 fix for JDK-8185103. The problem occurred > because classes were being put on the fixup_module_field_list before > their mirror field was set. If a (different) thread called method > patch_javabase_entries() before the class's mirror field was set then > this would cause a SIGSEGV because patch_javabase_entries() eventually > calls obj_field_put() which tries to dereference the class's mirror > field. > > This change fixes the problem by setting the class's mirror field > before putting the class on the fixup_module_field_list. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 > > The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util and other tests, the co-located NSK > tests, and with JPRT. > > Additionally, the fix was tested by temporarily adding a > naked_short_sleep(50) to method initialize_mirror_fields() shortly > after it put a class on the fixup_module_field_list. The sleep was > added in order to enhance the likelihood of patch_javabase_entries() > being called before the class's mirror field got set. Without the > fix, the TestThreadDumpMonitorContent.java test and the test reported > in JDK-8183309 > reliably got the reported SIGSEGVs. With the fix, the tests passed. > > Thanks, Harold > From harold.seigel at oracle.com Thu Aug 3 17:42:19 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 3 Aug 2017 13:42:19 -0400 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: <0272b5ef-b219-4e49-edf0-4c697e88676f@oracle.com> References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> <0272b5ef-b219-4e49-edf0-4c697e88676f@oracle.com> Message-ID: Thanks Coleen for the review! Harold On 8/3/2017 1:40 PM, coleen.phillimore at oracle.com wrote: > > This change looks good. Very minor nit: > > + if(k->java_mirror() == NULL) k->set_java_mirror(mirror()); > > > Needs a space between the if and (. Probably should also be: > > if (k->java_mirror() == NULL) { > // Only set the mirror if not set above while setting the module > k->set_java_mirror(); > } > > ie. with a comment. > > I do not need to see a new version. > > Thanks, > Coleen > > On 8/3/17 9:03 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 fix for JDK-8185103. The problem occurred >> because classes were being put on the fixup_module_field_list before >> their mirror field was set. If a (different) thread called method >> patch_javabase_entries() before the class's mirror field was set then >> this would cause a SIGSEGV because patch_javabase_entries() >> eventually calls obj_field_put() which tries to dereference the >> class's mirror field. >> >> This change fixes the problem by setting the class's mirror field >> before putting the class on the fixup_module_field_list. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 >> >> The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, >> java/io, java/lang, java/util and other tests, the co-located NSK >> tests, and with JPRT. >> >> Additionally, the fix was tested by temporarily adding a >> naked_short_sleep(50) to method initialize_mirror_fields() shortly >> after it put a class on the fixup_module_field_list. The sleep was >> added in order to enhance the likelihood of patch_javabase_entries() >> being called before the class's mirror field got set. Without the >> fix, the TestThreadDumpMonitorContent.java test and the test reported >> in JDK-8183309 >> reliably got the reported SIGSEGVs. With the fix, the tests passed. >> >> Thanks, Harold >> > From kim.barrett at oracle.com Thu Aug 3 17:44:07 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Aug 2017 13:44:07 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <4D9C4EAC-5F3A-4C96-B6D3-CC3A10346F99@oracle.com> <2B32C681-C924-4643-A639-B6FD2025A831@oracle.com> Message-ID: <292F1452-AF37-45B8-89ED-669092785D92@oracle.com> > On Aug 3, 2017, at 1:35 PM, Kim Barrett wrote: > There?s quite a (known) mess in that hierarchy. Note that some of the bad aspects of the inheritance of Mutex from Monitor could be fixed by C++11 deleted functions, e.g. in Mutex, instead of private: bool notify () { ShouldNotReachHere(); return false; } with C++11 one could have public: bool notify() = delete; Still not perfect, since one can still cast a Mutex reference to a Monitor reference and then use notify. (That?s true now too.) Better would be to fix the inheritance structure. (And just flipping the order isn?t, IMO, the best way to do that, but that?s a potentially much longer discussion.) From kim.barrett at oracle.com Thu Aug 3 17:44:48 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Aug 2017 13:44:48 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: <1501762304.2411.8.camel@oracle.com> References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <1501762304.2411.8.camel@oracle.com> Message-ID: > On Aug 3, 2017, at 8:11 AM, Thomas Schatzl wrote: > > Hi, > > On Wed, 2017-08-02 at 19:31 -0400, Kim Barrett wrote: >>> >>> On Aug 2, 2017, at 7:23 PM, David Holmes >>> wrote: >>> >>> Hi Kim, >>> >>> On 3/08/2017 8:41 AM, Kim Barrett wrote: >>>> >>>> Please review this small change to improve the debugging >>>> experience > > Looks good. Thanks. >>>> when a mutex is destroyed in a bad state. >>>> I've removed the assert in ~Mutex, which was making the same >>>> checks as >>>> in ~Monitor, but providing much less information. >>> Okay. Please can you add a comment to the ~Mutex noting that eg: >>> >>> // Defer any state checking to ~Monitor >>> Mutex::~Mutex() { } >> Well, I was waffling over just removing it completely. >> > > do it :) I can't see, apart from it being a bug (to be fixed in > another CR?), why the constructor of Monitor is not virtual. It?s gone now. From george.triantafillou at oracle.com Thu Aug 3 18:00:15 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Thu, 3 Aug 2017 14:00:15 -0400 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> Message-ID: Hi Harold, Looks good. -George On 8/3/2017 9:03 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 fix for JDK-8185103. The problem occurred > because classes were being put on the fixup_module_field_list before > their mirror field was set. If a (different) thread called method > patch_javabase_entries() before the class's mirror field was set then > this would cause a SIGSEGV because patch_javabase_entries() eventually > calls obj_field_put() which tries to dereference the class's mirror > field. > > This change fixes the problem by setting the class's mirror field > before putting the class on the fixup_module_field_list. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 > > The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util and other tests, the co-located NSK > tests, and with JPRT. > > Additionally, the fix was tested by temporarily adding a > naked_short_sleep(50) to method initialize_mirror_fields() shortly > after it put a class on the fixup_module_field_list. The sleep was > added in order to enhance the likelihood of patch_javabase_entries() > being called before the class's mirror field got set. Without the > fix, the TestThreadDumpMonitorContent.java test and the test reported > in JDK-8183309 > reliably got the reported SIGSEGVs. With the fix, the tests passed. > > Thanks, Harold > From harold.seigel at oracle.com Thu Aug 3 18:01:02 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 3 Aug 2017 14:01:02 -0400 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> Message-ID: Thanks George, for the review! Harold On 8/3/2017 2:00 PM, George Triantafillou wrote: > Hi Harold, > > Looks good. > > -George > > On 8/3/2017 9:03 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 fix for JDK-8185103. The problem occurred >> because classes were being put on the fixup_module_field_list before >> their mirror field was set. If a (different) thread called method >> patch_javabase_entries() before the class's mirror field was set then >> this would cause a SIGSEGV because patch_javabase_entries() >> eventually calls obj_field_put() which tries to dereference the >> class's mirror field. >> >> This change fixes the problem by setting the class's mirror field >> before putting the class on the fixup_module_field_list. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 >> >> The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, >> java/io, java/lang, java/util and other tests, the co-located NSK >> tests, and with JPRT. >> >> Additionally, the fix was tested by temporarily adding a >> naked_short_sleep(50) to method initialize_mirror_fields() shortly >> after it put a class on the fixup_module_field_list. The sleep was >> added in order to enhance the likelihood of patch_javabase_entries() >> being called before the class's mirror field got set. Without the >> fix, the TestThreadDumpMonitorContent.java test and the test reported >> in JDK-8183309 >> reliably got the reported SIGSEGVs. With the fix, the tests passed. >> >> Thanks, Harold >> > From kim.barrett at oracle.com Thu Aug 3 18:06:35 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Aug 2017 14:06:35 -0400 Subject: RFR: 8185757: QuickSort array size should be size_t In-Reply-To: <1501763541.2411.19.camel@oracle.com> References: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> <1501763541.2411.19.camel@oracle.com> Message-ID: > On Aug 3, 2017, at 8:32 AM, Thomas Schatzl wrote: > On Wed, 2017-08-02 at 19:29 -0400, Kim Barrett wrote: >> - Changed the handling of the result of calling the comparator, only >> requiring it to return negative, zero, or positive, rather than >> exactly -1, 0, or +1. This makes it consistent with the standard >> library function qsort. > > Not sure if the change of the do-while to the for-loops improves > readability that much. I guess it wasn?t obvious that the change away from do-while was necessitated by the change from int to size_t. The left/right_index variables were initialized to beyond the end, and always pre-inc/dec in the do-while loops. But in that scheme, left_index was initially -1. While it would have worked to change the types to size_t and still initialize left_index to (converted) -1, and left the do-while untouched so that it incremented left_index to (overflowed) zero in the first iteration, that seemed rather obscure to me, and also at some risk of compiler / static analyzer warnings. > However, please put the closing brackets of these into extra lines > (quicksort.hpp:76,77) to avoid the casual reader to overlook them. Sorry, but that just looks horrible. As a casual reader, I wouldn?t even look for them, since if they aren?t there then the code is badly mis-indented. >> - In Method::sort_methods, removed unnecessary explicit template >> argument, allowing it to be deduced. I didn't change the length from >> int to size_t here, because that had more fanout, and there are other >> issues around its type. For example, it is being passed to other >> functions that expect a u2 value. > > Fine with me. Please consider filing a CR for that. I was thinking I should probably do that. >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8185757 >> >> Webrev: >> http://cr.openjdk.java.net/~kbarrett/8185757/hotspot.00/ >> JPRT. In addition to unit testing, sorting gets exercised by G1 and >> by class file parsing (sort_methods). > > One (potential) pre-existing issue with the test: it uses random() to > create tests - however this decreases reproducability? Which has both good and bad aspects. Printing the first random value might help, so long as this remains a non-TEST_VM test; once the VM is initialized there could be other concurrent callers of random() that would alter the test?s sequence. From harold.seigel at oracle.com Thu Aug 3 19:03:21 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 3 Aug 2017 15:03:21 -0400 Subject: RFR 8185806: Quarantine test JdbExprTest.sh on 32-bit Windows Message-ID: <9b115ae6-1abb-6128-6228-42d4202aba41@oracle.com> Hi, Please review this small change to quarantine test jdk/test/com/sun/jdi/JdbExprTest.sh on 32-bit Windows systems. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8185806/webrev/index.html JBS Bug Sub-task: https://bugs.openjdk.java.net/browse/JDK-8185806 Thanks, Harold From coleen.phillimore at oracle.com Thu Aug 3 19:04:45 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 3 Aug 2017 15:04:45 -0400 Subject: RFR 8185806: Quarantine test JdbExprTest.sh on 32-bit Windows In-Reply-To: <9b115ae6-1abb-6128-6228-42d4202aba41@oracle.com> References: <9b115ae6-1abb-6128-6228-42d4202aba41@oracle.com> Message-ID: Looks good. I think this should be a trivial change. Coleen On 8/3/17 3:03 PM, harold seigel wrote: > Hi, > > Please review this small change to quarantine test > jdk/test/com/sun/jdi/JdbExprTest.sh on 32-bit Windows systems. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8185806/webrev/index.html > > JBS Bug Sub-task: https://bugs.openjdk.java.net/browse/JDK-8185806 > > Thanks, Harold > From thomas.schatzl at oracle.com Thu Aug 3 19:05:50 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 03 Aug 2017 21:05:50 +0200 Subject: RFR: 8185757: QuickSort array size should be size_t In-Reply-To: References: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> <1501763541.2411.19.camel@oracle.com> Message-ID: <1501787150.2411.71.camel@oracle.com> Hi, On Thu, 2017-08-03 at 14:06 -0400, Kim Barrett wrote: > > > > On Aug 3, 2017, at 8:32 AM, Thomas Schatzl > om> wrote: > > On Wed, 2017-08-02 at 19:29 -0400, Kim Barrett wrote: > > > > > > - Changed the handling of the result of calling the comparator, > > > only > > > requiring it to return negative, zero, or positive, rather than > > > exactly -1, 0, or +1. This makes it consistent with the standard > > > library function qsort. > > Not sure if the change of the do-while to the for-loops improves > > readability that much. > I guess it wasn?t obvious that the change away from do-while was > necessitated by the change from int to size_t.??The left/right_index > variables were initialized to beyond the end, and always pre-inc/dec > in the do-while loops.??But in that scheme, left_index was initially > -1. > > While it would have worked to change the types to size_t and still > initialize left_index to (converted) -1, and left the do-while > untouched > so that it incremented left_index to (overflowed) zero in the first > iteration, that seemed rather obscure to me, and also at some risk of > compiler / static analyzer warnings. Okay. > > However, please put the closing brackets of these into extra lines > > (quicksort.hpp:76,77) to avoid the casual reader to overlook them. > Sorry, but that just looks horrible.??As a casual reader, I wouldn?t > even look for them, since if they aren?t there then the code is > badly mis-indented. Actually I was already at writing about an issue with indentation when I noticed the brackets :) Not insisting on changing this. > > > > > > CR: > > > https://bugs.openjdk.java.net/browse/JDK-8185757 > > > > > > Webrev: > > > http://cr.openjdk.java.net/~kbarrett/8185757/hotspot.00/ > > > JPRT. In addition to unit testing, sorting gets exercised by G1 > > > and > > > by class file parsing (sort_methods). > > One (potential) pre-existing issue with the test: it uses random() > > to > > create tests - however this decreases reproducability? > Which has both good and bad aspects.??Printing the first random value > might help, so long as this remains a non-TEST_VM test; once the > VM is initialized there could be other concurrent callers of random() > that would alter the test?s sequence. Just commenting when I read over it. Yes, multiple tests running at the same time another aspect. Thanks, ? Thomas From harold.seigel at oracle.com Thu Aug 3 19:06:51 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 3 Aug 2017 15:06:51 -0400 Subject: RFR 8185806: Quarantine test JdbExprTest.sh on 32-bit Windows In-Reply-To: References: <9b115ae6-1abb-6128-6228-42d4202aba41@oracle.com> Message-ID: Thanks Coleen! Harold On 8/3/2017 3:04 PM, coleen.phillimore at oracle.com wrote: > Looks good. I think this should be a trivial change. > Coleen > > On 8/3/17 3:03 PM, harold seigel wrote: >> Hi, >> >> Please review this small change to quarantine test >> jdk/test/com/sun/jdi/JdbExprTest.sh on 32-bit Windows systems. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8185806/webrev/index.html >> >> JBS Bug Sub-task: https://bugs.openjdk.java.net/browse/JDK-8185806 >> >> Thanks, Harold >> > From kim.barrett at oracle.com Thu Aug 3 19:27:18 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Aug 2017 15:27:18 -0400 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596CD126.6080100@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> Message-ID: <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> > On Aug 1, 2017, at 11:35 PM, David Holmes wrote: > > Hi Kim, > > Good planning on Erik's part to go on vacation just as I have returned ;-) > > On 1/08/2017 4:18 AM, Kim Barrett wrote: >>> On Jul 28, 2017, at 12:25 PM, Erik Osterlund wrote: >>> >>> Hi Andrew, >>> >>> In that case, feel free to propose a revised solution while I am gone. >> Erik has asked me to try to make progress on this while he's on >> vacation, rather than possibly letting it sit until he gets back. > > Okay, while Erik is gone perhaps you can clarify a few things. As Andrew and Roman have expressed, I too find this: > > + template > + inline static U cmpxchg(T exchange_value, volatile U* dest, V compare_value, cmpxchg_memory_order order); > > totally unintuitive and unappealing. I do not understand the rationale for this this at all. It does not make any sense to me to allow T, U and V to be different types (even if constrained). It has been stated that if we force them to all be the same there is some problem with literals and the need for casts, but I don't understand what that problem would be. This response is about the types for a cmpxchg template. I'll respond to other comments separately. It is certainly simpler to implement if the types for all three arguments are required to be the same. I suggested Erik do that, but he reported running into "lots" of compilation failures. I guess Erik was less persistent than Andrew in working through them. Given Andrew's reported small number, I've also done that experiment and looked at the failing cases. I think (nearly?) all would be improved by making the argument types match. There's also one similar case for xchg. However, there are use-cases that I think are reasonable which don't immediately fit that restriction. (1) cmpxchg(v, p, NULL), to store a pointer if no pointer is already present. This can be used as an alternative to DCLP. One way to deal with this might be an overload on std::nullptr_t and use nullptr, but that requires C++11. We don't have any current uses of this that I could find, but it's a sufficiently interesting idiom that I'm relucant to forbid it. But such idiomatic usage could be wrapped up in its own little package that can deal with the restriction. (2) The use of literals can make getting a type match more difficult, especially when the pointee type doesn't have portable syntax (like intx and uintx). But using properly typed named values solves this, and may be seen as an improvement over magic literal values. (3) Passing a derived pointer as the new value when updating an atomic pointer seems reasonable to me. A derived pointer compare value seems somewhat less so. Similarly, new and compare values that have pointee types that are less cv-qualified than the destination also seems reasonable. (4) Several of the problematic cases involve enums, which aren't especially well handled by the original proposal. From Erik's explorations I knew there were some issues with enums. Having now looked at some of these cmpxchg problems, I think we should try harder to deal well with enums. One problem is that C++11 std::is_enum is at least hard (maybe impossible?) to portabily emulate in C++98, so some other mechanism would be needed to recognize them. Probably the simplest is a registration mechanism for enum types that are used for atomic-access values (e.g a type predicate defaulting to false, and specializations for relevant enum types, near their definitions.) (5) One of our goals is to eliminate, as much as possible, the need for explicit casts and conversions in uses of this API. We think the existing widespread use of casts makes code difficult to understand and is a source of bugs. I think the small number of call sites affected by requiring the cmpxchg types all be the same is in no small measure a result of that widespread use of casts in calling code. Unfortunately, Hotspot is rife with boundaries across which there are type mismatches (and often inconsistencies even within a single logical chunk of code). We don't think that's a good thing, but it's the context in which we were developing this change. I think the proposed change may be more lax than it should / could be, but I think there are valid reasons to not require the types for cmpxchg arguments to all match. From harold.seigel at oracle.com Thu Aug 3 19:45:47 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 3 Aug 2017 15:45:47 -0400 Subject: RFR 8185806: Quarantine test JdbExprTest.sh on 32-bit Windows In-Reply-To: <9d9ed57f-5437-c969-23be-16281a41ce19@oracle.com> References: <9b115ae6-1abb-6128-6228-42d4202aba41@oracle.com> <9d9ed57f-5437-c969-23be-16281a41ce19@oracle.com> Message-ID: Hi Serguei, Dan pointed out that the test failed on both 64-bit and 32-bit Windows platforms so I'll be sending out a new webrev without a "(sun.arch.data.model != "32")" clause in the @requires statement. Thanks, Harold On 8/3/2017 3:15 PM, serguei.spitsyn at oracle.com wrote: > Hi Harold, > > +# @requires (sun.arch.data.model != "32") & (os.family != "windows") > Should the '&' be replaced with '|' ? > > Thanks, > Serguei > > > On 8/3/17 12:03, harold seigel wrote: >> Hi, >> >> Please review this small change to quarantine test >> jdk/test/com/sun/jdi/JdbExprTest.sh on 32-bit Windows systems. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8185806/webrev/index.html >> >> JBS Bug Sub-task: https://bugs.openjdk.java.net/browse/JDK-8185806 >> >> Thanks, Harold >> > From david.holmes at oracle.com Thu Aug 3 21:24:37 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 4 Aug 2017 07:24:37 +1000 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> Message-ID: <7973a8e4-232b-2618-ee24-ab455bab5abc@oracle.com> Hi Harold, On 3/08/2017 11:03 PM, harold seigel wrote: > Hi, > > Please review this JDK-10 fix for JDK-8185103. The problem occurred > because classes were being put on the fixup_module_field_list before > their mirror field was set. If a (different) thread called method > patch_javabase_entries() before the class's mirror field was set then The code that calls patch_javabase_entries has this: // Only the thread that actually defined the base module will get here, // so no locking is needed. // Patch any previously loaded class's module field with java.base's java.lang.Module. ModuleEntryTable::patch_javabase_entries(module_handle); so it seems that comment is wrong and that locking is indeed needed somewhere! At a minimum your setting of the mirror needs a following storestore barrier, or (better) the set/get of the mirror uses load-acquire/store-release. Thanks, David ----- > this would cause a SIGSEGV because patch_javabase_entries() eventually > calls obj_field_put() which tries to dereference the class's mirror field. > > This change fixes the problem by setting the class's mirror field before > putting the class on the fixup_module_field_list. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 > > The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util and other tests, the co-located NSK tests, > and with JPRT. > > Additionally, the fix was tested by temporarily adding a > naked_short_sleep(50) to method initialize_mirror_fields() shortly after > it put a class on the fixup_module_field_list. The sleep was added in > order to enhance the likelihood of patch_javabase_entries() being called > before the class's mirror field got set. Without the fix, the > TestThreadDumpMonitorContent.java test and the test reported in > JDK-8183309 reliably > got the reported SIGSEGVs. With the fix, the tests passed. > > Thanks, Harold > From coleen.phillimore at oracle.com Thu Aug 3 21:52:18 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 3 Aug 2017 17:52:18 -0400 Subject: RFR 8185806: Quarantine test JdbExprTest.sh on 32-bit Windows In-Reply-To: References: <9b115ae6-1abb-6128-6228-42d4202aba41@oracle.com> <9d9ed57f-5437-c969-23be-16281a41ce19@oracle.com> Message-ID: <83b25597-fcea-f37f-1403-36dbb3bb0bc0@oracle.com> On 8/3/17 3:45 PM, harold seigel wrote: > Hi Serguei, > > Dan pointed out that the test failed on both 64-bit and 32-bit Windows > platforms so I'll be sending out a new webrev without a > "(sun.arch.data.model != "32")" clause in the @requires statement. Even better. Sorry for the cursory review last time. Coleen > > Thanks, Harold > > > On 8/3/2017 3:15 PM, serguei.spitsyn at oracle.com wrote: >> Hi Harold, >> >> +# @requires (sun.arch.data.model != "32") & (os.family != "windows") >> Should the '&' be replaced with '|' ? >> >> Thanks, >> Serguei >> >> >> On 8/3/17 12:03, harold seigel wrote: >>> Hi, >>> >>> Please review this small change to quarantine test >>> jdk/test/com/sun/jdi/JdbExprTest.sh on 32-bit Windows systems. >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8185806/webrev/index.html >>> >>> JBS Bug Sub-task: https://bugs.openjdk.java.net/browse/JDK-8185806 >>> >>> Thanks, Harold >>> >> > From david.holmes at oracle.com Thu Aug 3 23:03:46 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 4 Aug 2017 09:03:46 +1000 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: <7973a8e4-232b-2618-ee24-ab455bab5abc@oracle.com> References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> <7973a8e4-232b-2618-ee24-ab455bab5abc@oracle.com> Message-ID: Hi Harold, On 4/08/2017 7:24 AM, David Holmes wrote: > Hi Harold, > > On 3/08/2017 11:03 PM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 fix for JDK-8185103. The problem occurred >> because classes were being put on the fixup_module_field_list before >> their mirror field was set. If a (different) thread called method >> patch_javabase_entries() before the class's mirror field was set then > > The code that calls patch_javabase_entries has this: > > // Only the thread that actually defined the base module will get here, > // so no locking is needed. > > // Patch any previously loaded class's module field with java.base's > java.lang.Module. > ModuleEntryTable::patch_javabase_entries(module_handle); > > so it seems that comment is wrong and that locking is indeed needed > somewhere! At a minimum your setting of the mirror needs a following > storestore barrier, or (better) the set/get of the mirror uses > load-acquire/store-release. Sorry - looking in more detail the necessary locking is already in place. A class is only added to the fixup list, under the Module_lock, if the base module is not yet defined. The finalization of that definition also occurs under the Module_lock, which in turn occurs before the fixup list is processed (without the lock). So as long as the mirror is set before the class is added to the fixup list, the mirror will be visible to the main thread when it processes it. Looking at the original code: 881 // set the module field in the java_lang_Class instance 882 set_mirror_module_field(k, mirror, module, THREAD); 883 884 // Setup indirection from klass->mirror 885 // after any exceptions can happen during allocations. 886 k->set_java_mirror(mirror()); it would seem simplest to just reorder the two actions - except for that comment about exceptions. Is the allocation exception issue less of an issue when doing VM initialization? What will happen? Thanks, David > Thanks, > David > ----- > >> this would cause a SIGSEGV because patch_javabase_entries() eventually >> calls obj_field_put() which tries to dereference the class's mirror >> field. >> >> This change fixes the problem by setting the class's mirror field >> before putting the class on the fixup_module_field_list. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 >> >> The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, >> java/io, java/lang, java/util and other tests, the co-located NSK >> tests, and with JPRT. >> >> Additionally, the fix was tested by temporarily adding a >> naked_short_sleep(50) to method initialize_mirror_fields() shortly >> after it put a class on the fixup_module_field_list. The sleep was >> added in order to enhance the likelihood of patch_javabase_entries() >> being called before the class's mirror field got set. Without the >> fix, the TestThreadDumpMonitorContent.java test and the test reported >> in JDK-8183309 >> reliably got the reported SIGSEGVs. With the fix, the tests passed. >> >> Thanks, Harold >> From david.holmes at oracle.com Thu Aug 3 23:42:44 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 4 Aug 2017 09:42:44 +1000 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: Hi Kim, On 4/08/2017 5:27 AM, Kim Barrett wrote: >> On Aug 1, 2017, at 11:35 PM, David Holmes wrote: >> >> Hi Kim, >> >> Good planning on Erik's part to go on vacation just as I have returned ;-) >> >> On 1/08/2017 4:18 AM, Kim Barrett wrote: >>>> On Jul 28, 2017, at 12:25 PM, Erik Osterlund wrote: >>>> >>>> Hi Andrew, >>>> >>>> In that case, feel free to propose a revised solution while I am gone. >>> Erik has asked me to try to make progress on this while he's on >>> vacation, rather than possibly letting it sit until he gets back. >> >> Okay, while Erik is gone perhaps you can clarify a few things. As Andrew and Roman have expressed, I too find this: >> >> + template >> + inline static U cmpxchg(T exchange_value, volatile U* dest, V compare_value, cmpxchg_memory_order order); >> >> totally unintuitive and unappealing. I do not understand the rationale for this this at all. It does not make any sense to me to allow T, U and V to be different types (even if constrained). It has been stated that if we force them to all be the same there is some problem with literals and the need for casts, but I don't understand what that problem would be. > > This response is about the types for a cmpxchg template. I'll respond > to other comments separately. > > It is certainly simpler to implement if the types for all three > arguments are required to be the same. > > I suggested Erik do that, but he reported running into "lots" of > compilation failures. I guess Erik was less persistent than Andrew in > working through them. Given Andrew's reported small number, I've also > done that experiment and looked at the failing cases. I think > (nearly?) all would be improved by making the argument types match. > There's also one similar case for xchg. > > However, there are use-cases that I think are reasonable which don't > immediately fit that restriction. > > (1) cmpxchg(v, p, NULL), to store a pointer if no pointer is already > present. This can be used as an alternative to DCLP. One way to deal I thought NULL (aka 0 in a pointer context) was assignable to any pointer type without any casts. ?? > with this might be an overload on std::nullptr_t and use nullptr, but > that requires C++11. We don't have any current uses of this that I > could find, but it's a sufficiently interesting idiom that I'm > relucant to forbid it. But such idiomatic usage could be wrapped up > in its own little package that can deal with the restriction. > > (2) The use of literals can make getting a type match more difficult, > especially when the pointee type doesn't have portable syntax (like > intx and uintx). But using properly typed named values solves this, > and may be seen as an improvement over magic literal values. I'd like to see specific examples. I would hope that normal conversions/promotions would handle the majority of cases. > (3) Passing a derived pointer as the new value when updating an atomic > pointer seems reasonable to me. A derived pointer compare value seems > somewhat less so. Similarly, new and compare values that have pointee > types that are less cv-qualified than the destination also seems > reasonable. My C++ is rusty. Is a derived pointer not usable where a base pointer is declared?? Seems fundamental to polymorphism. The cv-qualifiers is a mess - shouldn't they really match? (I know we cast them on and off all over - but thats the mess part). Regardless, this suggests to me a relaxed form might be needed for pointer types. > (4) Several of the problematic cases involve enums, which aren't > especially well handled by the original proposal. From Erik's > explorations I knew there were some issues with enums. Having now > looked at some of these cmpxchg problems, I think we should try harder > to deal well with enums. One problem is that C++11 std::is_enum is at > least hard (maybe impossible?) to portabily emulate in C++98, so some > other mechanism would be needed to recognize them. Probably the > simplest is a registration mechanism for enum types that are used for > atomic-access values (e.g a type predicate defaulting to false, and > specializations for relevant enum types, near their definitions.) Enums are a disaster in C/C++ - maybe better in C++11. But I thought there were nice design patterns for type-safe enums that allow easy conversion to integer types within range? > (5) One of our goals is to eliminate, as much as possible, the need > for explicit casts and conversions in uses of this API. We think the > existing widespread use of casts makes code difficult to understand > and is a source of bugs. I think the small number of call sites > affected by requiring the cmpxchg types all be the same is in no small > measure a result of that widespread use of casts in calling code. > Unfortunately, Hotspot is rife with boundaries across which there are > type mismatches (and often inconsistencies even within a single > logical chunk of code). We don't think that's a good thing, but it's > the context in which we were developing this change. That's too hand-wavy for me. Many of the casts may not even be needed today but who is going to go through and try and figure that out! I don't even see that many casts in Atomic calls and the ones that are there seem to group into common sets in relation to pointers and intptr_t. > I think the proposed change may be more lax than it should / could > be, but I think there are valid reasons to not require the types for > cmpxchg arguments to all match. Sorry I'm not convinced. Points 1,3 and 5 may suggest the need for relaxed pointer variants. Also floating-point atomic ops make little sense - hardware doesn't provide discrete atomic ops on FP values. So the only thing you can do is a cas-loop on the memory location. That's why j.u.c.atomic doesn't support them - they'd be grossly inefficient. Cheers, David From jiangli.zhou at Oracle.COM Fri Aug 4 00:15:41 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Thu, 3 Aug 2017 17:15:41 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> Message-ID: <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> Here are the updated webrevs. http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ Changes in the updated webrevs include: Merge with Ioi?s recent shared space auto-sizing change (8072061) Addressed all feedbacks from Ioi and Coleen (Thanks for detailed review!) Thanks, Jiangli > On Aug 1, 2017, at 5:29 PM, Jiangli Zhou wrote: > > Hi Ioi, > > Thank you so much for reviewing this. I?ve addressed all your feedbacks. Please see details below. I?ll updated the webrev after addressing Coleen?s comments. > >> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >> >> Hi Jiangli, >> >> Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) >> >> stringTable.cpp: StringTable::archive_string >> >> add assert for DumpSharedSpaces only > > Ok. > >> >> filemap.cpp >> >> 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, >> 526 int first_region, int num_regions) { >> >> When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: >> >> 537 int len = regions->length(); >> 538 if (len > 1) { >> 539 start = (char*)regions->at(1).start(); >> 540 size = (char*)regions->at(len - 1).end() - start; >> 541 } >> 542 } >> >> The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. >> >> How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. >> >> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { >> if (first == MetaspaceShared::first_string) { >> assert(num_regons <= MetaspaceShared::max_strings, "..."); >> } else { >> assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); >> assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); >> } >> .... >> >> > > I?ve reworked the function and simplified the code. > >> >> 756 if (!string_data_mapped) { >> 757 StringTable::ignore_shared_strings(true); >> 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); >> 759 } >> 760 >> 761 if (open_archive_heap_data_mapped) { >> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >> 763 } else { >> 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); >> 765 } >> >> Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? > > Fixed. > >> >> FileMapInfo::map_heap_data() -- >> >> 818 char* addr = (char*)regions[i].start(); >> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >> 820 addr, regions[i].byte_size(), si->_read_only, >> 821 si->_allow_exec); >> >> What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. > > If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by ?regions? array are deallocated. We don?t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ?if? check below makes sure base == addr. > > if (base == NULL || base != addr) { > // dealloc the regions from java heap > dealloc_archive_heap_regions(regions, region_num); > if (log_is_enabled(Info, cds)) { > log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap."); > } > return false; > } > > >> >> constantPool.cpp >> >> Handle refs_handle; >> ... >> refs_handle = Handle(THREAD, (oop)archived); >> >> This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() >> >> I think it's more efficient if you merge these into a single statement >> >> Handle refs_handle(THREAD, (oop)archived); > > Fixed. > >> >> Is this experimental code? Maybe it should be removed? >> >> 664 if (tag_at(index).is_unresolved_klass()) { >> 665 #if 0 >> 666 CPSlot entry = cp->slot_at(index); >> 667 Symbol* name = entry.get_symbol(); >> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >> 669 if (k != NULL) { >> 670 klass_at_put(index, k); >> 671 } >> 672 #endif >> 673 } else > > Removed. > >> >> cpCache.hpp: >> >> u8 _archived_references >> >> shouldn't this be declared as an narrowOop to avoid the type casts when it's used? > > Ok. > >> >> cpCache.cpp: >> >> add assert so that one of these is used only at dump time and the other only at run time? >> >> 610 oop ConstantPoolCache::archived_references() { >> 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); >> 612 } >> 613 >> 614 void ConstantPoolCache::set_archived_references(oop o) { >> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >> 616 } > > Ok. > > Thanks! > > Jiangli > >> >> Thanks! >> - Ioi >> >> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>> Sorry, the mail didn?t handle the rich text well. I fixed the format below. >>> >>> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. >>> >>> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. >>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>> >>> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. >>> >>> Types of Pinned G1 Heap Regions >>> >>> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. >>> >>> 00100 0 [ 8] Pinned Mask >>> 01000 0 [16] Old Mask >>> 10000 0 [32] Archive Mask >>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>> >>> >>> Pinned Regions >>> >>> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. >>> >>> Archive Regions >>> >>> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. >>> >>> An archive region is also an old region by design. >>> >>> Open Archive (GC-RW) Regions >>> >>> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. >>> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. >>> >>> Adjustable Outgoing Pointers >>> >>> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. >>> >>> Closed Archive (GC-RO) Regions >>> >>> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. >>> In JDK 9 we support archive Strings with the archive regions. >>> >>> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. >>> >>> Dormant Objects >>> >>> Dormant objects are unreachable java objects within the open archive heap region. >>> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. >>> >>> Object State Transition >>> >>> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. >>> >>> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. >>> >>> Caching Java Objects at Archive Dump Time >>> >>> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. >>> >>> Caching Constant Pool resolved_references Array >>> >>> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. >>> >>> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. >>> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. >>> >>> Runtime Java Heap With Cached Java Objects >>> >>> >>> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. >>> >>> Preliminary test execution and status: >>> >>> JPRT: passed >>> Tier2-rt: passed >>> Tier2-gc: passed >>> Tier2-comp: passed >>> Tier3-rt: passed >>> Tier3-gc: passed >>> Tier3-comp: passed >>> Tier4-rt: passed >>> Tier4-gc: passed >>> Tier4-comp:6 jobs timed out, all other tests passed >>> Tier5-rt: one test failed but passed when running locally, all other tests passed >>> Tier5-gc: passed >>> Tier5-comp: running >>> hotspot_gc: two jobs timed out, all other tests passed >>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>> vm.gc: passed >>> vm.gc in CDS mode: passed >>> Kichensink: passed >>> Kichensink in CDS mode: passed >>> >>> Thanks, >>> Jiangli >> > From goetz.lindenmaier at sap.com Fri Aug 4 05:58:36 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 4 Aug 2017 05:58:36 +0000 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> Message-ID: <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> Hi Mikhailo, I put in your version of vmCDS() into this new webrev. I also had to update the list of tests marked in hotspot, as tests were removed and added in between, and resolved it against the aot change: http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ I don't think it's a good idea to swallow the exception silently as you propose. In our test setup, the tests would just be switched off if something breaks, and no one will see that. If they fail though, it's an easy and quick fix. I would at least switch them on, then one sees the failing tests in case switching them on was the wrong guess. Also, below, the method dump() throws an exception. Best regards, Goetz > -----Original Message----- > From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] > Sent: Tuesday, August 01, 2017 11:49 PM > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests > > Hi Goetz, > > I have reviewed your updated changes, and they overall look good to me. > > However, I have some comments + concerns regarding VMProps.vmCDS(): > > > 1. Throwing exceptions from within the vmCDS() method. > > The VMProps properties are evaluated at the start of each run. If > the exception is thrown here the whole test run will fail (not just the > test that uses '@requires vm.cds'). The failure will occur shortly after > the start of jtreg test run with a message: > "java.lang.RuntimeException: Can not start VM to test to > find out it's features. Switching off class data sharing (CDS)." > > Your method has 2 throw statements: "new RuntimeException("Can not > start VM..." and "java.lang.RuntimeException: Can not start VM to test > to...". I would recommend a more graceful way to fail, e.g. to print the > message and to return "false" instead. This way the rest of the test run > will continue, but the tests requiring vm.cds will be skipped with > qualification of "not selected". > > 2. The check for "An error has occurred while processing the shared > archive file." assumes that archive was not created prior to the > execution of this evaluation code. However, there are test modes where > archive is created prior to test run. We use such mode on regular basis. > In such cases the code will fail. > I recommend to run "-Xshare:on -version", and check the > following match that would result in return of "true": > "Java HotSpot.*sharing" > > 3. On occasion the mapping of shared archive region to a specified > address will fail (due to system configuration, space already occupied, > ASLR, etc.) > > Hence I recommend checking for such conditions as well: > > if (output.firstMatch("Unable to map") != null) { > System.out.println("VMProps.vmCDS() encountered an archive > mapping failure, still proceeding with vm.cds=true"); > return "true"; > } > I am returning true here because seeing this output means that CDS > feature is supported, however in this particular instance archive failed > to map. > > > The rest of the changes looks good to me. > > See for my version of VMProps.vmCDS() below. Let me know what you think. > > > Thank you, > > Mikhailo > > > ================== my update of VMProps.vmCDS() > > protected String vmCDS() { > System.setProperty("test.jdk", System.getProperty("java.home")); > ProcessBuilder pb = > ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); > OutputAnalyzer output; > > try { > output = new OutputAnalyzer(pb.start()); > } catch (IOException e) { > System.err.println( "Can not start VM to test to find out > it's features. " + > "Switching off class data > sharing (CDS)." + e); > return "false"; > } > if (output.firstMatch("Shared spaces are not supported in this > VM") != null) { > return "false"; > } > if (output.firstMatch("An error has occurred while processing > the shared archive file.") != null) { > return "true"; > } > if (output.firstMatch("Java HotSpot.*sharing") != null) { > return "true"; > } > if (output.firstMatch("Unable to map") != null) { > System.out.println("VMProps.vmCDS() encountered an archive > mapping failure, still proceeding with vm.cds=true"); > return "true"; > } > > return "false"; > } > ================== > > > > On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: > > Hi, > > > > I made new webrevs implementing the change with @requires: > > http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ > > http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02-hs/ > > > > I also changed the bug description and synopsis. > > > > For the jtreg runner I would propose to set the property test.jdk > > so that it is available in VMProps. Igor also ran into this issue. > > > > Best regards, > > Goetz. > > > > > >> -----Original Message----- > >> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > >> Sent: Montag, 31. Juli 2017 22:19 > >> To: Lindenmaier, Goetz > >> Cc: hotspot-runtime-dev at openjdk.java.net > >> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds > tests > >> > >> Hi Goetz, > >> > >> I have an idea on how to address your second use case. > >> The idea is to define a special test property (e.g. > >> test.cds.disable.cds.support) which will override logic inside the > >> VMProps.vmCDSSupported(). If this property is defined to "true" in test > >> invocation command then vmCDSSupported() returns false (CDS is > disabled, > >> not supported), and all tests marked with "@requires vm.cds.supported" > >> will be skipped. > >> > >> How to use it: > >> jtreg -Dtest.cds.disable.cds.support=true > >> E.g.: jtreg -Dtest.cds.disable.cds.support=true > >> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java > >> > >> I prototyped this approach, it works for me. I have attached the diff. > >> Let me know whether this works for your use case, or if you have any > >> questions. > >> > >> > >> Thank you, > >> Mikhailo > >> > >> > >> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: > >>> Hi Mikhailo, > >>> > >>> Basically I'm fine with using the @requires property. > >>> But is there a way to overrule the outcome of the method > >>> implemented In VMProps.java computing the property? > >>> I have two use cases for the key I want to introduce. > >>> > >>> First, our internal VM (we are Oracle licensees) is compiled without > >>> CDS support. Thus we don't want to run the CDS tests. Currently > >>> we have them all listed in the ProblemList, but that's not nice, especially > >>> because we have to adapt it whenever a new test is added. > >>> As I understand, the @requires property works fine, here. > >>> > >>> Second, we also test the two ports we contributed (ppc and s390). These > >> contain > >>> rudimentary cds support and so far passed all tests. Unfortunately it > broke > >>> lately in jdk10. Instead of fixing it (our people are working on finishing > our > >>> internal Java 9 port) I would like to switch off all cds tests. > >>> As I can set the key on the command line of jtreg, I easily can do that. > >>> Is there a way to do similar with the @requires property? > >>> > >>> Best regards, > >>> Goetz. > >>> > >>> > >>> > >>>> -----Original Message----- > >>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > >>>> Sent: Freitag, 28. Juli 2017 23:53 > >>>> To: Lindenmaier, Goetz > >>>> Cc: hotspot-runtime-dev at openjdk.java.net > >>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds > >> tests > >>>> Hi Goetz, > >>>> > >>>> I am a HotSpot SQE Engineer at Oracle. I have discussed your > proposed > >>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the > following > >>>> feedback on this change. > >>>> > >>>> 1. As part of streamlining and simplifying SQE process and the use of > >>>> test tools we have narrowed down the test selection mechanisms. > >>>> > >>>> 2. Our preferred test selection mechanism is use of "@requires" and a > >>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though > JTREG > >>>> supports use of "@key", we prefer the use of "@requires" as a first > >> choice. > >>>> 3. If it is not possible to use "@requires" for a given situation then > >>>> use "@key" mechanism. We would ask you if you could explore the > >>>> possibility of implementing this change via @requires first. > >>>> > >>>> > >>>> Here are several hints that may help: > >>>> > >>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. The value > >>>> of a given "requires property" is evaluated inside this file and placed > >>>> into a map (see public call() method). Add your evaluation code here, > >>>> and then follow the pattern used for other properties. Create a > property > >>>> (e.g. vm.cds.supported, with values of true/false). Create a method > that > >>>> evaluates the property value (e.g. isCDSSupported() or similar). > >>>> > >>>> 2. The method could use several options to evaluate whether CDS is > >>>> supported. > >>>> A. WhiteBox API. Create a new WB test API method which can > return > >>>> true if CDS_ compiler flag is defined, otherwise false. > >>>> Call WB API from VMProps.java. See > >>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create your > >> own > >>>> WB.isCDSSupported() > >>>> WhiteBox.java resides in test/lib/sun/hotspot/WhiteBox.java > >>>> > >>>> B. Another options is to evaluate by running VM with sharing on and > >>>> checking the return (may be not as reliable as option A) > >>>> C. Other ideas welcome. > >>>> > >>>> 3. Include "@requres vm.cds.supported == true" to the appropriate > tests. > >>>> > >>>> Let me know if you have any questions. > >>>> > >>>> > >>>> Best regards, > >>>> Mikhailo > >>>> > >>>> > >>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: > >>>>> Hi > >>>>> > >>>>> we compile the VM without CDS support. Thus the CDS tests > >>>>> fail. This change introduces a keyword 'cds' and marks > >>>>> the tests accordingly. > >>>>> This change also fixes the keywords specified in > >>>> gc/g1/TestSharedArchiveWithPreTouch.java. > >>>>> There may only be one @key keyword in the test specification. > >>>>> In runtime/CompressedOops/CompressedClassPointers.java only one > >> test > >>>>> case required CDS. I changed this sub case to succeed if CDS is not > >>>>> available. > >>>>> > >>>>> Please review this change. I please need a sponsor. > >>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.01/ > >>>>> > >>>>> Best regards, > >>>>> Goetz. From ioi.lam at oracle.com Fri Aug 4 06:39:32 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 3 Aug 2017 23:39:32 -0700 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> Message-ID: <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> Hi Goetz, Instead of testing -Xshare:on, I think you should test with -Xshare:auto, which sets the flags UseSharedSpaces = true RequireSharedSpaces = false and will reliably print "Shared spaces are not supported in this VM" if-and-only-if INCLUDE_CDS is disabled (see arguments.cpp): #if !INCLUDE_CDS if (DumpSharedSpaces || RequireSharedSpaces) { jio_fprintf(defaultStream::error_stream(), "Shared spaces are not supported in this VM\n"); return JNI_ERR; } if ((UseSharedSpaces && FLAG_IS_CMDLINE(UseSharedSpaces)) || log_is_enabled(Info, cds)) { warning("Shared spaces are not supported in this VM"); FLAG_SET_DEFAULT(UseSharedSpaces, false); LogConfiguration::configure_stdout(LogLevel::Off, true, LOG_TAGS(cds)); } no_shared_spaces("CDS Disabled"); #endif // INCLUDE_CDS That way, you don't need to test any other output message or exit conditions(such as mapping error). E.g.: ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java -Xshare:auto -version java version "10-internal" Java(TM) SE Runtime Environment (build 10-internal+0-2017-08-04-0614567.iklam.iter) Java HotSpot(TM) 64-Bit Server VM (build 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java -XXaltjvm=minimal -Xshare:auto -version Java HotSpot(TM) 64-Bit Minimal VM warning: Shared spaces are not supported in this VM java version "10-internal" Java(TM) SE Runtime Environment (build 10-internal+0-2017-08-04-0614567.iklam.iter) Java HotSpot(TM) 64-Bit Minimal VM (build 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) Thanks - Ioi On 8/3/17 10:58 PM, Lindenmaier, Goetz wrote: > Hi Mikhailo, > > I put in your version of vmCDS() into this new webrev. > I also had to update the list of tests marked in hotspot, > as tests were removed and added in between, and resolved > it against the aot change: > http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ > http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ > > I don't think it's a good idea to swallow the exception silently > as you propose. > In our test setup, the tests would just be switched off if something > breaks, and no one will see that. If they fail though, it's an easy > and quick fix. I would at least switch them on, then one sees the > failing tests in case switching them on was the wrong guess. > Also, below, the method dump() throws an exception. > > Best regards, > Goetz > >> -----Original Message----- >> From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] >> Sent: Tuesday, August 01, 2017 11:49 PM >> To: Lindenmaier, Goetz >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests >> >> Hi Goetz, >> >> I have reviewed your updated changes, and they overall look good to me. >> >> However, I have some comments + concerns regarding VMProps.vmCDS(): >> >> >> 1. Throwing exceptions from within the vmCDS() method. >> >> The VMProps properties are evaluated at the start of each run. If >> the exception is thrown here the whole test run will fail (not just the >> test that uses '@requires vm.cds'). The failure will occur shortly after >> the start of jtreg test run with a message: >> "java.lang.RuntimeException: Can not start VM to test to >> find out it's features. Switching off class data sharing (CDS)." >> >> Your method has 2 throw statements: "new RuntimeException("Can not >> start VM..." and "java.lang.RuntimeException: Can not start VM to test >> to...". I would recommend a more graceful way to fail, e.g. to print the >> message and to return "false" instead. This way the rest of the test run >> will continue, but the tests requiring vm.cds will be skipped with >> qualification of "not selected". >> >> 2. The check for "An error has occurred while processing the shared >> archive file." assumes that archive was not created prior to the >> execution of this evaluation code. However, there are test modes where >> archive is created prior to test run. We use such mode on regular basis. >> In such cases the code will fail. >> I recommend to run "-Xshare:on -version", and check the >> following match that would result in return of "true": >> "Java HotSpot.*sharing" >> >> 3. On occasion the mapping of shared archive region to a specified >> address will fail (due to system configuration, space already occupied, >> ASLR, etc.) >> >> Hence I recommend checking for such conditions as well: >> >> if (output.firstMatch("Unable to map") != null) { >> System.out.println("VMProps.vmCDS() encountered an archive >> mapping failure, still proceeding with vm.cds=true"); >> return "true"; >> } >> I am returning true here because seeing this output means that CDS >> feature is supported, however in this particular instance archive failed >> to map. >> >> >> The rest of the changes looks good to me. >> >> See for my version of VMProps.vmCDS() below. Let me know what you think. >> >> >> Thank you, >> >> Mikhailo >> >> >> ================== my update of VMProps.vmCDS() >> >> protected String vmCDS() { >> System.setProperty("test.jdk", System.getProperty("java.home")); >> ProcessBuilder pb = >> ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); >> OutputAnalyzer output; >> >> try { >> output = new OutputAnalyzer(pb.start()); >> } catch (IOException e) { >> System.err.println( "Can not start VM to test to find out >> it's features. " + >> "Switching off class data >> sharing (CDS)." + e); >> return "false"; >> } >> if (output.firstMatch("Shared spaces are not supported in this >> VM") != null) { >> return "false"; >> } >> if (output.firstMatch("An error has occurred while processing >> the shared archive file.") != null) { >> return "true"; >> } >> if (output.firstMatch("Java HotSpot.*sharing") != null) { >> return "true"; >> } >> if (output.firstMatch("Unable to map") != null) { >> System.out.println("VMProps.vmCDS() encountered an archive >> mapping failure, still proceeding with vm.cds=true"); >> return "true"; >> } >> >> return "false"; >> } >> ================== >> >> >> >> On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> I made new webrevs implementing the change with @requires: >>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ >>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02-hs/ >>> >>> I also changed the bug description and synopsis. >>> >>> For the jtreg runner I would propose to set the property test.jdk >>> so that it is available in VMProps. Igor also ran into this issue. >>> >>> Best regards, >>> Goetz. >>> >>> >>>> -----Original Message----- >>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>> Sent: Montag, 31. Juli 2017 22:19 >>>> To: Lindenmaier, Goetz >>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds >> tests >>>> Hi Goetz, >>>> >>>> I have an idea on how to address your second use case. >>>> The idea is to define a special test property (e.g. >>>> test.cds.disable.cds.support) which will override logic inside the >>>> VMProps.vmCDSSupported(). If this property is defined to "true" in test >>>> invocation command then vmCDSSupported() returns false (CDS is >> disabled, >>>> not supported), and all tests marked with "@requires vm.cds.supported" >>>> will be skipped. >>>> >>>> How to use it: >>>> jtreg -Dtest.cds.disable.cds.support=true >>>> E.g.: jtreg -Dtest.cds.disable.cds.support=true >>>> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java >>>> >>>> I prototyped this approach, it works for me. I have attached the diff. >>>> Let me know whether this works for your use case, or if you have any >>>> questions. >>>> >>>> >>>> Thank you, >>>> Mikhailo >>>> >>>> >>>> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: >>>>> Hi Mikhailo, >>>>> >>>>> Basically I'm fine with using the @requires property. >>>>> But is there a way to overrule the outcome of the method >>>>> implemented In VMProps.java computing the property? >>>>> I have two use cases for the key I want to introduce. >>>>> >>>>> First, our internal VM (we are Oracle licensees) is compiled without >>>>> CDS support. Thus we don't want to run the CDS tests. Currently >>>>> we have them all listed in the ProblemList, but that's not nice, especially >>>>> because we have to adapt it whenever a new test is added. >>>>> As I understand, the @requires property works fine, here. >>>>> >>>>> Second, we also test the two ports we contributed (ppc and s390). These >>>> contain >>>>> rudimentary cds support and so far passed all tests. Unfortunately it >> broke >>>>> lately in jdk10. Instead of fixing it (our people are working on finishing >> our >>>>> internal Java 9 port) I would like to switch off all cds tests. >>>>> As I can set the key on the command line of jtreg, I easily can do that. >>>>> Is there a way to do similar with the @requires property? >>>>> >>>>> Best regards, >>>>> Goetz. >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>> Sent: Freitag, 28. Juli 2017 23:53 >>>>>> To: Lindenmaier, Goetz >>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds >>>> tests >>>>>> Hi Goetz, >>>>>> >>>>>> I am a HotSpot SQE Engineer at Oracle. I have discussed your >> proposed >>>>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the >> following >>>>>> feedback on this change. >>>>>> >>>>>> 1. As part of streamlining and simplifying SQE process and the use of >>>>>> test tools we have narrowed down the test selection mechanisms. >>>>>> >>>>>> 2. Our preferred test selection mechanism is use of "@requires" and a >>>>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though >> JTREG >>>>>> supports use of "@key", we prefer the use of "@requires" as a first >>>> choice. >>>>>> 3. If it is not possible to use "@requires" for a given situation then >>>>>> use "@key" mechanism. We would ask you if you could explore the >>>>>> possibility of implementing this change via @requires first. >>>>>> >>>>>> >>>>>> Here are several hints that may help: >>>>>> >>>>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. The value >>>>>> of a given "requires property" is evaluated inside this file and placed >>>>>> into a map (see public call() method). Add your evaluation code here, >>>>>> and then follow the pattern used for other properties. Create a >> property >>>>>> (e.g. vm.cds.supported, with values of true/false). Create a method >> that >>>>>> evaluates the property value (e.g. isCDSSupported() or similar). >>>>>> >>>>>> 2. The method could use several options to evaluate whether CDS is >>>>>> supported. >>>>>> A. WhiteBox API. Create a new WB test API method which can >> return >>>>>> true if CDS_ compiler flag is defined, otherwise false. >>>>>> Call WB API from VMProps.java. See >>>>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create your >>>> own >>>>>> WB.isCDSSupported() >>>>>> WhiteBox.java resides in test/lib/sun/hotspot/WhiteBox.java >>>>>> >>>>>> B. Another options is to evaluate by running VM with sharing on and >>>>>> checking the return (may be not as reliable as option A) >>>>>> C. Other ideas welcome. >>>>>> >>>>>> 3. Include "@requres vm.cds.supported == true" to the appropriate >> tests. >>>>>> Let me know if you have any questions. >>>>>> >>>>>> >>>>>> Best regards, >>>>>> Mikhailo >>>>>> >>>>>> >>>>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: >>>>>>> Hi >>>>>>> >>>>>>> we compile the VM without CDS support. Thus the CDS tests >>>>>>> fail. This change introduces a keyword 'cds' and marks >>>>>>> the tests accordingly. >>>>>>> This change also fixes the keywords specified in >>>>>> gc/g1/TestSharedArchiveWithPreTouch.java. >>>>>>> There may only be one @key keyword in the test specification. >>>>>>> In runtime/CompressedOops/CompressedClassPointers.java only one >>>> test >>>>>>> case required CDS. I changed this sub case to succeed if CDS is not >>>>>>> available. >>>>>>> >>>>>>> Please review this change. I please need a sponsor. >>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.01/ >>>>>>> >>>>>>> Best regards, >>>>>>> Goetz. From aph at redhat.com Fri Aug 4 08:42:02 2017 From: aph at redhat.com (Andrew Haley) Date: Fri, 4 Aug 2017 09:42:02 +0100 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: <4cd2ba0f-394e-3e10-2800-c80a234a582f@redhat.com> Hi, On 03/08/17 20:27, Kim Barrett wrote: > > This response is about the types for a cmpxchg template. I'll respond > to other comments separately. > > It is certainly simpler to implement if the types for all three > arguments are required to be the same. > > I suggested Erik do that, but he reported running into "lots" of > compilation failures. I guess Erik was less persistent than Andrew > in working through them. Given Andrew's reported small number, I've > also done that experiment and looked at the failing cases. I think > (nearly?) all would be improved by making the argument types match. > There's also one similar case for xchg. > > However, there are use-cases that I think are reasonable which don't > immediately fit that restriction. > > (1) cmpxchg(v, p, NULL), to store a pointer if no pointer is already > present. This can be used as an alternative to DCLP. One way to > deal with this might be an overload on std::nullptr_t and use > nullptr, but that requires C++11. We don't have any current uses of > this that I could find, but it's a sufficiently interesting idiom > that I'm relucant to forbid it. But such idiomatic usage could be > wrapped up in its own little package that can deal with the > restriction. > > (2) The use of literals can make getting a type match more > difficult, especially when the pointee type doesn't have portable > syntax (like intx and uintx). But using properly typed named values > solves this, and may be seen as an improvement over magic literal > values. > > (3) Passing a derived pointer as the new value when updating an > atomic pointer seems reasonable to me. A derived pointer compare > value seems somewhat less so. Similarly, new and compare values > that have pointee types that are less cv-qualified than the > destination also seems reasonable. I wonder if our disagreement here is perhaps a philosophical one. C++ overload resolution rules can be obscure and sometimes surprising, especially when using subtypes. A cast or an assignment to a local variable of an exactly matching type serves as a notice to the reader of what is intended. In my view that is beneficial: nothing is hidden. Because Lock-free code is inherently tricky to write, any use of Atomic is going to require careful and sensitive analysis. Casts of argument types will slow down the reader slightly, but most of the difficulty understanding such code is essential complexity rather than accidental due to the rules of the language. > (4) Several of the problematic cases involve enums, which aren't > especially well handled by the original proposal. From Erik's > explorations I knew there were some issues with enums. Having now > looked at some of these cmpxchg problems, I think we should try > harder to deal well with enums. One problem is that C++11 > std::is_enum is at least hard (maybe impossible?) to portabily > emulate in C++98, so some other mechanism would be needed to > recognize them. Probably the simplest is a registration mechanism > for enum types that are used for atomic-access values (e.g a type > predicate defaulting to false, and specializations for relevant enum > types, near their definitions.) I don't understand why this is a problem. Conversions between enums and integer types are well-defined. > (5) One of our goals is to eliminate, as much as possible, the need > for explicit casts and conversions in uses of this API. We think the > existing widespread use of casts makes code difficult to understand > and is a source of bugs. I think the small number of call sites > affected by requiring the cmpxchg types all be the same is in no small > measure a result of that widespread use of casts in calling code. > Unfortunately, Hotspot is rife with boundaries across which there are > type mismatches (and often inconsistencies even within a single > logical chunk of code). We don't think that's a good thing, but it's > the context in which we were developing this change. I believe that eliminating type mismatches is a worthwhile goal, but this proposal doesn't quite do that. Instead, it hides type mismatches by coercing arguments. Despite that we're agreed that we will need -fno-strict-aliasing for some time to come, I still believe that gratuitous undefined behaviour should be eliminated when it is practical to do so. (And new UB shouldn't be added now!) Visible casts between compatible integer types may be ugly, but to my mind they are far less of a sin than unnecessary undefined behaviour. > I think the proposed change may be more lax than it should / could > be, but I think there are valid reasons to not require the types for > cmpxchg arguments to all match. There are, but IMO this is a balance of concerns. On the one hand, a strict requirement for some users of atomic cmpxchg. On the other hand, hundreds (thousands?) of lines of hairy template code. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Fri Aug 4 08:54:02 2017 From: aph at redhat.com (Andrew Haley) Date: Fri, 4 Aug 2017 09:54:02 +0100 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: <73fae08f-f5ea-7e97-3419-d1d0ae63c101@redhat.com> Hi, On 04/08/17 00:42, David Holmes wrote: > On 4/08/2017 5:27 AM, Kim Barrett wrote: >> >> However, there are use-cases that I think are reasonable which don't >> immediately fit that restriction. >> >> (1) cmpxchg(v, p, NULL), to store a pointer if no pointer is already >> present. This can be used as an alternative to DCLP. One way to deal > > I thought NULL (aka 0 in a pointer context) was assignable to any > pointer type without any casts. ?? They are, but you have to distinguish between default conversions and overload resolution. If you have two methods int a(foo *p); int a(bar *p); and you have a call a(NULL); the only way to resolve the overload is to do this: a((foo*)NULL); or this: foo *tmp = NULL; a(tmp); C++ is pretty strict about this rule, because it's safer in practice to insist that programmers say exactly what they mean than apply conversions that might be surprising. In several places C++ is fussier than C. For example, you can say this in C but not C++: foo *p = malloc(n); This is a deliberate design deicison. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From coleen.phillimore at oracle.com Fri Aug 4 17:47:38 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 4 Aug 2017 13:47:38 -0400 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> Message-ID: <1cb21dcb-eb60-aefa-a35f-73c43fdaae72@oracle.com> Hi, I think this change looks good except in metaspaceShared.cpp: 1632 NoSafepointVerifier nsv; Belongs above when you create the archive cache. 1600 MetaspaceShared::create_archive_object_cache(); The call: 1646 int hash = obj->identity_hash(); Can safepoint though. Not sure why NSV didn't have an assert. But maybe it won't if you've eagerly installed the hashcode in these objects, which I think you have. Otherwise you could use NoGCVerifier. Or GCLocker, which the GC group won't like. Thanks, Coleen On 8/3/17 8:15 PM, Jiangli Zhou wrote: > Here are the updated webrevs. > > http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ > http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ > > Changes in the updated webrevs include: > Merge with Ioi?s recent shared space auto-sizing change (8072061) > Addressed all feedbacks from Ioi and Coleen (Thanks for detailed review!) > > Thanks, > Jiangli > > >> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou wrote: >> >> Hi Ioi, >> >> Thank you so much for reviewing this. I?ve addressed all your feedbacks. Please see details below. I?ll updated the webrev after addressing Coleen?s comments. >> >>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>> >>> Hi Jiangli, >>> >>> Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) >>> >>> stringTable.cpp: StringTable::archive_string >>> >>> add assert for DumpSharedSpaces only >> Ok. >> >>> filemap.cpp >>> >>> 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, >>> 526 int first_region, int num_regions) { >>> >>> When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: >>> >>> 537 int len = regions->length(); >>> 538 if (len > 1) { >>> 539 start = (char*)regions->at(1).start(); >>> 540 size = (char*)regions->at(len - 1).end() - start; >>> 541 } >>> 542 } >>> >>> The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. >>> >>> How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. >>> >>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { >>> if (first == MetaspaceShared::first_string) { >>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>> } else { >>> assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); >>> assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); >>> } >>> .... >>> >>> >> I?ve reworked the function and simplified the code. >> >>> 756 if (!string_data_mapped) { >>> 757 StringTable::ignore_shared_strings(true); >>> 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); >>> 759 } >>> 760 >>> 761 if (open_archive_heap_data_mapped) { >>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>> 763 } else { >>> 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); >>> 765 } >>> >>> Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? >> Fixed. >> >>> FileMapInfo::map_heap_data() -- >>> >>> 818 char* addr = (char*)regions[i].start(); >>> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >>> 820 addr, regions[i].byte_size(), si->_read_only, >>> 821 si->_allow_exec); >>> >>> What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. >> If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by ?regions? array are deallocated. We don?t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ?if? check below makes sure base == addr. >> >> if (base == NULL || base != addr) { >> // dealloc the regions from java heap >> dealloc_archive_heap_regions(regions, region_num); >> if (log_is_enabled(Info, cds)) { >> log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap."); >> } >> return false; >> } >> >> >>> constantPool.cpp >>> >>> Handle refs_handle; >>> ... >>> refs_handle = Handle(THREAD, (oop)archived); >>> >>> This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() >>> >>> I think it's more efficient if you merge these into a single statement >>> >>> Handle refs_handle(THREAD, (oop)archived); >> Fixed. >> >>> Is this experimental code? Maybe it should be removed? >>> >>> 664 if (tag_at(index).is_unresolved_klass()) { >>> 665 #if 0 >>> 666 CPSlot entry = cp->slot_at(index); >>> 667 Symbol* name = entry.get_symbol(); >>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>> 669 if (k != NULL) { >>> 670 klass_at_put(index, k); >>> 671 } >>> 672 #endif >>> 673 } else >> Removed. >> >>> cpCache.hpp: >>> >>> u8 _archived_references >>> >>> shouldn't this be declared as an narrowOop to avoid the type casts when it's used? >> Ok. >> >>> cpCache.cpp: >>> >>> add assert so that one of these is used only at dump time and the other only at run time? >>> >>> 610 oop ConstantPoolCache::archived_references() { >>> 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); >>> 612 } >>> 613 >>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>> 616 } >> Ok. >> >> Thanks! >> >> Jiangli >> >>> Thanks! >>> - Ioi >>> >>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>> Sorry, the mail didn?t handle the rich text well. I fixed the format below. >>>> >>>> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. >>>> >>>> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. >>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>> >>>> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. >>>> >>>> Types of Pinned G1 Heap Regions >>>> >>>> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. >>>> >>>> 00100 0 [ 8] Pinned Mask >>>> 01000 0 [16] Old Mask >>>> 10000 0 [32] Archive Mask >>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>>> >>>> >>>> Pinned Regions >>>> >>>> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. >>>> >>>> Archive Regions >>>> >>>> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. >>>> >>>> An archive region is also an old region by design. >>>> >>>> Open Archive (GC-RW) Regions >>>> >>>> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. >>>> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. >>>> >>>> Adjustable Outgoing Pointers >>>> >>>> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. >>>> >>>> Closed Archive (GC-RO) Regions >>>> >>>> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. >>>> In JDK 9 we support archive Strings with the archive regions. >>>> >>>> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. >>>> >>>> Dormant Objects >>>> >>>> Dormant objects are unreachable java objects within the open archive heap region. >>>> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. >>>> >>>> Object State Transition >>>> >>>> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. >>>> >>>> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. >>>> >>>> Caching Java Objects at Archive Dump Time >>>> >>>> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. >>>> >>>> Caching Constant Pool resolved_references Array >>>> >>>> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. >>>> >>>> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. >>>> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. >>>> >>>> Runtime Java Heap With Cached Java Objects >>>> >>>> >>>> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. >>>> >>>> Preliminary test execution and status: >>>> >>>> JPRT: passed >>>> Tier2-rt: passed >>>> Tier2-gc: passed >>>> Tier2-comp: passed >>>> Tier3-rt: passed >>>> Tier3-gc: passed >>>> Tier3-comp: passed >>>> Tier4-rt: passed >>>> Tier4-gc: passed >>>> Tier4-comp:6 jobs timed out, all other tests passed >>>> Tier5-rt: one test failed but passed when running locally, all other tests passed >>>> Tier5-gc: passed >>>> Tier5-comp: running >>>> hotspot_gc: two jobs timed out, all other tests passed >>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>> vm.gc: passed >>>> vm.gc in CDS mode: passed >>>> Kichensink: passed >>>> Kichensink in CDS mode: passed >>>> >>>> Thanks, >>>> Jiangli From mikhailo.seledtsov at oracle.com Fri Aug 4 19:35:12 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Fri, 04 Aug 2017 12:35:12 -0700 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> Message-ID: <5984CC70.80209@oracle.com> Hi, I have an alternative solution that is IMO rather simple, reliable and will solve some issues we discussed (e.g. no need to throw exceptions, no need to handle failure to map an archive). The proposed solution uses White Box test API to determine whether VM is compiled with INCLUDE_CDS on or off. I implemented and tested it today, it works for me. The patch is attached. Please let me know what you think. Thank you, Mikhailo On 8/3/17, 11:39 PM, Ioi Lam wrote: > Hi Goetz, > > Instead of testing -Xshare:on, I think you should test with > -Xshare:auto, which sets the flags > > UseSharedSpaces = true > RequireSharedSpaces = false > > and will reliably print "Shared spaces are not supported in this VM" > if-and-only-if INCLUDE_CDS is disabled (see arguments.cpp): > > > #if !INCLUDE_CDS > if (DumpSharedSpaces || RequireSharedSpaces) { > jio_fprintf(defaultStream::error_stream(), > "Shared spaces are not supported in this VM\n"); > return JNI_ERR; > } > if ((UseSharedSpaces && FLAG_IS_CMDLINE(UseSharedSpaces)) || > log_is_enabled(Info, cds)) { > warning("Shared spaces are not supported in this VM"); > FLAG_SET_DEFAULT(UseSharedSpaces, false); > LogConfiguration::configure_stdout(LogLevel::Off, true, > LOG_TAGS(cds)); > } > no_shared_spaces("CDS Disabled"); > #endif // INCLUDE_CDS > > > That way, you don't need to test any other output message or exit > conditions(such as mapping error). > > > E.g.: > > ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java -Xshare:auto > -version > java version "10-internal" > Java(TM) SE Runtime Environment (build > 10-internal+0-2017-08-04-0614567.iklam.iter) > Java HotSpot(TM) 64-Bit Server VM (build > 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) > > > ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java > -XXaltjvm=minimal -Xshare:auto -version > Java HotSpot(TM) 64-Bit Minimal VM warning: Shared spaces are not > supported in this VM > java version "10-internal" > Java(TM) SE Runtime Environment (build > 10-internal+0-2017-08-04-0614567.iklam.iter) > Java HotSpot(TM) 64-Bit Minimal VM (build > 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) > > > > Thanks > - Ioi > > On 8/3/17 10:58 PM, Lindenmaier, Goetz wrote: >> Hi Mikhailo, >> >> I put in your version of vmCDS() into this new webrev. >> I also had to update the list of tests marked in hotspot, >> as tests were removed and added in between, and resolved >> it against the aot change: >> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ >> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ >> >> I don't think it's a good idea to swallow the exception silently >> as you propose. >> In our test setup, the tests would just be switched off if something >> breaks, and no one will see that. If they fail though, it's an easy >> and quick fix. I would at least switch them on, then one sees the >> failing tests in case switching them on was the wrong guess. >> Also, below, the method dump() throws an exception. >> >> Best regards, >> Goetz >> >>> -----Original Message----- >>> From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] >>> Sent: Tuesday, August 01, 2017 11:49 PM >>> To: Lindenmaier, Goetz >>> Cc: hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>> cds tests >>> >>> Hi Goetz, >>> >>> I have reviewed your updated changes, and they overall look good to me. >>> >>> However, I have some comments + concerns regarding VMProps.vmCDS(): >>> >>> >>> 1. Throwing exceptions from within the vmCDS() method. >>> >>> The VMProps properties are evaluated at the start of each >>> run. If >>> the exception is thrown here the whole test run will fail (not just the >>> test that uses '@requires vm.cds'). The failure will occur shortly >>> after >>> the start of jtreg test run with a message: >>> "java.lang.RuntimeException: Can not start VM to test to >>> find out it's features. Switching off class data sharing (CDS)." >>> >>> Your method has 2 throw statements: "new RuntimeException("Can >>> not >>> start VM..." and "java.lang.RuntimeException: Can not start VM to test >>> to...". I would recommend a more graceful way to fail, e.g. to print >>> the >>> message and to return "false" instead. This way the rest of the test >>> run >>> will continue, but the tests requiring vm.cds will be skipped with >>> qualification of "not selected". >>> >>> 2. The check for "An error has occurred while processing the shared >>> archive file." assumes that archive was not created prior to the >>> execution of this evaluation code. However, there are test modes where >>> archive is created prior to test run. We use such mode on regular >>> basis. >>> In such cases the code will fail. >>> I recommend to run "-Xshare:on -version", and check the >>> following match that would result in return of "true": >>> "Java HotSpot.*sharing" >>> >>> 3. On occasion the mapping of shared archive region to a specified >>> address will fail (due to system configuration, space already occupied, >>> ASLR, etc.) >>> >>> Hence I recommend checking for such conditions as well: >>> >>> if (output.firstMatch("Unable to map") != null) { >>> System.out.println("VMProps.vmCDS() encountered an >>> archive >>> mapping failure, still proceeding with vm.cds=true"); >>> return "true"; >>> } >>> I am returning true here because seeing this output means that >>> CDS >>> feature is supported, however in this particular instance archive >>> failed >>> to map. >>> >>> >>> The rest of the changes looks good to me. >>> >>> See for my version of VMProps.vmCDS() below. Let me know what you >>> think. >>> >>> >>> Thank you, >>> >>> Mikhailo >>> >>> >>> ================== my update of VMProps.vmCDS() >>> >>> protected String vmCDS() { >>> System.setProperty("test.jdk", >>> System.getProperty("java.home")); >>> ProcessBuilder pb = >>> ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); >>> OutputAnalyzer output; >>> >>> try { >>> output = new OutputAnalyzer(pb.start()); >>> } catch (IOException e) { >>> System.err.println( "Can not start VM to test to find out >>> it's features. " + >>> "Switching off class data >>> sharing (CDS)." + e); >>> return "false"; >>> } >>> if (output.firstMatch("Shared spaces are not supported in >>> this >>> VM") != null) { >>> return "false"; >>> } >>> if (output.firstMatch("An error has occurred while processing >>> the shared archive file.") != null) { >>> return "true"; >>> } >>> if (output.firstMatch("Java HotSpot.*sharing") != null) { >>> return "true"; >>> } >>> if (output.firstMatch("Unable to map") != null) { >>> System.out.println("VMProps.vmCDS() encountered an >>> archive >>> mapping failure, still proceeding with vm.cds=true"); >>> return "true"; >>> } >>> >>> return "false"; >>> } >>> ================== >>> >>> >>> >>> On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> I made new webrevs implementing the change with @requires: >>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ >>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02-hs/ >>>> >>>> I also changed the bug description and synopsis. >>>> >>>> For the jtreg runner I would propose to set the property test.jdk >>>> so that it is available in VMProps. Igor also ran into this issue. >>>> >>>> Best regards, >>>> Goetz. >>>> >>>> >>>>> -----Original Message----- >>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>> Sent: Montag, 31. Juli 2017 22:19 >>>>> To: Lindenmaier, Goetz >>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds >>> tests >>>>> Hi Goetz, >>>>> >>>>> I have an idea on how to address your second use case. >>>>> The idea is to define a special test property (e.g. >>>>> test.cds.disable.cds.support) which will override logic inside the >>>>> VMProps.vmCDSSupported(). If this property is defined to "true" in >>>>> test >>>>> invocation command then vmCDSSupported() returns false (CDS is >>> disabled, >>>>> not supported), and all tests marked with "@requires >>>>> vm.cds.supported" >>>>> will be skipped. >>>>> >>>>> How to use it: >>>>> jtreg -Dtest.cds.disable.cds.support=true >>>>> E.g.: jtreg -Dtest.cds.disable.cds.support=true >>>>> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java >>>>> >>>>> I prototyped this approach, it works for me. I have attached the >>>>> diff. >>>>> Let me know whether this works for your use case, or if you have any >>>>> questions. >>>>> >>>>> >>>>> Thank you, >>>>> Mikhailo >>>>> >>>>> >>>>> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: >>>>>> Hi Mikhailo, >>>>>> >>>>>> Basically I'm fine with using the @requires property. >>>>>> But is there a way to overrule the outcome of the method >>>>>> implemented In VMProps.java computing the property? >>>>>> I have two use cases for the key I want to introduce. >>>>>> >>>>>> First, our internal VM (we are Oracle licensees) is compiled without >>>>>> CDS support. Thus we don't want to run the CDS tests. Currently >>>>>> we have them all listed in the ProblemList, but that's not nice, >>>>>> especially >>>>>> because we have to adapt it whenever a new test is added. >>>>>> As I understand, the @requires property works fine, here. >>>>>> >>>>>> Second, we also test the two ports we contributed (ppc and s390). >>>>>> These >>>>> contain >>>>>> rudimentary cds support and so far passed all tests. >>>>>> Unfortunately it >>> broke >>>>>> lately in jdk10. Instead of fixing it (our people are working on >>>>>> finishing >>> our >>>>>> internal Java 9 port) I would like to switch off all cds tests. >>>>>> As I can set the key on the command line of jtreg, I easily can >>>>>> do that. >>>>>> Is there a way to do similar with the @requires property? >>>>>> >>>>>> Best regards, >>>>>> Goetz. >>>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>> Sent: Freitag, 28. Juli 2017 23:53 >>>>>>> To: Lindenmaier, Goetz >>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>> disable cds >>>>> tests >>>>>>> Hi Goetz, >>>>>>> >>>>>>> I am a HotSpot SQE Engineer at Oracle. I have discussed your >>> proposed >>>>>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the >>> following >>>>>>> feedback on this change. >>>>>>> >>>>>>> 1. As part of streamlining and simplifying SQE process and the >>>>>>> use of >>>>>>> test tools we have narrowed down the test selection mechanisms. >>>>>>> >>>>>>> 2. Our preferred test selection mechanism is use of "@requires" >>>>>>> and a >>>>>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though >>> JTREG >>>>>>> supports use of "@key", we prefer the use of "@requires" as a >>>>>>> first >>>>> choice. >>>>>>> 3. If it is not possible to use "@requires" for a given >>>>>>> situation then >>>>>>> use "@key" mechanism. We would ask you if you could explore the >>>>>>> possibility of implementing this change via @requires first. >>>>>>> >>>>>>> >>>>>>> Here are several hints that may help: >>>>>>> >>>>>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. The >>>>>>> value >>>>>>> of a given "requires property" is evaluated inside this file and >>>>>>> placed >>>>>>> into a map (see public call() method). Add your evaluation code >>>>>>> here, >>>>>>> and then follow the pattern used for other properties. Create a >>> property >>>>>>> (e.g. vm.cds.supported, with values of true/false). Create a method >>> that >>>>>>> evaluates the property value (e.g. isCDSSupported() or similar). >>>>>>> >>>>>>> 2. The method could use several options to evaluate whether CDS is >>>>>>> supported. >>>>>>> A. WhiteBox API. Create a new WB test API method which can >>> return >>>>>>> true if CDS_ compiler flag is defined, otherwise false. >>>>>>> Call WB API from VMProps.java. See >>>>>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create your >>>>> own >>>>>>> WB.isCDSSupported() >>>>>>> WhiteBox.java resides in >>>>>>> test/lib/sun/hotspot/WhiteBox.java >>>>>>> >>>>>>> B. Another options is to evaluate by running VM with >>>>>>> sharing on and >>>>>>> checking the return (may be not as reliable as option A) >>>>>>> C. Other ideas welcome. >>>>>>> >>>>>>> 3. Include "@requres vm.cds.supported == true" to the appropriate >>> tests. >>>>>>> Let me know if you have any questions. >>>>>>> >>>>>>> >>>>>>> Best regards, >>>>>>> Mikhailo >>>>>>> >>>>>>> >>>>>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: >>>>>>>> Hi >>>>>>>> >>>>>>>> we compile the VM without CDS support. Thus the CDS tests >>>>>>>> fail. This change introduces a keyword 'cds' and marks >>>>>>>> the tests accordingly. >>>>>>>> This change also fixes the keywords specified in >>>>>>> gc/g1/TestSharedArchiveWithPreTouch.java. >>>>>>>> There may only be one @key keyword in the test specification. >>>>>>>> In runtime/CompressedOops/CompressedClassPointers.java only one >>>>> test >>>>>>>> case required CDS. I changed this sub case to succeed if CDS is >>>>>>>> not >>>>>>>> available. >>>>>>>> >>>>>>>> Please review this change. I please need a sponsor. >>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.01/ >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Goetz. > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: misha-rev03-white-box-api.hotspot.diff URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: misha-rev03-white-box-api.top.diff URL: From ioi.lam at oracle.com Fri Aug 4 21:22:30 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 4 Aug 2017 14:22:30 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> Message-ID: <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> Hi Jiangli, The code looks good in general. I just have a few pet peeves for readability: (1) stringTable.cpp and metaspaceShared.cpp have the same asserts 704 assert(UseG1GC, "Only support G1 GC"); 705 assert(UseCompressedOops && UseCompressedClassPointers, 706 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); 1615 assert(UseG1GC, "Only support G1 GC"); 1616 assert(UseCompressedOops && UseCompressedClassPointers, 1617 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); Maybe it's better to combine them into a single function like MetaspaceShared::assert_vm_flags() so they don't get out of sync? (2) FileMapInfo::write_archive_heap_regions() I still find this code very hard to read, especially due to the loop. First, the comments are not consistent with the code: 498 assert(arr_len <= max_num_regions, "number of memory regions exceeds maximum"); but the comments says: "The rest are consecutive full GC regions" which means there's a chance for max_num_regions to be more than 2 (which will be the case with Calvin's java-loader dumping changes using very small heap size).So the code is actually wrong. The word "region" is used in these parameters, but they don't mean the same thing. GrowableArray *regions int first_region, int max_num_regions, How about regions -> g1_regions_list first_region -> first_region_in_archive In the comments, I find the phrase 'the current archive heap region' ambiguous. It could be (erroneously) interpreted as "a region from the currently mapped archive" To make it unambiguous, how about changing 464 // Write the current archive heap region, which contains one or multiple GC(G1) regions. to // Write the given list of G1 memory regions into the archive, starting at // first_region_in_archive. Also, for the explanation of how the G1 regions are written into the archive, how about: // The G1 regions in the list are sorted in ascending address order. When there are more objects // than the capacity of a single G1 region, the bottom-most G1 region may be partially filled, and the // remaining G1 region(s) are consecutively allocated and fully filled. // // Thus, the bottom-most G1 region (if not empty) is written into first_region_in_archive. // The remaining G1 regions (if exist) are coalesced and written as a single block // into (first_region_in_archive + 1) // Here's the mapping from (g1 regions) -> (archive regions). All this function needs to do is to decide the values for r0_start, r0_top r1_start, r1_top I think it would be much better to not use the loop, and not use the max_num_regions parameter (it's always 2 anyway). *r0_start = *r0_top = NULL; *r1_start = *r1_top = NULL; if (arr_len >= 1) { *r0_start = regions->at(0).start(); *r0_end = *r0_start + regions->at(0).byte_size(); } if (arr_len >= 2) { int last = arr_len - 1; *r1_start = regions->at(1).start(); *r1_end = regions->at(last).start() + regions->at(last).byte_size(); } what do you think? (3) metaspace.cpp 3350 // Map the archived heap regions after compressed pointers 3351 // because it relies on compressed class pointers setting to work do you mean this? // Archived heap regions depend on the parameters of compressed class pointers, so // they must be mapped after such parameters have been decided in the above call. (4) I found this name not strictly grammatical. How about this: allow_archive_heap_object -> is_heap_object_archiving_allowed (5) in most of your code, 'archive' is used as a noun, except in StringTable::archive_string() where it's used as a verb. archive_string could also be interpreted erroneously as "return a string that's already in the archive". So to be consistent and unambiguous, I think it's better to rename it to StringTable::create_archived_string() Thanks - Ioi On 8/3/17 5:15 PM, Jiangli Zhou wrote: > Here are the updated webrevs. > > http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ > > http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ > > Changes in the updated webrevs include: > > * Merge with Ioi?s recent shared space auto-sizing change (8072061) > * Addressed all feedbacks from Ioi and Coleen (Thanks for detailed > review!) > > > Thanks, > Jiangli > > >> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou wrote: >> >> Hi Ioi, >> >> Thank you so much for reviewing this. I?ve addressed all your >> feedbacks. Please see details below. I?ll updated the webrev >> after addressing Coleen?s comments. >> >>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>> >>> Hi Jiangli, >>> >>> Here are my comments. I've not reviewed the GC code and I'll leave >>> that to the GC experts :-) >>> >>> stringTable.cpp: StringTable::archive_string >>> >>> add assert for DumpSharedSpaces only >> >> Ok. >> >>> >>> filemap.cpp >>> >>> 525 void >>> FileMapInfo::write_archive_heap_regions(GrowableArray >>> *regions, >>> 526 int first_region, >>> int num_regions) { >>> >>> When I first read this function, I found it hard to follow, >>> especially this part that coalesces the trailing regions: >>> >>> 537 int len = regions->length(); >>> 538 if (len > 1) { >>> 539 start = (char*)regions->at(1).start(); >>> 540 size = (char*)regions->at(len - 1).end() - start; >>> 541 } >>> 542 } >>> >>> The rest of filemap.cpp always perform identical operations on >>> MemRegion arrays, which are either 1 or 2 in size. However, >>> this function doesn't follow that pattern; it also has a very >>> different notion of "region", and the confusing part is >>> regions->size() is not the same as num_regions. >>> >>> How about we change the API to something like the following? Before >>> calling this API, the caller needs to coalesce the trailing >>> G1 regions into a single MemRegion. >>> >>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int >>> first, int num_regions) { >>> if (first == MetaspaceShared::first_string) { >>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>> } else { >>> assert(first == >>> MetaspaceShared::first_open_archive_heap_region, "..."); >>> assert(num_regons <= >>> MetaspaceShared::max_open_archive_heap_region, "..."); >>> } >>> .... >>> >>> >> >> I?ve reworked the function and simplified the code. >> >>> >>> 756 if (!string_data_mapped) { >>> 757 StringTable::ignore_shared_strings(true); >>> 758 assert(string_ranges == NULL && num_string_ranges == 0, >>> "sanity"); >>> 759 } >>> 760 >>> 761 if (open_archive_heap_data_mapped) { >>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>> 763 } else { >>> 764 assert(open_archive_heap_ranges == NULL && >>> num_open_archive_heap_ranges == 0, "sanity"); >>> 765 } >>> >>> Maybe the two "if" statements should be more consistent? Instead of >>> StringTable::ignore_shared_strings, how >>> about StringTable::set_shared_strings_region_mapped()? >> >> Fixed. >> >>> >>> FileMapInfo::map_heap_data() -- >>> >>> 818 char* addr = (char*)regions[i].start(); >>> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >>> 820 addr, regions[i].byte_size(), >>> si->_read_only, >>> 821 si->_allow_exec); >>> >>> What happens when the first region succeeds to map but the second >>> region fails to map? Will both regions be unmapped? I don't see >>> where you store the return value (base) from os::map_memory(). Does >>> it mean the code assumes that (addr == base). If so, we need an >>> assert here. >> >> If any of the region fails to map, we bail out and call >> dealloc_archive_heap_regions(), which handles the deallocation of any >> regions specified. If second region fails to map, all memory ranges >> specified by ?regions? array are deallocated. We don?t unmap the >> memory here since it is part of the java heap. Unmapping of heap >> memory are handled by GC code. The ?if? check below makes sure base >> == addr. >> >> if (base == NULL || base != addr) { >> // dealloc the regions from java heap >> dealloc_archive_heap_regions(regions, region_num); >> if (log_is_enabled(Info, cds)) { >> log_info(cds)("UseSharedSpaces: Unable to map at required >> address in java heap."); >> } >> return false; >> } >> >> >>> >>> constantPool.cpp >>> >>> Handle refs_handle; >>> ... >>> refs_handle = Handle(THREAD, (oop)archived); >>> >>> This will first create a NULL handle, then construct a temporary >>> handle, and then assign the temp handle back to the null >>> handle. This means two handles will be pushed onto >>> THREAD->metadata_handles() >>> >>> I think it's more efficient if you merge these into a single statement >>> >>> Handle refs_handle(THREAD, (oop)archived); >> >> Fixed. >> >>> >>> Is this experimental code? Maybe it should be removed? >>> >>> 664 if (tag_at(index).is_unresolved_klass()) { >>> 665 #if 0 >>> 666 CPSlot entry = cp->slot_at(index); >>> 667 Symbol* name = entry.get_symbol(); >>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>> 669 if (k != NULL) { >>> 670 klass_at_put(index, k); >>> 671 } >>> 672 #endif >>> 673 } else >> >> Removed. >> >>> >>> cpCache.hpp: >>> >>> u8 _archived_references >>> >>> shouldn't this be declared as an narrowOop to avoid the type casts >>> when it's used? >> >> Ok. >> >>> >>> cpCache.cpp: >>> >>> add assert so that one of these is used only at dump time and the >>> other only at run time? >>> >>> 610 oop ConstantPoolCache::archived_references() { >>> 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); >>> 612 } >>> 613 >>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>> 616 } >> >> Ok. >> >> Thanks! >> >> Jiangli >> >>> >>> Thanks! >>> - Ioi >>> >>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>> Sorry, the mail didn?t handle the rich text well. I fixed the >>>> format below. >>>> >>>> Please help review the changes for JDK-8179302 (Pre-resolve >>>> constant pool string entries and cache resolved_reference arrays >>>> in CDS archive). Currently, the CDS archive can contain cached >>>> class metadata, interned java.lang.String objects. This RFE adds >>>> the constant pool ?resolved_references? arrays (hotspot specific) >>>> to the archive for startup/runtime performance enhancement. >>>> The ?resolved_references' arrays are used to hold references of >>>> resolved constant pool entries including Strings, mirrors, etc. >>>> With the 'resolved_references? being cached, string constants in >>>> shared classes can now be resolved to existing interned >>>> java.lang.Strings at CDS dump time. G1 and 64-bit platforms are >>>> required. >>>> >>>> The GC changes in the RFE were discussed and guided by Thomas >>>> Schatzl and GC team. Part of the changes were contributed by Thomas >>>> himself. >>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>> hotspot: >>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>> >>>> Please see below for details of supporting cached >>>> ?resolved_references? and pre-resolving string constants. >>>> >>>> Types of Pinned G1 Heap Regions >>>> >>>> The pinned region type is a super type of all archive region types, >>>> which include the open archive type and the closed archive type. >>>> >>>> 00100 0 [ 8] Pinned Mask >>>> 01000 0 [16] Old Mask >>>> 10000 0 [32] Archive Mask >>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>>> >>>> >>>> Pinned Regions >>>> >>>> Objects within the region are 'pinned', which means GC does not >>>> move any live objects. GC scans and marks objects in the >>>> pinned region as normal, but skips forwarding live objects. >>>> Pointers in live objects are updated. Dead objects (unreachable) >>>> can be collected and freed. >>>> >>>> Archive Regions >>>> >>>> The archive types are sub-types of 'pinned'. There are two types of >>>> archive region currently, open archive and closed archive. Both can >>>> support caching java heap objects via the CDS archive. >>>> >>>> An archive region is also an old region by design. >>>> >>>> Open Archive (GC-RW) Regions >>>> >>>> Open archive region is GC writable. GC scans & marks objects within >>>> the region and adjusts (updates) pointers in live objects the same >>>> way as a pinned region. Live objects (reachable) are pinned and not >>>> forwarded by GC. >>>> Open archive region does not have 'dead' objects. Unreachable >>>> objects are 'dormant' objects. Dormant objects are not >>>> collected and freed by GC. >>>> >>>> Adjustable Outgoing Pointers >>>> >>>> As GC can adjust pointers within the live objects in open archive >>>> heap region, objects can have outgoing pointers to another >>>> java heap region, including closed archive region, open archive >>>> region, pinned (or humongous) region, and normal generational >>>> region. When a referenced object is moved by GC, the pointer within >>>> the open archive region is updated accordingly. >>>> >>>> Closed Archive (GC-RO) Regions >>>> >>>> The closed archive region is GC read-only region. GC cannot write >>>> into the region. Objects are not scanned and marked by GC. Objects >>>> are pinned and not forwarded. Pointers are not updated by GC >>>> either. Hence, objects within the archive region cannot have any >>>> outgoing pointers to another java heap region. Objects however can >>>> still have pointers to other objects within the closed archive >>>> regions (we might allow pointers to open archive regions in the >>>> future). That restricts the type of java objects that can >>>> be supported by the archive region. >>>> In JDK 9 we support archive Strings with the archive regions. >>>> >>>> The GC-readonly archive region makes java heap memory sharable >>>> among different JVM processes. NOTE: synchronization on the objects >>>> within the archive heap region can still cause writes to the memory >>>> page. >>>> >>>> Dormant Objects >>>> >>>> Dormant objects are unreachable java objects within the open >>>> archive heap region. >>>> A java object in the open archive heap region is a live object if >>>> it can be reached during scanning. Some of the java objects in >>>> the region may not be reachable during scanning. Those objects are >>>> considered as dormant, but not dead. For example, a constant pool >>>> 'resolved_references' array is reachable via the klass root if its >>>> container klass (shared) is already loaded at the time during GC >>>> scanning. If a shared klass is not yet loaded, the klass root is >>>> not scanned and it's constant pool 'resolved_reference' array >>>> (A) in the open archive region is not reachable. Then A is a >>>> dormant object. >>>> >>>> Object State Transition >>>> >>>> All java objects are initially dormant objects when open archive >>>> heap regions are mapped to the runtime java heap. A dormant object >>>> becomes live object when the associated shared class is loaded at >>>> runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs >>>> to be made when a dormant object becomes live. That should be the >>>> case for cached objects with strong roots as well, since strong >>>> roots are only scanned at the start of GC marking (the initial >>>> marking) but not during Remarking/Final marking. If a cached object >>>> becomes live during concurrent marking phase, G1 may not find it >>>> and mark it live unless a call to >>>> G1SATBCardTableModRefBS::enqueue() is made for the object. >>>> >>>> Currently, a live object in the open archive heap region cannot >>>> become dormant again. This restriction simplifies GC >>>> requirement and guarantees all outgoing pointers are updated by GC >>>> correctly. Only objects for shared classes from the builtin class >>>> loaders (boot, PlatformClassLoaders, and AppClassLoaders) are >>>> supported for caching. >>>> >>>> Caching Java Objects at Archive Dump Time >>>> >>>> The closed archive and open archive regions are allocated near the >>>> top of the dump time java heap. Archived java objects are copied >>>> into the designated archive heap regions. For example, String >>>> objects and the underlying 'value' arrays are copied into >>>> the closed archive regions. All references to the archived objects >>>> (from shared class metadata, string table, etc) are set to the >>>> new heap locations. A hash table is used to keep track of all >>>> archived java objects during the copying process to make sure java >>>> object is not archived more than once if reached from different >>>> roots. It also makes sure references to the same archived object >>>> are updated using the same new address location. >>>> >>>> Caching Constant Pool resolved_references Array >>>> >>>> The 'resolved_references' is an array that holds references of >>>> resolved constant pool entries including Strings, mirrors >>>> and methodTypes, etc. Each loaded class has one >>>> 'resolved_references' array (in ConstantPoolCache). The >>>> 'resolved_references' arrays are copied into the open archive >>>> regions during dump process. Prior to copying the >>>> 'resolved_references' arrays, JVM iterates through constant pool >>>> entries and resolves all JVM_CONSTANT_String entries to existing >>>> interned Strings for all archived classes. When resolving, JVM only >>>> looks up the string table and finds existing interned Strings >>>> without inserting new ones. If a string entry cannot be resolved to >>>> an existing interned String, the constant pool entry remain as >>>> unresolved. That prevents memory waste if a constant pool string >>>> entry is never used at runtime. >>>> >>>> All String objects referenced by the string table are copied first >>>> into the closed archive regions. The string table entry is >>>> updated with the new location when each String object is archived. >>>> The JVM updates the resolved constant pool string entries with the >>>> new object locations when copying the 'resolved_references' arrays >>>> to the open archive regions. References to >>>> the 'resolved_references' arrays in the ConstantPoolCache are also >>>> updated. >>>> At runtime as part of ConstantPool::restore_unshareable_info() >>>> work, call G1SATBCardTableModRefBS::enqueue() to let GC know the >>>> 'resolved_references' is becoming live. A handle is created for the >>>> cached object and added to the loader_data's handles. >>>> >>>> Runtime Java Heap With Cached Java Objects >>>> >>>> >>>> The closed archive regions (the string regions) and open archive >>>> regions are mapped to the runtime java heap at the same offsets as >>>> the dump time offsets from the runtime java heap base. >>>> >>>> Preliminary test execution and status: >>>> >>>> JPRT: passed >>>> Tier2-rt: passed >>>> Tier2-gc: passed >>>> Tier2-comp: passed >>>> Tier3-rt: passed >>>> Tier3-gc: passed >>>> Tier3-comp: passed >>>> Tier4-rt: passed >>>> Tier4-gc: passed >>>> Tier4-comp:6 jobs timed out, all other tests passed >>>> Tier5-rt: one test failed but passed when running locally, all >>>> other tests passed >>>> Tier5-gc: passed >>>> Tier5-comp: running >>>> hotspot_gc: two jobs timed out, all other tests passed >>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>> vm.gc: passed >>>> vm.gc in CDS mode: passed >>>> Kichensink: passed >>>> Kichensink in CDS mode: passed >>>> >>>> Thanks, >>>> Jiangli >>> >> > From daniel.daugherty at oracle.com Fri Aug 4 21:24:58 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Aug 2017 15:24:58 -0600 Subject: RFR(XXS): quarantine tests named in JDK-8184042 on MacOS X (8185872) Message-ID: Greetings, I'm quarantining the tests named in JDK-8184042 on MacOS X. Webrev URL: http://cr.openjdk.java.net/~dcubed/8185872-webrev/0/ This fix is targeted to JDK10/hs. Dan From ioi.lam at oracle.com Fri Aug 4 21:29:15 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 4 Aug 2017 14:29:15 -0700 Subject: RFR(XXS): quarantine tests named in JDK-8184042 on MacOS X (8185872) In-Reply-To: References: Message-ID: <5437699e-bd65-78eb-3b88-35f90232ee8d@oracle.com> Looks good. Thanks Dan! - Ioi On 8/4/17 2:24 PM, Daniel D. Daugherty wrote: > Greetings, > > I'm quarantining the tests named in JDK-8184042 on MacOS X. > > Webrev URL: http://cr.openjdk.java.net/~dcubed/8185872-webrev/0/ > > This fix is targeted to JDK10/hs. > > Dan From daniel.daugherty at oracle.com Fri Aug 4 21:41:54 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Aug 2017 15:41:54 -0600 Subject: RFR(XXS): quarantine gc/stress/gclocker/TestGCLockerWithG1.java (8185874) Message-ID: Greetings, I'm quarantining gc/stress/gclocker/TestGCLockerWithG1.java as it continued to fail in the JDK10-hs nightly. 8185874 quarantine gc/stress/gclocker/TestGCLockerWithG1.java https://bugs.openjdk.java.net/browse/JDK-8185874 $ hg diff test/ProblemList.txt diff -r 2cbcc2fdc073 test/ProblemList.txt --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 +++ b/test/ProblemList.txt Fri Aug 04 14:39:10 2017 -0700 @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all +gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all ############################################################################# This fix is targeted to JDK10/hs. Dan From daniel.daugherty at oracle.com Fri Aug 4 21:43:44 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Aug 2017 15:43:44 -0600 Subject: RFR(XXS): quarantine tests named in JDK-8184042 on MacOS X (8185872) In-Reply-To: <5437699e-bd65-78eb-3b88-35f90232ee8d@oracle.com> References: <5437699e-bd65-78eb-3b88-35f90232ee8d@oracle.com> Message-ID: <9e4cf80c-52d6-3616-d999-95abf4db3d94@oracle.com> Thanks Ioi! I did forget the bug link: 8185872 quarantine tests named in JDK-8184042 on MacOS X https://bugs.openjdk.java.net/browse/JDK-8185872 I'm planning to use the "HotSpot Trivial Change" rule for this review. Dan On 8/4/17 3:29 PM, Ioi Lam wrote: > Looks good. Thanks Dan! > > - Ioi > > > On 8/4/17 2:24 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I'm quarantining the tests named in JDK-8184042 on MacOS X. >> >> Webrev URL: http://cr.openjdk.java.net/~dcubed/8185872-webrev/0/ >> >> This fix is targeted to JDK10/hs. >> >> Dan > From mikhailo.seledtsov at oracle.com Fri Aug 4 21:52:52 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Fri, 04 Aug 2017 14:52:52 -0700 Subject: RFR(XXS): quarantine gc/stress/gclocker/TestGCLockerWithG1.java (8185874) In-Reply-To: References: Message-ID: <5984ECB4.7000008@oracle.com> Looks good, Misha On 8/4/17, 2:41 PM, Daniel D. Daugherty wrote: > Greetings, > > I'm quarantining gc/stress/gclocker/TestGCLockerWithG1.java as it > continued to fail in the JDK10-hs nightly. > > 8185874 quarantine gc/stress/gclocker/TestGCLockerWithG1.java > https://bugs.openjdk.java.net/browse/JDK-8185874 > > $ hg diff test/ProblemList.txt > diff -r 2cbcc2fdc073 test/ProblemList.txt > --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 > +++ b/test/ProblemList.txt Fri Aug 04 14:39:10 2017 -0700 > @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv > gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all > gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all > gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all > +gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all > > ############################################################################# > > > This fix is targeted to JDK10/hs. > > Dan From daniel.daugherty at oracle.com Fri Aug 4 21:53:48 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Aug 2017 15:53:48 -0600 Subject: RFR(XXS): quarantine compiler/ciReplay/TestSA.sh (8185876) Message-ID: <9bdab5db-6708-d504-d9cc-1957575cd69a@oracle.com> Greetings, I'm quarantining compiler/ciReplay/TestSA.sh as it continues to fail in the JDK10-hs nightly. 8185876 quarantine compiler/ciReplay/TestSA.sh https://bugs.openjdk.java.net/browse/JDK-8185876 $ hg diff test/ProblemList.txt diff -r 2cbcc2fdc073 test/ProblemList.txt --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 +++ b/test/ProblemList.txt Fri Aug 04 14:51:00 2017 -0700 @@ -40,6 +40,7 @@ # :hotspot_compiler +compiler/ciReplay/TestSA.sh 8029528 generic-all compiler/codecache/stress/OverloadCompileQueueTest.java 8166554 generic-all compiler/compilercontrol/jcmd/ClearDirectivesFileStackTest.java 8140405 generic-all compiler/jvmci/compilerToVM/GetResolvedJavaTypeTest.java 8158860 generic-all This fix is targeted to JDK10/hs. Dan From daniel.daugherty at oracle.com Fri Aug 4 22:02:10 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Aug 2017 16:02:10 -0600 Subject: RFR(XXS): quarantine gc/stress/gclocker/TestGCLockerWithG1.java (8185874) In-Reply-To: <5984ECB4.7000008@oracle.com> References: <5984ECB4.7000008@oracle.com> Message-ID: <59b297f6-1830-8fb3-6b8e-c81d8ec0acc1@oracle.com> Misha, Thanks for the review! I still need a (R)eviewer to chime in on this thread so I can invoke the "HotSpot Trivial Change" rule. Dan On 8/4/17 3:52 PM, Mikhailo Seledtsov wrote: > Looks good, > > Misha > > On 8/4/17, 2:41 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I'm quarantining gc/stress/gclocker/TestGCLockerWithG1.java as it >> continued to fail in the JDK10-hs nightly. >> >> 8185874 quarantine gc/stress/gclocker/TestGCLockerWithG1.java >> https://bugs.openjdk.java.net/browse/JDK-8185874 >> >> $ hg diff test/ProblemList.txt >> diff -r 2cbcc2fdc073 test/ProblemList.txt >> --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 >> +++ b/test/ProblemList.txt Fri Aug 04 14:39:10 2017 -0700 >> @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv >> gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all >> gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all >> gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all >> +gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all >> >> ############################################################################# >> >> >> This fix is targeted to JDK10/hs. >> >> Dan From vladimir.kozlov at oracle.com Fri Aug 4 22:07:31 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 4 Aug 2017 15:07:31 -0700 Subject: RFR(XXS): quarantine compiler/ciReplay/TestSA.sh (8185876) In-Reply-To: <9bdab5db-6708-d504-d9cc-1957575cd69a@oracle.com> References: <9bdab5db-6708-d504-d9cc-1957575cd69a@oracle.com> Message-ID: Tere are no compiler/ciReplay/TestSA.sh file after 8155219 changes. Script was replaced by Java files. I agree with quarantine these tests but names should be correct. Thanks, Vladimir On 8/4/17 2:53 PM, Daniel D. Daugherty wrote: > Greetings, > > I'm quarantining compiler/ciReplay/TestSA.sh as it continues to fail in > the JDK10-hs nightly. > > 8185876 quarantine compiler/ciReplay/TestSA.sh > https://bugs.openjdk.java.net/browse/JDK-8185876 > > $ hg diff test/ProblemList.txt > diff -r 2cbcc2fdc073 test/ProblemList.txt > --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 > +++ b/test/ProblemList.txt Fri Aug 04 14:51:00 2017 -0700 > @@ -40,6 +40,7 @@ > > # :hotspot_compiler > > +compiler/ciReplay/TestSA.sh 8029528 generic-all > compiler/codecache/stress/OverloadCompileQueueTest.java 8166554 > generic-all > compiler/compilercontrol/jcmd/ClearDirectivesFileStackTest.java > 8140405 generic-all > compiler/jvmci/compilerToVM/GetResolvedJavaTypeTest.java 8158860 > generic-all > > This fix is targeted to JDK10/hs. > > Dan From vladimir.kozlov at oracle.com Fri Aug 4 22:09:10 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 4 Aug 2017 15:09:10 -0700 Subject: RFR(XXS): quarantine gc/stress/gclocker/TestGCLockerWithG1.java (8185874) In-Reply-To: <59b297f6-1830-8fb3-6b8e-c81d8ec0acc1@oracle.com> References: <5984ECB4.7000008@oracle.com> <59b297f6-1830-8fb3-6b8e-c81d8ec0acc1@oracle.com> Message-ID: <3d3a7df3-e118-2e1e-6e0d-d262e72dbbe2@oracle.com> Count me in. Vladimir On 8/4/17 3:02 PM, Daniel D. Daugherty wrote: > Misha, > > Thanks for the review! I still need a (R)eviewer to chime in on this > thread so I can invoke the "HotSpot Trivial Change" rule. > > Dan > > > On 8/4/17 3:52 PM, Mikhailo Seledtsov wrote: >> Looks good, >> >> Misha >> >> On 8/4/17, 2:41 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> I'm quarantining gc/stress/gclocker/TestGCLockerWithG1.java as it >>> continued to fail in the JDK10-hs nightly. >>> >>> 8185874 quarantine gc/stress/gclocker/TestGCLockerWithG1.java >>> https://bugs.openjdk.java.net/browse/JDK-8185874 >>> >>> $ hg diff test/ProblemList.txt >>> diff -r 2cbcc2fdc073 test/ProblemList.txt >>> --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 >>> +++ b/test/ProblemList.txt Fri Aug 04 14:39:10 2017 -0700 >>> @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv >>> gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all >>> gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all >>> gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all >>> +gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all >>> >>> ############################################################################# >>> >>> >>> This fix is targeted to JDK10/hs. >>> >>> Dan > From daniel.daugherty at oracle.com Fri Aug 4 22:10:06 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Aug 2017 16:10:06 -0600 Subject: RFR(XXS): quarantine gc/stress/gclocker/TestGCLockerWithG1.java (8185874) In-Reply-To: <3d3a7df3-e118-2e1e-6e0d-d262e72dbbe2@oracle.com> References: <5984ECB4.7000008@oracle.com> <59b297f6-1830-8fb3-6b8e-c81d8ec0acc1@oracle.com> <3d3a7df3-e118-2e1e-6e0d-d262e72dbbe2@oracle.com> Message-ID: Thanks! Dan On 8/4/17 4:09 PM, Vladimir Kozlov wrote: > Count me in. > > Vladimir > > On 8/4/17 3:02 PM, Daniel D. Daugherty wrote: >> Misha, >> >> Thanks for the review! I still need a (R)eviewer to chime in on this >> thread so I can invoke the "HotSpot Trivial Change" rule. >> >> Dan >> >> >> On 8/4/17 3:52 PM, Mikhailo Seledtsov wrote: >>> Looks good, >>> >>> Misha >>> >>> On 8/4/17, 2:41 PM, Daniel D. Daugherty wrote: >>>> Greetings, >>>> >>>> I'm quarantining gc/stress/gclocker/TestGCLockerWithG1.java as it >>>> continued to fail in the JDK10-hs nightly. >>>> >>>> 8185874 quarantine gc/stress/gclocker/TestGCLockerWithG1.java >>>> https://bugs.openjdk.java.net/browse/JDK-8185874 >>>> >>>> $ hg diff test/ProblemList.txt >>>> diff -r 2cbcc2fdc073 test/ProblemList.txt >>>> --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 >>>> +++ b/test/ProblemList.txt Fri Aug 04 14:39:10 2017 -0700 >>>> @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv >>>> gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all >>>> gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all >>>> gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all >>>> +gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all >>>> >>>> ############################################################################# >>>> >>>> >>>> This fix is targeted to JDK10/hs. >>>> >>>> Dan >> From daniel.daugherty at oracle.com Fri Aug 4 22:34:37 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Aug 2017 16:34:37 -0600 Subject: RFR(XXS): quarantine gc/stress/gclocker/TestGCLockerWithSerial.java (8185879) Message-ID: Greetings, I'm quarantining gc/stress/gclocker/TestGCLockerWithSerial.java as it continues to fail in the JDK10-hs nightly. 8185879 quarantine gc/stress/gclocker/TestGCLockerWithSerial.java https://bugs.openjdk.java.net/browse/JDK-8185879 $ hg diff test/ProblemList.txt diff -r 2cbcc2fdc073 test/ProblemList.txt --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 +++ b/test/ProblemList.txt Fri Aug 04 15:31:57 2017 -0700 @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all +gc/stress/gclocker/TestGCLockerWithSerial.java 8185879 generic-all ############################################################################# Dan From daniel.daugherty at oracle.com Fri Aug 4 22:46:48 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Aug 2017 16:46:48 -0600 Subject: RFR(XXS): quarantine gc/stress/gclocker/TestGCLockerWithSerial.java (8185879) In-Reply-To: References: Message-ID: <272620d3-c7a7-4969-bcfe-5d2403e3a727@oracle.com> Going too fast with these quarantines... This one should be: $ hg diff test/ProblemList.txt diff -r 2cbcc2fdc073 test/ProblemList.txt --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 +++ b/test/ProblemList.txt Fri Aug 04 15:45:42 2017 -0700 @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all +gc/stress/gclocker/TestGCLockerWithSerial.java 8180311 generic-all ############################################################################# Changed the associated bug ID from 8185879 to 8180311. Dan On 8/4/17 4:34 PM, Daniel D. Daugherty wrote: > Greetings, > > I'm quarantining gc/stress/gclocker/TestGCLockerWithSerial.java as it > continues to fail in the JDK10-hs nightly. > > 8185879 quarantine gc/stress/gclocker/TestGCLockerWithSerial.java > https://bugs.openjdk.java.net/browse/JDK-8185879 > > $ hg diff test/ProblemList.txt > diff -r 2cbcc2fdc073 test/ProblemList.txt > --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 > +++ b/test/ProblemList.txt Fri Aug 04 15:31:57 2017 -0700 > @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv > gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all > gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all > gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all > +gc/stress/gclocker/TestGCLockerWithSerial.java 8185879 generic-all > > ############################################################################# > > > Dan From kim.barrett at oracle.com Fri Aug 4 23:20:20 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 4 Aug 2017 19:20:20 -0400 Subject: RFR: 8185757: QuickSort array size should be size_t In-Reply-To: <1501787150.2411.71.camel@oracle.com> References: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> <1501763541.2411.19.camel@oracle.com> <1501787150.2411.71.camel@oracle.com> Message-ID: <4029C0A4-1F4D-4C43-9022-FF9F33BD321A@oracle.com> > On Aug 3, 2017, at 3:05 PM, Thomas Schatzl wrote: > > Hi, > > On Thu, 2017-08-03 at 14:06 -0400, Kim Barrett wrote: >>> >>> On Aug 3, 2017, at 8:32 AM, Thomas Schatzl >> om> wrote: >>> However, please put the closing brackets of these into extra lines >>> (quicksort.hpp:76,77) to avoid the casual reader to overlook them. >> Sorry, but that just looks horrible. As a casual reader, I wouldn?t >> even look for them, since if they aren?t there then the code is >> badly mis-indented. > > Actually I was already at writing about an issue with indentation when > I noticed the brackets :) > > Not insisting on changing this. I added a couple of asserts to check for buffer overruns due to a bad comparator: diff -r 6b62ed03a6a6 -r 7616ceb92653 src/share/vm/utilities/quickSort.hpp --- a/src/share/vm/utilities/quickSort.hpp Fri Aug 04 18:02:51 2017 -0400 +++ b/src/share/vm/utilities/quickSort.hpp Fri Aug 04 19:17:36 2017 -0400 @@ -73,8 +73,12 @@ T pivot_val = array[pivot]; for ( ; true; ++left_index, --right_index) { - for ( ; comparator(array[left_index], pivot_val) < 0; ++left_index) {} - for ( ; comparator(array[right_index], pivot_val) > 0; --right_index) {} + for ( ; comparator(array[left_index], pivot_val) < 0; ++left_index) { + assert(left_index < length, "reached end of partition"); + } + for ( ; comparator(array[right_index], pivot_val) > 0; --right_index) { + assert(right_index > 0, "reached start of partition"); + } if (left_index < right_index) { if (!idempotent || comparator(array[left_index], array[right_index]) != 0) { From ioi.lam at oracle.com Fri Aug 4 23:35:45 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 4 Aug 2017 16:35:45 -0700 Subject: RFR(XXS): quarantine gc/stress/gclocker/TestGCLockerWithSerial.java (8180311) In-Reply-To: <272620d3-c7a7-4969-bcfe-5d2403e3a727@oracle.com> References: <272620d3-c7a7-4969-bcfe-5d2403e3a727@oracle.com> Message-ID: <4b4841de-258d-cc5f-9e65-cd63c70a18b4@oracle.com> Looks good. I changed the e-mail subject to 8180311 :-) Thanks! - Ioi On 8/4/17 3:46 PM, Daniel D. Daugherty wrote: > Going too fast with these quarantines... This one should be: > > $ hg diff test/ProblemList.txt > diff -r 2cbcc2fdc073 test/ProblemList.txt > --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 > +++ b/test/ProblemList.txt Fri Aug 04 15:45:42 2017 -0700 > @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv > gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all > gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all > gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all > +gc/stress/gclocker/TestGCLockerWithSerial.java 8180311 generic-all > > ############################################################################# > > > Changed the associated bug ID from 8185879 to 8180311. > > Dan > > > > On 8/4/17 4:34 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I'm quarantining gc/stress/gclocker/TestGCLockerWithSerial.java as it >> continues to fail in the JDK10-hs nightly. >> >> 8185879 quarantine gc/stress/gclocker/TestGCLockerWithSerial.java >> https://bugs.openjdk.java.net/browse/JDK-8185879 >> >> $ hg diff test/ProblemList.txt >> diff -r 2cbcc2fdc073 test/ProblemList.txt >> --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 >> +++ b/test/ProblemList.txt Fri Aug 04 15:31:57 2017 -0700 >> @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv >> gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all >> gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all >> gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all >> +gc/stress/gclocker/TestGCLockerWithSerial.java 8185879 generic-all >> >> ############################################################################# >> >> >> Dan > From daniel.daugherty at oracle.com Fri Aug 4 23:38:00 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Aug 2017 17:38:00 -0600 Subject: RFR(XXS): quarantine gc/stress/gclocker/TestGCLockerWithSerial.java (8180311) In-Reply-To: <4b4841de-258d-cc5f-9e65-cd63c70a18b4@oracle.com> References: <272620d3-c7a7-4969-bcfe-5d2403e3a727@oracle.com> <4b4841de-258d-cc5f-9e65-cd63c70a18b4@oracle.com> Message-ID: On 8/4/17 5:35 PM, Ioi Lam wrote: > Looks good. Thanks. > I changed the e-mail subject to 8180311 :-) The bug ID in the subject is the bug fix under review which is the quarantine bug: 8185879 so I got that one right... :-) Dan > > Thanks! > > - Ioi > > > On 8/4/17 3:46 PM, Daniel D. Daugherty wrote: >> Going too fast with these quarantines... This one should be: >> >> $ hg diff test/ProblemList.txt >> diff -r 2cbcc2fdc073 test/ProblemList.txt >> --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 >> +++ b/test/ProblemList.txt Fri Aug 04 15:45:42 2017 -0700 >> @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv >> gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all >> gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all >> gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all >> +gc/stress/gclocker/TestGCLockerWithSerial.java 8180311 generic-all >> >> ############################################################################# >> >> >> Changed the associated bug ID from 8185879 to 8180311. >> >> Dan >> >> >> >> On 8/4/17 4:34 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> I'm quarantining gc/stress/gclocker/TestGCLockerWithSerial.java as it >>> continues to fail in the JDK10-hs nightly. >>> >>> 8185879 quarantine gc/stress/gclocker/TestGCLockerWithSerial.java >>> https://bugs.openjdk.java.net/browse/JDK-8185879 >>> >>> $ hg diff test/ProblemList.txt >>> diff -r 2cbcc2fdc073 test/ProblemList.txt >>> --- a/test/ProblemList.txt Fri Aug 04 12:24:33 2017 -0700 >>> +++ b/test/ProblemList.txt Fri Aug 04 15:31:57 2017 -0700 >>> @@ -123,6 +123,7 @@ gc/survivorAlignment/TestPromotionToSurv >>> gc/survivorAlignment/TestPromotionToSurvivor.java 8129886 generic-all >>> gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all >>> gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all >>> +gc/stress/gclocker/TestGCLockerWithSerial.java 8185879 generic-all >>> >>> ############################################################################# >>> >>> >>> Dan >> > From jiangli.zhou at Oracle.COM Sat Aug 5 05:19:28 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Fri, 4 Aug 2017 22:19:28 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> Message-ID: Hi Ioi, Thanks for looking again. > On Aug 4, 2017, at 2:22 PM, Ioi Lam wrote: > > Hi Jiangli, > > The code looks good in general. I just have a few pet peeves for readability: > > > (1) stringTable.cpp and metaspaceShared.cpp have the same asserts > > 704 assert(UseG1GC, "Only support G1 GC"); > 705 assert(UseCompressedOops && UseCompressedClassPointers, > 706 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); > > 1615 assert(UseG1GC, "Only support G1 GC"); > 1616 assert(UseCompressedOops && UseCompressedClassPointers, > 1617 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); > > Maybe it's better to combine them into a single function like MetaspaceShared::assert_vm_flags() so they don't get out of sync? There is a MetaspaceShared::allow_archive_heap_object(), which checks for UseG1GC, UseCompressedOops and UseCompressedClassPointers combined. It does not seem to worth add another separate API for asserting the required flags. I?ll use that in the assert. > > > > (2) FileMapInfo::write_archive_heap_regions() > > I still find this code very hard to read, especially due to the loop. > > First, the comments are not consistent with the code: > > 498 assert(arr_len <= max_num_regions, "number of memory regions exceeds maximum"); > > but the comments says: "The rest are consecutive full GC regions" which means there's a chance for max_num_regions to be more than 2 (which will be the case with Calvin's java-loader dumping changes using very small heap size). So the code is actually wrong. The max_num_regions is the maximum number of region for each archived heap space (the string space, or open archive space). We only run into the case where the MemRegion array size is larger than max_num_regions with Calvin?s pending change. As part of Calvin?s change, he will change the assert into a check and bail out if the number of MemRegions are larger than max_num_regions due to heap fragmentation. > > The word "region" is used in these parameters, but they don't mean the same thing. > > GrowableArray *regions > int first_region, int max_num_regions, > > > How about regions -> g1_regions_list > first_region -> first_region_in_archive The GrowableArray above is the MemRegions that GC code gives back to us. The GC code combines multiple G1 regions. The comments probably are over-explaining the details, which are hidden in the GC code. Probably that?s the confusing source. I?ll make the comment more clear. Using g1_regions_list would also be confusing, since write_archive_heap_regions does not handle G1 regions directly. It processes the MemRegion array that GC code returns. How about changing ?regions? to ?mem_regions? or ?archive_regions'? > > > In the comments, I find the phrase 'the current archive heap region' ambiguous. It could be (erroneously) interpreted as "a region from the currently mapped archive? > > To make it unambiguous, how about changing > > > 464 // Write the current archive heap region, which contains one or multiple GC(G1) regions. > > > to > > // Write the given list of G1 memory regions into the archive, starting at > // first_region_in_archive. Ok. How about the following: // Write the given list of java heap memory regions into the archive, starting at // first_region_in_archive. > > > Also, for the explanation of how the G1 regions are written into the archive, how about: > > // The G1 regions in the list are sorted in ascending address order. When there are more objects > // than the capacity of a single G1 region, the bottom-most G1 region may be partially filled, and the > // remaining G1 region(s) are consecutively allocated and fully filled. > // > // Thus, the bottom-most G1 region (if not empty) is written into first_region_in_archive. > // The remaining G1 regions (if exist) are coalesced and written as a single block > // into (first_region_in_archive + 1) > > // Here's the mapping from (g1 regions) -> (archive regions). > > > All this function needs to do is to decide the values for > > r0_start, r0_top > r1_start, r1_top > > I think it would be much better to not use the loop, and not use the max_num_regions parameter (it's always 2 anyway). > > *r0_start = *r0_top = NULL; > *r1_start = *r1_top = NULL; > > if (arr_len >= 1) { > *r0_start = regions->at(0).start(); > *r0_end = *r0_start + regions->at(0).byte_size(); > } > if (arr_len >= 2) { > int last = arr_len - 1; > *r1_start = regions->at(1).start(); > *r1_end = regions->at(last).start() + regions->at(last).byte_size(); > } > > what do you think? We need to write out all archive regions including the empty ones. The loop using max_num_regions is the easiest way. I?d like to remove the code that deals with r0_* and r1_ explicitly. Let me try that. > > > > (3) metaspace.cpp > > 3350 // Map the archived heap regions after compressed pointers > 3351 // because it relies on compressed class pointers setting to work > > do you mean this? > > // Archived heap regions depend on the parameters of compressed class pointers, so > // they must be mapped after such parameters have been decided in the above call. Hmmm, maybe use ?arguments? instead of ?parameters?? > > > (4) I found this name not strictly grammatical. How about this: > > allow_archive_heap_object -> is_heap_object_archiving_allowed Ok. > > (5) in most of your code, 'archive' is used as a noun, except in StringTable::archive_string() where it's used as a verb. > > archive_string could also be interpreted erroneously as "return a string that's already in the archive". So to be consistent and unambiguous, I think it's better to rename it to StringTable::create_archived_string() Ok. Thanks, Jiangli > > > Thanks > - Ioi > > > On 8/3/17 5:15 PM, Jiangli Zhou wrote: >> Here are the updated webrevs. >> >> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >> >> Changes in the updated webrevs include: >> Merge with Ioi?s recent shared space auto-sizing change (8072061) >> Addressed all feedbacks from Ioi and Coleen (Thanks for detailed review!) >> >> Thanks, >> Jiangli >> >> >>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou wrote: >>> >>> Hi Ioi, >>> >>> Thank you so much for reviewing this. I?ve addressed all your feedbacks. Please see details below. I?ll updated the webrev after addressing Coleen?s comments. >>> >>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>>> >>>> Hi Jiangli, >>>> >>>> Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) >>>> >>>> stringTable.cpp: StringTable::archive_string >>>> >>>> add assert for DumpSharedSpaces only >>> >>> Ok. >>> >>>> >>>> filemap.cpp >>>> >>>> 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, >>>> 526 int first_region, int num_regions) { >>>> >>>> When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: >>>> >>>> 537 int len = regions->length(); >>>> 538 if (len > 1) { >>>> 539 start = (char*)regions->at(1).start(); >>>> 540 size = (char*)regions->at(len - 1).end() - start; >>>> 541 } >>>> 542 } >>>> >>>> The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. >>>> >>>> How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. >>>> >>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { >>>> if (first == MetaspaceShared::first_string) { >>>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>>> } else { >>>> assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); >>>> assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); >>>> } >>>> .... >>>> >>>> >>> >>> I?ve reworked the function and simplified the code. >>> >>>> >>>> 756 if (!string_data_mapped) { >>>> 757 StringTable::ignore_shared_strings(true); >>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); >>>> 759 } >>>> 760 >>>> 761 if (open_archive_heap_data_mapped) { >>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>>> 763 } else { >>>> 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); >>>> 765 } >>>> >>>> Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? >>> >>> Fixed. >>> >>>> >>>> FileMapInfo::map_heap_data() -- >>>> >>>> 818 char* addr = (char*)regions[i].start(); >>>> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >>>> 820 addr, regions[i].byte_size(), si->_read_only, >>>> 821 si->_allow_exec); >>>> >>>> What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. >>> >>> If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by ?regions? array are deallocated. We don?t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ?if? check below makes sure base == addr. >>> >>> if (base == NULL || base != addr) { >>> // dealloc the regions from java heap >>> dealloc_archive_heap_regions(regions, region_num); >>> if (log_is_enabled(Info, cds)) { >>> log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap."); >>> } >>> return false; >>> } >>> >>> >>>> >>>> constantPool.cpp >>>> >>>> Handle refs_handle; >>>> ... >>>> refs_handle = Handle(THREAD, (oop)archived); >>>> >>>> This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() >>>> >>>> I think it's more efficient if you merge these into a single statement >>>> >>>> Handle refs_handle(THREAD, (oop)archived); >>> >>> Fixed. >>> >>>> >>>> Is this experimental code? Maybe it should be removed? >>>> >>>> 664 if (tag_at(index).is_unresolved_klass()) { >>>> 665 #if 0 >>>> 666 CPSlot entry = cp->slot_at(index); >>>> 667 Symbol* name = entry.get_symbol(); >>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>>> 669 if (k != NULL) { >>>> 670 klass_at_put(index, k); >>>> 671 } >>>> 672 #endif >>>> 673 } else >>> >>> Removed. >>> >>>> >>>> cpCache.hpp: >>>> >>>> u8 _archived_references >>>> >>>> shouldn't this be declared as an narrowOop to avoid the type casts when it's used? >>> >>> Ok. >>> >>>> >>>> cpCache.cpp: >>>> >>>> add assert so that one of these is used only at dump time and the other only at run time? >>>> >>>> 610 oop ConstantPoolCache::archived_references() { >>>> 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); >>>> 612 } >>>> 613 >>>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>>> 616 } >>> >>> Ok. >>> >>> Thanks! >>> >>> Jiangli >>> >>>> >>>> Thanks! >>>> - Ioi >>>> >>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>>> Sorry, the mail didn?t handle the rich text well. I fixed the format below. >>>>> >>>>> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. >>>>> >>>>> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. >>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>>> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>>> >>>>> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. >>>>> >>>>> Types of Pinned G1 Heap Regions >>>>> >>>>> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. >>>>> >>>>> 00100 0 [ 8] Pinned Mask >>>>> 01000 0 [16] Old Mask >>>>> 10000 0 [32] Archive Mask >>>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>>>> >>>>> >>>>> Pinned Regions >>>>> >>>>> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. >>>>> >>>>> Archive Regions >>>>> >>>>> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. >>>>> >>>>> An archive region is also an old region by design. >>>>> >>>>> Open Archive (GC-RW) Regions >>>>> >>>>> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. >>>>> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. >>>>> >>>>> Adjustable Outgoing Pointers >>>>> >>>>> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. >>>>> >>>>> Closed Archive (GC-RO) Regions >>>>> >>>>> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. >>>>> In JDK 9 we support archive Strings with the archive regions. >>>>> >>>>> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. >>>>> >>>>> Dormant Objects >>>>> >>>>> Dormant objects are unreachable java objects within the open archive heap region. >>>>> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. >>>>> >>>>> Object State Transition >>>>> >>>>> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. >>>>> >>>>> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. >>>>> >>>>> Caching Java Objects at Archive Dump Time >>>>> >>>>> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. >>>>> >>>>> Caching Constant Pool resolved_references Array >>>>> >>>>> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. >>>>> >>>>> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. >>>>> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. >>>>> >>>>> Runtime Java Heap With Cached Java Objects >>>>> >>>>> >>>>> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. >>>>> >>>>> Preliminary test execution and status: >>>>> >>>>> JPRT: passed >>>>> Tier2-rt: passed >>>>> Tier2-gc: passed >>>>> Tier2-comp: passed >>>>> Tier3-rt: passed >>>>> Tier3-gc: passed >>>>> Tier3-comp: passed >>>>> Tier4-rt: passed >>>>> Tier4-gc: passed >>>>> Tier4-comp:6 jobs timed out, all other tests passed >>>>> Tier5-rt: one test failed but passed when running locally, all other tests passed >>>>> Tier5-gc: passed >>>>> Tier5-comp: running >>>>> hotspot_gc: two jobs timed out, all other tests passed >>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>>> vm.gc: passed >>>>> vm.gc in CDS mode: passed >>>>> Kichensink: passed >>>>> Kichensink in CDS mode: passed >>>>> >>>>> Thanks, >>>>> Jiangli >>>> >>> >> > From jiangli.zhou at oracle.com Sat Aug 5 05:24:16 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 4 Aug 2017 22:24:16 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <1cb21dcb-eb60-aefa-a35f-73c43fdaae72@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1cb21dcb-eb60-aefa-a35f-73c43fdaae72@oracle.com> Message-ID: <999638DB-FEB8-456B-9EFC-6399626C1713@oracle.com> After discussing with Coleen in separate emails (thanks Coleen!), we think the NoSafepointVerifier guard should be added in VM_PopulateDumpSharedSpace::dump_java_heap_objects(). I?m rerunning tests. Thanks, Jiangli > On Aug 4, 2017, at 10:47 AM, coleen.phillimore at oracle.com wrote: > > > Hi, I think this change looks good except in metaspaceShared.cpp: > > 1632 NoSafepointVerifier nsv; > > > Belongs above when you create the archive cache. > > 1600 MetaspaceShared::create_archive_object_cache(); > > > The call: > > 1646 int hash = obj->identity_hash(); > > > Can safepoint though. Not sure why NSV didn't have an assert. But maybe it won't if you've eagerly installed the hashcode in these objects, which I think you have. Otherwise you could use NoGCVerifier. Or GCLocker, which the GC group won't like. > > Thanks, > Coleen > > On 8/3/17 8:15 PM, Jiangli Zhou wrote: >> Here are the updated webrevs. >> >> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >> >> Changes in the updated webrevs include: >> Merge with Ioi?s recent shared space auto-sizing change (8072061) >> Addressed all feedbacks from Ioi and Coleen (Thanks for detailed review!) >> >> Thanks, >> Jiangli >> >> >>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou wrote: >>> >>> Hi Ioi, >>> >>> Thank you so much for reviewing this. I?ve addressed all your feedbacks. Please see details below. I?ll updated the webrev after addressing Coleen?s comments. >>> >>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>>> >>>> Hi Jiangli, >>>> >>>> Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) >>>> >>>> stringTable.cpp: StringTable::archive_string >>>> >>>> add assert for DumpSharedSpaces only >>> Ok. >>> >>>> filemap.cpp >>>> >>>> 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, >>>> 526 int first_region, int num_regions) { >>>> >>>> When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: >>>> >>>> 537 int len = regions->length(); >>>> 538 if (len > 1) { >>>> 539 start = (char*)regions->at(1).start(); >>>> 540 size = (char*)regions->at(len - 1).end() - start; >>>> 541 } >>>> 542 } >>>> >>>> The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. >>>> >>>> How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. >>>> >>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { >>>> if (first == MetaspaceShared::first_string) { >>>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>>> } else { >>>> assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); >>>> assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); >>>> } >>>> .... >>>> >>>> >>> I?ve reworked the function and simplified the code. >>> >>>> 756 if (!string_data_mapped) { >>>> 757 StringTable::ignore_shared_strings(true); >>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); >>>> 759 } >>>> 760 >>>> 761 if (open_archive_heap_data_mapped) { >>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>>> 763 } else { >>>> 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); >>>> 765 } >>>> >>>> Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? >>> Fixed. >>> >>>> FileMapInfo::map_heap_data() -- >>>> >>>> 818 char* addr = (char*)regions[i].start(); >>>> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >>>> 820 addr, regions[i].byte_size(), si->_read_only, >>>> 821 si->_allow_exec); >>>> >>>> What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. >>> If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by ?regions? array are deallocated. We don?t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ?if? check below makes sure base == addr. >>> >>> if (base == NULL || base != addr) { >>> // dealloc the regions from java heap >>> dealloc_archive_heap_regions(regions, region_num); >>> if (log_is_enabled(Info, cds)) { >>> log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap."); >>> } >>> return false; >>> } >>> >>> >>>> constantPool.cpp >>>> >>>> Handle refs_handle; >>>> ... >>>> refs_handle = Handle(THREAD, (oop)archived); >>>> >>>> This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() >>>> >>>> I think it's more efficient if you merge these into a single statement >>>> >>>> Handle refs_handle(THREAD, (oop)archived); >>> Fixed. >>> >>>> Is this experimental code? Maybe it should be removed? >>>> >>>> 664 if (tag_at(index).is_unresolved_klass()) { >>>> 665 #if 0 >>>> 666 CPSlot entry = cp->slot_at(index); >>>> 667 Symbol* name = entry.get_symbol(); >>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>>> 669 if (k != NULL) { >>>> 670 klass_at_put(index, k); >>>> 671 } >>>> 672 #endif >>>> 673 } else >>> Removed. >>> >>>> cpCache.hpp: >>>> >>>> u8 _archived_references >>>> >>>> shouldn't this be declared as an narrowOop to avoid the type casts when it's used? >>> Ok. >>> >>>> cpCache.cpp: >>>> >>>> add assert so that one of these is used only at dump time and the other only at run time? >>>> >>>> 610 oop ConstantPoolCache::archived_references() { >>>> 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); >>>> 612 } >>>> 613 >>>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>>> 616 } >>> Ok. >>> >>> Thanks! >>> >>> Jiangli >>> >>>> Thanks! >>>> - Ioi >>>> >>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>>> Sorry, the mail didn?t handle the rich text well. I fixed the format below. >>>>> >>>>> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. >>>>> >>>>> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. >>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>>> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>>> >>>>> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. >>>>> >>>>> Types of Pinned G1 Heap Regions >>>>> >>>>> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. >>>>> >>>>> 00100 0 [ 8] Pinned Mask >>>>> 01000 0 [16] Old Mask >>>>> 10000 0 [32] Archive Mask >>>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>>>> >>>>> >>>>> Pinned Regions >>>>> >>>>> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. >>>>> >>>>> Archive Regions >>>>> >>>>> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. >>>>> >>>>> An archive region is also an old region by design. >>>>> >>>>> Open Archive (GC-RW) Regions >>>>> >>>>> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. >>>>> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. >>>>> >>>>> Adjustable Outgoing Pointers >>>>> >>>>> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. >>>>> >>>>> Closed Archive (GC-RO) Regions >>>>> >>>>> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. >>>>> In JDK 9 we support archive Strings with the archive regions. >>>>> >>>>> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. >>>>> >>>>> Dormant Objects >>>>> >>>>> Dormant objects are unreachable java objects within the open archive heap region. >>>>> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. >>>>> >>>>> Object State Transition >>>>> >>>>> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. >>>>> >>>>> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. >>>>> >>>>> Caching Java Objects at Archive Dump Time >>>>> >>>>> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. >>>>> >>>>> Caching Constant Pool resolved_references Array >>>>> >>>>> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. >>>>> >>>>> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. >>>>> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. >>>>> >>>>> Runtime Java Heap With Cached Java Objects >>>>> >>>>> >>>>> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. >>>>> >>>>> Preliminary test execution and status: >>>>> >>>>> JPRT: passed >>>>> Tier2-rt: passed >>>>> Tier2-gc: passed >>>>> Tier2-comp: passed >>>>> Tier3-rt: passed >>>>> Tier3-gc: passed >>>>> Tier3-comp: passed >>>>> Tier4-rt: passed >>>>> Tier4-gc: passed >>>>> Tier4-comp:6 jobs timed out, all other tests passed >>>>> Tier5-rt: one test failed but passed when running locally, all other tests passed >>>>> Tier5-gc: passed >>>>> Tier5-comp: running >>>>> hotspot_gc: two jobs timed out, all other tests passed >>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>>> vm.gc: passed >>>>> vm.gc in CDS mode: passed >>>>> Kichensink: passed >>>>> Kichensink in CDS mode: passed >>>>> >>>>> Thanks, >>>>> Jiangli > From kim.barrett at oracle.com Sun Aug 6 23:32:38 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Sun, 6 Aug 2017 19:32:38 -0400 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596CD126.6080100@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: Here is a start at addressing the various comments. I still owe email responses to some comments, but wanted to get this out first. http://cr.openjdk.java.net/~kbarrett/cmpxchg_template_20170806/webrev/ So far this only deals with cmpxchg, and then only for linux-x86/64. It builds and passes basic tests. I want to get feedback on this before spending time on other platforms, other operations, and more extensive testing. The metaprogramming support layer has been substantially reduced and simplified. IntegerTypes is still there, but is more focused, with a very small public API. It also badly needs a new name, but I'll worry about that later. Floating point types are still there; they can be dropped easily enough (and reinstated if later needed). I've kept them for now because they are easy (which is itself somewhat interesting), and because I don't recall whether later GC interface work needed it. I've included a mechanism for dealing with thin wrappers over primitive types. The driver for this is oop, which is normally just a typedef for oopDesc*, but is a class with an oopDesc* member when CHECK_UNHANDLED_OOPS is defined (such as during fastdebug builds). We (Erik and I) knew we would need something there, if only for the case of oop, but hadn't agreed on a solution. Well, I'm proposing one here. I've included a mechanism for dealing with enums. The previously proposed change didn't handle them, as recognizing enums requires a lot more metaprogramming, since we don't have C++11 std::is_enum. And as mentioned earlier, I now think we need to try harder in this area. The approach being taken is the registration mechanism I mentioned in earlier email. The reason for including this is to allow filtering and dispatching involving enum types, which are *not* integral types, even though conversions are supported. New files for metaprogramming: metaprogramming/integerTypes.hpp metaprogramming/registeredEnum.hpp Atomic::cmpxchg is still changed to be function template. And it still has three different template parameters. But there are more constraints on the parameters, encoded in the specializations for CmpxchgImpl. (Note that CmpxchgImpl isn't necessary; we could accomplish the same dispatch via SFINAEed specializations for cmpxchg. I just prefer the class vs the (IMO) syntactically horried syntax for SFINAE of function templates. C++11 improves that a lot. Erik prefers SFINAE of function templates rather than introducing a helper class template like CmpxchgImpl.) That front-end uses private Atomic::PlatformCmpxchg to perform . Specializations are function objects with a template operator() with signature T(T, T volatile*, T, order). Only Linux x86/64 updated so far. That makes use of the existing inline assembly, but wrapped in templates rather than functions with fixed parameter types. Replacing the assembly code with calls to gcc's __sync_compare_and_swap would be syntactically trivial (and indeed builds without any problems). I've also added bool Atomic::conditional_store_ptr(T, D volatile*), for the idiom of storing a value if the old value is NULL. It turns out there are about 25 occurrences of this idiom in Hotspot, so a utility for it seems warranted. The current implementation is just a straightforward wrapper around cmpxchg, which means it can't take advantage of gcc's __sync_bool_compare_and_swap. That can be dealt with later if desired. I also had to modify a few uses of cmpxchg to get this to compile. These are presumably the same ones that Andrew encountered. Changed files for Atomic: runtime/atomic.hpp os_cpu/linux_x86/vm/atomic_linux_x86.hpp Changed files for uses, so atomic changes compile: aot/aotCodeHeap.cpp aot/aotCodeHeap.hpp gc/parallel/psParallelCompact.hpp gc/shared/workgroup.cpp runtime/os.cpp [Plus one additional closed change.] I then replaced a few uses of cmpxchg_ptr with cmpxchg, taking advantage of the new API. This eliminated a number of casts. There are still about 110-120 uses of cmpxchg_ptr remaining. Changed files for cmpxchg_ptr removal: oops/oop.inline.hpp -- demonstrates oop translation utilities/bitMap.cpp utilities/bitMap.inline.hpp I'm looking for feedback on this before I try to carry it any further. From david.holmes at oracle.com Mon Aug 7 01:18:46 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 7 Aug 2017 11:18:46 +1000 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: <2e85182c-8253-25ab-cd94-09a4e8bda2cb@oracle.com> Hi Kim, On 7/08/2017 9:32 AM, Kim Barrett wrote: > Here is a start at addressing the various comments. I still owe email > responses to some comments, but wanted to get this out first. > > http://cr.openjdk.java.net/~kbarrett/cmpxchg_template_20170806/webrev/ > > So far this only deals with cmpxchg, and then only for linux-x86/64. > It builds and passes basic tests. I want to get feedback on this > before spending time on other platforms, other operations, and more > extensive testing. Pardon my ignorance but I can't follow/read/understand the vast majority of this template code. If I was using this API and got a compile-time error I likely would not have a clue how to try and figure out what was going wrong where. I certainly can not maintain or debug this code, nor explain to someone else how it works. :( David ----- > The metaprogramming support layer has been substantially reduced and > simplified. IntegerTypes is still there, but is more focused, with a > very small public API. It also badly needs a new name, but I'll worry > about that later. > > Floating point types are still there; they can be dropped easily > enough (and reinstated if later needed). I've kept them for now > because they are easy (which is itself somewhat interesting), and > because I don't recall whether later GC interface work needed it. > > I've included a mechanism for dealing with thin wrappers over > primitive types. The driver for this is oop, which is normally just a > typedef for oopDesc*, but is a class with an oopDesc* member when > CHECK_UNHANDLED_OOPS is defined (such as during fastdebug builds). We > (Erik and I) knew we would need something there, if only for the case > of oop, but hadn't agreed on a solution. Well, I'm proposing one here. > > I've included a mechanism for dealing with enums. The previously > proposed change didn't handle them, as recognizing enums requires a > lot more metaprogramming, since we don't have C++11 std::is_enum. And > as mentioned earlier, I now think we need to try harder in this area. > The approach being taken is the registration mechanism I mentioned in > earlier email. The reason for including this is to allow filtering > and dispatching involving enum types, which are *not* integral types, > even though conversions are supported. > > New files for metaprogramming: > metaprogramming/integerTypes.hpp > metaprogramming/registeredEnum.hpp > > Atomic::cmpxchg is still changed to be function template. And it > still has three different template parameters. But there are more > constraints on the parameters, encoded in the specializations for > CmpxchgImpl. (Note that CmpxchgImpl isn't necessary; we could > accomplish the same dispatch via SFINAEed specializations for > cmpxchg. I just prefer the class vs the (IMO) syntactically horried > syntax for SFINAE of function templates. C++11 improves that a lot. > Erik prefers SFINAE of function templates rather than introducing a > helper class template like CmpxchgImpl.) > > That front-end uses private Atomic::PlatformCmpxchg > to perform . Specializations are function objects > with a template operator() with signature T(T, T volatile*, T, order). > Only Linux x86/64 updated so far. That makes use of the existing > inline assembly, but wrapped in templates rather than functions with > fixed parameter types. Replacing the assembly code with calls to > gcc's __sync_compare_and_swap would be syntactically trivial (and > indeed builds without any problems). > > I've also added bool Atomic::conditional_store_ptr(T, D volatile*), > for the idiom of storing a value if the old value is NULL. It turns > out there are about 25 occurrences of this idiom in Hotspot, so a > utility for it seems warranted. The current implementation is just a > straightforward wrapper around cmpxchg, which means it can't take > advantage of gcc's __sync_bool_compare_and_swap. That can be dealt > with later if desired. > > I also had to modify a few uses of cmpxchg to get this to compile. > These are presumably the same ones that Andrew encountered. > > Changed files for Atomic: > runtime/atomic.hpp > os_cpu/linux_x86/vm/atomic_linux_x86.hpp > > Changed files for uses, so atomic changes compile: > aot/aotCodeHeap.cpp > aot/aotCodeHeap.hpp > gc/parallel/psParallelCompact.hpp > gc/shared/workgroup.cpp > runtime/os.cpp > [Plus one additional closed change.] > > I then replaced a few uses of cmpxchg_ptr with cmpxchg, taking > advantage of the new API. This eliminated a number of casts. There > are still about 110-120 uses of cmpxchg_ptr remaining. > > Changed files for cmpxchg_ptr removal: > oops/oop.inline.hpp -- demonstrates oop translation > utilities/bitMap.cpp > utilities/bitMap.inline.hpp > > I'm looking for feedback on this before I try to carry it any further. > From kim.barrett at oracle.com Mon Aug 7 04:34:52 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 7 Aug 2017 00:34:52 -0400 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596CD126.6080100@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: > On Aug 6, 2017, at 7:32 PM, Kim Barrett wrote: > > Here is a start at addressing the various comments. I still owe email > responses to some comments, but wanted to get this out first. > > http://cr.openjdk.java.net/~kbarrett/cmpxchg_template_20170806/webrev/ I made an "improvement" in the handling of translated types today, but it turned out to just make things more complicated. I've backed it out and updated the webrev in-place. This simplification is neither expected nor intended to address David's complaints about how templates are used in this code. From goetz.lindenmaier at sap.com Mon Aug 7 08:02:59 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 7 Aug 2017 08:02:59 +0000 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: <5984CC70.80209@oracle.com> References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> <5984CC70.80209@oracle.com> Message-ID: <412edbf307a64909a010ce5f83deac5b@sap.com> Hi, webrev with Whitebox: http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04/ http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04-hs/ I don't see so much of a difference to throwing an exception, if Whitebox is not properly implemented you get one, anyways: Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.hotspot.WhiteBox.isCDSIncludedInVmBuild()Z at sun.hotspot.WhiteBox.isCDSIncludedInVmBuild(Native Method) Maybe it's a bit less likely to break, though. I'm fine with this, too. Best regards, Goetz., > -----Original Message----- > From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > Sent: Freitag, 4. August 2017 21:35 > To: Ioi Lam ; Lindenmaier, Goetz > ; Igor Ignatyev > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests > > Hi, > > I have an alternative solution that is IMO rather simple, reliable > and will > solve some issues we discussed (e.g. no need to throw exceptions, no > need to handle failure to map an archive). > The proposed solution uses White Box test API to determine whether VM > is compiled with INCLUDE_CDS on or off. > I implemented and tested it today, it works for me. > > The patch is attached. Please let me know what you think. > > Thank you, > Mikhailo > > On 8/3/17, 11:39 PM, Ioi Lam wrote: > > Hi Goetz, > > > > Instead of testing -Xshare:on, I think you should test with > > -Xshare:auto, which sets the flags > > > > UseSharedSpaces = true > > RequireSharedSpaces = false > > > > and will reliably print "Shared spaces are not supported in this VM" > > if-and-only-if INCLUDE_CDS is disabled (see arguments.cpp): > > > > > > #if !INCLUDE_CDS > > if (DumpSharedSpaces || RequireSharedSpaces) { > > jio_fprintf(defaultStream::error_stream(), > > "Shared spaces are not supported in this VM\n"); > > return JNI_ERR; > > } > > if ((UseSharedSpaces && FLAG_IS_CMDLINE(UseSharedSpaces)) || > > log_is_enabled(Info, cds)) { > > warning("Shared spaces are not supported in this VM"); > > FLAG_SET_DEFAULT(UseSharedSpaces, false); > > LogConfiguration::configure_stdout(LogLevel::Off, true, > > LOG_TAGS(cds)); > > } > > no_shared_spaces("CDS Disabled"); > > #endif // INCLUDE_CDS > > > > > > That way, you don't need to test any other output message or exit > > conditions(such as mapping error). > > > > > > E.g.: > > > > ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java -Xshare:auto > > -version > > java version "10-internal" > > Java(TM) SE Runtime Environment (build > > 10-internal+0-2017-08-04-0614567.iklam.iter) > > Java HotSpot(TM) 64-Bit Server VM (build > > 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) > > > > > > ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java > > -XXaltjvm=minimal -Xshare:auto -version > > Java HotSpot(TM) 64-Bit Minimal VM warning: Shared spaces are not > > supported in this VM > > java version "10-internal" > > Java(TM) SE Runtime Environment (build > > 10-internal+0-2017-08-04-0614567.iklam.iter) > > Java HotSpot(TM) 64-Bit Minimal VM (build > > 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) > > > > > > > > Thanks > > - Ioi > > > > On 8/3/17 10:58 PM, Lindenmaier, Goetz wrote: > >> Hi Mikhailo, > >> > >> I put in your version of vmCDS() into this new webrev. > >> I also had to update the list of tests marked in hotspot, > >> as tests were removed and added in between, and resolved > >> it against the aot change: > >> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ > >> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ > >> > >> I don't think it's a good idea to swallow the exception silently > >> as you propose. > >> In our test setup, the tests would just be switched off if something > >> breaks, and no one will see that. If they fail though, it's an easy > >> and quick fix. I would at least switch them on, then one sees the > >> failing tests in case switching them on was the wrong guess. > >> Also, below, the method dump() throws an exception. > >> > >> Best regards, > >> Goetz > >> > >>> -----Original Message----- > >>> From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] > >>> Sent: Tuesday, August 01, 2017 11:49 PM > >>> To: Lindenmaier, Goetz > >>> Cc: hotspot-runtime-dev at openjdk.java.net > >>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable > >>> cds tests > >>> > >>> Hi Goetz, > >>> > >>> I have reviewed your updated changes, and they overall look good to > me. > >>> > >>> However, I have some comments + concerns regarding > VMProps.vmCDS(): > >>> > >>> > >>> 1. Throwing exceptions from within the vmCDS() method. > >>> > >>> The VMProps properties are evaluated at the start of each > >>> run. If > >>> the exception is thrown here the whole test run will fail (not just the > >>> test that uses '@requires vm.cds'). The failure will occur shortly > >>> after > >>> the start of jtreg test run with a message: > >>> "java.lang.RuntimeException: Can not start VM to test to > >>> find out it's features. Switching off class data sharing (CDS)." > >>> > >>> Your method has 2 throw statements: "new RuntimeException("Can > >>> not > >>> start VM..." and "java.lang.RuntimeException: Can not start VM to test > >>> to...". I would recommend a more graceful way to fail, e.g. to print > >>> the > >>> message and to return "false" instead. This way the rest of the test > >>> run > >>> will continue, but the tests requiring vm.cds will be skipped with > >>> qualification of "not selected". > >>> > >>> 2. The check for "An error has occurred while processing the shared > >>> archive file." assumes that archive was not created prior to the > >>> execution of this evaluation code. However, there are test modes > where > >>> archive is created prior to test run. We use such mode on regular > >>> basis. > >>> In such cases the code will fail. > >>> I recommend to run "-Xshare:on -version", and check the > >>> following match that would result in return of "true": > >>> "Java HotSpot.*sharing" > >>> > >>> 3. On occasion the mapping of shared archive region to a specified > >>> address will fail (due to system configuration, space already occupied, > >>> ASLR, etc.) > >>> > >>> Hence I recommend checking for such conditions as well: > >>> > >>> if (output.firstMatch("Unable to map") != null) { > >>> System.out.println("VMProps.vmCDS() encountered an > >>> archive > >>> mapping failure, still proceeding with vm.cds=true"); > >>> return "true"; > >>> } > >>> I am returning true here because seeing this output means that > >>> CDS > >>> feature is supported, however in this particular instance archive > >>> failed > >>> to map. > >>> > >>> > >>> The rest of the changes looks good to me. > >>> > >>> See for my version of VMProps.vmCDS() below. Let me know what you > >>> think. > >>> > >>> > >>> Thank you, > >>> > >>> Mikhailo > >>> > >>> > >>> ================== my update of VMProps.vmCDS() > >>> > >>> protected String vmCDS() { > >>> System.setProperty("test.jdk", > >>> System.getProperty("java.home")); > >>> ProcessBuilder pb = > >>> ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); > >>> OutputAnalyzer output; > >>> > >>> try { > >>> output = new OutputAnalyzer(pb.start()); > >>> } catch (IOException e) { > >>> System.err.println( "Can not start VM to test to find out > >>> it's features. " + > >>> "Switching off class data > >>> sharing (CDS)." + e); > >>> return "false"; > >>> } > >>> if (output.firstMatch("Shared spaces are not supported in > >>> this > >>> VM") != null) { > >>> return "false"; > >>> } > >>> if (output.firstMatch("An error has occurred while processing > >>> the shared archive file.") != null) { > >>> return "true"; > >>> } > >>> if (output.firstMatch("Java HotSpot.*sharing") != null) { > >>> return "true"; > >>> } > >>> if (output.firstMatch("Unable to map") != null) { > >>> System.out.println("VMProps.vmCDS() encountered an > >>> archive > >>> mapping failure, still proceeding with vm.cds=true"); > >>> return "true"; > >>> } > >>> > >>> return "false"; > >>> } > >>> ================== > >>> > >>> > >>> > >>> On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> I made new webrevs implementing the change with @requires: > >>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ > >>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02- > hs/ > >>>> > >>>> I also changed the bug description and synopsis. > >>>> > >>>> For the jtreg runner I would propose to set the property test.jdk > >>>> so that it is available in VMProps. Igor also ran into this issue. > >>>> > >>>> Best regards, > >>>> Goetz. > >>>> > >>>> > >>>>> -----Original Message----- > >>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > >>>>> Sent: Montag, 31. Juli 2017 22:19 > >>>>> To: Lindenmaier, Goetz > >>>>> Cc: hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds > >>> tests > >>>>> Hi Goetz, > >>>>> > >>>>> I have an idea on how to address your second use case. > >>>>> The idea is to define a special test property (e.g. > >>>>> test.cds.disable.cds.support) which will override logic inside the > >>>>> VMProps.vmCDSSupported(). If this property is defined to "true" in > >>>>> test > >>>>> invocation command then vmCDSSupported() returns false (CDS is > >>> disabled, > >>>>> not supported), and all tests marked with "@requires > >>>>> vm.cds.supported" > >>>>> will be skipped. > >>>>> > >>>>> How to use it: > >>>>> jtreg -Dtest.cds.disable.cds.support=true > >>>>> E.g.: jtreg -Dtest.cds.disable.cds.support=true > >>>>> > hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java > >>>>> > >>>>> I prototyped this approach, it works for me. I have attached the > >>>>> diff. > >>>>> Let me know whether this works for your use case, or if you have any > >>>>> questions. > >>>>> > >>>>> > >>>>> Thank you, > >>>>> Mikhailo > >>>>> > >>>>> > >>>>> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: > >>>>>> Hi Mikhailo, > >>>>>> > >>>>>> Basically I'm fine with using the @requires property. > >>>>>> But is there a way to overrule the outcome of the method > >>>>>> implemented In VMProps.java computing the property? > >>>>>> I have two use cases for the key I want to introduce. > >>>>>> > >>>>>> First, our internal VM (we are Oracle licensees) is compiled without > >>>>>> CDS support. Thus we don't want to run the CDS tests. Currently > >>>>>> we have them all listed in the ProblemList, but that's not nice, > >>>>>> especially > >>>>>> because we have to adapt it whenever a new test is added. > >>>>>> As I understand, the @requires property works fine, here. > >>>>>> > >>>>>> Second, we also test the two ports we contributed (ppc and s390). > >>>>>> These > >>>>> contain > >>>>>> rudimentary cds support and so far passed all tests. > >>>>>> Unfortunately it > >>> broke > >>>>>> lately in jdk10. Instead of fixing it (our people are working on > >>>>>> finishing > >>> our > >>>>>> internal Java 9 port) I would like to switch off all cds tests. > >>>>>> As I can set the key on the command line of jtreg, I easily can > >>>>>> do that. > >>>>>> Is there a way to do similar with the @requires property? > >>>>>> > >>>>>> Best regards, > >>>>>> Goetz. > >>>>>> > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > >>>>>>> Sent: Freitag, 28. Juli 2017 23:53 > >>>>>>> To: Lindenmaier, Goetz > >>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to > >>>>>>> disable cds > >>>>> tests > >>>>>>> Hi Goetz, > >>>>>>> > >>>>>>> I am a HotSpot SQE Engineer at Oracle. I have discussed your > >>> proposed > >>>>>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the > >>> following > >>>>>>> feedback on this change. > >>>>>>> > >>>>>>> 1. As part of streamlining and simplifying SQE process and the > >>>>>>> use of > >>>>>>> test tools we have narrowed down the test selection mechanisms. > >>>>>>> > >>>>>>> 2. Our preferred test selection mechanism is use of "@requires" > >>>>>>> and a > >>>>>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though > >>> JTREG > >>>>>>> supports use of "@key", we prefer the use of "@requires" as a > >>>>>>> first > >>>>> choice. > >>>>>>> 3. If it is not possible to use "@requires" for a given > >>>>>>> situation then > >>>>>>> use "@key" mechanism. We would ask you if you could explore the > >>>>>>> possibility of implementing this change via @requires first. > >>>>>>> > >>>>>>> > >>>>>>> Here are several hints that may help: > >>>>>>> > >>>>>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. The > >>>>>>> value > >>>>>>> of a given "requires property" is evaluated inside this file and > >>>>>>> placed > >>>>>>> into a map (see public call() method). Add your evaluation code > >>>>>>> here, > >>>>>>> and then follow the pattern used for other properties. Create a > >>> property > >>>>>>> (e.g. vm.cds.supported, with values of true/false). Create a > method > >>> that > >>>>>>> evaluates the property value (e.g. isCDSSupported() or similar). > >>>>>>> > >>>>>>> 2. The method could use several options to evaluate whether CDS > is > >>>>>>> supported. > >>>>>>> A. WhiteBox API. Create a new WB test API method which can > >>> return > >>>>>>> true if CDS_ compiler flag is defined, otherwise false. > >>>>>>> Call WB API from VMProps.java. See > >>>>>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create > your > >>>>> own > >>>>>>> WB.isCDSSupported() > >>>>>>> WhiteBox.java resides in > >>>>>>> test/lib/sun/hotspot/WhiteBox.java > >>>>>>> > >>>>>>> B. Another options is to evaluate by running VM with > >>>>>>> sharing on and > >>>>>>> checking the return (may be not as reliable as option A) > >>>>>>> C. Other ideas welcome. > >>>>>>> > >>>>>>> 3. Include "@requres vm.cds.supported == true" to the > appropriate > >>> tests. > >>>>>>> Let me know if you have any questions. > >>>>>>> > >>>>>>> > >>>>>>> Best regards, > >>>>>>> Mikhailo > >>>>>>> > >>>>>>> > >>>>>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: > >>>>>>>> Hi > >>>>>>>> > >>>>>>>> we compile the VM without CDS support. Thus the CDS tests > >>>>>>>> fail. This change introduces a keyword 'cds' and marks > >>>>>>>> the tests accordingly. > >>>>>>>> This change also fixes the keywords specified in > >>>>>>> gc/g1/TestSharedArchiveWithPreTouch.java. > >>>>>>>> There may only be one @key keyword in the test specification. > >>>>>>>> In runtime/CompressedOops/CompressedClassPointers.java only > one > >>>>> test > >>>>>>>> case required CDS. I changed this sub case to succeed if CDS is > >>>>>>>> not > >>>>>>>> available. > >>>>>>>> > >>>>>>>> Please review this change. I please need a sponsor. > >>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436- > cdsKey/webrev.01/ > >>>>>>>> > >>>>>>>> Best regards, > >>>>>>>> Goetz. > > From thomas.schatzl at oracle.com Mon Aug 7 09:42:34 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 07 Aug 2017 11:42:34 +0200 Subject: RFR: 8185757: QuickSort array size should be size_t In-Reply-To: <4029C0A4-1F4D-4C43-9022-FF9F33BD321A@oracle.com> References: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> <1501763541.2411.19.camel@oracle.com> <1501787150.2411.71.camel@oracle.com> <4029C0A4-1F4D-4C43-9022-FF9F33BD321A@oracle.com> Message-ID: <1502098954.2950.0.camel@oracle.com> Hi, On Fri, 2017-08-04 at 19:20 -0400, Kim Barrett wrote: > > > > > > > > [...] > > > > However, please put the closing brackets of these into extra > > > > lines > > > > (quicksort.hpp:76,77) to avoid the casual reader to overlook > > > > them. > > > Sorry, but that just looks horrible.??As a casual reader, I > > > wouldn?t even look for them, since if they aren?t there then the > > > code is badly mis-indented. > > Actually I was already at writing about an issue with indentation > > when I noticed the brackets :) > > > > Not insisting on changing this. > I added a couple of asserts to check for buffer overruns due to a bad > comparator: > > diff -r 6b62ed03a6a6 -r 7616ceb92653 > src/share/vm/utilities/quickSort.hpp > --- a/src/share/vm/utilities/quickSort.hpp Fri Aug 04 18:02:51 > 2017 -0400 > +++ b/src/share/vm/utilities/quickSort.hpp Fri Aug 04 19:17:36 > 2017 -0400 > @@ -73,8 +73,12 @@ > ?????T pivot_val = array[pivot]; > ? > ?????for ( ; true; ++left_index, --right_index) { > -??????for ( ; comparator(array[left_index], pivot_val) < 0; > ++left_index) {} > -??????for ( ; comparator(array[right_index], pivot_val) > 0; -- > right_index) {} > +??????for ( ; comparator(array[left_index], pivot_val) < 0; > ++left_index) { > +????????assert(left_index < length, "reached end of partition"); > +??????} > +??????for ( ; comparator(array[right_index], pivot_val) > 0; -- > right_index) { > +????????assert(right_index > 0, "reached start of partition"); > +??????} > ? > ???????if (left_index < right_index) { > ?????????if (!idempotent || comparator(array[left_index], > array[right_index]) != 0) { > ? looks good. Thanks a lot. Thomas From thomas.schatzl at oracle.com Mon Aug 7 09:55:05 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 07 Aug 2017 11:55:05 +0200 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <1501762304.2411.8.camel@oracle.com> Message-ID: <1502099705.2950.2.camel@oracle.com> Hi, On Thu, 2017-08-03 at 13:44 -0400, Kim Barrett wrote: > > > > On Aug 3, 2017, at 8:11 AM, Thomas Schatzl > om> wrote: > > > > On Wed, 2017-08-02 at 19:31 -0400, Kim Barrett wrote: > > > > > > > > > > > On Aug 2, 2017, at 7:23 PM, David Holmes > > > om> > > > > wrote: > > > > > > > > Hi Kim, > > > > > > > > On 3/08/2017 8:41 AM, Kim Barrett wrote: > > > > > > > > > > > > > > > Please review this small change to improve the debugging > > > > > experience > > Looks good. > Thanks. > > > > > [...] > > > > // Defer any state checking to ~Monitor > > > > Mutex::~Mutex() { } > > > Well, I was waffling over just removing it completely. > > > > > ? do it :) I can't see, apart from it being a bug (to be fixed in > > another CR?), why the constructor of Monitor is not virtual. > It?s gone now. > ? still looks good. Thomas From harold.seigel at oracle.com Mon Aug 7 13:13:56 2017 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 7 Aug 2017 09:13:56 -0400 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> <7973a8e4-232b-2618-ee24-ab455bab5abc@oracle.com> Message-ID: <9a934403-76d0-b296-269a-7f40b3f81208@oracle.com> Hi David, Thanks for your comments! Please review this updated webrev. It contains the change that you suggested. It also simplifies the implementation by statically allocating the fixup lists before their first use. http://cr.openjdk.java.net/~hseigel/bug_8185103.2/webrev/index.html Thanks, Harold On 8/3/2017 7:03 PM, David Holmes wrote: > Hi Harold, > > On 4/08/2017 7:24 AM, David Holmes wrote: >> Hi Harold, >> >> On 3/08/2017 11:03 PM, harold seigel wrote: >>> Hi, >>> >>> Please review this JDK-10 fix for JDK-8185103. The problem occurred >>> because classes were being put on the fixup_module_field_list before >>> their mirror field was set. If a (different) thread called method >>> patch_javabase_entries() before the class's mirror field was set then >> >> The code that calls patch_javabase_entries has this: >> >> // Only the thread that actually defined the base module will get >> here, >> // so no locking is needed. >> >> // Patch any previously loaded class's module field with >> java.base's java.lang.Module. >> ModuleEntryTable::patch_javabase_entries(module_handle); >> >> so it seems that comment is wrong and that locking is indeed needed >> somewhere! At a minimum your setting of the mirror needs a following >> storestore barrier, or (better) the set/get of the mirror uses >> load-acquire/store-release. > > Sorry - looking in more detail the necessary locking is already in > place. A class is only added to the fixup list, under the Module_lock, > if the base module is not yet defined. The finalization of that > definition also occurs under the Module_lock, which in turn occurs > before the fixup list is processed (without the lock). So as long as > the mirror is set before the class is added to the fixup list, the > mirror will be visible to the main thread when it processes it. > > Looking at the original code: > > 881 // set the module field in the java_lang_Class instance > 882 set_mirror_module_field(k, mirror, module, THREAD); > 883 > 884 // Setup indirection from klass->mirror > 885 // after any exceptions can happen during allocations. > 886 k->set_java_mirror(mirror()); > > it would seem simplest to just reorder the two actions - except for > that comment about exceptions. Is the allocation exception issue less > of an issue when doing VM initialization? What will happen? > > Thanks, > David > >> Thanks, >> David >> ----- >> >>> this would cause a SIGSEGV because patch_javabase_entries() >>> eventually calls obj_field_put() which tries to dereference the >>> class's mirror field. >>> >>> This change fixes the problem by setting the class's mirror field >>> before putting the class on the fixup_module_field_list. >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 >>> >>> The fix was tested with the JCK Lang and VM tests, the JTreg >>> hotspot, java/io, java/lang, java/util and other tests, the >>> co-located NSK tests, and with JPRT. >>> >>> Additionally, the fix was tested by temporarily adding a >>> naked_short_sleep(50) to method initialize_mirror_fields() shortly >>> after it put a class on the fixup_module_field_list. The sleep was >>> added in order to enhance the likelihood of patch_javabase_entries() >>> being called before the class's mirror field got set. Without the >>> fix, the TestThreadDumpMonitorContent.java test and the test >>> reported in JDK-8183309 >>> reliably got the >>> reported SIGSEGVs. With the fix, the tests passed. >>> >>> Thanks, Harold >>> From harold.seigel at oracle.com Mon Aug 7 13:26:55 2017 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 7 Aug 2017 09:26:55 -0400 Subject: RFR 8185717: Make ModuleEntry->module() return an oop not a jobject Message-ID: Hi, Please review this JDK-10 change to have ModuleEntry::module() return an oop instead of a jobject. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8185717/webrev/index.html JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185717 The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util and other tests, the co-located NSK tests, JPRT, and with the RBT tier2 - tier5 tests on Linux x64. Thanks, Harold From shade at redhat.com Mon Aug 7 13:34:04 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 7 Aug 2017 15:34:04 +0200 Subject: RFR 8185717: Make ModuleEntry->module() return an oop not a jobject In-Reply-To: References: Message-ID: On 08/07/2017 03:26 PM, harold seigel wrote: > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8185717/webrev/index.html So this just moves JNIHandles::resolve calls into ModuleEntry::module(), and exposes ModuleEntry::module_handle() in case we can do with the original (global?) handle. This looks correct. Thanks, -Aleksey From coleen.phillimore at oracle.com Mon Aug 7 14:49:56 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 7 Aug 2017 10:49:56 -0400 Subject: RFR 8185717: Make ModuleEntry->module() return an oop not a jobject In-Reply-To: References: Message-ID: <91a39159-c6c8-7034-73be-527fe421d09d@oracle.com> Harold, This looks good. Thank you for this cleanup! Coleen On 8/7/17 9:26 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to have ModuleEntry::module() return > an oop instead of a jobject. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8185717/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185717 > > The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util and other tests, the co-located NSK > tests, JPRT, and with the RBT tier2 - tier5 tests on Linux x64. > > Thanks, Harold > From thomas.schatzl at oracle.com Mon Aug 7 14:52:50 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 07 Aug 2017 16:52:50 +0200 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> Message-ID: <1502117570.2950.49.camel@oracle.com> Hi, On Thu, 2017-08-03 at 17:15 -0700, Jiangli Zhou wrote: > Here are the updated webrevs. > > http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ > http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ > > Changes in the updated webrevs include: > Merge with Ioi?s recent shared space auto-sizing change (8072061) > Addressed all feedbacks from Ioi and Coleen (Thanks for detailed > review!) - the comment in g1Allocator.hpp:326 needs to be updated. I would merge the information from the _open member here. I.e. define what "open" and "closed" archive are. -?g1Allocator.inline.hpp:66-72: formatting - parameters should be aligned below each other -?g1CollectedHeap.cpp:665/6: formatting - parameters should be aligned below each other -?g1CollectedHeap.cpp:750 + 756: maybe make a static method to avoid repetition here. -?G1NoteEndOfConcMarkClosure::doHeapRegion(): would it be too much work to make an extra CR out of this change? It is a change that fixes an existing bug unrelated to this change after all (not doing remembered set cleanup work for archive regions). -?g1HeapVerifier.cpp:63: I think this change is obsolete since is_obj_dead_cond() will always return false. (I.e. some leftover of some internal discussion) -?g1HeapVerifier.cpp: there is a verbose flag passed around. Not sure if it should be kept, as it seems to be some code that has been used for debugging this feature, but can't be activated anyway without code changes. -?heapRegion.inline.hpp:120: please fix the comment of the assert. Something like "Closed archive regions should not have references into other regions" - heapRegion.inline.hpp:125: I think the existing code of faking open archive regions as all-live does not work as implemented. Consider the case when a new object in there is made live, and references in there set to refer to some object outside this region, and is the only reference (and it has not been marked live yet): if there is a remembered set entry to that, and it is about to be scanned. The current implementation of HeapRegion::is_obj_dead() will consider it dead, so we will enter the code path at line 125. Block_size_using bitmap() will jump over that object, but the return values of is_obj_dead_with_size() method will indicate the caller to not iterate over this object anyway, potentially missing that reference. HeapRegion::is_obj_dead() needs to return that the object is not dead for open archive regions. I think for now the safest way is to add !is_open_archive() to the condition calculated there. That will obviate the need for that existing hack to the assert too. It may have some perf impact though - actually recently there has been some effort to remove that is_archive() check from that code (the one that is now the is_closed_archive() assert). I do not see an easy way to fix this. :( (i.e. there is likely no perf impact vs. jdk9 so it's not that bad) This suggestion also only works with the assumption laid out in the CR that there is no way that a live object can not become dormant again, and the objects in the open archive regions are always parsable (never contain junk data). - HeapRegionType::relabel_as_old(): the code in line 175/176 is unreachable and should be removed. I did not look through runtime code too closely. Thanks, ? Thomas From coleen.phillimore at oracle.com Mon Aug 7 15:06:22 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 7 Aug 2017 11:06:22 -0400 Subject: RFR 8185717: Make ModuleEntry->module() return an oop not a jobject In-Reply-To: References: Message-ID: <82c57fa9-4fb9-77d6-55b8-4a9338ad6ce5@oracle.com> On 8/7/17 9:34 AM, Aleksey Shipilev wrote: > On 08/07/2017 03:26 PM, harold seigel wrote: >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8185717/webrev/index.html > So this just moves JNIHandles::resolve calls into ModuleEntry::module(), and exposes > ModuleEntry::module_handle() in case we can do with the original (global?) handle. Yes, we can change the representation without having to fix all these places. If you look at ClassLoaderData::_handles, it's not a JNIHandleBlock anymore, so these really aren't JNIHandles or jobject. This was because JNIHandleBlock couldn't be concurrently walked. The global handle and weak handle list has the same problem. We're (Kim is) currently working on a replacement for this, which needs an RFE. thanks, Coleen > > This looks correct. > > Thanks, > -Aleksey > > From harold.seigel at oracle.com Mon Aug 7 15:07:44 2017 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 7 Aug 2017 11:07:44 -0400 Subject: RFR 8185717: Make ModuleEntry->module() return an oop not a jobject In-Reply-To: References: Message-ID: <8572c83f-ec2e-0432-3559-ee57e1270430@oracle.com> Aleksey, Thanks for the review! Harold On 8/7/2017 9:34 AM, Aleksey Shipilev wrote: > On 08/07/2017 03:26 PM, harold seigel wrote: >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8185717/webrev/index.html > So this just moves JNIHandles::resolve calls into ModuleEntry::module(), and exposes > ModuleEntry::module_handle() in case we can do with the original (global?) handle. > > This looks correct. > > Thanks, > -Aleksey > > From harold.seigel at oracle.com Mon Aug 7 15:08:07 2017 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 7 Aug 2017 11:08:07 -0400 Subject: RFR 8185717: Make ModuleEntry->module() return an oop not a jobject In-Reply-To: <91a39159-c6c8-7034-73be-527fe421d09d@oracle.com> References: <91a39159-c6c8-7034-73be-527fe421d09d@oracle.com> Message-ID: <6a63f2fa-8e37-4b75-4358-98bfc89b7049@oracle.com> Thanks Coleen! Harold On 8/7/2017 10:49 AM, coleen.phillimore at oracle.com wrote: > > Harold, This looks good. Thank you for this cleanup! > Coleen > > On 8/7/17 9:26 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to have ModuleEntry::module() return >> an oop instead of a jobject. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8185717/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185717 >> >> The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, >> java/io, java/lang, java/util and other tests, the co-located NSK >> tests, JPRT, and with the RBT tier2 - tier5 tests on Linux x64. >> >> Thanks, Harold >> > From lois.foltan at oracle.com Mon Aug 7 15:10:08 2017 From: lois.foltan at oracle.com (Lois Foltan) Date: Mon, 7 Aug 2017 11:10:08 -0400 Subject: RFR 8185717: Make ModuleEntry->module() return an oop not a jobject In-Reply-To: References: Message-ID: <83aea639-aca8-2498-1757-3013a3ee811b@oracle.com> Looks good. Lois On 8/7/2017 9:26 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to have ModuleEntry::module() return > an oop instead of a jobject. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8185717/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185717 > > The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util and other tests, the co-located NSK > tests, JPRT, and with the RBT tier2 - tier5 tests on Linux x64. > > Thanks, Harold > From harold.seigel at oracle.com Mon Aug 7 15:10:51 2017 From: harold.seigel at oracle.com (harold seigel) Date: Mon, 7 Aug 2017 11:10:51 -0400 Subject: RFR 8185717: Make ModuleEntry->module() return an oop not a jobject In-Reply-To: <83aea639-aca8-2498-1757-3013a3ee811b@oracle.com> References: <83aea639-aca8-2498-1757-3013a3ee811b@oracle.com> Message-ID: <982c8652-9340-6186-0a12-55efcad902aa@oracle.com> Thanks Lois! Harold On 8/7/2017 11:10 AM, Lois Foltan wrote: > Looks good. > Lois > > On 8/7/2017 9:26 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to have ModuleEntry::module() return >> an oop instead of a jobject. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8185717/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185717 >> >> The fix was tested with the JCK Lang and VM tests, the JTreg hotspot, >> java/io, java/lang, java/util and other tests, the co-located NSK >> tests, JPRT, and with the RBT tier2 - tier5 tests on Linux x64. >> >> Thanks, Harold >> > From aph at redhat.com Mon Aug 7 15:23:46 2017 From: aph at redhat.com (Andrew Haley) Date: Mon, 7 Aug 2017 16:23:46 +0100 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: <97a3c455-c3d1-7df4-a20f-90f1d1102e6d@redhat.com> On 07/08/17 00:32, Kim Barrett wrote: > I'm looking for feedback on this before I try to carry it any further. I don't like it because it converts pointers to operand types before calling the back end. For example, in here: intptr_t v = CASPTR(&_LockWord, 0, _LBIT); // agro ... the type of the operand LockWord is SplitWord. But the SplitWord * argument gets converted to void* volatile* when we call this: inline static void* cmpxchg_ptr(void* exchange_value, volatile void* dest, void* compare_value, cmpxchg_memory_order order = memory_order_conservative) { return cmpxchg(exchange_value, reinterpret_cast(dest), compare_value, order); Here's what I first wrote: I don't see the point of such a type conversion. We could call cmpxchg with the actual types of the operands, could we not? Why is cmpxchg_ptr even a thing? We're casting away type information for no reason that I can see. Couldn't cmpxchg_ptr() be defined as a template function in such a way that only the back ends that actually need to cast away the types have to do so? That is, if the back ends can define cmpxchg_ptr() themselves without resorting to pointer type conversion, we should let them so so. But rather than sending that message straight away, I tried it. And now I see: the compiler can't get the types right in those cases where we have mismatched operand types in the call. Argh. The only way we can get method resolution to work is to throw way the pointer type information and use void* for everything. At th erisk of being boring, I repeat what I said before: IMO this is not what we should be doing in 2017. We should be looking to the future, and get the types to match now, at the call site. Also, and this is a relatively slight objection, I find myself defining template<> struct Atomic::PlatformCmpxchg<1> VALUE_OBJ_CLASS_SPEC { template T operator()(T exchange_value, T volatile* dest, T compare_value, cmpxchg_memory_order order) const { return ::cmpxchg(exchange_value, dest, compare_value, order); } }; for 1, 4, and 8. I guess that can't be avoided, and in any case it would be easy enough to do it with preprocessor macros. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From mikhailo.seledtsov at oracle.com Mon Aug 7 16:46:40 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 07 Aug 2017 09:46:40 -0700 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: <412edbf307a64909a010ce5f83deac5b@sap.com> References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> <5984CC70.80209@oracle.com> <412edbf307a64909a010ce5f83deac5b@sap.com> Message-ID: <59889970.1080301@oracle.com> The change looks good to me, Thank you, Mikhailo On 8/7/17, 1:02 AM, Lindenmaier, Goetz wrote: > Hi, > > webrev with Whitebox: > http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04/ > http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04-hs/ > > I don't see so much of a difference to throwing an exception, if > Whitebox is not properly implemented you get one, anyways: > Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.hotspot.WhiteBox.isCDSIncludedInVmBuild()Z > at sun.hotspot.WhiteBox.isCDSIncludedInVmBuild(Native Method) > Maybe it's a bit less likely to break, though. > > I'm fine with this, too. > > Best regards, > Goetz., > > > >> -----Original Message----- >> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >> Sent: Freitag, 4. August 2017 21:35 >> To: Ioi Lam; Lindenmaier, Goetz >> ; Igor Ignatyev >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests >> >> Hi, >> >> I have an alternative solution that is IMO rather simple, reliable >> and will >> solve some issues we discussed (e.g. no need to throw exceptions, no >> need to handle failure to map an archive). >> The proposed solution uses White Box test API to determine whether VM >> is compiled with INCLUDE_CDS on or off. >> I implemented and tested it today, it works for me. >> >> The patch is attached. Please let me know what you think. >> >> Thank you, >> Mikhailo >> >> On 8/3/17, 11:39 PM, Ioi Lam wrote: >>> Hi Goetz, >>> >>> Instead of testing -Xshare:on, I think you should test with >>> -Xshare:auto, which sets the flags >>> >>> UseSharedSpaces = true >>> RequireSharedSpaces = false >>> >>> and will reliably print "Shared spaces are not supported in this VM" >>> if-and-only-if INCLUDE_CDS is disabled (see arguments.cpp): >>> >>> >>> #if !INCLUDE_CDS >>> if (DumpSharedSpaces || RequireSharedSpaces) { >>> jio_fprintf(defaultStream::error_stream(), >>> "Shared spaces are not supported in this VM\n"); >>> return JNI_ERR; >>> } >>> if ((UseSharedSpaces&& FLAG_IS_CMDLINE(UseSharedSpaces)) || >>> log_is_enabled(Info, cds)) { >>> warning("Shared spaces are not supported in this VM"); >>> FLAG_SET_DEFAULT(UseSharedSpaces, false); >>> LogConfiguration::configure_stdout(LogLevel::Off, true, >>> LOG_TAGS(cds)); >>> } >>> no_shared_spaces("CDS Disabled"); >>> #endif // INCLUDE_CDS >>> >>> >>> That way, you don't need to test any other output message or exit >>> conditions(such as mapping error). >>> >>> >>> E.g.: >>> >>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java -Xshare:auto >>> -version >>> java version "10-internal" >>> Java(TM) SE Runtime Environment (build >>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>> Java HotSpot(TM) 64-Bit Server VM (build >>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>> >>> >>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java >>> -XXaltjvm=minimal -Xshare:auto -version >>> Java HotSpot(TM) 64-Bit Minimal VM warning: Shared spaces are not >>> supported in this VM >>> java version "10-internal" >>> Java(TM) SE Runtime Environment (build >>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>> Java HotSpot(TM) 64-Bit Minimal VM (build >>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>> >>> >>> >>> Thanks >>> - Ioi >>> >>> On 8/3/17 10:58 PM, Lindenmaier, Goetz wrote: >>>> Hi Mikhailo, >>>> >>>> I put in your version of vmCDS() into this new webrev. >>>> I also had to update the list of tests marked in hotspot, >>>> as tests were removed and added in between, and resolved >>>> it against the aot change: >>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ >>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ >>>> >>>> I don't think it's a good idea to swallow the exception silently >>>> as you propose. >>>> In our test setup, the tests would just be switched off if something >>>> breaks, and no one will see that. If they fail though, it's an easy >>>> and quick fix. I would at least switch them on, then one sees the >>>> failing tests in case switching them on was the wrong guess. >>>> Also, below, the method dump() throws an exception. >>>> >>>> Best regards, >>>> Goetz >>>> >>>>> -----Original Message----- >>>>> From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] >>>>> Sent: Tuesday, August 01, 2017 11:49 PM >>>>> To: Lindenmaier, Goetz >>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>>>> cds tests >>>>> >>>>> Hi Goetz, >>>>> >>>>> I have reviewed your updated changes, and they overall look good to >> me. >>>>> However, I have some comments + concerns regarding >> VMProps.vmCDS(): >>>>> >>>>> 1. Throwing exceptions from within the vmCDS() method. >>>>> >>>>> The VMProps properties are evaluated at the start of each >>>>> run. If >>>>> the exception is thrown here the whole test run will fail (not just the >>>>> test that uses '@requires vm.cds'). The failure will occur shortly >>>>> after >>>>> the start of jtreg test run with a message: >>>>> "java.lang.RuntimeException: Can not start VM to test to >>>>> find out it's features. Switching off class data sharing (CDS)." >>>>> >>>>> Your method has 2 throw statements: "new RuntimeException("Can >>>>> not >>>>> start VM..." and "java.lang.RuntimeException: Can not start VM to test >>>>> to...". I would recommend a more graceful way to fail, e.g. to print >>>>> the >>>>> message and to return "false" instead. This way the rest of the test >>>>> run >>>>> will continue, but the tests requiring vm.cds will be skipped with >>>>> qualification of "not selected". >>>>> >>>>> 2. The check for "An error has occurred while processing the shared >>>>> archive file." assumes that archive was not created prior to the >>>>> execution of this evaluation code. However, there are test modes >> where >>>>> archive is created prior to test run. We use such mode on regular >>>>> basis. >>>>> In such cases the code will fail. >>>>> I recommend to run "-Xshare:on -version", and check the >>>>> following match that would result in return of "true": >>>>> "Java HotSpot.*sharing" >>>>> >>>>> 3. On occasion the mapping of shared archive region to a specified >>>>> address will fail (due to system configuration, space already occupied, >>>>> ASLR, etc.) >>>>> >>>>> Hence I recommend checking for such conditions as well: >>>>> >>>>> if (output.firstMatch("Unable to map") != null) { >>>>> System.out.println("VMProps.vmCDS() encountered an >>>>> archive >>>>> mapping failure, still proceeding with vm.cds=true"); >>>>> return "true"; >>>>> } >>>>> I am returning true here because seeing this output means that >>>>> CDS >>>>> feature is supported, however in this particular instance archive >>>>> failed >>>>> to map. >>>>> >>>>> >>>>> The rest of the changes looks good to me. >>>>> >>>>> See for my version of VMProps.vmCDS() below. Let me know what you >>>>> think. >>>>> >>>>> >>>>> Thank you, >>>>> >>>>> Mikhailo >>>>> >>>>> >>>>> ================== my update of VMProps.vmCDS() >>>>> >>>>> protected String vmCDS() { >>>>> System.setProperty("test.jdk", >>>>> System.getProperty("java.home")); >>>>> ProcessBuilder pb = >>>>> ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); >>>>> OutputAnalyzer output; >>>>> >>>>> try { >>>>> output = new OutputAnalyzer(pb.start()); >>>>> } catch (IOException e) { >>>>> System.err.println( "Can not start VM to test to find out >>>>> it's features. " + >>>>> "Switching off class data >>>>> sharing (CDS)." + e); >>>>> return "false"; >>>>> } >>>>> if (output.firstMatch("Shared spaces are not supported in >>>>> this >>>>> VM") != null) { >>>>> return "false"; >>>>> } >>>>> if (output.firstMatch("An error has occurred while processing >>>>> the shared archive file.") != null) { >>>>> return "true"; >>>>> } >>>>> if (output.firstMatch("Java HotSpot.*sharing") != null) { >>>>> return "true"; >>>>> } >>>>> if (output.firstMatch("Unable to map") != null) { >>>>> System.out.println("VMProps.vmCDS() encountered an >>>>> archive >>>>> mapping failure, still proceeding with vm.cds=true"); >>>>> return "true"; >>>>> } >>>>> >>>>> return "false"; >>>>> } >>>>> ================== >>>>> >>>>> >>>>> >>>>> On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: >>>>>> Hi, >>>>>> >>>>>> I made new webrevs implementing the change with @requires: >>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ >>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02- >> hs/ >>>>>> I also changed the bug description and synopsis. >>>>>> >>>>>> For the jtreg runner I would propose to set the property test.jdk >>>>>> so that it is available in VMProps. Igor also ran into this issue. >>>>>> >>>>>> Best regards, >>>>>> Goetz. >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>> Sent: Montag, 31. Juli 2017 22:19 >>>>>>> To: Lindenmaier, Goetz >>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds >>>>> tests >>>>>>> Hi Goetz, >>>>>>> >>>>>>> I have an idea on how to address your second use case. >>>>>>> The idea is to define a special test property (e.g. >>>>>>> test.cds.disable.cds.support) which will override logic inside the >>>>>>> VMProps.vmCDSSupported(). If this property is defined to "true" in >>>>>>> test >>>>>>> invocation command then vmCDSSupported() returns false (CDS is >>>>> disabled, >>>>>>> not supported), and all tests marked with "@requires >>>>>>> vm.cds.supported" >>>>>>> will be skipped. >>>>>>> >>>>>>> How to use it: >>>>>>> jtreg -Dtest.cds.disable.cds.support=true >>>>>>> E.g.: jtreg -Dtest.cds.disable.cds.support=true >>>>>>> >> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java >>>>>>> I prototyped this approach, it works for me. I have attached the >>>>>>> diff. >>>>>>> Let me know whether this works for your use case, or if you have any >>>>>>> questions. >>>>>>> >>>>>>> >>>>>>> Thank you, >>>>>>> Mikhailo >>>>>>> >>>>>>> >>>>>>> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: >>>>>>>> Hi Mikhailo, >>>>>>>> >>>>>>>> Basically I'm fine with using the @requires property. >>>>>>>> But is there a way to overrule the outcome of the method >>>>>>>> implemented In VMProps.java computing the property? >>>>>>>> I have two use cases for the key I want to introduce. >>>>>>>> >>>>>>>> First, our internal VM (we are Oracle licensees) is compiled without >>>>>>>> CDS support. Thus we don't want to run the CDS tests. Currently >>>>>>>> we have them all listed in the ProblemList, but that's not nice, >>>>>>>> especially >>>>>>>> because we have to adapt it whenever a new test is added. >>>>>>>> As I understand, the @requires property works fine, here. >>>>>>>> >>>>>>>> Second, we also test the two ports we contributed (ppc and s390). >>>>>>>> These >>>>>>> contain >>>>>>>> rudimentary cds support and so far passed all tests. >>>>>>>> Unfortunately it >>>>> broke >>>>>>>> lately in jdk10. Instead of fixing it (our people are working on >>>>>>>> finishing >>>>> our >>>>>>>> internal Java 9 port) I would like to switch off all cds tests. >>>>>>>> As I can set the key on the command line of jtreg, I easily can >>>>>>>> do that. >>>>>>>> Is there a way to do similar with the @requires property? >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Goetz. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>> Sent: Freitag, 28. Juli 2017 23:53 >>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>>>> disable cds >>>>>>> tests >>>>>>>>> Hi Goetz, >>>>>>>>> >>>>>>>>> I am a HotSpot SQE Engineer at Oracle. I have discussed your >>>>> proposed >>>>>>>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the >>>>> following >>>>>>>>> feedback on this change. >>>>>>>>> >>>>>>>>> 1. As part of streamlining and simplifying SQE process and the >>>>>>>>> use of >>>>>>>>> test tools we have narrowed down the test selection mechanisms. >>>>>>>>> >>>>>>>>> 2. Our preferred test selection mechanism is use of "@requires" >>>>>>>>> and a >>>>>>>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though >>>>> JTREG >>>>>>>>> supports use of "@key", we prefer the use of "@requires" as a >>>>>>>>> first >>>>>>> choice. >>>>>>>>> 3. If it is not possible to use "@requires" for a given >>>>>>>>> situation then >>>>>>>>> use "@key" mechanism. We would ask you if you could explore the >>>>>>>>> possibility of implementing this change via @requires first. >>>>>>>>> >>>>>>>>> >>>>>>>>> Here are several hints that may help: >>>>>>>>> >>>>>>>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. The >>>>>>>>> value >>>>>>>>> of a given "requires property" is evaluated inside this file and >>>>>>>>> placed >>>>>>>>> into a map (see public call() method). Add your evaluation code >>>>>>>>> here, >>>>>>>>> and then follow the pattern used for other properties. Create a >>>>> property >>>>>>>>> (e.g. vm.cds.supported, with values of true/false). Create a >> method >>>>> that >>>>>>>>> evaluates the property value (e.g. isCDSSupported() or similar). >>>>>>>>> >>>>>>>>> 2. The method could use several options to evaluate whether CDS >> is >>>>>>>>> supported. >>>>>>>>> A. WhiteBox API. Create a new WB test API method which can >>>>> return >>>>>>>>> true if CDS_ compiler flag is defined, otherwise false. >>>>>>>>> Call WB API from VMProps.java. See >>>>>>>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create >> your >>>>>>> own >>>>>>>>> WB.isCDSSupported() >>>>>>>>> WhiteBox.java resides in >>>>>>>>> test/lib/sun/hotspot/WhiteBox.java >>>>>>>>> >>>>>>>>> B. Another options is to evaluate by running VM with >>>>>>>>> sharing on and >>>>>>>>> checking the return (may be not as reliable as option A) >>>>>>>>> C. Other ideas welcome. >>>>>>>>> >>>>>>>>> 3. Include "@requres vm.cds.supported == true" to the >> appropriate >>>>> tests. >>>>>>>>> Let me know if you have any questions. >>>>>>>>> >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Mikhailo >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: >>>>>>>>>> Hi >>>>>>>>>> >>>>>>>>>> we compile the VM without CDS support. Thus the CDS tests >>>>>>>>>> fail. This change introduces a keyword 'cds' and marks >>>>>>>>>> the tests accordingly. >>>>>>>>>> This change also fixes the keywords specified in >>>>>>>>> gc/g1/TestSharedArchiveWithPreTouch.java. >>>>>>>>>> There may only be one @key keyword in the test specification. >>>>>>>>>> In runtime/CompressedOops/CompressedClassPointers.java only >> one >>>>>>> test >>>>>>>>>> case required CDS. I changed this sub case to succeed if CDS is >>>>>>>>>> not >>>>>>>>>> available. >>>>>>>>>> >>>>>>>>>> Please review this change. I please need a sponsor. >>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436- >> cdsKey/webrev.01/ >>>>>>>>>> Best regards, >>>>>>>>>> Goetz. From ioi.lam at oracle.com Mon Aug 7 17:39:39 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 7 Aug 2017 10:39:39 -0700 Subject: RFR(xxs): 8185706: Native callstacks unreliable under Windows x64 In-Reply-To: References: Message-ID: <9155fd17-2ce5-fe36-e114-928fc8fe1c32@oracle.com> Hi Thomas, Thanks for the patch! Skipping the test for SP != NULL and FP != NULL seems generally OK for me. I think StackWalk64 should be robust enough that when given NULL or bogus values for stk.AddrStack.Offset and stk.AddrFrame.Offset, it will still somehow recover gracefully. I forgot exactly why I put in these checks, though. I either was overly cautious, or I might have seen some problems without such checks, which might have caused crashes inside the debug printing routine. I really should have put in a comment there :-( By being generous to myself :-), I guess I would have put in an comment had I saw crash, so the lack of comments probably meant I was just over cautious .... How much testing have you done with your patch. Have you seen any crash inside the printing routine? Also, by "Native callstacks unreliable", do you mean "Native callstacks printing terminates prematurely", and not "sometimes they fail and print erroneous information or behave unexpectedly"? I think it's better to update the bug title. If you need a sponsor, I'll be happy to do it. Thanks - Ioi On 8/2/17 2:17 AM, Thomas St?fe wrote: > Hi all, > > may I please have a review for this small fix. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185706 > Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > 8185706-Native-callstacks-unreliable-under-Windows-x64/webrev.00/webrev/ > > This can be seen as an addon to https://bugs.openjdk.java. > net/browse/JDK-8022335. Ioi Lam did a good job analyzing the original > problem. On windows x64, the native compiler generates code which does not > use the frame pointer (regardless whether we set -Oy-). Only in rare cases > a frame pointer is used - e.g. for alloca()-functions - and, as Ioi pointed > out, no guarantee either that RBP is actually the frame pointer. > > So, in os :: > platform_print_native_stack > () > we walk the stack using StackWalk64(), extract the pc from each frame and > print that, like normal windows coding. However, we still test for the > frame pointer being NULL, and abort stack tracing if it is. This causes > stack dumping to fail quite often, and unnecessarily. > > For example, test: java.exe -XX:ErrorHandlerTest=12 > > Sometimes it works, but more out of accident - as Ioi pointed out in this > mail thread: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/ > 2013-August/009063.html. If there are java frames above the crashing native > frame, we still may have RBP set to some value (does not matter which) and > os :: > platform_print_native_stack > () > does not abort frame printing. > > Kind Regards, Thomas From ioi.lam at oracle.com Mon Aug 7 17:44:14 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 7 Aug 2017 10:44:14 -0700 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: <59889970.1080301@oracle.com> References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> <5984CC70.80209@oracle.com> <412edbf307a64909a010ce5f83deac5b@sap.com> <59889970.1080301@oracle.com> Message-ID: Looks good to me, too. Reviewed. Thanks - Ioi On 8/7/17 9:46 AM, Mikhailo Seledtsov wrote: > The change looks good to me, > > Thank you, > Mikhailo > > On 8/7/17, 1:02 AM, Lindenmaier, Goetz wrote: >> Hi, >> >> webrev with Whitebox: >> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04/ >> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04-hs/ >> >> I don't see so much of a difference to throwing an exception, if >> Whitebox is not properly implemented you get one, anyways: >> Exception in thread "main" java.lang.UnsatisfiedLinkError: >> sun.hotspot.WhiteBox.isCDSIncludedInVmBuild()Z >> at sun.hotspot.WhiteBox.isCDSIncludedInVmBuild(Native Method) >> Maybe it's a bit less likely to break, though. >> >> I'm fine with this, too. >> >> Best regards, >> Goetz., >> >> >> >>> -----Original Message----- >>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>> Sent: Freitag, 4. August 2017 21:35 >>> To: Ioi Lam; Lindenmaier, Goetz >>> ; Igor Ignatyev >>> Cc: hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>> cds tests >>> >>> Hi, >>> >>> I have an alternative solution that is IMO rather simple, reliable >>> and will >>> solve some issues we discussed (e.g. no need to throw >>> exceptions, no >>> need to handle failure to map an archive). >>> The proposed solution uses White Box test API to determine >>> whether VM >>> is compiled with INCLUDE_CDS on or off. >>> I implemented and tested it today, it works for me. >>> >>> The patch is attached. Please let me know what you think. >>> >>> Thank you, >>> Mikhailo >>> >>> On 8/3/17, 11:39 PM, Ioi Lam wrote: >>>> Hi Goetz, >>>> >>>> Instead of testing -Xshare:on, I think you should test with >>>> -Xshare:auto, which sets the flags >>>> >>>> UseSharedSpaces = true >>>> RequireSharedSpaces = false >>>> >>>> and will reliably print "Shared spaces are not supported in this VM" >>>> if-and-only-if INCLUDE_CDS is disabled (see arguments.cpp): >>>> >>>> >>>> #if !INCLUDE_CDS >>>> if (DumpSharedSpaces || RequireSharedSpaces) { >>>> jio_fprintf(defaultStream::error_stream(), >>>> "Shared spaces are not supported in this VM\n"); >>>> return JNI_ERR; >>>> } >>>> if ((UseSharedSpaces&& FLAG_IS_CMDLINE(UseSharedSpaces)) || >>>> log_is_enabled(Info, cds)) { >>>> warning("Shared spaces are not supported in this VM"); >>>> FLAG_SET_DEFAULT(UseSharedSpaces, false); >>>> LogConfiguration::configure_stdout(LogLevel::Off, true, >>>> LOG_TAGS(cds)); >>>> } >>>> no_shared_spaces("CDS Disabled"); >>>> #endif // INCLUDE_CDS >>>> >>>> >>>> That way, you don't need to test any other output message or exit >>>> conditions(such as mapping error). >>>> >>>> >>>> E.g.: >>>> >>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java -Xshare:auto >>>> -version >>>> java version "10-internal" >>>> Java(TM) SE Runtime Environment (build >>>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>>> Java HotSpot(TM) 64-Bit Server VM (build >>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>>> >>>> >>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java >>>> -XXaltjvm=minimal -Xshare:auto -version >>>> Java HotSpot(TM) 64-Bit Minimal VM warning: Shared spaces are not >>>> supported in this VM >>>> java version "10-internal" >>>> Java(TM) SE Runtime Environment (build >>>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>>> Java HotSpot(TM) 64-Bit Minimal VM (build >>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>>> >>>> >>>> >>>> Thanks >>>> - Ioi >>>> >>>> On 8/3/17 10:58 PM, Lindenmaier, Goetz wrote: >>>>> Hi Mikhailo, >>>>> >>>>> I put in your version of vmCDS() into this new webrev. >>>>> I also had to update the list of tests marked in hotspot, >>>>> as tests were removed and added in between, and resolved >>>>> it against the aot change: >>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ >>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ >>>>> >>>>> I don't think it's a good idea to swallow the exception silently >>>>> as you propose. >>>>> In our test setup, the tests would just be switched off if something >>>>> breaks, and no one will see that. If they fail though, it's an easy >>>>> and quick fix. I would at least switch them on, then one sees the >>>>> failing tests in case switching them on was the wrong guess. >>>>> Also, below, the method dump() throws an exception. >>>>> >>>>> Best regards, >>>>> Goetz >>>>> >>>>>> -----Original Message----- >>>>>> From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] >>>>>> Sent: Tuesday, August 01, 2017 11:49 PM >>>>>> To: Lindenmaier, Goetz >>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>>>>> cds tests >>>>>> >>>>>> Hi Goetz, >>>>>> >>>>>> I have reviewed your updated changes, and they overall look good to >>> me. >>>>>> However, I have some comments + concerns regarding >>> VMProps.vmCDS(): >>>>>> >>>>>> 1. Throwing exceptions from within the vmCDS() method. >>>>>> >>>>>> The VMProps properties are evaluated at the start of each >>>>>> run. If >>>>>> the exception is thrown here the whole test run will fail (not >>>>>> just the >>>>>> test that uses '@requires vm.cds'). The failure will occur shortly >>>>>> after >>>>>> the start of jtreg test run with a message: >>>>>> "java.lang.RuntimeException: Can not start VM to >>>>>> test to >>>>>> find out it's features. Switching off class data sharing (CDS)." >>>>>> >>>>>> Your method has 2 throw statements: "new >>>>>> RuntimeException("Can >>>>>> not >>>>>> start VM..." and "java.lang.RuntimeException: Can not start VM to >>>>>> test >>>>>> to...". I would recommend a more graceful way to fail, e.g. to print >>>>>> the >>>>>> message and to return "false" instead. This way the rest of the test >>>>>> run >>>>>> will continue, but the tests requiring vm.cds will be skipped with >>>>>> qualification of "not selected". >>>>>> >>>>>> 2. The check for "An error has occurred while processing the shared >>>>>> archive file." assumes that archive was not created prior to the >>>>>> execution of this evaluation code. However, there are test modes >>> where >>>>>> archive is created prior to test run. We use such mode on regular >>>>>> basis. >>>>>> In such cases the code will fail. >>>>>> I recommend to run "-Xshare:on -version", and check the >>>>>> following match that would result in return of "true": >>>>>> "Java HotSpot.*sharing" >>>>>> >>>>>> 3. On occasion the mapping of shared archive region to a specified >>>>>> address will fail (due to system configuration, space already >>>>>> occupied, >>>>>> ASLR, etc.) >>>>>> >>>>>> Hence I recommend checking for such conditions as well: >>>>>> >>>>>> if (output.firstMatch("Unable to map") != null) { >>>>>> System.out.println("VMProps.vmCDS() encountered an >>>>>> archive >>>>>> mapping failure, still proceeding with vm.cds=true"); >>>>>> return "true"; >>>>>> } >>>>>> I am returning true here because seeing this output means >>>>>> that >>>>>> CDS >>>>>> feature is supported, however in this particular instance archive >>>>>> failed >>>>>> to map. >>>>>> >>>>>> >>>>>> The rest of the changes looks good to me. >>>>>> >>>>>> See for my version of VMProps.vmCDS() below. Let me know what you >>>>>> think. >>>>>> >>>>>> >>>>>> Thank you, >>>>>> >>>>>> Mikhailo >>>>>> >>>>>> >>>>>> ================== my update of VMProps.vmCDS() >>>>>> >>>>>> protected String vmCDS() { >>>>>> System.setProperty("test.jdk", >>>>>> System.getProperty("java.home")); >>>>>> ProcessBuilder pb = >>>>>> ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); >>>>>> OutputAnalyzer output; >>>>>> >>>>>> try { >>>>>> output = new OutputAnalyzer(pb.start()); >>>>>> } catch (IOException e) { >>>>>> System.err.println( "Can not start VM to test to >>>>>> find out >>>>>> it's features. " + >>>>>> "Switching off class data >>>>>> sharing (CDS)." + e); >>>>>> return "false"; >>>>>> } >>>>>> if (output.firstMatch("Shared spaces are not supported in >>>>>> this >>>>>> VM") != null) { >>>>>> return "false"; >>>>>> } >>>>>> if (output.firstMatch("An error has occurred while >>>>>> processing >>>>>> the shared archive file.") != null) { >>>>>> return "true"; >>>>>> } >>>>>> if (output.firstMatch("Java HotSpot.*sharing") != null) { >>>>>> return "true"; >>>>>> } >>>>>> if (output.firstMatch("Unable to map") != null) { >>>>>> System.out.println("VMProps.vmCDS() encountered an >>>>>> archive >>>>>> mapping failure, still proceeding with vm.cds=true"); >>>>>> return "true"; >>>>>> } >>>>>> >>>>>> return "false"; >>>>>> } >>>>>> ================== >>>>>> >>>>>> >>>>>> >>>>>> On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I made new webrevs implementing the change with @requires: >>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ >>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02- >>> hs/ >>>>>>> I also changed the bug description and synopsis. >>>>>>> >>>>>>> For the jtreg runner I would propose to set the property test.jdk >>>>>>> so that it is available in VMProps. Igor also ran into this issue. >>>>>>> >>>>>>> Best regards, >>>>>>> Goetz. >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>> Sent: Montag, 31. Juli 2017 22:19 >>>>>>>> To: Lindenmaier, Goetz >>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>>> disable cds >>>>>> tests >>>>>>>> Hi Goetz, >>>>>>>> >>>>>>>> I have an idea on how to address your second use case. >>>>>>>> The idea is to define a special test property (e.g. >>>>>>>> test.cds.disable.cds.support) which will override logic inside the >>>>>>>> VMProps.vmCDSSupported(). If this property is defined to "true" in >>>>>>>> test >>>>>>>> invocation command then vmCDSSupported() returns false (CDS is >>>>>> disabled, >>>>>>>> not supported), and all tests marked with "@requires >>>>>>>> vm.cds.supported" >>>>>>>> will be skipped. >>>>>>>> >>>>>>>> How to use it: >>>>>>>> jtreg -Dtest.cds.disable.cds.support=true >>>>>>>> E.g.: jtreg -Dtest.cds.disable.cds.support=true >>>>>>>> >>> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java >>>>>>>> I prototyped this approach, it works for me. I have attached the >>>>>>>> diff. >>>>>>>> Let me know whether this works for your use case, or if you >>>>>>>> have any >>>>>>>> questions. >>>>>>>> >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Mikhailo >>>>>>>> >>>>>>>> >>>>>>>> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: >>>>>>>>> Hi Mikhailo, >>>>>>>>> >>>>>>>>> Basically I'm fine with using the @requires property. >>>>>>>>> But is there a way to overrule the outcome of the method >>>>>>>>> implemented In VMProps.java computing the property? >>>>>>>>> I have two use cases for the key I want to introduce. >>>>>>>>> >>>>>>>>> First, our internal VM (we are Oracle licensees) is compiled >>>>>>>>> without >>>>>>>>> CDS support. Thus we don't want to run the CDS tests. Currently >>>>>>>>> we have them all listed in the ProblemList, but that's not nice, >>>>>>>>> especially >>>>>>>>> because we have to adapt it whenever a new test is added. >>>>>>>>> As I understand, the @requires property works fine, here. >>>>>>>>> >>>>>>>>> Second, we also test the two ports we contributed (ppc and s390). >>>>>>>>> These >>>>>>>> contain >>>>>>>>> rudimentary cds support and so far passed all tests. >>>>>>>>> Unfortunately it >>>>>> broke >>>>>>>>> lately in jdk10. Instead of fixing it (our people are working on >>>>>>>>> finishing >>>>>> our >>>>>>>>> internal Java 9 port) I would like to switch off all cds tests. >>>>>>>>> As I can set the key on the command line of jtreg, I easily can >>>>>>>>> do that. >>>>>>>>> Is there a way to do similar with the @requires property? >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Goetz. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>>> Sent: Freitag, 28. Juli 2017 23:53 >>>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>>>>> disable cds >>>>>>>> tests >>>>>>>>>> Hi Goetz, >>>>>>>>>> >>>>>>>>>> I am a HotSpot SQE Engineer at Oracle. I have >>>>>>>>>> discussed your >>>>>> proposed >>>>>>>>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the >>>>>> following >>>>>>>>>> feedback on this change. >>>>>>>>>> >>>>>>>>>> 1. As part of streamlining and simplifying SQE process and the >>>>>>>>>> use of >>>>>>>>>> test tools we have narrowed down the test selection mechanisms. >>>>>>>>>> >>>>>>>>>> 2. Our preferred test selection mechanism is use of "@requires" >>>>>>>>>> and a >>>>>>>>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though >>>>>> JTREG >>>>>>>>>> supports use of "@key", we prefer the use of "@requires" as a >>>>>>>>>> first >>>>>>>> choice. >>>>>>>>>> 3. If it is not possible to use "@requires" for a given >>>>>>>>>> situation then >>>>>>>>>> use "@key" mechanism. We would ask you if you could explore the >>>>>>>>>> possibility of implementing this change via @requires first. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Here are several hints that may help: >>>>>>>>>> >>>>>>>>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. The >>>>>>>>>> value >>>>>>>>>> of a given "requires property" is evaluated inside this file and >>>>>>>>>> placed >>>>>>>>>> into a map (see public call() method). Add your evaluation code >>>>>>>>>> here, >>>>>>>>>> and then follow the pattern used for other properties. Create a >>>>>> property >>>>>>>>>> (e.g. vm.cds.supported, with values of true/false). Create a >>> method >>>>>> that >>>>>>>>>> evaluates the property value (e.g. isCDSSupported() or similar). >>>>>>>>>> >>>>>>>>>> 2. The method could use several options to evaluate whether CDS >>> is >>>>>>>>>> supported. >>>>>>>>>> A. WhiteBox API. Create a new WB test API method >>>>>>>>>> which can >>>>>> return >>>>>>>>>> true if CDS_ compiler flag is defined, otherwise false. >>>>>>>>>> Call WB API from VMProps.java. See >>>>>>>>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create >>> your >>>>>>>> own >>>>>>>>>> WB.isCDSSupported() >>>>>>>>>> WhiteBox.java resides in >>>>>>>>>> test/lib/sun/hotspot/WhiteBox.java >>>>>>>>>> >>>>>>>>>> B. Another options is to evaluate by running VM with >>>>>>>>>> sharing on and >>>>>>>>>> checking the return (may be not as reliable as option A) >>>>>>>>>> C. Other ideas welcome. >>>>>>>>>> >>>>>>>>>> 3. Include "@requres vm.cds.supported == true" to the >>> appropriate >>>>>> tests. >>>>>>>>>> Let me know if you have any questions. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Mikhailo >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: >>>>>>>>>>> Hi >>>>>>>>>>> >>>>>>>>>>> we compile the VM without CDS support. Thus the CDS tests >>>>>>>>>>> fail. This change introduces a keyword 'cds' and marks >>>>>>>>>>> the tests accordingly. >>>>>>>>>>> This change also fixes the keywords specified in >>>>>>>>>> gc/g1/TestSharedArchiveWithPreTouch.java. >>>>>>>>>>> There may only be one @key keyword in the test specification. >>>>>>>>>>> In runtime/CompressedOops/CompressedClassPointers.java only >>> one >>>>>>>> test >>>>>>>>>>> case required CDS. I changed this sub case to succeed if CDS is >>>>>>>>>>> not >>>>>>>>>>> available. >>>>>>>>>>> >>>>>>>>>>> Please review this change. I please need a sponsor. >>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436- >>> cdsKey/webrev.01/ >>>>>>>>>>> Best regards, >>>>>>>>>>> Goetz. From kim.barrett at oracle.com Mon Aug 7 17:49:52 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 7 Aug 2017 13:49:52 -0400 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: <97a3c455-c3d1-7df4-a20f-90f1d1102e6d@redhat.com> References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: > On Aug 7, 2017, at 11:23 AM, Andrew Haley wrote: > > On 07/08/17 00:32, Kim Barrett wrote: >> I'm looking for feedback on this before I try to carry it any further. > > I don't like it because it converts pointers to operand types before > calling the back end. > > For example, in here: > > intptr_t v = CASPTR(&_LockWord, 0, _LBIT); // agro ... > > the type of the operand LockWord is SplitWord. But the SplitWord * > argument gets converted to void* volatile* when we call this: > > inline static void* cmpxchg_ptr(void* exchange_value, volatile void* dest, void* compare_value, cmpxchg_memory_order order = memory_order_conservative) { > return cmpxchg(exchange_value, > reinterpret_cast(dest), > compare_value, > order); > Here's what I first wrote: > > I don't see the point of such a type conversion. We could call > cmpxchg with the actual types of the operands, could we not? Why is > cmpxchg_ptr even a thing? We're casting away type information for > no reason that I can see. > > Couldn't cmpxchg_ptr() be defined as a template function in such a > way that only the back ends that actually need to cast away the > types have to do so? That is, if the back ends can define > cmpxchg_ptr() themselves without resorting to pointer type > conversion, we should let them so so. > > But rather than sending that message straight away, I tried it. And > now I see: the compiler can't get the types right in those cases where > we have mismatched operand types in the call. Argh. The only way we > can get method resolution to work is to throw way the pointer type > information and use void* for everything. At th erisk of being > boring, I repeat what I said before: IMO this is not what we should be > doing in 2017. We should be looking to the future, and get the types > to match now, at the call site. Maybe you?ve forgotten this, from Erik?s original RFR email? "The X_ptr member functions have been deprecated, but are still there and can be used with identical behaviour as they had before. But new code should just use the non-ptr member functions instead.? I converted a few examples in cmpxchg_template_20170806, but there are over 100 left to go, and many of them require caller source changes because they are discarding the relevant type information at the call site in order to conform to the existing API. For folks other than Andrew who haven?t done the experiment, here?s just one example: ../src/share/vm/oops/method.hpp: In member function 'bool Method::init_method_counters(MethodCounters*)': ../src/share/vm/oops/method.hpp:353:88: error: NULL used in arithmetic [-Werror=pointer-arith] return Atomic::cmpxchg_ptr(counters, (volatile void*)&_method_counters, NULL) == NULL; Note the cast of &_method_counters. There are lots like that. I expect most to be easy to fix, and then the information losing cmpxchg_ptr can be removed. But that's a lot more work than I want to do before getting some feedback on the cmpxchg changes. And I?d actually prefer to separate the two tasks, with cmpxchg_ptr cleanup being a followup incremental set of tasks. I expect there to be *some* finicky cases that need careful review, and I?d rather not inflict (or have inflicted upon) reviewers with a large and mostly boringly similar set of changes with the occasional lurking tricky bit. So I think I?m entirely in agreement with Andrew about the target, just not necessarily in the timing of reaching it. > Also, and this is a relatively slight objection, I find myself > defining > > template<> > struct Atomic::PlatformCmpxchg<1> VALUE_OBJ_CLASS_SPEC { > template > T operator()(T exchange_value, > T volatile* dest, > T compare_value, > cmpxchg_memory_order order) const { > return ::cmpxchg(exchange_value, dest, compare_value, order); > } > }; > > for 1, 4, and 8. I guess that can't be avoided, and in any case it > would be easy enough to do it with preprocessor macros. What?s wrong with template struct Atomic::PlatformCmpxchg VALUE_OBJ_CLASS_SPEC { template T operator()(T nv, T volatile* d, T ov, cmpxchg_memory_order order) const { return ::cmpxchg(nv, d, ov, order); } }; and maybe an explicit specialization on 2 that errors rather than calling ::cmpxchg if that?s needed? I intentionally left the unspecialized case undefined, in part to permit platform backends to do that sort of thing. Though that ought to be a written part of the contract? From mikhailo.seledtsov at oracle.com Mon Aug 7 22:00:49 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 07 Aug 2017 15:00:49 -0700 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> <5984CC70.80209@oracle.com> <412edbf307a64909a010ce5f83deac5b@sap.com> <59889970.1080301@oracle.com> Message-ID: <5988E311.1080605@oracle.com> Hi Goetz, Please let me know if you need a sponsor for this change. Mikhailo On 8/7/17, 10:44 AM, Ioi Lam wrote: > Looks good to me, too. Reviewed. > > Thanks > > - Ioi > > > > On 8/7/17 9:46 AM, Mikhailo Seledtsov wrote: >> The change looks good to me, >> >> Thank you, >> Mikhailo >> >> On 8/7/17, 1:02 AM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> webrev with Whitebox: >>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04/ >>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04-hs/ >>> >>> I don't see so much of a difference to throwing an exception, if >>> Whitebox is not properly implemented you get one, anyways: >>> Exception in thread "main" java.lang.UnsatisfiedLinkError: >>> sun.hotspot.WhiteBox.isCDSIncludedInVmBuild()Z >>> at sun.hotspot.WhiteBox.isCDSIncludedInVmBuild(Native Method) >>> Maybe it's a bit less likely to break, though. >>> >>> I'm fine with this, too. >>> >>> Best regards, >>> Goetz., >>> >>> >>> >>>> -----Original Message----- >>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>> Sent: Freitag, 4. August 2017 21:35 >>>> To: Ioi Lam; Lindenmaier, Goetz >>>> ; Igor Ignatyev >>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>>> cds tests >>>> >>>> Hi, >>>> >>>> I have an alternative solution that is IMO rather simple, >>>> reliable >>>> and will >>>> solve some issues we discussed (e.g. no need to throw >>>> exceptions, no >>>> need to handle failure to map an archive). >>>> The proposed solution uses White Box test API to determine >>>> whether VM >>>> is compiled with INCLUDE_CDS on or off. >>>> I implemented and tested it today, it works for me. >>>> >>>> The patch is attached. Please let me know what you think. >>>> >>>> Thank you, >>>> Mikhailo >>>> >>>> On 8/3/17, 11:39 PM, Ioi Lam wrote: >>>>> Hi Goetz, >>>>> >>>>> Instead of testing -Xshare:on, I think you should test with >>>>> -Xshare:auto, which sets the flags >>>>> >>>>> UseSharedSpaces = true >>>>> RequireSharedSpaces = false >>>>> >>>>> and will reliably print "Shared spaces are not supported in this VM" >>>>> if-and-only-if INCLUDE_CDS is disabled (see arguments.cpp): >>>>> >>>>> >>>>> #if !INCLUDE_CDS >>>>> if (DumpSharedSpaces || RequireSharedSpaces) { >>>>> jio_fprintf(defaultStream::error_stream(), >>>>> "Shared spaces are not supported in this VM\n"); >>>>> return JNI_ERR; >>>>> } >>>>> if ((UseSharedSpaces&& FLAG_IS_CMDLINE(UseSharedSpaces)) || >>>>> log_is_enabled(Info, cds)) { >>>>> warning("Shared spaces are not supported in this VM"); >>>>> FLAG_SET_DEFAULT(UseSharedSpaces, false); >>>>> LogConfiguration::configure_stdout(LogLevel::Off, true, >>>>> LOG_TAGS(cds)); >>>>> } >>>>> no_shared_spaces("CDS Disabled"); >>>>> #endif // INCLUDE_CDS >>>>> >>>>> >>>>> That way, you don't need to test any other output message or exit >>>>> conditions(such as mapping error). >>>>> >>>>> >>>>> E.g.: >>>>> >>>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java >>>>> -Xshare:auto >>>>> -version >>>>> java version "10-internal" >>>>> Java(TM) SE Runtime Environment (build >>>>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>>>> Java HotSpot(TM) 64-Bit Server VM (build >>>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>>>> >>>>> >>>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java >>>>> -XXaltjvm=minimal -Xshare:auto -version >>>>> Java HotSpot(TM) 64-Bit Minimal VM warning: Shared spaces are not >>>>> supported in this VM >>>>> java version "10-internal" >>>>> Java(TM) SE Runtime Environment (build >>>>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>>>> Java HotSpot(TM) 64-Bit Minimal VM (build >>>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>>>> >>>>> >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> On 8/3/17 10:58 PM, Lindenmaier, Goetz wrote: >>>>>> Hi Mikhailo, >>>>>> >>>>>> I put in your version of vmCDS() into this new webrev. >>>>>> I also had to update the list of tests marked in hotspot, >>>>>> as tests were removed and added in between, and resolved >>>>>> it against the aot change: >>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ >>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ >>>>>> >>>>>> I don't think it's a good idea to swallow the exception silently >>>>>> as you propose. >>>>>> In our test setup, the tests would just be switched off if something >>>>>> breaks, and no one will see that. If they fail though, it's an easy >>>>>> and quick fix. I would at least switch them on, then one sees the >>>>>> failing tests in case switching them on was the wrong guess. >>>>>> Also, below, the method dump() throws an exception. >>>>>> >>>>>> Best regards, >>>>>> Goetz >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] >>>>>>> Sent: Tuesday, August 01, 2017 11:49 PM >>>>>>> To: Lindenmaier, Goetz >>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>>>>>> cds tests >>>>>>> >>>>>>> Hi Goetz, >>>>>>> >>>>>>> I have reviewed your updated changes, and they overall look good to >>>> me. >>>>>>> However, I have some comments + concerns regarding >>>> VMProps.vmCDS(): >>>>>>> >>>>>>> 1. Throwing exceptions from within the vmCDS() method. >>>>>>> >>>>>>> The VMProps properties are evaluated at the start of each >>>>>>> run. If >>>>>>> the exception is thrown here the whole test run will fail (not >>>>>>> just the >>>>>>> test that uses '@requires vm.cds'). The failure will occur shortly >>>>>>> after >>>>>>> the start of jtreg test run with a message: >>>>>>> "java.lang.RuntimeException: Can not start VM to >>>>>>> test to >>>>>>> find out it's features. Switching off class data sharing (CDS)." >>>>>>> >>>>>>> Your method has 2 throw statements: "new >>>>>>> RuntimeException("Can >>>>>>> not >>>>>>> start VM..." and "java.lang.RuntimeException: Can not start VM >>>>>>> to test >>>>>>> to...". I would recommend a more graceful way to fail, e.g. to >>>>>>> print >>>>>>> the >>>>>>> message and to return "false" instead. This way the rest of the >>>>>>> test >>>>>>> run >>>>>>> will continue, but the tests requiring vm.cds will be skipped with >>>>>>> qualification of "not selected". >>>>>>> >>>>>>> 2. The check for "An error has occurred while processing the shared >>>>>>> archive file." assumes that archive was not created prior to the >>>>>>> execution of this evaluation code. However, there are test modes >>>> where >>>>>>> archive is created prior to test run. We use such mode on regular >>>>>>> basis. >>>>>>> In such cases the code will fail. >>>>>>> I recommend to run "-Xshare:on -version", and check the >>>>>>> following match that would result in return of "true": >>>>>>> "Java HotSpot.*sharing" >>>>>>> >>>>>>> 3. On occasion the mapping of shared archive region to a specified >>>>>>> address will fail (due to system configuration, space already >>>>>>> occupied, >>>>>>> ASLR, etc.) >>>>>>> >>>>>>> Hence I recommend checking for such conditions as well: >>>>>>> >>>>>>> if (output.firstMatch("Unable to map") != null) { >>>>>>> System.out.println("VMProps.vmCDS() encountered an >>>>>>> archive >>>>>>> mapping failure, still proceeding with vm.cds=true"); >>>>>>> return "true"; >>>>>>> } >>>>>>> I am returning true here because seeing this output means >>>>>>> that >>>>>>> CDS >>>>>>> feature is supported, however in this particular instance archive >>>>>>> failed >>>>>>> to map. >>>>>>> >>>>>>> >>>>>>> The rest of the changes looks good to me. >>>>>>> >>>>>>> See for my version of VMProps.vmCDS() below. Let me know what you >>>>>>> think. >>>>>>> >>>>>>> >>>>>>> Thank you, >>>>>>> >>>>>>> Mikhailo >>>>>>> >>>>>>> >>>>>>> ================== my update of VMProps.vmCDS() >>>>>>> >>>>>>> protected String vmCDS() { >>>>>>> System.setProperty("test.jdk", >>>>>>> System.getProperty("java.home")); >>>>>>> ProcessBuilder pb = >>>>>>> ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); >>>>>>> OutputAnalyzer output; >>>>>>> >>>>>>> try { >>>>>>> output = new OutputAnalyzer(pb.start()); >>>>>>> } catch (IOException e) { >>>>>>> System.err.println( "Can not start VM to test to >>>>>>> find out >>>>>>> it's features. " + >>>>>>> "Switching off class data >>>>>>> sharing (CDS)." + e); >>>>>>> return "false"; >>>>>>> } >>>>>>> if (output.firstMatch("Shared spaces are not >>>>>>> supported in >>>>>>> this >>>>>>> VM") != null) { >>>>>>> return "false"; >>>>>>> } >>>>>>> if (output.firstMatch("An error has occurred while >>>>>>> processing >>>>>>> the shared archive file.") != null) { >>>>>>> return "true"; >>>>>>> } >>>>>>> if (output.firstMatch("Java HotSpot.*sharing") != >>>>>>> null) { >>>>>>> return "true"; >>>>>>> } >>>>>>> if (output.firstMatch("Unable to map") != null) { >>>>>>> System.out.println("VMProps.vmCDS() encountered an >>>>>>> archive >>>>>>> mapping failure, still proceeding with vm.cds=true"); >>>>>>> return "true"; >>>>>>> } >>>>>>> >>>>>>> return "false"; >>>>>>> } >>>>>>> ================== >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I made new webrevs implementing the change with @requires: >>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ >>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02- >>>> hs/ >>>>>>>> I also changed the bug description and synopsis. >>>>>>>> >>>>>>>> For the jtreg runner I would propose to set the property test.jdk >>>>>>>> so that it is available in VMProps. Igor also ran into this >>>>>>>> issue. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Goetz. >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>> Sent: Montag, 31. Juli 2017 22:19 >>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>>>> disable cds >>>>>>> tests >>>>>>>>> Hi Goetz, >>>>>>>>> >>>>>>>>> I have an idea on how to address your second use case. >>>>>>>>> The idea is to define a special test property (e.g. >>>>>>>>> test.cds.disable.cds.support) which will override logic inside >>>>>>>>> the >>>>>>>>> VMProps.vmCDSSupported(). If this property is defined to >>>>>>>>> "true" in >>>>>>>>> test >>>>>>>>> invocation command then vmCDSSupported() returns false (CDS is >>>>>>> disabled, >>>>>>>>> not supported), and all tests marked with "@requires >>>>>>>>> vm.cds.supported" >>>>>>>>> will be skipped. >>>>>>>>> >>>>>>>>> How to use it: >>>>>>>>> jtreg -Dtest.cds.disable.cds.support=true >>>>>>>>> E.g.: jtreg -Dtest.cds.disable.cds.support=true >>>>>>>>> >>>> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java >>>>>>>>> I prototyped this approach, it works for me. I have attached the >>>>>>>>> diff. >>>>>>>>> Let me know whether this works for your use case, or if you >>>>>>>>> have any >>>>>>>>> questions. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thank you, >>>>>>>>> Mikhailo >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: >>>>>>>>>> Hi Mikhailo, >>>>>>>>>> >>>>>>>>>> Basically I'm fine with using the @requires property. >>>>>>>>>> But is there a way to overrule the outcome of the method >>>>>>>>>> implemented In VMProps.java computing the property? >>>>>>>>>> I have two use cases for the key I want to introduce. >>>>>>>>>> >>>>>>>>>> First, our internal VM (we are Oracle licensees) is compiled >>>>>>>>>> without >>>>>>>>>> CDS support. Thus we don't want to run the CDS tests. Currently >>>>>>>>>> we have them all listed in the ProblemList, but that's not nice, >>>>>>>>>> especially >>>>>>>>>> because we have to adapt it whenever a new test is added. >>>>>>>>>> As I understand, the @requires property works fine, here. >>>>>>>>>> >>>>>>>>>> Second, we also test the two ports we contributed (ppc and >>>>>>>>>> s390). >>>>>>>>>> These >>>>>>>>> contain >>>>>>>>>> rudimentary cds support and so far passed all tests. >>>>>>>>>> Unfortunately it >>>>>>> broke >>>>>>>>>> lately in jdk10. Instead of fixing it (our people are >>>>>>>>>> working on >>>>>>>>>> finishing >>>>>>> our >>>>>>>>>> internal Java 9 port) I would like to switch off all cds tests. >>>>>>>>>> As I can set the key on the command line of jtreg, I easily can >>>>>>>>>> do that. >>>>>>>>>> Is there a way to do similar with the @requires property? >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Goetz. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>>>> Sent: Freitag, 28. Juli 2017 23:53 >>>>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>>>>>> disable cds >>>>>>>>> tests >>>>>>>>>>> Hi Goetz, >>>>>>>>>>> >>>>>>>>>>> I am a HotSpot SQE Engineer at Oracle. I have >>>>>>>>>>> discussed your >>>>>>> proposed >>>>>>>>>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the >>>>>>> following >>>>>>>>>>> feedback on this change. >>>>>>>>>>> >>>>>>>>>>> 1. As part of streamlining and simplifying SQE process and the >>>>>>>>>>> use of >>>>>>>>>>> test tools we have narrowed down the test selection mechanisms. >>>>>>>>>>> >>>>>>>>>>> 2. Our preferred test selection mechanism is use of "@requires" >>>>>>>>>>> and a >>>>>>>>>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though >>>>>>> JTREG >>>>>>>>>>> supports use of "@key", we prefer the use of "@requires" as a >>>>>>>>>>> first >>>>>>>>> choice. >>>>>>>>>>> 3. If it is not possible to use "@requires" for a given >>>>>>>>>>> situation then >>>>>>>>>>> use "@key" mechanism. We would ask you if you could explore the >>>>>>>>>>> possibility of implementing this change via @requires first. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Here are several hints that may help: >>>>>>>>>>> >>>>>>>>>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. >>>>>>>>>>> The >>>>>>>>>>> value >>>>>>>>>>> of a given "requires property" is evaluated inside this file >>>>>>>>>>> and >>>>>>>>>>> placed >>>>>>>>>>> into a map (see public call() method). Add your evaluation code >>>>>>>>>>> here, >>>>>>>>>>> and then follow the pattern used for other properties. Create a >>>>>>> property >>>>>>>>>>> (e.g. vm.cds.supported, with values of true/false). Create a >>>> method >>>>>>> that >>>>>>>>>>> evaluates the property value (e.g. isCDSSupported() or >>>>>>>>>>> similar). >>>>>>>>>>> >>>>>>>>>>> 2. The method could use several options to evaluate whether CDS >>>> is >>>>>>>>>>> supported. >>>>>>>>>>> A. WhiteBox API. Create a new WB test API method >>>>>>>>>>> which can >>>>>>> return >>>>>>>>>>> true if CDS_ compiler flag is defined, otherwise false. >>>>>>>>>>> Call WB API from VMProps.java. See >>>>>>>>>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create >>>> your >>>>>>>>> own >>>>>>>>>>> WB.isCDSSupported() >>>>>>>>>>> WhiteBox.java resides in >>>>>>>>>>> test/lib/sun/hotspot/WhiteBox.java >>>>>>>>>>> >>>>>>>>>>> B. Another options is to evaluate by running VM with >>>>>>>>>>> sharing on and >>>>>>>>>>> checking the return (may be not as reliable as option A) >>>>>>>>>>> C. Other ideas welcome. >>>>>>>>>>> >>>>>>>>>>> 3. Include "@requres vm.cds.supported == true" to the >>>> appropriate >>>>>>> tests. >>>>>>>>>>> Let me know if you have any questions. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Mikhailo >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: >>>>>>>>>>>> Hi >>>>>>>>>>>> >>>>>>>>>>>> we compile the VM without CDS support. Thus the CDS tests >>>>>>>>>>>> fail. This change introduces a keyword 'cds' and marks >>>>>>>>>>>> the tests accordingly. >>>>>>>>>>>> This change also fixes the keywords specified in >>>>>>>>>>> gc/g1/TestSharedArchiveWithPreTouch.java. >>>>>>>>>>>> There may only be one @key keyword in the test specification. >>>>>>>>>>>> In runtime/CompressedOops/CompressedClassPointers.java only >>>> one >>>>>>>>> test >>>>>>>>>>>> case required CDS. I changed this sub case to succeed if >>>>>>>>>>>> CDS is >>>>>>>>>>>> not >>>>>>>>>>>> available. >>>>>>>>>>>> >>>>>>>>>>>> Please review this change. I please need a sponsor. >>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436- >>>> cdsKey/webrev.01/ >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Goetz. > From jiangli.zhou at Oracle.COM Mon Aug 7 23:39:14 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Mon, 7 Aug 2017 16:39:14 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <1502117570.2950.49.camel@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1502117570.2950.49.camel@oracle.com> Message-ID: <638FE232-DCA9-4B46-83FA-F4A81A80949B@oracle.com> Hi Thomas, Thanks a lot for the review! > On Aug 7, 2017, at 7:52 AM, Thomas Schatzl wrote: > > Hi, > > On Thu, 2017-08-03 at 17:15 -0700, Jiangli Zhou wrote: >> Here are the updated webrevs. >> >> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >> >> Changes in the updated webrevs include: >> Merge with Ioi?s recent shared space auto-sizing change (8072061) >> Addressed all feedbacks from Ioi and Coleen (Thanks for detailed >> review!) > > - the comment in g1Allocator.hpp:326 needs to be updated. I would merge > the information from the _open member here. I.e. define what "open" and > "closed" archive are. Good catch. I updated the comments as the following: // G1ArchiveAllocator is used to allocate memory in archive // regions. Such regions are not scavenged nor compacted by GC. // There are two types of archive regions, which are // differ in the kind of references allowed for the contained objects: // // - 'Closed' archive region contain no references outside of archive // regions. The region is immutable by GC. GC does not mark object // header in 'closed' archive region. // - An 'open' archive region may contain references pointing to // non-archive heap region. GC can adjust pointers and mark object // header in 'open' archive region. > > - g1Allocator.inline.hpp:66-72: formatting - parameters should be > aligned below each other Fixed. > > - g1CollectedHeap.cpp:665/6: formatting - parameters should be aligned > below each other Fixed. > > - g1CollectedHeap.cpp:750 + 756: maybe make a static method to avoid > repetition here. I changed the code to be following. New static function is a little overkill since the usage is very limited. :-) HeapWord* top; HeapRegion* next_region; if (curr_region != last_region) { top = curr_region->end(); next_region = _hrm.next_region_in_heap(curr_region); } else { top = last_address + 1; next_region = NULL; } curr_region->set_top(top); curr_region->set_first_dead(top); curr_region->set_end_of_live(top); curr_region = next_region; } > > - G1NoteEndOfConcMarkClosure::doHeapRegion(): would it be too much work > to make an extra CR out of this change? It is a change that fixes an > existing bug unrelated to this change after all (not doing remembered > set cleanup work for archive regions). Using separate CR to track this sounds good. I just created JDK-8185924 . Since we have been testing the fix with other changes together, I'll integrate them together with both CRs. > > - g1HeapVerifier.cpp:63: I think this change is obsolete since > is_obj_dead_cond() will always return false. (I.e. some leftover of > some internal discussion) You are right. I reverted the change. > > - g1HeapVerifier.cpp: there is a verbose flag passed around. Not sure > if it should be kept, as it seems to be some code that has been used > for debugging this feature, but can't be activated anyway without code > changes. Removed. > > - heapRegion.inline.hpp:120: please fix the comment of the assert. > Something like "Closed archive regions should not have references into > other regions? Done. > > - heapRegion.inline.hpp:125: I think the existing code of faking open > archive regions as all-live does not work as implemented. Consider the > case when a new object in there is made live, and references in there > set to refer to some object outside this region, and is the only > reference (and it has not been marked live yet): if there is a > remembered set entry to that, and it is about to be scanned. > > The current implementation of HeapRegion::is_obj_dead() will consider > it dead, so we will enter the code path at line 125. Block_size_using > bitmap() will jump over that object, but the return values of > is_obj_dead_with_size() method will indicate the caller to not iterate > over this object anyway, potentially missing that reference. > > HeapRegion::is_obj_dead() needs to return that the object is not dead > for open archive regions. I think for now the safest way is to add > !is_open_archive() to the condition calculated there. That will obviate > the need for that existing hack to the assert too. > > It may have some perf impact though - actually recently there has been > some effort to remove that is_archive() check from that code (the one > that is now the is_closed_archive() assert). I do not see an easy way > to fix this. :( (i.e. there is likely no perf impact vs. jdk9 so it's > not that bad) > > This suggestion also only works with the assumption laid out in the CR > that there is no way that a live object can not become dormant again, > and the objects in the open archive regions are always parsable (never > contain junk data). Thank you for the analysis. I changed HeapRegion::is_obj_dead() with added !is_open_archive() condition as you suggested. I?m glad to get rid of the is_open_archive() change from is_obj_dead_with_size() assert. Thinking more about the case you described above, when an object (A) in the open archive just becomes live, there would be no reference to any other non-archive region at the moment. The object A only contains references to the archive (open or closed) regions initially. Scanning has no issue at the moment. When a new object (B) is allocated in the java heap and the reference is set in A. B is considered live, scanning would update the A->B reference accordingly. Is that correct? > > - HeapRegionType::relabel_as_old(): the code in line 175/176 is > unreachable and should be removed. Agreed. Removed. I?ll send out an updated webrev later. Thank you so much for all the help and contributions!! Jiangli > > I did not look through runtime code too closely. > > Thanks, > Thomas > From coleen.phillimore at oracle.com Tue Aug 8 00:22:42 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 7 Aug 2017 20:22:42 -0400 Subject: RFR: 8185757: QuickSort array size should be size_t In-Reply-To: <4029C0A4-1F4D-4C43-9022-FF9F33BD321A@oracle.com> References: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> <1501763541.2411.19.camel@oracle.com> <1501787150.2411.71.camel@oracle.com> <4029C0A4-1F4D-4C43-9022-FF9F33BD321A@oracle.com> Message-ID: <2811796b-4aff-3274-9bdc-6f45bc5bc394@oracle.com> This looks good. I don't think we'll change the methods->length() because it's limited to u2 and used to set method_idnum() which is a u2. Odd that the compiler doesn't complain about narrowing an int into a u2 though. Coleen On 8/4/17 7:20 PM, Kim Barrett wrote: >> On Aug 3, 2017, at 3:05 PM, Thomas Schatzl wrote: >> >> Hi, >> >> On Thu, 2017-08-03 at 14:06 -0400, Kim Barrett wrote: >>>> On Aug 3, 2017, at 8:32 AM, Thomas Schatzl >>> om> wrote: >>>> However, please put the closing brackets of these into extra lines >>>> (quicksort.hpp:76,77) to avoid the casual reader to overlook them. >>> Sorry, but that just looks horrible. As a casual reader, I wouldn?t >>> even look for them, since if they aren?t there then the code is >>> badly mis-indented. >> Actually I was already at writing about an issue with indentation when >> I noticed the brackets :) >> >> Not insisting on changing this. > I added a couple of asserts to check for buffer overruns due to a bad comparator: > > diff -r 6b62ed03a6a6 -r 7616ceb92653 src/share/vm/utilities/quickSort.hpp > --- a/src/share/vm/utilities/quickSort.hpp Fri Aug 04 18:02:51 2017 -0400 > +++ b/src/share/vm/utilities/quickSort.hpp Fri Aug 04 19:17:36 2017 -0400 > @@ -73,8 +73,12 @@ > T pivot_val = array[pivot]; > > for ( ; true; ++left_index, --right_index) { > - for ( ; comparator(array[left_index], pivot_val) < 0; ++left_index) {} > - for ( ; comparator(array[right_index], pivot_val) > 0; --right_index) {} > + for ( ; comparator(array[left_index], pivot_val) < 0; ++left_index) { > + assert(left_index < length, "reached end of partition"); > + } > + for ( ; comparator(array[right_index], pivot_val) > 0; --right_index) { > + assert(right_index > 0, "reached start of partition"); > + } > > if (left_index < right_index) { > if (!idempotent || comparator(array[left_index], array[right_index]) != 0) { > > From kim.barrett at oracle.com Tue Aug 8 00:26:45 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 7 Aug 2017 20:26:45 -0400 Subject: RFR: 8185746: Remove Mutex destructor assertion In-Reply-To: <1502099705.2950.2.camel@oracle.com> References: <9114A6CE-826F-4AF2-9A0D-C239DF9FD5FA@oracle.com> <751d85d4-3076-b478-4bbf-7509ac29a724@oracle.com> <1501762304.2411.8.camel@oracle.com> <1502099705.2950.2.camel@oracle.com> Message-ID: > On Aug 7, 2017, at 5:55 AM, Thomas Schatzl wrote: > > Hi, > > On Thu, 2017-08-03 at 13:44 -0400, Kim Barrett wrote: >>> >>> On Aug 3, 2017, at 8:11 AM, Thomas Schatzl >> om> wrote: >>> >>> On Wed, 2017-08-02 at 19:31 -0400, Kim Barrett wrote: >>>> >>>>> >>>>> On Aug 2, 2017, at 7:23 PM, David Holmes >>>> om> >>>>> wrote: >>>>> >>>>> Hi Kim, >>>>> >>>>> On 3/08/2017 8:41 AM, Kim Barrett wrote: >>>>>> >>>>>> >>>>>> Please review this small change to improve the debugging >>>>>> experience >>> Looks good. >> Thanks. >> >>>>> [...] >>>>> // Defer any state checking to ~Monitor >>>>> Mutex::~Mutex() { } >>>> Well, I was waffling over just removing it completely. >>>> >>> do it :) I can't see, apart from it being a bug (to be fixed in >>> another CR?), why the constructor of Monitor is not virtual. >> It?s gone now. >> > > still looks good. > > Thomas Thanks Thomas, Coleen, and David. From kim.barrett at oracle.com Tue Aug 8 00:27:16 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 7 Aug 2017 20:27:16 -0400 Subject: RFR: 8185757: QuickSort array size should be size_t In-Reply-To: <1502098954.2950.0.camel@oracle.com> References: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> <1501763541.2411.19.camel@oracle.com> <1501787150.2411.71.camel@oracle.com> <4029C0A4-1F4D-4C43-9022-FF9F33BD321A@oracle.com> <1502098954.2950.0.camel@oracle.com> Message-ID: <4A6A4A36-982E-41FD-8316-2E16A832A694@oracle.com> > On Aug 7, 2017, at 5:42 AM, Thomas Schatzl wrote: > > Hi, > > On Fri, 2017-08-04 at 19:20 -0400, Kim Barrett wrote: >>> >>> >>>>> [...] > >>>>> However, please put the closing brackets of these into extra >>>>> lines >>>>> (quicksort.hpp:76,77) to avoid the casual reader to overlook >>>>> them. >>>> Sorry, but that just looks horrible. As a casual reader, I >>>> wouldn?t even look for them, since if they aren?t there then the >>>> code is badly mis-indented. >>> Actually I was already at writing about an issue with indentation >>> when I noticed the brackets :) >>> >>> Not insisting on changing this. >> I added a couple of asserts to check for buffer overruns due to a bad >> comparator: >> >> diff -r 6b62ed03a6a6 -r 7616ceb92653 >> src/share/vm/utilities/quickSort.hpp >> --- a/src/share/vm/utilities/quickSort.hpp Fri Aug 04 18:02:51 >> 2017 -0400 >> +++ b/src/share/vm/utilities/quickSort.hpp Fri Aug 04 19:17:36 >> 2017 -0400 >> @@ -73,8 +73,12 @@ >> T pivot_val = array[pivot]; >> >> for ( ; true; ++left_index, --right_index) { >> - for ( ; comparator(array[left_index], pivot_val) < 0; >> ++left_index) {} >> - for ( ; comparator(array[right_index], pivot_val) > 0; -- >> right_index) {} >> + for ( ; comparator(array[left_index], pivot_val) < 0; >> ++left_index) { >> + assert(left_index < length, "reached end of partition"); >> + } >> + for ( ; comparator(array[right_index], pivot_val) > 0; -- >> right_index) { >> + assert(right_index > 0, "reached start of partition"); >> + } >> >> if (left_index < right_index) { >> if (!idempotent || comparator(array[left_index], >> array[right_index]) != 0) { >> > > looks good. > > Thanks a lot. > > Thomas Thanks. From kim.barrett at oracle.com Tue Aug 8 00:27:51 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 7 Aug 2017 20:27:51 -0400 Subject: RFR: 8185757: QuickSort array size should be size_t In-Reply-To: <2811796b-4aff-3274-9bdc-6f45bc5bc394@oracle.com> References: <7B3F4BA1-0D1A-4550-B35D-1093A2227AFA@oracle.com> <1501763541.2411.19.camel@oracle.com> <1501787150.2411.71.camel@oracle.com> <4029C0A4-1F4D-4C43-9022-FF9F33BD321A@oracle.com> <2811796b-4aff-3274-9bdc-6f45bc5bc394@oracle.com> Message-ID: <453A6E00-C4A0-425F-AE44-7D890CD76125@oracle.com> > On Aug 7, 2017, at 8:22 PM, coleen.phillimore at oracle.com wrote: > > > > This looks good. Thanks. > I don't think we'll change the methods->length() because it's limited to u2 and used to set method_idnum() which is a u2. Odd that the compiler doesn't complain about narrowing an int into a u2 though. > > Coleen > > On 8/4/17 7:20 PM, Kim Barrett wrote: >>> On Aug 3, 2017, at 3:05 PM, Thomas Schatzl wrote: >>> >>> Hi, >>> >>> On Thu, 2017-08-03 at 14:06 -0400, Kim Barrett wrote: >>>>> On Aug 3, 2017, at 8:32 AM, Thomas Schatzl >>>> om> wrote: >>>>> However, please put the closing brackets of these into extra lines >>>>> (quicksort.hpp:76,77) to avoid the casual reader to overlook them. >>>> Sorry, but that just looks horrible. As a casual reader, I wouldn?t >>>> even look for them, since if they aren?t there then the code is >>>> badly mis-indented. >>> Actually I was already at writing about an issue with indentation when >>> I noticed the brackets :) >>> >>> Not insisting on changing this. >> I added a couple of asserts to check for buffer overruns due to a bad comparator: >> >> diff -r 6b62ed03a6a6 -r 7616ceb92653 src/share/vm/utilities/quickSort.hpp >> --- a/src/share/vm/utilities/quickSort.hpp Fri Aug 04 18:02:51 2017 -0400 >> +++ b/src/share/vm/utilities/quickSort.hpp Fri Aug 04 19:17:36 2017 -0400 >> @@ -73,8 +73,12 @@ >> T pivot_val = array[pivot]; >> for ( ; true; ++left_index, --right_index) { >> - for ( ; comparator(array[left_index], pivot_val) < 0; ++left_index) {} >> - for ( ; comparator(array[right_index], pivot_val) > 0; --right_index) {} >> + for ( ; comparator(array[left_index], pivot_val) < 0; ++left_index) { >> + assert(left_index < length, "reached end of partition"); >> + } >> + for ( ; comparator(array[right_index], pivot_val) > 0; --right_index) { >> + assert(right_index > 0, "reached start of partition"); >> + } >> if (left_index < right_index) { >> if (!idempotent || comparator(array[left_index], array[right_index]) != 0) { From ioi.lam at oracle.com Tue Aug 8 00:45:09 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 7 Aug 2017 17:45:09 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> Message-ID: On 8/4/17 10:19 PM, Jiangli Zhou wrote: > Hi Ioi, > > Thanks for looking again. > >> On Aug 4, 2017, at 2:22 PM, Ioi Lam > > wrote: >> >> Hi Jiangli, >> >> The code looks good in general. I just have a few pet peeves for >> readability: >> >> >> (1) stringTable.cpp and metaspaceShared.cpp have the same asserts >> >> 704 assert(UseG1GC, "Only support G1 GC"); >> 705 assert(UseCompressedOops && UseCompressedClassPointers, >> 706 "Only support UseCompressedOops and >> UseCompressedClassPointers enabled"); >> >> 1615 assert(UseG1GC, "Only support G1 GC"); >> 1616 assert(UseCompressedOops && UseCompressedClassPointers, >> 1617 "Only support UseCompressedOops and >> UseCompressedClassPointers enabled"); >> >> Maybe it's better to combine them into a single function like >> MetaspaceShared::assert_vm_flags() so they don't get out of sync? > > There is a MetaspaceShared::allow_archive_heap_object(), which checks > for UseG1GC, UseCompressedOops and UseCompressedClassPointers > combined. It does not seem to worth add another separate API for > asserting the required flags. I?ll use that in the assert. > >> >> >> >> (2) FileMapInfo::write_archive_heap_regions() >> >> I still find this code very hard to read, especially due to the loop. >> >> First, the comments are not consistent with the code: >> >> 498 assert(arr_len <= max_num_regions, "number of memory >> regions exceeds maximum"); >> >> but the comments says: "The rest are consecutive full GC regions" >> which means there's a chance for max_num_regions to be more than 2 >> (which will be the case with Calvin's java-loader dumping changes >> using very small heap size).So the code is actually wrong. > > The max_num_regions is the maximum number of region for each archived > heap space (the string space, or open archive space). We only run into > the case where the MemRegion array size is larger than max_num_regions > with Calvin?s pending change. As part of Calvin?s change, he will > change the assert into a check and bail out if the number of > MemRegions are larger than max_num_regions due to heap fragmentation. > > Your latest patch assumes that arr_len <= 2, but the implementation of G1CollectedHeap::heap()->begin_archive_alloc_range() / G1CollectedHeap::heap()->end_archive_alloc_range() actually allows more than 2 regions to returned. So simply putting an assert there seems risky (unless you have analyzed all possible scenarios to prove that's impossible). Instead of trying to come up with a complicated proof, I think it's much safer to disable the archived string region if the arr_len > 2. Also, if the string region is disabled, you should also disable the open_archive_heap_region I think this is a general issue with the mapped heap regions, and it just happens to be revealed by Calvin's patch. So we should fix it now and not wait for Calvin's patch. >> >> The word "region" is used in these parameters, but they don't mean >> the same thing. >> >> GrowableArray *regions >> int first_region, int max_num_regions, >> >> >> How about regions -> g1_regions_list >> first_region -> first_region_in_archive > > The GrowableArray above is the MemRegions that GC code gives back to > us. The GC code combines multiple G1 regions. The comments probably > are over-explaining the details, which are hidden in the GC code. > Probably that?s the confusing source. I?ll make the comment more clear. > > Using g1_regions_list would also be confusing, since > write_archive_heap_regions does not handle G1 regions directly. It > processes the MemRegion array that GC code returns. How about changing > ?regions? to ?mem_regions? or ?archive_regions'? > How about heap_regions? These are regions in the active Java heap, which current has not mapped anything from the CDS archive. >> >> >> In the comments, I find the phrase 'the current archive heap region' >> ambiguous. It could be (erroneously) interpreted as "a region from >> the currently mapped archive? >> >> To make it unambiguous, how about changing >> >> >> 464 // Write the current archive heap region, which contains one or >> multiple GC(G1) regions. >> >> >> to >> >> // Write the given list of G1 memory regions into the archive, >> starting at >> // first_region_in_archive. > > > Ok. How about the following: > > // Write the given list of java heap memory regions into the archive, > starting at > // first_region_in_archive. > Sounds good. Thanks - Ioi >> >> >> Also, for the explanation of how the G1 regions are written into the >> archive, how about: >> >> // The G1 regions in the list are sorted in ascending address >> order. When there are more objects >> // than the capacity of a single G1 region, the bottom-most G1 >> region may be partially filled, and the >> // remaining G1 region(s) are consecutively allocated and fully >> filled. >> // >> // Thus, the bottom-most G1 region (if not empty) is written into >> first_region_in_archive. >> // The remaining G1 regions (if exist) are coalesced and written >> as a single block >> // into (first_region_in_archive + 1) >> >> // Here's the mapping from (g1 regions) -> (archive regions). >> >> >> All this function needs to do is to decide the values for >> >> r0_start, r0_top >> r1_start, r1_top >> >> I think it would be much better to not use the loop, and not use the >> max_num_regions parameter (it's always 2 anyway). >> >> *r0_start = *r0_top = NULL; >> *r1_start = *r1_top = NULL; >> >> if (arr_len >= 1) { >> *r0_start = regions->at(0).start(); >> *r0_end = *r0_start + regions->at(0).byte_size(); >> } >> if (arr_len >= 2) { >> int last = arr_len - 1; >> *r1_start = regions->at(1).start(); >> *r1_end = regions->at(last).start() + >> regions->at(last).byte_size(); >> } >> >> what do you think? > > We need to write out all archive regions including the empty ones. The > loop using max_num_regions is the easiest way. I?d like to remove the > code that deals with r0_* and r1_ explicitly. Let me try that. > >> >> >> >> (3) metaspace.cpp >> >> 3350 // Map the archived heap regions after compressed pointers >> 3351 // because it relies on compressed class pointers >> setting to work >> >> do you mean this? >> >> // Archived heap regions depend on the parameters of compressed >> class pointers, so >> // they must be mapped after such parameters have been decided in >> the above call. > > Hmmm, maybe use ?arguments? instead of ?parameters?? > >> >> >> (4) I found this name not strictly grammatical. How about this: >> >> allow_archive_heap_object -> is_heap_object_archiving_allowed > > Ok. > >> >> (5) in most of your code, 'archive' is used as a noun, except in >> StringTable::archive_string() where it's used as a verb. >> >> archive_string could also be interpreted erroneously as "return a >> string that's already in the archive". So to be consistent and >> unambiguous, I think it's better to rename it to >> StringTable::create_archived_string() > > Ok. > > Thanks, > Jiangli > >> >> >> Thanks >> - Ioi >> >> >> On 8/3/17 5:15 PM, Jiangli Zhou wrote: >>> Here are the updated webrevs. >>> >>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >>> >>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >>> >>> Changes in the updated webrevs include: >>> >>> * Merge with Ioi?s recent shared space auto-sizing change (8072061) >>> * Addressed all feedbacks from Ioi and Coleen (Thanks for detailed >>> review!) >>> >>> >>> Thanks, >>> Jiangli >>> >>> >>>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou >>>> wrote: >>>> >>>> Hi Ioi, >>>> >>>> Thank you so much for reviewing this. I?ve addressed all your >>>> feedbacks. Please see details below. I?ll updated the webrev >>>> after addressing Coleen?s comments. >>>> >>>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>>>> >>>>> Hi Jiangli, >>>>> >>>>> Here are my comments. I've not reviewed the GC code and I'll leave >>>>> that to the GC experts :-) >>>>> >>>>> stringTable.cpp: StringTable::archive_string >>>>> >>>>> add assert for DumpSharedSpaces only >>>> >>>> Ok. >>>> >>>>> >>>>> filemap.cpp >>>>> >>>>> 525 void >>>>> FileMapInfo::write_archive_heap_regions(GrowableArray >>>>> *regions, >>>>> 526 int first_region, int num_regions) { >>>>> >>>>> When I first read this function, I found it hard to follow, >>>>> especially this part that coalesces the trailing regions: >>>>> >>>>> 537 int len = regions->length(); >>>>> 538 if (len > 1) { >>>>> 539 start = (char*)regions->at(1).start(); >>>>> 540 size = (char*)regions->at(len - 1).end() - start; >>>>> 541 } >>>>> 542 } >>>>> >>>>> The rest of filemap.cpp always perform identical operations on >>>>> MemRegion arrays, which are either 1 or 2 in size. However, >>>>> this function doesn't follow that pattern; it also has a very >>>>> different notion of "region", and the confusing part is >>>>> regions->size() is not the same as num_regions. >>>>> >>>>> How about we change the API to something like the following? >>>>> Before calling this API, the caller needs to coalesce the trailing >>>>> G1 regions into a single MemRegion. >>>>> >>>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int >>>>> first, int num_regions) { >>>>> if (first == MetaspaceShared::first_string) { >>>>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>>>> } else { >>>>> assert(first == >>>>> MetaspaceShared::first_open_archive_heap_region, "..."); >>>>> assert(num_regons <= >>>>> MetaspaceShared::max_open_archive_heap_region, "..."); >>>>> } >>>>> .... >>>>> >>>>> >>>> >>>> I?ve reworked the function and simplified the code. >>>> >>>>> >>>>> 756 if (!string_data_mapped) { >>>>> 757 StringTable::ignore_shared_strings(true); >>>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, >>>>> "sanity"); >>>>> 759 } >>>>> 760 >>>>> 761 if (open_archive_heap_data_mapped) { >>>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>>>> 763 } else { >>>>> 764 assert(open_archive_heap_ranges == NULL && >>>>> num_open_archive_heap_ranges == 0, "sanity"); >>>>> 765 } >>>>> >>>>> Maybe the two "if" statements should be more consistent? Instead >>>>> of StringTable::ignore_shared_strings, how >>>>> about StringTable::set_shared_strings_region_mapped()? >>>> >>>> Fixed. >>>> >>>>> >>>>> FileMapInfo::map_heap_data() -- >>>>> >>>>> 818 char* addr = (char*)regions[i].start(); >>>>> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >>>>> 820 addr, regions[i].byte_size(), >>>>> si->_read_only, >>>>> 821 si->_allow_exec); >>>>> >>>>> What happens when the first region succeeds to map but the second >>>>> region fails to map? Will both regions be unmapped? I don't see >>>>> where you store the return value (base) from os::map_memory(). >>>>> Does it mean the code assumes that (addr == base). If so, we need >>>>> an assert here. >>>> >>>> If any of the region fails to map, we bail out and call >>>> dealloc_archive_heap_regions(), which handles the deallocation of >>>> any regions specified. If second region fails to map, all memory >>>> ranges specified by ?regions? array are deallocated. We don?t unmap >>>> the memory here since it is part of the java heap. Unmapping of >>>> heap memory are handled by GC code. The ?if? check below makes sure >>>> base == addr. >>>> >>>> if (base == NULL || base != addr) { >>>> // dealloc the regions from java heap >>>> dealloc_archive_heap_regions(regions, region_num); >>>> if (log_is_enabled(Info, cds)) { >>>> log_info(cds)("UseSharedSpaces: Unable to map at required >>>> address in java heap."); >>>> } >>>> return false; >>>> } >>>> >>>> >>>>> >>>>> constantPool.cpp >>>>> >>>>> Handle refs_handle; >>>>> ... >>>>> refs_handle = Handle(THREAD, (oop)archived); >>>>> >>>>> This will first create a NULL handle, then construct a temporary >>>>> handle, and then assign the temp handle back to the null >>>>> handle. This means two handles will be pushed onto >>>>> THREAD->metadata_handles() >>>>> >>>>> I think it's more efficient if you merge these into a single statement >>>>> >>>>> Handle refs_handle(THREAD, (oop)archived); >>>> >>>> Fixed. >>>> >>>>> >>>>> Is this experimental code? Maybe it should be removed? >>>>> >>>>> 664 if (tag_at(index).is_unresolved_klass()) { >>>>> 665 #if 0 >>>>> 666 CPSlot entry = cp->slot_at(index); >>>>> 667 Symbol* name = entry.get_symbol(); >>>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>>>> 669 if (k != NULL) { >>>>> 670 klass_at_put(index, k); >>>>> 671 } >>>>> 672 #endif >>>>> 673 } else >>>> >>>> Removed. >>>> >>>>> >>>>> cpCache.hpp: >>>>> >>>>> u8 _archived_references >>>>> >>>>> shouldn't this be declared as an narrowOop to avoid the type casts >>>>> when it's used? >>>> >>>> Ok. >>>> >>>>> >>>>> cpCache.cpp: >>>>> >>>>> add assert so that one of these is used only at dump time and >>>>> the other only at run time? >>>>> >>>>> 610 oop ConstantPoolCache::archived_references() { >>>>> 611 return >>>>> oopDesc::decode_heap_oop((narrowOop)_archived_references); >>>>> 612 } >>>>> 613 >>>>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>>>> 616 } >>>> >>>> Ok. >>>> >>>> Thanks! >>>> >>>> Jiangli >>>> >>>>> >>>>> Thanks! >>>>> - Ioi >>>>> >>>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>>>> Sorry, the mail didn?t handle the rich text well. I fixed the >>>>>> format below. >>>>>> >>>>>> Please help review the changes for JDK-8179302 (Pre-resolve >>>>>> constant pool string entries and cache resolved_reference arrays >>>>>> in CDS archive). Currently, the CDS archive can contain cached >>>>>> class metadata, interned java.lang.String objects. This RFE adds >>>>>> the constant pool ?resolved_references? arrays (hotspot specific) >>>>>> to the archive for startup/runtime performance enhancement. >>>>>> The ?resolved_references' arrays are used to hold references of >>>>>> resolved constant pool entries including Strings, mirrors, etc. >>>>>> With the 'resolved_references? being cached, string constants in >>>>>> shared classes can now be resolved to existing interned >>>>>> java.lang.Strings at CDS dump time. G1 and 64-bit platforms are >>>>>> required. >>>>>> >>>>>> The GC changes in the RFE were discussed and guided by Thomas >>>>>> Schatzl and GC team. Part of the changes were contributed by >>>>>> Thomas himself. >>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>>>> hotspot: >>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>>>> whitebox: >>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>>>> >>>>>> Please see below for details of supporting cached >>>>>> ?resolved_references? and pre-resolving string constants. >>>>>> >>>>>> Types of Pinned G1 Heap Regions >>>>>> >>>>>> The pinned region type is a super type of all archive region >>>>>> types, which include the open archive type and the closed archive >>>>>> type. >>>>>> >>>>>> 00100 0 [ 8] Pinned Mask >>>>>> 01000 0 [16] Old Mask >>>>>> 10000 0 [32] Archive Mask >>>>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>>>>> >>>>>> >>>>>> Pinned Regions >>>>>> >>>>>> Objects within the region are 'pinned', which means GC does not >>>>>> move any live objects. GC scans and marks objects in the >>>>>> pinned region as normal, but skips forwarding live objects. >>>>>> Pointers in live objects are updated. Dead objects (unreachable) >>>>>> can be collected and freed. >>>>>> >>>>>> Archive Regions >>>>>> >>>>>> The archive types are sub-types of 'pinned'. There are two types >>>>>> of archive region currently, open archive and closed archive. >>>>>> Both can support caching java heap objects via the CDS archive. >>>>>> >>>>>> An archive region is also an old region by design. >>>>>> >>>>>> Open Archive (GC-RW) Regions >>>>>> >>>>>> Open archive region is GC writable. GC scans & marks objects >>>>>> within the region and adjusts (updates) pointers in live objects >>>>>> the same way as a pinned region. Live objects (reachable) are >>>>>> pinned and not forwarded by GC. >>>>>> Open archive region does not have 'dead' objects. Unreachable >>>>>> objects are 'dormant' objects. Dormant objects are not >>>>>> collected and freed by GC. >>>>>> >>>>>> Adjustable Outgoing Pointers >>>>>> >>>>>> As GC can adjust pointers within the live objects in open archive >>>>>> heap region, objects can have outgoing pointers to another >>>>>> java heap region, including closed archive region, open archive >>>>>> region, pinned (or humongous) region, and normal generational >>>>>> region. When a referenced object is moved by GC, the pointer >>>>>> within the open archive region is updated accordingly. >>>>>> >>>>>> Closed Archive (GC-RO) Regions >>>>>> >>>>>> The closed archive region is GC read-only region. GC cannot write >>>>>> into the region. Objects are not scanned and marked by >>>>>> GC. Objects are pinned and not forwarded. Pointers are not >>>>>> updated by GC either. Hence, objects within the archive region >>>>>> cannot have any outgoing pointers to another java heap region. >>>>>> Objects however can still have pointers to other objects within >>>>>> the closed archive regions (we might allow pointers to open >>>>>> archive regions in the future). That restricts the type of java >>>>>> objects that can be supported by the archive region. >>>>>> In JDK 9 we support archive Strings with the archive regions. >>>>>> >>>>>> The GC-readonly archive region makes java heap memory sharable >>>>>> among different JVM processes. NOTE: synchronization on the >>>>>> objects within the archive heap region can still cause writes to >>>>>> the memory page. >>>>>> >>>>>> Dormant Objects >>>>>> >>>>>> Dormant objects are unreachable java objects within the open >>>>>> archive heap region. >>>>>> A java object in the open archive heap region is a live object if >>>>>> it can be reached during scanning. Some of the java objects in >>>>>> the region may not be reachable during scanning. Those objects >>>>>> are considered as dormant, but not dead. For example, a >>>>>> constant pool 'resolved_references' array is reachable via the >>>>>> klass root if its container klass (shared) is already loaded at >>>>>> the time during GC scanning. If a shared klass is not yet loaded, >>>>>> the klass root is not scanned and it's constant pool >>>>>> 'resolved_reference' array (A) in the open archive region is not >>>>>> reachable. Then A is a dormant object. >>>>>> >>>>>> Object State Transition >>>>>> >>>>>> All java objects are initially dormant objects when open archive >>>>>> heap regions are mapped to the runtime java heap. A >>>>>> dormant object becomes live object when the associated shared >>>>>> class is loaded at runtime. Explicit call >>>>>> to G1SATBCardTableModRefBS::enqueue() needs to be made when a >>>>>> dormant object becomes live. That should be the case for cached >>>>>> objects with strong roots as well, since strong roots are only >>>>>> scanned at the start of GC marking (the initial marking) but >>>>>> not during Remarking/Final marking. If a cached object becomes >>>>>> live during concurrent marking phase, G1 may not find it and mark >>>>>> it live unless a call to G1SATBCardTableModRefBS::enqueue() is >>>>>> made for the object. >>>>>> >>>>>> Currently, a live object in the open archive heap region cannot >>>>>> become dormant again. This restriction simplifies GC >>>>>> requirement and guarantees all outgoing pointers are updated by >>>>>> GC correctly. Only objects for shared classes from the builtin >>>>>> class loaders (boot, PlatformClassLoaders, and AppClassLoaders) >>>>>> are supported for caching. >>>>>> >>>>>> Caching Java Objects at Archive Dump Time >>>>>> >>>>>> The closed archive and open archive regions are allocated near >>>>>> the top of the dump time java heap. Archived java objects >>>>>> are copied into the designated archive heap regions. For example, >>>>>> String objects and the underlying 'value' arrays are copied into >>>>>> the closed archive regions. All references to the archived >>>>>> objects (from shared class metadata, string table, etc) are set >>>>>> to the new heap locations. A hash table is used to keep track of >>>>>> all archived java objects during the copying process to make sure >>>>>> java object is not archived more than once if reached from >>>>>> different roots. It also makes sure references to the same >>>>>> archived object are updated using the same new address location. >>>>>> >>>>>> Caching Constant Pool resolved_references Array >>>>>> >>>>>> The 'resolved_references' is an array that holds references of >>>>>> resolved constant pool entries including Strings, mirrors >>>>>> and methodTypes, etc. Each loaded class has one >>>>>> 'resolved_references' array (in ConstantPoolCache). The >>>>>> 'resolved_references' arrays are copied into the open archive >>>>>> regions during dump process. Prior to copying the >>>>>> 'resolved_references' arrays, JVM iterates through constant pool >>>>>> entries and resolves all JVM_CONSTANT_String entries to existing >>>>>> interned Strings for all archived classes. When resolving, JVM >>>>>> only looks up the string table and finds existing interned >>>>>> Strings without inserting new ones. If a string entry cannot be >>>>>> resolved to an existing interned String, the constant pool entry >>>>>> remain as unresolved. That prevents memory waste if a constant >>>>>> pool string entry is never used at runtime. >>>>>> >>>>>> All String objects referenced by the string table are copied >>>>>> first into the closed archive regions. The string table entry is >>>>>> updated with the new location when each String object is >>>>>> archived. The JVM updates the resolved constant pool string >>>>>> entries with the new object locations when copying the >>>>>> 'resolved_references' arrays to the open archive regions. >>>>>> References to the 'resolved_references' arrays in the >>>>>> ConstantPoolCache are also updated. >>>>>> At runtime as part of ConstantPool::restore_unshareable_info() >>>>>> work, call G1SATBCardTableModRefBS::enqueue() to let GC know the >>>>>> 'resolved_references' is becoming live. A handle is created for >>>>>> the cached object and added to the loader_data's handles. >>>>>> >>>>>> Runtime Java Heap With Cached Java Objects >>>>>> >>>>>> >>>>>> The closed archive regions (the string regions) and open archive >>>>>> regions are mapped to the runtime java heap at the same >>>>>> offsets as the dump time offsets from the runtime java heap base. >>>>>> >>>>>> Preliminary test execution and status: >>>>>> >>>>>> JPRT: passed >>>>>> Tier2-rt: passed >>>>>> Tier2-gc: passed >>>>>> Tier2-comp: passed >>>>>> Tier3-rt: passed >>>>>> Tier3-gc: passed >>>>>> Tier3-comp: passed >>>>>> Tier4-rt: passed >>>>>> Tier4-gc: passed >>>>>> Tier4-comp:6 jobs timed out, all other tests passed >>>>>> Tier5-rt: one test failed but passed when running locally, all >>>>>> other tests passed >>>>>> Tier5-gc: passed >>>>>> Tier5-comp: running >>>>>> hotspot_gc: two jobs timed out, all other tests passed >>>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>>>> vm.gc: passed >>>>>> vm.gc in CDS mode: passed >>>>>> Kichensink: passed >>>>>> Kichensink in CDS mode: passed >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>> >>>> >>> >> > From david.holmes at oracle.com Tue Aug 8 01:37:09 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Aug 2017 11:37:09 +1000 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: <97a3c455-c3d1-7df4-a20f-90f1d1102e6d@redhat.com> References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> <97a3c455-c3d1-7df4-a20f-90f1d1102e6d@redhat.com> Message-ID: On 8/08/2017 1:23 AM, Andrew Haley wrote: > On 07/08/17 00:32, Kim Barrett wrote: >> I'm looking for feedback on this before I try to carry it any further. > > I don't like it because it converts pointers to operand types before > calling the back end. > > For example, in here: > > intptr_t v = CASPTR(&_LockWord, 0, _LBIT); // agro ... > > the type of the operand LockWord is SplitWord. But the SplitWord * > argument gets converted to void* volatile* when we call this: > > inline static void* cmpxchg_ptr(void* exchange_value, volatile void* dest, void* compare_value, cmpxchg_memory_order order = memory_order_conservative) { > return cmpxchg(exchange_value, > reinterpret_cast(dest), > compare_value, > order); > > Here's what I first wrote: > > I don't see the point of such a type conversion. We could call > cmpxchg with the actual types of the operands, could we not? Why is > cmpxchg_ptr even a thing? We're casting away type information for > no reason that I can see. While I'd be happy to see cmpxchg_ptr go away, my understanding was that it was needed in the API precisely to avoid (where possible) casts at the call site. I find it very frustrating that we have to jump through so many hoops to get the compiler to get out of the way. The atomic ops work on bits not semantics types - we only need to know how many bits we're dealing with, we don't care about pointers or ints or longs etc. The only semantics we need are that we check for a value that is valid for the destination to hold, and we provide a new value that is valid for the destination to hold. We know what those values are. The compiler doesn't and continually gets in our way because of that - and so we have to jump through hoop to persuade the compiler what we are doing is okay. End of rant. :) Cheers, David ----- > Couldn't cmpxchg_ptr() be defined as a template function in such a > way that only the back ends that actually need to cast away the > types have to do so? That is, if the back ends can define > cmpxchg_ptr() themselves without resorting to pointer type > conversion, we should let them so so. > > But rather than sending that message straight away, I tried it. And > now I see: the compiler can't get the types right in those cases where > we have mismatched operand types in the call. Argh. The only way we > can get method resolution to work is to throw way the pointer type > information and use void* for everything. At th erisk of being > boring, I repeat what I said before: IMO this is not what we should be > doing in 2017. We should be looking to the future, and get the types > to match now, at the call site. > > Also, and this is a relatively slight objection, I find myself > defining > > template<> > struct Atomic::PlatformCmpxchg<1> VALUE_OBJ_CLASS_SPEC { > template > T operator()(T exchange_value, > T volatile* dest, > T compare_value, > cmpxchg_memory_order order) const { > return ::cmpxchg(exchange_value, dest, compare_value, order); > } > }; > > for 1, 4, and 8. I guess that can't be avoided, and in any case it > would be easy enough to do it with preprocessor macros. > From david.holmes at oracle.com Tue Aug 8 01:41:22 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Aug 2017 11:41:22 +1000 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: <9a934403-76d0-b296-269a-7f40b3f81208@oracle.com> References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> <7973a8e4-232b-2618-ee24-ab455bab5abc@oracle.com> <9a934403-76d0-b296-269a-7f40b3f81208@oracle.com> Message-ID: Hi Harold, On 7/08/2017 11:13 PM, harold seigel wrote: > Hi David, > > Thanks for your comments! > > Please review this updated webrev. It contains the change that you > suggested. It also simplifies the implementation by statically > allocating the fixup lists before their first use. > > http://cr.openjdk.java.net/~hseigel/bug_8185103.2/webrev/index.html I like it! Not much point in lazy initialization if things will always be initialized anyway. And presumably this has now removed the potential for allocation failure that was significant in setting the klass mirror second, so now we can set it first. Thanks, David > Thanks, Harold > > On 8/3/2017 7:03 PM, David Holmes wrote: >> Hi Harold, >> >> On 4/08/2017 7:24 AM, David Holmes wrote: >>> Hi Harold, >>> >>> On 3/08/2017 11:03 PM, harold seigel wrote: >>>> Hi, >>>> >>>> Please review this JDK-10 fix for JDK-8185103. The problem occurred >>>> because classes were being put on the fixup_module_field_list before >>>> their mirror field was set. If a (different) thread called method >>>> patch_javabase_entries() before the class's mirror field was set then >>> >>> The code that calls patch_javabase_entries has this: >>> >>> // Only the thread that actually defined the base module will get >>> here, >>> // so no locking is needed. >>> >>> // Patch any previously loaded class's module field with >>> java.base's java.lang.Module. >>> ModuleEntryTable::patch_javabase_entries(module_handle); >>> >>> so it seems that comment is wrong and that locking is indeed needed >>> somewhere! At a minimum your setting of the mirror needs a following >>> storestore barrier, or (better) the set/get of the mirror uses >>> load-acquire/store-release. >> >> Sorry - looking in more detail the necessary locking is already in >> place. A class is only added to the fixup list, under the Module_lock, >> if the base module is not yet defined. The finalization of that >> definition also occurs under the Module_lock, which in turn occurs >> before the fixup list is processed (without the lock). So as long as >> the mirror is set before the class is added to the fixup list, the >> mirror will be visible to the main thread when it processes it. >> >> Looking at the original code: >> >> 881 // set the module field in the java_lang_Class instance >> 882 set_mirror_module_field(k, mirror, module, THREAD); >> 883 >> 884 // Setup indirection from klass->mirror >> 885 // after any exceptions can happen during allocations. >> 886 k->set_java_mirror(mirror()); >> >> it would seem simplest to just reorder the two actions - except for >> that comment about exceptions. Is the allocation exception issue less >> of an issue when doing VM initialization? What will happen? >> >> Thanks, >> David >> >>> Thanks, >>> David >>> ----- >>> >>>> this would cause a SIGSEGV because patch_javabase_entries() >>>> eventually calls obj_field_put() which tries to dereference the >>>> class's mirror field. >>>> >>>> This change fixes the problem by setting the class's mirror field >>>> before putting the class on the fixup_module_field_list. >>>> >>>> Open Webrev: >>>> http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html >>>> >>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 >>>> >>>> The fix was tested with the JCK Lang and VM tests, the JTreg >>>> hotspot, java/io, java/lang, java/util and other tests, the >>>> co-located NSK tests, and with JPRT. >>>> >>>> Additionally, the fix was tested by temporarily adding a >>>> naked_short_sleep(50) to method initialize_mirror_fields() shortly >>>> after it put a class on the fixup_module_field_list. The sleep was >>>> added in order to enhance the likelihood of patch_javabase_entries() >>>> being called before the class's mirror field got set. Without the >>>> fix, the TestThreadDumpMonitorContent.java test and the test >>>> reported in JDK-8183309 >>>> reliably got the >>>> reported SIGSEGVs. With the fix, the tests passed. >>>> >>>> Thanks, Harold >>>> > From kim.barrett at oracle.com Tue Aug 8 02:26:54 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 7 Aug 2017 22:26:54 -0400 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596CD126.6080100@oracle.com> <596DD713.6080806@oracle.com> <77d5a9e7-b2b0-79e4-effd-903dfe226512@redhat.com> <596E1142.6040708@oracle.com> <3432d11c-5904-a446-1cb5-419627f1794f@redhat.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: <3435B896-5770-45F0-B698-DC765CD08DEA@oracle.com> > On Aug 6, 2017, at 7:32 PM, Kim Barrett wrote: > I've included a mechanism for dealing with thin wrappers over > primitive types. The driver for this is oop, which is normally just a > typedef for oopDesc*, but is a class with an oopDesc* member when > CHECK_UNHANDLED_OOPS is defined (such as during fastdebug builds). We > (Erik and I) knew we would need something there, if only for the case > of oop, but hadn't agreed on a solution. Well, I'm proposing one here. If this really is only needed for oop with CHECK_UNHANDLED_OOPS, an alternative to this general IntegerTypes::Translate mechanism (and making the Atomic/OrderedAccess layer pay attention to it) would be to make that special oop class definition deal with the problem. In oopsHierarchy.hpp, #include atomic.hpp (and eventually orderAccess.hpp) and add definitions like following after the class definition for oop (replacing the proposed Translate specialization) template<> inline oop Atomic::cmpxchg(oop exchange_value, oop volatile* dest, oop compare_value, cmpxchg_memory_order order) { return oop(Atomic::cmpxchg(exchange_value.obj(), reinterpret_cast(dest), compare_value.obj(), order)); } and similarly for all the other Atomic and OrderAccess operations that we want to accept oop arguments. This is kind of ugly, in that it puts all these definitions far away from where one might expect to find them. But then, this whole "oop is a typedef except when it's a class" thing is kind of ugly too. This also runs into include order problems; I'm not sure how hard those would be to deal with. Right now I'm inclined to stick with the Translate mechanism, especially because of the unknowns involved in the include order problem. Consider the Translate mechanism to be the usual additional level of indirection needed to solve a problem. From jiangli.zhou at oracle.com Tue Aug 8 02:57:42 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Mon, 7 Aug 2017 19:57:42 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> Message-ID: <5F35B435-FB72-4A21-8896-645E0291C5DF@oracle.com> Hi Ioi, Thanks for getting back to me. > On Aug 7, 2017, at 5:45 PM, Ioi Lam wrote: > > On 8/4/17 10:19 PM, Jiangli Zhou wrote: >> Hi Ioi, >> >> Thanks for looking again. >> >>> On Aug 4, 2017, at 2:22 PM, Ioi Lam > wrote: >>> >>> Hi Jiangli, >>> >>> The code looks good in general. I just have a few pet peeves for readability: >>> >>> >>> (1) stringTable.cpp and metaspaceShared.cpp have the same asserts >>> >>> 704 assert(UseG1GC, "Only support G1 GC"); >>> 705 assert(UseCompressedOops && UseCompressedClassPointers, >>> 706 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); >>> >>> 1615 assert(UseG1GC, "Only support G1 GC"); >>> 1616 assert(UseCompressedOops && UseCompressedClassPointers, >>> 1617 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); >>> >>> Maybe it's better to combine them into a single function like MetaspaceShared::assert_vm_flags() so they don't get out of sync? >> >> There is a MetaspaceShared::allow_archive_heap_object(), which checks for UseG1GC, UseCompressedOops and UseCompressedClassPointers combined. It does not seem to worth add another separate API for asserting the required flags. I?ll use that in the assert. >> >>> >>> >>> >>> (2) FileMapInfo::write_archive_heap_regions() >>> >>> I still find this code very hard to read, especially due to the loop. >>> >>> First, the comments are not consistent with the code: >>> >>> 498 assert(arr_len <= max_num_regions, "number of memory regions exceeds maximum"); >>> >>> but the comments says: "The rest are consecutive full GC regions" which means there's a chance for max_num_regions to be more than 2 (which will be the case with Calvin's java-loader dumping changes using very small heap size). So the code is actually wrong. >> >> The max_num_regions is the maximum number of region for each archived heap space (the string space, or open archive space). We only run into the case where the MemRegion array size is larger than max_num_regions with Calvin?s pending change. As part of Calvin?s change, he will change the assert into a check and bail out if the number of MemRegions are larger than max_num_regions due to heap fragmentation. >> >> > Your latest patch assumes that arr_len <= 2, but the implementation of G1CollectedHeap::heap()->begin_archive_alloc_range() / G1CollectedHeap::heap()->end_archive_alloc_range() actually allows more than 2 regions to returned. So simply putting an assert there seems risky (unless you have analyzed all possible scenarios to prove that's impossible). > > Instead of trying to come up with a complicated proof, I think it's much safer to disable the archived string region if the arr_len > 2. Also, if the string region is disabled, you should also disable the open_archive_heap_region > > I think this is a general issue with the mapped heap regions, and it just happens to be revealed by Calvin's patch. So we should fix it now and not wait for Calvin's patch. Ok. I?ll change the assert to be a check. > > >>> >>> The word "region" is used in these parameters, but they don't mean the same thing. >>> >>> GrowableArray *regions >>> int first_region, int max_num_regions, >>> >>> >>> How about regions -> g1_regions_list >>> first_region -> first_region_in_archive >> >> The GrowableArray above is the MemRegions that GC code gives back to us. The GC code combines multiple G1 regions. The comments probably are over-explaining the details, which are hidden in the GC code. Probably that?s the confusing source. I?ll make the comment more clear. >> >> Using g1_regions_list would also be confusing, since write_archive_heap_regions does not handle G1 regions directly. It processes the MemRegion array that GC code returns. How about changing ?regions? to ?mem_regions? or ?archive_regions'? >> > How about heap_regions? These are regions in the active Java heap, which current has not mapped anything from the CDS archive. Ok. I?m updating my changes and will send out a consolidated webrev. Thanks! Jiangli > > >>> >>> >>> In the comments, I find the phrase 'the current archive heap region' ambiguous. It could be (erroneously) interpreted as "a region from the currently mapped archive? >> >>> >>> To make it unambiguous, how about changing >>> >>> >>> 464 // Write the current archive heap region, which contains one or multiple GC(G1) regions. >>> >>> >>> to >>> >>> // Write the given list of G1 memory regions into the archive, starting at >>> // first_region_in_archive. >> >> >> Ok. How about the following: >> >> // Write the given list of java heap memory regions into the archive, starting at >> // first_region_in_archive. >> > Sounds good. > > Thanks > - Ioi > >>> >>> >>> Also, for the explanation of how the G1 regions are written into the archive, how about: >>> >>> // The G1 regions in the list are sorted in ascending address order. When there are more objects >>> // than the capacity of a single G1 region, the bottom-most G1 region may be partially filled, and the >>> // remaining G1 region(s) are consecutively allocated and fully filled. >>> // >>> // Thus, the bottom-most G1 region (if not empty) is written into first_region_in_archive. >>> // The remaining G1 regions (if exist) are coalesced and written as a single block >>> // into (first_region_in_archive + 1) >>> >>> // Here's the mapping from (g1 regions) -> (archive regions). >>> >>> >>> All this function needs to do is to decide the values for >>> >>> r0_start, r0_top >>> r1_start, r1_top >>> >>> I think it would be much better to not use the loop, and not use the max_num_regions parameter (it's always 2 anyway). >>> >>> *r0_start = *r0_top = NULL; >>> *r1_start = *r1_top = NULL; >>> >>> if (arr_len >= 1) { >>> *r0_start = regions->at(0).start(); >>> *r0_end = *r0_start + regions->at(0).byte_size(); >>> } >>> if (arr_len >= 2) { >>> int last = arr_len - 1; >>> *r1_start = regions->at(1).start(); >>> *r1_end = regions->at(last).start() + regions->at(last).byte_size(); >>> } >>> >>> what do you think? >> >> We need to write out all archive regions including the empty ones. The loop using max_num_regions is the easiest way. I?d like to remove the code that deals with r0_* and r1_ explicitly. Let me try that. >> >>> >>> >>> >>> (3) metaspace.cpp >>> >>> 3350 // Map the archived heap regions after compressed pointers >>> 3351 // because it relies on compressed class pointers setting to work >>> >>> do you mean this? >>> >>> // Archived heap regions depend on the parameters of compressed class pointers, so >>> // they must be mapped after such parameters have been decided in the above call. >> >> Hmmm, maybe use ?arguments? instead of ?parameters?? >> >>> >>> >>> (4) I found this name not strictly grammatical. How about this: >>> >>> allow_archive_heap_object -> is_heap_object_archiving_allowed >> >> Ok. >> >>> >>> (5) in most of your code, 'archive' is used as a noun, except in StringTable::archive_string() where it's used as a verb. >>> >>> archive_string could also be interpreted erroneously as "return a string that's already in the archive". So to be consistent and unambiguous, I think it's better to rename it to StringTable::create_archived_string() >> >> Ok. >> >> Thanks, >> Jiangli >> >>> >>> >>> Thanks >>> - Ioi >>> >>> >>> On 8/3/17 5:15 PM, Jiangli Zhou wrote: >>>> Here are the updated webrevs. >>>> >>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >>>> >>>> Changes in the updated webrevs include: >>>> Merge with Ioi?s recent shared space auto-sizing change (8072061) >>>> Addressed all feedbacks from Ioi and Coleen (Thanks for detailed review!) >>>> >>>> Thanks, >>>> Jiangli >>>> >>>> >>>>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou wrote: >>>>> >>>>> Hi Ioi, >>>>> >>>>> Thank you so much for reviewing this. I?ve addressed all your feedbacks. Please see details below. I?ll updated the webrev after addressing Coleen?s comments. >>>>> >>>>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>>>>> >>>>>> Hi Jiangli, >>>>>> >>>>>> Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) >>>>>> >>>>>> stringTable.cpp: StringTable::archive_string >>>>>> >>>>>> add assert for DumpSharedSpaces only >>>>> >>>>> Ok. >>>>> >>>>>> >>>>>> filemap.cpp >>>>>> >>>>>> 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, >>>>>> 526 int first_region, int num_regions) { >>>>>> >>>>>> When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: >>>>>> >>>>>> 537 int len = regions->length(); >>>>>> 538 if (len > 1) { >>>>>> 539 start = (char*)regions->at(1).start(); >>>>>> 540 size = (char*)regions->at(len - 1).end() - start; >>>>>> 541 } >>>>>> 542 } >>>>>> >>>>>> The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. >>>>>> >>>>>> How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. >>>>>> >>>>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { >>>>>> if (first == MetaspaceShared::first_string) { >>>>>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>>>>> } else { >>>>>> assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); >>>>>> assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); >>>>>> } >>>>>> .... >>>>>> >>>>>> >>>>> >>>>> I?ve reworked the function and simplified the code. >>>>> >>>>>> >>>>>> 756 if (!string_data_mapped) { >>>>>> 757 StringTable::ignore_shared_strings(true); >>>>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); >>>>>> 759 } >>>>>> 760 >>>>>> 761 if (open_archive_heap_data_mapped) { >>>>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>>>>> 763 } else { >>>>>> 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); >>>>>> 765 } >>>>>> >>>>>> Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? >>>>> >>>>> Fixed. >>>>> >>>>>> >>>>>> FileMapInfo::map_heap_data() -- >>>>>> >>>>>> 818 char* addr = (char*)regions[i].start(); >>>>>> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >>>>>> 820 addr, regions[i].byte_size(), si->_read_only, >>>>>> 821 si->_allow_exec); >>>>>> >>>>>> What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. >>>>> >>>>> If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by ?regions? array are deallocated. We don?t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ?if? check below makes sure base == addr. >>>>> >>>>> if (base == NULL || base != addr) { >>>>> // dealloc the regions from java heap >>>>> dealloc_archive_heap_regions(regions, region_num); >>>>> if (log_is_enabled(Info, cds)) { >>>>> log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap."); >>>>> } >>>>> return false; >>>>> } >>>>> >>>>> >>>>>> >>>>>> constantPool.cpp >>>>>> >>>>>> Handle refs_handle; >>>>>> ... >>>>>> refs_handle = Handle(THREAD, (oop)archived); >>>>>> >>>>>> This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() >>>>>> >>>>>> I think it's more efficient if you merge these into a single statement >>>>>> >>>>>> Handle refs_handle(THREAD, (oop)archived); >>>>> >>>>> Fixed. >>>>> >>>>>> >>>>>> Is this experimental code? Maybe it should be removed? >>>>>> >>>>>> 664 if (tag_at(index).is_unresolved_klass()) { >>>>>> 665 #if 0 >>>>>> 666 CPSlot entry = cp->slot_at(index); >>>>>> 667 Symbol* name = entry.get_symbol(); >>>>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>>>>> 669 if (k != NULL) { >>>>>> 670 klass_at_put(index, k); >>>>>> 671 } >>>>>> 672 #endif >>>>>> 673 } else >>>>> >>>>> Removed. >>>>> >>>>>> >>>>>> cpCache.hpp: >>>>>> >>>>>> u8 _archived_references >>>>>> >>>>>> shouldn't this be declared as an narrowOop to avoid the type casts when it's used? >>>>> >>>>> Ok. >>>>> >>>>>> >>>>>> cpCache.cpp: >>>>>> >>>>>> add assert so that one of these is used only at dump time and the other only at run time? >>>>>> >>>>>> 610 oop ConstantPoolCache::archived_references() { >>>>>> 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); >>>>>> 612 } >>>>>> 613 >>>>>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>>>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>>>>> 616 } >>>>> >>>>> Ok. >>>>> >>>>> Thanks! >>>>> >>>>> Jiangli >>>>> >>>>>> >>>>>> Thanks! >>>>>> - Ioi >>>>>> >>>>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>>>>> Sorry, the mail didn?t handle the rich text well. I fixed the format below. >>>>>>> >>>>>>> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. >>>>>>> >>>>>>> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. >>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>>>>> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>>>>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>>>>> >>>>>>> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. >>>>>>> >>>>>>> Types of Pinned G1 Heap Regions >>>>>>> >>>>>>> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. >>>>>>> >>>>>>> 00100 0 [ 8] Pinned Mask >>>>>>> 01000 0 [16] Old Mask >>>>>>> 10000 0 [32] Archive Mask >>>>>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>>>>>> >>>>>>> >>>>>>> Pinned Regions >>>>>>> >>>>>>> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. >>>>>>> >>>>>>> Archive Regions >>>>>>> >>>>>>> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. >>>>>>> >>>>>>> An archive region is also an old region by design. >>>>>>> >>>>>>> Open Archive (GC-RW) Regions >>>>>>> >>>>>>> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. >>>>>>> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. >>>>>>> >>>>>>> Adjustable Outgoing Pointers >>>>>>> >>>>>>> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. >>>>>>> >>>>>>> Closed Archive (GC-RO) Regions >>>>>>> >>>>>>> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. >>>>>>> In JDK 9 we support archive Strings with the archive regions. >>>>>>> >>>>>>> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. >>>>>>> >>>>>>> Dormant Objects >>>>>>> >>>>>>> Dormant objects are unreachable java objects within the open archive heap region. >>>>>>> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. >>>>>>> >>>>>>> Object State Transition >>>>>>> >>>>>>> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. >>>>>>> >>>>>>> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. >>>>>>> >>>>>>> Caching Java Objects at Archive Dump Time >>>>>>> >>>>>>> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. >>>>>>> >>>>>>> Caching Constant Pool resolved_references Array >>>>>>> >>>>>>> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. >>>>>>> >>>>>>> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. >>>>>>> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. >>>>>>> >>>>>>> Runtime Java Heap With Cached Java Objects >>>>>>> >>>>>>> >>>>>>> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. >>>>>>> >>>>>>> Preliminary test execution and status: >>>>>>> >>>>>>> JPRT: passed >>>>>>> Tier2-rt: passed >>>>>>> Tier2-gc: passed >>>>>>> Tier2-comp: passed >>>>>>> Tier3-rt: passed >>>>>>> Tier3-gc: passed >>>>>>> Tier3-comp: passed >>>>>>> Tier4-rt: passed >>>>>>> Tier4-gc: passed >>>>>>> Tier4-comp:6 jobs timed out, all other tests passed >>>>>>> Tier5-rt: one test failed but passed when running locally, all other tests passed >>>>>>> Tier5-gc: passed >>>>>>> Tier5-comp: running >>>>>>> hotspot_gc: two jobs timed out, all other tests passed >>>>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>>>>> vm.gc: passed >>>>>>> vm.gc in CDS mode: passed >>>>>>> Kichensink: passed >>>>>>> Kichensink in CDS mode: passed >>>>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>> >>>>> >>>> >>> >> > From thomas.stuefe at gmail.com Tue Aug 8 05:50:16 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 8 Aug 2017 07:50:16 +0200 Subject: RFR(xxs): 8185706: Native callstacks unreliable under Windows x64 In-Reply-To: <9155fd17-2ce5-fe36-e114-928fc8fe1c32@oracle.com> References: <9155fd17-2ce5-fe36-e114-928fc8fe1c32@oracle.com> Message-ID: Hi Ioi, On Mon, Aug 7, 2017 at 7:39 PM, Ioi Lam wrote: > Hi Thomas, > > Thanks for the patch! > > Skipping the test for SP != NULL and FP != NULL seems generally OK for me. > I think StackWalk64 should be robust enough that when given NULL or bogus > values for stk.AddrStack.Offset and stk.AddrFrame.Offset, it will still > somehow recover gracefully. I forgot exactly why I put in these checks, > though. I either was overly cautious, or I might have seen some problems > without such checks, which might have caused crashes inside the debug > printing routine. I really should have put in a comment there :-( > > By being generous to myself :-), I guess I would have put in an comment > had I saw crash, so the lack of comments probably meant I was just over > cautious .... > > How much testing have you done with your patch. Pretty much only the error scenario (java -XX:+ErrorHandlingTest=xx) and the gtests, both on Win x64. > Have you seen any crash inside the printing routine? > None I would attribute to my change. I know there is a very slight risk of crashing more often now, just based on the fact that we now continue stack dumping where we skipped before, and because StackWalk64 is a black box. But this is error handling, we deal with secondary crashes anyway and I think I rather have more complete callstacks in the hs-err file and risk a secondary crash instead of useless error reports. Note that callstack dumping and symbol resolution is pretty unreliable and unstable on windows anyway. See https://bugs.openjdk.java.net/browse/JDK-8185712, I am currently working on bringing improvements upstream we have in our fork. Our error handling is more reliable than stock openjdk. > > Also, by "Native callstacks unreliable", do you mean "Native callstacks > printing terminates prematurely", and not "sometimes they fail and print > erroneous information or behave unexpectedly"? I think it's better to > update the bug title. > > Sure thats a better name :) I changed it. > If you need a sponsor, I'll be happy to do it. > > Thanks! Now for a second reviewer? Anyone? > Thanks > - Ioi > > ..Thomas > > > On 8/2/17 2:17 AM, Thomas St?fe wrote: > >> Hi all, >> >> may I please have a review for this small fix. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8185706 >> Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ >> 8185706-Native-callstacks-unreliable-under-Windows-x64/webrev.00/webrev/ >> >> This can be seen as an addon to https://bugs.openjdk.java. >> net/browse/JDK-8022335. Ioi Lam did a good job analyzing the original >> problem. On windows x64, the native compiler generates code which does not >> use the frame pointer (regardless whether we set -Oy-). Only in rare cases >> a frame pointer is used - e.g. for alloca()-functions - and, as Ioi >> pointed >> out, no guarantee either that RBP is actually the frame pointer. >> >> So, in os :: >> platform_print_native_stack >> > k&project=integ-hotspot-X>() >> we walk the stack using StackWalk64(), extract the pc from each frame and >> print that, like normal windows coding. However, we still test for the >> frame pointer being NULL, and abort stack tracing if it is. This causes >> stack dumping to fail quite often, and unnecessarily. >> >> For example, test: java.exe -XX:ErrorHandlerTest=12 >> >> Sometimes it works, but more out of accident - as Ioi pointed out in this >> mail thread: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/ >> 2013-August/009063.html. If there are java frames above the crashing >> native >> frame, we still may have RBP set to some value (does not matter which) and >> os :: >> platform_print_native_stack >> > k&project=integ-hotspot-X>() >> does not abort frame printing. >> >> Kind Regards, Thomas >> > > From goetz.lindenmaier at sap.com Tue Aug 8 09:00:20 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 8 Aug 2017 09:00:20 +0000 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: <5988E311.1080605@oracle.com> References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> <5984CC70.80209@oracle.com> <412edbf307a64909a010ce5f83deac5b@sap.com> <59889970.1080301@oracle.com> <5988E311.1080605@oracle.com> Message-ID: <70dc302307ad46c98bc65e45b98682b6@sap.com> Hi Mikhailo, yes, I please need a sponsor. Thanks for the help with working on this change! I added Ioi as reviewer in the webrev, so the patches can be pushed as is. Thanks, Goetz. > -----Original Message----- > From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > Sent: Dienstag, 8. August 2017 00:01 > To: Ioi Lam > Cc: Lindenmaier, Goetz ; Igor Ignatyev > ; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests > > Hi Goetz, > > Please let me know if you need a sponsor for this change. > > Mikhailo > > On 8/7/17, 10:44 AM, Ioi Lam wrote: > > Looks good to me, too. Reviewed. > > > > Thanks > > > > - Ioi > > > > > > > > On 8/7/17 9:46 AM, Mikhailo Seledtsov wrote: > >> The change looks good to me, > >> > >> Thank you, > >> Mikhailo > >> > >> On 8/7/17, 1:02 AM, Lindenmaier, Goetz wrote: > >>> Hi, > >>> > >>> webrev with Whitebox: > >>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04/ > >>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04-hs/ > >>> > >>> I don't see so much of a difference to throwing an exception, if > >>> Whitebox is not properly implemented you get one, anyways: > >>> Exception in thread "main" java.lang.UnsatisfiedLinkError: > >>> sun.hotspot.WhiteBox.isCDSIncludedInVmBuild()Z > >>> at sun.hotspot.WhiteBox.isCDSIncludedInVmBuild(Native Method) > >>> Maybe it's a bit less likely to break, though. > >>> > >>> I'm fine with this, too. > >>> > >>> Best regards, > >>> Goetz., > >>> > >>> > >>> > >>>> -----Original Message----- > >>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > >>>> Sent: Freitag, 4. August 2017 21:35 > >>>> To: Ioi Lam; Lindenmaier, Goetz > >>>> ; Igor Ignatyev > >>>> Cc: hotspot-runtime-dev at openjdk.java.net > >>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable > >>>> cds tests > >>>> > >>>> Hi, > >>>> > >>>> I have an alternative solution that is IMO rather simple, > >>>> reliable > >>>> and will > >>>> solve some issues we discussed (e.g. no need to throw > >>>> exceptions, no > >>>> need to handle failure to map an archive). > >>>> The proposed solution uses White Box test API to determine > >>>> whether VM > >>>> is compiled with INCLUDE_CDS on or off. > >>>> I implemented and tested it today, it works for me. > >>>> > >>>> The patch is attached. Please let me know what you think. > >>>> > >>>> Thank you, > >>>> Mikhailo > >>>> > >>>> On 8/3/17, 11:39 PM, Ioi Lam wrote: > >>>>> Hi Goetz, > >>>>> > >>>>> Instead of testing -Xshare:on, I think you should test with > >>>>> -Xshare:auto, which sets the flags > >>>>> > >>>>> UseSharedSpaces = true > >>>>> RequireSharedSpaces = false > >>>>> > >>>>> and will reliably print "Shared spaces are not supported in this VM" > >>>>> if-and-only-if INCLUDE_CDS is disabled (see arguments.cpp): > >>>>> > >>>>> > >>>>> #if !INCLUDE_CDS > >>>>> if (DumpSharedSpaces || RequireSharedSpaces) { > >>>>> jio_fprintf(defaultStream::error_stream(), > >>>>> "Shared spaces are not supported in this VM\n"); > >>>>> return JNI_ERR; > >>>>> } > >>>>> if ((UseSharedSpaces&& FLAG_IS_CMDLINE(UseSharedSpaces)) || > >>>>> log_is_enabled(Info, cds)) { > >>>>> warning("Shared spaces are not supported in this VM"); > >>>>> FLAG_SET_DEFAULT(UseSharedSpaces, false); > >>>>> LogConfiguration::configure_stdout(LogLevel::Off, true, > >>>>> LOG_TAGS(cds)); > >>>>> } > >>>>> no_shared_spaces("CDS Disabled"); > >>>>> #endif // INCLUDE_CDS > >>>>> > >>>>> > >>>>> That way, you don't need to test any other output message or exit > >>>>> conditions(such as mapping error). > >>>>> > >>>>> > >>>>> E.g.: > >>>>> > >>>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java > >>>>> -Xshare:auto > >>>>> -version > >>>>> java version "10-internal" > >>>>> Java(TM) SE Runtime Environment (build > >>>>> 10-internal+0-2017-08-04-0614567.iklam.iter) > >>>>> Java HotSpot(TM) 64-Bit Server VM (build > >>>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) > >>>>> > >>>>> > >>>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java > >>>>> -XXaltjvm=minimal -Xshare:auto -version > >>>>> Java HotSpot(TM) 64-Bit Minimal VM warning: Shared spaces are not > >>>>> supported in this VM > >>>>> java version "10-internal" > >>>>> Java(TM) SE Runtime Environment (build > >>>>> 10-internal+0-2017-08-04-0614567.iklam.iter) > >>>>> Java HotSpot(TM) 64-Bit Minimal VM (build > >>>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) > >>>>> > >>>>> > >>>>> > >>>>> Thanks > >>>>> - Ioi > >>>>> > >>>>> On 8/3/17 10:58 PM, Lindenmaier, Goetz wrote: > >>>>>> Hi Mikhailo, > >>>>>> > >>>>>> I put in your version of vmCDS() into this new webrev. > >>>>>> I also had to update the list of tests marked in hotspot, > >>>>>> as tests were removed and added in between, and resolved > >>>>>> it against the aot change: > >>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ > >>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ > >>>>>> > >>>>>> I don't think it's a good idea to swallow the exception silently > >>>>>> as you propose. > >>>>>> In our test setup, the tests would just be switched off if something > >>>>>> breaks, and no one will see that. If they fail though, it's an easy > >>>>>> and quick fix. I would at least switch them on, then one sees the > >>>>>> failing tests in case switching them on was the wrong guess. > >>>>>> Also, below, the method dump() throws an exception. > >>>>>> > >>>>>> Best regards, > >>>>>> Goetz > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] > >>>>>>> Sent: Tuesday, August 01, 2017 11:49 PM > >>>>>>> To: Lindenmaier, Goetz > >>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net > >>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable > >>>>>>> cds tests > >>>>>>> > >>>>>>> Hi Goetz, > >>>>>>> > >>>>>>> I have reviewed your updated changes, and they overall look good to > >>>> me. > >>>>>>> However, I have some comments + concerns regarding > >>>> VMProps.vmCDS(): > >>>>>>> > >>>>>>> 1. Throwing exceptions from within the vmCDS() method. > >>>>>>> > >>>>>>> The VMProps properties are evaluated at the start of each > >>>>>>> run. If > >>>>>>> the exception is thrown here the whole test run will fail (not > >>>>>>> just the > >>>>>>> test that uses '@requires vm.cds'). The failure will occur shortly > >>>>>>> after > >>>>>>> the start of jtreg test run with a message: > >>>>>>> "java.lang.RuntimeException: Can not start VM to > >>>>>>> test to > >>>>>>> find out it's features. Switching off class data sharing (CDS)." > >>>>>>> > >>>>>>> Your method has 2 throw statements: "new > >>>>>>> RuntimeException("Can > >>>>>>> not > >>>>>>> start VM..." and "java.lang.RuntimeException: Can not start VM > >>>>>>> to test > >>>>>>> to...". I would recommend a more graceful way to fail, e.g. to > >>>>>>> print > >>>>>>> the > >>>>>>> message and to return "false" instead. This way the rest of the > >>>>>>> test > >>>>>>> run > >>>>>>> will continue, but the tests requiring vm.cds will be skipped with > >>>>>>> qualification of "not selected". > >>>>>>> > >>>>>>> 2. The check for "An error has occurred while processing the shared > >>>>>>> archive file." assumes that archive was not created prior to the > >>>>>>> execution of this evaluation code. However, there are test modes > >>>> where > >>>>>>> archive is created prior to test run. We use such mode on regular > >>>>>>> basis. > >>>>>>> In such cases the code will fail. > >>>>>>> I recommend to run "-Xshare:on -version", and check the > >>>>>>> following match that would result in return of "true": > >>>>>>> "Java HotSpot.*sharing" > >>>>>>> > >>>>>>> 3. On occasion the mapping of shared archive region to a specified > >>>>>>> address will fail (due to system configuration, space already > >>>>>>> occupied, > >>>>>>> ASLR, etc.) > >>>>>>> > >>>>>>> Hence I recommend checking for such conditions as well: > >>>>>>> > >>>>>>> if (output.firstMatch("Unable to map") != null) { > >>>>>>> System.out.println("VMProps.vmCDS() encountered an > >>>>>>> archive > >>>>>>> mapping failure, still proceeding with vm.cds=true"); > >>>>>>> return "true"; > >>>>>>> } > >>>>>>> I am returning true here because seeing this output means > >>>>>>> that > >>>>>>> CDS > >>>>>>> feature is supported, however in this particular instance archive > >>>>>>> failed > >>>>>>> to map. > >>>>>>> > >>>>>>> > >>>>>>> The rest of the changes looks good to me. > >>>>>>> > >>>>>>> See for my version of VMProps.vmCDS() below. Let me know what you > >>>>>>> think. > >>>>>>> > >>>>>>> > >>>>>>> Thank you, > >>>>>>> > >>>>>>> Mikhailo > >>>>>>> > >>>>>>> > >>>>>>> ================== my update of VMProps.vmCDS() > >>>>>>> > >>>>>>> protected String vmCDS() { > >>>>>>> System.setProperty("test.jdk", > >>>>>>> System.getProperty("java.home")); > >>>>>>> ProcessBuilder pb = > >>>>>>> ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); > >>>>>>> OutputAnalyzer output; > >>>>>>> > >>>>>>> try { > >>>>>>> output = new OutputAnalyzer(pb.start()); > >>>>>>> } catch (IOException e) { > >>>>>>> System.err.println( "Can not start VM to test to > >>>>>>> find out > >>>>>>> it's features. " + > >>>>>>> "Switching off class data > >>>>>>> sharing (CDS)." + e); > >>>>>>> return "false"; > >>>>>>> } > >>>>>>> if (output.firstMatch("Shared spaces are not > >>>>>>> supported in > >>>>>>> this > >>>>>>> VM") != null) { > >>>>>>> return "false"; > >>>>>>> } > >>>>>>> if (output.firstMatch("An error has occurred while > >>>>>>> processing > >>>>>>> the shared archive file.") != null) { > >>>>>>> return "true"; > >>>>>>> } > >>>>>>> if (output.firstMatch("Java HotSpot.*sharing") != > >>>>>>> null) { > >>>>>>> return "true"; > >>>>>>> } > >>>>>>> if (output.firstMatch("Unable to map") != null) { > >>>>>>> System.out.println("VMProps.vmCDS() encountered an > >>>>>>> archive > >>>>>>> mapping failure, still proceeding with vm.cds=true"); > >>>>>>> return "true"; > >>>>>>> } > >>>>>>> > >>>>>>> return "false"; > >>>>>>> } > >>>>>>> ================== > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I made new webrevs implementing the change with @requires: > >>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ > >>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02- > >>>> hs/ > >>>>>>>> I also changed the bug description and synopsis. > >>>>>>>> > >>>>>>>> For the jtreg runner I would propose to set the property test.jdk > >>>>>>>> so that it is available in VMProps. Igor also ran into this > >>>>>>>> issue. > >>>>>>>> > >>>>>>>> Best regards, > >>>>>>>> Goetz. > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] > >>>>>>>>> Sent: Montag, 31. Juli 2017 22:19 > >>>>>>>>> To: Lindenmaier, Goetz > >>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net > >>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to > >>>>>>>>> disable cds > >>>>>>> tests > >>>>>>>>> Hi Goetz, > >>>>>>>>> > >>>>>>>>> I have an idea on how to address your second use case. > >>>>>>>>> The idea is to define a special test property (e.g. > >>>>>>>>> test.cds.disable.cds.support) which will override logic inside > >>>>>>>>> the > >>>>>>>>> VMProps.vmCDSSupported(). If this property is defined to > >>>>>>>>> "true" in > >>>>>>>>> test > >>>>>>>>> invocation command then vmCDSSupported() returns false (CDS is > >>>>>>> disabled, > >>>>>>>>> not supported), and all tests marked with "@requires > >>>>>>>>> vm.cds.supported" > >>>>>>>>> will be skipped. > >>>>>>>>> > >>>>>>>>> How to use it: > >>>>>>>>> jtreg -Dtest.cds.disable.cds.support=true > >>>>>>>>> E.g.: jtreg -Dtest.cds.disable.cds.support=true > >>>>>>>>> > >>>> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java > >>>>>>>>> I prototyped this approach, it works for me. I have attached the > >>>>>>>>> diff. > >>>>>>>>> Let me know whether this works for your use case, or if you > >>>>>>>>> have any > >>>>>>>>> questions. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Thank you, > >>>>>>>>> Mikhailo > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: > >>>>>>>>>> Hi Mikhailo, > >>>>>>>>>> > >>>>>>>>>> Basically I'm fine with using the @requires property. > >>>>>>>>>> But is there a way to overrule the outcome of the method > >>>>>>>>>> implemented In VMProps.java computing the property? > >>>>>>>>>> I have two use cases for the key I want to introduce. > >>>>>>>>>> > >>>>>>>>>> First, our internal VM (we are Oracle licensees) is compiled > >>>>>>>>>> without > >>>>>>>>>> CDS support. Thus we don't want to run the CDS tests. Currently > >>>>>>>>>> we have them all listed in the ProblemList, but that's not nice, > >>>>>>>>>> especially > >>>>>>>>>> because we have to adapt it whenever a new test is added. > >>>>>>>>>> As I understand, the @requires property works fine, here. > >>>>>>>>>> > >>>>>>>>>> Second, we also test the two ports we contributed (ppc and > >>>>>>>>>> s390). > >>>>>>>>>> These > >>>>>>>>> contain > >>>>>>>>>> rudimentary cds support and so far passed all tests. > >>>>>>>>>> Unfortunately it > >>>>>>> broke > >>>>>>>>>> lately in jdk10. Instead of fixing it (our people are > >>>>>>>>>> working on > >>>>>>>>>> finishing > >>>>>>> our > >>>>>>>>>> internal Java 9 port) I would like to switch off all cds tests. > >>>>>>>>>> As I can set the key on the command line of jtreg, I easily can > >>>>>>>>>> do that. > >>>>>>>>>> Is there a way to do similar with the @requires property? > >>>>>>>>>> > >>>>>>>>>> Best regards, > >>>>>>>>>> Goetz. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> -----Original Message----- > >>>>>>>>>>> From: Mikhailo Seledtsov > [mailto:mikhailo.seledtsov at oracle.com] > >>>>>>>>>>> Sent: Freitag, 28. Juli 2017 23:53 > >>>>>>>>>>> To: Lindenmaier, Goetz > >>>>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net > >>>>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to > >>>>>>>>>>> disable cds > >>>>>>>>> tests > >>>>>>>>>>> Hi Goetz, > >>>>>>>>>>> > >>>>>>>>>>> I am a HotSpot SQE Engineer at Oracle. I have > >>>>>>>>>>> discussed your > >>>>>>> proposed > >>>>>>>>>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the > >>>>>>> following > >>>>>>>>>>> feedback on this change. > >>>>>>>>>>> > >>>>>>>>>>> 1. As part of streamlining and simplifying SQE process and the > >>>>>>>>>>> use of > >>>>>>>>>>> test tools we have narrowed down the test selection mechanisms. > >>>>>>>>>>> > >>>>>>>>>>> 2. Our preferred test selection mechanism is use of "@requires" > >>>>>>>>>>> and a > >>>>>>>>>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though > >>>>>>> JTREG > >>>>>>>>>>> supports use of "@key", we prefer the use of "@requires" as a > >>>>>>>>>>> first > >>>>>>>>> choice. > >>>>>>>>>>> 3. If it is not possible to use "@requires" for a given > >>>>>>>>>>> situation then > >>>>>>>>>>> use "@key" mechanism. We would ask you if you could explore > the > >>>>>>>>>>> possibility of implementing this change via @requires first. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Here are several hints that may help: > >>>>>>>>>>> > >>>>>>>>>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. > >>>>>>>>>>> The > >>>>>>>>>>> value > >>>>>>>>>>> of a given "requires property" is evaluated inside this file > >>>>>>>>>>> and > >>>>>>>>>>> placed > >>>>>>>>>>> into a map (see public call() method). Add your evaluation code > >>>>>>>>>>> here, > >>>>>>>>>>> and then follow the pattern used for other properties. Create a > >>>>>>> property > >>>>>>>>>>> (e.g. vm.cds.supported, with values of true/false). Create a > >>>> method > >>>>>>> that > >>>>>>>>>>> evaluates the property value (e.g. isCDSSupported() or > >>>>>>>>>>> similar). > >>>>>>>>>>> > >>>>>>>>>>> 2. The method could use several options to evaluate whether CDS > >>>> is > >>>>>>>>>>> supported. > >>>>>>>>>>> A. WhiteBox API. Create a new WB test API method > >>>>>>>>>>> which can > >>>>>>> return > >>>>>>>>>>> true if CDS_ compiler flag is defined, otherwise false. > >>>>>>>>>>> Call WB API from VMProps.java. See > >>>>>>>>>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create > >>>> your > >>>>>>>>> own > >>>>>>>>>>> WB.isCDSSupported() > >>>>>>>>>>> WhiteBox.java resides in > >>>>>>>>>>> test/lib/sun/hotspot/WhiteBox.java > >>>>>>>>>>> > >>>>>>>>>>> B. Another options is to evaluate by running VM with > >>>>>>>>>>> sharing on and > >>>>>>>>>>> checking the return (may be not as reliable as option A) > >>>>>>>>>>> C. Other ideas welcome. > >>>>>>>>>>> > >>>>>>>>>>> 3. Include "@requres vm.cds.supported == true" to the > >>>> appropriate > >>>>>>> tests. > >>>>>>>>>>> Let me know if you have any questions. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Best regards, > >>>>>>>>>>> Mikhailo > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: > >>>>>>>>>>>> Hi > >>>>>>>>>>>> > >>>>>>>>>>>> we compile the VM without CDS support. Thus the CDS tests > >>>>>>>>>>>> fail. This change introduces a keyword 'cds' and marks > >>>>>>>>>>>> the tests accordingly. > >>>>>>>>>>>> This change also fixes the keywords specified in > >>>>>>>>>>> gc/g1/TestSharedArchiveWithPreTouch.java. > >>>>>>>>>>>> There may only be one @key keyword in the test specification. > >>>>>>>>>>>> In runtime/CompressedOops/CompressedClassPointers.java > only > >>>> one > >>>>>>>>> test > >>>>>>>>>>>> case required CDS. I changed this sub case to succeed if > >>>>>>>>>>>> CDS is > >>>>>>>>>>>> not > >>>>>>>>>>>> available. > >>>>>>>>>>>> > >>>>>>>>>>>> Please review this change. I please need a sponsor. > >>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436- > >>>> cdsKey/webrev.01/ > >>>>>>>>>>>> Best regards, > >>>>>>>>>>>> Goetz. > > From harold.seigel at oracle.com Tue Aug 8 12:36:09 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 8 Aug 2017 08:36:09 -0400 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> <7973a8e4-232b-2618-ee24-ab455bab5abc@oracle.com> <9a934403-76d0-b296-269a-7f40b3f81208@oracle.com> Message-ID: <6c5f7cb3-dc35-89e2-cb38-b750bab2faf9@oracle.com> Thanks David! Harold On 8/7/2017 9:41 PM, David Holmes wrote: > Hi Harold, > > On 7/08/2017 11:13 PM, harold seigel wrote: >> Hi David, >> >> Thanks for your comments! >> >> Please review this updated webrev. It contains the change that you >> suggested. It also simplifies the implementation by statically >> allocating the fixup lists before their first use. >> >> http://cr.openjdk.java.net/~hseigel/bug_8185103.2/webrev/index.html > > I like it! Not much point in lazy initialization if things will always > be initialized anyway. And presumably this has now removed the > potential for allocation failure that was significant in setting the > klass mirror second, so now we can set it first. > > Thanks, > David > >> Thanks, Harold >> >> On 8/3/2017 7:03 PM, David Holmes wrote: >>> Hi Harold, >>> >>> On 4/08/2017 7:24 AM, David Holmes wrote: >>>> Hi Harold, >>>> >>>> On 3/08/2017 11:03 PM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this JDK-10 fix for JDK-8185103. The problem >>>>> occurred because classes were being put on the >>>>> fixup_module_field_list before their mirror field was set. If a >>>>> (different) thread called method patch_javabase_entries() before >>>>> the class's mirror field was set then >>>> >>>> The code that calls patch_javabase_entries has this: >>>> >>>> // Only the thread that actually defined the base module will get >>>> here, >>>> // so no locking is needed. >>>> >>>> // Patch any previously loaded class's module field with >>>> java.base's java.lang.Module. >>>> ModuleEntryTable::patch_javabase_entries(module_handle); >>>> >>>> so it seems that comment is wrong and that locking is indeed needed >>>> somewhere! At a minimum your setting of the mirror needs a >>>> following storestore barrier, or (better) the set/get of the mirror >>>> uses load-acquire/store-release. >>> >>> Sorry - looking in more detail the necessary locking is already in >>> place. A class is only added to the fixup list, under the >>> Module_lock, if the base module is not yet defined. The finalization >>> of that definition also occurs under the Module_lock, which in turn >>> occurs before the fixup list is processed (without the lock). So as >>> long as the mirror is set before the class is added to the fixup >>> list, the mirror will be visible to the main thread when it >>> processes it. >>> >>> Looking at the original code: >>> >>> 881 // set the module field in the java_lang_Class instance >>> 882 set_mirror_module_field(k, mirror, module, THREAD); >>> 883 >>> 884 // Setup indirection from klass->mirror >>> 885 // after any exceptions can happen during allocations. >>> 886 k->set_java_mirror(mirror()); >>> >>> it would seem simplest to just reorder the two actions - except for >>> that comment about exceptions. Is the allocation exception issue >>> less of an issue when doing VM initialization? What will happen? >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> this would cause a SIGSEGV because patch_javabase_entries() >>>>> eventually calls obj_field_put() which tries to dereference the >>>>> class's mirror field. >>>>> >>>>> This change fixes the problem by setting the class's mirror field >>>>> before putting the class on the fixup_module_field_list. >>>>> >>>>> Open Webrev: >>>>> http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html >>>>> >>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 >>>>> >>>>> The fix was tested with the JCK Lang and VM tests, the JTreg >>>>> hotspot, java/io, java/lang, java/util and other tests, the >>>>> co-located NSK tests, and with JPRT. >>>>> >>>>> Additionally, the fix was tested by temporarily adding a >>>>> naked_short_sleep(50) to method initialize_mirror_fields() shortly >>>>> after it put a class on the fixup_module_field_list. The sleep was >>>>> added in order to enhance the likelihood of >>>>> patch_javabase_entries() being called before the class's mirror >>>>> field got set. Without the fix, the >>>>> TestThreadDumpMonitorContent.java test and the test reported in >>>>> JDK-8183309 >>>>> reliably got the reported SIGSEGVs. With the fix, the tests passed. >>>>> >>>>> Thanks, Harold >>>>> >> From aph at redhat.com Tue Aug 8 13:14:00 2017 From: aph at redhat.com (Andrew Haley) Date: Tue, 8 Aug 2017 14:14:00 +0100 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> Message-ID: <5c6b3711-ab1b-1ea1-50e9-1b916eb8a5dc@redhat.com> On 07/08/17 18:49, Kim Barrett wrote: >> On Aug 7, 2017, at 11:23 AM, Andrew Haley wrote: >> >> On 07/08/17 00:32, Kim Barrett wrote: >>> I'm looking for feedback on this before I try to carry it any further. >> >> I don't like it because it converts pointers to operand types before >> calling the back end. >> >> For example, in here: >> >> intptr_t v = CASPTR(&_LockWord, 0, _LBIT); // agro ... >> >> the type of the operand LockWord is SplitWord. But the SplitWord * >> argument gets converted to void* volatile* when we call this: >> >> inline static void* cmpxchg_ptr(void* exchange_value, volatile void* dest, void* compare_value, cmpxchg_memory_order order = memory_order_conservative) { >> return cmpxchg(exchange_value, >> reinterpret_cast(dest), >> compare_value, >> order); >> Here's what I first wrote: >> >> I don't see the point of such a type conversion. We could call >> cmpxchg with the actual types of the operands, could we not? Why is >> cmpxchg_ptr even a thing? We're casting away type information for >> no reason that I can see. >> >> Couldn't cmpxchg_ptr() be defined as a template function in such a >> way that only the back ends that actually need to cast away the >> types have to do so? That is, if the back ends can define >> cmpxchg_ptr() themselves without resorting to pointer type >> conversion, we should let them so so. >> >> But rather than sending that message straight away, I tried it. And >> now I see: the compiler can't get the types right in those cases where >> we have mismatched operand types in the call. Argh. The only way we >> can get method resolution to work is to throw way the pointer type >> information and use void* for everything. At th erisk of being >> boring, I repeat what I said before: IMO this is not what we should be >> doing in 2017. We should be looking to the future, and get the types >> to match now, at the call site. > > Maybe you?ve forgotten this, from Erik?s original RFR email? > > "The X_ptr member functions have been deprecated, but are still > there and can be used with identical behaviour as they had > before. But new code should just use the non-ptr member functions > instead.? No, I hadn't forgotten, it's because I wrote a version of this patch which made the problem go away. But that did result in a few changes at call sites, as discussed. > So I think I?m entirely in agreement with Andrew about the target, > just not necessarily in the timing of reaching it. OK. > What?s wrong with > > template > struct Atomic::PlatformCmpxchg VALUE_OBJ_CLASS_SPEC { > template > T operator()(T nv, T volatile* d, T ov, cmpxchg_memory_order order) const { > return ::cmpxchg(nv, d, ov, order); > } > }; Thanks. That seems to work, but I have no idea why. :-) > and maybe an explicit specialization on 2 that errors rather than > calling ::cmpxchg if that?s needed? No, there's no need for that: if anyone uses cmpxchg(short) that'll just work. I guess I will drop my objection to cmpxchg_ptr() staying as it is, because it looks like we have a general improvement on the status quo. It certainly seems to work, and everything inlines beautifully. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From coleen.phillimore at oracle.com Tue Aug 8 16:17:31 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Aug 2017 12:17:31 -0400 Subject: RFR 8185103: TestThreadDumpMonitorContention.java crashed due to SIGSEGV in G1SATBCardTableModRefBS::write_ref_field_pre_work In-Reply-To: References: <9b2cdc6e-fba2-bd4a-6c4a-aa81b10376f5@oracle.com> <7973a8e4-232b-2618-ee24-ab455bab5abc@oracle.com> <9a934403-76d0-b296-269a-7f40b3f81208@oracle.com> Message-ID: <9b8761ae-c079-bba0-18ab-d29bef6f95b8@oracle.com> Yes, I like it too. Looks good. Coleen On 8/7/17 9:41 PM, David Holmes wrote: > Hi Harold, > > On 7/08/2017 11:13 PM, harold seigel wrote: >> Hi David, >> >> Thanks for your comments! >> >> Please review this updated webrev. It contains the change that you >> suggested. It also simplifies the implementation by statically >> allocating the fixup lists before their first use. >> >> http://cr.openjdk.java.net/~hseigel/bug_8185103.2/webrev/index.html > > I like it! Not much point in lazy initialization if things will always > be initialized anyway. And presumably this has now removed the > potential for allocation failure that was significant in setting the > klass mirror second, so now we can set it first. > > Thanks, > David > >> Thanks, Harold >> >> On 8/3/2017 7:03 PM, David Holmes wrote: >>> Hi Harold, >>> >>> On 4/08/2017 7:24 AM, David Holmes wrote: >>>> Hi Harold, >>>> >>>> On 3/08/2017 11:03 PM, harold seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this JDK-10 fix for JDK-8185103. The problem >>>>> occurred because classes were being put on the >>>>> fixup_module_field_list before their mirror field was set. If a >>>>> (different) thread called method patch_javabase_entries() before >>>>> the class's mirror field was set then >>>> >>>> The code that calls patch_javabase_entries has this: >>>> >>>> // Only the thread that actually defined the base module will get >>>> here, >>>> // so no locking is needed. >>>> >>>> // Patch any previously loaded class's module field with >>>> java.base's java.lang.Module. >>>> ModuleEntryTable::patch_javabase_entries(module_handle); >>>> >>>> so it seems that comment is wrong and that locking is indeed needed >>>> somewhere! At a minimum your setting of the mirror needs a >>>> following storestore barrier, or (better) the set/get of the mirror >>>> uses load-acquire/store-release. >>> >>> Sorry - looking in more detail the necessary locking is already in >>> place. A class is only added to the fixup list, under the >>> Module_lock, if the base module is not yet defined. The finalization >>> of that definition also occurs under the Module_lock, which in turn >>> occurs before the fixup list is processed (without the lock). So as >>> long as the mirror is set before the class is added to the fixup >>> list, the mirror will be visible to the main thread when it >>> processes it. >>> >>> Looking at the original code: >>> >>> 881 // set the module field in the java_lang_Class instance >>> 882 set_mirror_module_field(k, mirror, module, THREAD); >>> 883 >>> 884 // Setup indirection from klass->mirror >>> 885 // after any exceptions can happen during allocations. >>> 886 k->set_java_mirror(mirror()); >>> >>> it would seem simplest to just reorder the two actions - except for >>> that comment about exceptions. Is the allocation exception issue >>> less of an issue when doing VM initialization? What will happen? >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> this would cause a SIGSEGV because patch_javabase_entries() >>>>> eventually calls obj_field_put() which tries to dereference the >>>>> class's mirror field. >>>>> >>>>> This change fixes the problem by setting the class's mirror field >>>>> before putting the class on the fixup_module_field_list. >>>>> >>>>> Open Webrev: >>>>> http://cr.openjdk.java.net/~hseigel/bug_8185103/webrev/index.html >>>>> >>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8185103 >>>>> >>>>> The fix was tested with the JCK Lang and VM tests, the JTreg >>>>> hotspot, java/io, java/lang, java/util and other tests, the >>>>> co-located NSK tests, and with JPRT. >>>>> >>>>> Additionally, the fix was tested by temporarily adding a >>>>> naked_short_sleep(50) to method initialize_mirror_fields() shortly >>>>> after it put a class on the fixup_module_field_list. The sleep was >>>>> added in order to enhance the likelihood of >>>>> patch_javabase_entries() being called before the class's mirror >>>>> field got set. Without the fix, the >>>>> TestThreadDumpMonitorContent.java test and the test reported in >>>>> JDK-8183309 >>>>> reliably got the reported SIGSEGVs. With the fix, the tests passed. >>>>> >>>>> Thanks, Harold >>>>> >> From mikhailo.seledtsov at oracle.com Tue Aug 8 18:06:47 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Tue, 8 Aug 2017 11:06:47 -0700 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: <70dc302307ad46c98bc65e45b98682b6@sap.com> References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> <5984CC70.80209@oracle.com> <412edbf307a64909a010ce5f83deac5b@sap.com> <59889970.1080301@oracle.com> <5988E311.1080605@oracle.com> <70dc302307ad46c98bc65e45b98682b6@sap.com> Message-ID: Hi Goetz, I will test your patch and then will do a sponsor push. Thank you, Mikhailo On 08/08/2017 02:00 AM, Lindenmaier, Goetz wrote: > Hi Mikhailo, > > yes, I please need a sponsor. > Thanks for the help with working on this change! > I added Ioi as reviewer in the webrev, so the patches > can be pushed as is. > > Thanks, > Goetz. > >> -----Original Message----- >> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >> Sent: Dienstag, 8. August 2017 00:01 >> To: Ioi Lam >> Cc: Lindenmaier, Goetz ; Igor Ignatyev >> ; hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests >> >> Hi Goetz, >> >> Please let me know if you need a sponsor for this change. >> >> Mikhailo >> >> On 8/7/17, 10:44 AM, Ioi Lam wrote: >>> Looks good to me, too. Reviewed. >>> >>> Thanks >>> >>> - Ioi >>> >>> >>> >>> On 8/7/17 9:46 AM, Mikhailo Seledtsov wrote: >>>> The change looks good to me, >>>> >>>> Thank you, >>>> Mikhailo >>>> >>>> On 8/7/17, 1:02 AM, Lindenmaier, Goetz wrote: >>>>> Hi, >>>>> >>>>> webrev with Whitebox: >>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04/ >>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04-hs/ >>>>> >>>>> I don't see so much of a difference to throwing an exception, if >>>>> Whitebox is not properly implemented you get one, anyways: >>>>> Exception in thread "main" java.lang.UnsatisfiedLinkError: >>>>> sun.hotspot.WhiteBox.isCDSIncludedInVmBuild()Z >>>>> at sun.hotspot.WhiteBox.isCDSIncludedInVmBuild(Native Method) >>>>> Maybe it's a bit less likely to break, though. >>>>> >>>>> I'm fine with this, too. >>>>> >>>>> Best regards, >>>>> Goetz., >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>> Sent: Freitag, 4. August 2017 21:35 >>>>>> To: Ioi Lam; Lindenmaier, Goetz >>>>>> ; Igor Ignatyev >>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>>>>> cds tests >>>>>> >>>>>> Hi, >>>>>> >>>>>> I have an alternative solution that is IMO rather simple, >>>>>> reliable >>>>>> and will >>>>>> solve some issues we discussed (e.g. no need to throw >>>>>> exceptions, no >>>>>> need to handle failure to map an archive). >>>>>> The proposed solution uses White Box test API to determine >>>>>> whether VM >>>>>> is compiled with INCLUDE_CDS on or off. >>>>>> I implemented and tested it today, it works for me. >>>>>> >>>>>> The patch is attached. Please let me know what you think. >>>>>> >>>>>> Thank you, >>>>>> Mikhailo >>>>>> >>>>>> On 8/3/17, 11:39 PM, Ioi Lam wrote: >>>>>>> Hi Goetz, >>>>>>> >>>>>>> Instead of testing -Xshare:on, I think you should test with >>>>>>> -Xshare:auto, which sets the flags >>>>>>> >>>>>>> UseSharedSpaces = true >>>>>>> RequireSharedSpaces = false >>>>>>> >>>>>>> and will reliably print "Shared spaces are not supported in this VM" >>>>>>> if-and-only-if INCLUDE_CDS is disabled (see arguments.cpp): >>>>>>> >>>>>>> >>>>>>> #if !INCLUDE_CDS >>>>>>> if (DumpSharedSpaces || RequireSharedSpaces) { >>>>>>> jio_fprintf(defaultStream::error_stream(), >>>>>>> "Shared spaces are not supported in this VM\n"); >>>>>>> return JNI_ERR; >>>>>>> } >>>>>>> if ((UseSharedSpaces&& FLAG_IS_CMDLINE(UseSharedSpaces)) || >>>>>>> log_is_enabled(Info, cds)) { >>>>>>> warning("Shared spaces are not supported in this VM"); >>>>>>> FLAG_SET_DEFAULT(UseSharedSpaces, false); >>>>>>> LogConfiguration::configure_stdout(LogLevel::Off, true, >>>>>>> LOG_TAGS(cds)); >>>>>>> } >>>>>>> no_shared_spaces("CDS Disabled"); >>>>>>> #endif // INCLUDE_CDS >>>>>>> >>>>>>> >>>>>>> That way, you don't need to test any other output message or exit >>>>>>> conditions(such as mapping error). >>>>>>> >>>>>>> >>>>>>> E.g.: >>>>>>> >>>>>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java >>>>>>> -Xshare:auto >>>>>>> -version >>>>>>> java version "10-internal" >>>>>>> Java(TM) SE Runtime Environment (build >>>>>>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>>>>>> Java HotSpot(TM) 64-Bit Server VM (build >>>>>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>>>>>> >>>>>>> >>>>>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java >>>>>>> -XXaltjvm=minimal -Xshare:auto -version >>>>>>> Java HotSpot(TM) 64-Bit Minimal VM warning: Shared spaces are not >>>>>>> supported in this VM >>>>>>> java version "10-internal" >>>>>>> Java(TM) SE Runtime Environment (build >>>>>>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>>>>>> Java HotSpot(TM) 64-Bit Minimal VM (build >>>>>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> On 8/3/17 10:58 PM, Lindenmaier, Goetz wrote: >>>>>>>> Hi Mikhailo, >>>>>>>> >>>>>>>> I put in your version of vmCDS() into this new webrev. >>>>>>>> I also had to update the list of tests marked in hotspot, >>>>>>>> as tests were removed and added in between, and resolved >>>>>>>> it against the aot change: >>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ >>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ >>>>>>>> >>>>>>>> I don't think it's a good idea to swallow the exception silently >>>>>>>> as you propose. >>>>>>>> In our test setup, the tests would just be switched off if something >>>>>>>> breaks, and no one will see that. If they fail though, it's an easy >>>>>>>> and quick fix. I would at least switch them on, then one sees the >>>>>>>> failing tests in case switching them on was the wrong guess. >>>>>>>> Also, below, the method dump() throws an exception. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Goetz >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>> Sent: Tuesday, August 01, 2017 11:49 PM >>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>>>>>>>> cds tests >>>>>>>>> >>>>>>>>> Hi Goetz, >>>>>>>>> >>>>>>>>> I have reviewed your updated changes, and they overall look good to >>>>>> me. >>>>>>>>> However, I have some comments + concerns regarding >>>>>> VMProps.vmCDS(): >>>>>>>>> 1. Throwing exceptions from within the vmCDS() method. >>>>>>>>> >>>>>>>>> The VMProps properties are evaluated at the start of each >>>>>>>>> run. If >>>>>>>>> the exception is thrown here the whole test run will fail (not >>>>>>>>> just the >>>>>>>>> test that uses '@requires vm.cds'). The failure will occur shortly >>>>>>>>> after >>>>>>>>> the start of jtreg test run with a message: >>>>>>>>> "java.lang.RuntimeException: Can not start VM to >>>>>>>>> test to >>>>>>>>> find out it's features. Switching off class data sharing (CDS)." >>>>>>>>> >>>>>>>>> Your method has 2 throw statements: "new >>>>>>>>> RuntimeException("Can >>>>>>>>> not >>>>>>>>> start VM..." and "java.lang.RuntimeException: Can not start VM >>>>>>>>> to test >>>>>>>>> to...". I would recommend a more graceful way to fail, e.g. to >>>>>>>>> print >>>>>>>>> the >>>>>>>>> message and to return "false" instead. This way the rest of the >>>>>>>>> test >>>>>>>>> run >>>>>>>>> will continue, but the tests requiring vm.cds will be skipped with >>>>>>>>> qualification of "not selected". >>>>>>>>> >>>>>>>>> 2. The check for "An error has occurred while processing the shared >>>>>>>>> archive file." assumes that archive was not created prior to the >>>>>>>>> execution of this evaluation code. However, there are test modes >>>>>> where >>>>>>>>> archive is created prior to test run. We use such mode on regular >>>>>>>>> basis. >>>>>>>>> In such cases the code will fail. >>>>>>>>> I recommend to run "-Xshare:on -version", and check the >>>>>>>>> following match that would result in return of "true": >>>>>>>>> "Java HotSpot.*sharing" >>>>>>>>> >>>>>>>>> 3. On occasion the mapping of shared archive region to a specified >>>>>>>>> address will fail (due to system configuration, space already >>>>>>>>> occupied, >>>>>>>>> ASLR, etc.) >>>>>>>>> >>>>>>>>> Hence I recommend checking for such conditions as well: >>>>>>>>> >>>>>>>>> if (output.firstMatch("Unable to map") != null) { >>>>>>>>> System.out.println("VMProps.vmCDS() encountered an >>>>>>>>> archive >>>>>>>>> mapping failure, still proceeding with vm.cds=true"); >>>>>>>>> return "true"; >>>>>>>>> } >>>>>>>>> I am returning true here because seeing this output means >>>>>>>>> that >>>>>>>>> CDS >>>>>>>>> feature is supported, however in this particular instance archive >>>>>>>>> failed >>>>>>>>> to map. >>>>>>>>> >>>>>>>>> >>>>>>>>> The rest of the changes looks good to me. >>>>>>>>> >>>>>>>>> See for my version of VMProps.vmCDS() below. Let me know what you >>>>>>>>> think. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thank you, >>>>>>>>> >>>>>>>>> Mikhailo >>>>>>>>> >>>>>>>>> >>>>>>>>> ================== my update of VMProps.vmCDS() >>>>>>>>> >>>>>>>>> protected String vmCDS() { >>>>>>>>> System.setProperty("test.jdk", >>>>>>>>> System.getProperty("java.home")); >>>>>>>>> ProcessBuilder pb = >>>>>>>>> ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); >>>>>>>>> OutputAnalyzer output; >>>>>>>>> >>>>>>>>> try { >>>>>>>>> output = new OutputAnalyzer(pb.start()); >>>>>>>>> } catch (IOException e) { >>>>>>>>> System.err.println( "Can not start VM to test to >>>>>>>>> find out >>>>>>>>> it's features. " + >>>>>>>>> "Switching off class data >>>>>>>>> sharing (CDS)." + e); >>>>>>>>> return "false"; >>>>>>>>> } >>>>>>>>> if (output.firstMatch("Shared spaces are not >>>>>>>>> supported in >>>>>>>>> this >>>>>>>>> VM") != null) { >>>>>>>>> return "false"; >>>>>>>>> } >>>>>>>>> if (output.firstMatch("An error has occurred while >>>>>>>>> processing >>>>>>>>> the shared archive file.") != null) { >>>>>>>>> return "true"; >>>>>>>>> } >>>>>>>>> if (output.firstMatch("Java HotSpot.*sharing") != >>>>>>>>> null) { >>>>>>>>> return "true"; >>>>>>>>> } >>>>>>>>> if (output.firstMatch("Unable to map") != null) { >>>>>>>>> System.out.println("VMProps.vmCDS() encountered an >>>>>>>>> archive >>>>>>>>> mapping failure, still proceeding with vm.cds=true"); >>>>>>>>> return "true"; >>>>>>>>> } >>>>>>>>> >>>>>>>>> return "false"; >>>>>>>>> } >>>>>>>>> ================== >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I made new webrevs implementing the change with @requires: >>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ >>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02- >>>>>> hs/ >>>>>>>>>> I also changed the bug description and synopsis. >>>>>>>>>> >>>>>>>>>> For the jtreg runner I would propose to set the property test.jdk >>>>>>>>>> so that it is available in VMProps. Igor also ran into this >>>>>>>>>> issue. >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Goetz. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>>>> Sent: Montag, 31. Juli 2017 22:19 >>>>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>>>>>> disable cds >>>>>>>>> tests >>>>>>>>>>> Hi Goetz, >>>>>>>>>>> >>>>>>>>>>> I have an idea on how to address your second use case. >>>>>>>>>>> The idea is to define a special test property (e.g. >>>>>>>>>>> test.cds.disable.cds.support) which will override logic inside >>>>>>>>>>> the >>>>>>>>>>> VMProps.vmCDSSupported(). If this property is defined to >>>>>>>>>>> "true" in >>>>>>>>>>> test >>>>>>>>>>> invocation command then vmCDSSupported() returns false (CDS is >>>>>>>>> disabled, >>>>>>>>>>> not supported), and all tests marked with "@requires >>>>>>>>>>> vm.cds.supported" >>>>>>>>>>> will be skipped. >>>>>>>>>>> >>>>>>>>>>> How to use it: >>>>>>>>>>> jtreg -Dtest.cds.disable.cds.support=true >>>>>>>>>>> E.g.: jtreg -Dtest.cds.disable.cds.support=true >>>>>>>>>>> >>>>>> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java >>>>>>>>>>> I prototyped this approach, it works for me. I have attached the >>>>>>>>>>> diff. >>>>>>>>>>> Let me know whether this works for your use case, or if you >>>>>>>>>>> have any >>>>>>>>>>> questions. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thank you, >>>>>>>>>>> Mikhailo >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: >>>>>>>>>>>> Hi Mikhailo, >>>>>>>>>>>> >>>>>>>>>>>> Basically I'm fine with using the @requires property. >>>>>>>>>>>> But is there a way to overrule the outcome of the method >>>>>>>>>>>> implemented In VMProps.java computing the property? >>>>>>>>>>>> I have two use cases for the key I want to introduce. >>>>>>>>>>>> >>>>>>>>>>>> First, our internal VM (we are Oracle licensees) is compiled >>>>>>>>>>>> without >>>>>>>>>>>> CDS support. Thus we don't want to run the CDS tests. Currently >>>>>>>>>>>> we have them all listed in the ProblemList, but that's not nice, >>>>>>>>>>>> especially >>>>>>>>>>>> because we have to adapt it whenever a new test is added. >>>>>>>>>>>> As I understand, the @requires property works fine, here. >>>>>>>>>>>> >>>>>>>>>>>> Second, we also test the two ports we contributed (ppc and >>>>>>>>>>>> s390). >>>>>>>>>>>> These >>>>>>>>>>> contain >>>>>>>>>>>> rudimentary cds support and so far passed all tests. >>>>>>>>>>>> Unfortunately it >>>>>>>>> broke >>>>>>>>>>>> lately in jdk10. Instead of fixing it (our people are >>>>>>>>>>>> working on >>>>>>>>>>>> finishing >>>>>>>>> our >>>>>>>>>>>> internal Java 9 port) I would like to switch off all cds tests. >>>>>>>>>>>> As I can set the key on the command line of jtreg, I easily can >>>>>>>>>>>> do that. >>>>>>>>>>>> Is there a way to do similar with the @requires property? >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Goetz. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>> From: Mikhailo Seledtsov >> [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>>>>>> Sent: Freitag, 28. Juli 2017 23:53 >>>>>>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>>>>>>>> disable cds >>>>>>>>>>> tests >>>>>>>>>>>>> Hi Goetz, >>>>>>>>>>>>> >>>>>>>>>>>>> I am a HotSpot SQE Engineer at Oracle. I have >>>>>>>>>>>>> discussed your >>>>>>>>> proposed >>>>>>>>>>>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the >>>>>>>>> following >>>>>>>>>>>>> feedback on this change. >>>>>>>>>>>>> >>>>>>>>>>>>> 1. As part of streamlining and simplifying SQE process and the >>>>>>>>>>>>> use of >>>>>>>>>>>>> test tools we have narrowed down the test selection mechanisms. >>>>>>>>>>>>> >>>>>>>>>>>>> 2. Our preferred test selection mechanism is use of "@requires" >>>>>>>>>>>>> and a >>>>>>>>>>>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though >>>>>>>>> JTREG >>>>>>>>>>>>> supports use of "@key", we prefer the use of "@requires" as a >>>>>>>>>>>>> first >>>>>>>>>>> choice. >>>>>>>>>>>>> 3. If it is not possible to use "@requires" for a given >>>>>>>>>>>>> situation then >>>>>>>>>>>>> use "@key" mechanism. We would ask you if you could explore >> the >>>>>>>>>>>>> possibility of implementing this change via @requires first. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Here are several hints that may help: >>>>>>>>>>>>> >>>>>>>>>>>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. >>>>>>>>>>>>> The >>>>>>>>>>>>> value >>>>>>>>>>>>> of a given "requires property" is evaluated inside this file >>>>>>>>>>>>> and >>>>>>>>>>>>> placed >>>>>>>>>>>>> into a map (see public call() method). Add your evaluation code >>>>>>>>>>>>> here, >>>>>>>>>>>>> and then follow the pattern used for other properties. Create a >>>>>>>>> property >>>>>>>>>>>>> (e.g. vm.cds.supported, with values of true/false). Create a >>>>>> method >>>>>>>>> that >>>>>>>>>>>>> evaluates the property value (e.g. isCDSSupported() or >>>>>>>>>>>>> similar). >>>>>>>>>>>>> >>>>>>>>>>>>> 2. The method could use several options to evaluate whether CDS >>>>>> is >>>>>>>>>>>>> supported. >>>>>>>>>>>>> A. WhiteBox API. Create a new WB test API method >>>>>>>>>>>>> which can >>>>>>>>> return >>>>>>>>>>>>> true if CDS_ compiler flag is defined, otherwise false. >>>>>>>>>>>>> Call WB API from VMProps.java. See >>>>>>>>>>>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create >>>>>> your >>>>>>>>>>> own >>>>>>>>>>>>> WB.isCDSSupported() >>>>>>>>>>>>> WhiteBox.java resides in >>>>>>>>>>>>> test/lib/sun/hotspot/WhiteBox.java >>>>>>>>>>>>> >>>>>>>>>>>>> B. Another options is to evaluate by running VM with >>>>>>>>>>>>> sharing on and >>>>>>>>>>>>> checking the return (may be not as reliable as option A) >>>>>>>>>>>>> C. Other ideas welcome. >>>>>>>>>>>>> >>>>>>>>>>>>> 3. Include "@requres vm.cds.supported == true" to the >>>>>> appropriate >>>>>>>>> tests. >>>>>>>>>>>>> Let me know if you have any questions. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> Mikhailo >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: >>>>>>>>>>>>>> Hi >>>>>>>>>>>>>> >>>>>>>>>>>>>> we compile the VM without CDS support. Thus the CDS tests >>>>>>>>>>>>>> fail. This change introduces a keyword 'cds' and marks >>>>>>>>>>>>>> the tests accordingly. >>>>>>>>>>>>>> This change also fixes the keywords specified in >>>>>>>>>>>>> gc/g1/TestSharedArchiveWithPreTouch.java. >>>>>>>>>>>>>> There may only be one @key keyword in the test specification. >>>>>>>>>>>>>> In runtime/CompressedOops/CompressedClassPointers.java >> only >>>>>> one >>>>>>>>>>> test >>>>>>>>>>>>>> case required CDS. I changed this sub case to succeed if >>>>>>>>>>>>>> CDS is >>>>>>>>>>>>>> not >>>>>>>>>>>>>> available. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please review this change. I please need a sponsor. >>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436- >>>>>> cdsKey/webrev.01/ >>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>> Goetz. From kim.barrett at oracle.com Tue Aug 8 20:25:52 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 8 Aug 2017 16:25:52 -0400 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: <5c6b3711-ab1b-1ea1-50e9-1b916eb8a5dc@redhat.com> References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> <5c6b3711-ab1b-1ea1-50e9-1b91! 6eb8a5dc@redhat.com> Message-ID: > On Aug 8, 2017, at 9:14 AM, Andrew Haley wrote: > > On 07/08/17 18:49, Kim Barrett wrote: >>> On Aug 7, 2017, at 11:23 AM, Andrew Haley wrote: >>> >>> On 07/08/17 00:32, Kim Barrett wrote: >>>> I'm looking for feedback on this before I try to carry it any further. >>> >>> I don't like it because it converts pointers to operand types before >>> calling the back end. >>> >>> For example, in here: >>> >>> intptr_t v = CASPTR(&_LockWord, 0, _LBIT); // agro ... >>> >>> the type of the operand LockWord is SplitWord. But the SplitWord * >>> argument gets converted to void* volatile* when we call this: >>> >>> inline static void* cmpxchg_ptr(void* exchange_value, volatile void* dest, void* compare_value, cmpxchg_memory_order order = memory_order_conservative) { >>> return cmpxchg(exchange_value, >>> reinterpret_cast(dest), >>> compare_value, >>> order); >>> Here's what I first wrote: >>> >>> I don't see the point of such a type conversion. We could call >>> cmpxchg with the actual types of the operands, could we not? Why is >>> cmpxchg_ptr even a thing? We're casting away type information for >>> no reason that I can see. >>> >>> Couldn't cmpxchg_ptr() be defined as a template function in such a >>> way that only the back ends that actually need to cast away the >>> types have to do so? That is, if the back ends can define >>> cmpxchg_ptr() themselves without resorting to pointer type >>> conversion, we should let them so so. >>> >>> But rather than sending that message straight away, I tried it. And >>> now I see: the compiler can't get the types right in those cases where >>> we have mismatched operand types in the call. Argh. The only way we >>> can get method resolution to work is to throw way the pointer type >>> information and use void* for everything. At th erisk of being >>> boring, I repeat what I said before: IMO this is not what we should be >>> doing in 2017. We should be looking to the future, and get the types >>> to match now, at the call site. >> >> Maybe you?ve forgotten this, from Erik?s original RFR email? >> >> "The X_ptr member functions have been deprecated, but are still >> there and can be used with identical behaviour as they had >> before. But new code should just use the non-ptr member functions >> instead.? > > No, I hadn't forgotten, it's because I wrote a version of this patch > which made the problem go away. But that did result in a few changes > at call sites, as discussed. I recall a discussion about fixing (8-ish) call sites to work with a requirement that all the types match. Those were ordinary cmpxchg, not cmpxchg_ptr. There are *lots* of cmpxchg_ptr call sites that are problematic. Just for starters, there are the 20-25 that pass NULL as the compare_value. >> So I think I?m entirely in agreement with Andrew about the target, >> just not necessarily in the timing of reaching it. > > OK. > >> What?s wrong with >> >> template >> struct Atomic::PlatformCmpxchg VALUE_OBJ_CLASS_SPEC { >> template >> T operator()(T nv, T volatile* d, T ov, cmpxchg_memory_order order) const { >> return ::cmpxchg(nv, d, ov, order); >> } >> }; > > Thanks. That seems to work, but I have no idea why. :-) Atomic declares PlatformCmpxchg (and fails to document it's requirements; I did say this was a prototype...), but does not provide any definition. The above definition is unspecialized on the size, so is applicable for any size value. Since the body code doesn't seem to care about the size (or rather, figures out what it needs on its own, inside ::cmpxchg)... >> and maybe an explicit specialization on 2 that errors rather than >> calling ::cmpxchg if that?s needed? > > No, there's no need for that: if anyone uses cmpxchg(short) that'll > just work. > > I guess I will drop my objection to cmpxchg_ptr() staying as it is, > because it looks like we have a general improvement on the status quo. > It certainly seems to work, and everything inlines beautifully. Great! Thank you. My plan at this point is to focus on finishing cmpxchg, and put just that (with the associated infrastructure) out for review. Then circle back to deal with the other operations, using the new infrastructure, approach, and any additional lessons learned from cmpxchg. That should also make the handoff of remaining work back to Erik go more smoothly when he comes back from vacation and I start mine. > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From jiangli.zhou at Oracle.COM Wed Aug 9 00:33:45 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Tue, 8 Aug 2017 17:33:45 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <5F35B435-FB72-4A21-8896-645E0291C5DF@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> <5F35B435-FB72-4A21-8896-645E0291C5DF@oracle.com> Message-ID: <4DEE0393-59DA-4E01-ACDB-F2DB0F9ED6D9@oracle.com> Here is the incremental webrev that has all the changes incorporated with suggestions from Coleen, Ioi and Thomas: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03.inc/ Updated full webrev: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03/ Thanks again for Coleen's, Ioi's and Thomas? review! Jiangli > On Aug 7, 2017, at 7:57 PM, Jiangli Zhou wrote: > > Hi Ioi, > > Thanks for getting back to me. > >> On Aug 7, 2017, at 5:45 PM, Ioi Lam > wrote: >> >> On 8/4/17 10:19 PM, Jiangli Zhou wrote: >>> Hi Ioi, >>> >>> Thanks for looking again. >>> >>>> On Aug 4, 2017, at 2:22 PM, Ioi Lam > wrote: >>>> >>>> Hi Jiangli, >>>> >>>> The code looks good in general. I just have a few pet peeves for readability: >>>> >>>> >>>> (1) stringTable.cpp and metaspaceShared.cpp have the same asserts >>>> >>>> 704 assert(UseG1GC, "Only support G1 GC"); >>>> 705 assert(UseCompressedOops && UseCompressedClassPointers, >>>> 706 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); >>>> >>>> 1615 assert(UseG1GC, "Only support G1 GC"); >>>> 1616 assert(UseCompressedOops && UseCompressedClassPointers, >>>> 1617 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); >>>> >>>> Maybe it's better to combine them into a single function like MetaspaceShared::assert_vm_flags() so they don't get out of sync? >>> >>> There is a MetaspaceShared::allow_archive_heap_object(), which checks for UseG1GC, UseCompressedOops and UseCompressedClassPointers combined. It does not seem to worth add another separate API for asserting the required flags. I?ll use that in the assert. >>> >>>> >>>> >>>> >>>> (2) FileMapInfo::write_archive_heap_regions() >>>> >>>> I still find this code very hard to read, especially due to the loop. >>>> >>>> First, the comments are not consistent with the code: >>>> >>>> 498 assert(arr_len <= max_num_regions, "number of memory regions exceeds maximum"); >>>> >>>> but the comments says: "The rest are consecutive full GC regions" which means there's a chance for max_num_regions to be more than 2 (which will be the case with Calvin's java-loader dumping changes using very small heap size). So the code is actually wrong. >>> >>> The max_num_regions is the maximum number of region for each archived heap space (the string space, or open archive space). We only run into the case where the MemRegion array size is larger than max_num_regions with Calvin?s pending change. As part of Calvin?s change, he will change the assert into a check and bail out if the number of MemRegions are larger than max_num_regions due to heap fragmentation. >>> >>> >> Your latest patch assumes that arr_len <= 2, but the implementation of G1CollectedHeap::heap()->begin_archive_alloc_range() / G1CollectedHeap::heap()->end_archive_alloc_range() actually allows more than 2 regions to returned. So simply putting an assert there seems risky (unless you have analyzed all possible scenarios to prove that's impossible). >> >> Instead of trying to come up with a complicated proof, I think it's much safer to disable the archived string region if the arr_len > 2. Also, if the string region is disabled, you should also disable the open_archive_heap_region >> >> I think this is a general issue with the mapped heap regions, and it just happens to be revealed by Calvin's patch. So we should fix it now and not wait for Calvin's patch. > > Ok. I?ll change the assert to be a check. > >> >> >>>> >>>> The word "region" is used in these parameters, but they don't mean the same thing. >>>> >>>> GrowableArray *regions >>>> int first_region, int max_num_regions, >>>> >>>> >>>> How about regions -> g1_regions_list >>>> first_region -> first_region_in_archive >>> >>> The GrowableArray above is the MemRegions that GC code gives back to us. The GC code combines multiple G1 regions. The comments probably are over-explaining the details, which are hidden in the GC code. Probably that?s the confusing source. I?ll make the comment more clear. >>> >>> Using g1_regions_list would also be confusing, since write_archive_heap_regions does not handle G1 regions directly. It processes the MemRegion array that GC code returns. How about changing ?regions? to ?mem_regions? or ?archive_regions'? >>> >> How about heap_regions? These are regions in the active Java heap, which current has not mapped anything from the CDS archive. > > Ok. > > I?m updating my changes and will send out a consolidated webrev. > > Thanks! > Jiangli > >> >> >>>> >>>> >>>> In the comments, I find the phrase 'the current archive heap region' ambiguous. It could be (erroneously) interpreted as "a region from the currently mapped archive? >>> >>>> >>>> To make it unambiguous, how about changing >>>> >>>> >>>> 464 // Write the current archive heap region, which contains one or multiple GC(G1) regions. >>>> >>>> >>>> to >>>> >>>> // Write the given list of G1 memory regions into the archive, starting at >>>> // first_region_in_archive. >>> >>> >>> Ok. How about the following: >>> >>> // Write the given list of java heap memory regions into the archive, starting at >>> // first_region_in_archive. >>> >> Sounds good. >> >> Thanks >> - Ioi >> >>>> >>>> >>>> Also, for the explanation of how the G1 regions are written into the archive, how about: >>>> >>>> // The G1 regions in the list are sorted in ascending address order. When there are more objects >>>> // than the capacity of a single G1 region, the bottom-most G1 region may be partially filled, and the >>>> // remaining G1 region(s) are consecutively allocated and fully filled. >>>> // >>>> // Thus, the bottom-most G1 region (if not empty) is written into first_region_in_archive. >>>> // The remaining G1 regions (if exist) are coalesced and written as a single block >>>> // into (first_region_in_archive + 1) >>>> >>>> // Here's the mapping from (g1 regions) -> (archive regions). >>>> >>>> >>>> All this function needs to do is to decide the values for >>>> >>>> r0_start, r0_top >>>> r1_start, r1_top >>>> >>>> I think it would be much better to not use the loop, and not use the max_num_regions parameter (it's always 2 anyway). >>>> >>>> *r0_start = *r0_top = NULL; >>>> *r1_start = *r1_top = NULL; >>>> >>>> if (arr_len >= 1) { >>>> *r0_start = regions->at(0).start(); >>>> *r0_end = *r0_start + regions->at(0).byte_size(); >>>> } >>>> if (arr_len >= 2) { >>>> int last = arr_len - 1; >>>> *r1_start = regions->at(1).start(); >>>> *r1_end = regions->at(last).start() + regions->at(last).byte_size(); >>>> } >>>> >>>> what do you think? >>> >>> We need to write out all archive regions including the empty ones. The loop using max_num_regions is the easiest way. I?d like to remove the code that deals with r0_* and r1_ explicitly. Let me try that. >>> >>>> >>>> >>>> >>>> (3) metaspace.cpp >>>> >>>> 3350 // Map the archived heap regions after compressed pointers >>>> 3351 // because it relies on compressed class pointers setting to work >>>> >>>> do you mean this? >>>> >>>> // Archived heap regions depend on the parameters of compressed class pointers, so >>>> // they must be mapped after such parameters have been decided in the above call. >>> >>> Hmmm, maybe use ?arguments? instead of ?parameters?? >>> >>>> >>>> >>>> (4) I found this name not strictly grammatical. How about this: >>>> >>>> allow_archive_heap_object -> is_heap_object_archiving_allowed >>> >>> Ok. >>> >>>> >>>> (5) in most of your code, 'archive' is used as a noun, except in StringTable::archive_string() where it's used as a verb. >>>> >>>> archive_string could also be interpreted erroneously as "return a string that's already in the archive". So to be consistent and unambiguous, I think it's better to rename it to StringTable::create_archived_string() >>> >>> Ok. >>> >>> Thanks, >>> Jiangli >>> >>>> >>>> >>>> Thanks >>>> - Ioi >>>> >>>> >>>> On 8/3/17 5:15 PM, Jiangli Zhou wrote: >>>>> Here are the updated webrevs. >>>>> >>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >>>>> >>>>> Changes in the updated webrevs include: >>>>> Merge with Ioi?s recent shared space auto-sizing change (8072061) >>>>> Addressed all feedbacks from Ioi and Coleen (Thanks for detailed review!) >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>> >>>>>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou wrote: >>>>>> >>>>>> Hi Ioi, >>>>>> >>>>>> Thank you so much for reviewing this. I?ve addressed all your feedbacks. Please see details below. I?ll updated the webrev after addressing Coleen?s comments. >>>>>> >>>>>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>>>>>> >>>>>>> Hi Jiangli, >>>>>>> >>>>>>> Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) >>>>>>> >>>>>>> stringTable.cpp: StringTable::archive_string >>>>>>> >>>>>>> add assert for DumpSharedSpaces only >>>>>> >>>>>> Ok. >>>>>> >>>>>>> >>>>>>> filemap.cpp >>>>>>> >>>>>>> 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, >>>>>>> 526 int first_region, int num_regions) { >>>>>>> >>>>>>> When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: >>>>>>> >>>>>>> 537 int len = regions->length(); >>>>>>> 538 if (len > 1) { >>>>>>> 539 start = (char*)regions->at(1).start(); >>>>>>> 540 size = (char*)regions->at(len - 1).end() - start; >>>>>>> 541 } >>>>>>> 542 } >>>>>>> >>>>>>> The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. >>>>>>> >>>>>>> How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. >>>>>>> >>>>>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { >>>>>>> if (first == MetaspaceShared::first_string) { >>>>>>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>>>>>> } else { >>>>>>> assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); >>>>>>> assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); >>>>>>> } >>>>>>> .... >>>>>>> >>>>>>> >>>>>> >>>>>> I?ve reworked the function and simplified the code. >>>>>> >>>>>>> >>>>>>> 756 if (!string_data_mapped) { >>>>>>> 757 StringTable::ignore_shared_strings(true); >>>>>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); >>>>>>> 759 } >>>>>>> 760 >>>>>>> 761 if (open_archive_heap_data_mapped) { >>>>>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>>>>>> 763 } else { >>>>>>> 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); >>>>>>> 765 } >>>>>>> >>>>>>> Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? >>>>>> >>>>>> Fixed. >>>>>> >>>>>>> >>>>>>> FileMapInfo::map_heap_data() -- >>>>>>> >>>>>>> 818 char* addr = (char*)regions[i].start(); >>>>>>> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >>>>>>> 820 addr, regions[i].byte_size(), si->_read_only, >>>>>>> 821 si->_allow_exec); >>>>>>> >>>>>>> What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. >>>>>> >>>>>> If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by ?regions? array are deallocated. We don?t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ?if? check below makes sure base == addr. >>>>>> >>>>>> if (base == NULL || base != addr) { >>>>>> // dealloc the regions from java heap >>>>>> dealloc_archive_heap_regions(regions, region_num); >>>>>> if (log_is_enabled(Info, cds)) { >>>>>> log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap."); >>>>>> } >>>>>> return false; >>>>>> } >>>>>> >>>>>> >>>>>>> >>>>>>> constantPool.cpp >>>>>>> >>>>>>> Handle refs_handle; >>>>>>> ... >>>>>>> refs_handle = Handle(THREAD, (oop)archived); >>>>>>> >>>>>>> This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() >>>>>>> >>>>>>> I think it's more efficient if you merge these into a single statement >>>>>>> >>>>>>> Handle refs_handle(THREAD, (oop)archived); >>>>>> >>>>>> Fixed. >>>>>> >>>>>>> >>>>>>> Is this experimental code? Maybe it should be removed? >>>>>>> >>>>>>> 664 if (tag_at(index).is_unresolved_klass()) { >>>>>>> 665 #if 0 >>>>>>> 666 CPSlot entry = cp->slot_at(index); >>>>>>> 667 Symbol* name = entry.get_symbol(); >>>>>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>>>>>> 669 if (k != NULL) { >>>>>>> 670 klass_at_put(index, k); >>>>>>> 671 } >>>>>>> 672 #endif >>>>>>> 673 } else >>>>>> >>>>>> Removed. >>>>>> >>>>>>> >>>>>>> cpCache.hpp: >>>>>>> >>>>>>> u8 _archived_references >>>>>>> >>>>>>> shouldn't this be declared as an narrowOop to avoid the type casts when it's used? >>>>>> >>>>>> Ok. >>>>>> >>>>>>> >>>>>>> cpCache.cpp: >>>>>>> >>>>>>> add assert so that one of these is used only at dump time and the other only at run time? >>>>>>> >>>>>>> 610 oop ConstantPoolCache::archived_references() { >>>>>>> 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); >>>>>>> 612 } >>>>>>> 613 >>>>>>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>>>>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>>>>>> 616 } >>>>>> >>>>>> Ok. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Jiangli >>>>>> >>>>>>> >>>>>>> Thanks! >>>>>>> - Ioi >>>>>>> >>>>>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>>>>>> Sorry, the mail didn?t handle the rich text well. I fixed the format below. >>>>>>>> >>>>>>>> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. >>>>>>>> >>>>>>>> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. >>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>>>>>> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>>>>>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>>>>>> >>>>>>>> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. >>>>>>>> >>>>>>>> Types of Pinned G1 Heap Regions >>>>>>>> >>>>>>>> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. >>>>>>>> >>>>>>>> 00100 0 [ 8] Pinned Mask >>>>>>>> 01000 0 [16] Old Mask >>>>>>>> 10000 0 [32] Archive Mask >>>>>>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>>>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>>>>>>> >>>>>>>> >>>>>>>> Pinned Regions >>>>>>>> >>>>>>>> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. >>>>>>>> >>>>>>>> Archive Regions >>>>>>>> >>>>>>>> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. >>>>>>>> >>>>>>>> An archive region is also an old region by design. >>>>>>>> >>>>>>>> Open Archive (GC-RW) Regions >>>>>>>> >>>>>>>> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. >>>>>>>> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. >>>>>>>> >>>>>>>> Adjustable Outgoing Pointers >>>>>>>> >>>>>>>> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. >>>>>>>> >>>>>>>> Closed Archive (GC-RO) Regions >>>>>>>> >>>>>>>> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. >>>>>>>> In JDK 9 we support archive Strings with the archive regions. >>>>>>>> >>>>>>>> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. >>>>>>>> >>>>>>>> Dormant Objects >>>>>>>> >>>>>>>> Dormant objects are unreachable java objects within the open archive heap region. >>>>>>>> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. >>>>>>>> >>>>>>>> Object State Transition >>>>>>>> >>>>>>>> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. >>>>>>>> >>>>>>>> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. >>>>>>>> >>>>>>>> Caching Java Objects at Archive Dump Time >>>>>>>> >>>>>>>> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. >>>>>>>> >>>>>>>> Caching Constant Pool resolved_references Array >>>>>>>> >>>>>>>> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. >>>>>>>> >>>>>>>> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. >>>>>>>> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. >>>>>>>> >>>>>>>> Runtime Java Heap With Cached Java Objects >>>>>>>> >>>>>>>> >>>>>>>> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. >>>>>>>> >>>>>>>> Preliminary test execution and status: >>>>>>>> >>>>>>>> JPRT: passed >>>>>>>> Tier2-rt: passed >>>>>>>> Tier2-gc: passed >>>>>>>> Tier2-comp: passed >>>>>>>> Tier3-rt: passed >>>>>>>> Tier3-gc: passed >>>>>>>> Tier3-comp: passed >>>>>>>> Tier4-rt: passed >>>>>>>> Tier4-gc: passed >>>>>>>> Tier4-comp:6 jobs timed out, all other tests passed >>>>>>>> Tier5-rt: one test failed but passed when running locally, all other tests passed >>>>>>>> Tier5-gc: passed >>>>>>>> Tier5-comp: running >>>>>>>> hotspot_gc: two jobs timed out, all other tests passed >>>>>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>>>>>> vm.gc: passed >>>>>>>> vm.gc in CDS mode: passed >>>>>>>> Kichensink: passed >>>>>>>> Kichensink in CDS mode: passed >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jiangli >>>>>>> >>>>>> >>>>> >>>> >>> >> > From thomas.schatzl at oracle.com Wed Aug 9 11:59:07 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 09 Aug 2017 13:59:07 +0200 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <638FE232-DCA9-4B46-83FA-F4A81A80949B@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1502117570.2950.49.camel@oracle.com> <638FE232-DCA9-4B46-83FA-F4A81A80949B@oracle.com> Message-ID: <1502279947.2367.15.camel@oracle.com> Hi Jiangli, ? going to look at the new webrev in the other email again, answering some questions here. Thanks for considering all these suggestions. On Mon, 2017-08-07 at 16:39 -0700, Jiangli Zhou wrote: > Hi Thomas, > > Thanks a lot for the review! > > > On Aug 7, 2017, at 7:52 AM, Thomas Schatzl > om> wrote: > > > > Hi, > > > > On Thu, 2017-08-03 at 17:15 -0700, Jiangli Zhou wrote: > > > Here are the updated webrevs. > > > > > > http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ > > > http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ > > > > > > Changes in the updated webrevs include: > > > Merge with Ioi?s recent shared space auto-sizing change (8072061) > > > Addressed all feedbacks from Ioi and Coleen (Thanks for detailed > > > review!) > > - the comment in g1Allocator.hpp:326 needs to be updated. I would > > merge the information from the _open member here. I.e. define what > > "open" and "closed" archive are. > Good catch. I updated the comments as the following: > > // G1ArchiveAllocator is used to allocate memory in archive > // regions. Such regions are not scavenged nor compacted by GC. > // There are two types of archive regions, which are > // differ in the kind of references allowed for the contained > // objects: > // > // - 'Closed' archive region contain no references outside of archive better: other "closed" archive > // ? regions. The region is immutable by GC. GC does not mark object > // ? header in 'closed' archive region.? > // - An 'open' archive region may contain references pointing to > // ? non-archive heap region. GC can adjust pointers and mark object > // ? header in 'open' archive region. > [...] > > -?g1CollectedHeap.cpp:750 + 756: maybe make a static method to > > avoid repetition here. > I changed the code to be following. New static function is a little > overkill since the usage is very limited. :-) Okay :) > > -?G1NoteEndOfConcMarkClosure::doHeapRegion(): would it be too much > > work to make an extra CR out of this change? It is a change that > > fixes an existing bug unrelated to this change after all (not doing > > remembered set cleanup work for archive regions). > Using separate CR to track this sounds good. I just created?JDK- > 8185924. Since we have been testing the fix with other changes > together, I'll integrate them together with both CRs. Since you assigned this to me, do you want me to post the RFR? > > > > -?g1HeapVerifier.cpp: there is a verbose flag passed around. Not > > sure if it should be kept, as it seems to be some code that has > > been used for debugging this feature, but can't be activated anyway > > without code changes. > Removed. Thanks. > > - heapRegion.inline.hpp:125: I think the existing code of faking > > open archive regions as all-live does not work as implemented. > > Consider the case when a new object in there is made live, and > > references in there set to refer to some object outside this > > region, and is the only reference (and it has not been marked live > > yet): if there is a remembered set entry to that, and it is about > > to be scanned. > > > > The current implementation of HeapRegion::is_obj_dead() will > > consider it dead, so we will enter the code path at line 125. > > Block_size_using bitmap() will jump over that object, but the > > return values of is_obj_dead_with_size() method will indicate the > > caller to not iterate over this object anyway, potentially missing > > that reference. > > > > HeapRegion::is_obj_dead() needs to return that the object is not > > dead for open archive regions. I think for now the safest way is to > > add !is_open_archive() to the condition calculated there. That will > > obviate the need for that existing hack to the assert too. > > > > It may have some perf impact though - actually recently there has > > been some effort to remove that is_archive() check from that code > > (the one that is now the is_closed_archive() assert). I do not see > > an easy way to fix this. :( (i.e. there is likely no perf impact > > vs. jdk9 so it's not that bad) > > > > This suggestion also only works with the assumption laid out in the > > CR that there is no way that a live object can not become dormant > > again, and the objects in the open archive regions are always > > parsable (never contain junk data). > Thank you for the analysis. I changed HeapRegion::is_obj_dead() with > added !is_open_archive() condition as you suggested. I?m glad to get > rid of the is_open_archive() change from is_obj_dead_with_size() > assert. > > Thinking more about the case you described above, when an object (A) > in the open archive just becomes live, there would be no reference to > any other non-archive region at the moment. The object A only > contains references to the archive (open or closed) regions > initially. Scanning has no issue at the moment. When a new object (B) > is allocated in the java heap and the reference is set in A. B is > considered live, scanning would update the A->B reference > accordingly. Is that correct? ? that is exactly the case this suggested change covers. Otherwise the not-yet-marked object would be skipped, and the reference not updated. As mentioned I will look again at the new webrev. Thomas From harold.seigel at oracle.com Wed Aug 9 14:01:47 2017 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 9 Aug 2017 10:01:47 -0400 Subject: RFR 8177741: Fix hotspot tests to use --patch-module instead of -Xmodule Message-ID: Hi, Please review this JDK-10 change to replace "-Xmodule" with "--patch-module" in the test/lib/jdk/test/lib/compiler/InMemoryJavaCompiler class and in affected tests. The option name "-Xmodule" was used briefly for java and javac during JDK-9 development but was eventually replaced with "--patch-module". However, InMemoryJavaCompiler did not get updated to use "--patch-module". Open Webrevs: http://cr.openjdk.java.net/~hseigel/bug_8177741.test/webrev/ http://cr.openjdk.java.net/~hseigel/bug_8177741.hs/webrev/ JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8177741 The change was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util and other tests, the co-located NSK tests, and with JPRT. Thanks, Harold From thomas.schatzl at oracle.com Wed Aug 9 15:12:33 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 09 Aug 2017 17:12:33 +0200 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <4DEE0393-59DA-4E01-ACDB-F2DB0F9ED6D9@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> <5F35B435-FB72-4A21-8896-645E0291C5DF@oracle.com> <4DEE0393-59DA-4E01-ACDB-F2DB0F9ED6D9@oracle.com> Message-ID: <1502291553.14972.19.camel@oracle.com> Hi, On Tue, 2017-08-08 at 17:33 -0700, Jiangli Zhou wrote: > Here is the incremental webrev that has all the changes incorporated > with suggestions from Coleen, Ioi and Thomas: > > http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03.inc/ > > Updated full webrev:?http://cr.openjdk.java.net/~jiangli/8179302/webr > ev.hotspot.03/ > ?- (just repeating) g1Allocator.hpp:328: ?"Closed" archive region contain no references outside of archive ... -> outside of other closed archive regions. As for the description of open archive regions, it may be useful to allow references to any other regions (afaik there is no restriction on pointing to other open archive regions, and none to closed). ?-?g1Allocator.inline.hpp:68 and 71: argument alignment. ?-?filemap.cpp:473: missing space before the opening bracket ?- filemap.cpp:475: s/consqusective/consecutive ?-?g1HeapVerifier.cpp:252: comment should say "Should be closed archive region" ?-?filemap.cpp:675: this is just a note about the big comment. It mentions "archive string regions" and "string regions" (also seen "metaspace string region") which may or may not be defined elsewhere. Maybe it is useful to consolidate these terms. The text also mentions "regions", which is a G1 specific term. Not sure if there is a better way to define these areas in this context. The comment in?metaspaceShared.cpp seems to carefully avoid this by using "shraed strings" and "shared archive heap space" (i.e. no mention of "regions") most of the time. Sometimes "regions" is also used as "area".? Consolidating this may avoid confusion between "G1 regions" and region as "area" and help a reader. Or maybe I'm just reading too much into this :) I have no particular opinion on whether you want to change anything here. ?- FileMapInfo::map_heap_data(): there are quite a few occurrances of this construct: ?761?????if (log_is_enabled(Info, cds)) { ?762???????log_info(cds)("UseSharedSpaces: Unable to allocate region, " Afaik the log_info() call already only prints out data if log_is_enabled(Info, cds), so it seems superfluous. The use of log_is_enabled() call seems only useful if the block inside contains some expensive computation, which does not seem the case here. Sorry for bringing up these mostly cosmetic issues that late. I do not need a re-review for the comment changes (and removal of the log_is_enabled()). Thanks, ? Thomas From coleen.phillimore at oracle.com Wed Aug 9 17:39:46 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Aug 2017 13:39:46 -0400 Subject: RFR (XS) 8068317: No_Safepoint_Verifier is not necessary in Rewriter::scan_method Message-ID: <11804acd-4f29-6906-4e23-e66c3f04ce90@oracle.com> Summary: remove NSV, Method* can't move or be redefined while being rewritten One down, lots more to look at. open webrev at http://cr.openjdk.java.net/~coleenp/8068317.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8068317 Tested with tier1 testing on linux/x64. Will run JPRT. Thanks, Coleen From shade at redhat.com Wed Aug 9 17:45:24 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 9 Aug 2017 19:45:24 +0200 Subject: RFR (XS) 8068317: No_Safepoint_Verifier is not necessary in Rewriter::scan_method In-Reply-To: <11804acd-4f29-6906-4e23-e66c3f04ce90@oracle.com> References: <11804acd-4f29-6906-4e23-e66c3f04ce90@oracle.com> Message-ID: On 08/09/2017 07:39 PM, coleen.phillimore at oracle.com wrote: > open webrev at http://cr.openjdk.java.net/~coleenp/8068317.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8068317 Ok, so code_base mentioned in comment is still there, but allocated in Metaspace. Looks good. Thanks, -Aleksey From jiangli.zhou at Oracle.COM Wed Aug 9 17:46:30 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Wed, 9 Aug 2017 10:46:30 -0700 Subject: RFR (XS) 8068317: No_Safepoint_Verifier is not necessary in Rewriter::scan_method In-Reply-To: <11804acd-4f29-6906-4e23-e66c3f04ce90@oracle.com> References: <11804acd-4f29-6906-4e23-e66c3f04ce90@oracle.com> Message-ID: Looks good. Jiangli > On Aug 9, 2017, at 10:39 AM, coleen.phillimore at oracle.com wrote: > > Summary: remove NSV, Method* can't move or be redefined while being rewritten > > One down, lots more to look at. > > open webrev at http://cr.openjdk.java.net/~coleenp/8068317.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8068317 > > Tested with tier1 testing on linux/x64. Will run JPRT. > > Thanks, > Coleen From coleen.phillimore at oracle.com Wed Aug 9 17:47:11 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Aug 2017 13:47:11 -0400 Subject: RFR (XS) 8068317: No_Safepoint_Verifier is not necessary in Rewriter::scan_method In-Reply-To: References: <11804acd-4f29-6906-4e23-e66c3f04ce90@oracle.com> Message-ID: <5027e10d-d14d-d7fc-043f-0be7d3403e9c@oracle.com> On 8/9/17 1:45 PM, Aleksey Shipilev wrote: > On 08/09/2017 07:39 PM, coleen.phillimore at oracle.com wrote: >> open webrev at http://cr.openjdk.java.net/~coleenp/8068317.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8068317 > Ok, so code_base mentioned in comment is still there, but allocated in Metaspace. Looks good. Yes. Metaspace objects don't move (except during CDS dumping). Thanks for the quick review. Coleen > > Thanks, > -Aleksey > From jiangli.zhou at oracle.com Wed Aug 9 18:39:03 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 9 Aug 2017 11:39:03 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <1502291553.14972.19.camel@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> <5F35B435-FB72-4A21-8896-645E0291C5DF@oracle.com> <4DEE0393-59DA-4E01-ACDB-F2DB0F9ED6D9@oracle.com> <1502291553.14972.19.camel@oracle.com> Message-ID: <164DFAA0-26A9-49BA-85D2-CA0442C24A5E@oracle.com> Hi Thomas, Thank you so much for looking at the update in great detail! All points were taken and fixed as suggested. > On Aug 9, 2017, at 8:12 AM, Thomas Schatzl wrote: > > Hi, > > On Tue, 2017-08-08 at 17:33 -0700, Jiangli Zhou wrote: >> Here is the incremental webrev that has all the changes incorporated >> with suggestions from Coleen, Ioi and Thomas: >> >> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03.inc/ >> >> Updated full webrev: http://cr.openjdk.java.net/~jiangli/8179302/webr >> ev.hotspot.03/ >> > > - (just repeating) g1Allocator.hpp:328: "Closed" archive region > contain no references outside of archive ... -> outside of other closed > archive regions. > As for the description of open archive regions, it may be useful to > allow references to any other regions (afaik there is no restriction on > pointing to other open archive regions, and none to closed). Fixed. > > - g1Allocator.inline.hpp:68 and 71: argument alignment. Fixed. > > - filemap.cpp:473: missing space before the opening bracket Fixed. > > - filemap.cpp:475: s/consqusective/consecutive Fixed. > > - g1HeapVerifier.cpp:252: comment should say "Should be closed archive > region? Fixed. > > - filemap.cpp:675: this is just a note about the big comment. It > mentions "archive string regions" and "string regions" (also seen > "metaspace string region") which may or may not be defined elsewhere. > Maybe it is useful to consolidate these terms. > > The text also mentions "regions", which is a G1 specific term. Not sure > if there is a better way to define these areas in this context. > > The comment in metaspaceShared.cpp seems to carefully avoid this by > using "shraed strings" and "shared archive heap space" (i.e. no mention > of "regions") most of the time. Sometimes "regions" is also used as > "area". > > Consolidating this may avoid confusion between "G1 regions" and region > as "area" and help a reader. > > Or maybe I'm just reading too much into this :) > > I have no particular opinion on whether you want to change anything > here. I changed the comments and replaced with shared string object and open archive heap objects (or open archive heap data). > > - FileMapInfo::map_heap_data(): there are quite a few occurrances of > this construct: > > 761 if (log_is_enabled(Info, cds)) { > 762 log_info(cds)("UseSharedSpaces: Unable to allocate region, " > > Afaik the log_info() call already only prints out data if > log_is_enabled(Info, cds), so it seems superfluous. The use of > log_is_enabled() call seems only useful if the block inside contains > some expensive computation, which does not seem the case here. Removed those unnecessary log_is_enabled(Info, cds) checks. > > Sorry for bringing up these mostly cosmetic issues that late. I do not > need a re-review for the comment changes (and removal of the > log_is_enabled()). Thanks! Jiangli > > Thanks, > Thomas From coleen.phillimore at oracle.com Wed Aug 9 19:52:10 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Aug 2017 15:52:10 -0400 Subject: RFR (XS) 8068317: No_Safepoint_Verifier is not necessary in Rewriter::scan_method In-Reply-To: References: <11804acd-4f29-6906-4e23-e66c3f04ce90@oracle.com> Message-ID: <6fd37dc1-70e5-c23b-5deb-29b7e95ebea0@oracle.com> Thanks, Jiangli! Coleen On 8/9/17 1:46 PM, Jiangli Zhou wrote: > Looks good. > > Jiangli > >> On Aug 9, 2017, at 10:39 AM, coleen.phillimore at oracle.com wrote: >> >> Summary: remove NSV, Method* can't move or be redefined while being rewritten >> >> One down, lots more to look at. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8068317.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8068317 >> >> Tested with tier1 testing on linux/x64. Will run JPRT. >> >> Thanks, >> Coleen From coleen.phillimore at oracle.com Wed Aug 9 20:02:51 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Aug 2017 16:02:51 -0400 Subject: RFR (XS) 8186044: [TESTBUG] DumpSharedDictionary test sometimes fails in JPRT Message-ID: <88aa1c11-abb2-3a00-31ea-e4a91b0236fc@oracle.com> Summary: wrap test in CDSTestUtils.isUnableToMap(out) bug link https://bugs.openjdk.java.net/browse/JDK-8186044 local webrev at http://oklahoma.us.oracle.com/~cphillim/webrev/8186044.01/webrev Reran macos jprt testing, and local linux x64 testing. thanks, Coleen (Ioi, is this right?) From ioi.lam at oracle.com Wed Aug 9 20:17:50 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 9 Aug 2017 13:17:50 -0700 Subject: RFR (XS) 8186044: [TESTBUG] DumpSharedDictionary test sometimes fails in JPRT In-Reply-To: <88aa1c11-abb2-3a00-31ea-e4a91b0236fc@oracle.com> References: <88aa1c11-abb2-3a00-31ea-e4a91b0236fc@oracle.com> Message-ID: <5283a8a7-a79f-0430-5a73-3f555d6ccc42@oracle.com> It looks good. Thanks Coleen. - Ioi On 8/9/17 1:02 PM, coleen.phillimore at oracle.com wrote: > Summary: wrap test in CDSTestUtils.isUnableToMap(out) > > bug link https://bugs.openjdk.java.net/browse/JDK-8186044 > local webrev at > http://oklahoma.us.oracle.com/~cphillim/webrev/8186044.01/webrev > > Reran macos jprt testing, and local linux x64 testing. > > thanks, > Coleen > > (Ioi, is this right?) From coleen.phillimore at oracle.com Wed Aug 9 20:25:41 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Aug 2017 16:25:41 -0400 Subject: RFR (XS) 8186044: [TESTBUG] DumpSharedDictionary test sometimes fails in JPRT In-Reply-To: <5283a8a7-a79f-0430-5a73-3f555d6ccc42@oracle.com> References: <88aa1c11-abb2-3a00-31ea-e4a91b0236fc@oracle.com> <5283a8a7-a79f-0430-5a73-3f555d6ccc42@oracle.com> Message-ID: <1ec06d49-65af-9378-bbdf-296c929b10c1@oracle.com> On 8/9/17 4:17 PM, Ioi Lam wrote: > It looks good. Thanks Coleen. Thanks, Ioi for the advice how to fix it. Coleen > > > - Ioi > > > On 8/9/17 1:02 PM, coleen.phillimore at oracle.com wrote: >> Summary: wrap test in CDSTestUtils.isUnableToMap(out) >> >> bug link https://bugs.openjdk.java.net/browse/JDK-8186044 >> local webrev at >> http://oklahoma.us.oracle.com/~cphillim/webrev/8186044.01/webrev >> >> Reran macos jprt testing, and local linux x64 testing. >> >> thanks, >> Coleen >> >> (Ioi, is this right?) > From mikhailo.seledtsov at oracle.com Wed Aug 9 22:08:19 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Wed, 9 Aug 2017 15:08:19 -0700 Subject: RFR (XS) 8186044: [TESTBUG] DumpSharedDictionary test sometimes fails in JPRT In-Reply-To: <88aa1c11-abb2-3a00-31ea-e4a91b0236fc@oracle.com> References: <88aa1c11-abb2-3a00-31ea-e4a91b0236fc@oracle.com> Message-ID: <312a6036-a145-0ecf-d875-56e21a985617@oracle.com> Looks good to me. Thank you for fixing this. Misha On 08/09/2017 01:02 PM, coleen.phillimore at oracle.com wrote: > Summary: wrap test in CDSTestUtils.isUnableToMap(out) > > bug link https://bugs.openjdk.java.net/browse/JDK-8186044 > local webrev at > http://oklahoma.us.oracle.com/~cphillim/webrev/8186044.01/webrev > > Reran macos jprt testing, and local linux x64 testing. > > thanks, > Coleen > > (Ioi, is this right?) From coleen.phillimore at oracle.com Wed Aug 9 22:49:38 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Aug 2017 18:49:38 -0400 Subject: RFR (XS) 8186044: [TESTBUG] DumpSharedDictionary test sometimes fails in JPRT In-Reply-To: <312a6036-a145-0ecf-d875-56e21a985617@oracle.com> References: <88aa1c11-abb2-3a00-31ea-e4a91b0236fc@oracle.com> <312a6036-a145-0ecf-d875-56e21a985617@oracle.com> Message-ID: <7efbe8fc-f554-6e77-d7ac-d5c73f55b5ff@oracle.com> Thank you Misha. Coleen On 8/9/17 6:08 PM, mikhailo wrote: > Looks good to me. Thank you for fixing this. > > > Misha > > > On 08/09/2017 01:02 PM, coleen.phillimore at oracle.com wrote: >> Summary: wrap test in CDSTestUtils.isUnableToMap(out) >> >> bug link https://bugs.openjdk.java.net/browse/JDK-8186044 >> local webrev at >> http://oklahoma.us.oracle.com/~cphillim/webrev/8186044.01/webrev >> >> Reran macos jprt testing, and local linux x64 testing. >> >> thanks, >> Coleen >> >> (Ioi, is this right?) > From goetz.lindenmaier at sap.com Thu Aug 10 04:43:12 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 10 Aug 2017 04:43:12 +0000 Subject: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests In-Reply-To: References: <77fc019ea90b4c7e8d8df99fe32710ce@sap.com> <597BB24D.5080903@oracle.com> <2bedb0a4c3524a749ef5d330c345d287@sap.com> <597F90A0.4030706@oracle.com> <2fa1eb81b40f4df5a0ce039d03206de9@sap.com> <3914ccbe-3c4d-5c66-94e4-5ee2d88d0afb@oracle.com> <5984CC70.80209@oracle.com> <412edbf307a64909a010ce5f83deac5b@sap.com> <59889970.1080301@oracle.com> <5988E311.1080605@oracle.com> <70dc302307ad46c98bc65e45b98682b6@sap.com>, Message-ID: <3DCE200B-2C07-4AD5-B256-F7CA1E8380F2@sap.com> Hi mikhailo, Thanks for sponsoring! Best regards, g?tz > Am 08.08.2017 um 20:05 schrieb mikhailo : > > Hi Goetz, > > I will test your patch and then will do a sponsor push. > > > Thank you, > > Mikhailo > > >> On 08/08/2017 02:00 AM, Lindenmaier, Goetz wrote: >> Hi Mikhailo, >> >> yes, I please need a sponsor. >> Thanks for the help with working on this change! >> I added Ioi as reviewer in the webrev, so the patches >> can be pushed as is. >> >> Thanks, >> Goetz. >> >>> -----Original Message----- >>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>> Sent: Dienstag, 8. August 2017 00:01 >>> To: Ioi Lam >>> Cc: Lindenmaier, Goetz ; Igor Ignatyev >>> ; hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable cds tests >>> >>> Hi Goetz, >>> >>> Please let me know if you need a sponsor for this change. >>> >>> Mikhailo >>> >>>> On 8/7/17, 10:44 AM, Ioi Lam wrote: >>>> Looks good to me, too. Reviewed. >>>> >>>> Thanks >>>> >>>> - Ioi >>>> >>>> >>>> >>>>> On 8/7/17 9:46 AM, Mikhailo Seledtsov wrote: >>>>> The change looks good to me, >>>>> >>>>> Thank you, >>>>> Mikhailo >>>>> >>>>>> On 8/7/17, 1:02 AM, Lindenmaier, Goetz wrote: >>>>>> Hi, >>>>>> >>>>>> webrev with Whitebox: >>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04/ >>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.04-hs/ >>>>>> >>>>>> I don't see so much of a difference to throwing an exception, if >>>>>> Whitebox is not properly implemented you get one, anyways: >>>>>> Exception in thread "main" java.lang.UnsatisfiedLinkError: >>>>>> sun.hotspot.WhiteBox.isCDSIncludedInVmBuild()Z >>>>>> at sun.hotspot.WhiteBox.isCDSIncludedInVmBuild(Native Method) >>>>>> Maybe it's a bit less likely to break, though. >>>>>> >>>>>> I'm fine with this, too. >>>>>> >>>>>> Best regards, >>>>>> Goetz., >>>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>> Sent: Freitag, 4. August 2017 21:35 >>>>>>> To: Ioi Lam; Lindenmaier, Goetz >>>>>>> ; Igor Ignatyev >>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>>>>>> cds tests >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have an alternative solution that is IMO rather simple, >>>>>>> reliable >>>>>>> and will >>>>>>> solve some issues we discussed (e.g. no need to throw >>>>>>> exceptions, no >>>>>>> need to handle failure to map an archive). >>>>>>> The proposed solution uses White Box test API to determine >>>>>>> whether VM >>>>>>> is compiled with INCLUDE_CDS on or off. >>>>>>> I implemented and tested it today, it works for me. >>>>>>> >>>>>>> The patch is attached. Please let me know what you think. >>>>>>> >>>>>>> Thank you, >>>>>>> Mikhailo >>>>>>> >>>>>>>> On 8/3/17, 11:39 PM, Ioi Lam wrote: >>>>>>>> Hi Goetz, >>>>>>>> >>>>>>>> Instead of testing -Xshare:on, I think you should test with >>>>>>>> -Xshare:auto, which sets the flags >>>>>>>> >>>>>>>> UseSharedSpaces = true >>>>>>>> RequireSharedSpaces = false >>>>>>>> >>>>>>>> and will reliably print "Shared spaces are not supported in this VM" >>>>>>>> if-and-only-if INCLUDE_CDS is disabled (see arguments.cpp): >>>>>>>> >>>>>>>> >>>>>>>> #if !INCLUDE_CDS >>>>>>>> if (DumpSharedSpaces || RequireSharedSpaces) { >>>>>>>> jio_fprintf(defaultStream::error_stream(), >>>>>>>> "Shared spaces are not supported in this VM\n"); >>>>>>>> return JNI_ERR; >>>>>>>> } >>>>>>>> if ((UseSharedSpaces&& FLAG_IS_CMDLINE(UseSharedSpaces)) || >>>>>>>> log_is_enabled(Info, cds)) { >>>>>>>> warning("Shared spaces are not supported in this VM"); >>>>>>>> FLAG_SET_DEFAULT(UseSharedSpaces, false); >>>>>>>> LogConfiguration::configure_stdout(LogLevel::Off, true, >>>>>>>> LOG_TAGS(cds)); >>>>>>>> } >>>>>>>> no_shared_spaces("CDS Disabled"); >>>>>>>> #endif // INCLUDE_CDS >>>>>>>> >>>>>>>> >>>>>>>> That way, you don't need to test any other output message or exit >>>>>>>> conditions(such as mapping error). >>>>>>>> >>>>>>>> >>>>>>>> E.g.: >>>>>>>> >>>>>>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java >>>>>>>> -Xshare:auto >>>>>>>> -version >>>>>>>> java version "10-internal" >>>>>>>> Java(TM) SE Runtime Environment (build >>>>>>>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>>>>>>> Java HotSpot(TM) 64-Bit Server VM (build >>>>>>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>>>>>>> >>>>>>>> >>>>>>>> ioilinux /jdk/iter/build/linux-x64$ ./images/jdk/bin/java >>>>>>>> -XXaltjvm=minimal -Xshare:auto -version >>>>>>>> Java HotSpot(TM) 64-Bit Minimal VM warning: Shared spaces are not >>>>>>>> supported in this VM >>>>>>>> java version "10-internal" >>>>>>>> Java(TM) SE Runtime Environment (build >>>>>>>> 10-internal+0-2017-08-04-0614567.iklam.iter) >>>>>>>> Java HotSpot(TM) 64-Bit Minimal VM (build >>>>>>>> 10-internal+0-2017-08-04-0614567.iklam.iter, mixed mode) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks >>>>>>>> - Ioi >>>>>>>> >>>>>>>>> On 8/3/17 10:58 PM, Lindenmaier, Goetz wrote: >>>>>>>>> Hi Mikhailo, >>>>>>>>> >>>>>>>>> I put in your version of vmCDS() into this new webrev. >>>>>>>>> I also had to update the list of tests marked in hotspot, >>>>>>>>> as tests were removed and added in between, and resolved >>>>>>>>> it against the aot change: >>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03/ >>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.03-hs/ >>>>>>>>> >>>>>>>>> I don't think it's a good idea to swallow the exception silently >>>>>>>>> as you propose. >>>>>>>>> In our test setup, the tests would just be switched off if something >>>>>>>>> breaks, and no one will see that. If they fail though, it's an easy >>>>>>>>> and quick fix. I would at least switch them on, then one sees the >>>>>>>>> failing tests in case switching them on was the wrong guess. >>>>>>>>> Also, below, the method dump() throws an exception. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Goetz >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: mikhailo [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>>> Sent: Tuesday, August 01, 2017 11:49 PM >>>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to disable >>>>>>>>>> cds tests >>>>>>>>>> >>>>>>>>>> Hi Goetz, >>>>>>>>>> >>>>>>>>>> I have reviewed your updated changes, and they overall look good to >>>>>>> me. >>>>>>>>>> However, I have some comments + concerns regarding >>>>>>> VMProps.vmCDS(): >>>>>>>>>> 1. Throwing exceptions from within the vmCDS() method. >>>>>>>>>> >>>>>>>>>> The VMProps properties are evaluated at the start of each >>>>>>>>>> run. If >>>>>>>>>> the exception is thrown here the whole test run will fail (not >>>>>>>>>> just the >>>>>>>>>> test that uses '@requires vm.cds'). The failure will occur shortly >>>>>>>>>> after >>>>>>>>>> the start of jtreg test run with a message: >>>>>>>>>> "java.lang.RuntimeException: Can not start VM to >>>>>>>>>> test to >>>>>>>>>> find out it's features. Switching off class data sharing (CDS)." >>>>>>>>>> >>>>>>>>>> Your method has 2 throw statements: "new >>>>>>>>>> RuntimeException("Can >>>>>>>>>> not >>>>>>>>>> start VM..." and "java.lang.RuntimeException: Can not start VM >>>>>>>>>> to test >>>>>>>>>> to...". I would recommend a more graceful way to fail, e.g. to >>>>>>>>>> print >>>>>>>>>> the >>>>>>>>>> message and to return "false" instead. This way the rest of the >>>>>>>>>> test >>>>>>>>>> run >>>>>>>>>> will continue, but the tests requiring vm.cds will be skipped with >>>>>>>>>> qualification of "not selected". >>>>>>>>>> >>>>>>>>>> 2. The check for "An error has occurred while processing the shared >>>>>>>>>> archive file." assumes that archive was not created prior to the >>>>>>>>>> execution of this evaluation code. However, there are test modes >>>>>>> where >>>>>>>>>> archive is created prior to test run. We use such mode on regular >>>>>>>>>> basis. >>>>>>>>>> In such cases the code will fail. >>>>>>>>>> I recommend to run "-Xshare:on -version", and check the >>>>>>>>>> following match that would result in return of "true": >>>>>>>>>> "Java HotSpot.*sharing" >>>>>>>>>> >>>>>>>>>> 3. On occasion the mapping of shared archive region to a specified >>>>>>>>>> address will fail (due to system configuration, space already >>>>>>>>>> occupied, >>>>>>>>>> ASLR, etc.) >>>>>>>>>> >>>>>>>>>> Hence I recommend checking for such conditions as well: >>>>>>>>>> >>>>>>>>>> if (output.firstMatch("Unable to map") != null) { >>>>>>>>>> System.out.println("VMProps.vmCDS() encountered an >>>>>>>>>> archive >>>>>>>>>> mapping failure, still proceeding with vm.cds=true"); >>>>>>>>>> return "true"; >>>>>>>>>> } >>>>>>>>>> I am returning true here because seeing this output means >>>>>>>>>> that >>>>>>>>>> CDS >>>>>>>>>> feature is supported, however in this particular instance archive >>>>>>>>>> failed >>>>>>>>>> to map. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The rest of the changes looks good to me. >>>>>>>>>> >>>>>>>>>> See for my version of VMProps.vmCDS() below. Let me know what you >>>>>>>>>> think. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thank you, >>>>>>>>>> >>>>>>>>>> Mikhailo >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ================== my update of VMProps.vmCDS() >>>>>>>>>> >>>>>>>>>> protected String vmCDS() { >>>>>>>>>> System.setProperty("test.jdk", >>>>>>>>>> System.getProperty("java.home")); >>>>>>>>>> ProcessBuilder pb = >>>>>>>>>> ProcessTools.createJavaProcessBuilder("-Xshare:on", "-version"); >>>>>>>>>> OutputAnalyzer output; >>>>>>>>>> >>>>>>>>>> try { >>>>>>>>>> output = new OutputAnalyzer(pb.start()); >>>>>>>>>> } catch (IOException e) { >>>>>>>>>> System.err.println( "Can not start VM to test to >>>>>>>>>> find out >>>>>>>>>> it's features. " + >>>>>>>>>> "Switching off class data >>>>>>>>>> sharing (CDS)." + e); >>>>>>>>>> return "false"; >>>>>>>>>> } >>>>>>>>>> if (output.firstMatch("Shared spaces are not >>>>>>>>>> supported in >>>>>>>>>> this >>>>>>>>>> VM") != null) { >>>>>>>>>> return "false"; >>>>>>>>>> } >>>>>>>>>> if (output.firstMatch("An error has occurred while >>>>>>>>>> processing >>>>>>>>>> the shared archive file.") != null) { >>>>>>>>>> return "true"; >>>>>>>>>> } >>>>>>>>>> if (output.firstMatch("Java HotSpot.*sharing") != >>>>>>>>>> null) { >>>>>>>>>> return "true"; >>>>>>>>>> } >>>>>>>>>> if (output.firstMatch("Unable to map") != null) { >>>>>>>>>> System.out.println("VMProps.vmCDS() encountered an >>>>>>>>>> archive >>>>>>>>>> mapping failure, still proceeding with vm.cds=true"); >>>>>>>>>> return "true"; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> return "false"; >>>>>>>>>> } >>>>>>>>>> ================== >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On 08/01/2017 07:20 AM, Lindenmaier, Goetz wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I made new webrevs implementing the change with @requires: >>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02/ >>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436-cdsKey/webrev.02- >>>>>>> hs/ >>>>>>>>>>> I also changed the bug description and synopsis. >>>>>>>>>>> >>>>>>>>>>> For the jtreg runner I would propose to set the property test.jdk >>>>>>>>>>> so that it is available in VMProps. Igor also ran into this >>>>>>>>>>> issue. >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Goetz. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: Mikhailo Seledtsov [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>>>>> Sent: Montag, 31. Juli 2017 22:19 >>>>>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>>>>>>> disable cds >>>>>>>>>> tests >>>>>>>>>>>> Hi Goetz, >>>>>>>>>>>> >>>>>>>>>>>> I have an idea on how to address your second use case. >>>>>>>>>>>> The idea is to define a special test property (e.g. >>>>>>>>>>>> test.cds.disable.cds.support) which will override logic inside >>>>>>>>>>>> the >>>>>>>>>>>> VMProps.vmCDSSupported(). If this property is defined to >>>>>>>>>>>> "true" in >>>>>>>>>>>> test >>>>>>>>>>>> invocation command then vmCDSSupported() returns false (CDS is >>>>>>>>>> disabled, >>>>>>>>>>>> not supported), and all tests marked with "@requires >>>>>>>>>>>> vm.cds.supported" >>>>>>>>>>>> will be skipped. >>>>>>>>>>>> >>>>>>>>>>>> How to use it: >>>>>>>>>>>> jtreg -Dtest.cds.disable.cds.support=true >>>>>>>>>>>> E.g.: jtreg -Dtest.cds.disable.cds.support=true >>>>>>>>>>>> >>>>>>> hs/hotspot/test/runtime/SharedArchiveFile/ArchiveDoesNotExist.java >>>>>>>>>>>> I prototyped this approach, it works for me. I have attached the >>>>>>>>>>>> diff. >>>>>>>>>>>> Let me know whether this works for your use case, or if you >>>>>>>>>>>> have any >>>>>>>>>>>> questions. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thank you, >>>>>>>>>>>> Mikhailo >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 7/31/17, 1:45 AM, Lindenmaier, Goetz wrote: >>>>>>>>>>>>> Hi Mikhailo, >>>>>>>>>>>>> >>>>>>>>>>>>> Basically I'm fine with using the @requires property. >>>>>>>>>>>>> But is there a way to overrule the outcome of the method >>>>>>>>>>>>> implemented In VMProps.java computing the property? >>>>>>>>>>>>> I have two use cases for the key I want to introduce. >>>>>>>>>>>>> >>>>>>>>>>>>> First, our internal VM (we are Oracle licensees) is compiled >>>>>>>>>>>>> without >>>>>>>>>>>>> CDS support. Thus we don't want to run the CDS tests. Currently >>>>>>>>>>>>> we have them all listed in the ProblemList, but that's not nice, >>>>>>>>>>>>> especially >>>>>>>>>>>>> because we have to adapt it whenever a new test is added. >>>>>>>>>>>>> As I understand, the @requires property works fine, here. >>>>>>>>>>>>> >>>>>>>>>>>>> Second, we also test the two ports we contributed (ppc and >>>>>>>>>>>>> s390). >>>>>>>>>>>>> These >>>>>>>>>>>> contain >>>>>>>>>>>>> rudimentary cds support and so far passed all tests. >>>>>>>>>>>>> Unfortunately it >>>>>>>>>> broke >>>>>>>>>>>>> lately in jdk10. Instead of fixing it (our people are >>>>>>>>>>>>> working on >>>>>>>>>>>>> finishing >>>>>>>>>> our >>>>>>>>>>>>> internal Java 9 port) I would like to switch off all cds tests. >>>>>>>>>>>>> As I can set the key on the command line of jtreg, I easily can >>>>>>>>>>>>> do that. >>>>>>>>>>>>> Is there a way to do similar with the @requires property? >>>>>>>>>>>>> >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> Goetz. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>> From: Mikhailo Seledtsov >>> [mailto:mikhailo.seledtsov at oracle.com] >>>>>>>>>>>>>> Sent: Freitag, 28. Juli 2017 23:53 >>>>>>>>>>>>>> To: Lindenmaier, Goetz >>>>>>>>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>>>>>>>> Subject: Re: RFR(M): 8185436: jtreg: introduce keyword to >>>>>>>>>>>>>> disable cds >>>>>>>>>>>> tests >>>>>>>>>>>>>> Hi Goetz, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am a HotSpot SQE Engineer at Oracle. I have >>>>>>>>>>>>>> discussed your >>>>>>>>>> proposed >>>>>>>>>>>>>> fix with Igor Ignatyev (also VM SQE Engineer), and we have the >>>>>>>>>> following >>>>>>>>>>>>>> feedback on this change. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. As part of streamlining and simplifying SQE process and the >>>>>>>>>>>>>> use of >>>>>>>>>>>>>> test tools we have narrowed down the test selection mechanisms. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. Our preferred test selection mechanism is use of "@requires" >>>>>>>>>>>>>> and a >>>>>>>>>>>>>> corresponding test/jtreg-ext/requires/VMProps.java. Even though >>>>>>>>>> JTREG >>>>>>>>>>>>>> supports use of "@key", we prefer the use of "@requires" as a >>>>>>>>>>>>>> first >>>>>>>>>>>> choice. >>>>>>>>>>>>>> 3. If it is not possible to use "@requires" for a given >>>>>>>>>>>>>> situation then >>>>>>>>>>>>>> use "@key" mechanism. We would ask you if you could explore >>> the >>>>>>>>>>>>>> possibility of implementing this change via @requires first. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Here are several hints that may help: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Take a look at/test/jtreg-ext/requires/VMProps.java. >>>>>>>>>>>>>> The >>>>>>>>>>>>>> value >>>>>>>>>>>>>> of a given "requires property" is evaluated inside this file >>>>>>>>>>>>>> and >>>>>>>>>>>>>> placed >>>>>>>>>>>>>> into a map (see public call() method). Add your evaluation code >>>>>>>>>>>>>> here, >>>>>>>>>>>>>> and then follow the pattern used for other properties. Create a >>>>>>>>>> property >>>>>>>>>>>>>> (e.g. vm.cds.supported, with values of true/false). Create a >>>>>>> method >>>>>>>>>> that >>>>>>>>>>>>>> evaluates the property value (e.g. isCDSSupported() or >>>>>>>>>>>>>> similar). >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. The method could use several options to evaluate whether CDS >>>>>>> is >>>>>>>>>>>>>> supported. >>>>>>>>>>>>>> A. WhiteBox API. Create a new WB test API method >>>>>>>>>>>>>> which can >>>>>>>>>> return >>>>>>>>>>>>>> true if CDS_ compiler flag is defined, otherwise false. >>>>>>>>>>>>>> Call WB API from VMProps.java. See >>>>>>>>>>>>>> WB.getBooleanVMFlag("EnableJVMCI") as an example. Or create >>>>>>> your >>>>>>>>>>>> own >>>>>>>>>>>>>> WB.isCDSSupported() >>>>>>>>>>>>>> WhiteBox.java resides in >>>>>>>>>>>>>> test/lib/sun/hotspot/WhiteBox.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> B. Another options is to evaluate by running VM with >>>>>>>>>>>>>> sharing on and >>>>>>>>>>>>>> checking the return (may be not as reliable as option A) >>>>>>>>>>>>>> C. Other ideas welcome. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 3. Include "@requres vm.cds.supported == true" to the >>>>>>> appropriate >>>>>>>>>> tests. >>>>>>>>>>>>>> Let me know if you have any questions. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>> Mikhailo >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 7/28/17, 12:58 AM, Lindenmaier, Goetz wrote: >>>>>>>>>>>>>>> Hi >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> we compile the VM without CDS support. Thus the CDS tests >>>>>>>>>>>>>>> fail. This change introduces a keyword 'cds' and marks >>>>>>>>>>>>>>> the tests accordingly. >>>>>>>>>>>>>>> This change also fixes the keywords specified in >>>>>>>>>>>>>> gc/g1/TestSharedArchiveWithPreTouch.java. >>>>>>>>>>>>>>> There may only be one @key keyword in the test specification. >>>>>>>>>>>>>>> In runtime/CompressedOops/CompressedClassPointers.java >>> only >>>>>>> one >>>>>>>>>>>> test >>>>>>>>>>>>>>> case required CDS. I changed this sub case to succeed if >>>>>>>>>>>>>>> CDS is >>>>>>>>>>>>>>> not >>>>>>>>>>>>>>> available. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please review this change. I please need a sponsor. >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8185436- >>>>>>> cdsKey/webrev.01/ >>>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>>> Goetz. > From thomas.stuefe at gmail.com Thu Aug 10 12:35:08 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 10 Aug 2017 14:35:08 +0200 Subject: RFR(xxs): 8185706: Native callstacks unreliable under Windows x64 In-Reply-To: References: Message-ID: Ping... May I please have a second review? Thank you! On Wed, Aug 2, 2017 at 11:17 AM, Thomas St?fe wrote: > Hi all, > > may I please have a review for this small fix. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185706 > Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185706- > Native-callstacks-unreliable-under-Windows-x64/webrev.00/webrev/ > > This can be seen as an addon to https://bugs.openjdk.java.n > et/browse/JDK-8022335. Ioi Lam did a good job analyzing the original > problem. On windows x64, the native compiler generates code which does not > use the frame pointer (regardless whether we set -Oy-). Only in rare cases > a frame pointer is used - e.g. for alloca()-functions - and, as Ioi pointed > out, no guarantee either that RBP is actually the frame pointer. > > So, in os :: > platform_print_native_stack > () > we walk the stack using StackWalk64(), extract the pc from each frame and > print that, like normal windows coding. However, we still test for the > frame pointer being NULL, and abort stack tracing if it is. This causes > stack dumping to fail quite often, and unnecessarily. > > For example, test: java.exe -XX:ErrorHandlerTest=12 > > Sometimes it works, but more out of accident - as Ioi pointed out in this > mail thread: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2 > 013-August/009063.html. If there are java frames above the crashing > native frame, we still may have RBP set to some value (does not matter > which) and os > :: > platform_print_native_stack > () > does not abort frame printing. > > Kind Regards, Thomas > > > From zgu at redhat.com Thu Aug 10 13:09:14 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 10 Aug 2017 09:09:14 -0400 Subject: RFR(xxs): 8185706: Native callstacks unreliable under Windows x64 In-Reply-To: References: Message-ID: <10333e6d-4374-50d9-0fa0-198e5ddc8667@redhat.com> Look good to me. -Zhengyu On 08/10/2017 08:35 AM, Thomas St?fe wrote: > Ping... May I please have a second review? > > Thank you! > > On Wed, Aug 2, 2017 at 11:17 AM, Thomas St?fe > wrote: > >> Hi all, >> >> may I please have a review for this small fix. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8185706 >> Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185706- >> Native-callstacks-unreliable-under-Windows-x64/webrev.00/webrev/ >> >> This can be seen as an addon to https://bugs.openjdk.java.n >> et/browse/JDK-8022335. Ioi Lam did a good job analyzing the original >> problem. On windows x64, the native compiler generates code which does not >> use the frame pointer (regardless whether we set -Oy-). Only in rare cases >> a frame pointer is used - e.g. for alloca()-functions - and, as Ioi pointed >> out, no guarantee either that RBP is actually the frame pointer. >> >> So, in os :: >> platform_print_native_stack >> () >> we walk the stack using StackWalk64(), extract the pc from each frame and >> print that, like normal windows coding. However, we still test for the >> frame pointer being NULL, and abort stack tracing if it is. This causes >> stack dumping to fail quite often, and unnecessarily. >> >> For example, test: java.exe -XX:ErrorHandlerTest=12 >> >> Sometimes it works, but more out of accident - as Ioi pointed out in this >> mail thread: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2 >> 013-August/009063.html. If there are java frames above the crashing >> native frame, we still may have RBP set to some value (does not matter >> which) and os >> :: >> platform_print_native_stack >> () >> does not abort frame printing. >> >> Kind Regards, Thomas >> >> >> From thomas.stuefe at gmail.com Thu Aug 10 13:16:00 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 10 Aug 2017 15:16:00 +0200 Subject: RFR(xxs): 8185706: Native callstacks unreliable under Windows x64 In-Reply-To: <10333e6d-4374-50d9-0fa0-198e5ddc8667@redhat.com> References: <10333e6d-4374-50d9-0fa0-198e5ddc8667@redhat.com> Message-ID: Thank you, Zhengyu! On Thu, Aug 10, 2017 at 3:09 PM, Zhengyu Gu wrote: > Look good to me. > > -Zhengyu > > On 08/10/2017 08:35 AM, Thomas St?fe wrote: > >> Ping... May I please have a second review? >> >> Thank you! >> >> On Wed, Aug 2, 2017 at 11:17 AM, Thomas St?fe >> wrote: >> >> Hi all, >>> >>> may I please have a review for this small fix. >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8185706 >>> Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185706- >>> Native-callstacks-unreliable-under-Windows-x64/webrev.00/webrev/ >>> >>> This can be seen as an addon to https://bugs.openjdk.java.n >>> et/browse/JDK-8022335. Ioi Lam did a good job analyzing the original >>> problem. On windows x64, the native compiler generates code which does >>> not >>> use the frame pointer (regardless whether we set -Oy-). Only in rare >>> cases >>> a frame pointer is used - e.g. for alloca()-functions - and, as Ioi >>> pointed >>> out, no guarantee either that RBP is actually the frame pointer. >>> >>> So, in os >> >:: >>> platform_print_native_stack >>> >> k&project=integ-hotspot-X>() >>> we walk the stack using StackWalk64(), extract the pc from each frame and >>> print that, like normal windows coding. However, we still test for the >>> frame pointer being NULL, and abort stack tracing if it is. This causes >>> stack dumping to fail quite often, and unnecessarily. >>> >>> For example, test: java.exe -XX:ErrorHandlerTest=12 >>> >>> Sometimes it works, but more out of accident - as Ioi pointed out in this >>> mail thread: http://mail.openjdk.java.net/p >>> ipermail/hotspot-runtime-dev/2 >>> 013-August/009063.html. If there are java frames above the crashing >>> native frame, we still may have RBP set to some value (does not matter >>> which) and os >>> :: >>> platform_print_native_stack >>> >> k&project=integ-hotspot-X>() >>> does not abort frame printing. >>> >>> Kind Regards, Thomas >>> >>> >>> >>> From coleen.phillimore at oracle.com Thu Aug 10 14:43:30 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 10 Aug 2017 10:43:30 -0400 Subject: RFR 8177741: Fix hotspot tests to use --patch-module instead of -Xmodule In-Reply-To: References: Message-ID: <4d163436-cfe6-e6a7-cb15-ba96ad84cc25@oracle.com> This looks good. Coleen On 8/9/17 10:01 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to replace "-Xmodule" with > "--patch-module" in the > test/lib/jdk/test/lib/compiler/InMemoryJavaCompiler class and in > affected tests. The option name "-Xmodule" was used briefly for java > and javac during JDK-9 development but was eventually replaced with > "--patch-module". However, InMemoryJavaCompiler did not get updated > to use "--patch-module". > > Open Webrevs: > > http://cr.openjdk.java.net/~hseigel/bug_8177741.test/webrev/ > > http://cr.openjdk.java.net/~hseigel/bug_8177741.hs/webrev/ > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8177741 > > The change was tested with the JCK Lang and VM tests, the JTreg > hotspot, java/io, java/lang, java/util and other tests, the co-located > NSK tests, and with JPRT. > > Thanks, Harold > From harold.seigel at oracle.com Thu Aug 10 14:45:42 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 10 Aug 2017 10:45:42 -0400 Subject: RFR 8177741: Fix hotspot tests to use --patch-module instead of -Xmodule In-Reply-To: <4d163436-cfe6-e6a7-cb15-ba96ad84cc25@oracle.com> References: <4d163436-cfe6-e6a7-cb15-ba96ad84cc25@oracle.com> Message-ID: <4836b223-b869-ccaf-f138-1662c4ce92db@oracle.com> Thanks Coleen! Harold On 8/10/2017 10:43 AM, coleen.phillimore at oracle.com wrote: > This looks good. > Coleen > > On 8/9/17 10:01 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to replace "-Xmodule" with >> "--patch-module" in the >> test/lib/jdk/test/lib/compiler/InMemoryJavaCompiler class and in >> affected tests. The option name "-Xmodule" was used briefly for java >> and javac during JDK-9 development but was eventually replaced with >> "--patch-module". However, InMemoryJavaCompiler did not get updated >> to use "--patch-module". >> >> Open Webrevs: >> >> http://cr.openjdk.java.net/~hseigel/bug_8177741.test/webrev/ >> >> http://cr.openjdk.java.net/~hseigel/bug_8177741.hs/webrev/ >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8177741 >> >> The change was tested with the JCK Lang and VM tests, the JTreg >> hotspot, java/io, java/lang, java/util and other tests, the >> co-located NSK tests, and with JPRT. >> >> Thanks, Harold >> > From george.triantafillou at oracle.com Thu Aug 10 14:48:34 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Thu, 10 Aug 2017 10:48:34 -0400 Subject: RFR 8177741: Fix hotspot tests to use --patch-module instead of -Xmodule In-Reply-To: References: Message-ID: <0931cff5-3dd8-2a36-1488-72b56ac6974a@oracle.com> Hi Harold, This looks good. -George On 8/9/2017 10:01 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to replace "-Xmodule" with > "--patch-module" in the > test/lib/jdk/test/lib/compiler/InMemoryJavaCompiler class and in > affected tests. The option name "-Xmodule" was used briefly for java > and javac during JDK-9 development but was eventually replaced with > "--patch-module". However, InMemoryJavaCompiler did not get updated > to use "--patch-module". > > Open Webrevs: > > http://cr.openjdk.java.net/~hseigel/bug_8177741.test/webrev/ > > http://cr.openjdk.java.net/~hseigel/bug_8177741.hs/webrev/ > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8177741 > > The change was tested with the JCK Lang and VM tests, the JTreg > hotspot, java/io, java/lang, java/util and other tests, the co-located > NSK tests, and with JPRT. > > Thanks, Harold > From harold.seigel at oracle.com Thu Aug 10 15:07:02 2017 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 10 Aug 2017 11:07:02 -0400 Subject: RFR 8177741: Fix hotspot tests to use --patch-module instead of -Xmodule In-Reply-To: <0931cff5-3dd8-2a36-1488-72b56ac6974a@oracle.com> References: <0931cff5-3dd8-2a36-1488-72b56ac6974a@oracle.com> Message-ID: Thanks George! Harold On 8/10/2017 10:48 AM, George Triantafillou wrote: > Hi Harold, > > This looks good. > > -George > > On 8/9/2017 10:01 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to replace "-Xmodule" with >> "--patch-module" in the >> test/lib/jdk/test/lib/compiler/InMemoryJavaCompiler class and in >> affected tests. The option name "-Xmodule" was used briefly for java >> and javac during JDK-9 development but was eventually replaced with >> "--patch-module". However, InMemoryJavaCompiler did not get updated >> to use "--patch-module". >> >> Open Webrevs: >> >> http://cr.openjdk.java.net/~hseigel/bug_8177741.test/webrev/ >> >> http://cr.openjdk.java.net/~hseigel/bug_8177741.hs/webrev/ >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8177741 >> >> The change was tested with the JCK Lang and VM tests, the JTreg >> hotspot, java/io, java/lang, java/util and other tests, the >> co-located NSK tests, and with JPRT. >> >> Thanks, Harold >> > From ioi.lam at oracle.com Thu Aug 10 16:15:25 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 10 Aug 2017 09:15:25 -0700 Subject: RFR(xxs): 8185706: Native callstacks unreliable under Windows x64 In-Reply-To: References: <10333e6d-4374-50d9-0fa0-198e5ddc8667@redhat.com> Message-ID: <699588cd-cfa8-c311-3569-3df63f7479b8@oracle.com> Hi Thomas, I will sponsor the changes. Thanks for the contribution! - Ioi On 8/10/17 6:16 AM, Thomas St?fe wrote: > Thank you, Zhengyu! > > On Thu, Aug 10, 2017 at 3:09 PM, Zhengyu Gu wrote: > >> Look good to me. >> >> -Zhengyu >> >> On 08/10/2017 08:35 AM, Thomas St?fe wrote: >> >>> Ping... May I please have a second review? >>> >>> Thank you! >>> >>> On Wed, Aug 2, 2017 at 11:17 AM, Thomas St?fe >>> wrote: >>> >>> Hi all, >>>> may I please have a review for this small fix. >>>> >>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8185706 >>>> Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185706- >>>> Native-callstacks-unreliable-under-Windows-x64/webrev.00/webrev/ >>>> >>>> This can be seen as an addon to https://bugs.openjdk.java.n >>>> et/browse/JDK-8022335. Ioi Lam did a good job analyzing the original >>>> problem. On windows x64, the native compiler generates code which does >>>> not >>>> use the frame pointer (regardless whether we set -Oy-). Only in rare >>>> cases >>>> a frame pointer is used - e.g. for alloca()-functions - and, as Ioi >>>> pointed >>>> out, no guarantee either that RBP is actually the frame pointer. >>>> >>>> So, in os >>>> :: >>>> platform_print_native_stack >>>> >>> k&project=integ-hotspot-X>() >>>> we walk the stack using StackWalk64(), extract the pc from each frame and >>>> print that, like normal windows coding. However, we still test for the >>>> frame pointer being NULL, and abort stack tracing if it is. This causes >>>> stack dumping to fail quite often, and unnecessarily. >>>> >>>> For example, test: java.exe -XX:ErrorHandlerTest=12 >>>> >>>> Sometimes it works, but more out of accident - as Ioi pointed out in this >>>> mail thread: http://mail.openjdk.java.net/p >>>> ipermail/hotspot-runtime-dev/2 >>>> 013-August/009063.html. If there are java frames above the crashing >>>> native frame, we still may have RBP set to some value (does not matter >>>> which) and os >>>> :: >>>> platform_print_native_stack >>>> >>> k&project=integ-hotspot-X>() >>>> does not abort frame printing. >>>> >>>> Kind Regards, Thomas >>>> >>>> >>>> >>>> From vladimir.kozlov at oracle.com Thu Aug 10 16:46:57 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 10 Aug 2017 09:46:57 -0700 Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> Message-ID: CCing to Runtime. Can you add comment explaining why it set to true on SPARC? Thanks, Vladimir On 8/10/17 6:26 AM, Poonam Parhar wrote: > Hello, > > Please review this simple patch: > > Bug:_JDK-8185572_:Enable > AssumeMP by default on SPARC machines > > Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ > > This change enables AssumeMP by default on SPARC machines. On Sparc T7, > to finalize BIS instructions the server compiler needs toadd > a?membar?instruction at the end.But the generation of?membar?is guarded > byos::is_MP(), andos::is_MP()returns false when there isa singlecpu > available on the system. Now,invirtualized/containerenvironments, the > number ofprocessorsallocated to a virtual machine can dynamically change > during the application runtime.That could lead to incorrect generation > of BIS instructions and can cause JVM crashes.Enabling AssumeMP makes > is_MP() always return true on SPARC systems. > > In future, we may consider makinggeneration of?membar?unconditional > withtheenhancementrequest:_JDK-8150715_. > > Thanks, > > Poonam > From poonam.bajaj at oracle.com Thu Aug 10 17:44:09 2017 From: poonam.bajaj at oracle.com (Poonam Parhar) Date: Thu, 10 Aug 2017 10:44:09 -0700 (PDT) Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> Message-ID: <0285fdb7-8548-48e5-b019-3de62f66cffe@default> Thanks Vladimir. Since the SPARC machines are always multi-cores, we can safely set AssumeMP to true on these. Adding my comments from the previous mail here again for better readability: ------------------------------------- Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable AssumeMP by default on SPARC machines Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ This change enables AssumeMP by default on SPARC machines. On Sparc T7, to finalize BIS instructions the server compiler needs to add a 'membar' instruction at the end. But the generation of 'membar' is guarded by os::is_MP(), and os::is_MP() returns false when there is a single cpu available on the system. Now, in virtualized/container environments, the number of processors allocated to a virtual machine can dynamically change during the application runtime. That could lead to incorrect generation of BIS instructions and can cause JVM crashes. Enabling AssumeMP makes is_MP() always return true on SPARC systems. In future, we may consider making generation of 'membar' unconditional with the enhancement request: https://bugs.openjdk.java.net/browse/JDK-8150715. Thanks, Poonam > -----Original Message----- > From: Vladimir Kozlov > Sent: Thursday, August 10, 2017 9:47 AM > To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net > Cc: hotspot-runtime-dev at openjdk.java.net runtime > Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC > machines > > CCing to Runtime. > > Can you add comment explaining why it set to true on SPARC? > > Thanks, > Vladimir > > On 8/10/17 6:26 AM, Poonam Parhar wrote: > > Hello, > > > > Please review this simple patch: > > > > Bug:_JDK-8185572_ 8185572>:En > > able > > AssumeMP by default on SPARC machines > > > > Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ > > > > This change enables AssumeMP by default on SPARC machines. On Sparc > > T7, to finalize BIS instructions the server compiler needs toadd > > a'membar'instruction at the end.But the generation of'membar'is > > guarded byos::is_MP(), andos::is_MP()returns false when there isa > > singlecpu available on the system. > > Now,invirtualized/containerenvironments, the number > > ofprocessorsallocated to a virtual machine can dynamically change > > during the application runtime.That could lead to incorrect > generation > > of BIS instructions and can cause JVM crashes.Enabling AssumeMP makes > > is_MP() always return true on SPARC systems. > > > > In future, we may consider makinggeneration of'membar'unconditional > > withtheenhancementrequest:_JDK- > 8150715_. > > > > Thanks, > > > > Poonam > > From vladimir.kozlov at oracle.com Thu Aug 10 18:23:54 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 10 Aug 2017 11:23:54 -0700 Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: <0285fdb7-8548-48e5-b019-3de62f66cffe@default> References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> Message-ID: <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> Poonam, I mean to add a small (one or two sentences) comment to the code. Some thing like next but may better wording: + if (FLAG_IS_DEFAULT(AssumeMP)) { + // BIS instructions require 'membar' instruction regardless number of CPU. + // Otherwise in virtualized/container environments which use only 1 cpu BIS instructions may produce incorrect results. + FLAG_SET_DEFAULT(AssumeMP, true); Thanks, Vladimir On 8/10/17 10:44 AM, Poonam Parhar wrote: > Thanks Vladimir. > > Since the SPARC machines are always multi-cores, we can safely set AssumeMP to true on these. > > Adding my comments from the previous mail here again for better readability: > ------------------------------------- > Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable AssumeMP by default on SPARC machines > Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ > > This change enables AssumeMP by default on SPARC machines. On Sparc T7, to finalize BIS instructions the server compiler needs to add a 'membar' instruction at the end. But the generation of 'membar' is guarded by os::is_MP(), and os::is_MP() returns false when there is a single cpu available on the system. Now, in virtualized/container environments, the number of processors allocated to a virtual machine can dynamically change during the application runtime. That could lead to incorrect generation of BIS instructions and can cause JVM crashes. Enabling AssumeMP makes is_MP() always return true on SPARC systems. > > In future, we may consider making generation of 'membar' unconditional with the enhancement request: https://bugs.openjdk.java.net/browse/JDK-8150715. > > Thanks, > Poonam > > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Thursday, August 10, 2017 9:47 AM >> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net >> Cc: hotspot-runtime-dev at openjdk.java.net runtime >> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC >> machines >> >> CCing to Runtime. >> >> Can you add comment explaining why it set to true on SPARC? >> >> Thanks, >> Vladimir >> >> On 8/10/17 6:26 AM, Poonam Parhar wrote: >>> Hello, >>> >>> Please review this simple patch: >>> >>> Bug:_JDK-8185572_> 8185572>:En >>> able >>> AssumeMP by default on SPARC machines >>> >>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>> >>> This change enables AssumeMP by default on SPARC machines. On Sparc >>> T7, to finalize BIS instructions the server compiler needs toadd >>> a'membar'instruction at the end.But the generation of'membar'is >>> guarded byos::is_MP(), andos::is_MP()returns false when there isa >>> singlecpu available on the system. >>> Now,invirtualized/containerenvironments, the number >>> ofprocessorsallocated to a virtual machine can dynamically change >>> during the application runtime.That could lead to incorrect >> generation >>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP makes >>> is_MP() always return true on SPARC systems. >>> >>> In future, we may consider makinggeneration of'membar'unconditional >>> withtheenhancementrequest:_JDK- >> 8150715_. >>> >>> Thanks, >>> >>> Poonam >>> From bob.vandette at oracle.com Thu Aug 10 18:53:30 2017 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 10 Aug 2017 14:53:30 -0400 Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> Message-ID: <38ABC51B-73E3-4DEF-B5E1-B44B8B25160D@oracle.com> Can we just always run with AssumeMP true for all platforms these days? Surely single CPU systems are rare now. We might have issues with Docker containers that have a limit 1 CPU on a large mp system which may cause issues. Bob. > On Aug 10, 2017, at 2:23 PM, Vladimir Kozlov wrote: > > Poonam, > > I mean to add a small (one or two sentences) comment to the code. Some thing like next but may better wording: > > + if (FLAG_IS_DEFAULT(AssumeMP)) { > + // BIS instructions require 'membar' instruction regardless number of CPU. > + // Otherwise in virtualized/container environments which use only 1 cpu BIS instructions may produce incorrect results. > + FLAG_SET_DEFAULT(AssumeMP, true); > > Thanks, > Vladimir > > On 8/10/17 10:44 AM, Poonam Parhar wrote: >> Thanks Vladimir. >> Since the SPARC machines are always multi-cores, we can safely set AssumeMP to true on these. >> Adding my comments from the previous mail here again for better readability: >> ------------------------------------- >> Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable AssumeMP by default on SPARC machines >> Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >> This change enables AssumeMP by default on SPARC machines. On Sparc T7, to finalize BIS instructions the server compiler needs to add a 'membar' instruction at the end. But the generation of 'membar' is guarded by os::is_MP(), and os::is_MP() returns false when there is a single cpu available on the system. Now, in virtualized/container environments, the number of processors allocated to a virtual machine can dynamically change during the application runtime. That could lead to incorrect generation of BIS instructions and can cause JVM crashes. Enabling AssumeMP makes is_MP() always return true on SPARC systems. >> In future, we may consider making generation of 'membar' unconditional with the enhancement request: https://bugs.openjdk.java.net/browse/JDK-8150715. >> Thanks, >> Poonam >>> -----Original Message----- >>> From: Vladimir Kozlov >>> Sent: Thursday, August 10, 2017 9:47 AM >>> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net >>> Cc: hotspot-runtime-dev at openjdk.java.net runtime >>> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC >>> machines >>> >>> CCing to Runtime. >>> >>> Can you add comment explaining why it set to true on SPARC? >>> >>> Thanks, >>> Vladimir >>> >>> On 8/10/17 6:26 AM, Poonam Parhar wrote: >>>> Hello, >>>> >>>> Please review this simple patch: >>>> >>>> Bug:_JDK-8185572_>> 8185572>:En >>>> able >>>> AssumeMP by default on SPARC machines >>>> >>>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>>> >>>> This change enables AssumeMP by default on SPARC machines. On Sparc >>>> T7, to finalize BIS instructions the server compiler needs toadd >>>> a'membar'instruction at the end.But the generation of'membar'is >>>> guarded byos::is_MP(), andos::is_MP()returns false when there isa >>>> singlecpu available on the system. >>>> Now,invirtualized/containerenvironments, the number >>>> ofprocessorsallocated to a virtual machine can dynamically change >>>> during the application runtime.That could lead to incorrect >>> generation >>>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP makes >>>> is_MP() always return true on SPARC systems. >>>> >>>> In future, we may consider makinggeneration of'membar'unconditional >>>> withtheenhancementrequest:_JDK- >>> 8150715_. >>>> >>>> Thanks, >>>> >>>> Poonam >>>> From poonam.bajaj at oracle.com Thu Aug 10 18:55:15 2017 From: poonam.bajaj at oracle.com (Poonam Parhar) Date: Thu, 10 Aug 2017 11:55:15 -0700 (PDT) Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> Message-ID: Updated the webrev with comments: http://cr.openjdk.java.net/~poonam/8185572/webrev.01/ Thanks, Poonam > -----Original Message----- > From: Vladimir Kozlov > Sent: Thursday, August 10, 2017 11:24 AM > To: hotspot-compiler-dev at openjdk.java.net; Poonam Bajaj Parhar > Cc: hotspot-runtime-dev at openjdk.java.net runtime > Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC > machines > > Poonam, > > I mean to add a small (one or two sentences) comment to the code. Some > thing like next but may better wording: > > + if (FLAG_IS_DEFAULT(AssumeMP)) { > + // BIS instructions require 'membar' instruction regardless number > of CPU. > + // Otherwise in virtualized/container environments which use only > 1 > cpu BIS instructions may produce incorrect results. > + FLAG_SET_DEFAULT(AssumeMP, true); > > Thanks, > Vladimir > > On 8/10/17 10:44 AM, Poonam Parhar wrote: > > Thanks Vladimir. > > > > Since the SPARC machines are always multi-cores, we can safely set > AssumeMP to true on these. > > > > Adding my comments from the previous mail here again for better > readability: > > ------------------------------------- > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable > AssumeMP > > by default on SPARC machines > > Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ > > > > This change enables AssumeMP by default on SPARC machines. On Sparc > T7, to finalize BIS instructions the server compiler needs to add a > 'membar' instruction at the end. But the generation of 'membar' is > guarded by os::is_MP(), and os::is_MP() returns false when there is a > single cpu available on the system. Now, in virtualized/container > environments, the number of processors allocated to a virtual machine > can dynamically change during the application runtime. That could lead > to incorrect generation of BIS instructions and can cause JVM crashes. > Enabling AssumeMP makes is_MP() always return true on SPARC systems. > > > > In future, we may consider making generation of 'membar' > unconditional with the enhancement request: > https://bugs.openjdk.java.net/browse/JDK-8150715. > > > > Thanks, > > Poonam > > > > > >> -----Original Message----- > >> From: Vladimir Kozlov > >> Sent: Thursday, August 10, 2017 9:47 AM > >> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net > >> Cc: hotspot-runtime-dev at openjdk.java.net runtime > >> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on > >> SPARC machines > >> > >> CCing to Runtime. > >> > >> Can you add comment explaining why it set to true on SPARC? > >> > >> Thanks, > >> Vladimir > >> > >> On 8/10/17 6:26 AM, Poonam Parhar wrote: > >>> Hello, > >>> > >>> Please review this simple patch: > >>> > >>> Bug:_JDK-8185572_ >> 8185572>:En > >>> able > >>> AssumeMP by default on SPARC machines > >>> > >>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ > >>> > >>> This change enables AssumeMP by default on SPARC machines. On Sparc > >>> T7, to finalize BIS instructions the server compiler needs toadd > >>> a'membar'instruction at the end.But the generation of'membar'is > >>> guarded byos::is_MP(), andos::is_MP()returns false when there isa > >>> singlecpu available on the system. > >>> Now,invirtualized/containerenvironments, the number > >>> ofprocessorsallocated to a virtual machine can dynamically change > >>> during the application runtime.That could lead to incorrect > >> generation > >>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP > >>> makes > >>> is_MP() always return true on SPARC systems. > >>> > >>> In future, we may consider makinggeneration of'membar'unconditional > >>> withtheenhancementrequest:_JDK- > >> 8150715_. > >>> > >>> Thanks, > >>> > >>> Poonam > >>> From vladimir.kozlov at oracle.com Thu Aug 10 19:10:37 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 10 Aug 2017 12:10:37 -0700 Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: <38ABC51B-73E3-4DEF-B5E1-B44B8B25160D@oracle.com> References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> <38ABC51B-73E3-4DEF-B5E1-B44B8B25160D@oracle.com> Message-ID: <58512f71-b5e6-f2b0-f99e-005c239e911e@oracle.com> Bob, we have JDK-8185062 for that: https://bugs.openjdk.java.net/browse/JDK-8185062 IMHO this fix is intended for backports, should be simple and don't cause regression, for example on embedded platforms. But I am fine if runtime group think it is fine to enable it on all platforms in jdk 7, 8 and 9. I agree that due to problem with dynamic cpus configuration in containers it may be good to enable it on all platforms in previous releases too. Thanks, Vladimir On 8/10/17 11:53 AM, Bob Vandette wrote: > Can we just always run with AssumeMP true for all platforms these days? > Surely single CPU systems are rare now. > > We might have issues with Docker containers that have a limit 1 CPU > on a large mp system which may cause issues. > > Bob. > > >> On Aug 10, 2017, at 2:23 PM, Vladimir Kozlov wrote: >> >> Poonam, >> >> I mean to add a small (one or two sentences) comment to the code. Some thing like next but may better wording: >> >> + if (FLAG_IS_DEFAULT(AssumeMP)) { >> + // BIS instructions require 'membar' instruction regardless number of CPU. >> + // Otherwise in virtualized/container environments which use only 1 cpu BIS instructions may produce incorrect results. >> + FLAG_SET_DEFAULT(AssumeMP, true); >> >> Thanks, >> Vladimir >> >> On 8/10/17 10:44 AM, Poonam Parhar wrote: >>> Thanks Vladimir. >>> Since the SPARC machines are always multi-cores, we can safely set AssumeMP to true on these. >>> Adding my comments from the previous mail here again for better readability: >>> ------------------------------------- >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable AssumeMP by default on SPARC machines >>> Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>> This change enables AssumeMP by default on SPARC machines. On Sparc T7, to finalize BIS instructions the server compiler needs to add a 'membar' instruction at the end. But the generation of 'membar' is guarded by os::is_MP(), and os::is_MP() returns false when there is a single cpu available on the system. Now, in virtualized/container environments, the number of processors allocated to a virtual machine can dynamically change during the application runtime. That could lead to incorrect generation of BIS instructions and can cause JVM crashes. Enabling AssumeMP makes is_MP() always return true on SPARC systems. >>> In future, we may consider making generation of 'membar' unconditional with the enhancement request: https://bugs.openjdk.java.net/browse/JDK-8150715. >>> Thanks, >>> Poonam >>>> -----Original Message----- >>>> From: Vladimir Kozlov >>>> Sent: Thursday, August 10, 2017 9:47 AM >>>> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net >>>> Cc: hotspot-runtime-dev at openjdk.java.net runtime >>>> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC >>>> machines >>>> >>>> CCing to Runtime. >>>> >>>> Can you add comment explaining why it set to true on SPARC? >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 8/10/17 6:26 AM, Poonam Parhar wrote: >>>>> Hello, >>>>> >>>>> Please review this simple patch: >>>>> >>>>> Bug:_JDK-8185572_>>> 8185572>:En >>>>> able >>>>> AssumeMP by default on SPARC machines >>>>> >>>>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>>>> >>>>> This change enables AssumeMP by default on SPARC machines. On Sparc >>>>> T7, to finalize BIS instructions the server compiler needs toadd >>>>> a'membar'instruction at the end.But the generation of'membar'is >>>>> guarded byos::is_MP(), andos::is_MP()returns false when there isa >>>>> singlecpu available on the system. >>>>> Now,invirtualized/containerenvironments, the number >>>>> ofprocessorsallocated to a virtual machine can dynamically change >>>>> during the application runtime.That could lead to incorrect >>>> generation >>>>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP makes >>>>> is_MP() always return true on SPARC systems. >>>>> >>>>> In future, we may consider makinggeneration of'membar'unconditional >>>>> withtheenhancementrequest:_JDK- >>>> 8150715_. >>>>> >>>>> Thanks, >>>>> >>>>> Poonam >>>>> > From vladimir.kozlov at oracle.com Thu Aug 10 19:11:39 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 10 Aug 2017 12:11:39 -0700 Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> Message-ID: <72e1f78f-595c-b25f-e30f-98fd62f66093@oracle.com> Looks good. I think someone from Runtime have to review it too. Thanks, Vladimir On 8/10/17 11:55 AM, Poonam Parhar wrote: > Updated the webrev with comments: > http://cr.openjdk.java.net/~poonam/8185572/webrev.01/ > > Thanks, > Poonam > > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Thursday, August 10, 2017 11:24 AM >> To: hotspot-compiler-dev at openjdk.java.net; Poonam Bajaj Parhar >> Cc: hotspot-runtime-dev at openjdk.java.net runtime >> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC >> machines >> >> Poonam, >> >> I mean to add a small (one or two sentences) comment to the code. Some >> thing like next but may better wording: >> >> + if (FLAG_IS_DEFAULT(AssumeMP)) { >> + // BIS instructions require 'membar' instruction regardless number >> of CPU. >> + // Otherwise in virtualized/container environments which use only >> 1 >> cpu BIS instructions may produce incorrect results. >> + FLAG_SET_DEFAULT(AssumeMP, true); >> >> Thanks, >> Vladimir >> >> On 8/10/17 10:44 AM, Poonam Parhar wrote: >>> Thanks Vladimir. >>> >>> Since the SPARC machines are always multi-cores, we can safely set >> AssumeMP to true on these. >>> >>> Adding my comments from the previous mail here again for better >> readability: >>> ------------------------------------- >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable >> AssumeMP >>> by default on SPARC machines >>> Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>> >>> This change enables AssumeMP by default on SPARC machines. On Sparc >> T7, to finalize BIS instructions the server compiler needs to add a >> 'membar' instruction at the end. But the generation of 'membar' is >> guarded by os::is_MP(), and os::is_MP() returns false when there is a >> single cpu available on the system. Now, in virtualized/container >> environments, the number of processors allocated to a virtual machine >> can dynamically change during the application runtime. That could lead >> to incorrect generation of BIS instructions and can cause JVM crashes. >> Enabling AssumeMP makes is_MP() always return true on SPARC systems. >>> >>> In future, we may consider making generation of 'membar' >> unconditional with the enhancement request: >> https://bugs.openjdk.java.net/browse/JDK-8150715. >>> >>> Thanks, >>> Poonam >>> >>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov >>>> Sent: Thursday, August 10, 2017 9:47 AM >>>> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net >>>> Cc: hotspot-runtime-dev at openjdk.java.net runtime >>>> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on >>>> SPARC machines >>>> >>>> CCing to Runtime. >>>> >>>> Can you add comment explaining why it set to true on SPARC? >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 8/10/17 6:26 AM, Poonam Parhar wrote: >>>>> Hello, >>>>> >>>>> Please review this simple patch: >>>>> >>>>> Bug:_JDK-8185572_>>> 8185572>:En >>>>> able >>>>> AssumeMP by default on SPARC machines >>>>> >>>>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>>>> >>>>> This change enables AssumeMP by default on SPARC machines. On Sparc >>>>> T7, to finalize BIS instructions the server compiler needs toadd >>>>> a'membar'instruction at the end.But the generation of'membar'is >>>>> guarded byos::is_MP(), andos::is_MP()returns false when there isa >>>>> singlecpu available on the system. >>>>> Now,invirtualized/containerenvironments, the number >>>>> ofprocessorsallocated to a virtual machine can dynamically change >>>>> during the application runtime.That could lead to incorrect >>>> generation >>>>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP >>>>> makes >>>>> is_MP() always return true on SPARC systems. >>>>> >>>>> In future, we may consider makinggeneration of'membar'unconditional >>>>> withtheenhancementrequest:_JDK- >>>> 8150715_. >>>>> >>>>> Thanks, >>>>> >>>>> Poonam >>>>> From bob.vandette at oracle.com Thu Aug 10 19:13:46 2017 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 10 Aug 2017 15:13:46 -0400 Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: <58512f71-b5e6-f2b0-f99e-005c239e911e@oracle.com> References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> <38ABC51B-73E3-4DEF-B5E1-B44B8B25160D@oracle.com> <58512f71-b5e6-f2b0-f99e-005c239e911e@oracle.com> Message-ID: I don?t think that we should backport it to JDK 7 and 8 since older single CPU systems may get a security update for these older releases and see a performance regression. Perhaps JDK9 and 10 would be ok. Bob. > On Aug 10, 2017, at 3:10 PM, Vladimir Kozlov wrote: > > Bob, we have JDK-8185062 for that: > > https://bugs.openjdk.java.net/browse/JDK-8185062 > > IMHO this fix is intended for backports, should be simple and don't cause regression, for example on embedded platforms. > > But I am fine if runtime group think it is fine to enable it on all platforms in jdk 7, 8 and 9. > > I agree that due to problem with dynamic cpus configuration in containers it may be good to enable it on all platforms in previous releases too. > > Thanks, > Vladimir > > On 8/10/17 11:53 AM, Bob Vandette wrote: >> Can we just always run with AssumeMP true for all platforms these days? >> Surely single CPU systems are rare now. >> We might have issues with Docker containers that have a limit 1 CPU >> on a large mp system which may cause issues. >> Bob. >>> On Aug 10, 2017, at 2:23 PM, Vladimir Kozlov wrote: >>> >>> Poonam, >>> >>> I mean to add a small (one or two sentences) comment to the code. Some thing like next but may better wording: >>> >>> + if (FLAG_IS_DEFAULT(AssumeMP)) { >>> + // BIS instructions require 'membar' instruction regardless number of CPU. >>> + // Otherwise in virtualized/container environments which use only 1 cpu BIS instructions may produce incorrect results. >>> + FLAG_SET_DEFAULT(AssumeMP, true); >>> >>> Thanks, >>> Vladimir >>> >>> On 8/10/17 10:44 AM, Poonam Parhar wrote: >>>> Thanks Vladimir. >>>> Since the SPARC machines are always multi-cores, we can safely set AssumeMP to true on these. >>>> Adding my comments from the previous mail here again for better readability: >>>> ------------------------------------- >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable AssumeMP by default on SPARC machines >>>> Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>>> This change enables AssumeMP by default on SPARC machines. On Sparc T7, to finalize BIS instructions the server compiler needs to add a 'membar' instruction at the end. But the generation of 'membar' is guarded by os::is_MP(), and os::is_MP() returns false when there is a single cpu available on the system. Now, in virtualized/container environments, the number of processors allocated to a virtual machine can dynamically change during the application runtime. That could lead to incorrect generation of BIS instructions and can cause JVM crashes. Enabling AssumeMP makes is_MP() always return true on SPARC systems. >>>> In future, we may consider making generation of 'membar' unconditional with the enhancement request: https://bugs.openjdk.java.net/browse/JDK-8150715. >>>> Thanks, >>>> Poonam >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov >>>>> Sent: Thursday, August 10, 2017 9:47 AM >>>>> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net >>>>> Cc: hotspot-runtime-dev at openjdk.java.net runtime >>>>> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC >>>>> machines >>>>> >>>>> CCing to Runtime. >>>>> >>>>> Can you add comment explaining why it set to true on SPARC? >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 8/10/17 6:26 AM, Poonam Parhar wrote: >>>>>> Hello, >>>>>> >>>>>> Please review this simple patch: >>>>>> >>>>>> Bug:_JDK-8185572_>>>> 8185572>:En >>>>>> able >>>>>> AssumeMP by default on SPARC machines >>>>>> >>>>>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>>>>> >>>>>> This change enables AssumeMP by default on SPARC machines. On Sparc >>>>>> T7, to finalize BIS instructions the server compiler needs toadd >>>>>> a'membar'instruction at the end.But the generation of'membar'is >>>>>> guarded byos::is_MP(), andos::is_MP()returns false when there isa >>>>>> singlecpu available on the system. >>>>>> Now,invirtualized/containerenvironments, the number >>>>>> ofprocessorsallocated to a virtual machine can dynamically change >>>>>> during the application runtime.That could lead to incorrect >>>>> generation >>>>>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP makes >>>>>> is_MP() always return true on SPARC systems. >>>>>> >>>>>> In future, we may consider makinggeneration of'membar'unconditional >>>>>> withtheenhancementrequest:_JDK- >>>>> 8150715_. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Poonam >>>>>> From coleen.phillimore at oracle.com Thu Aug 10 19:22:02 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 10 Aug 2017 15:22:02 -0400 Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: <72e1f78f-595c-b25f-e30f-98fd62f66093@oracle.com> References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> <72e1f78f-595c-b25f-e30f-98fd62f66093@oracle.com> Message-ID: <80fb8a79-aa48-b0e1-729e-a33bd32bf7d4@oracle.com> This looks good. It'll be nice when the whole thing gets cleaned up. Coleen On 8/10/17 3:11 PM, Vladimir Kozlov wrote: > Looks good. I think someone from Runtime have to review it too. > > Thanks, > Vladimir > > On 8/10/17 11:55 AM, Poonam Parhar wrote: >> Updated the webrev with comments: >> http://cr.openjdk.java.net/~poonam/8185572/webrev.01/ >> >> Thanks, >> Poonam >> >> >>> -----Original Message----- >>> From: Vladimir Kozlov >>> Sent: Thursday, August 10, 2017 11:24 AM >>> To: hotspot-compiler-dev at openjdk.java.net; Poonam Bajaj Parhar >>> Cc: hotspot-runtime-dev at openjdk.java.net runtime >>> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC >>> machines >>> >>> Poonam, >>> >>> I mean to add a small (one or two sentences) comment to the code. Some >>> thing like next but may better wording: >>> >>> + if (FLAG_IS_DEFAULT(AssumeMP)) { >>> + // BIS instructions require 'membar' instruction regardless number >>> of CPU. >>> + // Otherwise in virtualized/container environments which use only >>> 1 >>> cpu BIS instructions may produce incorrect results. >>> + FLAG_SET_DEFAULT(AssumeMP, true); >>> >>> Thanks, >>> Vladimir >>> >>> On 8/10/17 10:44 AM, Poonam Parhar wrote: >>>> Thanks Vladimir. >>>> >>>> Since the SPARC machines are always multi-cores, we can safely set >>> AssumeMP to true on these. >>>> >>>> Adding my comments from the previous mail here again for better >>> readability: >>>> ------------------------------------- >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable >>> AssumeMP >>>> by default on SPARC machines >>>> Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>>> >>>> This change enables AssumeMP by default on SPARC machines. On Sparc >>> T7, to finalize BIS instructions the server compiler needs to add a >>> 'membar' instruction at the end. But the generation of 'membar' is >>> guarded by os::is_MP(), and os::is_MP() returns false when there is a >>> single cpu available on the system. Now, in virtualized/container >>> environments, the number of processors allocated to a virtual machine >>> can dynamically change during the application runtime. That could lead >>> to incorrect generation of BIS instructions and can cause JVM crashes. >>> Enabling AssumeMP makes is_MP() always return true on SPARC systems. >>>> >>>> In future, we may consider making generation of 'membar' >>> unconditional with the enhancement request: >>> https://bugs.openjdk.java.net/browse/JDK-8150715. >>>> >>>> Thanks, >>>> Poonam >>>> >>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov >>>>> Sent: Thursday, August 10, 2017 9:47 AM >>>>> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net >>>>> Cc: hotspot-runtime-dev at openjdk.java.net runtime >>>>> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on >>>>> SPARC machines >>>>> >>>>> CCing to Runtime. >>>>> >>>>> Can you add comment explaining why it set to true on SPARC? >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 8/10/17 6:26 AM, Poonam Parhar wrote: >>>>>> Hello, >>>>>> >>>>>> Please review this simple patch: >>>>>> >>>>>> Bug:_JDK-8185572_>>>> 8185572>:En >>>>>> able >>>>>> AssumeMP by default on SPARC machines >>>>>> >>>>>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>>>>> >>>>>> This change enables AssumeMP by default on SPARC machines. On Sparc >>>>>> T7, to finalize BIS instructions the server compiler needs toadd >>>>>> a'membar'instruction at the end.But the generation of'membar'is >>>>>> guarded byos::is_MP(), andos::is_MP()returns false when there isa >>>>>> singlecpu available on the system. >>>>>> Now,invirtualized/containerenvironments, the number >>>>>> ofprocessorsallocated to a virtual machine can dynamically change >>>>>> during the application runtime.That could lead to incorrect >>>>> generation >>>>>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP >>>>>> makes >>>>>> is_MP() always return true on SPARC systems. >>>>>> >>>>>> In future, we may consider makinggeneration of'membar'unconditional >>>>>> withtheenhancementrequest:_JDK- >>>>> 8150715_. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Poonam >>>>>> From ioi.lam at oracle.com Thu Aug 10 21:15:21 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 10 Aug 2017 14:15:21 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <4DEE0393-59DA-4E01-ACDB-F2DB0F9ED6D9@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> <5F35B435-FB72-4A21-8896-645E0291C5DF@oracle.com> <4DEE0393-59DA-4E01-ACDB-F2DB0F9ED6D9@oracle.com> Message-ID: <20bfa3ff-b2cd-028d-efa9-bddad4a6ff7c@oracle.com> Hi Jiangli, The changes look good to me. Thanks for considering my suggestions. - Ioi On 8/8/17 5:33 PM, Jiangli Zhou wrote: > Here is the incremental webrev that has all the changes incorporated > with suggestions from Coleen, Ioi and Thomas: > > http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03.inc/ > > > Updated full webrev: > http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03/ > > > Thanks again for Coleen's, Ioi's and Thomas? review! > Jiangli > >> On Aug 7, 2017, at 7:57 PM, Jiangli Zhou > > wrote: >> >> Hi Ioi, >> >> Thanks for getting back to me. >> >>> On Aug 7, 2017, at 5:45 PM, Ioi Lam >> > wrote: >>> >>> On 8/4/17 10:19 PM, Jiangli Zhou wrote: >>> >>>> Hi Ioi, >>>> >>>> Thanks for looking again. >>>> >>>>> On Aug 4, 2017, at 2:22 PM, Ioi Lam >>>> > wrote: >>>>> >>>>> Hi Jiangli, >>>>> >>>>> The code looks good in general. I just have a few pet peeves for >>>>> readability: >>>>> >>>>> >>>>> (1) stringTable.cpp and metaspaceShared.cpp have the same asserts >>>>> >>>>> 704 assert(UseG1GC, "Only support G1 GC"); >>>>> 705 assert(UseCompressedOops && UseCompressedClassPointers, >>>>> 706 "Only support UseCompressedOops and >>>>> UseCompressedClassPointers enabled"); >>>>> >>>>> 1615 assert(UseG1GC, "Only support G1 GC"); >>>>> 1616 assert(UseCompressedOops && UseCompressedClassPointers, >>>>> 1617 "Only support UseCompressedOops and >>>>> UseCompressedClassPointers enabled"); >>>>> >>>>> Maybe it's better to combine them into a single function like >>>>> MetaspaceShared::assert_vm_flags() so they don't get out of sync? >>>> >>>> There is a MetaspaceShared::allow_archive_heap_object(), which >>>> checks for UseG1GC, UseCompressedOops and >>>> UseCompressedClassPointers combined. It does not seem to worth add >>>> another separate API for asserting the required flags. I?ll use >>>> that in the assert. >>>> >>>>> >>>>> >>>>> >>>>> (2) FileMapInfo::write_archive_heap_regions() >>>>> >>>>> I still find this code very hard to read, especially due to the loop. >>>>> >>>>> First, the comments are not consistent with the code: >>>>> >>>>> 498 assert(arr_len <= max_num_regions, "number of memory >>>>> regions exceeds maximum"); >>>>> >>>>> but the comments says: "The rest are consecutive full GC regions" >>>>> which means there's a chance for max_num_regions to be more than 2 >>>>> (which will be the case with Calvin's java-loader dumping changes >>>>> using very small heap size).So the code is actually wrong. >>>> >>>> The max_num_regions is the maximum number of region for each >>>> archived heap space (the string space, or open archive space). We >>>> only run into the case where the MemRegion array size is larger >>>> than max_num_regions with Calvin?s pending change. As part of >>>> Calvin?s change, he will change the assert into a check and bail >>>> out if the number of MemRegions are larger than max_num_regions due >>>> to heap fragmentation. >>>> >>>> >>> Your latest patch assumes that arr_len <= 2, but the implementation >>> of G1CollectedHeap::heap()->begin_archive_alloc_range() / >>> G1CollectedHeap::heap()->end_archive_alloc_range() actually allows >>> more than 2 regions to returned. So simply putting an assert there >>> seems risky (unless you have analyzed all possible scenarios to >>> prove that's impossible). >>> >>> Instead of trying to come up with a complicated proof, I think it's >>> much safer to disable the archived string region if the arr_len > 2. >>> Also, if the string region is disabled, you should also disable the >>> open_archive_heap_region >>> >>> I think this is a general issue with the mapped heap regions, and it >>> just happens to be revealed by Calvin's patch. So we should fix it >>> now and not wait for Calvin's patch. >> >> Ok. I?ll change the assert to be a check. >> >>> >>> >>>>> >>>>> The word "region" is used in these parameters, but they don't mean >>>>> the same thing. >>>>> >>>>> GrowableArray *regions >>>>> int first_region, int max_num_regions, >>>>> >>>>> >>>>> How about regions -> g1_regions_list >>>>> first_region -> first_region_in_archive >>>> >>>> The GrowableArray above is the MemRegions that GC code gives back >>>> to us. The GC code combines multiple G1 regions. The comments >>>> probably are over-explaining the details, which are hidden in the >>>> GC code. Probably that?s the confusing source. I?ll make the >>>> comment more clear. >>>> >>>> Using g1_regions_list would also be confusing, since >>>> write_archive_heap_regions does not handle G1 regions directly. It >>>> processes the MemRegion array that GC code returns. How about >>>> changing ?regions? to ?mem_regions? or ?archive_regions'? >>>> >>> How about heap_regions? These are regions in the active Java heap, >>> which current has not mapped anything from the CDS archive. >> >> Ok. >> >> I?m updating my changes and will send out a consolidated webrev. >> >> Thanks! >> Jiangli >> >>> >>> >>>>> >>>>> >>>>> In the comments, I find the phrase 'the current archive heap >>>>> region' ambiguous. It could be (erroneously) interpreted as "a >>>>> region from the currently mapped archive? >>>>> >>>>> To make it unambiguous, how about changing >>>>> >>>>> >>>>> 464 // Write the current archive heap region, which contains one >>>>> or multiple GC(G1) regions. >>>>> >>>>> >>>>> to >>>>> >>>>> // Write the given list of G1 memory regions into the archive, >>>>> starting at >>>>> // first_region_in_archive. >>>> >>>> >>>> Ok. How about the following: >>>> >>>> // Write the given list of java heap memory regions into the >>>> archive, starting at >>>> // first_region_in_archive. >>>> >>> Sounds good. >>> >>> Thanks >>> - Ioi >>> >>>>> >>>>> >>>>> Also, for the explanation of how the G1 regions are written into >>>>> the archive, how about: >>>>> >>>>> // The G1 regions in the list are sorted in ascending address >>>>> order. When there are more objects >>>>> // than the capacity of a single G1 region, the bottom-most G1 >>>>> region may be partially filled, and the >>>>> // remaining G1 region(s) are consecutively allocated and fully >>>>> filled. >>>>> // >>>>> // Thus, the bottom-most G1 region (if not empty) is written >>>>> into first_region_in_archive. >>>>> // The remaining G1 regions (if exist) are coalesced and >>>>> written as a single block >>>>> // into (first_region_in_archive + 1) >>>>> >>>>> // Here's the mapping from (g1 regions) -> (archive regions). >>>>> >>>>> >>>>> All this function needs to do is to decide the values for >>>>> >>>>> r0_start, r0_top >>>>> r1_start, r1_top >>>>> >>>>> I think it would be much better to not use the loop, and not use >>>>> the max_num_regions parameter (it's always 2 anyway). >>>>> >>>>> *r0_start = *r0_top = NULL; >>>>> *r1_start = *r1_top = NULL; >>>>> >>>>> if (arr_len >= 1) { >>>>> *r0_start = regions->at(0).start(); >>>>> *r0_end = *r0_start + regions->at(0).byte_size(); >>>>> } >>>>> if (arr_len >= 2) { >>>>> int last = arr_len - 1; >>>>> *r1_start = regions->at(1).start(); >>>>> *r1_end = regions->at(last).start() + regions->at(last).byte_size(); >>>>> } >>>>> >>>>> what do you think? >>>> >>>> We need to write out all archive regions including the empty ones. >>>> The loop using max_num_regions is the easiest way. I?d like to >>>> remove the code that deals with r0_* and r1_ explicitly. Let me try >>>> that. >>>> >>>>> >>>>> >>>>> >>>>> (3) metaspace.cpp >>>>> >>>>> 3350 // Map the archived heap regions after compressed >>>>> pointers >>>>> 3351 // because it relies on compressed class pointers >>>>> setting to work >>>>> >>>>> do you mean this? >>>>> >>>>> // Archived heap regions depend on the parameters of >>>>> compressed class pointers, so >>>>> // they must be mapped after such parameters have been decided >>>>> in the above call. >>>> >>>> Hmmm, maybe use ?arguments? instead of ?parameters?? >>>> >>>>> >>>>> >>>>> (4) I found this name not strictly grammatical. How about this: >>>>> >>>>> allow_archive_heap_object -> is_heap_object_archiving_allowed >>>> >>>> Ok. >>>> >>>>> >>>>> (5) in most of your code, 'archive' is used as a noun, except in >>>>> StringTable::archive_string() where it's used as a verb. >>>>> >>>>> archive_string could also be interpreted erroneously as "return a >>>>> string that's already in the archive". So to be consistent and >>>>> unambiguous, I think it's better to rename it to >>>>> StringTable::create_archived_string() >>>> >>>> Ok. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> >>>>> On 8/3/17 5:15 PM, Jiangli Zhou wrote: >>>>>> Here are the updated webrevs. >>>>>> >>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >>>>>> >>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >>>>>> >>>>>> Changes in the updated webrevs include: >>>>>> >>>>>> * Merge with Ioi?s recent shared space auto-sizing change (8072061) >>>>>> * Addressed all feedbacks from Ioi and Coleen (Thanks for >>>>>> detailed review!) >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>> >>>>>>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou >>>>>>> wrote: >>>>>>> >>>>>>> Hi Ioi, >>>>>>> >>>>>>> Thank you so much for reviewing this. I?ve addressed all your >>>>>>> feedbacks. Please see details below. I?ll updated the webrev >>>>>>> after addressing Coleen?s comments. >>>>>>> >>>>>>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>>>>>>> >>>>>>>> Hi Jiangli, >>>>>>>> >>>>>>>> Here are my comments. I've not reviewed the GC code and I'll >>>>>>>> leave that to the GC experts :-) >>>>>>>> >>>>>>>> stringTable.cpp: StringTable::archive_string >>>>>>>> >>>>>>>> add assert for DumpSharedSpaces only >>>>>>> >>>>>>> Ok. >>>>>>> >>>>>>>> >>>>>>>> filemap.cpp >>>>>>>> >>>>>>>> 525 void >>>>>>>> FileMapInfo::write_archive_heap_regions(GrowableArray >>>>>>>> *regions, >>>>>>>> 526 int first_region, int num_regions) { >>>>>>>> >>>>>>>> When I first read this function, I found it hard to follow, >>>>>>>> especially this part that coalesces the trailing regions: >>>>>>>> >>>>>>>> 537 int len = regions->length(); >>>>>>>> 538 if (len > 1) { >>>>>>>> 539 start = (char*)regions->at(1).start(); >>>>>>>> 540 size = (char*)regions->at(len - 1).end() - start; >>>>>>>> 541 } >>>>>>>> 542 } >>>>>>>> >>>>>>>> The rest of filemap.cpp always perform identical operations on >>>>>>>> MemRegion arrays, which are either 1 or 2 in size. However, >>>>>>>> this function doesn't follow that pattern; it also has a very >>>>>>>> different notion of "region", and the confusing part is >>>>>>>> regions->size() is not the same as num_regions. >>>>>>>> >>>>>>>> How about we change the API to something like the following? >>>>>>>> Before calling this API, the caller needs to coalesce the >>>>>>>> trailing G1 regions into a single MemRegion. >>>>>>>> >>>>>>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int >>>>>>>> first, int num_regions) { >>>>>>>> if (first == MetaspaceShared::first_string) { >>>>>>>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>>>>>>> } else { >>>>>>>> assert(first == >>>>>>>> MetaspaceShared::first_open_archive_heap_region, "..."); >>>>>>>> assert(num_regons <= >>>>>>>> MetaspaceShared::max_open_archive_heap_region, "..."); >>>>>>>> } >>>>>>>> .... >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> I?ve reworked the function and simplified the code. >>>>>>> >>>>>>>> >>>>>>>> 756 if (!string_data_mapped) { >>>>>>>> 757 StringTable::ignore_shared_strings(true); >>>>>>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, >>>>>>>> "sanity"); >>>>>>>> 759 } >>>>>>>> 760 >>>>>>>> 761 if (open_archive_heap_data_mapped) { >>>>>>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>>>>>>> 763 } else { >>>>>>>> 764 assert(open_archive_heap_ranges == NULL && >>>>>>>> num_open_archive_heap_ranges == 0, "sanity"); >>>>>>>> 765 } >>>>>>>> >>>>>>>> Maybe the two "if" statements should be more consistent? >>>>>>>> Instead of StringTable::ignore_shared_strings, how >>>>>>>> about StringTable::set_shared_strings_region_mapped()? >>>>>>> >>>>>>> Fixed. >>>>>>> >>>>>>>> >>>>>>>> FileMapInfo::map_heap_data() -- >>>>>>>> >>>>>>>> 818 char* addr = (char*)regions[i].start(); >>>>>>>> 819 char* base = os::map_memory(_fd, _full_path, >>>>>>>> si->_file_offset, >>>>>>>> 820 addr, regions[i].byte_size(), si->_read_only, >>>>>>>> 821 si->_allow_exec); >>>>>>>> >>>>>>>> What happens when the first region succeeds to map but the >>>>>>>> second region fails to map? Will both regions be unmapped? I >>>>>>>> don't see where you store the return value (base) from >>>>>>>> os::map_memory(). Does it mean the code assumes that (addr == >>>>>>>> base). If so, we need an assert here. >>>>>>> >>>>>>> If any of the region fails to map, we bail out and call >>>>>>> dealloc_archive_heap_regions(), which handles the deallocation >>>>>>> of any regions specified. If second region fails to map, all >>>>>>> memory ranges specified by ?regions? array are deallocated. We >>>>>>> don?t unmap the memory here since it is part of the java heap. >>>>>>> Unmapping of heap memory are handled by GC code. The ?if? check >>>>>>> below makes sure base == addr. >>>>>>> >>>>>>> if (base == NULL || base != addr) { >>>>>>> // dealloc the regions from java heap >>>>>>> dealloc_archive_heap_regions(regions, region_num); >>>>>>> if (log_is_enabled(Info, cds)) { >>>>>>> log_info(cds)("UseSharedSpaces: Unable to map at required >>>>>>> address in java heap."); >>>>>>> } >>>>>>> return false; >>>>>>> } >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> constantPool.cpp >>>>>>>> >>>>>>>> Handle refs_handle; >>>>>>>> ... >>>>>>>> refs_handle = Handle(THREAD, (oop)archived); >>>>>>>> >>>>>>>> This will first create a NULL handle, then construct a >>>>>>>> temporary handle, and then assign the temp handle back to the >>>>>>>> null handle. This means two handles will be pushed onto >>>>>>>> THREAD->metadata_handles() >>>>>>>> >>>>>>>> I think it's more efficient if you merge these into a single >>>>>>>> statement >>>>>>>> >>>>>>>> Handle refs_handle(THREAD, (oop)archived); >>>>>>> >>>>>>> Fixed. >>>>>>> >>>>>>>> >>>>>>>> Is this experimental code? Maybe it should be removed? >>>>>>>> >>>>>>>> 664 if (tag_at(index).is_unresolved_klass()) { >>>>>>>> 665 #if 0 >>>>>>>> 666 CPSlot entry = cp->slot_at(index); >>>>>>>> 667 Symbol* name = entry.get_symbol(); >>>>>>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, >>>>>>>> THREAD); >>>>>>>> 669 if (k != NULL) { >>>>>>>> 670 klass_at_put(index, k); >>>>>>>> 671 } >>>>>>>> 672 #endif >>>>>>>> 673 } else >>>>>>> >>>>>>> Removed. >>>>>>> >>>>>>>> >>>>>>>> cpCache.hpp: >>>>>>>> >>>>>>>> u8 _archived_references >>>>>>>> >>>>>>>> shouldn't this be declared as an narrowOop to avoid the type >>>>>>>> casts when it's used? >>>>>>> >>>>>>> Ok. >>>>>>> >>>>>>>> >>>>>>>> cpCache.cpp: >>>>>>>> >>>>>>>> add assert so that one of these is used only at dump time >>>>>>>> and the other only at run time? >>>>>>>> >>>>>>>> 610 oop ConstantPoolCache::archived_references() { >>>>>>>> 611 return >>>>>>>> oopDesc::decode_heap_oop((narrowOop)_archived_references); >>>>>>>> 612 } >>>>>>>> 613 >>>>>>>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>>>>>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>>>>>>> 616 } >>>>>>> >>>>>>> Ok. >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> Jiangli >>>>>>> >>>>>>>> >>>>>>>> Thanks! >>>>>>>> - Ioi >>>>>>>> >>>>>>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>>>>>>> Sorry, the mail didn?t handle the rich text well. I fixed the >>>>>>>>> format below. >>>>>>>>> >>>>>>>>> Please help review the changes for JDK-8179302 (Pre-resolve >>>>>>>>> constant pool string entries and cache resolved_reference >>>>>>>>> arrays in CDS archive). Currently, the CDS archive can contain >>>>>>>>> cached class metadata, interned java.lang.String objects. This >>>>>>>>> RFE adds the constant pool ?resolved_references? arrays >>>>>>>>> (hotspot specific) to the archive for startup/runtime >>>>>>>>> performance enhancement. The ?resolved_references' arrays are >>>>>>>>> used to hold references of resolved constant pool entries >>>>>>>>> including Strings, mirrors, etc. With >>>>>>>>> the 'resolved_references? being cached, string constants in >>>>>>>>> shared classes can now be resolved to existing interned >>>>>>>>> java.lang.Strings at CDS dump time. G1 and 64-bit platforms >>>>>>>>> are required. >>>>>>>>> >>>>>>>>> The GC changes in the RFE were discussed and guided by Thomas >>>>>>>>> Schatzl and GC team. Part of the changes were contributed by >>>>>>>>> Thomas himself. >>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>>>>>>> hotspot: >>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>>>>>>> whitebox: >>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>>>>>>> >>>>>>>>> Please see below for details of supporting cached >>>>>>>>> ?resolved_references? and pre-resolving string constants. >>>>>>>>> >>>>>>>>> Types of Pinned G1 Heap Regions >>>>>>>>> >>>>>>>>> The pinned region type is a super type of all archive region >>>>>>>>> types, which include the open archive type and the closed >>>>>>>>> archive type. >>>>>>>>> >>>>>>>>> 00100 0 [ 8] Pinned Mask >>>>>>>>> 01000 0 [16] Old Mask >>>>>>>>> 10000 0 [32] Archive Mask >>>>>>>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>>>>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | >>>>>>>>> OldMask + 1 >>>>>>>>> >>>>>>>>> >>>>>>>>> Pinned Regions >>>>>>>>> >>>>>>>>> Objects within the region are 'pinned', which means GC does >>>>>>>>> not move any live objects. GC scans and marks objects in the >>>>>>>>> pinned region as normal, but skips forwarding live objects. >>>>>>>>> Pointers in live objects are updated. Dead objects >>>>>>>>> (unreachable) can be collected and freed. >>>>>>>>> >>>>>>>>> Archive Regions >>>>>>>>> >>>>>>>>> The archive types are sub-types of 'pinned'. There are two >>>>>>>>> types of archive region currently, open archive and closed >>>>>>>>> archive. Both can support caching java heap objects via the >>>>>>>>> CDS archive. >>>>>>>>> >>>>>>>>> An archive region is also an old region by design. >>>>>>>>> >>>>>>>>> Open Archive (GC-RW) Regions >>>>>>>>> >>>>>>>>> Open archive region is GC writable. GC scans & marks objects >>>>>>>>> within the region and adjusts (updates) pointers in live >>>>>>>>> objects the same way as a pinned region. Live objects >>>>>>>>> (reachable) are pinned and not forwarded by GC. >>>>>>>>> Open archive region does not have 'dead' objects. Unreachable >>>>>>>>> objects are 'dormant' objects. Dormant objects are not >>>>>>>>> collected and freed by GC. >>>>>>>>> >>>>>>>>> Adjustable Outgoing Pointers >>>>>>>>> >>>>>>>>> As GC can adjust pointers within the live objects in open >>>>>>>>> archive heap region, objects can have outgoing pointers to >>>>>>>>> another java heap region, including closed archive region, >>>>>>>>> open archive region, pinned (or humongous) region, and normal >>>>>>>>> generational region. When a referenced object is moved by GC, >>>>>>>>> the pointer within the open archive region is updated accordingly. >>>>>>>>> >>>>>>>>> Closed Archive (GC-RO) Regions >>>>>>>>> >>>>>>>>> The closed archive region is GC read-only region. GC cannot >>>>>>>>> write into the region. Objects are not scanned and marked by >>>>>>>>> GC. Objects are pinned and not forwarded. Pointers are not >>>>>>>>> updated by GC either. Hence, objects within the archive region >>>>>>>>> cannot have any outgoing pointers to another java heap region. >>>>>>>>> Objects however can still have pointers to other objects >>>>>>>>> within the closed archive regions (we might allow pointers to >>>>>>>>> open archive regions in the future). That restricts the type >>>>>>>>> of java objects that can be supported by the archive region. >>>>>>>>> In JDK 9 we support archive Strings with the archive regions. >>>>>>>>> >>>>>>>>> The GC-readonly archive region makes java heap memory sharable >>>>>>>>> among different JVM processes. NOTE: synchronization on the >>>>>>>>> objects within the archive heap region can still cause writes >>>>>>>>> to the memory page. >>>>>>>>> >>>>>>>>> Dormant Objects >>>>>>>>> >>>>>>>>> Dormant objects are unreachable java objects within the open >>>>>>>>> archive heap region. >>>>>>>>> A java object in the open archive heap region is a live object >>>>>>>>> if it can be reached during scanning. Some of the java objects >>>>>>>>> in the region may not be reachable during scanning. Those >>>>>>>>> objects are considered as dormant, but not dead. For example, >>>>>>>>> a constant pool 'resolved_references' array is reachable via >>>>>>>>> the klass root if its container klass (shared) is already >>>>>>>>> loaded at the time during GC scanning. If a shared klass is >>>>>>>>> not yet loaded, the klass root is not scanned and it's >>>>>>>>> constant pool 'resolved_reference' array (A) in the open >>>>>>>>> archive region is not reachable. Then A is a dormant object. >>>>>>>>> >>>>>>>>> Object State Transition >>>>>>>>> >>>>>>>>> All java objects are initially dormant objects when open >>>>>>>>> archive heap regions are mapped to the runtime java heap. A >>>>>>>>> dormant object becomes live object when the associated shared >>>>>>>>> class is loaded at runtime. Explicit call >>>>>>>>> to G1SATBCardTableModRefBS::enqueue() needs to be made when a >>>>>>>>> dormant object becomes live. That should be the case >>>>>>>>> for cached objects with strong roots as well, since strong >>>>>>>>> roots are only scanned at the start of GC marking (the initial >>>>>>>>> marking) but not during Remarking/Final marking. If a cached >>>>>>>>> object becomes live during concurrent marking phase, G1 may >>>>>>>>> not find it and mark it live unless a call to >>>>>>>>> G1SATBCardTableModRefBS::enqueue() is made for the object. >>>>>>>>> >>>>>>>>> Currently, a live object in the open archive heap region >>>>>>>>> cannot become dormant again. This restriction simplifies GC >>>>>>>>> requirement and guarantees all outgoing pointers are updated >>>>>>>>> by GC correctly. Only objects for shared classes from the >>>>>>>>> builtin class loaders (boot, PlatformClassLoaders, and >>>>>>>>> AppClassLoaders) are supported for caching. >>>>>>>>> >>>>>>>>> Caching Java Objects at Archive Dump Time >>>>>>>>> >>>>>>>>> The closed archive and open archive regions are allocated near >>>>>>>>> the top of the dump time java heap. Archived java objects >>>>>>>>> are copied into the designated archive heap regions. For >>>>>>>>> example, String objects and the underlying 'value' arrays are >>>>>>>>> copied into the closed archive regions. All references to the >>>>>>>>> archived objects (from shared class metadata, string table, >>>>>>>>> etc) are set to the new heap locations. A hash table is used >>>>>>>>> to keep track of all archived java objects during the copying >>>>>>>>> process to make sure java object is not archived more than >>>>>>>>> once if reached from different roots. It also makes sure >>>>>>>>> references to the same archived object are updated using the >>>>>>>>> same new address location. >>>>>>>>> >>>>>>>>> Caching Constant Pool resolved_references Array >>>>>>>>> >>>>>>>>> The 'resolved_references' is an array that holds references of >>>>>>>>> resolved constant pool entries including Strings, mirrors >>>>>>>>> and methodTypes, etc. Each loaded class has one >>>>>>>>> 'resolved_references' array (in ConstantPoolCache). The >>>>>>>>> 'resolved_references' arrays are copied into the open archive >>>>>>>>> regions during dump process. Prior to copying the >>>>>>>>> 'resolved_references' arrays, JVM iterates through constant >>>>>>>>> pool entries and resolves all JVM_CONSTANT_String entries to >>>>>>>>> existing interned Strings for all archived classes. When >>>>>>>>> resolving, JVM only looks up the string table and finds >>>>>>>>> existing interned Strings without inserting new ones. If >>>>>>>>> a string entry cannot be resolved to an existing interned >>>>>>>>> String, the constant pool entry remain as unresolved. That >>>>>>>>> prevents memory waste if a constant pool string entry is never >>>>>>>>> used at runtime. >>>>>>>>> >>>>>>>>> All String objects referenced by the string table are copied >>>>>>>>> first into the closed archive regions. The string table entry >>>>>>>>> is updated with the new location when each String object is >>>>>>>>> archived. The JVM updates the resolved constant pool string >>>>>>>>> entries with the new object locations when copying the >>>>>>>>> 'resolved_references' arrays to the open archive regions. >>>>>>>>> References to the 'resolved_references' arrays in the >>>>>>>>> ConstantPoolCache are also updated. >>>>>>>>> At runtime as part of ConstantPool::restore_unshareable_info() >>>>>>>>> work, call G1SATBCardTableModRefBS::enqueue() to let GC >>>>>>>>> know the 'resolved_references' is becoming live. A handle is >>>>>>>>> created for the cached object and added to the loader_data's >>>>>>>>> handles. >>>>>>>>> >>>>>>>>> Runtime Java Heap With Cached Java Objects >>>>>>>>> >>>>>>>>> >>>>>>>>> The closed archive regions (the string regions) and open >>>>>>>>> archive regions are mapped to the runtime java heap at the >>>>>>>>> same offsets as the dump time offsets from the runtime java >>>>>>>>> heap base. >>>>>>>>> >>>>>>>>> Preliminary test execution and status: >>>>>>>>> >>>>>>>>> JPRT: passed >>>>>>>>> Tier2-rt: passed >>>>>>>>> Tier2-gc: passed >>>>>>>>> Tier2-comp: passed >>>>>>>>> Tier3-rt: passed >>>>>>>>> Tier3-gc: passed >>>>>>>>> Tier3-comp: passed >>>>>>>>> Tier4-rt: passed >>>>>>>>> Tier4-gc: passed >>>>>>>>> Tier4-comp:6 jobs timed out, all other tests passed >>>>>>>>> Tier5-rt: one test failed but passed when running locally, all >>>>>>>>> other tests passed >>>>>>>>> Tier5-gc: passed >>>>>>>>> Tier5-comp: running >>>>>>>>> hotspot_gc: two jobs timed out, all other tests passed >>>>>>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>>>>>>> vm.gc: passed >>>>>>>>> vm.gc in CDS mode: passed >>>>>>>>> Kichensink: passed >>>>>>>>> Kichensink in CDS mode: passed >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jiangli >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From poonam.bajaj at oracle.com Thu Aug 10 21:32:39 2017 From: poonam.bajaj at oracle.com (Poonam Parhar) Date: Thu, 10 Aug 2017 14:32:39 -0700 (PDT) Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> <80530dcf-57f4-8d8f-e084-360e770520a6@oracle.com> <38ABC51B-73E3-4DEF-B5E1-B44B8B25160D@oracle.com> <58512f71-b5e6-f2b0-f99e-005c239e911e@oracle.com> Message-ID: <10fb6c8d-104d-4fc3-94e3-e82db3fa8273@default> We have customer reports of crashes with 8u on T7 sparc systems. So the change needs to be made in 8, and possibly in 7 too. Thanks, Poonam > -----Original Message----- > From: Bob Vandette > Sent: Thursday, August 10, 2017 12:14 PM > To: Vladimir Kozlov > Cc: hotspot-compiler-dev at openjdk.java.net; Poonam Bajaj Parhar; > hotspot-runtime-dev at openjdk.java.net runtime > Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC > machines > > I don?t think that we should backport it to JDK 7 and 8 since older > single CPU systems may get a security update for these older releases > and see a performance regression. > Perhaps JDK9 and 10 would be ok. > > Bob. > > > > On Aug 10, 2017, at 3:10 PM, Vladimir Kozlov > wrote: > > > > Bob, we have JDK-8185062 for that: > > > > https://bugs.openjdk.java.net/browse/JDK-8185062 > > > > IMHO this fix is intended for backports, should be simple and don't > cause regression, for example on embedded platforms. > > > > But I am fine if runtime group think it is fine to enable it on all > platforms in jdk 7, 8 and 9. > > > > I agree that due to problem with dynamic cpus configuration in > containers it may be good to enable it on all platforms in previous > releases too. > > > > Thanks, > > Vladimir > > > > On 8/10/17 11:53 AM, Bob Vandette wrote: > >> Can we just always run with AssumeMP true for all platforms these > days? > >> Surely single CPU systems are rare now. > >> We might have issues with Docker containers that have a limit 1 CPU > >> on a large mp system which may cause issues. > >> Bob. > >>> On Aug 10, 2017, at 2:23 PM, Vladimir Kozlov > wrote: > >>> > >>> Poonam, > >>> > >>> I mean to add a small (one or two sentences) comment to the code. > Some thing like next but may better wording: > >>> > >>> + if (FLAG_IS_DEFAULT(AssumeMP)) { > >>> + // BIS instructions require 'membar' instruction regardless > number of CPU. > >>> + // Otherwise in virtualized/container environments which use > only 1 cpu BIS instructions may produce incorrect results. > >>> + FLAG_SET_DEFAULT(AssumeMP, true); > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> On 8/10/17 10:44 AM, Poonam Parhar wrote: > >>>> Thanks Vladimir. > >>>> Since the SPARC machines are always multi-cores, we can safely set > AssumeMP to true on these. > >>>> Adding my comments from the previous mail here again for better > readability: > >>>> ------------------------------------- > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable > >>>> AssumeMP by default on SPARC machines > >>>> Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ > >>>> This change enables AssumeMP by default on SPARC machines. On > Sparc T7, to finalize BIS instructions the server compiler needs to add > a 'membar' instruction at the end. But the generation of 'membar' is > guarded by os::is_MP(), and os::is_MP() returns false when there is a > single cpu available on the system. Now, in virtualized/container > environments, the number of processors allocated to a virtual machine > can dynamically change during the application runtime. That could lead > to incorrect generation of BIS instructions and can cause JVM crashes. > Enabling AssumeMP makes is_MP() always return true on SPARC systems. > >>>> In future, we may consider making generation of 'membar' > unconditional with the enhancement request: > https://bugs.openjdk.java.net/browse/JDK-8150715. > >>>> Thanks, > >>>> Poonam > >>>>> -----Original Message----- > >>>>> From: Vladimir Kozlov > >>>>> Sent: Thursday, August 10, 2017 9:47 AM > >>>>> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net > >>>>> Cc: hotspot-runtime-dev at openjdk.java.net runtime > >>>>> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on > >>>>> SPARC machines > >>>>> > >>>>> CCing to Runtime. > >>>>> > >>>>> Can you add comment explaining why it set to true on SPARC? > >>>>> > >>>>> Thanks, > >>>>> Vladimir > >>>>> > >>>>> On 8/10/17 6:26 AM, Poonam Parhar wrote: > >>>>>> Hello, > >>>>>> > >>>>>> Please review this simple patch: > >>>>>> > >>>>>> Bug:_JDK-8185572_ >>>>> 8185572>:En > >>>>>> able > >>>>>> AssumeMP by default on SPARC machines > >>>>>> > >>>>>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ > >>>>>> > >>>>>> This change enables AssumeMP by default on SPARC machines. On > >>>>>> Sparc T7, to finalize BIS instructions the server compiler needs > >>>>>> toadd a'membar'instruction at the end.But the generation > >>>>>> of'membar'is guarded byos::is_MP(), andos::is_MP()returns false > >>>>>> when there isa singlecpu available on the system. > >>>>>> Now,invirtualized/containerenvironments, the number > >>>>>> ofprocessorsallocated to a virtual machine can dynamically > change > >>>>>> during the application runtime.That could lead to incorrect > >>>>> generation > >>>>>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP > >>>>>> makes > >>>>>> is_MP() always return true on SPARC systems. > >>>>>> > >>>>>> In future, we may consider makinggeneration > >>>>>> of'membar'unconditional > >>>>>> withtheenhancementrequest:_JDK- > >>>>> 8150715_. > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Poonam > >>>>>> > From david.holmes at oracle.com Thu Aug 10 21:57:24 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 11 Aug 2017 07:57:24 +1000 Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: <0285fdb7-8548-48e5-b019-3de62f66cffe@default> References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> Message-ID: <46b484c8-e080-8c6a-a6bb-8471ce5efd9f@oracle.com> Hi Poonam, On 11/08/2017 3:44 AM, Poonam Parhar wrote: > Thanks Vladimir. > > Since the SPARC machines are always multi-cores, we can safely set AssumeMP to true on these. I'm still unclear about the reported problem here. As Vladimir pointed out in the bug report the is_MP checks uses _processor_count which is set to the number of cpus on the machine _not_ the number of cpus currently available to the VM. void os::Solaris::initialize_system_info() { set_processor_count(sysconf(_SC_NPROCESSORS_CONF)); So all this discussion about containers and dynamic changes to available cpus should be moot. So the only way this can fail is if the number of configured processors on the machine dynamically changed _and_ it was initially 1 - which seems to me to be impossible with sparc unless the hardware info is being incorrectly reported (ie virtualization bug?) That aside I have no issue with the fix as we will likely be assuming MP always in the future. Thanks, David > Adding my comments from the previous mail here again for better readability: > ------------------------------------- > Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable AssumeMP by default on SPARC machines > Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ > > This change enables AssumeMP by default on SPARC machines. On Sparc T7, to finalize BIS instructions the server compiler needs to add a 'membar' instruction at the end. But the generation of 'membar' is guarded by os::is_MP(), and os::is_MP() returns false when there is a single cpu available on the system. Now, in virtualized/container environments, the number of processors allocated to a virtual machine can dynamically change during the application runtime. That could lead to incorrect generation of BIS instructions and can cause JVM crashes. Enabling AssumeMP makes is_MP() always return true on SPARC systems. > > In future, we may consider making generation of 'membar' unconditional with the enhancement request: https://bugs.openjdk.java.net/browse/JDK-8150715. > > Thanks, > Poonam > > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Thursday, August 10, 2017 9:47 AM >> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net >> Cc: hotspot-runtime-dev at openjdk.java.net runtime >> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC >> machines >> >> CCing to Runtime. >> >> Can you add comment explaining why it set to true on SPARC? >> >> Thanks, >> Vladimir >> >> On 8/10/17 6:26 AM, Poonam Parhar wrote: >>> Hello, >>> >>> Please review this simple patch: >>> >>> Bug:_JDK-8185572_> 8185572>:En >>> able >>> AssumeMP by default on SPARC machines >>> >>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>> >>> This change enables AssumeMP by default on SPARC machines. On Sparc >>> T7, to finalize BIS instructions the server compiler needs toadd >>> a'membar'instruction at the end.But the generation of'membar'is >>> guarded byos::is_MP(), andos::is_MP()returns false when there isa >>> singlecpu available on the system. >>> Now,invirtualized/containerenvironments, the number >>> ofprocessorsallocated to a virtual machine can dynamically change >>> during the application runtime.That could lead to incorrect >> generation >>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP makes >>> is_MP() always return true on SPARC systems. >>> >>> In future, we may consider makinggeneration of'membar'unconditional >>> withtheenhancementrequest:_JDK- >> 8150715_. >>> >>> Thanks, >>> >>> Poonam >>> From jiangli.zhou at oracle.com Thu Aug 10 23:51:52 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 10 Aug 2017 16:51:52 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <20bfa3ff-b2cd-028d-efa9-bddad4a6ff7c@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> <5F35B435-FB72-4A21-8896-645E0291C5DF@oracle.com> <4DEE0393-59DA-4E01-ACDB-F2DB0F9ED6D9@oracle.com> <20bfa3ff-b2cd-028d-efa9-bddad4a6ff7c@oracle.com> Message-ID: Thanks, Ioi! Jiangli > On Aug 10, 2017, at 2:15 PM, Ioi Lam wrote: > > Hi Jiangli, > > > The changes look good to me. Thanks for considering my suggestions. > > > - Ioi > > On 8/8/17 5:33 PM, Jiangli Zhou wrote: >> Here is the incremental webrev that has all the changes incorporated with suggestions from Coleen, Ioi and Thomas: >> >> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03.inc/ >> >> Updated full webrev: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03/ >> >> Thanks again for Coleen's, Ioi's and Thomas? review! >> Jiangli >> >>> On Aug 7, 2017, at 7:57 PM, Jiangli Zhou > wrote: >>> >>> Hi Ioi, >>> >>> Thanks for getting back to me. >>> >>>> On Aug 7, 2017, at 5:45 PM, Ioi Lam > wrote: >>>> >>>> On 8/4/17 10:19 PM, Jiangli Zhou wrote: >>>>> Hi Ioi, >>>>> >>>>> Thanks for looking again. >>>>> >>>>>> On Aug 4, 2017, at 2:22 PM, Ioi Lam > wrote: >>>>>> >>>>>> Hi Jiangli, >>>>>> >>>>>> The code looks good in general. I just have a few pet peeves for readability: >>>>>> >>>>>> >>>>>> (1) stringTable.cpp and metaspaceShared.cpp have the same asserts >>>>>> >>>>>> 704 assert(UseG1GC, "Only support G1 GC"); >>>>>> 705 assert(UseCompressedOops && UseCompressedClassPointers, >>>>>> 706 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); >>>>>> >>>>>> 1615 assert(UseG1GC, "Only support G1 GC"); >>>>>> 1616 assert(UseCompressedOops && UseCompressedClassPointers, >>>>>> 1617 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); >>>>>> >>>>>> Maybe it's better to combine them into a single function like MetaspaceShared::assert_vm_flags() so they don't get out of sync? >>>>> >>>>> There is a MetaspaceShared::allow_archive_heap_object(), which checks for UseG1GC, UseCompressedOops and UseCompressedClassPointers combined. It does not seem to worth add another separate API for asserting the required flags. I?ll use that in the assert. >>>>> >>>>>> >>>>>> >>>>>> >>>>>> (2) FileMapInfo::write_archive_heap_regions() >>>>>> >>>>>> I still find this code very hard to read, especially due to the loop. >>>>>> >>>>>> First, the comments are not consistent with the code: >>>>>> >>>>>> 498 assert(arr_len <= max_num_regions, "number of memory regions exceeds maximum"); >>>>>> >>>>>> but the comments says: "The rest are consecutive full GC regions" which means there's a chance for max_num_regions to be more than 2 (which will be the case with Calvin's java-loader dumping changes using very small heap size). So the code is actually wrong. >>>>> >>>>> The max_num_regions is the maximum number of region for each archived heap space (the string space, or open archive space). We only run into the case where the MemRegion array size is larger than max_num_regions with Calvin?s pending change. As part of Calvin?s change, he will change the assert into a check and bail out if the number of MemRegions are larger than max_num_regions due to heap fragmentation. >>>>> >>>>> >>>> Your latest patch assumes that arr_len <= 2, but the implementation of G1CollectedHeap::heap()->begin_archive_alloc_range() / G1CollectedHeap::heap()->end_archive_alloc_range() actually allows more than 2 regions to returned. So simply putting an assert there seems risky (unless you have analyzed all possible scenarios to prove that's impossible). >>>> >>>> Instead of trying to come up with a complicated proof, I think it's much safer to disable the archived string region if the arr_len > 2. Also, if the string region is disabled, you should also disable the open_archive_heap_region >>>> >>>> I think this is a general issue with the mapped heap regions, and it just happens to be revealed by Calvin's patch. So we should fix it now and not wait for Calvin's patch. >>> >>> Ok. I?ll change the assert to be a check. >>> >>>> >>>> >>>>>> >>>>>> The word "region" is used in these parameters, but they don't mean the same thing. >>>>>> >>>>>> GrowableArray *regions >>>>>> int first_region, int max_num_regions, >>>>>> >>>>>> >>>>>> How about regions -> g1_regions_list >>>>>> first_region -> first_region_in_archive >>>>> >>>>> The GrowableArray above is the MemRegions that GC code gives back to us. The GC code combines multiple G1 regions. The comments probably are over-explaining the details, which are hidden in the GC code. Probably that?s the confusing source. I?ll make the comment more clear. >>>>> >>>>> Using g1_regions_list would also be confusing, since write_archive_heap_regions does not handle G1 regions directly. It processes the MemRegion array that GC code returns. How about changing ?regions? to ?mem_regions? or ?archive_regions'? >>>>> >>>> How about heap_regions? These are regions in the active Java heap, which current has not mapped anything from the CDS archive. >>> >>> Ok. >>> >>> I?m updating my changes and will send out a consolidated webrev. >>> >>> Thanks! >>> Jiangli >>> >>>> >>>> >>>>>> >>>>>> >>>>>> In the comments, I find the phrase 'the current archive heap region' ambiguous. It could be (erroneously) interpreted as "a region from the currently mapped archive? >>>>> >>>>>> >>>>>> To make it unambiguous, how about changing >>>>>> >>>>>> >>>>>> 464 // Write the current archive heap region, which contains one or multiple GC(G1) regions. >>>>>> >>>>>> >>>>>> to >>>>>> >>>>>> // Write the given list of G1 memory regions into the archive, starting at >>>>>> // first_region_in_archive. >>>>> >>>>> >>>>> Ok. How about the following: >>>>> >>>>> // Write the given list of java heap memory regions into the archive, starting at >>>>> // first_region_in_archive. >>>>> >>>> Sounds good. >>>> >>>> Thanks >>>> - Ioi >>>> >>>>>> >>>>>> >>>>>> Also, for the explanation of how the G1 regions are written into the archive, how about: >>>>>> >>>>>> // The G1 regions in the list are sorted in ascending address order. When there are more objects >>>>>> // than the capacity of a single G1 region, the bottom-most G1 region may be partially filled, and the >>>>>> // remaining G1 region(s) are consecutively allocated and fully filled. >>>>>> // >>>>>> // Thus, the bottom-most G1 region (if not empty) is written into first_region_in_archive. >>>>>> // The remaining G1 regions (if exist) are coalesced and written as a single block >>>>>> // into (first_region_in_archive + 1) >>>>>> >>>>>> // Here's the mapping from (g1 regions) -> (archive regions). >>>>>> >>>>>> >>>>>> All this function needs to do is to decide the values for >>>>>> >>>>>> r0_start, r0_top >>>>>> r1_start, r1_top >>>>>> >>>>>> I think it would be much better to not use the loop, and not use the max_num_regions parameter (it's always 2 anyway). >>>>>> >>>>>> *r0_start = *r0_top = NULL; >>>>>> *r1_start = *r1_top = NULL; >>>>>> >>>>>> if (arr_len >= 1) { >>>>>> *r0_start = regions->at(0).start(); >>>>>> *r0_end = *r0_start + regions->at(0).byte_size(); >>>>>> } >>>>>> if (arr_len >= 2) { >>>>>> int last = arr_len - 1; >>>>>> *r1_start = regions->at(1).start(); >>>>>> *r1_end = regions->at(last).start() + regions->at(last).byte_size(); >>>>>> } >>>>>> >>>>>> what do you think? >>>>> >>>>> We need to write out all archive regions including the empty ones. The loop using max_num_regions is the easiest way. I?d like to remove the code that deals with r0_* and r1_ explicitly. Let me try that. >>>>> >>>>>> >>>>>> >>>>>> >>>>>> (3) metaspace.cpp >>>>>> >>>>>> 3350 // Map the archived heap regions after compressed pointers >>>>>> 3351 // because it relies on compressed class pointers setting to work >>>>>> >>>>>> do you mean this? >>>>>> >>>>>> // Archived heap regions depend on the parameters of compressed class pointers, so >>>>>> // they must be mapped after such parameters have been decided in the above call. >>>>> >>>>> Hmmm, maybe use ?arguments? instead of ?parameters?? >>>>> >>>>>> >>>>>> >>>>>> (4) I found this name not strictly grammatical. How about this: >>>>>> >>>>>> allow_archive_heap_object -> is_heap_object_archiving_allowed >>>>> >>>>> Ok. >>>>> >>>>>> >>>>>> (5) in most of your code, 'archive' is used as a noun, except in StringTable::archive_string() where it's used as a verb. >>>>>> >>>>>> archive_string could also be interpreted erroneously as "return a string that's already in the archive". So to be consistent and unambiguous, I think it's better to rename it to StringTable::create_archived_string() >>>>> >>>>> Ok. >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>>> >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> >>>>>> On 8/3/17 5:15 PM, Jiangli Zhou wrote: >>>>>>> Here are the updated webrevs. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >>>>>>> >>>>>>> Changes in the updated webrevs include: >>>>>>> Merge with Ioi?s recent shared space auto-sizing change (8072061) >>>>>>> Addressed all feedbacks from Ioi and Coleen (Thanks for detailed review!) >>>>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>>> >>>>>>> >>>>>>>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou wrote: >>>>>>>> >>>>>>>> Hi Ioi, >>>>>>>> >>>>>>>> Thank you so much for reviewing this. I?ve addressed all your feedbacks. Please see details below. I?ll updated the webrev after addressing Coleen?s comments. >>>>>>>> >>>>>>>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>>>>>>>> >>>>>>>>> Hi Jiangli, >>>>>>>>> >>>>>>>>> Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) >>>>>>>>> >>>>>>>>> stringTable.cpp: StringTable::archive_string >>>>>>>>> >>>>>>>>> add assert for DumpSharedSpaces only >>>>>>>> >>>>>>>> Ok. >>>>>>>> >>>>>>>>> >>>>>>>>> filemap.cpp >>>>>>>>> >>>>>>>>> 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, >>>>>>>>> 526 int first_region, int num_regions) { >>>>>>>>> >>>>>>>>> When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: >>>>>>>>> >>>>>>>>> 537 int len = regions->length(); >>>>>>>>> 538 if (len > 1) { >>>>>>>>> 539 start = (char*)regions->at(1).start(); >>>>>>>>> 540 size = (char*)regions->at(len - 1).end() - start; >>>>>>>>> 541 } >>>>>>>>> 542 } >>>>>>>>> >>>>>>>>> The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. >>>>>>>>> >>>>>>>>> How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. >>>>>>>>> >>>>>>>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { >>>>>>>>> if (first == MetaspaceShared::first_string) { >>>>>>>>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>>>>>>>> } else { >>>>>>>>> assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); >>>>>>>>> assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); >>>>>>>>> } >>>>>>>>> .... >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> I?ve reworked the function and simplified the code. >>>>>>>> >>>>>>>>> >>>>>>>>> 756 if (!string_data_mapped) { >>>>>>>>> 757 StringTable::ignore_shared_strings(true); >>>>>>>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); >>>>>>>>> 759 } >>>>>>>>> 760 >>>>>>>>> 761 if (open_archive_heap_data_mapped) { >>>>>>>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>>>>>>>> 763 } else { >>>>>>>>> 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); >>>>>>>>> 765 } >>>>>>>>> >>>>>>>>> Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? >>>>>>>> >>>>>>>> Fixed. >>>>>>>> >>>>>>>>> >>>>>>>>> FileMapInfo::map_heap_data() -- >>>>>>>>> >>>>>>>>> 818 char* addr = (char*)regions[i].start(); >>>>>>>>> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >>>>>>>>> 820 addr, regions[i].byte_size(), si->_read_only, >>>>>>>>> 821 si->_allow_exec); >>>>>>>>> >>>>>>>>> What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. >>>>>>>> >>>>>>>> If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by ?regions? array are deallocated. We don?t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ?if? check below makes sure base == addr. >>>>>>>> >>>>>>>> if (base == NULL || base != addr) { >>>>>>>> // dealloc the regions from java heap >>>>>>>> dealloc_archive_heap_regions(regions, region_num); >>>>>>>> if (log_is_enabled(Info, cds)) { >>>>>>>> log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap."); >>>>>>>> } >>>>>>>> return false; >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> constantPool.cpp >>>>>>>>> >>>>>>>>> Handle refs_handle; >>>>>>>>> ... >>>>>>>>> refs_handle = Handle(THREAD, (oop)archived); >>>>>>>>> >>>>>>>>> This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() >>>>>>>>> >>>>>>>>> I think it's more efficient if you merge these into a single statement >>>>>>>>> >>>>>>>>> Handle refs_handle(THREAD, (oop)archived); >>>>>>>> >>>>>>>> Fixed. >>>>>>>> >>>>>>>>> >>>>>>>>> Is this experimental code? Maybe it should be removed? >>>>>>>>> >>>>>>>>> 664 if (tag_at(index).is_unresolved_klass()) { >>>>>>>>> 665 #if 0 >>>>>>>>> 666 CPSlot entry = cp->slot_at(index); >>>>>>>>> 667 Symbol* name = entry.get_symbol(); >>>>>>>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>>>>>>>> 669 if (k != NULL) { >>>>>>>>> 670 klass_at_put(index, k); >>>>>>>>> 671 } >>>>>>>>> 672 #endif >>>>>>>>> 673 } else >>>>>>>> >>>>>>>> Removed. >>>>>>>> >>>>>>>>> >>>>>>>>> cpCache.hpp: >>>>>>>>> >>>>>>>>> u8 _archived_references >>>>>>>>> >>>>>>>>> shouldn't this be declared as an narrowOop to avoid the type casts when it's used? >>>>>>>> >>>>>>>> Ok. >>>>>>>> >>>>>>>>> >>>>>>>>> cpCache.cpp: >>>>>>>>> >>>>>>>>> add assert so that one of these is used only at dump time and the other only at run time? >>>>>>>>> >>>>>>>>> 610 oop ConstantPoolCache::archived_references() { >>>>>>>>> 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); >>>>>>>>> 612 } >>>>>>>>> 613 >>>>>>>>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>>>>>>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>>>>>>>> 616 } >>>>>>>> >>>>>>>> Ok. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> Jiangli >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>>>>>>>> Sorry, the mail didn?t handle the rich text well. I fixed the format below. >>>>>>>>>> >>>>>>>>>> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. >>>>>>>>>> >>>>>>>>>> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. >>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>>>>>>>> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>>>>>>>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>>>>>>>> >>>>>>>>>> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. >>>>>>>>>> >>>>>>>>>> Types of Pinned G1 Heap Regions >>>>>>>>>> >>>>>>>>>> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. >>>>>>>>>> >>>>>>>>>> 00100 0 [ 8] Pinned Mask >>>>>>>>>> 01000 0 [16] Old Mask >>>>>>>>>> 10000 0 [32] Archive Mask >>>>>>>>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>>>>>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Pinned Regions >>>>>>>>>> >>>>>>>>>> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. >>>>>>>>>> >>>>>>>>>> Archive Regions >>>>>>>>>> >>>>>>>>>> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. >>>>>>>>>> >>>>>>>>>> An archive region is also an old region by design. >>>>>>>>>> >>>>>>>>>> Open Archive (GC-RW) Regions >>>>>>>>>> >>>>>>>>>> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. >>>>>>>>>> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. >>>>>>>>>> >>>>>>>>>> Adjustable Outgoing Pointers >>>>>>>>>> >>>>>>>>>> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. >>>>>>>>>> >>>>>>>>>> Closed Archive (GC-RO) Regions >>>>>>>>>> >>>>>>>>>> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. >>>>>>>>>> In JDK 9 we support archive Strings with the archive regions. >>>>>>>>>> >>>>>>>>>> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. >>>>>>>>>> >>>>>>>>>> Dormant Objects >>>>>>>>>> >>>>>>>>>> Dormant objects are unreachable java objects within the open archive heap region. >>>>>>>>>> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. >>>>>>>>>> >>>>>>>>>> Object State Transition >>>>>>>>>> >>>>>>>>>> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. >>>>>>>>>> >>>>>>>>>> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. >>>>>>>>>> >>>>>>>>>> Caching Java Objects at Archive Dump Time >>>>>>>>>> >>>>>>>>>> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. >>>>>>>>>> >>>>>>>>>> Caching Constant Pool resolved_references Array >>>>>>>>>> >>>>>>>>>> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. >>>>>>>>>> >>>>>>>>>> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. >>>>>>>>>> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. >>>>>>>>>> >>>>>>>>>> Runtime Java Heap With Cached Java Objects >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. >>>>>>>>>> >>>>>>>>>> Preliminary test execution and status: >>>>>>>>>> >>>>>>>>>> JPRT: passed >>>>>>>>>> Tier2-rt: passed >>>>>>>>>> Tier2-gc: passed >>>>>>>>>> Tier2-comp: passed >>>>>>>>>> Tier3-rt: passed >>>>>>>>>> Tier3-gc: passed >>>>>>>>>> Tier3-comp: passed >>>>>>>>>> Tier4-rt: passed >>>>>>>>>> Tier4-gc: passed >>>>>>>>>> Tier4-comp:6 jobs timed out, all other tests passed >>>>>>>>>> Tier5-rt: one test failed but passed when running locally, all other tests passed >>>>>>>>>> Tier5-gc: passed >>>>>>>>>> Tier5-comp: running >>>>>>>>>> hotspot_gc: two jobs timed out, all other tests passed >>>>>>>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>>>>>>>> vm.gc: passed >>>>>>>>>> vm.gc in CDS mode: passed >>>>>>>>>> Kichensink: passed >>>>>>>>>> Kichensink in CDS mode: passed >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Jiangli >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From coleen.phillimore at oracle.com Fri Aug 11 15:46:05 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 11 Aug 2017 11:46:05 -0400 Subject: RFR (M) 8186042: Optimize OopMapCache lookup Message-ID: <7c36b086-5959-753b-1626-415817d53525@oracle.com> Summary: Use lock free access to oopMapCache Contributed-by: frederic.parain at oracle.com, coleen.phillimore at oracle.com The OopMapCache::lookup() function took out a mutex to protect access between the GC threads that are running concurrently. See bug for more info. The function lookup() is run by multiple GC threads concurrently. If there's a collision in the hashtable, this uses atomic cmpxchg to add the entry to a list to be cleaned up after the safepoint is over. GC isn't doing lookup at that point. This change is contributed by Frederic Parain, with some cleanup and logging from me. open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8186042 Tested with RBT equivalent of nightly on linux x64. Also ran dacapo with -Xint -Xlog:interpreter+oopmap=debug to verify. This change also removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration however) in favor of new logging. A linked CSR request is pending. Thanks, Coleen From daniel.daugherty at oracle.com Fri Aug 11 16:58:11 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 11 Aug 2017 10:58:11 -0600 Subject: RFR(XXS): quarantine gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java (8186149) Message-ID: <970d172a-7592-f8ef-c52d-109046e80366@oracle.com> Greetings, I'm quarantining gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java as it continues to fail in the JDK10-hs nightly. 8186149 quarantine gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java https://bugs.openjdk.java.net/browse/JDK-8186149 $ hg diff test/ProblemList.txt diff -r 52f2a3a13ed1 test/ProblemList.txt --- a/test/ProblemList.txt Thu Aug 10 18:09:19 2017 -0700 +++ b/test/ProblemList.txt Fri Aug 11 10:56:44 2017 -0600 @@ -125,6 +125,7 @@ gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all +gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java 8177765 generic-all ############################################################################# This fix is targeted to JDK10/hs. Dan From jesper.wilhelmsson at oracle.com Fri Aug 11 17:04:53 2017 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Fri, 11 Aug 2017 19:04:53 +0200 Subject: RFR(XXS): quarantine gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java (8186149) In-Reply-To: <970d172a-7592-f8ef-c52d-109046e80366@oracle.com> References: <970d172a-7592-f8ef-c52d-109046e80366@oracle.com> Message-ID: Looks good! /Jesper > On 11 Aug 2017, at 18:58, Daniel D. Daugherty wrote: > > Greetings, > > I'm quarantining gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java > as it continues to fail in the JDK10-hs nightly. > > 8186149 quarantine gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java > https://bugs.openjdk.java.net/browse/JDK-8186149 > > $ hg diff test/ProblemList.txt > diff -r 52f2a3a13ed1 test/ProblemList.txt > --- a/test/ProblemList.txt Thu Aug 10 18:09:19 2017 -0700 > +++ b/test/ProblemList.txt Fri Aug 11 10:56:44 2017 -0600 > @@ -125,6 +125,7 @@ > gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all > gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all > gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all > +gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java 8177765 generic-all > > ############################################################################# > > This fix is targeted to JDK10/hs. > > Dan > From daniel.daugherty at oracle.com Fri Aug 11 17:07:57 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 11 Aug 2017 11:07:57 -0600 Subject: RFR(XXS): quarantine gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java (8186149) In-Reply-To: References: <970d172a-7592-f8ef-c52d-109046e80366@oracle.com> Message-ID: Thanks Jesper! Dan On 8/11/17 11:04 AM, jesper.wilhelmsson at oracle.com wrote: > Looks good! > /Jesper > >> On 11 Aug 2017, at 18:58, Daniel D. Daugherty wrote: >> >> Greetings, >> >> I'm quarantining gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java >> as it continues to fail in the JDK10-hs nightly. >> >> 8186149 quarantine gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java >> https://bugs.openjdk.java.net/browse/JDK-8186149 >> >> $ hg diff test/ProblemList.txt >> diff -r 52f2a3a13ed1 test/ProblemList.txt >> --- a/test/ProblemList.txt Thu Aug 10 18:09:19 2017 -0700 >> +++ b/test/ProblemList.txt Fri Aug 11 10:56:44 2017 -0600 >> @@ -125,6 +125,7 @@ >> gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all >> gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all >> gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all >> +gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java 8177765 generic-all >> >> ############################################################################# >> >> This fix is targeted to JDK10/hs. >> >> Dan >> From serguei.spitsyn at oracle.com Fri Aug 11 17:24:03 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 11 Aug 2017 10:24:03 -0700 Subject: RFR(XXS): quarantine gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java (8186149) In-Reply-To: <970d172a-7592-f8ef-c52d-109046e80366@oracle.com> References: <970d172a-7592-f8ef-c52d-109046e80366@oracle.com> Message-ID: <43799f11-c796-6c9b-9900-2e1ef092bd66@oracle.com> Looks good. Thanks, Serguei On 8/11/17 09:58, Daniel D. Daugherty wrote: > Greetings, > > I'm quarantining > gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java > as it continues to fail in the JDK10-hs nightly. > > 8186149 quarantine > gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java > https://bugs.openjdk.java.net/browse/JDK-8186149 > > $ hg diff test/ProblemList.txt > diff -r 52f2a3a13ed1 test/ProblemList.txt > --- a/test/ProblemList.txt Thu Aug 10 18:09:19 2017 -0700 > +++ b/test/ProblemList.txt Fri Aug 11 10:56:44 2017 -0600 > @@ -125,6 +125,7 @@ > gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all > gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all > gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all > +gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java > 8177765 generic-all > > ############################################################################# > > > This fix is targeted to JDK10/hs. > > Dan > From daniel.daugherty at oracle.com Fri Aug 11 17:25:37 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 11 Aug 2017 11:25:37 -0600 Subject: RFR(XXS): try simple @build fix in compiler/jsr292/PollutedTrapCounts.java (8186151) Message-ID: <39ca265b-20d2-4372-fdb6-a6beaf40bd91@oracle.com> Greetings, I was about to quarantine compiler/jsr292/PollutedTrapCounts.java in JDK10-hs when Ioi suggested the following simple @build fix: $ hg diff test/compiler/jsr292/PollutedTrapCounts.java diff -r 7b88d261529b test/compiler/jsr292/PollutedTrapCounts.java --- a/test/compiler/jsr292/PollutedTrapCounts.java Fri Aug 11 11:20:37 2017 -0600 +++ b/test/compiler/jsr292/PollutedTrapCounts.java Fri Aug 11 11:22:13 2017 -0600 @@ -26,7 +26,8 @@ * @bug 8074551 * @modules java.base/jdk.internal.misc * @library /test/lib - * + * @build jdk.test.lib.* + * @build jdk.test.lib.process.* * @run driver compiler.jsr292.PollutedTrapCounts */ 8186151 try simple @build fix in compiler/jsr292/PollutedTrapCounts.java https://bugs.openjdk.java.net/browse/JDK-8186151 This fix is targeted to JDK10/hs. Dan From daniel.daugherty at oracle.com Fri Aug 11 17:52:39 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 11 Aug 2017 11:52:39 -0600 Subject: RFR(XXS): quarantine gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java (8186149) In-Reply-To: <43799f11-c796-6c9b-9900-2e1ef092bd66@oracle.com> References: <970d172a-7592-f8ef-c52d-109046e80366@oracle.com> <43799f11-c796-6c9b-9900-2e1ef092bd66@oracle.com> Message-ID: <8d870bc2-7487-490b-76f2-5e5f342cc110@oracle.com> Thanks Serguei! Dan On 8/11/17 11:24 AM, serguei.spitsyn at oracle.com wrote: > Looks good. > > Thanks, > Serguei > > On 8/11/17 09:58, Daniel D. Daugherty wrote: >> Greetings, >> >> I'm quarantining >> gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java >> as it continues to fail in the JDK10-hs nightly. >> >> 8186149 quarantine >> gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java >> https://bugs.openjdk.java.net/browse/JDK-8186149 >> >> $ hg diff test/ProblemList.txt >> diff -r 52f2a3a13ed1 test/ProblemList.txt >> --- a/test/ProblemList.txt Thu Aug 10 18:09:19 2017 -0700 >> +++ b/test/ProblemList.txt Fri Aug 11 10:56:44 2017 -0600 >> @@ -125,6 +125,7 @@ >> gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all >> gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all >> gc/stress/gclocker/TestGCLockerWithG1.java 8179226 generic-all >> +gc/survivorAlignment/TestPromotionFromSurvivorToTenuredAfterMinorGC.java >> 8177765 generic-all >> >> ############################################################################# >> >> >> This fix is targeted to JDK10/hs. >> >> Dan >> > From daniel.daugherty at oracle.com Fri Aug 11 17:55:05 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 11 Aug 2017 11:55:05 -0600 Subject: RFR(XXS): quarantine sun/management/jdp/JdpOffTest.java (8186152) Message-ID: <48af379f-8c41-f2fa-5f68-052976bbf517@oracle.com> Greetings, I'm quarantining sun/management/jdp/JdpOffTest.java as it continues to fail in the JDK10-hs nightly. 8186152 quarantine sun/management/jdp/JdpOffTest.java https://bugs.openjdk.java.net/browse/JDK-8186152 $ hg diff test/ProblemList.txt diff -r 5ebbdc94be6d test/ProblemList.txt --- a/test/ProblemList.txt Tue Aug 08 22:55:42 2017 +0200 +++ b/test/ProblemList.txt Fri Aug 11 11:51:51 2017 -0600 @@ -153,6 +153,7 @@ com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java 8030957 aix-all com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java 8030957 aix-all sun/management/HotspotRuntimeMBean/GetSafepointSyncTime.java 8174734 generic-all +sun/management/jdp/JdpOffTest.java 8175542 generic-all ############################################################################ This fix is targeted to JDK10/hs. Dan From vladimir.kozlov at oracle.com Fri Aug 11 17:59:04 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 11 Aug 2017 10:59:04 -0700 Subject: RFR(XXS): try simple @build fix in compiler/jsr292/PollutedTrapCounts.java (8186151) In-Reply-To: <39ca265b-20d2-4372-fdb6-a6beaf40bd91@oracle.com> References: <39ca265b-20d2-4372-fdb6-a6beaf40bd91@oracle.com> Message-ID: Okay, lets try it. Vladimir On 8/11/17 10:25 AM, Daniel D. Daugherty wrote: > Greetings, > > I was about to quarantine compiler/jsr292/PollutedTrapCounts.java in > JDK10-hs when Ioi suggested the following simple @build fix: > > $ hg diff test/compiler/jsr292/PollutedTrapCounts.java > diff -r 7b88d261529b test/compiler/jsr292/PollutedTrapCounts.java > --- a/test/compiler/jsr292/PollutedTrapCounts.java Fri Aug 11 > 11:20:37 2017 -0600 > +++ b/test/compiler/jsr292/PollutedTrapCounts.java Fri Aug 11 > 11:22:13 2017 -0600 > @@ -26,7 +26,8 @@ > * @bug 8074551 > * @modules java.base/jdk.internal.misc > * @library /test/lib > - * > + * @build jdk.test.lib.* > + * @build jdk.test.lib.process.* > * @run driver compiler.jsr292.PollutedTrapCounts > */ > > > 8186151 try simple @build fix in compiler/jsr292/PollutedTrapCounts.java > https://bugs.openjdk.java.net/browse/JDK-8186151 > > This fix is targeted to JDK10/hs. > > Dan From daniel.daugherty at oracle.com Fri Aug 11 18:00:27 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 11 Aug 2017 12:00:27 -0600 Subject: RFR(XXS): try simple @build fix in compiler/jsr292/PollutedTrapCounts.java (8186151) In-Reply-To: References: <39ca265b-20d2-4372-fdb6-a6beaf40bd91@oracle.com> Message-ID: <74ce2543-f4fa-5e72-ff81-efced65303d1@oracle.com> Thanks Vladimir! Dan On 8/11/17 11:59 AM, Vladimir Kozlov wrote: > Okay, lets try it. > > Vladimir > > On 8/11/17 10:25 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> I was about to quarantine compiler/jsr292/PollutedTrapCounts.java in >> JDK10-hs when Ioi suggested the following simple @build fix: >> >> $ hg diff test/compiler/jsr292/PollutedTrapCounts.java >> diff -r 7b88d261529b test/compiler/jsr292/PollutedTrapCounts.java >> --- a/test/compiler/jsr292/PollutedTrapCounts.java Fri Aug 11 >> 11:20:37 2017 -0600 >> +++ b/test/compiler/jsr292/PollutedTrapCounts.java Fri Aug 11 >> 11:22:13 2017 -0600 >> @@ -26,7 +26,8 @@ >> * @bug 8074551 >> * @modules java.base/jdk.internal.misc >> * @library /test/lib >> - * >> + * @build jdk.test.lib.* >> + * @build jdk.test.lib.process.* >> * @run driver compiler.jsr292.PollutedTrapCounts >> */ >> >> >> 8186151 try simple @build fix in compiler/jsr292/PollutedTrapCounts.java >> https://bugs.openjdk.java.net/browse/JDK-8186151 >> >> This fix is targeted to JDK10/hs. >> >> Dan From coleen.phillimore at oracle.com Fri Aug 11 18:22:29 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 11 Aug 2017 14:22:29 -0400 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> <5F35B435-FB72-4A21-8896-645E0291C5DF@oracle.com> <4DEE0393-59DA-4E01-ACDB-F2DB0F9ED6D9@oracle.com> <20bfa3ff-b2cd-028d-efa9-bddad4a6ff7c@oracle.com> Message-ID: <1132858e-5c1b-34f4-831c-ccc595ade738@oracle.com> These incremental changes look good to me. Thanks, Coleen On 8/10/17 7:51 PM, Jiangli Zhou wrote: > Thanks, Ioi! > > Jiangli > >> On Aug 10, 2017, at 2:15 PM, Ioi Lam > > wrote: >> >> Hi Jiangli, >> >> >> The changes look good to me. Thanks for considering my suggestions. >> >> >> - Ioi >> >> >> On 8/8/17 5:33 PM, Jiangli Zhou wrote: >>> Here is the incremental webrev that has all the changes incorporated >>> with suggestions from Coleen, Ioi and Thomas: >>> >>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03.inc/ >>> >>> >>> Updated full webrev: >>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03/ >>> >>> >>> Thanks again for Coleen's, Ioi's and Thomas? review! >>> Jiangli >>> >>>> On Aug 7, 2017, at 7:57 PM, Jiangli Zhou >>> > wrote: >>>> >>>> Hi Ioi, >>>> >>>> Thanks for getting back to me. >>>> >>>>> On Aug 7, 2017, at 5:45 PM, Ioi Lam >>>> > wrote: >>>>> >>>>> On 8/4/17 10:19 PM, Jiangli Zhou wrote: >>>>> >>>>>> Hi Ioi, >>>>>> >>>>>> Thanks for looking again. >>>>>> >>>>>>> On Aug 4, 2017, at 2:22 PM, Ioi Lam >>>>>> > wrote: >>>>>>> >>>>>>> Hi Jiangli, >>>>>>> >>>>>>> The code looks good in general. I just have a few pet peeves for >>>>>>> readability: >>>>>>> >>>>>>> >>>>>>> (1) stringTable.cpp and metaspaceShared.cpp have the same asserts >>>>>>> >>>>>>> 704 assert(UseG1GC, "Only support G1 GC"); >>>>>>> 705 assert(UseCompressedOops && UseCompressedClassPointers, >>>>>>> 706 "Only support UseCompressedOops and >>>>>>> UseCompressedClassPointers enabled"); >>>>>>> >>>>>>> 1615 assert(UseG1GC, "Only support G1 GC"); >>>>>>> 1616 assert(UseCompressedOops && UseCompressedClassPointers, >>>>>>> 1617 "Only support UseCompressedOops and >>>>>>> UseCompressedClassPointers enabled"); >>>>>>> >>>>>>> Maybe it's better to combine them into a single function like >>>>>>> MetaspaceShared::assert_vm_flags() so they don't get out of sync? >>>>>> >>>>>> There is a MetaspaceShared::allow_archive_heap_object(), which >>>>>> checks for UseG1GC, UseCompressedOops and >>>>>> UseCompressedClassPointers combined. It does not seem to worth >>>>>> add another separate API for asserting the required flags. I?ll >>>>>> use that in the assert. >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> (2) FileMapInfo::write_archive_heap_regions() >>>>>>> >>>>>>> I still find this code very hard to read, especially due to the >>>>>>> loop. >>>>>>> >>>>>>> First, the comments are not consistent with the code: >>>>>>> >>>>>>> 498 assert(arr_len <= max_num_regions, "number of memory regions >>>>>>> exceeds maximum"); >>>>>>> >>>>>>> but the comments says: "The rest are consecutive full GC >>>>>>> regions" which means there's a chance for max_num_regions to be >>>>>>> more than 2 (which will be the case with Calvin's java-loader >>>>>>> dumping changes using very small heap size).So the code is >>>>>>> actually wrong. >>>>>> >>>>>> The max_num_regions is the maximum number of region for each >>>>>> archived heap space (the string space, or open archive space). We >>>>>> only run into the case where the MemRegion array size is larger >>>>>> than max_num_regions with Calvin?s pending change. As part of >>>>>> Calvin?s change, he will change the assert into a check and bail >>>>>> out if the number of MemRegions are larger than max_num_regions >>>>>> due to heap fragmentation. >>>>>> >>>>>> >>>>> Your latest patch assumes that arr_len <= 2, but the >>>>> implementation of >>>>> G1CollectedHeap::heap()->begin_archive_alloc_range() / >>>>> G1CollectedHeap::heap()->end_archive_alloc_range() actually allows >>>>> more than 2 regions to returned. So simply putting an assert there >>>>> seems risky (unless you have analyzed all possible scenarios to >>>>> prove that's impossible). >>>>> >>>>> Instead of trying to come up with a complicated proof, I think >>>>> it's much safer to disable the archived string region if the >>>>> arr_len > 2. Also, if the string region is disabled, you should >>>>> also disable the open_archive_heap_region >>>>> >>>>> I think this is a general issue with the mapped heap regions, and >>>>> it just happens to be revealed by Calvin's patch. So we should fix >>>>> it now and not wait for Calvin's patch. >>>> >>>> Ok. I?ll change the assert to be a check. >>>> >>>>> >>>>> >>>>>>> >>>>>>> The word "region" is used in these parameters, but they don't >>>>>>> mean the same thing. >>>>>>> >>>>>>> GrowableArray *regions >>>>>>> int first_region, int max_num_regions, >>>>>>> >>>>>>> >>>>>>> How about regions -> g1_regions_list >>>>>>> first_region -> first_region_in_archive >>>>>> >>>>>> The GrowableArray above is the MemRegions that GC code gives back >>>>>> to us. The GC code combines multiple G1 regions. The comments >>>>>> probably are over-explaining the details, which are hidden in the >>>>>> GC code. Probably that?s the confusing source. I?ll make the >>>>>> comment more clear. >>>>>> >>>>>> Using g1_regions_list would also be confusing, since >>>>>> write_archive_heap_regions does not handle G1 regions directly. >>>>>> It processes the MemRegion array that GC code returns. How about >>>>>> changing ?regions? to ?mem_regions? or ?archive_regions'? >>>>>> >>>>> How about heap_regions? These are regions in the active Java heap, >>>>> which current has not mapped anything from the CDS archive. >>>> >>>> Ok. >>>> >>>> I?m updating my changes and will send out a consolidated webrev. >>>> >>>> Thanks! >>>> Jiangli >>>> >>>>> >>>>> >>>>>>> >>>>>>> >>>>>>> In the comments, I find the phrase 'the current archive heap >>>>>>> region' ambiguous. It could be (erroneously) interpreted as "a >>>>>>> region from the currently mapped archive? >>>>>>> >>>>>>> To make it unambiguous, how about changing >>>>>>> >>>>>>> >>>>>>> 464 // Write the current archive heap region, which contains >>>>>>> one or multiple GC(G1) regions. >>>>>>> >>>>>>> >>>>>>> to >>>>>>> >>>>>>> // Write the given list of G1 memory regions into the >>>>>>> archive, starting at >>>>>>> // first_region_in_archive. >>>>>> >>>>>> >>>>>> Ok. How about the following: >>>>>> >>>>>> // Write the given list of java heap memory regions into the >>>>>> archive, starting at >>>>>> // first_region_in_archive. >>>>>> >>>>> Sounds good. >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>>>> >>>>>>> >>>>>>> Also, for the explanation of how the G1 regions are written into >>>>>>> the archive, how about: >>>>>>> >>>>>>> // The G1 regions in the list are sorted in ascending address >>>>>>> order. When there are more objects >>>>>>> // than the capacity of a single G1 region, the bottom-most >>>>>>> G1 region may be partially filled, and the >>>>>>> // remaining G1 region(s) are consecutively allocated and >>>>>>> fully filled. >>>>>>> // >>>>>>> // Thus, the bottom-most G1 region (if not empty) is written >>>>>>> into first_region_in_archive. >>>>>>> // The remaining G1 regions (if exist) are coalesced and >>>>>>> written as a single block >>>>>>> // into (first_region_in_archive + 1) >>>>>>> >>>>>>> // Here's the mapping from (g1 regions) -> (archive regions). >>>>>>> >>>>>>> >>>>>>> All this function needs to do is to decide the values for >>>>>>> >>>>>>> r0_start, r0_top >>>>>>> r1_start, r1_top >>>>>>> >>>>>>> I think it would be much better to not use the loop, and not use >>>>>>> the max_num_regions parameter (it's always 2 anyway). >>>>>>> >>>>>>> *r0_start = *r0_top = NULL; >>>>>>> *r1_start = *r1_top = NULL; >>>>>>> >>>>>>> if (arr_len >= 1) { >>>>>>> *r0_start = regions->at(0).start(); >>>>>>> *r0_end = *r0_start + regions->at(0).byte_size(); >>>>>>> } >>>>>>> if (arr_len >= 2) { >>>>>>> int last = arr_len - 1; >>>>>>> *r1_start = regions->at(1).start(); >>>>>>> *r1_end = regions->at(last).start() + >>>>>>> regions->at(last).byte_size(); >>>>>>> } >>>>>>> >>>>>>> what do you think? >>>>>> >>>>>> We need to write out all archive regions including the empty >>>>>> ones. The loop using max_num_regions is the easiest way. I?d like >>>>>> to remove the code that deals with r0_* and r1_ explicitly. Let >>>>>> me try that. >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> (3) metaspace.cpp >>>>>>> >>>>>>> 3350 // Map the archived heap regions after compressed >>>>>>> pointers >>>>>>> 3351 // because it relies on compressed class pointers >>>>>>> setting to work >>>>>>> >>>>>>> do you mean this? >>>>>>> >>>>>>> // Archived heap regions depend on the parameters of >>>>>>> compressed class pointers, so >>>>>>> // they must be mapped after such parameters have been >>>>>>> decided in the above call. >>>>>> >>>>>> Hmmm, maybe use ?arguments? instead of ?parameters?? >>>>>> >>>>>>> >>>>>>> >>>>>>> (4) I found this name not strictly grammatical. How about this: >>>>>>> >>>>>>> allow_archive_heap_object -> is_heap_object_archiving_allowed >>>>>> >>>>>> Ok. >>>>>> >>>>>>> >>>>>>> (5) in most of your code, 'archive' is used as a noun, except in >>>>>>> StringTable::archive_string() where it's used as a verb. >>>>>>> >>>>>>> archive_string could also be interpreted erroneously as "return >>>>>>> a string that's already in the archive". So to be consistent and >>>>>>> unambiguous, I think it's better to rename it to >>>>>>> StringTable::create_archived_string() >>>>>> >>>>>> Ok. >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> >>>>>>> On 8/3/17 5:15 PM, Jiangli Zhou wrote: >>>>>>>> Here are the updated webrevs. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >>>>>>>> >>>>>>>> Changes in the updated webrevs include: >>>>>>>> >>>>>>>> * Merge with Ioi?s recent shared space auto-sizing change >>>>>>>> (8072061) >>>>>>>> * Addressed all feedbacks from Ioi and Coleen (Thanks for >>>>>>>> detailed review!) >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jiangli >>>>>>>> >>>>>>>> >>>>>>>>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi Ioi, >>>>>>>>> >>>>>>>>> Thank you so much for reviewing this. I?ve addressed all your >>>>>>>>> feedbacks. Please see details below. I?ll updated the webrev >>>>>>>>> after addressing Coleen?s comments. >>>>>>>>> >>>>>>>>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>>>>>>>>> >>>>>>>>>> Hi Jiangli, >>>>>>>>>> >>>>>>>>>> Here are my comments. I've not reviewed the GC code and I'll >>>>>>>>>> leave that to the GC experts :-) >>>>>>>>>> >>>>>>>>>> stringTable.cpp: StringTable::archive_string >>>>>>>>>> >>>>>>>>>> add assert for DumpSharedSpaces only >>>>>>>>> >>>>>>>>> Ok. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> filemap.cpp >>>>>>>>>> >>>>>>>>>> 525 void >>>>>>>>>> FileMapInfo::write_archive_heap_regions(GrowableArray >>>>>>>>>> *regions, >>>>>>>>>> 526 int first_region, int num_regions) { >>>>>>>>>> >>>>>>>>>> When I first read this function, I found it hard to follow, >>>>>>>>>> especially this part that coalesces the trailing regions: >>>>>>>>>> >>>>>>>>>> 537 int len = regions->length(); >>>>>>>>>> 538 if (len > 1) { >>>>>>>>>> 539 start = (char*)regions->at(1).start(); >>>>>>>>>> 540 size = (char*)regions->at(len - 1).end() - start; >>>>>>>>>> 541 } >>>>>>>>>> 542 } >>>>>>>>>> >>>>>>>>>> The rest of filemap.cpp always perform identical operations >>>>>>>>>> on MemRegion arrays, which are either 1 or 2 in size. >>>>>>>>>> However, this function doesn't follow that pattern; it also >>>>>>>>>> has a very different notion of "region", and the confusing >>>>>>>>>> part is regions->size() is not the same as num_regions. >>>>>>>>>> >>>>>>>>>> How about we change the API to something like the following? >>>>>>>>>> Before calling this API, the caller needs to coalesce the >>>>>>>>>> trailing G1 regions into a single MemRegion. >>>>>>>>>> >>>>>>>>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, >>>>>>>>>> int first, int num_regions) { >>>>>>>>>> if (first == MetaspaceShared::first_string) { >>>>>>>>>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>>>>>>>>> } else { >>>>>>>>>> assert(first == >>>>>>>>>> MetaspaceShared::first_open_archive_heap_region, "..."); >>>>>>>>>> assert(num_regons <= >>>>>>>>>> MetaspaceShared::max_open_archive_heap_region, "..."); >>>>>>>>>> } >>>>>>>>>> .... >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> I?ve reworked the function and simplified the code. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> 756 if (!string_data_mapped) { >>>>>>>>>> 757 StringTable::ignore_shared_strings(true); >>>>>>>>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, >>>>>>>>>> "sanity"); >>>>>>>>>> 759 } >>>>>>>>>> 760 >>>>>>>>>> 761 if (open_archive_heap_data_mapped) { >>>>>>>>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>>>>>>>>> 763 } else { >>>>>>>>>> 764 assert(open_archive_heap_ranges == NULL && >>>>>>>>>> num_open_archive_heap_ranges == 0, "sanity"); >>>>>>>>>> 765 } >>>>>>>>>> >>>>>>>>>> Maybe the two "if" statements should be more consistent? >>>>>>>>>> Instead of StringTable::ignore_shared_strings, how >>>>>>>>>> about StringTable::set_shared_strings_region_mapped()? >>>>>>>>> >>>>>>>>> Fixed. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> FileMapInfo::map_heap_data() -- >>>>>>>>>> >>>>>>>>>> 818 char* addr = (char*)regions[i].start(); >>>>>>>>>> 819 char* base = os::map_memory(_fd, _full_path, >>>>>>>>>> si->_file_offset, >>>>>>>>>> 820 addr, regions[i].byte_size(), si->_read_only, >>>>>>>>>> 821 si->_allow_exec); >>>>>>>>>> >>>>>>>>>> What happens when the first region succeeds to map but the >>>>>>>>>> second region fails to map? Will both regions be unmapped? I >>>>>>>>>> don't see where you store the return value (base) from >>>>>>>>>> os::map_memory(). Does it mean the code assumes that (addr == >>>>>>>>>> base). If so, we need an assert here. >>>>>>>>> >>>>>>>>> If any of the region fails to map, we bail out and call >>>>>>>>> dealloc_archive_heap_regions(), which handles the deallocation >>>>>>>>> of any regions specified. If second region fails to map, all >>>>>>>>> memory ranges specified by ?regions? array are deallocated. We >>>>>>>>> don?t unmap the memory here since it is part of the java heap. >>>>>>>>> Unmapping of heap memory are handled by GC code. The ?if? >>>>>>>>> check below makes sure base == addr. >>>>>>>>> >>>>>>>>> if (base == NULL || base != addr) { >>>>>>>>> // dealloc the regions from java heap >>>>>>>>> dealloc_archive_heap_regions(regions, region_num); >>>>>>>>> if (log_is_enabled(Info, cds)) { >>>>>>>>> log_info(cds)("UseSharedSpaces: Unable to map at required >>>>>>>>> address in java heap."); >>>>>>>>> } >>>>>>>>> return false; >>>>>>>>> } >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> constantPool.cpp >>>>>>>>>> >>>>>>>>>> Handle refs_handle; >>>>>>>>>> ... >>>>>>>>>> refs_handle = Handle(THREAD, (oop)archived); >>>>>>>>>> >>>>>>>>>> This will first create a NULL handle, then construct a >>>>>>>>>> temporary handle, and then assign the temp handle back to the >>>>>>>>>> null handle. This means two handles will be pushed onto >>>>>>>>>> THREAD->metadata_handles() >>>>>>>>>> >>>>>>>>>> I think it's more efficient if you merge these into a single >>>>>>>>>> statement >>>>>>>>>> >>>>>>>>>> Handle refs_handle(THREAD, (oop)archived); >>>>>>>>> >>>>>>>>> Fixed. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Is this experimental code? Maybe it should be removed? >>>>>>>>>> >>>>>>>>>> 664 if (tag_at(index).is_unresolved_klass()) { >>>>>>>>>> 665 #if 0 >>>>>>>>>> 666 CPSlot entry = cp->slot_at(index); >>>>>>>>>> 667 Symbol* name = entry.get_symbol(); >>>>>>>>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>>>>>>>>> 669 if (k != NULL) { >>>>>>>>>> 670 klass_at_put(index, k); >>>>>>>>>> 671 } >>>>>>>>>> 672 #endif >>>>>>>>>> 673 } else >>>>>>>>> >>>>>>>>> Removed. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> cpCache.hpp: >>>>>>>>>> >>>>>>>>>> u8 _archived_references >>>>>>>>>> >>>>>>>>>> shouldn't this be declared as an narrowOop to avoid the type >>>>>>>>>> casts when it's used? >>>>>>>>> >>>>>>>>> Ok. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> cpCache.cpp: >>>>>>>>>> >>>>>>>>>> add assert so that one of these is used only at dump time >>>>>>>>>> and the other only at run time? >>>>>>>>>> >>>>>>>>>> 610 oop ConstantPoolCache::archived_references() { >>>>>>>>>> 611 return >>>>>>>>>> oopDesc::decode_heap_oop((narrowOop)_archived_references); >>>>>>>>>> 612 } >>>>>>>>>> 613 >>>>>>>>>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>>>>>>>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>>>>>>>>> 616 } >>>>>>>>> >>>>>>>>> Ok. >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> Jiangli >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> - Ioi >>>>>>>>>> >>>>>>>>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>>>>>>>>> Sorry, the mail didn?t handle the rich text well. I fixed >>>>>>>>>>> the format below. >>>>>>>>>>> >>>>>>>>>>> Please help review the changes for JDK-8179302 (Pre-resolve >>>>>>>>>>> constant pool string entries and cache resolved_reference >>>>>>>>>>> arrays in CDS archive). Currently, the CDS archive can >>>>>>>>>>> contain cached class metadata, interned java.lang.String >>>>>>>>>>> objects. This RFE adds the constant pool >>>>>>>>>>> ?resolved_references? arrays (hotspot specific) to the >>>>>>>>>>> archive for startup/runtime performance enhancement. >>>>>>>>>>> The ?resolved_references' arrays are used to hold references >>>>>>>>>>> of resolved constant pool entries including Strings, >>>>>>>>>>> mirrors, etc. With the 'resolved_references? being cached, >>>>>>>>>>> string constants in shared classes can now be resolved to >>>>>>>>>>> existing interned java.lang.Strings at CDS dump time. G1 and >>>>>>>>>>> 64-bit platforms are required. >>>>>>>>>>> >>>>>>>>>>> The GC changes in the RFE were discussed and guided by >>>>>>>>>>> Thomas Schatzl and GC team. Part of the changes were >>>>>>>>>>> contributed by Thomas himself. >>>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>>>>>>>>> hotspot: >>>>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>>>>>>>>> whitebox: >>>>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>>>>>>>>> >>>>>>>>>>> Please see below for details of supporting cached >>>>>>>>>>> ?resolved_references? and pre-resolving string constants. >>>>>>>>>>> >>>>>>>>>>> Types of Pinned G1 Heap Regions >>>>>>>>>>> >>>>>>>>>>> The pinned region type is a super type of all archive region >>>>>>>>>>> types, which include the open archive type and the closed >>>>>>>>>>> archive type. >>>>>>>>>>> >>>>>>>>>>> 00100 0 [ 8] Pinned Mask >>>>>>>>>>> 01000 0 [16] Old Mask >>>>>>>>>>> 10000 0 [32] Archive Mask >>>>>>>>>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>>>>>>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | >>>>>>>>>>> OldMask + 1 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Pinned Regions >>>>>>>>>>> >>>>>>>>>>> Objects within the region are 'pinned', which means GC does >>>>>>>>>>> not move any live objects. GC scans and marks objects in the >>>>>>>>>>> pinned region as normal, but skips forwarding live objects. >>>>>>>>>>> Pointers in live objects are updated. Dead objects >>>>>>>>>>> (unreachable) can be collected and freed. >>>>>>>>>>> >>>>>>>>>>> Archive Regions >>>>>>>>>>> >>>>>>>>>>> The archive types are sub-types of 'pinned'. There are two >>>>>>>>>>> types of archive region currently, open archive and closed >>>>>>>>>>> archive. Both can support caching java heap objects via the >>>>>>>>>>> CDS archive. >>>>>>>>>>> >>>>>>>>>>> An archive region is also an old region by design. >>>>>>>>>>> >>>>>>>>>>> Open Archive (GC-RW) Regions >>>>>>>>>>> >>>>>>>>>>> Open archive region is GC writable. GC scans & marks objects >>>>>>>>>>> within the region and adjusts (updates) pointers in live >>>>>>>>>>> objects the same way as a pinned region. Live objects >>>>>>>>>>> (reachable) are pinned and not forwarded by GC. >>>>>>>>>>> Open archive region does not have 'dead' objects. >>>>>>>>>>> Unreachable objects are 'dormant' objects. Dormant objects >>>>>>>>>>> are not collected and freed by GC. >>>>>>>>>>> >>>>>>>>>>> Adjustable Outgoing Pointers >>>>>>>>>>> >>>>>>>>>>> As GC can adjust pointers within the live objects in open >>>>>>>>>>> archive heap region, objects can have outgoing pointers to >>>>>>>>>>> another java heap region, including closed archive region, >>>>>>>>>>> open archive region, pinned (or humongous) region, and >>>>>>>>>>> normal generational region. When a referenced object is >>>>>>>>>>> moved by GC, the pointer within the open archive region is >>>>>>>>>>> updated accordingly. >>>>>>>>>>> >>>>>>>>>>> Closed Archive (GC-RO) Regions >>>>>>>>>>> >>>>>>>>>>> The closed archive region is GC read-only region. GC cannot >>>>>>>>>>> write into the region. Objects are not scanned and marked by >>>>>>>>>>> GC. Objects are pinned and not forwarded. Pointers are not >>>>>>>>>>> updated by GC either. Hence, objects within the archive >>>>>>>>>>> region cannot have any outgoing pointers to another java >>>>>>>>>>> heap region. Objects however can still have pointers to >>>>>>>>>>> other objects within the closed archive regions (we might >>>>>>>>>>> allow pointers to open archive regions in the future). That >>>>>>>>>>> restricts the type of java objects that can be supported by >>>>>>>>>>> the archive region. >>>>>>>>>>> In JDK 9 we support archive Strings with the archive regions. >>>>>>>>>>> >>>>>>>>>>> The GC-readonly archive region makes java heap memory >>>>>>>>>>> sharable among different JVM processes. NOTE: >>>>>>>>>>> synchronization on the objects within the archive heap >>>>>>>>>>> region can still cause writes to the memory page. >>>>>>>>>>> >>>>>>>>>>> Dormant Objects >>>>>>>>>>> >>>>>>>>>>> Dormant objects are unreachable java objects within the open >>>>>>>>>>> archive heap region. >>>>>>>>>>> A java object in the open archive heap region is a live >>>>>>>>>>> object if it can be reached during scanning. Some of the >>>>>>>>>>> java objects in the region may not be reachable during >>>>>>>>>>> scanning. Those objects are considered as dormant, but not >>>>>>>>>>> dead. For example, a constant pool 'resolved_references' >>>>>>>>>>> array is reachable via the klass root if its container klass >>>>>>>>>>> (shared) is already loaded at the time during GC scanning. >>>>>>>>>>> If a shared klass is not yet loaded, the klass root is not >>>>>>>>>>> scanned and it's constant pool 'resolved_reference' array >>>>>>>>>>> (A) in the open archive region is not reachable. Then A is a >>>>>>>>>>> dormant object. >>>>>>>>>>> >>>>>>>>>>> Object State Transition >>>>>>>>>>> >>>>>>>>>>> All java objects are initially dormant objects when open >>>>>>>>>>> archive heap regions are mapped to the runtime java heap. A >>>>>>>>>>> dormant object becomes live object when the associated >>>>>>>>>>> shared class is loaded at runtime. Explicit call >>>>>>>>>>> to G1SATBCardTableModRefBS::enqueue() needs to be made when >>>>>>>>>>> a dormant object becomes live. That should be the case >>>>>>>>>>> for cached objects with strong roots as well, since strong >>>>>>>>>>> roots are only scanned at the start of GC marking (the >>>>>>>>>>> initial marking) but not during Remarking/Final marking. If >>>>>>>>>>> a cached object becomes live during concurrent marking >>>>>>>>>>> phase, G1 may not find it and mark it live unless a call to >>>>>>>>>>> G1SATBCardTableModRefBS::enqueue() is made for the object. >>>>>>>>>>> >>>>>>>>>>> Currently, a live object in the open archive heap region >>>>>>>>>>> cannot become dormant again. This restriction simplifies GC >>>>>>>>>>> requirement and guarantees all outgoing pointers are updated >>>>>>>>>>> by GC correctly. Only objects for shared classes from the >>>>>>>>>>> builtin class loaders (boot, PlatformClassLoaders, and >>>>>>>>>>> AppClassLoaders) are supported for caching. >>>>>>>>>>> >>>>>>>>>>> Caching Java Objects at Archive Dump Time >>>>>>>>>>> >>>>>>>>>>> The closed archive and open archive regions are allocated >>>>>>>>>>> near the top of the dump time java heap. Archived java >>>>>>>>>>> objects are copied into the designated archive heap regions. >>>>>>>>>>> For example, String objects and the underlying 'value' >>>>>>>>>>> arrays are copied into the closed archive regions. All >>>>>>>>>>> references to the archived objects (from shared class >>>>>>>>>>> metadata, string table, etc) are set to the new heap >>>>>>>>>>> locations. A hash table is used to keep track of all >>>>>>>>>>> archived java objects during the copying process to make >>>>>>>>>>> sure java object is not archived more than once if reached >>>>>>>>>>> from different roots. It also makes sure references to the >>>>>>>>>>> same archived object are updated using the same new address >>>>>>>>>>> location. >>>>>>>>>>> >>>>>>>>>>> Caching Constant Pool resolved_references Array >>>>>>>>>>> >>>>>>>>>>> The 'resolved_references' is an array that holds references >>>>>>>>>>> of resolved constant pool entries including Strings, mirrors >>>>>>>>>>> and methodTypes, etc. Each loaded class has one >>>>>>>>>>> 'resolved_references' array (in ConstantPoolCache). The >>>>>>>>>>> 'resolved_references' arrays are copied into the open >>>>>>>>>>> archive regions during dump process. Prior to copying the >>>>>>>>>>> 'resolved_references' arrays, JVM iterates through constant >>>>>>>>>>> pool entries and resolves all JVM_CONSTANT_String entries to >>>>>>>>>>> existing interned Strings for all archived classes. When >>>>>>>>>>> resolving, JVM only looks up the string table and finds >>>>>>>>>>> existing interned Strings without inserting new ones. If >>>>>>>>>>> a string entry cannot be resolved to an existing interned >>>>>>>>>>> String, the constant pool entry remain as unresolved. That >>>>>>>>>>> prevents memory waste if a constant pool string entry is >>>>>>>>>>> never used at runtime. >>>>>>>>>>> >>>>>>>>>>> All String objects referenced by the string table are copied >>>>>>>>>>> first into the closed archive regions. The string table >>>>>>>>>>> entry is updated with the new location when each String >>>>>>>>>>> object is archived. The JVM updates the resolved constant >>>>>>>>>>> pool string entries with the new object locations when >>>>>>>>>>> copying the 'resolved_references' arrays to the open archive >>>>>>>>>>> regions. References to the 'resolved_references' arrays in >>>>>>>>>>> the ConstantPoolCache are also updated. >>>>>>>>>>> At runtime as part of >>>>>>>>>>> ConstantPool::restore_unshareable_info() work, call >>>>>>>>>>> G1SATBCardTableModRefBS::enqueue() to let GC know the >>>>>>>>>>> 'resolved_references' is becoming live. A handle is created >>>>>>>>>>> for the cached object and added to the loader_data's handles. >>>>>>>>>>> >>>>>>>>>>> Runtime Java Heap With Cached Java Objects >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The closed archive regions (the string regions) and open >>>>>>>>>>> archive regions are mapped to the runtime java heap at the >>>>>>>>>>> same offsets as the dump time offsets from the runtime java >>>>>>>>>>> heap base. >>>>>>>>>>> >>>>>>>>>>> Preliminary test execution and status: >>>>>>>>>>> >>>>>>>>>>> JPRT: passed >>>>>>>>>>> Tier2-rt: passed >>>>>>>>>>> Tier2-gc: passed >>>>>>>>>>> Tier2-comp: passed >>>>>>>>>>> Tier3-rt: passed >>>>>>>>>>> Tier3-gc: passed >>>>>>>>>>> Tier3-comp: passed >>>>>>>>>>> Tier4-rt: passed >>>>>>>>>>> Tier4-gc: passed >>>>>>>>>>> Tier4-comp:6 jobs timed out, all other tests passed >>>>>>>>>>> Tier5-rt: one test failed but passed when running locally, >>>>>>>>>>> all other tests passed >>>>>>>>>>> Tier5-gc: passed >>>>>>>>>>> Tier5-comp: running >>>>>>>>>>> hotspot_gc: two jobs timed out, all other tests passed >>>>>>>>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests >>>>>>>>>>> passed >>>>>>>>>>> vm.gc: passed >>>>>>>>>>> vm.gc in CDS mode: passed >>>>>>>>>>> Kichensink: passed >>>>>>>>>>> Kichensink in CDS mode: passed >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Jiangli >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From jiangli.zhou at oracle.com Fri Aug 11 18:29:48 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 11 Aug 2017 11:29:48 -0700 Subject: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive In-Reply-To: <1132858e-5c1b-34f4-831c-ccc595ade738@oracle.com> References: <74D26CA6-E3A8-4ABB-A6E9-D37E3AD2BAD6@oracle.com> <32C9BE41-D4C3-4242-A8D7-C1E1A5B2E0F3@oracle.com> <4172849a-55e6-7133-90d6-2c75b58b0391@oracle.com> <9E1C5146-2327-4F97-A2B7-DC500D545D1C@oracle.com> <98EF8CF9-FD28-46BC-8D3D-52DEA205EBD5@oracle.com> <1b55ca78-679c-64c5-cd4c-a1dc2c032bc7@oracle.com> <5F35B435-FB72-4A21-8896-645E0291C5DF@oracle.com> <4DEE0393-59DA-4E01-ACDB-F2DB0F9ED6D9@oracle.com> <20bfa3ff-b2cd-028d-efa9-bddad4a6ff7c@oracle.com> <1132858e-5c1b-34f4-831c-ccc595ade738@oracle.com> Message-ID: <0175D81D-FB40-4E73-B94D-46CE48A888C8@oracle.com> Thank you, Coleen! Jiangli > On Aug 11, 2017, at 11:22 AM, coleen.phillimore at oracle.com wrote: > > > These incremental changes look good to me. > Thanks, > Coleen > > On 8/10/17 7:51 PM, Jiangli Zhou wrote: >> Thanks, Ioi! >> >> Jiangli >> >>> On Aug 10, 2017, at 2:15 PM, Ioi Lam > wrote: >>> >>> Hi Jiangli, >>> >>> >>> The changes look good to me. Thanks for considering my suggestions. >>> >>> >>> - Ioi >>> >>> On 8/8/17 5:33 PM, Jiangli Zhou wrote: >>>> Here is the incremental webrev that has all the changes incorporated with suggestions from Coleen, Ioi and Thomas: >>>> >>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03.inc/ >>>> >>>> Updated full webrev: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03/ >>>> >>>> Thanks again for Coleen's, Ioi's and Thomas? review! >>>> Jiangli >>>> >>>>> On Aug 7, 2017, at 7:57 PM, Jiangli Zhou > wrote: >>>>> >>>>> Hi Ioi, >>>>> >>>>> Thanks for getting back to me. >>>>> >>>>>> On Aug 7, 2017, at 5:45 PM, Ioi Lam > wrote: >>>>>> >>>>>> On 8/4/17 10:19 PM, Jiangli Zhou wrote: >>>>>>> Hi Ioi, >>>>>>> >>>>>>> Thanks for looking again. >>>>>>> >>>>>>>> On Aug 4, 2017, at 2:22 PM, Ioi Lam > wrote: >>>>>>>> >>>>>>>> Hi Jiangli, >>>>>>>> >>>>>>>> The code looks good in general. I just have a few pet peeves for readability: >>>>>>>> >>>>>>>> >>>>>>>> (1) stringTable.cpp and metaspaceShared.cpp have the same asserts >>>>>>>> >>>>>>>> 704 assert(UseG1GC, "Only support G1 GC"); >>>>>>>> 705 assert(UseCompressedOops && UseCompressedClassPointers, >>>>>>>> 706 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); >>>>>>>> >>>>>>>> 1615 assert(UseG1GC, "Only support G1 GC"); >>>>>>>> 1616 assert(UseCompressedOops && UseCompressedClassPointers, >>>>>>>> 1617 "Only support UseCompressedOops and UseCompressedClassPointers enabled"); >>>>>>>> >>>>>>>> Maybe it's better to combine them into a single function like MetaspaceShared::assert_vm_flags() so they don't get out of sync? >>>>>>> >>>>>>> There is a MetaspaceShared::allow_archive_heap_object(), which checks for UseG1GC, UseCompressedOops and UseCompressedClassPointers combined. It does not seem to worth add another separate API for asserting the required flags. I?ll use that in the assert. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> (2) FileMapInfo::write_archive_heap_regions() >>>>>>>> >>>>>>>> I still find this code very hard to read, especially due to the loop. >>>>>>>> >>>>>>>> First, the comments are not consistent with the code: >>>>>>>> >>>>>>>> 498 assert(arr_len <= max_num_regions, "number of memory regions exceeds maximum"); >>>>>>>> >>>>>>>> but the comments says: "The rest are consecutive full GC regions" which means there's a chance for max_num_regions to be more than 2 (which will be the case with Calvin's java-loader dumping changes using very small heap size). So the code is actually wrong. >>>>>>> >>>>>>> The max_num_regions is the maximum number of region for each archived heap space (the string space, or open archive space). We only run into the case where the MemRegion array size is larger than max_num_regions with Calvin?s pending change. As part of Calvin?s change, he will change the assert into a check and bail out if the number of MemRegions are larger than max_num_regions due to heap fragmentation. >>>>>>> >>>>>>> >>>>>> Your latest patch assumes that arr_len <= 2, but the implementation of G1CollectedHeap::heap()->begin_archive_alloc_range() / G1CollectedHeap::heap()->end_archive_alloc_range() actually allows more than 2 regions to returned. So simply putting an assert there seems risky (unless you have analyzed all possible scenarios to prove that's impossible). >>>>>> >>>>>> Instead of trying to come up with a complicated proof, I think it's much safer to disable the archived string region if the arr_len > 2. Also, if the string region is disabled, you should also disable the open_archive_heap_region >>>>>> >>>>>> I think this is a general issue with the mapped heap regions, and it just happens to be revealed by Calvin's patch. So we should fix it now and not wait for Calvin's patch. >>>>> >>>>> Ok. I?ll change the assert to be a check. >>>>> >>>>>> >>>>>> >>>>>>>> >>>>>>>> The word "region" is used in these parameters, but they don't mean the same thing. >>>>>>>> >>>>>>>> GrowableArray *regions >>>>>>>> int first_region, int max_num_regions, >>>>>>>> >>>>>>>> >>>>>>>> How about regions -> g1_regions_list >>>>>>>> first_region -> first_region_in_archive >>>>>>> >>>>>>> The GrowableArray above is the MemRegions that GC code gives back to us. The GC code combines multiple G1 regions. The comments probably are over-explaining the details, which are hidden in the GC code. Probably that?s the confusing source. I?ll make the comment more clear. >>>>>>> >>>>>>> Using g1_regions_list would also be confusing, since write_archive_heap_regions does not handle G1 regions directly. It processes the MemRegion array that GC code returns. How about changing ?regions? to ?mem_regions? or ?archive_regions'? >>>>>>> >>>>>> How about heap_regions? These are regions in the active Java heap, which current has not mapped anything from the CDS archive. >>>>> >>>>> Ok. >>>>> >>>>> I?m updating my changes and will send out a consolidated webrev. >>>>> >>>>> Thanks! >>>>> Jiangli >>>>> >>>>>> >>>>>> >>>>>>>> >>>>>>>> >>>>>>>> In the comments, I find the phrase 'the current archive heap region' ambiguous. It could be (erroneously) interpreted as "a region from the currently mapped archive? >>>>>>> >>>>>>>> >>>>>>>> To make it unambiguous, how about changing >>>>>>>> >>>>>>>> >>>>>>>> 464 // Write the current archive heap region, which contains one or multiple GC(G1) regions. >>>>>>>> >>>>>>>> >>>>>>>> to >>>>>>>> >>>>>>>> // Write the given list of G1 memory regions into the archive, starting at >>>>>>>> // first_region_in_archive. >>>>>>> >>>>>>> >>>>>>> Ok. How about the following: >>>>>>> >>>>>>> // Write the given list of java heap memory regions into the archive, starting at >>>>>>> // first_region_in_archive. >>>>>>> >>>>>> Sounds good. >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Also, for the explanation of how the G1 regions are written into the archive, how about: >>>>>>>> >>>>>>>> // The G1 regions in the list are sorted in ascending address order. When there are more objects >>>>>>>> // than the capacity of a single G1 region, the bottom-most G1 region may be partially filled, and the >>>>>>>> // remaining G1 region(s) are consecutively allocated and fully filled. >>>>>>>> // >>>>>>>> // Thus, the bottom-most G1 region (if not empty) is written into first_region_in_archive. >>>>>>>> // The remaining G1 regions (if exist) are coalesced and written as a single block >>>>>>>> // into (first_region_in_archive + 1) >>>>>>>> >>>>>>>> // Here's the mapping from (g1 regions) -> (archive regions). >>>>>>>> >>>>>>>> >>>>>>>> All this function needs to do is to decide the values for >>>>>>>> >>>>>>>> r0_start, r0_top >>>>>>>> r1_start, r1_top >>>>>>>> >>>>>>>> I think it would be much better to not use the loop, and not use the max_num_regions parameter (it's always 2 anyway). >>>>>>>> >>>>>>>> *r0_start = *r0_top = NULL; >>>>>>>> *r1_start = *r1_top = NULL; >>>>>>>> >>>>>>>> if (arr_len >= 1) { >>>>>>>> *r0_start = regions->at(0).start(); >>>>>>>> *r0_end = *r0_start + regions->at(0).byte_size(); >>>>>>>> } >>>>>>>> if (arr_len >= 2) { >>>>>>>> int last = arr_len - 1; >>>>>>>> *r1_start = regions->at(1).start(); >>>>>>>> *r1_end = regions->at(last).start() + regions->at(last).byte_size(); >>>>>>>> } >>>>>>>> >>>>>>>> what do you think? >>>>>>> >>>>>>> We need to write out all archive regions including the empty ones. The loop using max_num_regions is the easiest way. I?d like to remove the code that deals with r0_* and r1_ explicitly. Let me try that. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> (3) metaspace.cpp >>>>>>>> >>>>>>>> 3350 // Map the archived heap regions after compressed pointers >>>>>>>> 3351 // because it relies on compressed class pointers setting to work >>>>>>>> >>>>>>>> do you mean this? >>>>>>>> >>>>>>>> // Archived heap regions depend on the parameters of compressed class pointers, so >>>>>>>> // they must be mapped after such parameters have been decided in the above call. >>>>>>> >>>>>>> Hmmm, maybe use ?arguments? instead of ?parameters?? >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> (4) I found this name not strictly grammatical. How about this: >>>>>>>> >>>>>>>> allow_archive_heap_object -> is_heap_object_archiving_allowed >>>>>>> >>>>>>> Ok. >>>>>>> >>>>>>>> >>>>>>>> (5) in most of your code, 'archive' is used as a noun, except in StringTable::archive_string() where it's used as a verb. >>>>>>>> >>>>>>>> archive_string could also be interpreted erroneously as "return a string that's already in the archive". So to be consistent and unambiguous, I think it's better to rename it to StringTable::create_archived_string() >>>>>>> >>>>>>> Ok. >>>>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks >>>>>>>> - Ioi >>>>>>>> >>>>>>>> >>>>>>>> On 8/3/17 5:15 PM, Jiangli Zhou wrote: >>>>>>>>> Here are the updated webrevs. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ >>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ >>>>>>>>> >>>>>>>>> Changes in the updated webrevs include: >>>>>>>>> Merge with Ioi?s recent shared space auto-sizing change (8072061) >>>>>>>>> Addressed all feedbacks from Ioi and Coleen (Thanks for detailed review!) >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jiangli >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou wrote: >>>>>>>>>> >>>>>>>>>> Hi Ioi, >>>>>>>>>> >>>>>>>>>> Thank you so much for reviewing this. I?ve addressed all your feedbacks. Please see details below. I?ll updated the webrev after addressing Coleen?s comments. >>>>>>>>>> >>>>>>>>>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Jiangli, >>>>>>>>>>> >>>>>>>>>>> Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-) >>>>>>>>>>> >>>>>>>>>>> stringTable.cpp: StringTable::archive_string >>>>>>>>>>> >>>>>>>>>>> add assert for DumpSharedSpaces only >>>>>>>>>> >>>>>>>>>> Ok. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> filemap.cpp >>>>>>>>>>> >>>>>>>>>>> 525 void FileMapInfo::write_archive_heap_regions(GrowableArray *regions, >>>>>>>>>>> 526 int first_region, int num_regions) { >>>>>>>>>>> >>>>>>>>>>> When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions: >>>>>>>>>>> >>>>>>>>>>> 537 int len = regions->length(); >>>>>>>>>>> 538 if (len > 1) { >>>>>>>>>>> 539 start = (char*)regions->at(1).start(); >>>>>>>>>>> 540 size = (char*)regions->at(len - 1).end() - start; >>>>>>>>>>> 541 } >>>>>>>>>>> 542 } >>>>>>>>>>> >>>>>>>>>>> The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1 or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions. >>>>>>>>>>> >>>>>>>>>>> How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion. >>>>>>>>>>> >>>>>>>>>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) { >>>>>>>>>>> if (first == MetaspaceShared::first_string) { >>>>>>>>>>> assert(num_regons <= MetaspaceShared::max_strings, "..."); >>>>>>>>>>> } else { >>>>>>>>>>> assert(first == MetaspaceShared::first_open_archive_heap_region, "..."); >>>>>>>>>>> assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "..."); >>>>>>>>>>> } >>>>>>>>>>> .... >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I?ve reworked the function and simplified the code. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 756 if (!string_data_mapped) { >>>>>>>>>>> 757 StringTable::ignore_shared_strings(true); >>>>>>>>>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity"); >>>>>>>>>>> 759 } >>>>>>>>>>> 760 >>>>>>>>>>> 761 if (open_archive_heap_data_mapped) { >>>>>>>>>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped(); >>>>>>>>>>> 763 } else { >>>>>>>>>>> 764 assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity"); >>>>>>>>>>> 765 } >>>>>>>>>>> >>>>>>>>>>> Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()? >>>>>>>>>> >>>>>>>>>> Fixed. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> FileMapInfo::map_heap_data() -- >>>>>>>>>>> >>>>>>>>>>> 818 char* addr = (char*)regions[i].start(); >>>>>>>>>>> 819 char* base = os::map_memory(_fd, _full_path, si->_file_offset, >>>>>>>>>>> 820 addr, regions[i].byte_size(), si->_read_only, >>>>>>>>>>> 821 si->_allow_exec); >>>>>>>>>>> >>>>>>>>>>> What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here. >>>>>>>>>> >>>>>>>>>> If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by ?regions? array are deallocated. We don?t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ?if? check below makes sure base == addr. >>>>>>>>>> >>>>>>>>>> if (base == NULL || base != addr) { >>>>>>>>>> // dealloc the regions from java heap >>>>>>>>>> dealloc_archive_heap_regions(regions, region_num); >>>>>>>>>> if (log_is_enabled(Info, cds)) { >>>>>>>>>> log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap."); >>>>>>>>>> } >>>>>>>>>> return false; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> constantPool.cpp >>>>>>>>>>> >>>>>>>>>>> Handle refs_handle; >>>>>>>>>>> ... >>>>>>>>>>> refs_handle = Handle(THREAD, (oop)archived); >>>>>>>>>>> >>>>>>>>>>> This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles() >>>>>>>>>>> >>>>>>>>>>> I think it's more efficient if you merge these into a single statement >>>>>>>>>>> >>>>>>>>>>> Handle refs_handle(THREAD, (oop)archived); >>>>>>>>>> >>>>>>>>>> Fixed. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Is this experimental code? Maybe it should be removed? >>>>>>>>>>> >>>>>>>>>>> 664 if (tag_at(index).is_unresolved_klass()) { >>>>>>>>>>> 665 #if 0 >>>>>>>>>>> 666 CPSlot entry = cp->slot_at(index); >>>>>>>>>>> 667 Symbol* name = entry.get_symbol(); >>>>>>>>>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD); >>>>>>>>>>> 669 if (k != NULL) { >>>>>>>>>>> 670 klass_at_put(index, k); >>>>>>>>>>> 671 } >>>>>>>>>>> 672 #endif >>>>>>>>>>> 673 } else >>>>>>>>>> >>>>>>>>>> Removed. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> cpCache.hpp: >>>>>>>>>>> >>>>>>>>>>> u8 _archived_references >>>>>>>>>>> >>>>>>>>>>> shouldn't this be declared as an narrowOop to avoid the type casts when it's used? >>>>>>>>>> >>>>>>>>>> Ok. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> cpCache.cpp: >>>>>>>>>>> >>>>>>>>>>> add assert so that one of these is used only at dump time and the other only at run time? >>>>>>>>>>> >>>>>>>>>>> 610 oop ConstantPoolCache::archived_references() { >>>>>>>>>>> 611 return oopDesc::decode_heap_oop((narrowOop)_archived_references); >>>>>>>>>>> 612 } >>>>>>>>>>> 613 >>>>>>>>>>> 614 void ConstantPoolCache::set_archived_references(oop o) { >>>>>>>>>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o); >>>>>>>>>>> 616 } >>>>>>>>>> >>>>>>>>>> Ok. >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>> Jiangli >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> - Ioi >>>>>>>>>>> >>>>>>>>>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote: >>>>>>>>>>>> Sorry, the mail didn?t handle the rich text well. I fixed the format below. >>>>>>>>>>>> >>>>>>>>>>>> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ?resolved_references? arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ?resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references? being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. >>>>>>>>>>>> >>>>>>>>>>>> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself. >>>>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302 >>>>>>>>>>>> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ >>>>>>>>>>>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ >>>>>>>>>>>> >>>>>>>>>>>> Please see below for details of supporting cached ?resolved_references? and pre-resolving string constants. >>>>>>>>>>>> >>>>>>>>>>>> Types of Pinned G1 Heap Regions >>>>>>>>>>>> >>>>>>>>>>>> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type. >>>>>>>>>>>> >>>>>>>>>>>> 00100 0 [ 8] Pinned Mask >>>>>>>>>>>> 01000 0 [16] Old Mask >>>>>>>>>>>> 10000 0 [32] Archive Mask >>>>>>>>>>>> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask >>>>>>>>>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Pinned Regions >>>>>>>>>>>> >>>>>>>>>>>> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed. >>>>>>>>>>>> >>>>>>>>>>>> Archive Regions >>>>>>>>>>>> >>>>>>>>>>>> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive. >>>>>>>>>>>> >>>>>>>>>>>> An archive region is also an old region by design. >>>>>>>>>>>> >>>>>>>>>>>> Open Archive (GC-RW) Regions >>>>>>>>>>>> >>>>>>>>>>>> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC. >>>>>>>>>>>> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC. >>>>>>>>>>>> >>>>>>>>>>>> Adjustable Outgoing Pointers >>>>>>>>>>>> >>>>>>>>>>>> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly. >>>>>>>>>>>> >>>>>>>>>>>> Closed Archive (GC-RO) Regions >>>>>>>>>>>> >>>>>>>>>>>> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region. >>>>>>>>>>>> In JDK 9 we support archive Strings with the archive regions. >>>>>>>>>>>> >>>>>>>>>>>> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page. >>>>>>>>>>>> >>>>>>>>>>>> Dormant Objects >>>>>>>>>>>> >>>>>>>>>>>> Dormant objects are unreachable java objects within the open archive heap region. >>>>>>>>>>>> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object. >>>>>>>>>>>> >>>>>>>>>>>> Object State Transition >>>>>>>>>>>> >>>>>>>>>>>> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object. >>>>>>>>>>>> >>>>>>>>>>>> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching. >>>>>>>>>>>> >>>>>>>>>>>> Caching Java Objects at Archive Dump Time >>>>>>>>>>>> >>>>>>>>>>>> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location. >>>>>>>>>>>> >>>>>>>>>>>> Caching Constant Pool resolved_references Array >>>>>>>>>>>> >>>>>>>>>>>> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste if a constant pool string entry is never used at runtime. >>>>>>>>>>>> >>>>>>>>>>>> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated. >>>>>>>>>>>> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles. >>>>>>>>>>>> >>>>>>>>>>>> Runtime Java Heap With Cached Java Objects >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base. >>>>>>>>>>>> >>>>>>>>>>>> Preliminary test execution and status: >>>>>>>>>>>> >>>>>>>>>>>> JPRT: passed >>>>>>>>>>>> Tier2-rt: passed >>>>>>>>>>>> Tier2-gc: passed >>>>>>>>>>>> Tier2-comp: passed >>>>>>>>>>>> Tier3-rt: passed >>>>>>>>>>>> Tier3-gc: passed >>>>>>>>>>>> Tier3-comp: passed >>>>>>>>>>>> Tier4-rt: passed >>>>>>>>>>>> Tier4-gc: passed >>>>>>>>>>>> Tier4-comp:6 jobs timed out, all other tests passed >>>>>>>>>>>> Tier5-rt: one test failed but passed when running locally, all other tests passed >>>>>>>>>>>> Tier5-gc: passed >>>>>>>>>>>> Tier5-comp: running >>>>>>>>>>>> hotspot_gc: two jobs timed out, all other tests passed >>>>>>>>>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed >>>>>>>>>>>> vm.gc: passed >>>>>>>>>>>> vm.gc in CDS mode: passed >>>>>>>>>>>> Kichensink: passed >>>>>>>>>>>> Kichensink in CDS mode: passed >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Jiangli >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From eric.caspole at oracle.com Fri Aug 11 19:10:41 2017 From: eric.caspole at oracle.com (Eric Caspole) Date: Fri, 11 Aug 2017 15:10:41 -0400 Subject: RFR(XS): JDK-8186154 : Extra info with +LogTouchedMethods Message-ID: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> Hi, Please review this very small change to add more info to +LogTouchedMethods including the touched count and whether the method has loops. This extra info has been handy while doing experiments to create the most efficient AOT library for quick startup. http://cr.openjdk.java.net/~ecaspole/JDK-8186154/webrev/ https://bugs.openjdk.java.net/browse/JDK-8186154 Passed JPRT hotspot tests. Thanks, Eric From vladimir.kozlov at oracle.com Fri Aug 11 19:23:45 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 11 Aug 2017 12:23:45 -0700 Subject: RFR(XS): JDK-8186154 : Extra info with +LogTouchedMethods In-Reply-To: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> References: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> Message-ID: Good. Thanks, Vladimir On 8/11/17 12:10 PM, Eric Caspole wrote: > Hi, > Please review this very small change to add more info to > +LogTouchedMethods including the touched count and whether the method > has loops. This extra info has been handy while doing experiments to > create the most efficient AOT library for quick startup. > > http://cr.openjdk.java.net/~ecaspole/JDK-8186154/webrev/ > > https://bugs.openjdk.java.net/browse/JDK-8186154 > > Passed JPRT hotspot tests. > > Thanks, > Eric From serguei.spitsyn at oracle.com Fri Aug 11 19:41:46 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 11 Aug 2017 12:41:46 -0700 Subject: RFR(XXS): quarantine sun/management/jdp/JdpOffTest.java (8186152) In-Reply-To: <48af379f-8c41-f2fa-5f68-052976bbf517@oracle.com> References: <48af379f-8c41-f2fa-5f68-052976bbf517@oracle.com> Message-ID: Looks good. Thanks, Serguei On 8/11/17 10:55, Daniel D. Daugherty wrote: > Greetings, > > I'm quarantining sun/management/jdp/JdpOffTest.java > as it continues to fail in the JDK10-hs nightly. > > 8186152 quarantine sun/management/jdp/JdpOffTest.java > https://bugs.openjdk.java.net/browse/JDK-8186152 > > $ hg diff test/ProblemList.txt > diff -r 5ebbdc94be6d test/ProblemList.txt > --- a/test/ProblemList.txt Tue Aug 08 22:55:42 2017 +0200 > +++ b/test/ProblemList.txt Fri Aug 11 11:51:51 2017 -0600 > @@ -153,6 +153,7 @@ > com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java > 8030957 aix-all > com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java > 8030957 aix-all > sun/management/HotspotRuntimeMBean/GetSafepointSyncTime.java 8174734 > generic-all > +sun/management/jdp/JdpOffTest.java 8175542 generic-all > > ############################################################################ > > > This fix is targeted to JDK10/hs. > > Dan From daniel.daugherty at oracle.com Fri Aug 11 19:48:04 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 11 Aug 2017 13:48:04 -0600 Subject: RFR(XXS): quarantine sun/management/jdp/JdpOffTest.java (8186152) In-Reply-To: References: <48af379f-8c41-f2fa-5f68-052976bbf517@oracle.com> Message-ID: <06180a0c-b0f1-d158-1e1d-9a5bd0a5aa3d@oracle.com> Thanks Serguei! Dan On 8/11/17 1:41 PM, serguei.spitsyn at oracle.com wrote: > Looks good. > > Thanks, > Serguei > > > On 8/11/17 10:55, Daniel D. Daugherty wrote: >> Greetings, >> >> I'm quarantining sun/management/jdp/JdpOffTest.java >> as it continues to fail in the JDK10-hs nightly. >> >> 8186152 quarantine sun/management/jdp/JdpOffTest.java >> https://bugs.openjdk.java.net/browse/JDK-8186152 >> >> $ hg diff test/ProblemList.txt >> diff -r 5ebbdc94be6d test/ProblemList.txt >> --- a/test/ProblemList.txt Tue Aug 08 22:55:42 2017 +0200 >> +++ b/test/ProblemList.txt Fri Aug 11 11:51:51 2017 -0600 >> @@ -153,6 +153,7 @@ >> com/sun/management/OperatingSystemMXBean/GetProcessCpuLoad.java >> 8030957 aix-all >> com/sun/management/OperatingSystemMXBean/GetSystemCpuLoad.java >> 8030957 aix-all >> sun/management/HotspotRuntimeMBean/GetSafepointSyncTime.java 8174734 >> generic-all >> +sun/management/jdp/JdpOffTest.java 8175542 generic-all >> >> ############################################################################ >> >> >> This fix is targeted to JDK10/hs. >> >> Dan > From david.holmes at oracle.com Fri Aug 11 23:36:15 2017 From: david.holmes at oracle.com (David Holmes) Date: Sat, 12 Aug 2017 09:36:15 +1000 Subject: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC machines In-Reply-To: <46b484c8-e080-8c6a-a6bb-8471ce5efd9f@oracle.com> References: <6e333994-ee8b-4fb4-b9a9-59afc751217d@default> <0285fdb7-8548-48e5-b019-3de62f66cffe@default> <46b484c8-e080-8c6a-a6bb-8471ce5efd9f@oracle.com> Message-ID: On 11/08/2017 7:57 AM, David Holmes wrote: > Hi Poonam, > > On 11/08/2017 3:44 AM, Poonam Parhar wrote: >> Thanks Vladimir. >> >> Since the SPARC machines are always multi-cores, we can safely set >> AssumeMP to true on these. > > I'm still unclear about the reported problem here. As Vladimir pointed > out in the bug report the is_MP checks uses _processor_count which is > set to the number of cpus on the machine _not_ the number of cpus > currently available to the VM. > > void os::Solaris::initialize_system_info() { > set_processor_count(sysconf(_SC_NPROCESSORS_CONF)); > > So all this discussion about containers and dynamic changes to available > cpus should be moot. So the only way this can fail is if the number of > configured processors on the machine dynamically changed _and_ it was > initially 1 - which seems to me to be impossible with sparc unless the > hardware info is being incorrectly reported (ie virtualization bug?) No it isn't impossible as a LDOM can be configured with 1 "cpu", and 1 is what sysconf(_SC_NPROCESSORS_CONF) will return. A change to the number of cpus is not the issue here, but the simple fact that the BIS instructions have to have a membar issued, and that is elided when there is only 1 processor - unless AssumeMP is set true. Sorry for the noise. David > That aside I have no issue with the fix as we will likely be assuming MP > always in the future. > > Thanks, > David > >> Adding my comments from the previous mail here again for better >> readability: >> ------------------------------------- >> Bug: https://bugs.openjdk.java.net/browse/JDK-8185572: Enable AssumeMP >> by default on SPARC machines >> Webrev: http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >> >> This change enables AssumeMP by default on SPARC machines. On Sparc >> T7, to finalize BIS instructions the server compiler needs to add a >> 'membar' instruction at the end. But the generation of 'membar' is >> guarded by os::is_MP(), and os::is_MP() returns false when there is a >> single cpu available on the system. Now, in virtualized/container >> environments, the number of processors allocated to a virtual machine >> can dynamically change during the application runtime. That could lead >> to incorrect generation of BIS instructions and can cause JVM crashes. >> Enabling AssumeMP makes is_MP() always return true on SPARC systems. >> >> In future, we may consider making generation of 'membar' unconditional >> with the enhancement request: >> https://bugs.openjdk.java.net/browse/JDK-8150715. >> >> Thanks, >> Poonam >> >> >>> -----Original Message----- >>> From: Vladimir Kozlov >>> Sent: Thursday, August 10, 2017 9:47 AM >>> To: Poonam Parhar; hotspot-compiler-dev at openjdk.java.net >>> Cc: hotspot-runtime-dev at openjdk.java.net runtime >>> Subject: Re: [10] RFR(S): 8185572: Enable AssumeMP by default on SPARC >>> machines >>> >>> CCing to Runtime. >>> >>> Can you add comment explaining why it set to true on SPARC? >>> >>> Thanks, >>> Vladimir >>> >>> On 8/10/17 6:26 AM, Poonam Parhar wrote: >>>> Hello, >>>> >>>> Please review this simple patch: >>>> >>>> Bug:_JDK-8185572_>> 8185572>:En >>>> able >>>> AssumeMP by default on SPARC machines >>>> >>>> Webrev:http://cr.openjdk.java.net/~poonam/8185572/webrev.00/ >>>> >>>> This change enables AssumeMP by default on SPARC machines. On Sparc >>>> T7, to finalize BIS instructions the server compiler needs toadd >>>> a'membar'instruction at the end.But the generation of'membar'is >>>> guarded byos::is_MP(), andos::is_MP()returns false when there isa >>>> singlecpu available on the system. >>>> Now,invirtualized/containerenvironments, the number >>>> ofprocessorsallocated to a virtual machine can dynamically change >>>> during the application runtime.That could lead to incorrect >>> generation >>>> of BIS instructions and can cause JVM crashes.Enabling AssumeMP makes >>>> is_MP() always return true on SPARC systems. >>>> >>>> In future, we may consider makinggeneration of'membar'unconditional >>>> withtheenhancementrequest:_JDK- >>> 8150715_. >>>> >>>> Thanks, >>>> >>>> Poonam >>>> From milan.mimica at gmail.com Sat Aug 12 09:15:44 2017 From: milan.mimica at gmail.com (Milan Mimica) Date: Sat, 12 Aug 2017 09:15:44 +0000 Subject: RFR(XS): JDK-8186154 : Extra info with +LogTouchedMethods In-Reply-To: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> References: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> Message-ID: Hi I believe each VM option should have a unique description: diagnostic(bool, PrintTouchedMethodsAtExit, false, \ "Print all methods that have been ever touched in runtime") \ \+ diagnostic(bool, PrintTouchedMethodCount, false, \+ "Print all methods that have been ever touched in runtime") \+ pet, 11. kol 2017. u 21:11 Eric Caspole napisao je: > Hi, > Please review this very small change to add more info to > +LogTouchedMethods including the touched count and whether the method > has loops. This extra info has been handy while doing experiments to > create the most efficient AOT library for quick startup. > > http://cr.openjdk.java.net/~ecaspole/JDK-8186154/webrev/ > > https://bugs.openjdk.java.net/browse/JDK-8186154 > > Passed JPRT hotspot tests. > > Thanks, > Eric > From ioi.lam at oracle.com Sun Aug 13 01:42:12 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Sat, 12 Aug 2017 18:42:12 -0700 Subject: RFR(XS): JDK-8186154 : Extra info with +LogTouchedMethods In-Reply-To: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> References: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> Message-ID: Hi Eric, I am not sure if _touch_count can accurately represent 'how often is this method used'. In the interpreter, Method::log_touched is called only once, when building the Method::method_counters() #define GET_METHOD_COUNTERS(res) \ res = METHOD->method_counters(); \ if (res == NULL) { \ CALL_VM(res = InterpreterRuntime::build_method_counters(THREAD, METHOD), handle_exception); \ } In addition, whenever the compiler references a method (when compiling a method, or when inlining a method into another, etc), it calls Method::log_touched ciMethod::ciMethod(const methodHandle& h_m, ciInstanceKlass* holder) : ciMetadata(h_m()), _holder(holder) { assert(h_m() != NULL, "no null method"); if (LogTouchedMethods) { h_m()->log_touched(Thread::current()); } So, a high count of _touch_count is more correlated to how often this method is seen by the compiler. Thanks - Ioi On 8/11/17 12:10 PM, Eric Caspole wrote: > Hi, > Please review this very small change to add more info to > +LogTouchedMethods including the touched count and whether the method > has loops. This extra info has been handy while doing experiments to > create the most efficient AOT library for quick startup. > > http://cr.openjdk.java.net/~ecaspole/JDK-8186154/webrev/ > > https://bugs.openjdk.java.net/browse/JDK-8186154 > > Passed JPRT hotspot tests. > > Thanks, > Eric From thomas.stuefe at gmail.com Sun Aug 13 15:14:30 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sun, 13 Aug 2017 17:14:30 +0200 Subject: RFR(m): 8185712: [windows] Multiple issues with the native symbol decoder Message-ID: Dear all, May I please have reviews for the following patch. Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.00/webrev/ This fix makes the native symbol decoder on Windows more robust. This increases the chance of getting useful error files. It also refactors the code and contains some useful new features. --- Native symbol resolution on windows is done via the group of "SymXX" APIs exported from dbghelp.dll. Currently this is done via a layer of abstraction in decoder.cpp, and a "WindowsDecoder" class in decoder_windows.cpp. This class can be instantianted twice, so there can exist two objects of this class. The bugs: 1) Functions from dbghelp.dll are not threadsafe; calls to them must be synchronized. But dbghelp.dll could be used by the two live WindowsDecoder objects in parallel from different threads. In addition to that, dbghelp.dll functions are used from within os_windows.cpp, when writing minidumps. 2) the SymXX APIs need to be initialized by a call to SymInitialize(). This can only happen once. The way it is now, each of the two WindowsDecoder objects will call SymInitialize. The second object to do this will fail and hence not be usable. In practice this means if in the VM someone called e.g. dll_from_address_name() - which will initialize and use the shared WindowsDecoder object - and then the VM crashes, the hs-err file will contain no useful stack because that would require the second WindowsDecoder object, which would get initialized after the shared one, and its initialization would fail. 3) Initialization dependencies: - when building the pdb search path (WindowsDecoder::initialize), Arguments::get_java_home() is used to deduce the jdk bin directory containing the jdk shared objects. This will crash if invoked before the system properties are set. - Decoder::decode() calls (via the Monitor lock) Thread::current(). This means the code will not work during initialization and where Thread::current is not set (e.g. in an unattached thread). Admittedly a bit theoretical, as this only affects the shared WindowsDecoder object, which is not used during error reporting. 4) WindowsDecoder::initialize(), pdb search path: Code uses MAX_PATH in various places, which is wrong. NTFS paths can be longer than that. - when calling SymGetSearchPath, code assumes a maximum path size of MAX_PATH for the *combined* size of all directory names, which may be way too small. Truncation is not handled, code will silently fail if output buffer is too small, resulting in the existing pdb path to be overwritten instead of being preserved. - GetModuleFileName(): similar problem, output buffer is MAX_PATH len, which may be too small. Truncation will lead to wrong results. - Throughout the function "strncat" is used to assemble the search path, but is used wrong and does not guard against buffer overflows (if that was the intent - otherwise, why not just use strcat?). The last argument to strncat should be the remaining space in the destination buffer, not the length of the source string. 5) WindowsDecoder::decode(): We call SymGetSymFromAddr64() using an output buffer of size MAX_PATH - which makes no sense at all, as we are retrieving a symbol name, not a file name. Truncation is not handled. This is actually dangerous, because SymGetSymFromAddr64() handles truncations sloppily, it will return success and fill the output buffer completely, so on truncation the symbol name will not be zero terminated. In addition to the bugs, there are a number of things which could be done better or simpler: a) Setting the search path could be simplified and made more useful if we would just add the directory of every loaded module to the search path (currently we just add the two jdk bin directories, which also exposes us to initialization dependency, see (3)). This would be more useful as a common convention is to put pdb files beside binaries, and that way we would catch all those. Including, but not limited to, our own jdk pdb files. b) Dlls can be loaded and unloaded, and it would be nice to have an updated pdb search path - e.g. in case a late-loaded third party JNI library crashes. So, it would be nice to run (a) whenever a library is loaded or unloaded, and for the process to be fast if nothing changed (e.g. if the new DLL was loaded from a directory which is already part of the search path). c) It would be nice to have file name and line number in the callstack, too. d) As pointed out in JDK-8144855, the function Decoder::can_decode_C_frame_in_vm() is not necessary. This should be handled in WindowsDecoder instead. --- What I did in this fix: - I pulled out dbghelp.dll handling into an own centralized singleton (DbgHelpLoader). This class takes care of loading the Dll and synchonizes access to all its functions, thus solving (1). DbgHelpLoader now replaces all places where before the DLL was loaded manually. - Atop of DbgHelpLoader there is another new singleton class, "SymbolEngine", which wraps the life cycle of the symbol APIs. This solves problem (2). In addition to that, this class is the new interface to the SymXX APIs. - this means that the existing code in decoder.cpp, which instantiates two Decode objects and synchronizes access to one of them, makes no sense for Windows. That whole layer is now bypassed in Windows - the static Decoder::xxx() functions are now directly implemented in decoder_windows.cpp and access the SymbolEngine singleton without any added layers inbetween. - SymbolEngine contains a new improved way to assemble the pdb search path as described in (a) and (b). We iterate loaded modules, extract directories and add them to the search path. Implementation takes care to not do unnecessary work (avoiding changing the path if list of loaded modules is unchanged, or if only Dlls from known directories were loaded). - We now recalculate the pdb seach path if a DLL was loaded (via os::dll_load). That makes it possible to debug third party dlls which do not reside in jdk directories and were loaded after the SymbolEngine was initialized. - When decoding a symbol from an unknown address and the decode fails, we attempt to rebuild the pdb search path and attempt again. - Care was taken for this code to be robust. It may be called pre-initialization, post-shutdown or under error conditions (low memory, out of stack etc), so no VM infrastructure was used and memory use is frugal (dynamic buffers allocated off the stack and reused were possible). - I did away with the "Decoder::can_decode_C_frame_in_vm()" code. See argumentation in JDK-8144855. To me, it makes no sense. The function is a workaround the problem that, when faced with stripped binaries which only have debug symbols for public functions, symbol info may be confusing (public symbols + large offsets). But this is not done consistently, and most programmers know not to trust symbols with large offsets. - Finally, we now have source info in the callstack :) Stack: [0x00840000,0x00890000], sp=0x0088f884, free space=318k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [jvm.dll+0xa26903] VMError::controlled_crash+0x2a3 (vmerror.cpp:1683) V [jvm.dll+0xa2664e] VMError::test_error_handler+0xe (vmerror.cpp:1632) V [jvm.dll+0x6af796] JNI_CreateJavaVM_inner+0x196 (jni.cpp:3988) V [jvm.dll+0x6af5c2] JNI_CreateJavaVM+0x52 (jni.cpp:4036) V [jvm.dll+0x2cdfc] init_jvm+0xdc (gtestmain.cpp:94) V [jvm.dll+0x2d18a] JVMInitializerListener::OnTestStart+0x7a (gtestmain.cpp:114) V [jvm.dll+0x829b] testing::internal::TestEventRepeater::OnTestStart+0x5b (gtest.cc:2979) V [jvm.dll+0x6874] testing::TestInfo::Run+0x54 (gtest.cc:2312) V [jvm.dll+0x6d8f] testing::TestCase::Run+0xbf (gtest.cc:2445) V [jvm.dll+0xbe9e] testing::internal::UnitTestImpl::RunAllTests+0x25e (gtest.cc:4316) V [jvm.dll+0x2a410] testing::internal::HandleSehExceptionsInMethodIfSupported+0x40 (gtest.cc:2063) V [jvm.dll+0x2493e] testing::internal::HandleExceptionsInMethodIfSupported+0x5e (gtest.cc:2114) V [jvm.dll+0xb0e9] testing::UnitTest::Run+0xe9 (gtest.cc:3929) V [jvm.dll+0x2d0bf] RUN_ALL_TESTS+0xf (gtest.h:2289) V [jvm.dll+0x2cc76] runUnitTestsInner+0x1f6 (gtestmain.cpp:249) V [jvm.dll+0x2c958] runUnitTests+0x48 (gtestmain.cpp:319) C [gtestLauncher.exe+0x1011] main+0x11 (gtestlauncher.cpp:32) C [gtestLauncher.exe+0x1182] __tmainCRTStartup+0x122 (crtexe.c:555) C [KERNEL32.DLL+0x138f4] C [ntdll.dll+0x65de3] C [ntdll.dll+0x65dae] .. which is implemented for Windows, but pipes are laid to plug in other platforms as well (Decoder::get_source_info()) ---- Tell me what you think. Thanks, and Kind Regards, Thomas From david.holmes at oracle.com Mon Aug 14 00:29:38 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 14 Aug 2017 10:29:38 +1000 Subject: RFR (M) 8186042: Optimize OopMapCache lookup In-Reply-To: <7c36b086-5959-753b-1626-415817d53525@oracle.com> References: <7c36b086-5959-753b-1626-415817d53525@oracle.com> Message-ID: <80eb03d2-7921-a793-0019-d2b40270845d@oracle.com> Hi Coleen, This looks good to me - Reviewed. A couple of minor comments below. On 12/08/2017 1:46 AM, coleen.phillimore at oracle.com wrote: > Summary: Use lock free access to oopMapCache > Contributed-by: frederic.parain at oracle.com, coleen.phillimore at oracle.com > > The OopMapCache::lookup() function took out a mutex to protect access > between the GC threads that are running concurrently. See bug for more > info. The function lookup() is run by multiple GC threads > concurrently. If there's a collision in the hashtable, this uses atomic > cmpxchg to add the entry to a list to be cleaned up after the safepoint > is over. GC isn't doing lookup at that point. > > This change is contributed by Frederic Parain, with some cleanup and > logging from me. > > open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8186042 src/share/vm/interpreter/oopMapCache.cpp The logging here doesn't follow our usual pattern of only making the logging calls if the appropriate logging level is enabled. ?? src/share/vm/interpreter/oopMapCache.hpp Nit: OopMapCacheEntry* volatile* _array; Space after volatile please as it refers to the OMCE* not the *_array. > Tested with RBT equivalent of nightly on linux x64. Also ran dacapo > with -Xint -Xlog:interpreter+oopmap=debug to verify. This change also Do you have some performance improvement numbers for this optimization? Or is this a pre-emptive strike against a potential locking bottleneck? > removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration > however) in favor of new logging. A linked CSR request is pending. Ok. Thanks, David > Thanks, > Coleen > From kim.barrett at oracle.com Mon Aug 14 02:43:41 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Sun, 13 Aug 2017 22:43:41 -0400 Subject: RFR (M): 8184334: Generalizing Atomic with templates In-Reply-To: References: <72F3D5DE-E517-4E3A-81F3-556F5A2222DA@oracle.com> <596E288B.90706@oracle.com> <773a0b64-8273-1a58-a46a-22ae18f927b7@redhat.com> <5970B3FD.607@oracle.com> <8e1d9a32-7e01-b917-df1d-595c4000e4b2@redhat.com> <597226E2.6090803@oracle.com> <2c637e28-8d95-cbaf-c29e-d01867ed9157@redhat.com> <5979FEB6.10506@oracle.com> <3C8BDDFA-1044-47F6-B15C-6DE2085ACF7C@oracle.com> <597B5FC6.5020702@oracle.com> <4e10660a-4dee-78bb-333e-fe99a9c2295d@redhat.com> <13B920D8-8190-4F8A-A18A-16C5BF473732@oracle.com> <5c6b3711-ab1b-1ea1-50e9-1b91! ! 6eb8a5dc@redhat.com> Message-ID: <1E672A08-6768-4039-B4A5-2CA7DC99CA0C@oracle.com> > On Aug 8, 2017, at 4:25 PM, Kim Barrett wrote: > My plan at this point is to focus on finishing cmpxchg, and put just > that (with the associated infrastructure) out for review. Then circle > back to deal with the other operations, using the new infrastructure, > approach, and any additional lessons learned from cmpxchg. That > should also make the handoff of remaining work back to Erik go more > smoothly when he comes back from vacation and I start mine. I?ve just sent to hotspot-dev an RFR for 8186166: Generalize Atomic::cmpxchg with templates. From coleen.phillimore at oracle.com Mon Aug 14 12:07:46 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Aug 2017 08:07:46 -0400 Subject: RFR(XS): JDK-8186154 : Extra info with +LogTouchedMethods In-Reply-To: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> References: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> Message-ID: <570107e5-92d6-4838-a878-edf0640d6781@oracle.com> Hi, I don't think you should add another option to print the count. You should do it unconditionally. Since it's a diagnostic option, we don't have to support parsers of the output. I was going to type that you should use UL for this output instead, but that's a bigger task. It would be nice if the compiler options used UL and didn't have these Print options anymore. In any case, please don't add another Print* option. thanks, Coleen On 8/11/17 3:10 PM, Eric Caspole wrote: > Hi, > Please review this very small change to add more info to > +LogTouchedMethods including the touched count and whether the method > has loops. This extra info has been handy while doing experiments to > create the most efficient AOT library for quick startup. > > http://cr.openjdk.java.net/~ecaspole/JDK-8186154/webrev/ > > https://bugs.openjdk.java.net/browse/JDK-8186154 > > Passed JPRT hotspot tests. > > Thanks, > Eric From coleen.phillimore at oracle.com Mon Aug 14 12:17:44 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Aug 2017 08:17:44 -0400 Subject: RFR (M) 8186042: Optimize OopMapCache lookup In-Reply-To: <80eb03d2-7921-a793-0019-d2b40270845d@oracle.com> References: <7c36b086-5959-753b-1626-415817d53525@oracle.com> <80eb03d2-7921-a793-0019-d2b40270845d@oracle.com> Message-ID: On 8/13/17 8:29 PM, David Holmes wrote: > Hi Coleen, > > This looks good to me - Reviewed. A couple of minor comments below. > > On 12/08/2017 1:46 AM, coleen.phillimore at oracle.com wrote: >> Summary: Use lock free access to oopMapCache >> Contributed-by: frederic.parain at oracle.com, coleen.phillimore at oracle.com >> >> The OopMapCache::lookup() function took out a mutex to protect access >> between the GC threads that are running concurrently. See bug for >> more info. The function lookup() is run by multiple GC threads >> concurrently. If there's a collision in the hashtable, this uses >> atomic cmpxchg to add the entry to a list to be cleaned up after the >> safepoint is over. GC isn't doing lookup at that point. >> >> This change is contributed by Frederic Parain, with some cleanup and >> logging from me. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8186042 > > src/share/vm/interpreter/oopMapCache.cpp > > The logging here doesn't follow our usual pattern of only making the > logging calls if the appropriate logging level is enabled. ?? I'm not sure what you're talking about. I think this is the new pattern of doing logging: Log(interpreter, oopmap) logv; LogStream st(logv.trace()); Otherwise log_debug(interpreter, oopmap)("*** collision in oopmap cache - flushing item ***"); checks for whether debug level logging is on. You only need this: if (log_is_enabled(Debug, interpreter, oopmap)) { static int count = 0; ResourceMark rm; log_debug(interpreter, oopmap) ("%d - Computing oopmap at bci %d for %s at hash %d", ++count, bci, method()->name_and_sig_as_C_string(), probe); } If you're creating a ResourceMark. Unless logging changed (which I think it did). > > src/share/vm/interpreter/oopMapCache.hpp > > Nit: OopMapCacheEntry* volatile* _array; > > Space after volatile please as it refers to the OMCE* not the *_array. Like this? OopMapCacheEntry* volatile * _array; Per coding standard, we don't put the '*' without * in front of the name of the variable, only the type, like OopMapCacheEntry* blah; vs OopMapCacheEntry *blah; > >> Tested with RBT equivalent of nightly on linux x64. Also ran dacapo >> with -Xint -Xlog:interpreter+oopmap=debug to verify. This change also > > Do you have some performance improvement numbers for this > optimization? Or is this a pre-emptive strike against a potential > locking bottleneck? Erik Osterlund ran benchmarks but I don't have access to them and he is on vacation. Heresay is that the bottleneck observed for this lock is gone. > >> removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration >> however) in favor of new logging. A linked CSR request is pending. > > Ok. Thanks for commenting on the CSR and the review. Coleen > > > Thanks, > David > >> Thanks, >> Coleen >> From david.holmes at oracle.com Mon Aug 14 12:29:53 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 14 Aug 2017 22:29:53 +1000 Subject: RFR (M) 8186042: Optimize OopMapCache lookup In-Reply-To: References: <7c36b086-5959-753b-1626-415817d53525@oracle.com> <80eb03d2-7921-a793-0019-d2b40270845d@oracle.com> Message-ID: <240a0e8c-9257-2e1c-7eba-400f97a05f05@oracle.com> On 14/08/2017 10:17 PM, coleen.phillimore at oracle.com wrote: > On 8/13/17 8:29 PM, David Holmes wrote: >> Hi Coleen, >> >> This looks good to me - Reviewed. A couple of minor comments below. >> >> On 12/08/2017 1:46 AM, coleen.phillimore at oracle.com wrote: >>> Summary: Use lock free access to oopMapCache >>> Contributed-by: frederic.parain at oracle.com, coleen.phillimore at oracle.com >>> >>> The OopMapCache::lookup() function took out a mutex to protect access >>> between the GC threads that are running concurrently. See bug for >>> more info. The function lookup() is run by multiple GC threads >>> concurrently. If there's a collision in the hashtable, this uses >>> atomic cmpxchg to add the entry to a list to be cleaned up after the >>> safepoint is over. GC isn't doing lookup at that point. >>> >>> This change is contributed by Frederic Parain, with some cleanup and >>> logging from me. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8186042 >> >> src/share/vm/interpreter/oopMapCache.cpp >> >> The logging here doesn't follow our usual pattern of only making the >> logging calls if the appropriate logging level is enabled. ?? > > I'm not sure what you're talking about. > > I think this is the new pattern of doing logging: > > Log(interpreter, oopmap) logv; > LogStream st(logv.trace()); I'm not familiar with that form. I wonder how the subsequent unconditional st.print is handled? ie how much work do we do before determining logging is not enabled? > Otherwise > > log_debug(interpreter, oopmap)("*** collision in oopmap cache - > flushing item ***"); > > checks for whether debug level logging is on. Yes. > You only need this: > if (log_is_enabled(Debug, interpreter, oopmap)) { > static int count = 0; > ResourceMark rm; > log_debug(interpreter, oopmap) > ("%d - Computing oopmap at bci %d for %s at hash %d", > ++count, bci, > method()->name_and_sig_as_C_string(), probe); > } > > If you're creating a ResourceMark. I don't associate it only with use of ResourceMarks, but any time you have to do a bit of work to gather all the information that will be passed to the actual logging statement. The aim being not to do any of that if logging is not enabled. > Unless logging changed (which I think it did). Perhaps. In which case I'd like to understand how this new form minimises the wasted effort when logging is disabled. >> >> src/share/vm/interpreter/oopMapCache.hpp >> >> Nit: OopMapCacheEntry* volatile* _array; >> >> Space after volatile please as it refers to the OMCE* not the *_array. > > Like this? > > OopMapCacheEntry* volatile * _array; Yes. > Per coding standard, we don't put the '*' without * in front of the name > of the variable, only the type, like > OopMapCacheEntry* blah; > vs > OopMapCacheEntry *blah; Okay I see where this comes from. Personally I find it odd to have the "volatile*". Anyway minor issue. >> >>> Tested with RBT equivalent of nightly on linux x64. Also ran dacapo >>> with -Xint -Xlog:interpreter+oopmap=debug to verify. This change also >> >> Do you have some performance improvement numbers for this >> optimization? Or is this a pre-emptive strike against a potential >> locking bottleneck? > > Erik Osterlund ran benchmarks but I don't have access to them and he is > on vacation. Heresay is that the bottleneck observed for this lock is > gone. Okay we can get the gory details added to the bug later. Thanks, David >> >>> removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration >>> however) in favor of new logging. A linked CSR request is pending. >> >> Ok. > > Thanks for commenting on the CSR and the review. > > Coleen >> >> >> Thanks, >> David >> >>> Thanks, >>> Coleen >>> > From coleen.phillimore at oracle.com Mon Aug 14 12:45:32 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Aug 2017 08:45:32 -0400 Subject: RFR (M) 8186042: Optimize OopMapCache lookup In-Reply-To: <240a0e8c-9257-2e1c-7eba-400f97a05f05@oracle.com> References: <7c36b086-5959-753b-1626-415817d53525@oracle.com> <80eb03d2-7921-a793-0019-d2b40270845d@oracle.com> <240a0e8c-9257-2e1c-7eba-400f97a05f05@oracle.com> Message-ID: <5b712422-77cf-2e2f-ccd3-43759275cdb6@oracle.com> On 8/14/17 8:29 AM, David Holmes wrote: > On 14/08/2017 10:17 PM, coleen.phillimore at oracle.com wrote: >> On 8/13/17 8:29 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> This looks good to me - Reviewed. A couple of minor comments below. >>> >>> On 12/08/2017 1:46 AM, coleen.phillimore at oracle.com wrote: >>>> Summary: Use lock free access to oopMapCache >>>> Contributed-by: frederic.parain at oracle.com, >>>> coleen.phillimore at oracle.com >>>> >>>> The OopMapCache::lookup() function took out a mutex to protect >>>> access between the GC threads that are running concurrently. See >>>> bug for more info. The function lookup() is run by multiple GC >>>> threads concurrently. If there's a collision in the hashtable, >>>> this uses atomic cmpxchg to add the entry to a list to be cleaned >>>> up after the safepoint is over. GC isn't doing lookup at that point. >>>> >>>> This change is contributed by Frederic Parain, with some cleanup >>>> and logging from me. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186042 >>> >>> src/share/vm/interpreter/oopMapCache.cpp >>> >>> The logging here doesn't follow our usual pattern of only making the >>> logging calls if the appropriate logging level is enabled. ?? >> >> I'm not sure what you're talking about. >> >> I think this is the new pattern of doing logging: >> >> Log(interpreter, oopmap) logv; >> LogStream st(logv.trace()); > > I'm not familiar with that form. I wonder how the subsequent > unconditional st.print is handled? ie how much work do we do before > determining logging is not enabled? I wish I knew how to get generated .s file with the new build. The printing code in the function OopMapCacheEntry::verify_mask isn't doing anything interesting (like getting as_C_string() etc). The implied check for log_is_enabled before all of the st.print()s can't be worse than the existing global checks for TraceOopMapGeneration && Verbose. It's a ton cleaner and the performance is a wash. > >> Otherwise >> >> log_debug(interpreter, oopmap)("*** collision in oopmap cache - >> flushing item ***"); >> >> checks for whether debug level logging is on. > > Yes. > >> You only need this: >> if (log_is_enabled(Debug, interpreter, oopmap)) { >> static int count = 0; >> ResourceMark rm; >> log_debug(interpreter, oopmap) >> ("%d - Computing oopmap at bci %d for %s at hash %d", >> ++count, bci, >> method()->name_and_sig_as_C_string(), probe); >> } >> >> If you're creating a ResourceMark. > > I don't associate it only with use of ResourceMarks, but any time you > have to do a bit of work to gather all the information that will be > passed to the actual logging statement. The aim being not to do any of > that if logging is not enabled. > >> Unless logging changed (which I think it did). > > Perhaps. In which case I'd like to understand how this new form > minimises the wasted effort when logging is disabled. > >>> >>> src/share/vm/interpreter/oopMapCache.hpp >>> >>> Nit: OopMapCacheEntry* volatile* _array; >>> >>> Space after volatile please as it refers to the OMCE* not the *_array. >> >> Like this? >> >> OopMapCacheEntry* volatile * _array; > > Yes. > >> Per coding standard, we don't put the '*' without * in front of the >> name of the variable, only the type, like >> OopMapCacheEntry* blah; >> vs >> OopMapCacheEntry *blah; > > Okay I see where this comes from. Personally I find it odd to have the > "volatile*". Anyway minor issue. > >>> >>>> Tested with RBT equivalent of nightly on linux x64. Also ran >>>> dacapo with -Xint -Xlog:interpreter+oopmap=debug to verify. This >>>> change also >>> >>> Do you have some performance improvement numbers for this >>> optimization? Or is this a pre-emptive strike against a potential >>> locking bottleneck? >> >> Erik Osterlund ran benchmarks but I don't have access to them and he >> is on vacation. Heresay is that the bottleneck observed for this >> lock is gone. > > Okay we can get the gory details added to the bug later. Yes, RedHat will see the improvements also. Thanks, Coleen > > Thanks, > David > >>> >>>> removes -XX:+TraceOopMapGeneration (not >>>> -XX:+TraceNewOopMapGeneration however) in favor of new logging. A >>>> linked CSR request is pending. >>> >>> Ok. >> >> Thanks for commenting on the CSR and the review. >> >> Coleen >>> >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> Coleen >>>> >> From coleen.phillimore at oracle.com Mon Aug 14 14:07:03 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Aug 2017 10:07:03 -0400 Subject: RFR (M) 8186042: Optimize OopMapCache lookup In-Reply-To: <5b712422-77cf-2e2f-ccd3-43759275cdb6@oracle.com> References: <7c36b086-5959-753b-1626-415817d53525@oracle.com> <80eb03d2-7921-a793-0019-d2b40270845d@oracle.com> <240a0e8c-9257-2e1c-7eba-400f97a05f05@oracle.com> <5b712422-77cf-2e2f-ccd3-43759275cdb6@oracle.com> Message-ID: On 8/14/17 8:45 AM, coleen.phillimore at oracle.com wrote: > > > On 8/14/17 8:29 AM, David Holmes wrote: >> On 14/08/2017 10:17 PM, coleen.phillimore at oracle.com wrote: >>> On 8/13/17 8:29 PM, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> This looks good to me - Reviewed. A couple of minor comments below. >>>> >>>> On 12/08/2017 1:46 AM, coleen.phillimore at oracle.com wrote: >>>>> Summary: Use lock free access to oopMapCache >>>>> Contributed-by: frederic.parain at oracle.com, >>>>> coleen.phillimore at oracle.com >>>>> >>>>> The OopMapCache::lookup() function took out a mutex to protect >>>>> access between the GC threads that are running concurrently. See >>>>> bug for more info. The function lookup() is run by multiple GC >>>>> threads concurrently. If there's a collision in the hashtable, >>>>> this uses atomic cmpxchg to add the entry to a list to be cleaned >>>>> up after the safepoint is over. GC isn't doing lookup at that >>>>> point. >>>>> >>>>> This change is contributed by Frederic Parain, with some cleanup >>>>> and logging from me. >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8186042 >>>> >>>> src/share/vm/interpreter/oopMapCache.cpp >>>> >>>> The logging here doesn't follow our usual pattern of only making >>>> the logging calls if the appropriate logging level is enabled. ?? >>> >>> I'm not sure what you're talking about. >>> >>> I think this is the new pattern of doing logging: >>> >>> Log(interpreter, oopmap) logv; >>> LogStream st(logv.trace()); >> >> I'm not familiar with that form. I wonder how the subsequent >> unconditional st.print is handled? ie how much work do we do before >> determining logging is not enabled? > > I wish I knew how to get generated .s file with the new build. The > printing code in the function OopMapCacheEntry::verify_mask isn't > doing anything interesting (like getting as_C_string() etc). The > implied check for log_is_enabled before all of the st.print()s can't > be worse than the existing global checks for TraceOopMapGeneration && > Verbose. It's a ton cleaner and the performance is a wash. The function verify_maks is called under an assert, so the log trace queries won't affect product performance. Coleen >> >>> Otherwise >>> >>> log_debug(interpreter, oopmap)("*** collision in oopmap cache >>> - flushing item ***"); >>> >>> checks for whether debug level logging is on. >> >> Yes. >> >>> You only need this: >>> if (log_is_enabled(Debug, interpreter, oopmap)) { >>> static int count = 0; >>> ResourceMark rm; >>> log_debug(interpreter, oopmap) >>> ("%d - Computing oopmap at bci %d for %s at hash %d", >>> ++count, bci, >>> method()->name_and_sig_as_C_string(), probe); >>> } >>> >>> If you're creating a ResourceMark. >> >> I don't associate it only with use of ResourceMarks, but any time you >> have to do a bit of work to gather all the information that will be >> passed to the actual logging statement. The aim being not to do any >> of that if logging is not enabled. >> >>> Unless logging changed (which I think it did). >> >> Perhaps. In which case I'd like to understand how this new form >> minimises the wasted effort when logging is disabled. >> >>>> >>>> src/share/vm/interpreter/oopMapCache.hpp >>>> >>>> Nit: OopMapCacheEntry* volatile* _array; >>>> >>>> Space after volatile please as it refers to the OMCE* not the *_array. >>> >>> Like this? >>> >>> OopMapCacheEntry* volatile * _array; >> >> Yes. >> >>> Per coding standard, we don't put the '*' without * in front of the >>> name of the variable, only the type, like >>> OopMapCacheEntry* blah; >>> vs >>> OopMapCacheEntry *blah; >> >> Okay I see where this comes from. Personally I find it odd to have >> the "volatile*". Anyway minor issue. >> >>>> >>>>> Tested with RBT equivalent of nightly on linux x64. Also ran >>>>> dacapo with -Xint -Xlog:interpreter+oopmap=debug to verify. This >>>>> change also >>>> >>>> Do you have some performance improvement numbers for this >>>> optimization? Or is this a pre-emptive strike against a potential >>>> locking bottleneck? >>> >>> Erik Osterlund ran benchmarks but I don't have access to them and he >>> is on vacation. Heresay is that the bottleneck observed for this >>> lock is gone. >> >> Okay we can get the gory details added to the bug later. > > Yes, RedHat will see the improvements also. > > Thanks, > Coleen > >> >> Thanks, >> David >> >>>> >>>>> removes -XX:+TraceOopMapGeneration (not >>>>> -XX:+TraceNewOopMapGeneration however) in favor of new logging. A >>>>> linked CSR request is pending. >>>> >>>> Ok. >>> >>> Thanks for commenting on the CSR and the review. >>> >>> Coleen >>>> >>>> >>>> Thanks, >>>> David >>>> >>>>> Thanks, >>>>> Coleen >>>>> >>> > From eric.caspole at oracle.com Mon Aug 14 14:20:58 2017 From: eric.caspole at oracle.com (Eric Caspole) Date: Mon, 14 Aug 2017 10:20:58 -0400 Subject: RFR(XS): JDK-8186154 : Extra info with +LogTouchedMethods In-Reply-To: References: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> Message-ID: Good catch. Eric On 08/12/2017 05:15 AM, Milan Mimica wrote: > Hi > > I believe each VM option should have a unique description: > > diagnostic(bool, PrintTouchedMethodsAtExit, false, \ > "Print all methods that have been ever touched in runtime") \ > > \+ diagnostic(bool, PrintTouchedMethodCount, false, > \+ "Print all methods that have been ever > touched in runtime") \+ > > > pet, 11. kol 2017. u 21:11 Eric Caspole napisao > je: > >> Hi, >> Please review this very small change to add more info to >> +LogTouchedMethods including the touched count and whether the method >> has loops. This extra info has been handy while doing experiments to >> create the most efficient AOT library for quick startup. >> >> http://cr.openjdk.java.net/~ecaspole/JDK-8186154/webrev/ >> >> https://bugs.openjdk.java.net/browse/JDK-8186154 >> >> Passed JPRT hotspot tests. >> >> Thanks, >> Eric >> From vladimir.kozlov at oracle.com Mon Aug 14 15:07:43 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 14 Aug 2017 08:07:43 -0700 Subject: RFR(XS): JDK-8186154 : Extra info with +LogTouchedMethods In-Reply-To: <570107e5-92d6-4838-a878-edf0640d6781@oracle.com> References: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> <570107e5-92d6-4838-a878-edf0640d6781@oracle.com> Message-ID: On 8/14/17 5:07 AM, coleen.phillimore at oracle.com wrote: > > Hi, > > I don't think you should add another option to print the count. You > should do it unconditionally. Since it's a diagnostic option, we don't > have to support parsers of the output. Not true. Please, don't complicate PrintTouchedMethodsAtExit output because AOT have to parse it ;) as Eric explained. If you want an other output use -Xlog format. Vladimir > I was going to type that you should use UL for this output instead, but > that's a bigger task. It would be nice if the compiler options used UL > and didn't have these Print options anymore. In any case, please don't > add another Print* option. > > thanks, > Coleen > > On 8/11/17 3:10 PM, Eric Caspole wrote: >> Hi, >> Please review this very small change to add more info to >> +LogTouchedMethods including the touched count and whether the method >> has loops. This extra info has been handy while doing experiments to >> create the most efficient AOT library for quick startup. >> >> http://cr.openjdk.java.net/~ecaspole/JDK-8186154/webrev/ >> >> https://bugs.openjdk.java.net/browse/JDK-8186154 >> >> Passed JPRT hotspot tests. >> >> Thanks, >> Eric > From coleen.phillimore at oracle.com Mon Aug 14 15:13:46 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Aug 2017 11:13:46 -0400 Subject: RFR: 8180768: Deprecate -XX:+/-MonitorInUseLists option In-Reply-To: References: <1dad29d0-f962-52d6-7e24-6a1f723e5aa9@redhat.com> Message-ID: <24be1349-b835-bae1-684a-bc44816cde72@oracle.com> This is already checked in but the bug is still open. Was this a raw push? Coleen On 7/25/17 10:55 AM, coleen.phillimore at oracle.com wrote: > > This looks good. I can sponsor this for you. > Coleen > > On 7/25/17 6:10 AM, Roman Kennke wrote: >> Hi all, >> >> please review this trivial change to deprecate MonitorInUseLists (as >> discussed earlier here on this list). >> >> http://cr.openjdk.java.net/~rkennke/8180768/webrev.00/ >> >> >> Bug entry: >> >> https://bugs.openjdk.java.net/browse/JDK-8180768 >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8180929 >> >> Thanks, >> Roman >> > From rkennke at redhat.com Mon Aug 14 15:17:42 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 14 Aug 2017 17:17:42 +0200 Subject: RFR: 8180768: Deprecate -XX:+/-MonitorInUseLists option In-Reply-To: <24be1349-b835-bae1-684a-bc44816cde72@oracle.com> References: <1dad29d0-f962-52d6-7e24-6a1f723e5aa9@redhat.com> <24be1349-b835-bae1-684a-bc44816cde72@oracle.com> Message-ID: <3a78f381-2b3d-bb56-1ceb-83cef67b48fc@redhat.com> I think I made a mistake. I put the CSR ID in the commit msg instead of the bug. Roman Am 14.08.2017 um 17:13 schrieb coleen.phillimore at oracle.com: > This is already checked in but the bug is still open. Was this a raw > push? > > Coleen > > On 7/25/17 10:55 AM, coleen.phillimore at oracle.com wrote: >> >> This looks good. I can sponsor this for you. >> Coleen >> >> On 7/25/17 6:10 AM, Roman Kennke wrote: >>> Hi all, >>> >>> please review this trivial change to deprecate MonitorInUseLists (as >>> discussed earlier here on this list). >>> >>> http://cr.openjdk.java.net/~rkennke/8180768/webrev.00/ >>> >>> >>> Bug entry: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8180768 >>> >>> CSR: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8180929 >>> >>> Thanks, >>> Roman >>> >> > From thomas.stuefe at gmail.com Mon Aug 14 15:23:09 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 14 Aug 2017 17:23:09 +0200 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH Message-ID: Dear all, please review this tiny fix: Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. Thanks, Thomas From coleen.phillimore at oracle.com Mon Aug 14 15:30:08 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Aug 2017 11:30:08 -0400 Subject: RFR: 8180768: Deprecate -XX:+/-MonitorInUseLists option In-Reply-To: <3a78f381-2b3d-bb56-1ceb-83cef67b48fc@redhat.com> References: <1dad29d0-f962-52d6-7e24-6a1f723e5aa9@redhat.com> <24be1349-b835-bae1-684a-bc44816cde72@oracle.com> <3a78f381-2b3d-bb56-1ceb-83cef67b48fc@redhat.com> Message-ID: <5aae9323-ba61-701e-fa69-c95d2d80b62f@oracle.com> On 8/14/17 11:17 AM, Roman Kennke wrote: > I think I made a mistake. I put the CSR ID in the commit msg instead of > the bug. Ah, ok. I marked it as closed. I guess JPRT doesn't put the changeset in the CSR issue. Coleen > > Roman > > Am 14.08.2017 um 17:13 schrieb coleen.phillimore at oracle.com: >> This is already checked in but the bug is still open. Was this a raw >> push? >> >> Coleen >> >> On 7/25/17 10:55 AM, coleen.phillimore at oracle.com wrote: >>> This looks good. I can sponsor this for you. >>> Coleen >>> >>> On 7/25/17 6:10 AM, Roman Kennke wrote: >>>> Hi all, >>>> >>>> please review this trivial change to deprecate MonitorInUseLists (as >>>> discussed earlier here on this list). >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180768/webrev.00/ >>>> >>>> >>>> Bug entry: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8180768 >>>> >>>> CSR: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8180929 >>>> >>>> Thanks, >>>> Roman >>>> From george.triantafillou at oracle.com Mon Aug 14 15:31:13 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Mon, 14 Aug 2017 11:31:13 -0400 Subject: RFR 8149790: NegativeArraySizeException with hprof Message-ID: <416dc040-f076-a89a-a2c5-35d477f2f32d@oracle.com> Please review this change to fix NegativeArraySizeException test failures with hprof: JBS: https://bugs.openjdk.java.net/browse/JDK-8149790 webrev: http://cr.openjdk.java.net/~gtriantafill/8149790-webrev/webrev/index.html The original patch was contributed by Andreas Eriksson. Tested locally on Linux-x64 and with RBT tiers 2 through 5. Thanks. -George From coleen.phillimore at oracle.com Mon Aug 14 15:32:26 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Aug 2017 11:32:26 -0400 Subject: RFR(XS): JDK-8186154 : Extra info with +LogTouchedMethods In-Reply-To: References: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> <570107e5-92d6-4838-a878-edf0640d6781@oracle.com> Message-ID: On 8/14/17 11:07 AM, Vladimir Kozlov wrote: > On 8/14/17 5:07 AM, coleen.phillimore at oracle.com wrote: >> >> Hi, >> >> I don't think you should add another option to print the count. You >> should do it unconditionally. Since it's a diagnostic option, we >> don't have to support parsers of the output. > > Not true. Please, don't complicate PrintTouchedMethodsAtExit output > because AOT have to parse it ;) as Eric explained. I didn't see that. Can we also fix AOT to parse and/or ignore the count? I thought the purpose was for AOT to get the count to provide a better list to precompile. thanks, Coleen > > If you want an other output use -Xlog format. > > Vladimir > >> I was going to type that you should use UL for this output instead, >> but that's a bigger task. It would be nice if the compiler options >> used UL and didn't have these Print options anymore. In any case, >> please don't add another Print* option. >> >> thanks, >> Coleen >> >> On 8/11/17 3:10 PM, Eric Caspole wrote: >>> Hi, >>> Please review this very small change to add more info to >>> +LogTouchedMethods including the touched count and whether the >>> method has loops. This extra info has been handy while doing >>> experiments to create the most efficient AOT library for quick startup. >>> >>> http://cr.openjdk.java.net/~ecaspole/JDK-8186154/webrev/ >>> >>> https://bugs.openjdk.java.net/browse/JDK-8186154 >>> >>> Passed JPRT hotspot tests. >>> >>> Thanks, >>> Eric >> From daniel.daugherty at oracle.com Mon Aug 14 16:01:36 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 14 Aug 2017 10:01:36 -0600 Subject: RFR: 8180768: Deprecate -XX:+/-MonitorInUseLists option In-Reply-To: <5aae9323-ba61-701e-fa69-c95d2d80b62f@oracle.com> References: <1dad29d0-f962-52d6-7e24-6a1f723e5aa9@redhat.com> <24be1349-b835-bae1-684a-bc44816cde72@oracle.com> <3a78f381-2b3d-bb56-1ceb-83cef67b48fc@redhat.com> <5aae9323-ba61-701e-fa69-c95d2d80b62f@oracle.com> Message-ID: <0305d4e6-47a8-f41a-6ac2-26a16fb241f0@oracle.com> I've updated the bug to show it is fixed in "team". I've also fabricated an entry that looks like what "HG Updates" would add for those bug parser scripts... I'll try to keep an eye on this one when it hits JDK10/jdk10. It will need another manual update... Dan On 8/14/17 9:30 AM, coleen.phillimore at oracle.com wrote: > > > On 8/14/17 11:17 AM, Roman Kennke wrote: >> I think I made a mistake. I put the CSR ID in the commit msg instead of >> the bug. > > Ah, ok. I marked it as closed. I guess JPRT doesn't put the > changeset in the CSR issue. > > Coleen > >> >> Roman >> >> Am 14.08.2017 um 17:13 schrieb coleen.phillimore at oracle.com: >>> This is already checked in but the bug is still open. Was this a raw >>> push? >>> >>> Coleen >>> >>> On 7/25/17 10:55 AM, coleen.phillimore at oracle.com wrote: >>>> This looks good. I can sponsor this for you. >>>> Coleen >>>> >>>> On 7/25/17 6:10 AM, Roman Kennke wrote: >>>>> Hi all, >>>>> >>>>> please review this trivial change to deprecate MonitorInUseLists (as >>>>> discussed earlier here on this list). >>>>> >>>>> http://cr.openjdk.java.net/~rkennke/8180768/webrev.00/ >>>>> >>>>> >>>>> Bug entry: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8180768 >>>>> >>>>> CSR: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8180929 >>>>> >>>>> Thanks, >>>>> Roman >>>>> > > From eric.caspole at oracle.com Mon Aug 14 16:03:07 2017 From: eric.caspole at oracle.com (Eric Caspole) Date: Mon, 14 Aug 2017 12:03:07 -0400 Subject: RFR(XS): JDK-8186154 : Extra info with +LogTouchedMethods In-Reply-To: References: <7ad28462-176d-81c8-64dc-b9740a7a21f7@oracle.com> Message-ID: <480cfa97-840c-6a71-6707-73c096a00d12@oracle.com> Hi Ioi, Yes this is not a cheap invocation counter. I think the count is useful because it is related to how often the methods are considered for inlining etc, so the higher counts are methods that affect the hot paths in a way I admit I don't completely understand yet. That's why I want this stat :) Eric On 08/12/2017 09:42 PM, Ioi Lam wrote: > Hi Eric, > > I am not sure if _touch_count can accurately represent 'how often is > this method used'. In the interpreter, Method::log_touched is called > only once, when building the Method::method_counters() > > #define GET_METHOD_COUNTERS(res) \ > res = METHOD->method_counters(); \ > if (res == NULL) { \ > CALL_VM(res = InterpreterRuntime::build_method_counters(THREAD, > METHOD), handle_exception); \ > } > > In addition, whenever the compiler references a method (when compiling a > method, or when inlining a method into another, etc), it calls > Method::log_touched > > ciMethod::ciMethod(const methodHandle& h_m, ciInstanceKlass* holder) : > ciMetadata(h_m()), > _holder(holder) > { > assert(h_m() != NULL, "no null method"); > > if (LogTouchedMethods) { > h_m()->log_touched(Thread::current()); > } > > So, a high count of _touch_count is more correlated to how often this > method is seen by the compiler. > > Thanks > > - Ioi > > > > On 8/11/17 12:10 PM, Eric Caspole wrote: >> Hi, >> Please review this very small change to add more info to >> +LogTouchedMethods including the touched count and whether the method >> has loops. This extra info has been handy while doing experiments to >> create the most efficient AOT library for quick startup. >> >> http://cr.openjdk.java.net/~ecaspole/JDK-8186154/webrev/ >> >> https://bugs.openjdk.java.net/browse/JDK-8186154 >> >> Passed JPRT hotspot tests. >> >> Thanks, >> Eric > From coleen.phillimore at oracle.com Mon Aug 14 16:20:06 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Aug 2017 12:20:06 -0400 Subject: RFR: 8180768: Deprecate -XX:+/-MonitorInUseLists option In-Reply-To: <0305d4e6-47a8-f41a-6ac2-26a16fb241f0@oracle.com> References: <1dad29d0-f962-52d6-7e24-6a1f723e5aa9@redhat.com> <24be1349-b835-bae1-684a-bc44816cde72@oracle.com> <3a78f381-2b3d-bb56-1ceb-83cef67b48fc@redhat.com> <5aae9323-ba61-701e-fa69-c95d2d80b62f@oracle.com> <0305d4e6-47a8-f41a-6ac2-26a16fb241f0@oracle.com> Message-ID: <31ea40e9-4407-c493-9b4a-14842e95c45b@oracle.com> On 8/14/17 12:01 PM, Daniel D. Daugherty wrote: > I've updated the bug to show it is fixed in "team". I've also > fabricated an entry that looks like what "HG Updates" would > add for those bug parser scripts... > > I'll try to keep an eye on this one when it hits JDK10/jdk10. > It will need another manual update... Thanks, Dan! Coleen > > Dan > > > On 8/14/17 9:30 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 8/14/17 11:17 AM, Roman Kennke wrote: >>> I think I made a mistake. I put the CSR ID in the commit msg instead of >>> the bug. >> >> Ah, ok. I marked it as closed. I guess JPRT doesn't put the >> changeset in the CSR issue. >> >> Coleen >> >>> >>> Roman >>> >>> Am 14.08.2017 um 17:13 schrieb coleen.phillimore at oracle.com: >>>> This is already checked in but the bug is still open. Was this a raw >>>> push? >>>> >>>> Coleen >>>> >>>> On 7/25/17 10:55 AM, coleen.phillimore at oracle.com wrote: >>>>> This looks good. I can sponsor this for you. >>>>> Coleen >>>>> >>>>> On 7/25/17 6:10 AM, Roman Kennke wrote: >>>>>> Hi all, >>>>>> >>>>>> please review this trivial change to deprecate MonitorInUseLists (as >>>>>> discussed earlier here on this list). >>>>>> >>>>>> http://cr.openjdk.java.net/~rkennke/8180768/webrev.00/ >>>>>> >>>>>> >>>>>> Bug entry: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8180768 >>>>>> >>>>>> CSR: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8180929 >>>>>> >>>>>> Thanks, >>>>>> Roman >>>>>> >> >> > From lois.foltan at oracle.com Mon Aug 14 20:08:09 2017 From: lois.foltan at oracle.com (Lois Foltan) Date: Mon, 14 Aug 2017 16:08:09 -0400 Subject: RFR 8149790: NegativeArraySizeException with hprof In-Reply-To: <416dc040-f076-a89a-a2c5-35d477f2f32d@oracle.com> References: <416dc040-f076-a89a-a2c5-35d477f2f32d@oracle.com> Message-ID: <50da66cd-b2c9-9149-60ce-44ea200077a3@oracle.com> George, I think this looks good. Copyright needs updating in FileReadBuffer.java, MappedReadBuffer.java & ReadBuffer.java. Thanks, Lois On 8/14/2017 11:31 AM, George Triantafillou wrote: > Please review this change to fix NegativeArraySizeException test > failures with hprof: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8149790 > webrev: > http://cr.openjdk.java.net/~gtriantafill/8149790-webrev/webrev/index.html > > > > The original patch was contributed by Andreas Eriksson. Tested > locally on Linux-x64 and with RBT tiers 2 through 5. > > Thanks. > > -George > From george.triantafillou at oracle.com Mon Aug 14 20:21:47 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Mon, 14 Aug 2017 16:21:47 -0400 Subject: RFR 8149790: NegativeArraySizeException with hprof In-Reply-To: <50da66cd-b2c9-9149-60ce-44ea200077a3@oracle.com> References: <416dc040-f076-a89a-a2c5-35d477f2f32d@oracle.com> <50da66cd-b2c9-9149-60ce-44ea200077a3@oracle.com> Message-ID: <72e0c578-c656-c893-9d3c-429ff7d98b2e@oracle.com> Hi Lois, Thanks for the review. I'll update the copyrights. -George On 8/14/2017 4:08 PM, Lois Foltan wrote: > George, > > I think this looks good. Copyright needs updating in > FileReadBuffer.java, MappedReadBuffer.java & ReadBuffer.java. > > Thanks, > Lois > > On 8/14/2017 11:31 AM, George Triantafillou wrote: >> Please review this change to fix NegativeArraySizeException test >> failures with hprof: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8149790 >> webrev: >> http://cr.openjdk.java.net/~gtriantafill/8149790-webrev/webrev/index.html >> >> >> >> The original patch was contributed by Andreas Eriksson. Tested >> locally on Linux-x64 and with RBT tiers 2 through 5. >> >> Thanks. >> >> -George >> > From david.holmes at oracle.com Tue Aug 15 00:57:57 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 15 Aug 2017 10:57:57 +1000 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: References: Message-ID: <830b3b99-6c81-38f6-2b5a-eae243f81e8d@oracle.com> Hi Thomas, On 15/08/2017 1:23 AM, Thomas St?fe wrote: > Dear all, > > please review this tiny fix: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 > webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ > > We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal > handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. Seems fine - Reviewed. Thanks, David > Thanks, Thomas > From thomas.stuefe at gmail.com Tue Aug 15 03:59:32 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 15 Aug 2017 03:59:32 +0000 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: <830b3b99-6c81-38f6-2b5a-eae243f81e8d@oracle.com> References: <830b3b99-6c81-38f6-2b5a-eae243f81e8d@oracle.com> Message-ID: Thanks, David! On Tue 15. Aug 2017 at 02:58, David Holmes wrote: > Hi Thomas, > > On 15/08/2017 1:23 AM, Thomas St?fe wrote: > > Dear all, > > > > please review this tiny fix: > > > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 > > webrev: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ > > > > We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal > > handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. > > Seems fine - Reviewed. > > Thanks, > David > > > Thanks, Thomas > > > From thomas.stuefe at gmail.com Tue Aug 15 07:17:04 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 15 Aug 2017 09:17:04 +0200 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: References: Message-ID: Modified the change a bit to avoid theoretical problems should we ever want C++ expeptions. This also is more consistent with the way we use _try/__except for CreateJavaVM. New Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186199-destroyjavavm-no-seh-handler/webrev.01/webrev/ Thanks, Thomas On Mon, Aug 14, 2017 at 5:23 PM, Thomas St?fe wrote: > Dear all, > > please review this tiny fix: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 > webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > 8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ > > We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal > handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. > > Thanks, Thomas > From david.holmes at oracle.com Tue Aug 15 11:26:24 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 15 Aug 2017 21:26:24 +1000 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: References: Message-ID: <661134fa-f577-b344-c993-bf7e8ca0027e@oracle.com> On 15/08/2017 5:17 PM, Thomas St?fe wrote: > Modified the change a bit to avoid theoretical problems should we ever > want C++ expeptions. This also is more consistent with the way we use > _try/__except for CreateJavaVM. Okay ... not sure what difference it makes though. David > New Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8186199-destroyjavavm-no-seh-handler/webrev.01/webrev/ > > Thanks, Thomas > > > > On Mon, Aug 14, 2017 at 5:23 PM, Thomas St?fe > wrote: > > Dear all, > > please review this tiny fix: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 > > webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ > > > We miss __try{ } __except in JNI_DestroyJavaVM, so we run without > signal handler (well, SE handler) during the invocation of > JNI_DestroyJavaVM. > > Thanks, Thomas > > From harold.seigel at oracle.com Tue Aug 15 13:43:02 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 15 Aug 2017 09:43:02 -0400 Subject: RFR 8186089: Move Arena to its own header file Message-ID: Hi, Please review this JDK-10 change to move class Arena into its own header file for reasons described in the bug. Note that since the SA doesn't use class Chunk, it need no longer be exported to it. Also, class Chunk is in the new arena.hpp header file (as opposed to just the .cpp file) because class HandleMark uses it in handles.hpp. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8186089/webrev/index.html JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186089 The change was tested with the JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util, and other tests, the co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. Thanks, Harold From coleen.phillimore at oracle.com Tue Aug 15 14:25:56 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 15 Aug 2017 10:25:56 -0400 Subject: RFR 8186089: Move Arena to its own header file In-Reply-To: References: Message-ID: Harold, This looks great! Thank you for doing this. Coleen On 8/15/17 9:43 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to move class Arena into its own > header file for reasons described in the bug. > > Note that since the SA doesn't use class Chunk, it need no longer be > exported to it. Also, class Chunk is in the new arena.hpp header file > (as opposed to just the .cpp file) because class HandleMark uses it in > handles.hpp. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8186089/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186089 > > The change was tested with the JCK Lang and VM tests, the JTreg > hotspot, java/io, java/lang, java/util, and other tests, the > co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. > > Thanks, Harold > From harold.seigel at oracle.com Tue Aug 15 14:58:53 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 15 Aug 2017 10:58:53 -0400 Subject: RFR 8186089: Move Arena to its own header file In-Reply-To: References: Message-ID: <909c4217-7db1-b777-1a1c-ca9ed162117f@oracle.com> Thanks Coleen! Harold On 8/15/2017 10:25 AM, coleen.phillimore at oracle.com wrote: > > Harold, > > This looks great! Thank you for doing this. > > Coleen > > On 8/15/17 9:43 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to move class Arena into its own >> header file for reasons described in the bug. >> >> Note that since the SA doesn't use class Chunk, it need no longer be >> exported to it. Also, class Chunk is in the new arena.hpp header >> file (as opposed to just the .cpp file) because class HandleMark uses >> it in handles.hpp. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8186089/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186089 >> >> The change was tested with the JCK Lang and VM tests, the JTreg >> hotspot, java/io, java/lang, java/util, and other tests, the >> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >> >> Thanks, Harold >> > From george.triantafillou at oracle.com Tue Aug 15 15:47:33 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 15 Aug 2017 11:47:33 -0400 Subject: RFR 8186089: Move Arena to its own header file In-Reply-To: References: Message-ID: Hi Harold, This looks good. -George On 8/15/2017 9:43 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to move class Arena into its own > header file for reasons described in the bug. > > Note that since the SA doesn't use class Chunk, it need no longer be > exported to it. Also, class Chunk is in the new arena.hpp header file > (as opposed to just the .cpp file) because class HandleMark uses it in > handles.hpp. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8186089/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186089 > > The change was tested with the JCK Lang and VM tests, the JTreg > hotspot, java/io, java/lang, java/util, and other tests, the > co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. > > Thanks, Harold > From harold.seigel at oracle.com Tue Aug 15 15:53:38 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 15 Aug 2017 11:53:38 -0400 Subject: RFR 8186089: Move Arena to its own header file In-Reply-To: References: Message-ID: <9cfb9c62-1f36-78fc-e956-672a74953822@oracle.com> Thanks George! Harold On 8/15/2017 11:47 AM, George Triantafillou wrote: > Hi Harold, > > This looks good. > > -George > > On 8/15/2017 9:43 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to move class Arena into its own >> header file for reasons described in the bug. >> >> Note that since the SA doesn't use class Chunk, it need no longer be >> exported to it. Also, class Chunk is in the new arena.hpp header >> file (as opposed to just the .cpp file) because class HandleMark uses >> it in handles.hpp. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8186089/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186089 >> >> The change was tested with the JCK Lang and VM tests, the JTreg >> hotspot, java/io, java/lang, java/util, and other tests, the >> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >> >> Thanks, Harold >> > From coleen.phillimore at oracle.com Tue Aug 15 17:00:54 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 15 Aug 2017 13:00:54 -0400 Subject: RFR (M) 8186042: Optimize OopMapCache lookup In-Reply-To: References: <7c36b086-5959-753b-1626-415817d53525@oracle.com> Message-ID: On 8/15/17 12:55 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > It looks good in general. Thank you Serguei for the code review. > One comment though. > > http://cr.openjdk.java.net/~coleenp/8186042.01/webrev/src/share/vm/interpreter/oopMapCache.cpp.frames.html > > 516 // Entry is not in hashtable. > 517 // Compute entry > 518 > 519 OopMapCacheEntry* tmp = NEW_C_HEAP_OBJ(OopMapCacheEntry, mtClass); > 520 tmp->initialize(); > 521 tmp->fill(method, bci); > 522 entry_for->resource_copy(tmp); > 523 > 524 if (method->should_not_be_cached()) { > 525 // It is either not safe or not a good idea to cache this Method* > 526 // at this time. We give the caller of lookup() a copy of the > 527 // interesting info via parameter entry_for, but we don't add it to > 528 // the cache. See the gory details in Method*.cpp. > 529 FREE_C_HEAP_OBJ(tmp); > 530 return; > 531 } > The OopMapCacheEntry we allocate at 519 is used to fill in the entry_for passed in at line 522, so it can't be moved. The fill() method does the abstract interpretation, filling in tmp and that's used to fill in entry_for. entry_for is an InterpreterOopMap, which is derived from OopMapCacheEntry. It's sort of odd code, ie not straightforward. Thanks, Coleen > Would it better to move the fragment 524-531 above the line 516? > Then the line (529 FREE_C_HEAP_OBJ(tmp);) could be removed. > > > Thanks, > Serguei > > > On 8/11/17 08:46, coleen.phillimore at oracle.com wrote: >> Summary: Use lock free access to oopMapCache >> Contributed-by: frederic.parain at oracle.com, coleen.phillimore at oracle.com >> >> The OopMapCache::lookup() function took out a mutex to protect access >> between the GC threads that are running concurrently. See bug for >> more info. The function lookup() is run by multiple GC threads >> concurrently. If there's a collision in the hashtable, this uses >> atomic cmpxchg to add the entry to a list to be cleaned up after the >> safepoint is over. GC isn't doing lookup at that point. >> >> This change is contributed by Frederic Parain, with some cleanup and >> logging from me. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8186042.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8186042 >> >> Tested with RBT equivalent of nightly on linux x64. Also ran dacapo >> with -Xint -Xlog:interpreter+oopmap=debug to verify. This change also >> removes -XX:+TraceOopMapGeneration (not -XX:+TraceNewOopMapGeneration >> however) in favor of new logging. A linked CSR request is pending. >> >> Thanks, >> Coleen >> > From christian.tornqvist at oracle.com Tue Aug 15 20:28:47 2017 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Tue, 15 Aug 2017 16:28:47 -0400 Subject: RFR 8149790: NegativeArraySizeException with hprof In-Reply-To: <416dc040-f076-a89a-a2c5-35d477f2f32d@oracle.com> References: <416dc040-f076-a89a-a2c5-35d477f2f32d@oracle.com> Message-ID: Hi George, This looks good. Thanks, Christian > On Aug 14, 2017, at 11:31 AM, George Triantafillou wrote: > > Please review this change to fix NegativeArraySizeException test failures with hprof: > > JBS: https://bugs.openjdk.java.net/browse/JDK-8149790 > webrev: http://cr.openjdk.java.net/~gtriantafill/8149790-webrev/webrev/index.html > The original patch was contributed by Andreas Eriksson. Tested locally on Linux-x64 and with RBT tiers 2 through 5. > Thanks. > > -George From george.triantafillou at oracle.com Tue Aug 15 20:30:14 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 15 Aug 2017 16:30:14 -0400 Subject: RFR 8149790: NegativeArraySizeException with hprof In-Reply-To: References: <416dc040-f076-a89a-a2c5-35d477f2f32d@oracle.com> Message-ID: <8aa661c7-515f-2244-291e-f5965c91cc93@oracle.com> Thanks Christian. -George On 8/15/2017 4:28 PM, Christian Tornqvist wrote: > Hi George, > > This looks good. > > Thanks, > Christian > >> On Aug 14, 2017, at 11:31 AM, George Triantafillou >> > > wrote: >> >> Please review this change to fix NegativeArraySizeException test >> failures with hprof: >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8149790 >> webrev: >> http://cr.openjdk.java.net/~gtriantafill/8149790-webrev/webrev/index.html >> >> >> The original patch was contributed by Andreas Eriksson. Tested >> locally on Linux-x64 and with RBT tiers 2 through 5. >> >> Thanks. >> >> -George >> > From harold.seigel at oracle.com Tue Aug 15 20:42:07 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 15 Aug 2017 16:42:07 -0400 Subject: RFR 8186089: Move Arena to its own header file In-Reply-To: References: Message-ID: <5ab0e51a-e50b-77ea-43fa-1fe7222f7320@oracle.com> Hi George, Is there a closed webrev for this change that changes tonga/testlist/vm.heapdump.testlist to unquarantine test heapdump/JMapHeap ? Also, have you run all the tests in vm.heapdump.testlist with these changes? Thanks, Harold On 8/15/2017 9:43 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to move class Arena into its own > header file for reasons described in the bug. > > Note that since the SA doesn't use class Chunk, it need no longer be > exported to it. Also, class Chunk is in the new arena.hpp header file > (as opposed to just the .cpp file) because class HandleMark uses it in > handles.hpp. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8186089/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186089 > > The change was tested with the JCK Lang and VM tests, the JTreg > hotspot, java/io, java/lang, java/util, and other tests, the > co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. > > Thanks, Harold > From harold.seigel at oracle.com Tue Aug 15 20:45:42 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 15 Aug 2017 16:45:42 -0400 Subject: RFR 8186089: Move Arena to its own header file In-Reply-To: <5ab0e51a-e50b-77ea-43fa-1fe7222f7320@oracle.com> References: <5ab0e51a-e50b-77ea-43fa-1fe7222f7320@oracle.com> Message-ID: Please ignore this email. Thanks, Harold On 8/15/2017 4:42 PM, harold seigel wrote: > Hi George, > > Is there a closed webrev for this change that changes > tonga/testlist/vm.heapdump.testlist to unquarantine test > heapdump/JMapHeap ? > > Also, have you run all the tests in vm.heapdump.testlist with these > changes? > > Thanks, Harold > > > On 8/15/2017 9:43 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to move class Arena into its own >> header file for reasons described in the bug. >> >> Note that since the SA doesn't use class Chunk, it need no longer be >> exported to it. Also, class Chunk is in the new arena.hpp header >> file (as opposed to just the .cpp file) because class HandleMark uses >> it in handles.hpp. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8186089/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8186089 >> >> The change was tested with the JCK Lang and VM tests, the JTreg >> hotspot, java/io, java/lang, java/util, and other tests, the >> co-located NSK tests, JPRT, and with RBT tier2 - tier5 tests. >> >> Thanks, Harold >> > From jiangli.zhou at oracle.com Tue Aug 15 21:59:43 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 15 Aug 2017 14:59:43 -0700 Subject: RFR (S): 8186238: The constant pool entry to empty string ("") should not be pre-resolved during CDS dump time Message-ID: <0C3CD156-9C20-4108-96CC-6922AA6B5189@oracle.com> Hi, Please review the following fix for JDK-8186238 . Empty string ?? is excluded from the shared string table at CDS dump time. During pre-resolving the constant pool string entries at dump time, the entries to the empty string should be skipped, otherwise different instances might be returned when calling intern() on empty string at runtime. webrev: http://cr.openjdk.java.net/~jiangli/8186238/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8186238 Tested with the failed tests shown in nightly. Thanks, Jiangli From jiangli.zhou at oracle.com Tue Aug 15 22:02:53 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 15 Aug 2017 15:02:53 -0700 Subject: RFR (S): 8186238: The constant pool entry to empty string ("") should not be pre-resolved during CDS dump time In-Reply-To: <0C3CD156-9C20-4108-96CC-6922AA6B5189@oracle.com> References: <0C3CD156-9C20-4108-96CC-6922AA6B5189@oracle.com> Message-ID: <086CF710-40F6-404A-BDA2-A3702108324D@oracle.com> I also want to thank Ioi for helping narrow down and diagnose the issue quickly! Jiangli > On Aug 15, 2017, at 2:59 PM, Jiangli Zhou wrote: > > Hi, > > Please review the following fix for JDK-8186238 . Empty string ?? is excluded from the shared string table at CDS dump time. During pre-resolving the constant pool string entries at dump time, the entries to the empty string should be skipped, otherwise different instances might be returned when calling intern() on empty string at runtime. > > webrev: http://cr.openjdk.java.net/~jiangli/8186238/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8186238 > > Tested with the failed tests shown in nightly. > > Thanks, > Jiangli > From daniel.daugherty at oracle.com Tue Aug 15 22:07:33 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 15 Aug 2017 16:07:33 -0600 Subject: RFR (S): 8186238: The constant pool entry to empty string ("") should not be pre-resolved during CDS dump time In-Reply-To: <0C3CD156-9C20-4108-96CC-6922AA6B5189@oracle.com> References: <0C3CD156-9C20-4108-96CC-6922AA6B5189@oracle.com> Message-ID: <6a729f11-10bb-919b-7013-dd6b2a38468c@oracle.com> On 8/15/17 3:59 PM, Jiangli Zhou wrote: > Hi, > > Please review the following fix for JDK-8186238 . Empty string ?? is excluded from the shared string table at CDS dump time. During pre-resolving the constant pool string entries at dump time, the entries to the empty string should be skipped, otherwise different instances might be returned when calling intern() on empty string at runtime. > > webrev: http://cr.openjdk.java.net/~jiangli/8186238/webrev.00/ src/share/vm/oops/constantPool.cpp No comments. Thumbs up! Ioi said to list him as a reviewer since he saw the fix in another e-mail thread. I think the HotSpot trivial fix rule applies and you do not have to wait for 24 hours. Dan > bug: https://bugs.openjdk.java.net/browse/JDK-8186238 > > Tested with the failed tests shown in nightly. > > Thanks, > Jiangli > From jiangli.zhou at oracle.com Tue Aug 15 22:10:52 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 15 Aug 2017 15:10:52 -0700 Subject: RFR (S): 8186238: The constant pool entry to empty string ("") should not be pre-resolved during CDS dump time In-Reply-To: <6a729f11-10bb-919b-7013-dd6b2a38468c@oracle.com> References: <0C3CD156-9C20-4108-96CC-6922AA6B5189@oracle.com> <6a729f11-10bb-919b-7013-dd6b2a38468c@oracle.com> Message-ID: Thank you, Dan! Jiangli > On Aug 15, 2017, at 3:07 PM, Daniel D. Daugherty wrote: > > On 8/15/17 3:59 PM, Jiangli Zhou wrote: >> Hi, >> >> Please review the following fix for JDK-8186238 . Empty string ?? is excluded from the shared string table at CDS dump time. During pre-resolving the constant pool string entries at dump time, the entries to the empty string should be skipped, otherwise different instances might be returned when calling intern() on empty string at runtime. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8186238/webrev.00/ > > src/share/vm/oops/constantPool.cpp > No comments. > > Thumbs up! > > Ioi said to list him as a reviewer since he saw the fix in > another e-mail thread. > > I think the HotSpot trivial fix rule applies and you do not > have to wait for 24 hours. > > Dan > > >> bug: https://bugs.openjdk.java.net/browse/JDK-8186238 >> >> Tested with the failed tests shown in nightly. >> >> Thanks, >> Jiangli >> > From goetz.lindenmaier at sap.com Wed Aug 16 10:22:50 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 16 Aug 2017 10:22:50 +0000 Subject: RFR(M): 8186072: dll_build_name returns true even if file is missing. Message-ID: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> Hi, dll_build_name builds the proper path to a library given a list of paths seperated by path_seperator and a library name. It adds in the platform specific endings etc. It is documented to return whether the file exists, but only does so if a path_seperator exists in the path. Especially if the path is empty, it just returns 'true'. Dll_build_name is usually used before calling dll_load. If dll_load does not get a full path it searches in well known unix/windows locations. This is intended in the two cases where dll_build_name is called with an empty path. I added a second variant of dll_build_name without the path argument that adds the path from system property java.lang.path and use that in these two cases. I changed the original function to actually check file availability in all cases, and to check . if the path is empty. Please review this change. I please need a sponsor. http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.01/ Best regards, Goetz. From thomas.stuefe at gmail.com Wed Aug 16 11:00:55 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 16 Aug 2017 13:00:55 +0200 Subject: RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> References: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> Message-ID: Hi Goetz, On Wed, Aug 16, 2017 at 12:22 PM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi, > > dll_build_name builds the proper path to a library given a list of paths > seperated by > path_seperator and a library name. It adds in the platform specific > endings etc. > It is documented to return whether the file exists, but only does so if a > path_seperator > exists in the path. > > good catch! > Especially if the path is empty, it just returns 'true'. > Dll_build_name is usually used before calling dll_load. If dll_load does > not get a full path it searches > in well known unix/windows locations. This is intended in the two cases > where dll_build_name > is called with an empty path. > So, for both cases (thread.cpp, jvmtiExport.cpp), before, we would call os::dll_build_name() with an empty string for the path which, for relative paths, would result in feeding that path unexpanded to dlopen(), which would use whatever the OS does in those cases (LIBPATH, LD_LIBRARY_PATH, PATH on windows). Note that this does not necessarily include searching the current directory. With your change, we now use java.library.path, which is not necessarily the same? (BTW, I think the old comments in thread.cpp and jniExport.cpp were wrong:"// Try the local directory" - if "local" means "current", this is not what did happen). > I added a second variant of dll_build_name without the path argument that > adds the path > from system property java.lang.path and use that in these two cases. > I changed the original function to actually check file availability in all > cases, > and to check . if the path is empty. > > I think that may be a bit confusing. We would then have three options: - call os::dll_build_name with a real ";;.." PATH and get a file name resolved from that path - call os::dll_build_name with "" for the PATH and get OS dll resolution - call your new overloaded version of os::dll_build_name(), which uses -Djava.library.path. > Please review this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.01/ > > Best regards, > Goetz. > > Kind Regards, Thomas From thomas.stuefe at gmail.com Wed Aug 16 14:18:10 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 16 Aug 2017 16:18:10 +0200 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: References: Message-ID: Ping.. could I please have a second review and a sponsor? Thank you! Current webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186199- destroyjavavm-no-seh-handler/webrev.01/webrev/ ..Thomas On Mon, Aug 14, 2017 at 5:23 PM, Thomas St?fe wrote: > Dear all, > > please review this tiny fix: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 > webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > 8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ > > We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal > handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. > > Thanks, Thomas > From goetz.lindenmaier at sap.com Wed Aug 16 14:24:09 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 16 Aug 2017 14:24:09 +0000 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes Message-ID: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> Hi, TestOptionWithRanges causes the vm on aix to crash on some machines. This is because huge stack sizes are not treated properly. On linux, pthread_attr_setstacksize succeeds if called with huge values, but pthread_create() then fails. On Aix, pthread_attr_setstacksize fails if passed a value exceeding the limits and leaves the minimal system thread stack size in attr. Thus thread creation succeeds and leads to crashes after thread creation when the guard pages shall be protected but don't fit on the tiny stack created. Please review this small, aix-only change. http://cr.openjdk.java.net/~goetz/wr17/8186293-aixHugeStack/webrev.01/ Best regards, Goetz. From thomas.stuefe at gmail.com Wed Aug 16 14:43:20 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 16 Aug 2017 16:43:20 +0200 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes In-Reply-To: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> References: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> Message-ID: Hi Goetz, thanks for this fix! Some small nits: "+ // On linux, pthread_attr_setstacksize succeeds with huge values, but + // pthread_create() fails. No need to mention Linux here. + On Aix, this fails and leaves the minimal system + // thread size in attr." We do not know this :) We only know it leaves the pthread_attr structure untouched. It was filled by pthread_attr_init() with default values. Stack default size may be minimally possible size, but I rather doubt this. How about instead: "On Aix, this fails and leaves the pthread_attr structure untouched" - pthread_attr_setguardsize(&attr, os::Aix::default_guard_size(thr_type)); + ret = pthread_attr_setguardsize(&attr, os::Aix::default_guard_size(thr_type)); I would not bother checking the setting of the guard page. The only reason it makes even sense to set the guard page size is for java threads, where we set it to zero because we do not want to spend memory for a system guard page if we have our own guard pages. See os::Aix::default_guard_size. For all other threads, this is supposed to be one page. But as a stack page may be 4 or 64k, this is flaky and may fail. In fact, thinking about this, I think a better way to do this would to be to call pthread_attr_setguardsize() only to disable system guard pages, and to otherwise live with the system guard page default. -- The rest is fine. I leave it up to you if you take my proposals. Do not need another webrev. ..Thomas On Wed, Aug 16, 2017 at 4:24 PM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi, > > TestOptionWithRanges causes the vm on aix to crash on some machines. > This is because huge stack sizes are not treated properly. > > On linux, pthread_attr_setstacksize succeeds if called with huge values, > but > pthread_create() then fails. On Aix, pthread_attr_setstacksize fails if > passed a value exceeding the limits and leaves the minimal system > thread stack size in attr. Thus thread creation succeeds and leads to > crashes after thread creation when the guard pages shall be protected > but don't fit on the tiny stack created. > > Please review this small, aix-only change. > http://cr.openjdk.java.net/~goetz/wr17/8186293-aixHugeStack/webrev.01/ > > Best regards, > Goetz. > From ioi.lam at oracle.com Wed Aug 16 14:43:46 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 16 Aug 2017 07:43:46 -0700 Subject: RFR(m): 8185712: [windows] Multiple issues with the native symbol decoder In-Reply-To: References: Message-ID: <657af5bc-185b-ddba-920c-c18da652dd71@oracle.com> Hi Thomas, I am planning to review this but there's a lot of changes so it would take me a little while :-( One quick question - what kind of testing has been done on these changes? My main worry is (1) stability during normal execution, (2) impacts to hs_err file generation. I can see the improvements in the hs_err call stack, but would there be any regression in corner cases? How can we test for that? Thanks - Ioi On 8/13/17 8:14 AM, Thomas St?fe wrote: > Dear all, > > May I please have reviews for the following patch. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.00/webrev/ > > This fix makes the native symbol decoder on Windows more robust. This > increases the chance of getting useful error files. It also refactors the > code and contains some useful new features. > > --- > > Native symbol resolution on windows is done via the group of "SymXX" APIs > exported from dbghelp.dll. Currently this is done via a layer of > abstraction in decoder.cpp, and a "WindowsDecoder" class in > decoder_windows.cpp. This class can be instantianted twice, so there can > exist two objects of this class. > > The bugs: > > 1) Functions from dbghelp.dll are not threadsafe; calls to them must be > synchronized. But dbghelp.dll could be used by the two live WindowsDecoder > objects in parallel from different threads. In addition to that, > dbghelp.dll functions are used from within os_windows.cpp, when writing > minidumps. > > 2) the SymXX APIs need to be initialized by a call to SymInitialize(). This > can only happen once. The way it is now, each of the two WindowsDecoder > objects will call SymInitialize. The second object to do this will fail and > hence not be usable. In practice this means if in the VM someone called > e.g. dll_from_address_name() - which will initialize and use the shared > WindowsDecoder object - and then the VM crashes, the hs-err file will > contain no useful stack because that would require the second > WindowsDecoder object, which would get initialized after the shared one, > and its initialization would fail. > > 3) Initialization dependencies: > - when building the pdb search path (WindowsDecoder::initialize), > Arguments::get_java_home() is used to deduce the jdk bin directory > containing the jdk shared objects. This will crash if invoked before the > system properties are set. > - Decoder::decode() calls (via the Monitor lock) Thread::current(). This > means the code will not work during initialization and where > Thread::current is not set (e.g. in an unattached thread). Admittedly a bit > theoretical, as this only affects the shared WindowsDecoder object, which > is not used during error reporting. > > 4) WindowsDecoder::initialize(), pdb search path: Code uses MAX_PATH in > various places, which is wrong. NTFS paths can be longer than that. > - when calling SymGetSearchPath, code assumes a maximum path size of > MAX_PATH for the *combined* size of all directory names, which may be way > too small. Truncation is not handled, code will silently fail if output > buffer is too small, resulting in the existing pdb path to be overwritten > instead of being preserved. > - GetModuleFileName(): similar problem, output buffer is MAX_PATH len, > which may be too small. Truncation will lead to wrong results. > - Throughout the function "strncat" is used to assemble the search path, > but is used wrong and does not guard against buffer overflows (if that was > the intent - otherwise, why not just use strcat?). The last argument to > strncat should be the remaining space in the destination buffer, not the > length of the source string. > > 5) WindowsDecoder::decode(): We call SymGetSymFromAddr64() using an output > buffer of size MAX_PATH - which makes no sense at all, as we are retrieving > a symbol name, not a file name. Truncation is not handled. This is actually > dangerous, because SymGetSymFromAddr64() handles truncations sloppily, it > will return success and fill the output buffer completely, so on truncation > the symbol name will not be zero terminated. > > In addition to the bugs, there are a number of things which could be done > better or simpler: > > a) Setting the search path could be simplified and made more useful if we > would just add the directory of every loaded module to the search path > (currently we just add the two jdk bin directories, which also exposes us > to initialization dependency, see (3)). This would be more useful as a > common convention is to put pdb files beside binaries, and that way we > would catch all those. Including, but not limited to, our own jdk pdb files. > > b) Dlls can be loaded and unloaded, and it would be nice to have an updated > pdb search path - e.g. in case a late-loaded third party JNI library > crashes. So, it would be nice to run (a) whenever a library is loaded or > unloaded, and for the process to be fast if nothing changed (e.g. if the > new DLL was loaded from a directory which is already part of the search > path). > > c) It would be nice to have file name and line number in the callstack, too. > > d) As pointed out in JDK-8144855, the function > Decoder::can_decode_C_frame_in_vm() is not necessary. This should be > handled in WindowsDecoder instead. > > --- > > What I did in this fix: > > - I pulled out dbghelp.dll handling into an own centralized singleton > (DbgHelpLoader). This class takes care of loading the Dll and synchonizes > access to all its functions, thus solving (1). DbgHelpLoader now replaces > all places where before the DLL was loaded manually. > > - Atop of DbgHelpLoader there is another new singleton class, > "SymbolEngine", which wraps the life cycle of the symbol APIs. This solves > problem (2). In addition to that, this class is the new interface to the > SymXX APIs. > > - this means that the existing code in decoder.cpp, which instantiates two > Decode objects and synchronizes access to one of them, makes no sense for > Windows. That whole layer is now bypassed in Windows - the static > Decoder::xxx() functions are now directly implemented in > decoder_windows.cpp and access the SymbolEngine singleton without any added > layers inbetween. > > - SymbolEngine contains a new improved way to assemble the pdb search path > as described in (a) and (b). We iterate loaded modules, extract directories > and add them to the search path. Implementation takes care to not do > unnecessary work (avoiding changing the path if list of loaded modules is > unchanged, or if only Dlls from known directories were loaded). > > - We now recalculate the pdb seach path if a DLL was loaded (via > os::dll_load). That makes it possible to debug third party dlls which do > not reside in jdk directories and were loaded after the SymbolEngine was > initialized. > > - When decoding a symbol from an unknown address and the decode fails, we > attempt to rebuild the pdb search path and attempt again. > > - Care was taken for this code to be robust. It may be called > pre-initialization, post-shutdown or under error conditions (low memory, > out of stack etc), so no VM infrastructure was used and memory use is > frugal (dynamic buffers allocated off the stack and reused were possible). > > - I did away with the "Decoder::can_decode_C_frame_in_vm()" code. See > argumentation in JDK-8144855. To me, it makes no sense. The function is a > workaround the problem that, when faced with stripped binaries which only > have debug symbols for public functions, symbol info may be confusing > (public symbols + large offsets). But this is not done consistently, and > most programmers know not to trust symbols with large offsets. > > - Finally, we now have source info in the callstack :) > > Stack: [0x00840000,0x00890000], sp=0x0088f884, free space=318k > Native frames: (J=compiled Java code, A=aot compiled Java code, > j=interpreted, Vv=VM code, C=native code) > V [jvm.dll+0xa26903] VMError::controlled_crash+0x2a3 (vmerror.cpp:1683) > V [jvm.dll+0xa2664e] VMError::test_error_handler+0xe (vmerror.cpp:1632) > V [jvm.dll+0x6af796] JNI_CreateJavaVM_inner+0x196 (jni.cpp:3988) > V [jvm.dll+0x6af5c2] JNI_CreateJavaVM+0x52 (jni.cpp:4036) > V [jvm.dll+0x2cdfc] init_jvm+0xdc (gtestmain.cpp:94) > V [jvm.dll+0x2d18a] JVMInitializerListener::OnTestStart+0x7a > (gtestmain.cpp:114) > V [jvm.dll+0x829b] testing::internal::TestEventRepeater::OnTestStart+0x5b > (gtest.cc:2979) > V [jvm.dll+0x6874] testing::TestInfo::Run+0x54 (gtest.cc:2312) > V [jvm.dll+0x6d8f] testing::TestCase::Run+0xbf (gtest.cc:2445) > V [jvm.dll+0xbe9e] testing::internal::UnitTestImpl::RunAllTests+0x25e > (gtest.cc:4316) > V [jvm.dll+0x2a410] > testing::internal::HandleSehExceptionsInMethodIfSupported+0x40 > (gtest.cc:2063) > V [jvm.dll+0x2493e] > testing::internal::HandleExceptionsInMethodIfSupported+0x5e > (gtest.cc:2114) > V [jvm.dll+0xb0e9] testing::UnitTest::Run+0xe9 (gtest.cc:3929) > V [jvm.dll+0x2d0bf] RUN_ALL_TESTS+0xf (gtest.h:2289) > V [jvm.dll+0x2cc76] runUnitTestsInner+0x1f6 (gtestmain.cpp:249) > V [jvm.dll+0x2c958] runUnitTests+0x48 (gtestmain.cpp:319) > C [gtestLauncher.exe+0x1011] main+0x11 (gtestlauncher.cpp:32) > C [gtestLauncher.exe+0x1182] __tmainCRTStartup+0x122 (crtexe.c:555) > C [KERNEL32.DLL+0x138f4] > C [ntdll.dll+0x65de3] > C [ntdll.dll+0x65dae] > > .. which is implemented for Windows, but pipes are laid to plug in other > platforms as well (Decoder::get_source_info()) > > ---- > > Tell me what you think. > > Thanks, and Kind Regards, Thomas From thomas.stuefe at gmail.com Wed Aug 16 15:12:38 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 16 Aug 2017 17:12:38 +0200 Subject: RFR(m): 8185712: [windows] Multiple issues with the native symbol decoder In-Reply-To: <657af5bc-185b-ddba-920c-c18da652dd71@oracle.com> References: <657af5bc-185b-ddba-920c-c18da652dd71@oracle.com> Message-ID: Hi Ioi, On Wed, Aug 16, 2017 at 4:43 PM, Ioi Lam wrote: > Hi Thomas, > > I am planning to review this but there's a lot of changes so it would take > me a little while :-( > > thank you! I currently wonder whether this is too massive and if I should split this up. At least the centralized dbghelp.dll handling could easily moved into an own patch, that would make this patch easier to digest. What do you think? > One quick question - what kind of testing has been done on these changes? > My main worry is (1) stability during normal execution, (2) impacts to > hs_err file generation. > > I built windows x86 and x64, slodebug and release. Ran gtests manually. Our nightly builds ran (TCK, hotspot jtreg) with no problems I could attribute to my patch. I also wrote a small selftest (SymbolEngine::selftest()) atop of the ones I added to test_os.cpp which lives in symbol_api.cpp and stresses the SymbolEngine class from the inside. I did not add this to the webrev. For one because the webrev was already large and also because this test was not a gtest but a selftest inside the VM and I did not think you guys would accept this. > I can see the improvements in the hs_err call stack, but would there be > any regression in corner cases? How can we test for that? I think the code with my patch is way more stable than before, that was the aim of this patch. One could write loads of regression tests for corner cases. We actually do this at SAP with our own fork - we have extensive tests for the hs-err file generation. But from experience I know that these kind of tests are difficult to get 100% solid. For instance, one thing we do at SAP is to the VM crash in many colorful ways and check the callstacks for validity. However, this is fragile and depends on many factors - how the compiler optimizes, whether we build with debug infos etc. I hesitate to add tests like this to the OpenJDK which causes sporadic work for maintainers, everyone would hate me :) Thanks, Thomas > Thanks > > - Ioi > > > > On 8/13/17 8:14 AM, Thomas St?fe wrote: > >> Dear all, >> >> May I please have reviews for the following patch. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 >> Webrev: >> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows- >> improve-native-symbol-resolver/webrev.00/webrev/ >> >> This fix makes the native symbol decoder on Windows more robust. This >> increases the chance of getting useful error files. It also refactors the >> code and contains some useful new features. >> >> --- >> >> Native symbol resolution on windows is done via the group of "SymXX" APIs >> exported from dbghelp.dll. Currently this is done via a layer of >> abstraction in decoder.cpp, and a "WindowsDecoder" class in >> decoder_windows.cpp. This class can be instantianted twice, so there can >> exist two objects of this class. >> >> The bugs: >> >> 1) Functions from dbghelp.dll are not threadsafe; calls to them must be >> synchronized. But dbghelp.dll could be used by the two live WindowsDecoder >> objects in parallel from different threads. In addition to that, >> dbghelp.dll functions are used from within os_windows.cpp, when writing >> minidumps. >> >> 2) the SymXX APIs need to be initialized by a call to SymInitialize(). >> This >> can only happen once. The way it is now, each of the two WindowsDecoder >> objects will call SymInitialize. The second object to do this will fail >> and >> hence not be usable. In practice this means if in the VM someone called >> e.g. dll_from_address_name() - which will initialize and use the shared >> WindowsDecoder object - and then the VM crashes, the hs-err file will >> contain no useful stack because that would require the second >> WindowsDecoder object, which would get initialized after the shared one, >> and its initialization would fail. >> >> 3) Initialization dependencies: >> - when building the pdb search path (WindowsDecoder::initialize), >> Arguments::get_java_home() is used to deduce the jdk bin directory >> containing the jdk shared objects. This will crash if invoked before the >> system properties are set. >> - Decoder::decode() calls (via the Monitor lock) Thread::current(). >> This >> means the code will not work during initialization and where >> Thread::current is not set (e.g. in an unattached thread). Admittedly a >> bit >> theoretical, as this only affects the shared WindowsDecoder object, which >> is not used during error reporting. >> >> 4) WindowsDecoder::initialize(), pdb search path: Code uses MAX_PATH in >> various places, which is wrong. NTFS paths can be longer than that. >> - when calling SymGetSearchPath, code assumes a maximum path size of >> MAX_PATH for the *combined* size of all directory names, which may be way >> too small. Truncation is not handled, code will silently fail if output >> buffer is too small, resulting in the existing pdb path to be overwritten >> instead of being preserved. >> - GetModuleFileName(): similar problem, output buffer is MAX_PATH len, >> which may be too small. Truncation will lead to wrong results. >> - Throughout the function "strncat" is used to assemble the search >> path, >> but is used wrong and does not guard against buffer overflows (if that was >> the intent - otherwise, why not just use strcat?). The last argument to >> strncat should be the remaining space in the destination buffer, not the >> length of the source string. >> >> 5) WindowsDecoder::decode(): We call SymGetSymFromAddr64() using an output >> buffer of size MAX_PATH - which makes no sense at all, as we are >> retrieving >> a symbol name, not a file name. Truncation is not handled. This is >> actually >> dangerous, because SymGetSymFromAddr64() handles truncations sloppily, it >> will return success and fill the output buffer completely, so on >> truncation >> the symbol name will not be zero terminated. >> >> In addition to the bugs, there are a number of things which could be done >> better or simpler: >> >> a) Setting the search path could be simplified and made more useful if we >> would just add the directory of every loaded module to the search path >> (currently we just add the two jdk bin directories, which also exposes us >> to initialization dependency, see (3)). This would be more useful as a >> common convention is to put pdb files beside binaries, and that way we >> would catch all those. Including, but not limited to, our own jdk pdb >> files. >> >> b) Dlls can be loaded and unloaded, and it would be nice to have an >> updated >> pdb search path - e.g. in case a late-loaded third party JNI library >> crashes. So, it would be nice to run (a) whenever a library is loaded or >> unloaded, and for the process to be fast if nothing changed (e.g. if the >> new DLL was loaded from a directory which is already part of the search >> path). >> >> c) It would be nice to have file name and line number in the callstack, >> too. >> >> d) As pointed out in JDK-8144855, the function >> Decoder::can_decode_C_frame_in_vm() is not necessary. This should be >> handled in WindowsDecoder instead. >> >> --- >> >> What I did in this fix: >> >> - I pulled out dbghelp.dll handling into an own centralized singleton >> (DbgHelpLoader). This class takes care of loading the Dll and synchonizes >> access to all its functions, thus solving (1). DbgHelpLoader now replaces >> all places where before the DLL was loaded manually. >> >> - Atop of DbgHelpLoader there is another new singleton class, >> "SymbolEngine", which wraps the life cycle of the symbol APIs. This solves >> problem (2). In addition to that, this class is the new interface to the >> SymXX APIs. >> >> - this means that the existing code in decoder.cpp, which instantiates two >> Decode objects and synchronizes access to one of them, makes no sense for >> Windows. That whole layer is now bypassed in Windows - the static >> Decoder::xxx() functions are now directly implemented in >> decoder_windows.cpp and access the SymbolEngine singleton without any >> added >> layers inbetween. >> >> - SymbolEngine contains a new improved way to assemble the pdb search path >> as described in (a) and (b). We iterate loaded modules, extract >> directories >> and add them to the search path. Implementation takes care to not do >> unnecessary work (avoiding changing the path if list of loaded modules is >> unchanged, or if only Dlls from known directories were loaded). >> >> - We now recalculate the pdb seach path if a DLL was loaded (via >> os::dll_load). That makes it possible to debug third party dlls which do >> not reside in jdk directories and were loaded after the SymbolEngine was >> initialized. >> >> - When decoding a symbol from an unknown address and the decode fails, we >> attempt to rebuild the pdb search path and attempt again. >> >> - Care was taken for this code to be robust. It may be called >> pre-initialization, post-shutdown or under error conditions (low memory, >> out of stack etc), so no VM infrastructure was used and memory use is >> frugal (dynamic buffers allocated off the stack and reused were possible). >> >> - I did away with the "Decoder::can_decode_C_frame_in_vm()" code. See >> argumentation in JDK-8144855. To me, it makes no sense. The function is a >> workaround the problem that, when faced with stripped binaries which only >> have debug symbols for public functions, symbol info may be confusing >> (public symbols + large offsets). But this is not done consistently, and >> most programmers know not to trust symbols with large offsets. >> >> - Finally, we now have source info in the callstack :) >> >> Stack: [0x00840000,0x00890000], sp=0x0088f884, free space=318k >> Native frames: (J=compiled Java code, A=aot compiled Java code, >> j=interpreted, Vv=VM code, C=native code) >> V [jvm.dll+0xa26903] VMError::controlled_crash+0x2a3 >> (vmerror.cpp:1683) >> V [jvm.dll+0xa2664e] VMError::test_error_handler+0xe >> (vmerror.cpp:1632) >> V [jvm.dll+0x6af796] JNI_CreateJavaVM_inner+0x196 (jni.cpp:3988) >> V [jvm.dll+0x6af5c2] JNI_CreateJavaVM+0x52 (jni.cpp:4036) >> V [jvm.dll+0x2cdfc] init_jvm+0xdc (gtestmain.cpp:94) >> V [jvm.dll+0x2d18a] JVMInitializerListener::OnTestStart+0x7a >> (gtestmain.cpp:114) >> V [jvm.dll+0x829b] testing::internal::TestEventRe >> peater::OnTestStart+0x5b >> (gtest.cc:2979) >> V [jvm.dll+0x6874] testing::TestInfo::Run+0x54 (gtest.cc:2312) >> V [jvm.dll+0x6d8f] testing::TestCase::Run+0xbf (gtest.cc:2445) >> V [jvm.dll+0xbe9e] testing::internal::UnitTestImpl::RunAllTests+0x25e >> (gtest.cc:4316) >> V [jvm.dll+0x2a410] >> testing::internal::HandleSehExceptionsInMethodIfSupported< >> testing::internal::UnitTestImpl,bool>+0x40 >> (gtest.cc:2063) >> V [jvm.dll+0x2493e] >> testing::internal::HandleExceptionsInMethodIfSupported< >> testing::internal::UnitTestImpl,bool>+0x5e >> (gtest.cc:2114) >> V [jvm.dll+0xb0e9] testing::UnitTest::Run+0xe9 (gtest.cc:3929) >> V [jvm.dll+0x2d0bf] RUN_ALL_TESTS+0xf (gtest.h:2289) >> V [jvm.dll+0x2cc76] runUnitTestsInner+0x1f6 (gtestmain.cpp:249) >> V [jvm.dll+0x2c958] runUnitTests+0x48 (gtestmain.cpp:319) >> C [gtestLauncher.exe+0x1011] main+0x11 (gtestlauncher.cpp:32) >> C [gtestLauncher.exe+0x1182] __tmainCRTStartup+0x122 (crtexe.c:555) >> C [KERNEL32.DLL+0x138f4] >> C [ntdll.dll+0x65de3] >> C [ntdll.dll+0x65dae] >> >> .. which is implemented for Windows, but pipes are laid to plug in other >> platforms as well (Decoder::get_source_info()) >> >> ---- >> >> Tell me what you think. >> >> Thanks, and Kind Regards, Thomas >> > > From ioi.lam at oracle.com Wed Aug 16 15:45:08 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 16 Aug 2017 08:45:08 -0700 Subject: RFR(m): 8185712: [windows] Multiple issues with the native symbol decoder In-Reply-To: References: <657af5bc-185b-ddba-920c-c18da652dd71@oracle.com> Message-ID: <162bc4ad-73e1-033c-5d1f-323157445d7d@oracle.com> On 8/16/17 8:12 AM, Thomas St?fe wrote: > Hi Ioi, > > On Wed, Aug 16, 2017 at 4:43 PM, Ioi Lam > wrote: > > Hi Thomas, > > I am planning to review this but there's a lot of changes so it > would take me a little while :-( > > > thank you! > > I currently wonder whether this is too massive and if I should split > this up. At least the centralized dbghelp.dll handling could easily > moved into an own patch, that would make this patch easier to digest. > What do you think? I think having multiple smaller patches would be better. I would do a subset of our nightly tests before pushing, and then we can let our nightly test run for a couple of weeks to make sure no issues come up. Then we can try the next patch, and so forth. Given this is an area that's not used often, and must perform correctly when "used" (e.g., where a VM crash state needs to be correctly logged), I think we need to take extra precautions. I hope this wouldn't mean too much extra work for you. > > One quick question - what kind of testing has been done on these > changes? My main worry is (1) stability during normal execution, > (2) impacts to hs_err file generation. > > > I built windows x86 and x64, slodebug and release. Ran gtests > manually. Our nightly builds ran (TCK, hotspot jtreg) with no problems > I could attribute to my patch. > > I also wrote a small selftest (SymbolEngine::selftest()) atop of the > ones I added to test_os.cpp which lives in symbol_api.cpp and stresses > the SymbolEngine class from the inside. I did not add this to the > webrev. For one because the webrev was already large and also because > this test was not a gtest but a selftest inside the VM and I did not > think you guys would accept this. If the code size is not too big, maybe we can add it to whitebox? That way we can start it using jtreg. It would be kind of like the various ::verify() routines in the JVM, and can be restricted in non-product builds if necessary. Thanks - Ioi > > I can see the improvements in the hs_err call stack, but would > there be any regression in corner cases? How can we test for that? > > > I think the code with my patch is way more stable than before, that > was the aim of this patch. One could write loads of regression tests > for corner cases. > > We actually do this at SAP with our own fork - we have extensive tests > for the hs-err file generation. But from experience I know that these > kind of tests are difficult to get 100% solid. For instance, one thing > we do at SAP is to the VM crash in many colorful ways and check the > callstacks for validity. However, this is fragile and depends on many > factors - how the compiler optimizes, whether we build with debug > infos etc. I hesitate to add tests like this to the OpenJDK which > causes sporadic work for maintainers, everyone would hate me :) > > Thanks, Thomas > > > Thanks > > - Ioi > > > > On 8/13/17 8:14 AM, Thomas St?fe wrote: > > Dear all, > > May I please have reviews for the following patch. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > > Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.00/webrev/ > > > This fix makes the native symbol decoder on Windows more > robust. This > increases the chance of getting useful error files. It also > refactors the > code and contains some useful new features. > > --- > > Native symbol resolution on windows is done via the group of > "SymXX" APIs > exported from dbghelp.dll. Currently this is done via a layer of > abstraction in decoder.cpp, and a "WindowsDecoder" class in > decoder_windows.cpp. This class can be instantianted twice, so > there can > exist two objects of this class. > > The bugs: > > 1) Functions from dbghelp.dll are not threadsafe; calls to > them must be > synchronized. But dbghelp.dll could be used by the two live > WindowsDecoder > objects in parallel from different threads. In addition to that, > dbghelp.dll functions are used from within os_windows.cpp, > when writing > minidumps. > > 2) the SymXX APIs need to be initialized by a call to > SymInitialize(). This > can only happen once. The way it is now, each of the two > WindowsDecoder > objects will call SymInitialize. The second object to do this > will fail and > hence not be usable. In practice this means if in the VM > someone called > e.g. dll_from_address_name() - which will initialize and use > the shared > WindowsDecoder object - and then the VM crashes, the hs-err > file will > contain no useful stack because that would require the second > WindowsDecoder object, which would get initialized after the > shared one, > and its initialization would fail. > > 3) Initialization dependencies: > - when building the pdb search path > (WindowsDecoder::initialize), > Arguments::get_java_home() is used to deduce the jdk bin directory > containing the jdk shared objects. This will crash if invoked > before the > system properties are set. > - Decoder::decode() calls (via the Monitor lock) > Thread::current(). This > means the code will not work during initialization and where > Thread::current is not set (e.g. in an unattached thread). > Admittedly a bit > theoretical, as this only affects the shared WindowsDecoder > object, which > is not used during error reporting. > > 4) WindowsDecoder::initialize(), pdb search path: Code uses > MAX_PATH in > various places, which is wrong. NTFS paths can be longer than > that. > - when calling SymGetSearchPath, code assumes a maximum > path size of > MAX_PATH for the *combined* size of all directory names, which > may be way > too small. Truncation is not handled, code will silently fail > if output > buffer is too small, resulting in the existing pdb path to be > overwritten > instead of being preserved. > - GetModuleFileName(): similar problem, output buffer is > MAX_PATH len, > which may be too small. Truncation will lead to wrong results. > - Throughout the function "strncat" is used to assemble the > search path, > but is used wrong and does not guard against buffer overflows > (if that was > the intent - otherwise, why not just use strcat?). The last > argument to > strncat should be the remaining space in the destination > buffer, not the > length of the source string. > > 5) WindowsDecoder::decode(): We call SymGetSymFromAddr64() > using an output > buffer of size MAX_PATH - which makes no sense at all, as we > are retrieving > a symbol name, not a file name. Truncation is not handled. > This is actually > dangerous, because SymGetSymFromAddr64() handles truncations > sloppily, it > will return success and fill the output buffer completely, so > on truncation > the symbol name will not be zero terminated. > > In addition to the bugs, there are a number of things which > could be done > better or simpler: > > a) Setting the search path could be simplified and made more > useful if we > would just add the directory of every loaded module to the > search path > (currently we just add the two jdk bin directories, which also > exposes us > to initialization dependency, see (3)). This would be more > useful as a > common convention is to put pdb files beside binaries, and > that way we > would catch all those. Including, but not limited to, our own > jdk pdb files. > > b) Dlls can be loaded and unloaded, and it would be nice to > have an updated > pdb search path - e.g. in case a late-loaded third party JNI > library > crashes. So, it would be nice to run (a) whenever a library is > loaded or > unloaded, and for the process to be fast if nothing changed > (e.g. if the > new DLL was loaded from a directory which is already part of > the search > path). > > c) It would be nice to have file name and line number in the > callstack, too. > > d) As pointed out in JDK-8144855, the function > Decoder::can_decode_C_frame_in_vm() is not necessary. This > should be > handled in WindowsDecoder instead. > > --- > > What I did in this fix: > > - I pulled out dbghelp.dll handling into an own centralized > singleton > (DbgHelpLoader). This class takes care of loading the Dll and > synchonizes > access to all its functions, thus solving (1). DbgHelpLoader > now replaces > all places where before the DLL was loaded manually. > > - Atop of DbgHelpLoader there is another new singleton class, > "SymbolEngine", which wraps the life cycle of the symbol APIs. > This solves > problem (2). In addition to that, this class is the new > interface to the > SymXX APIs. > > - this means that the existing code in decoder.cpp, which > instantiates two > Decode objects and synchronizes access to one of them, makes > no sense for > Windows. That whole layer is now bypassed in Windows - the static > Decoder::xxx() functions are now directly implemented in > decoder_windows.cpp and access the SymbolEngine singleton > without any added > layers inbetween. > > - SymbolEngine contains a new improved way to assemble the pdb > search path > as described in (a) and (b). We iterate loaded modules, > extract directories > and add them to the search path. Implementation takes care to > not do > unnecessary work (avoiding changing the path if list of loaded > modules is > unchanged, or if only Dlls from known directories were loaded). > > - We now recalculate the pdb seach path if a DLL was loaded (via > os::dll_load). That makes it possible to debug third party > dlls which do > not reside in jdk directories and were loaded after the > SymbolEngine was > initialized. > > - When decoding a symbol from an unknown address and the > decode fails, we > attempt to rebuild the pdb search path and attempt again. > > - Care was taken for this code to be robust. It may be called > pre-initialization, post-shutdown or under error conditions > (low memory, > out of stack etc), so no VM infrastructure was used and memory > use is > frugal (dynamic buffers allocated off the stack and reused > were possible). > > - I did away with the "Decoder::can_decode_C_frame_in_vm()" > code. See > argumentation in JDK-8144855. To me, it makes no sense. The > function is a > workaround the problem that, when faced with stripped binaries > which only > have debug symbols for public functions, symbol info may be > confusing > (public symbols + large offsets). But this is not done > consistently, and > most programmers know not to trust symbols with large offsets. > > - Finally, we now have source info in the callstack :) > > Stack: [0x00840000,0x00890000], sp=0x0088f884, free space=318k > Native frames: (J=compiled Java code, A=aot compiled Java code, > j=interpreted, Vv=VM code, C=native code) > V [jvm.dll+0xa26903] VMError::controlled_crash+0x2a3 > (vmerror.cpp:1683) > V [jvm.dll+0xa2664e] VMError::test_error_handler+0xe > (vmerror.cpp:1632) > V [jvm.dll+0x6af796] JNI_CreateJavaVM_inner+0x196 (jni.cpp:3988) > V [jvm.dll+0x6af5c2] JNI_CreateJavaVM+0x52 (jni.cpp:4036) > V [jvm.dll+0x2cdfc] init_jvm+0xdc (gtestmain.cpp:94) > V [jvm.dll+0x2d18a] JVMInitializerListener::OnTestStart+0x7a > (gtestmain.cpp:114) > V [jvm.dll+0x829b] > testing::internal::TestEventRepeater::OnTestStart+0x5b > (gtest.cc:2979) > V [jvm.dll+0x6874] testing::TestInfo::Run+0x54 (gtest.cc:2312) > V [jvm.dll+0x6d8f] testing::TestCase::Run+0xbf (gtest.cc:2445) > V [jvm.dll+0xbe9e] > testing::internal::UnitTestImpl::RunAllTests+0x25e > (gtest.cc:4316) > V [jvm.dll+0x2a410] > > testing::internal::HandleSehExceptionsInMethodIfSupported+0x40 > (gtest.cc:2063) > V [jvm.dll+0x2493e] > > testing::internal::HandleExceptionsInMethodIfSupported+0x5e > (gtest.cc:2114) > V [jvm.dll+0xb0e9] testing::UnitTest::Run+0xe9 (gtest.cc:3929) > V [jvm.dll+0x2d0bf] RUN_ALL_TESTS+0xf (gtest.h:2289) > V [jvm.dll+0x2cc76] runUnitTestsInner+0x1f6 (gtestmain.cpp:249) > V [jvm.dll+0x2c958] runUnitTests+0x48 (gtestmain.cpp:319) > C [gtestLauncher.exe+0x1011] main+0x11 (gtestlauncher.cpp:32) > C [gtestLauncher.exe+0x1182] __tmainCRTStartup+0x122 > (crtexe.c:555) > C [KERNEL32.DLL+0x138f4] > C [ntdll.dll+0x65de3] > C [ntdll.dll+0x65dae] > > .. which is implemented for Windows, but pipes are laid to > plug in other > platforms as well (Decoder::get_source_info()) > > ---- > > Tell me what you think. > > Thanks, and Kind Regards, Thomas > > > From thomas.stuefe at gmail.com Wed Aug 16 15:52:24 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 16 Aug 2017 17:52:24 +0200 Subject: RFR(m): 8185712: [windows] Multiple issues with the native symbol decoder In-Reply-To: <162bc4ad-73e1-033c-5d1f-323157445d7d@oracle.com> References: <657af5bc-185b-ddba-920c-c18da652dd71@oracle.com> <162bc4ad-73e1-033c-5d1f-323157445d7d@oracle.com> Message-ID: On Wed, Aug 16, 2017 at 5:45 PM, Ioi Lam wrote: > > > On 8/16/17 8:12 AM, Thomas St?fe wrote: > > Hi Ioi, > > On Wed, Aug 16, 2017 at 4:43 PM, Ioi Lam wrote: > >> Hi Thomas, >> >> I am planning to review this but there's a lot of changes so it would >> take me a little while :-( >> >> > thank you! > > I currently wonder whether this is too massive and if I should split this > up. At least the centralized dbghelp.dll handling could easily moved into > an own patch, that would make this patch easier to digest. What do you > think? > > I think having multiple smaller patches would be better. I would do a > subset of our nightly tests before pushing, and then we can let our nightly > test run for a couple of weeks to make sure no issues come up. Then we can > try the next patch, and so forth. > > Okay, I will do this then. I withdraw this patch and will chop it up in digestible chunks. > Given this is an area that's not used often, and must perform correctly > when "used" (e.g., where a VM crash state needs to be correctly logged), I > think we need to take extra precautions. > Sure, I agree. Note that this code is not completely new. It is based on code we use in our VM since many years. Our error reporting is very robust and quite a bit enhanced (as I wrote in another mail thread, for our support the hs-err file is a very important tool because of the difficulty of getting core dumps from customer sites). Our VMs get used in the field, so at least the underlying techniques are well tested, if not in this particular form. > I hope this wouldn't mean too much extra work for you. > > > >> One quick question - what kind of testing has been done on these changes? >> My main worry is (1) stability during normal execution, (2) impacts to >> hs_err file generation. >> >> > I built windows x86 and x64, slodebug and release. Ran gtests manually. > Our nightly builds ran (TCK, hotspot jtreg) with no problems I could > attribute to my patch. > > I also wrote a small selftest (SymbolEngine::selftest()) atop of the ones > I added to test_os.cpp which lives in symbol_api.cpp and stresses the > SymbolEngine class from the inside. I did not add this to the webrev. For > one because the webrev was already large and also because this test was not > a gtest but a selftest inside the VM and I did not think you guys would > accept this. > > If the code size is not too big, maybe we can add it to whitebox? That way > we can start it using jtreg. It would be kind of like the various > ::verify() routines in the JVM, and can be restricted in non-product > builds if necessary. > That would be a possibility. I have to look at how this is done. Thanks, Thomas > Thanks > - Ioi > > > >> I can see the improvements in the hs_err call stack, but would there be >> any regression in corner cases? How can we test for that? > > > I think the code with my patch is way more stable than before, that was > the aim of this patch. One could write loads of regression tests for corner > cases. > > We actually do this at SAP with our own fork - we have extensive tests for > the hs-err file generation. But from experience I know that these kind of > tests are difficult to get 100% solid. For instance, one thing we do at SAP > is to the VM crash in many colorful ways and check the callstacks for > validity. However, this is fragile and depends on many factors - how the > compiler optimizes, whether we build with debug infos etc. I hesitate to > add tests like this to the OpenJDK which causes sporadic work for > maintainers, everyone would hate me :) > > Thanks, Thomas > > >> Thanks >> >> - Ioi >> >> >> >> On 8/13/17 8:14 AM, Thomas St?fe wrote: >> >>> Dear all, >>> >>> May I please have reviews for the following patch. >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 >>> Webrev: >>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-i >>> mprove-native-symbol-resolver/webrev.00/webrev/ >>> >>> This fix makes the native symbol decoder on Windows more robust. This >>> increases the chance of getting useful error files. It also refactors the >>> code and contains some useful new features. >>> >>> --- >>> >>> Native symbol resolution on windows is done via the group of "SymXX" APIs >>> exported from dbghelp.dll. Currently this is done via a layer of >>> abstraction in decoder.cpp, and a "WindowsDecoder" class in >>> decoder_windows.cpp. This class can be instantianted twice, so there can >>> exist two objects of this class. >>> >>> The bugs: >>> >>> 1) Functions from dbghelp.dll are not threadsafe; calls to them must be >>> synchronized. But dbghelp.dll could be used by the two live >>> WindowsDecoder >>> objects in parallel from different threads. In addition to that, >>> dbghelp.dll functions are used from within os_windows.cpp, when writing >>> minidumps. >>> >>> 2) the SymXX APIs need to be initialized by a call to SymInitialize(). >>> This >>> can only happen once. The way it is now, each of the two WindowsDecoder >>> objects will call SymInitialize. The second object to do this will fail >>> and >>> hence not be usable. In practice this means if in the VM someone called >>> e.g. dll_from_address_name() - which will initialize and use the shared >>> WindowsDecoder object - and then the VM crashes, the hs-err file will >>> contain no useful stack because that would require the second >>> WindowsDecoder object, which would get initialized after the shared one, >>> and its initialization would fail. >>> >>> 3) Initialization dependencies: >>> - when building the pdb search path (WindowsDecoder::initialize), >>> Arguments::get_java_home() is used to deduce the jdk bin directory >>> containing the jdk shared objects. This will crash if invoked before the >>> system properties are set. >>> - Decoder::decode() calls (via the Monitor lock) Thread::current(). >>> This >>> means the code will not work during initialization and where >>> Thread::current is not set (e.g. in an unattached thread). Admittedly a >>> bit >>> theoretical, as this only affects the shared WindowsDecoder object, which >>> is not used during error reporting. >>> >>> 4) WindowsDecoder::initialize(), pdb search path: Code uses MAX_PATH in >>> various places, which is wrong. NTFS paths can be longer than that. >>> - when calling SymGetSearchPath, code assumes a maximum path size of >>> MAX_PATH for the *combined* size of all directory names, which may be way >>> too small. Truncation is not handled, code will silently fail if output >>> buffer is too small, resulting in the existing pdb path to be overwritten >>> instead of being preserved. >>> - GetModuleFileName(): similar problem, output buffer is MAX_PATH len, >>> which may be too small. Truncation will lead to wrong results. >>> - Throughout the function "strncat" is used to assemble the search >>> path, >>> but is used wrong and does not guard against buffer overflows (if that >>> was >>> the intent - otherwise, why not just use strcat?). The last argument to >>> strncat should be the remaining space in the destination buffer, not the >>> length of the source string. >>> >>> 5) WindowsDecoder::decode(): We call SymGetSymFromAddr64() using an >>> output >>> buffer of size MAX_PATH - which makes no sense at all, as we are >>> retrieving >>> a symbol name, not a file name. Truncation is not handled. This is >>> actually >>> dangerous, because SymGetSymFromAddr64() handles truncations sloppily, it >>> will return success and fill the output buffer completely, so on >>> truncation >>> the symbol name will not be zero terminated. >>> >>> In addition to the bugs, there are a number of things which could be done >>> better or simpler: >>> >>> a) Setting the search path could be simplified and made more useful if we >>> would just add the directory of every loaded module to the search path >>> (currently we just add the two jdk bin directories, which also exposes us >>> to initialization dependency, see (3)). This would be more useful as a >>> common convention is to put pdb files beside binaries, and that way we >>> would catch all those. Including, but not limited to, our own jdk pdb >>> files. >>> >>> b) Dlls can be loaded and unloaded, and it would be nice to have an >>> updated >>> pdb search path - e.g. in case a late-loaded third party JNI library >>> crashes. So, it would be nice to run (a) whenever a library is loaded or >>> unloaded, and for the process to be fast if nothing changed (e.g. if the >>> new DLL was loaded from a directory which is already part of the search >>> path). >>> >>> c) It would be nice to have file name and line number in the callstack, >>> too. >>> >>> d) As pointed out in JDK-8144855, the function >>> Decoder::can_decode_C_frame_in_vm() is not necessary. This should be >>> handled in WindowsDecoder instead. >>> >>> --- >>> >>> What I did in this fix: >>> >>> - I pulled out dbghelp.dll handling into an own centralized singleton >>> (DbgHelpLoader). This class takes care of loading the Dll and synchonizes >>> access to all its functions, thus solving (1). DbgHelpLoader now replaces >>> all places where before the DLL was loaded manually. >>> >>> - Atop of DbgHelpLoader there is another new singleton class, >>> "SymbolEngine", which wraps the life cycle of the symbol APIs. This >>> solves >>> problem (2). In addition to that, this class is the new interface to the >>> SymXX APIs. >>> >>> - this means that the existing code in decoder.cpp, which instantiates >>> two >>> Decode objects and synchronizes access to one of them, makes no sense for >>> Windows. That whole layer is now bypassed in Windows - the static >>> Decoder::xxx() functions are now directly implemented in >>> decoder_windows.cpp and access the SymbolEngine singleton without any >>> added >>> layers inbetween. >>> >>> - SymbolEngine contains a new improved way to assemble the pdb search >>> path >>> as described in (a) and (b). We iterate loaded modules, extract >>> directories >>> and add them to the search path. Implementation takes care to not do >>> unnecessary work (avoiding changing the path if list of loaded modules is >>> unchanged, or if only Dlls from known directories were loaded). >>> >>> - We now recalculate the pdb seach path if a DLL was loaded (via >>> os::dll_load). That makes it possible to debug third party dlls which do >>> not reside in jdk directories and were loaded after the SymbolEngine was >>> initialized. >>> >>> - When decoding a symbol from an unknown address and the decode fails, we >>> attempt to rebuild the pdb search path and attempt again. >>> >>> - Care was taken for this code to be robust. It may be called >>> pre-initialization, post-shutdown or under error conditions (low memory, >>> out of stack etc), so no VM infrastructure was used and memory use is >>> frugal (dynamic buffers allocated off the stack and reused were >>> possible). >>> >>> - I did away with the "Decoder::can_decode_C_frame_in_vm()" code. See >>> argumentation in JDK-8144855. To me, it makes no sense. The function is a >>> workaround the problem that, when faced with stripped binaries which only >>> have debug symbols for public functions, symbol info may be confusing >>> (public symbols + large offsets). But this is not done consistently, and >>> most programmers know not to trust symbols with large offsets. >>> >>> - Finally, we now have source info in the callstack :) >>> >>> Stack: [0x00840000,0x00890000], sp=0x0088f884, free space=318k >>> Native frames: (J=compiled Java code, A=aot compiled Java code, >>> j=interpreted, Vv=VM code, C=native code) >>> V [jvm.dll+0xa26903] VMError::controlled_crash+0x2a3 >>> (vmerror.cpp:1683) >>> V [jvm.dll+0xa2664e] VMError::test_error_handler+0xe >>> (vmerror.cpp:1632) >>> V [jvm.dll+0x6af796] JNI_CreateJavaVM_inner+0x196 (jni.cpp:3988) >>> V [jvm.dll+0x6af5c2] JNI_CreateJavaVM+0x52 (jni.cpp:4036) >>> V [jvm.dll+0x2cdfc] init_jvm+0xdc (gtestmain.cpp:94) >>> V [jvm.dll+0x2d18a] JVMInitializerListener::OnTestStart+0x7a >>> (gtestmain.cpp:114) >>> V [jvm.dll+0x829b] testing::internal::TestEventRe >>> peater::OnTestStart+0x5b >>> (gtest.cc:2979) >>> V [jvm.dll+0x6874] testing::TestInfo::Run+0x54 (gtest.cc:2312) >>> V [jvm.dll+0x6d8f] testing::TestCase::Run+0xbf (gtest.cc:2445) >>> V [jvm.dll+0xbe9e] testing::internal::UnitTestImpl::RunAllTests+0x25e >>> (gtest.cc:4316) >>> V [jvm.dll+0x2a410] >>> testing::internal::HandleSehExceptionsInMethodIfSupported>> sting::internal::UnitTestImpl,bool>+0x40 >>> (gtest.cc:2063) >>> V [jvm.dll+0x2493e] >>> testing::internal::HandleExceptionsInMethodIfSupported>> ng::internal::UnitTestImpl,bool>+0x5e >>> (gtest.cc:2114) >>> V [jvm.dll+0xb0e9] testing::UnitTest::Run+0xe9 (gtest.cc:3929) >>> V [jvm.dll+0x2d0bf] RUN_ALL_TESTS+0xf (gtest.h:2289) >>> V [jvm.dll+0x2cc76] runUnitTestsInner+0x1f6 (gtestmain.cpp:249) >>> V [jvm.dll+0x2c958] runUnitTests+0x48 (gtestmain.cpp:319) >>> C [gtestLauncher.exe+0x1011] main+0x11 (gtestlauncher.cpp:32) >>> C [gtestLauncher.exe+0x1182] __tmainCRTStartup+0x122 (crtexe.c:555) >>> C [KERNEL32.DLL+0x138f4] >>> C [ntdll.dll+0x65de3] >>> C [ntdll.dll+0x65dae] >>> >>> .. which is implemented for Windows, but pipes are laid to plug in other >>> platforms as well (Decoder::get_source_info()) >>> >>> ---- >>> >>> Tell me what you think. >>> >>> Thanks, and Kind Regards, Thomas >>> >> >> > > From thomas.stuefe at gmail.com Wed Aug 16 15:53:07 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 16 Aug 2017 17:53:07 +0200 Subject: RFR(m): 8185712: [windows] Multiple issues with the native symbol decoder In-Reply-To: References: Message-ID: Hi all, after discussing this with Ioi, I decided to remove this RFR and split the patch up in multiple chunks, which are hopefully easier to review. Thanks, Thomas On Sun, Aug 13, 2017 at 5:14 PM, Thomas St?fe wrote: > Dear all, > > May I please have reviews for the following patch. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > 8185712-windows-improve-native-symbol-resolver/webrev.00/webrev/ > > This fix makes the native symbol decoder on Windows more robust. This > increases the chance of getting useful error files. It also refactors the > code and contains some useful new features. > > --- > > Native symbol resolution on windows is done via the group of "SymXX" APIs > exported from dbghelp.dll. Currently this is done via a layer of > abstraction in decoder.cpp, and a "WindowsDecoder" class in > decoder_windows.cpp. This class can be instantianted twice, so there can > exist two objects of this class. > > The bugs: > > 1) Functions from dbghelp.dll are not threadsafe; calls to them must be > synchronized. But dbghelp.dll could be used by the two live WindowsDecoder > objects in parallel from different threads. In addition to that, > dbghelp.dll functions are used from within os_windows.cpp, when writing > minidumps. > > 2) the SymXX APIs need to be initialized by a call to SymInitialize(). > This can only happen once. The way it is now, each of the two > WindowsDecoder objects will call SymInitialize. The second object to do > this will fail and hence not be usable. In practice this means if in the VM > someone called e.g. dll_from_address_name() - which will initialize and use > the shared WindowsDecoder object - and then the VM crashes, the hs-err file > will contain no useful stack because that would require the second > WindowsDecoder object, which would get initialized after the shared one, > and its initialization would fail. > > 3) Initialization dependencies: > - when building the pdb search path (WindowsDecoder::initialize), > Arguments::get_java_home() is used to deduce the jdk bin directory > containing the jdk shared objects. This will crash if invoked before the > system properties are set. > - Decoder::decode() calls (via the Monitor lock) Thread::current(). This > means the code will not work during initialization and where > Thread::current is not set (e.g. in an unattached thread). Admittedly a bit > theoretical, as this only affects the shared WindowsDecoder object, which > is not used during error reporting. > > 4) WindowsDecoder::initialize(), pdb search path: Code uses MAX_PATH in > various places, which is wrong. NTFS paths can be longer than that. > - when calling SymGetSearchPath, code assumes a maximum path size of > MAX_PATH for the *combined* size of all directory names, which may be way > too small. Truncation is not handled, code will silently fail if output > buffer is too small, resulting in the existing pdb path to be overwritten > instead of being preserved. > - GetModuleFileName(): similar problem, output buffer is MAX_PATH len, > which may be too small. Truncation will lead to wrong results. > - Throughout the function "strncat" is used to assemble the search path, > but is used wrong and does not guard against buffer overflows (if that was > the intent - otherwise, why not just use strcat?). The last argument to > strncat should be the remaining space in the destination buffer, not the > length of the source string. > > 5) WindowsDecoder::decode(): We call SymGetSymFromAddr64() using an output > buffer of size MAX_PATH - which makes no sense at all, as we are retrieving > a symbol name, not a file name. Truncation is not handled. This is actually > dangerous, because SymGetSymFromAddr64() handles truncations sloppily, it > will return success and fill the output buffer completely, so on truncation > the symbol name will not be zero terminated. > > In addition to the bugs, there are a number of things which could be done > better or simpler: > > a) Setting the search path could be simplified and made more useful if we > would just add the directory of every loaded module to the search path > (currently we just add the two jdk bin directories, which also exposes us > to initialization dependency, see (3)). This would be more useful as a > common convention is to put pdb files beside binaries, and that way we > would catch all those. Including, but not limited to, our own jdk pdb files. > > b) Dlls can be loaded and unloaded, and it would be nice to have an > updated pdb search path - e.g. in case a late-loaded third party JNI > library crashes. So, it would be nice to run (a) whenever a library is > loaded or unloaded, and for the process to be fast if nothing changed (e.g. > if the new DLL was loaded from a directory which is already part of the > search path). > > c) It would be nice to have file name and line number in the callstack, > too. > > d) As pointed out in JDK-8144855, the function Decoder::can_decode_C_frame_in_vm() > is not necessary. This should be handled in WindowsDecoder instead. > > --- > > What I did in this fix: > > - I pulled out dbghelp.dll handling into an own centralized singleton > (DbgHelpLoader). This class takes care of loading the Dll and synchonizes > access to all its functions, thus solving (1). DbgHelpLoader now replaces > all places where before the DLL was loaded manually. > > - Atop of DbgHelpLoader there is another new singleton class, > "SymbolEngine", which wraps the life cycle of the symbol APIs. This solves > problem (2). In addition to that, this class is the new interface to the > SymXX APIs. > > - this means that the existing code in decoder.cpp, which instantiates two > Decode objects and synchronizes access to one of them, makes no sense for > Windows. That whole layer is now bypassed in Windows - the static > Decoder::xxx() functions are now directly implemented in > decoder_windows.cpp and access the SymbolEngine singleton without any added > layers inbetween. > > - SymbolEngine contains a new improved way to assemble the pdb search path > as described in (a) and (b). We iterate loaded modules, extract directories > and add them to the search path. Implementation takes care to not do > unnecessary work (avoiding changing the path if list of loaded modules is > unchanged, or if only Dlls from known directories were loaded). > > - We now recalculate the pdb seach path if a DLL was loaded (via > os::dll_load). That makes it possible to debug third party dlls which do > not reside in jdk directories and were loaded after the SymbolEngine was > initialized. > > - When decoding a symbol from an unknown address and the decode fails, we > attempt to rebuild the pdb search path and attempt again. > > - Care was taken for this code to be robust. It may be called > pre-initialization, post-shutdown or under error conditions (low memory, > out of stack etc), so no VM infrastructure was used and memory use is > frugal (dynamic buffers allocated off the stack and reused were possible). > > - I did away with the "Decoder::can_decode_C_frame_in_vm()" code. See > argumentation in JDK-8144855. To me, it makes no sense. The function is a > workaround the problem that, when faced with stripped binaries which only > have debug symbols for public functions, symbol info may be confusing > (public symbols + large offsets). But this is not done consistently, and > most programmers know not to trust symbols with large offsets. > > - Finally, we now have source info in the callstack :) > > Stack: [0x00840000,0x00890000], sp=0x0088f884, free space=318k > Native frames: (J=compiled Java code, A=aot compiled Java code, > j=interpreted, Vv=VM code, C=native code) > V [jvm.dll+0xa26903] VMError::controlled_crash+0x2a3 (vmerror.cpp:1683) > V [jvm.dll+0xa2664e] VMError::test_error_handler+0xe (vmerror.cpp:1632) > V [jvm.dll+0x6af796] JNI_CreateJavaVM_inner+0x196 (jni.cpp:3988) > V [jvm.dll+0x6af5c2] JNI_CreateJavaVM+0x52 (jni.cpp:4036) > V [jvm.dll+0x2cdfc] init_jvm+0xdc (gtestmain.cpp:94) > V [jvm.dll+0x2d18a] JVMInitializerListener::OnTestStart+0x7a > (gtestmain.cpp:114) > V [jvm.dll+0x829b] testing::internal::TestEventRepeater::OnTestStart+0x5b > (gtest.cc:2979) > V [jvm.dll+0x6874] testing::TestInfo::Run+0x54 (gtest.cc:2312) > V [jvm.dll+0x6d8f] testing::TestCase::Run+0xbf (gtest.cc:2445) > V [jvm.dll+0xbe9e] testing::internal::UnitTestImpl::RunAllTests+0x25e > (gtest.cc:4316) > V [jvm.dll+0x2a410] testing::internal::HandleSehExceptionsInMethodIfS > upported+0x40 (gtest.cc:2063) > V [jvm.dll+0x2493e] testing::internal::HandleExceptionsInMethodIfSupp > orted+0x5e (gtest.cc:2114) > V [jvm.dll+0xb0e9] testing::UnitTest::Run+0xe9 (gtest.cc:3929) > V [jvm.dll+0x2d0bf] RUN_ALL_TESTS+0xf (gtest.h:2289) > V [jvm.dll+0x2cc76] runUnitTestsInner+0x1f6 (gtestmain.cpp:249) > V [jvm.dll+0x2c958] runUnitTests+0x48 (gtestmain.cpp:319) > C [gtestLauncher.exe+0x1011] main+0x11 (gtestlauncher.cpp:32) > C [gtestLauncher.exe+0x1182] __tmainCRTStartup+0x122 (crtexe.c:555) > C [KERNEL32.DLL+0x138f4] > C [ntdll.dll+0x65de3] > C [ntdll.dll+0x65dae] > > .. which is implemented for Windows, but pipes are laid to plug in other > platforms as well (Decoder::get_source_info()) > > ---- > > Tell me what you think. > > Thanks, and Kind Regards, Thomas > > > > > From bob.vandette at oracle.com Wed Aug 16 17:32:51 2017 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 16 Aug 2017 13:32:51 -0400 Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available RAM Message-ID: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> Please review this simple two line fix which allows more flexibility in selecting the % of system RAM to be used by the Heap. This just changes two int variables to doubles. RFE: https://bugs.openjdk.java.net/browse/JDK-8186248 Webrev: http://cr.openjdk.java.net/~bobv/8186248 Bob. From ioi.lam at oracle.com Wed Aug 16 17:37:17 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 16 Aug 2017 10:37:17 -0700 Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> Message-ID: <65162580-104e-d505-dae3-040bb116e40b@oracle.com> It looks good to me, but I am wondering if we need to file an CSR for this type of change? https://wiki.openjdk.java.net/display/csr/CSR+FAQs Arguably, some scripts might have expected =MaxRAMFraction3.0 to fail ..... Thanks - Ioi On 8/16/17 10:32 AM, Bob Vandette wrote: > Please review this simple two line fix which allows more flexibility in selecting the % of system RAM > to be used by the Heap. This just changes two int variables to doubles. > > RFE: > https://bugs.openjdk.java.net/browse/JDK-8186248 > > Webrev: > http://cr.openjdk.java.net/~bobv/8186248 > > Bob. > From bob.vandette at oracle.com Wed Aug 16 17:47:44 2017 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 16 Aug 2017 13:47:44 -0400 Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: <65162580-104e-d505-dae3-040bb116e40b@oracle.com> References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> <65162580-104e-d505-dae3-040bb116e40b@oracle.com> Message-ID: Since it?s backward compatible for all working cases, I thought I could skip that. If anyone disagrees, please let me know. Bob. > On Aug 16, 2017, at 1:37 PM, Ioi Lam wrote: > > It looks good to me, but I am wondering if we need to file an CSR for this type of change? > > https://wiki.openjdk.java.net/display/csr/CSR+FAQs > > Arguably, some scripts might have expected =MaxRAMFraction3.0 to fail ..... > > Thanks > > - Ioi > > > On 8/16/17 10:32 AM, Bob Vandette wrote: >> Please review this simple two line fix which allows more flexibility in selecting the % of system RAM >> to be used by the Heap. This just changes two int variables to doubles. >> >> RFE: >> https://bugs.openjdk.java.net/browse/JDK-8186248 >> >> Webrev: >> http://cr.openjdk.java.net/~bobv/8186248 >> >> Bob. >> > From david.holmes at oracle.com Wed Aug 16 20:53:40 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Aug 2017 06:53:40 +1000 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes In-Reply-To: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> References: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> Message-ID: On 17/08/2017 12:24 AM, Lindenmaier, Goetz wrote: > Hi, > > TestOptionWithRanges causes the vm on aix to crash on some machines. > This is because huge stack sizes are not treated properly. > > On linux, pthread_attr_setstacksize succeeds if called with huge values, but > pthread_create() then fails. On Aix, pthread_attr_setstacksize fails if > passed a value exceeding the limits and leaves the minimal system The AIX behaviour is more in spirit with POSIX. > thread stack size in attr. Thus thread creation succeeds and leads to > crashes after thread creation when the guard pages shall be protected > but don't fit on the tiny stack created. > > Please review this small, aix-only change. > http://cr.openjdk.java.net/~goetz/wr17/8186293-aixHugeStack/webrev.01/ I wonder whether a fatal error "Error occurred during initialization of VM" would not be better than just logging a warning? As Thomas notes, no need to discuss/describe what may or may not happen on other platforms. Cheers, David > Best regards, > Goetz. > From david.holmes at oracle.com Wed Aug 16 21:04:36 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Aug 2017 07:04:36 +1000 Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> Message-ID: <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> Hi Bob, On 17/08/2017 3:32 AM, Bob Vandette wrote: > Please review this simple two line fix which allows more flexibility in selecting the % of system RAM > to be used by the Heap. This just changes two int variables to doubles. > > RFE: > https://bugs.openjdk.java.net/browse/JDK-8186248 > > Webrev: > http://cr.openjdk.java.net/~bobv/8186248 Wouldn't you also want/need to change the type of InitialRAMFraction? Note: jdk10/hs is currently closed to changes as we prepare to push up to jdk10/jdk10. Thanks, David > Bob. > From bob.vandette at oracle.com Thu Aug 17 03:29:32 2017 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 16 Aug 2017 23:29:32 -0400 Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> Message-ID: <7C05B931-43BB-4703-9AAD-5C23FA419DC1@oracle.com> I saw that but wasn't sure it needed the added flexibility since its probably ok that initial sizes are 50% or less. Bob. > On Aug 16, 2017, at 5:04 PM, David Holmes wrote: > > Hi Bob, > >> On 17/08/2017 3:32 AM, Bob Vandette wrote: >> Please review this simple two line fix which allows more flexibility in selecting the % of system RAM >> to be used by the Heap. This just changes two int variables to doubles. >> RFE: >> https://bugs.openjdk.java.net/browse/JDK-8186248 >> Webrev: >> http://cr.openjdk.java.net/~bobv/8186248 > > Wouldn't you also want/need to change the type of InitialRAMFraction? > > Note: jdk10/hs is currently closed to changes as we prepare to push up to jdk10/jdk10. > > Thanks, > David > >> Bob. From david.holmes at oracle.com Thu Aug 17 04:36:07 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Aug 2017 14:36:07 +1000 Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: <7C05B931-43BB-4703-9AAD-5C23FA419DC1@oracle.com> References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> <7C05B931-43BB-4703-9AAD-5C23FA419DC1@oracle.com> Message-ID: <5d8c1673-de57-5504-e6d7-ff92080b0692@oracle.com> On 17/08/2017 1:29 PM, Bob Vandette wrote: > I saw that but wasn't sure it needed the added flexibility since its probably ok that initial sizes are 50% or less. I'd go for consistency. Also now you will need to guard against values < 1, I think. There may be an option checking test that will need updating as well. Cheers, David > Bob. > > >> On Aug 16, 2017, at 5:04 PM, David Holmes wrote: >> >> Hi Bob, >> >>> On 17/08/2017 3:32 AM, Bob Vandette wrote: >>> Please review this simple two line fix which allows more flexibility in selecting the % of system RAM >>> to be used by the Heap. This just changes two int variables to doubles. >>> RFE: >>> https://bugs.openjdk.java.net/browse/JDK-8186248 >>> Webrev: >>> http://cr.openjdk.java.net/~bobv/8186248 >> >> Wouldn't you also want/need to change the type of InitialRAMFraction? >> >> Note: jdk10/hs is currently closed to changes as we prepare to push up to jdk10/jdk10. >> >> Thanks, >> David >> >>> Bob. > From goetz.lindenmaier at sap.com Thu Aug 17 06:52:48 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 17 Aug 2017 06:52:48 +0000 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes In-Reply-To: References: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> Message-ID: <5fffc76d634742e2ab513ad5c75d63e0@sap.com> Hi David, > I wonder whether a fatal error "Error occurred during initialization of > VM" would not be better than just logging a warning? I skip thread creation if the error code is != 0 and return false as it happens on linux. So you see the exact same behavior. Only I print the additional message about the stack size because that is missing from the pthread attr which is reported (as setting it failed). I removed the reference to linux, although I find it useful to point out such unexpected differences. http://cr.openjdk.java.net/~goetz/wr17/8186293-aixHugeStack/webrev.02/ Best regards, Goetz. > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Wednesday, August 16, 2017 10:54 PM > To: Lindenmaier, Goetz ; hotspot-runtime- > dev at openjdk.java.net > Subject: Re: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes > > On 17/08/2017 12:24 AM, Lindenmaier, Goetz wrote: > > Hi, > > > > TestOptionWithRanges causes the vm on aix to crash on some machines. > > This is because huge stack sizes are not treated properly. > > > > On linux, pthread_attr_setstacksize succeeds if called with huge values, but > > pthread_create() then fails. On Aix, pthread_attr_setstacksize fails if > > passed a value exceeding the limits and leaves the minimal system > > The AIX behaviour is more in spirit with POSIX. > > > thread stack size in attr. Thus thread creation succeeds and leads to > > crashes after thread creation when the guard pages shall be protected > > but don't fit on the tiny stack created. > > > > Please review this small, aix-only change. > > http://cr.openjdk.java.net/~goetz/wr17/8186293- > aixHugeStack/webrev.01/ > > I wonder whether a fatal error "Error occurred during initialization of > VM" would not be better than just logging a warning? > > As Thomas notes, no need to discuss/describe what may or may not happen > on other platforms. > > Cheers, > David > > > Best regards, > > Goetz. > > From goetz.lindenmaier at sap.com Thu Aug 17 06:53:52 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 17 Aug 2017 06:53:52 +0000 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes In-Reply-To: References: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> Message-ID: Hi Thomas, Thanks for the review! I adapted the comment and removed the check after setting the guard size. Best regards, Goetz. From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: Wednesday, August 16, 2017 4:43 PM To: Lindenmaier, Goetz Cc: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes Hi Goetz, thanks for this fix! Some small nits: "+ // On linux, pthread_attr_setstacksize succeeds with huge values, but + // pthread_create() fails. No need to mention Linux here. + On Aix, this fails and leaves the minimal system + // thread size in attr." We do not know this :) We only know it leaves the pthread_attr structure untouched. It was filled by pthread_attr_init() with default values. Stack default size may be minimally possible size, but I rather doubt this. How about instead: "On Aix, this fails and leaves the pthread_attr structure untouched" - pthread_attr_setguardsize(&attr, os::Aix::default_guard_size(thr_type)); + ret = pthread_attr_setguardsize(&attr, os::Aix::default_guard_size(thr_type)); I would not bother checking the setting of the guard page. The only reason it makes even sense to set the guard page size is for java threads, where we set it to zero because we do not want to spend memory for a system guard page if we have our own guard pages. See os::Aix::default_guard_size. For all other threads, this is supposed to be one page. But as a stack page may be 4 or 64k, this is flaky and may fail. In fact, thinking about this, I think a better way to do this would to be to call pthread_attr_setguardsize() only to disable system guard pages, and to otherwise live with the system guard page default. -- The rest is fine. I leave it up to you if you take my proposals. Do not need another webrev. ..Thomas On Wed, Aug 16, 2017 at 4:24 PM, Lindenmaier, Goetz > wrote: Hi, TestOptionWithRanges causes the vm on aix to crash on some machines. This is because huge stack sizes are not treated properly. On linux, pthread_attr_setstacksize succeeds if called with huge values, but pthread_create() then fails. On Aix, pthread_attr_setstacksize fails if passed a value exceeding the limits and leaves the minimal system thread stack size in attr. Thus thread creation succeeds and leads to crashes after thread creation when the guard pages shall be protected but don't fit on the tiny stack created. Please review this small, aix-only change. http://cr.openjdk.java.net/~goetz/wr17/8186293-aixHugeStack/webrev.01/ Best regards, Goetz. From thomas.stuefe at gmail.com Thu Aug 17 07:07:18 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 17 Aug 2017 09:07:18 +0200 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes In-Reply-To: <5fffc76d634742e2ab513ad5c75d63e0@sap.com> References: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> <5fffc76d634742e2ab513ad5c75d63e0@sap.com> Message-ID: On Thu, Aug 17, 2017 at 8:52 AM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi David, > > > I wonder whether a fatal error "Error occurred during initialization of > > VM" would not be better than just logging a warning? > I skip thread creation if the error code is != 0 and return false as it > happens on linux. So you see the exact same behavior. Only I > print the additional message about the stack size because that > is missing from the pthread attr which is reported (as setting it > failed). > Platforms seem to have different behaviour now: AIX debug (with your patch): if either one of pthread_attr_setstacksize(), pthread_attr_setguardsize or pthread_create fails, we log a warning and return an error. AIX release: ditto. Linux debug: if pthread_attr_setstacksize fails, we assert. We ignore errors from pthread_attr_setguardsize. If pthread_create fails, we log a warning and return an error. Linux release: We ignore errors from both pthread_attr_setstacksize and pthread_attr_setguardsize. If pthread_create fails, we log a warning and return an error. I think it would be nice to have consistent behavior for all platforms. Not sure if this needs to be done with this patch here. As for an "Error occurred during initialization" (debug only or release too?) - can we be sure this only happens at initialization? Could it not be possible for java threads with different stack sizes to be started later, and triggering the same error? I think I prefer to always have a runtime error (both in debug and release builds), as Goetz did in his current AIX patch. And maybe have a sensible Exception text. Kind Regards, Thomas > > I removed the reference to linux, although I find it useful to > point out such unexpected differences. > http://cr.openjdk.java.net/~goetz/wr17/8186293-aixHugeStack/webrev.02/ > > Best regards, > Goetz. > > > > > > -----Original Message----- > > From: David Holmes [mailto:david.holmes at oracle.com] > > Sent: Wednesday, August 16, 2017 10:54 PM > > To: Lindenmaier, Goetz ; hotspot-runtime- > > dev at openjdk.java.net > > Subject: Re: RFR(S): 8186293: [aix] Fix thread creation with huge stack > sizes > > > > On 17/08/2017 12:24 AM, Lindenmaier, Goetz wrote: > > > Hi, > > > > > > TestOptionWithRanges causes the vm on aix to crash on some machines. > > > This is because huge stack sizes are not treated properly. > > > > > > On linux, pthread_attr_setstacksize succeeds if called with huge > values, but > > > pthread_create() then fails. On Aix, pthread_attr_setstacksize fails > if > > > passed a value exceeding the limits and leaves the minimal system > > > > The AIX behaviour is more in spirit with POSIX. > > > > > thread stack size in attr. Thus thread creation succeeds and leads to > > > crashes after thread creation when the guard pages shall be protected > > > but don't fit on the tiny stack created. > > > > > > Please review this small, aix-only change. > > > http://cr.openjdk.java.net/~goetz/wr17/8186293- > > aixHugeStack/webrev.01/ > > > > I wonder whether a fatal error "Error occurred during initialization of > > VM" would not be better than just logging a warning? > > > > As Thomas notes, no need to discuss/describe what may or may not happen > > on other platforms. > > > > Cheers, > > David > > > > > Best regards, > > > Goetz. > > > > From david.holmes at oracle.com Thu Aug 17 07:21:08 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Aug 2017 17:21:08 +1000 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes In-Reply-To: <5fffc76d634742e2ab513ad5c75d63e0@sap.com> References: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> <5fffc76d634742e2ab513ad5c75d63e0@sap.com> Message-ID: <820e69b6-156a-544a-d707-04a8930a551b@oracle.com> Hi Goetz, On 17/08/2017 4:52 PM, Lindenmaier, Goetz wrote: > Hi David, > >> I wonder whether a fatal error "Error occurred during initialization of >> VM" would not be better than just logging a warning? > I skip thread creation if the error code is != 0 and return false as it > happens on linux. So you see the exact same behavior. Only I Sorry I overlooked that. > print the additional message about the stack size because that > is missing from the pthread attr which is reported (as setting it > failed). > > I removed the reference to linux, although I find it useful to > point out such unexpected differences. > http://cr.openjdk.java.net/~goetz/wr17/8186293-aixHugeStack/webrev.02/ Seems fine to me. Thanks, David > Best regards, > Goetz. > > > > >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Wednesday, August 16, 2017 10:54 PM >> To: Lindenmaier, Goetz ; hotspot-runtime- >> dev at openjdk.java.net >> Subject: Re: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes >> >> On 17/08/2017 12:24 AM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> TestOptionWithRanges causes the vm on aix to crash on some machines. >>> This is because huge stack sizes are not treated properly. >>> >>> On linux, pthread_attr_setstacksize succeeds if called with huge values, but >>> pthread_create() then fails. On Aix, pthread_attr_setstacksize fails if >>> passed a value exceeding the limits and leaves the minimal system >> >> The AIX behaviour is more in spirit with POSIX. >> >>> thread stack size in attr. Thus thread creation succeeds and leads to >>> crashes after thread creation when the guard pages shall be protected >>> but don't fit on the tiny stack created. >>> >>> Please review this small, aix-only change. >>> http://cr.openjdk.java.net/~goetz/wr17/8186293- >> aixHugeStack/webrev.01/ >> >> I wonder whether a fatal error "Error occurred during initialization of >> VM" would not be better than just logging a warning? >> >> As Thomas notes, no need to discuss/describe what may or may not happen >> on other platforms. >> >> Cheers, >> David >> >>> Best regards, >>> Goetz. >>> From david.holmes at oracle.com Thu Aug 17 07:23:09 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Aug 2017 17:23:09 +1000 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes In-Reply-To: References: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> <5fffc76d634742e2ab513ad5c75d63e0@sap.com> Message-ID: <4612484d-6185-bf0a-2055-db22f0e6ffcf@oracle.com> Hi Thomas, On 17/08/2017 5:07 PM, Thomas St?fe wrote: > > On Thu, Aug 17, 2017 at 8:52 AM, Lindenmaier, Goetz > > wrote: > > Hi David, > > > I wonder whether a fatal error "Error occurred during initialization of > > VM" would not be better than just logging a warning? > I skip thread creation if the error code is != 0 and return false as it > happens on linux. So you see the exact same behavior. Only I > print the additional message about the stack size because that > is missing from the pthread attr which is reported (as setting it > failed). > > > Platforms seem to have different behaviour now: > > AIX debug (with your patch): if either one of > pthread_attr_setstacksize(), pthread_attr_setguardsize or pthread_create > fails, we log a warning and return an error. > AIX release: ditto. > > Linux debug: if pthread_attr_setstacksize fails, we assert. We ignore > errors from pthread_attr_setguardsize. If pthread_create fails, we log a > warning and return an error. > Linux release: We ignore errors from both pthread_attr_setstacksize and > pthread_attr_setguardsize. If pthread_create fails, we log a warning and > return an error. To me the asserts are there to catch basic usage errors that indicate a general programming bug (eg something uninitialized). They will also catch an "illegal argument" if the API detects that, but that is secondary. If pthread_create fails we do as you say, but if this is a system problem (like bad -Xss) then it will fail the first JavaThread creation and we will get the "Error occurred during initialization". > I think it would be nice to have consistent behavior for all platforms. Yes consistency would be nice but if one platform has more error checking then another then we'd have to drop assertions, even if we don't ever expect a failure on that platform. > Not sure if this needs to be done with this patch here. As for an "Error > occurred during initialization" (debug only or release too?) - can we be > sure this only happens at initialization? Could it not be possible for > java threads with different stack sizes to be started later, and > triggering the same error? I would expect a large stacksize passed to j.l.Thread constructor to also be able to trigger this. But bad -Xss/-XX:ThreadStackSize will cause initialization to fail. > I think I prefer to always have a runtime error (both in debug and > release builds), as Goetz did in his current AIX patch. And maybe have a > sensible Exception text. If pthread_create fails we have no detail as to exactly why. If pthread_attr_setstacksize does then we have a reasonable idea. So not sure what you would suggest. But in any case, as you suggest, this would all be a separate enhancement request. Cheers, David > Kind Regards, Thomas > > > I removed the reference to linux, although I find it useful to > point out such unexpected differences. > http://cr.openjdk.java.net/~goetz/wr17/8186293-aixHugeStack/webrev.02/ > > > Best regards, > ? Goetz. > > > > > > -----Original Message----- > > From: David Holmes [mailto:david.holmes at oracle.com > ] > > Sent: Wednesday, August 16, 2017 10:54 PM > > To: Lindenmaier, Goetz >; hotspot-runtime- > > dev at openjdk.java.net > > Subject: Re: RFR(S): 8186293: [aix] Fix thread creation with huge > stack sizes > > > > On 17/08/2017 12:24 AM, Lindenmaier, Goetz wrote: > > > Hi, > > > > > > TestOptionWithRanges causes the vm on aix to crash on some > machines. > > > This is because huge stack sizes are not treated properly. > > > > > > On linux, pthread_attr_setstacksize succeeds if called with > huge values, but > > > pthread_create() then fails. On Aix, pthread_attr_setstacksize > fails if > > > passed a value exceeding the limits and leaves the minimal system > > > > The AIX behaviour is more in spirit with POSIX. > > > > > thread stack size in attr. Thus thread creation succeeds and > leads to > > > crashes after thread creation when the guard pages shall be > protected > > > but don't fit on the tiny stack created. > > > > > > Please review this small, aix-only change. > > > http://cr.openjdk.java.net/~goetz/wr17/8186293- > > > aixHugeStack/webrev.01/ > > > > I wonder whether a fatal error "Error occurred during > initialization of > > VM" would not be better than just logging a warning? > > > > As Thomas notes, no need to discuss/describe what may or may not > happen > > on other platforms. > > > > Cheers, > > David > > > > > Best regards, > > >? ? Goetz. > > > > > From goetz.lindenmaier at sap.com Thu Aug 17 08:41:52 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 17 Aug 2017 08:41:52 +0000 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes In-Reply-To: <4612484d-6185-bf0a-2055-db22f0e6ffcf@oracle.com> References: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> <5fffc76d634742e2ab513ad5c75d63e0@sap.com> <4612484d-6185-bf0a-2055-db22f0e6ffcf@oracle.com> Message-ID: Hi Thomas, the purpose of my fix is to make the behavior more similar. For -XX:ThreadStackSize=40000000000 you get in both dbg and opt: on aix: [0.226s][warning][os,thread] The thread stack size specified is invalid: 40000000000k [0.226s][warning][os,thread] Failed to start thread - pthread_create failed (22=EINVAL) for attributes: stacksize: 192k, guardsize: 4k, detached. Error occurred during initialization of VM java.lang.OutOfMemoryError... on linux: [0.225s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 400000000k, guardsize: 0k, detached. Error occurred during initialization of VM java.lang.OutOfMemoryError... Note the stack sizes reported. I think the assertion on linux is only theoretical, as the function succeeds setting even impossible values. What else should go wrong? attr != NULL is obvious. You must remove the upper range limit from ThreadStackSize to reproduce this easily. In jdk9, this upper range limit is missing. Best regards, Goetz. > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Thursday, August 17, 2017 9:23 AM > To: Thomas St?fe ; Lindenmaier, Goetz > > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes > > Hi Thomas, > > On 17/08/2017 5:07 PM, Thomas St?fe wrote: > > > > On Thu, Aug 17, 2017 at 8:52 AM, Lindenmaier, Goetz > > > > wrote: > > > > Hi David, > > > > > I wonder whether a fatal error "Error occurred during initialization of > > > VM" would not be better than just logging a warning? > > I skip thread creation if the error code is != 0 and return false as it > > happens on linux. So you see the exact same behavior. Only I > > print the additional message about the stack size because that > > is missing from the pthread attr which is reported (as setting it > > failed). > > > > > > Platforms seem to have different behaviour now: > > > > AIX debug (with your patch): if either one of > > pthread_attr_setstacksize(), pthread_attr_setguardsize or pthread_create > > fails, we log a warning and return an error. > > AIX release: ditto. > > > > Linux debug: if pthread_attr_setstacksize fails, we assert. We ignore > > errors from pthread_attr_setguardsize. If pthread_create fails, we log a > > warning and return an error. > > Linux release: We ignore errors from both pthread_attr_setstacksize and > > pthread_attr_setguardsize. If pthread_create fails, we log a warning and > > return an error. > > To me the asserts are there to catch basic usage errors that indicate a > general programming bug (eg something uninitialized). They will also > catch an "illegal argument" if the API detects that, but that is secondary. > > If pthread_create fails we do as you say, but if this is a system > problem (like bad -Xss) then it will fail the first JavaThread creation > and we will get the "Error occurred during initialization". > > > I think it would be nice to have consistent behavior for all platforms. > > Yes consistency would be nice but if one platform has more error > checking then another then we'd have to drop assertions, even if we > don't ever expect a failure on that platform. > > > Not sure if this needs to be done with this patch here. As for an "Error > > occurred during initialization" (debug only or release too?) - can we be > > sure this only happens at initialization? Could it not be possible for > > java threads with different stack sizes to be started later, and > > triggering the same error? > > I would expect a large stacksize passed to j.l.Thread constructor to > also be able to trigger this. But bad -Xss/-XX:ThreadStackSize will > cause initialization to fail. > > > I think I prefer to always have a runtime error (both in debug and > > release builds), as Goetz did in his current AIX patch. And maybe have a > > sensible Exception text. > > If pthread_create fails we have no detail as to exactly why. If > pthread_attr_setstacksize does then we have a reasonable idea. So not > sure what you would suggest. > > But in any case, as you suggest, this would all be a separate > enhancement request. > > Cheers, > David > > > Kind Regards, Thomas > > > > > > I removed the reference to linux, although I find it useful to > > point out such unexpected differences. > > http://cr.openjdk.java.net/~goetz/wr17/8186293- > aixHugeStack/webrev.02/ > > aixHugeStack/webrev.02/> > > > > Best regards, > > ? Goetz. > > > > > > > > > > > -----Original Message----- > > > From: David Holmes [mailto:david.holmes at oracle.com > > ] > > > Sent: Wednesday, August 16, 2017 10:54 PM > > > To: Lindenmaier, Goetz > >; hotspot-runtime- > > > dev at openjdk.java.net > > > Subject: Re: RFR(S): 8186293: [aix] Fix thread creation with huge > > stack sizes > > > > > > On 17/08/2017 12:24 AM, Lindenmaier, Goetz wrote: > > > > Hi, > > > > > > > > TestOptionWithRanges causes the vm on aix to crash on some > > machines. > > > > This is because huge stack sizes are not treated properly. > > > > > > > > On linux, pthread_attr_setstacksize succeeds if called with > > huge values, but > > > > pthread_create() then fails. On Aix, pthread_attr_setstacksize > > fails if > > > > passed a value exceeding the limits and leaves the minimal system > > > > > > The AIX behaviour is more in spirit with POSIX. > > > > > > > thread stack size in attr. Thus thread creation succeeds and > > leads to > > > > crashes after thread creation when the guard pages shall be > > protected > > > > but don't fit on the tiny stack created. > > > > > > > > Please review this small, aix-only change. > > > > http://cr.openjdk.java.net/~goetz/wr17/8186293- > > > > > aixHugeStack/webrev.01/ > > > > > > I wonder whether a fatal error "Error occurred during > > initialization of > > > VM" would not be better than just logging a warning? > > > > > > As Thomas notes, no need to discuss/describe what may or may not > > happen > > > on other platforms. > > > > > > Cheers, > > > David > > > > > > > Best regards, > > > >? ? Goetz. > > > > > > > > From thomas.stuefe at gmail.com Thu Aug 17 10:56:07 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 17 Aug 2017 12:56:07 +0200 Subject: RFR(S): 8186293: [aix] Fix thread creation with huge stack sizes In-Reply-To: References: <5b5ad95c56944ef0b9e976d3eb27164f@sap.com> <5fffc76d634742e2ab513ad5c75d63e0@sap.com> <4612484d-6185-bf0a-2055-db22f0e6ffcf@oracle.com> Message-ID: Hi Goetz, as I said before I think the patch is basically fine. So, this is just discussion for discussions sake, which can be fun too :) See my answer inline. On Thu, Aug 17, 2017 at 10:41 AM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi Thomas, > > the purpose of my fix is to make the behavior more similar. > > For -XX:ThreadStackSize=40000000000 you get in both dbg and opt: > > on aix: > [0.226s][warning][os,thread] The thread stack size specified is invalid: > 40000000000k > [0.226s][warning][os,thread] Failed to start thread - pthread_create > failed (22=EINVAL) for attributes: stacksize: 192k, guardsize: 4k, detached. > Error occurred during initialization of VM > java.lang.OutOfMemoryError... > > on linux: > [0.225s][warning][os,thread] Failed to start thread - pthread_create > failed (EAGAIN) for attributes: stacksize: 400000000k, guardsize: 0k, > detached. > Error occurred during initialization of VM > java.lang.OutOfMemoryError... > > Note the stack sizes reported. > I think the assertion on linux is only theoretical, as the function > succeeds setting even impossible values. What else should > go wrong? attr != NULL is obvious. > For one, the value may be too small. Does glibc pthread_attr_setstacksize() also silently accepts a too small value? And then, this relies on the libc implementation we use. We may link against another libc (musl, eglibc, ...). So, error behaviour may change, so we should only rely on posix behaviour. But all this is theoretical and consolidating all platforms may be addressed by a follow up fix. > > You must remove the upper range limit from ThreadStackSize to reproduce > this easily. > In jdk9, this upper range limit is missing. > > Or write a java program with a user-set stack size? ..Thomas > Best regards, > Goetz. > > > > -----Original Message----- > > From: David Holmes [mailto:david.holmes at oracle.com] > > Sent: Thursday, August 17, 2017 9:23 AM > > To: Thomas St?fe ; Lindenmaier, Goetz > > > > Cc: hotspot-runtime-dev at openjdk.java.net > > Subject: Re: RFR(S): 8186293: [aix] Fix thread creation with huge stack > sizes > > > > Hi Thomas, > > > > On 17/08/2017 5:07 PM, Thomas St?fe wrote: > > > > > > On Thu, Aug 17, 2017 at 8:52 AM, Lindenmaier, Goetz > > > > > > wrote: > > > > > > Hi David, > > > > > > > I wonder whether a fatal error "Error occurred during > initialization of > > > > VM" would not be better than just logging a warning? > > > I skip thread creation if the error code is != 0 and return false > as it > > > happens on linux. So you see the exact same behavior. Only I > > > print the additional message about the stack size because that > > > is missing from the pthread attr which is reported (as setting it > > > failed). > > > > > > > > > Platforms seem to have different behaviour now: > > > > > > AIX debug (with your patch): if either one of > > > pthread_attr_setstacksize(), pthread_attr_setguardsize or > pthread_create > > > fails, we log a warning and return an error. > > > AIX release: ditto. > > > > > > Linux debug: if pthread_attr_setstacksize fails, we assert. We ignore > > > errors from pthread_attr_setguardsize. If pthread_create fails, we log > a > > > warning and return an error. > > > Linux release: We ignore errors from both pthread_attr_setstacksize and > > > pthread_attr_setguardsize. If pthread_create fails, we log a warning > and > > > return an error. > > > > To me the asserts are there to catch basic usage errors that indicate a > > general programming bug (eg something uninitialized). They will also > > catch an "illegal argument" if the API detects that, but that is > secondary. > > > > If pthread_create fails we do as you say, but if this is a system > > problem (like bad -Xss) then it will fail the first JavaThread creation > > and we will get the "Error occurred during initialization". > > > > > I think it would be nice to have consistent behavior for all platforms. > > > > Yes consistency would be nice but if one platform has more error > > checking then another then we'd have to drop assertions, even if we > > don't ever expect a failure on that platform. > > > > > Not sure if this needs to be done with this patch here. As for an > "Error > > > occurred during initialization" (debug only or release too?) - can we > be > > > sure this only happens at initialization? Could it not be possible for > > > java threads with different stack sizes to be started later, and > > > triggering the same error? > > > > I would expect a large stacksize passed to j.l.Thread constructor to > > also be able to trigger this. But bad -Xss/-XX:ThreadStackSize will > > cause initialization to fail. > > > > > I think I prefer to always have a runtime error (both in debug and > > > release builds), as Goetz did in his current AIX patch. And maybe have > a > > > sensible Exception text. > > > > If pthread_create fails we have no detail as to exactly why. If > > pthread_attr_setstacksize does then we have a reasonable idea. So not > > sure what you would suggest. > > > > But in any case, as you suggest, this would all be a separate > > enhancement request. > > > > Cheers, > > David > > > > > Kind Regards, Thomas > > > > > > > > > I removed the reference to linux, although I find it useful to > > > point out such unexpected differences. > > > http://cr.openjdk.java.net/~goetz/wr17/8186293- > > aixHugeStack/webrev.02/ > > > > aixHugeStack/webrev.02/> > > > > > > Best regards, > > > Goetz. > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: David Holmes [mailto:david.holmes at oracle.com > > > ] > > > > Sent: Wednesday, August 16, 2017 10:54 PM > > > > To: Lindenmaier, Goetz > > >; hotspot-runtime- > > > > dev at openjdk.java.net > > > > Subject: Re: RFR(S): 8186293: [aix] Fix thread creation with > huge > > > stack sizes > > > > > > > > On 17/08/2017 12:24 AM, Lindenmaier, Goetz wrote: > > > > > Hi, > > > > > > > > > > TestOptionWithRanges causes the vm on aix to crash on some > > > machines. > > > > > This is because huge stack sizes are not treated properly. > > > > > > > > > > On linux, pthread_attr_setstacksize succeeds if called with > > > huge values, but > > > > > pthread_create() then fails. On Aix, pthread_attr_setstacksize > > > fails if > > > > > passed a value exceeding the limits and leaves the minimal > system > > > > > > > > The AIX behaviour is more in spirit with POSIX. > > > > > > > > > thread stack size in attr. Thus thread creation succeeds and > > > leads to > > > > > crashes after thread creation when the guard pages shall be > > > protected > > > > > but don't fit on the tiny stack created. > > > > > > > > > > Please review this small, aix-only change. > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186293- > > > > > > > aixHugeStack/webrev.01/ > > > > > > > > I wonder whether a fatal error "Error occurred during > > > initialization of > > > > VM" would not be better than just logging a warning? > > > > > > > > As Thomas notes, no need to discuss/describe what may or may not > > > happen > > > > on other platforms. > > > > > > > > Cheers, > > > > David > > > > > > > > > Best regards, > > > > > Goetz. > > > > > > > > > > > > From martin.doerr at sap.com Thu Aug 17 10:56:56 2017 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 17 Aug 2017 10:56:56 +0000 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: References: Message-ID: Hi Thomas, looks good. Thanks. I also like that DestroyJavaVM looks more like CreateJavaVM, now. Best regards, Martin -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Thomas St?fe Sent: Mittwoch, 16. August 2017 16:18 To: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH Ping.. could I please have a second review and a sponsor? Thank you! Current webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186199- destroyjavavm-no-seh-handler/webrev.01/webrev/ ..Thomas On Mon, Aug 14, 2017 at 5:23 PM, Thomas St?fe wrote: > Dear all, > > please review this tiny fix: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 > webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > 8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ > > We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal > handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. > > Thanks, Thomas > From coleen.phillimore at oracle.com Thu Aug 17 11:08:06 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 17 Aug 2017 07:08:06 -0400 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: References: Message-ID: Hi, I will sponsor this for you. The code looks ok to me but I'm not really a windows expert. Coleen On 8/16/17 10:18 AM, Thomas St?fe wrote: > Ping.. could I please have a second review and a sponsor? Thank you! > > Current webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186199- > destroyjavavm-no-seh-handler/webrev.01/webrev/ > > ..Thomas > > On Mon, Aug 14, 2017 at 5:23 PM, Thomas St?fe > wrote: > >> Dear all, >> >> please review this tiny fix: >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 >> webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ >> 8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ >> >> We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal >> handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. >> >> Thanks, Thomas >> From thomas.stuefe at gmail.com Thu Aug 17 11:34:17 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 17 Aug 2017 13:34:17 +0200 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: References: Message-ID: Danke :) On Aug 17, 2017 12:56, "Doerr, Martin" wrote: > Hi Thomas, > > looks good. Thanks. I also like that DestroyJavaVM looks more like > CreateJavaVM, now. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Thomas St?fe > Sent: Mittwoch, 16. August 2017 16:18 > To: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by > SEH > > Ping.. could I please have a second review and a sponsor? Thank you! > > Current webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186199- > destroyjavavm-no-seh-handler/webrev.01/webrev/ > > ..Thomas > > On Mon, Aug 14, 2017 at 5:23 PM, Thomas St?fe > wrote: > > > Dear all, > > > > please review this tiny fix: > > > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 > > webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > > 8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ > > > > We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal > > handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. > > > > Thanks, Thomas > > > From goetz.lindenmaier at sap.com Thu Aug 17 11:35:44 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 17 Aug 2017 11:35:44 +0000 Subject: RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: References: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> Message-ID: <2151e4417d4d4bf5b9e368db68c75db2@sap.com> Hi Thomas, I reworked the whole thing. First, there is dll_build_name. It just does -> lib.so. Second, I renamed the legacy dll_build_name to dll_locate_lib. I merged all the unix variants to one in os_posix. I removed the buffer overflow check at the top. It's too restrictive because the path argument can contain several paths. I added the overflow checks into the single cases. Also, I first assemble the pure name using the new, simple dll_build_name. This is for reuse and readability. In case of an empty directory, I use get_current_directory to complete the path as indicated by the original documentation where it was called with "". Dll_locate_lib now always returns a name with a full path if the file exists. Also, on windows, I think I fixed a bug by reversing the order of checks. A path list ending in ':' or '\' would not have been recognized. On Bsd, I removed JNI_LIB_* because that already is defined in jvm_bsh.h New webrev: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/ Best regards, Goetz. Find some comments inline: > Especially if the path is empty, it just returns 'true'. > Dll_build_name is usually used before calling dll_load. If dll_load does not get a full path it searches > in well known unix/windows locations. This is intended in the two cases where dll_build_name > is called with an empty path. > > So, for both cases (thread.cpp, jvmtiExport.cpp), > > before, we would call os::dll_build_name() with an empty string for the path > which, for relative paths, would result in feeding that path unexpanded to > dlopen(), which would use whatever the OS does in those cases (LIBPATH, > LD_LIBRARY_PATH, PATH on windows). Note that this does not necessarily > include searching the current directory. Right. With changed dll_biuld_name it's again exactly as before. > With your change, we now use java.library.path, which is not necessarily the > same? You are right, I oversaw that java.library.path can be overwritten. Initially, it's set to the right thing. > (BTW, I think the old comments in thread.cpp and jniExport.cpp were wrong:"// > Try the local directory" - if "local" means "current", this is not what did > happen). Right, I tried to adapt them, did I miss one? > I added a second variant of dll_build_name without the path argument that adds the path > from system property java.lang.path and use that in these two cases. > I changed the original function to actually check file availability in all cases, > and to check . if the path is empty. > I think that may be a bit confusing. We would then have three options: > > - call os::dll_build_name with a real ";;.." PATH and get a file name > resolved from that path > - call os::dll_build_name with "" for the PATH and get OS dll resolution No, in that case, as I called file_exists(), it would only work if the dll is in the current working directory. But I changed this now, anyways. > - call your new overloaded version of os::dll_build_name(), which uses - > Djava.library.path. > > Please review this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.01/ dllBuildName/webrev.01/> > > Best regards, > Goetz. > > > > > Kind Regards, Thomas From coleen.phillimore at oracle.com Thu Aug 17 11:43:20 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 17 Aug 2017 07:43:20 -0400 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: References: Message-ID: <616ca92d-84e4-a366-2278-ba6c4a1e446a@oracle.com> On 8/17/17 7:34 AM, Thomas St?fe wrote: > Danke :) Bitte (?) according to google translate. I forgot that the repo is shut down. I'll check it in when it's ok (unless M-T when I'm out of the office). Coleen > > On Aug 17, 2017 12:56, "Doerr, Martin" wrote: > >> Hi Thomas, >> >> looks good. Thanks. I also like that DestroyJavaVM looks more like >> CreateJavaVM, now. >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >> bounces at openjdk.java.net] On Behalf Of Thomas St?fe >> Sent: Mittwoch, 16. August 2017 16:18 >> To: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by >> SEH >> >> Ping.. could I please have a second review and a sponsor? Thank you! >> >> Current webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186199- >> destroyjavavm-no-seh-handler/webrev.01/webrev/ >> >> ..Thomas >> >> On Mon, Aug 14, 2017 at 5:23 PM, Thomas St?fe >> wrote: >> >>> Dear all, >>> >>> please review this tiny fix: >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 >>> webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ >>> 8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ >>> >>> We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal >>> handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. >>> >>> Thanks, Thomas >>> From thomas.stuefe at gmail.com Thu Aug 17 12:14:14 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 17 Aug 2017 14:14:14 +0200 Subject: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered by SEH In-Reply-To: <616ca92d-84e4-a366-2278-ba6c4a1e446a@oracle.com> References: <616ca92d-84e4-a366-2278-ba6c4a1e446a@oracle.com> Message-ID: On Thu, Aug 17, 2017 at 1:43 PM, wrote: > > > On 8/17/17 7:34 AM, Thomas St?fe wrote: > >> Danke :) >> > > Bitte (?) according to google translate. I forgot that the repo is shut > down. I'll check it in when it's ok (unless M-T when I'm out of the > office). > > :) thats correct. Thank you! Thomas > Coleen > > >> On Aug 17, 2017 12:56, "Doerr, Martin" wrote: >> >> Hi Thomas, >>> >>> looks good. Thanks. I also like that DestroyJavaVM looks more like >>> CreateJavaVM, now. >>> >>> Best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>> bounces at openjdk.java.net] On Behalf Of Thomas St?fe >>> Sent: Mittwoch, 16. August 2017 16:18 >>> To: hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(xxs): 8186199: [windows] JNI_DestroyJavaVM not covered >>> by >>> SEH >>> >>> Ping.. could I please have a second review and a sponsor? Thank you! >>> >>> Current webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186199- >>> destroyjavavm-no-seh-handler/webrev.01/webrev/ >>> >>> ..Thomas >>> >>> On Mon, Aug 14, 2017 at 5:23 PM, Thomas St?fe >>> wrote: >>> >>> Dear all, >>>> >>>> please review this tiny fix: >>>> >>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8186199 >>>> webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ >>>> 8186199-destroyjavavm-no-seh-handler/webrev.00/webrev/ >>>> >>>> We miss __try{ } __except in JNI_DestroyJavaVM, so we run without signal >>>> handler (well, SE handler) during the invocation of JNI_DestroyJavaVM. >>>> >>>> Thanks, Thomas >>>> >>>> > From thomas.stuefe at gmail.com Thu Aug 17 13:48:25 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 17 Aug 2017 15:48:25 +0200 Subject: RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <2151e4417d4d4bf5b9e368db68c75db2@sap.com> References: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> <2151e4417d4d4bf5b9e368db68c75db2@sap.com> Message-ID: Hi Goetz, On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi Thomas, > > I reworked the whole thing. > > First, there is dll_build_name. It just does -> lib.so. > > Second, I renamed the legacy dll_build_name to dll_locate_lib. > > I merged all the unix variants to one in os_posix. > > I removed the buffer overflow check at the top. > It's too restrictive because the path argument > can contain several paths. I added the overflow > checks into the single cases. > > Also, I first assemble the pure name using the new, simple > dll_build_name. This is for reuse and readability. > > In case of an empty directory, I use get_current_directory > to complete the path as indicated by the original documentation > where it was called with "". > Dll_locate_lib now always returns a name with a full path if > the file exists. > > Also, on windows, I think I fixed a bug by reversing the order > of checks. A path list ending in ':' or '\' would not have > been recognized. > > On Bsd, I removed JNI_LIB_* because that already is defined > in jvm_bsh.h > > New webrev: > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/ > > Best regards, > Goetz. > > I like this better than before. Remarks: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html + // Builds the platform-specific name of a library. + // Returns false on __buffer overflow__. Hopefully not! :D How about: "Returns false no truncation" instead. + // Builds a platform-specific full library path given an ld path and lib name. + // Returns true if the buffer contains a full path to an existing file, false + // otherwise. If pathname is empty, checks the current directory. + static bool dll_locate_lib(char* buffer, size_t size, const char* pathname, const char* fname); Might be worth mentioning that "fname" is the unadorned library name, e.g. "verify" for libverify.so or verify.dll. Would the following alternative be valid: one could make dll_locate_lib take the real file name, and let caller use dll_build_name() to build the libary name first before handing it to dll_locate_lib(). In that case, dll_locate_lib() could be renamed to a generic "find_file_in_path" because it would work for any kind of file. As an added bonus, there would be no need to create a temporary array in dll_build_name/dll_locate_lib, and no need to call free() so no cleanup-related control flow changes in these functions. ===== http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + strlen(JNI_LIB_SUFFIX); int -> size_t (does that even compile without warning?) + // Check current working directory. + const char* p = get_current_directory(buffer, buflen); + if (p != NULL && + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { + strcat(buffer, "\\"); + strcat(buffer, fullfname); + retval = file_exists(buffer); Small nit: I'd use jio_snprintf instead of strcat. Functionally identical but will make scanners (e.g. coverity) happy. One could then avoid the length calculation and rely on jio_snprintf truncation: const char* p = get_current_directory(buffer, buflen); if (p != NULL) { const size_t end = strlen(p); if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { retval = file_exists(buffer); } } -- Not your change, but: why does the code in os::dll_locate_lib() even differentiate between a PATH containing no os::path_separator() and a path containing os::path_separator()? Would the former not be just a PATH with only one directory and hence need no special treatment? ===== http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html Could os::dll_locate_lib be consolidated between windows and unix? Seems to be the implementation is almost identical. ==== http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html + // not found - try library path Proposal: "not found - try OS default library path" Find some comments inline: > > > > Especially if the path is empty, it just returns 'true'. > > Dll_build_name is usually used before calling dll_load. If > dll_load does not get a full path it searches > > in well known unix/windows locations. This is intended in the two > cases where dll_build_name > > is called with an empty path. > > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > > > > before, we would call os::dll_build_name() with an empty string for the > path > > which, for relative paths, would result in feeding that path unexpanded > to > > dlopen(), which would use whatever the OS does in those cases (LIBPATH, > > LD_LIBRARY_PATH, PATH on windows). Note that this does not necessarily > > include searching the current directory. > Right. With changed dll_biuld_name it's again exactly as before. > > > With your change, we now use java.library.path, which is not necessarily > the > > same? > You are right, I oversaw that java.library.path can be overwritten. > Initially, > it's set to the right thing. > > > (BTW, I think the old comments in thread.cpp and jniExport.cpp were > wrong:"// > > Try the local directory" - if "local" means "current", this is not what > did > > happen). > Right, I tried to adapt them, did I miss one? > > > I added a second variant of dll_build_name without the path > argument that adds the path > > from system property java.lang.path and use that in these two > cases. > > I changed the original function to actually check file > availability in all cases, > > and to check . if the path is empty. > > I think that may be a bit confusing. We would then have three options: > > > > - call os::dll_build_name with a real ";;.." PATH and get a file > name > > resolved from that path > > - call os::dll_build_name with "" for the PATH and get OS dll resolution > No, in that case, as I called file_exists(), it would only work if the dll > is in the > current working directory. But I changed this now, anyways. > > > - call your new overloaded version of os::dll_build_name(), which uses - > > Djava.library.path. > > > > Please review this change. I please need a sponsor. > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.01/ > dllBuildName/webrev.01/> > > > > Best regards, > > Goetz. > > > > > > > > > > Kind Regards, Thomas > > Best Regards, Thomas From goetz.lindenmaier at sap.com Thu Aug 17 16:03:13 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 17 Aug 2017 16:03:13 +0000 Subject: RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: References: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> <2151e4417d4d4bf5b9e368db68c75db2@sap.com> Message-ID: <59d9ca9f5f6d4eeab0679edabc66f1f6@sap.com> Hi Thomas, I adapted the comments in os.hpp. If I move the call to dll_build_name out of dll_locate_lib I have to do a lot of coding in all the places where it is called. That seems not useful to me. Fixed the type to size_t. One could merge posix/windows if putting the check for ?:? into a WINDOWS_ONLY() I guess. The check for \ could be done in posix as well, if using file_seperator(). ? Not your change, but: why does the code in os::dll_locate_lib() even ? differentiate between a PATH containing no os::path_separator() ? and a path containing os::path_separator()? I assume this was done to avoid all the allocations and copying of the path. Also adapted the comment in jvmtiExport.cpp. New webrev: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.03/ incremental diff: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.03/diffs-incremental.patch (fixed indentation on windows) Best regards, Goetz. From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: Thursday, August 17, 2017 3:48 PM To: Lindenmaier, Goetz Cc: hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file is missing. Hi Goetz, On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz > wrote: Hi Thomas, I reworked the whole thing. First, there is dll_build_name. It just does -> lib.so. Second, I renamed the legacy dll_build_name to dll_locate_lib. I merged all the unix variants to one in os_posix. I removed the buffer overflow check at the top. It's too restrictive because the path argument can contain several paths. I added the overflow checks into the single cases. Also, I first assemble the pure name using the new, simple dll_build_name. This is for reuse and readability. In case of an empty directory, I use get_current_directory to complete the path as indicated by the original documentation where it was called with "". Dll_locate_lib now always returns a name with a full path if the file exists. Also, on windows, I think I fixed a bug by reversing the order of checks. A path list ending in ':' or '\' would not have been recognized. On Bsd, I removed JNI_LIB_* because that already is defined in jvm_bsh.h New webrev: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/ Best regards, Goetz. I like this better than before. Remarks: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html + // Builds the platform-specific name of a library. + // Returns false on __buffer overflow__. Hopefully not! :D How about: "Returns false no truncation" instead. + // Builds a platform-specific full library path given an ld path and lib name. + // Returns true if the buffer contains a full path to an existing file, false + // otherwise. If pathname is empty, checks the current directory. + static bool dll_locate_lib(char* buffer, size_t size, const char* pathname, const char* fname); Might be worth mentioning that "fname" is the unadorned library name, e.g. "verify" for libverify.so or verify.dll. Would the following alternative be valid: one could make dll_locate_lib take the real file name, and let caller use dll_build_name() to build the libary name first before handing it to dll_locate_lib(). In that case, dll_locate_lib() could be renamed to a generic "find_file_in_path" because it would work for any kind of file. As an added bonus, there would be no need to create a temporary array in dll_build_name/dll_locate_lib, and no need to call free() so no cleanup-related control flow changes in these functions. ===== http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + strlen(JNI_LIB_SUFFIX); int -> size_t (does that even compile without warning?) + // Check current working directory. + const char* p = get_current_directory(buffer, buflen); + if (p != NULL && + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { + strcat(buffer, "\\"); + strcat(buffer, fullfname); + retval = file_exists(buffer); Small nit: I'd use jio_snprintf instead of strcat. Functionally identical but will make scanners (e.g. coverity) happy. One could then avoid the length calculation and rely on jio_snprintf truncation: const char* p = get_current_directory(buffer, buflen); if (p != NULL) { const size_t end = strlen(p); if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { retval = file_exists(buffer); } } -- Not your change, but: why does the code in os::dll_locate_lib() even differentiate between a PATH containing no os::path_separator() and a path containing os::path_separator()? Would the former not be just a PATH with only one directory and hence need no special treatment? ===== http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html Could os::dll_locate_lib be consolidated between windows and unix? Seems to be the implementation is almost identical. ==== http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html + // not found - try library path Proposal: "not found - try OS default library path" Find some comments inline: > Especially if the path is empty, it just returns 'true'. > Dll_build_name is usually used before calling dll_load. If dll_load does not get a full path it searches > in well known unix/windows locations. This is intended in the two cases where dll_build_name > is called with an empty path. > > So, for both cases (thread.cpp, jvmtiExport.cpp), > > before, we would call os::dll_build_name() with an empty string for the path > which, for relative paths, would result in feeding that path unexpanded to > dlopen(), which would use whatever the OS does in those cases (LIBPATH, > LD_LIBRARY_PATH, PATH on windows). Note that this does not necessarily > include searching the current directory. Right. With changed dll_biuld_name it's again exactly as before. > With your change, we now use java.library.path, which is not necessarily the > same? You are right, I oversaw that java.library.path can be overwritten. Initially, it's set to the right thing. > (BTW, I think the old comments in thread.cpp and jniExport.cpp were wrong:"// > Try the local directory" - if "local" means "current", this is not what did > happen). Right, I tried to adapt them, did I miss one? > I added a second variant of dll_build_name without the path argument that adds the path > from system property java.lang.path and use that in these two cases. > I changed the original function to actually check file availability in all cases, > and to check . if the path is empty. > I think that may be a bit confusing. We would then have three options: > > - call os::dll_build_name with a real ";;.." PATH and get a file name > resolved from that path > - call os::dll_build_name with "" for the PATH and get OS dll resolution No, in that case, as I called file_exists(), it would only work if the dll is in the current working directory. But I changed this now, anyways. > - call your new overloaded version of os::dll_build_name(), which uses - > Djava.library.path. > > Please review this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.01/ dllBuildName/webrev.01/> > > Best regards, > Goetz. > > > > > Kind Regards, Thomas Best Regards, Thomas From thomas.stuefe at gmail.com Thu Aug 17 17:54:02 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 17 Aug 2017 19:54:02 +0200 Subject: RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <59d9ca9f5f6d4eeab0679edabc66f1f6@sap.com> References: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> <2151e4417d4d4bf5b9e368db68c75db2@sap.com> <59d9ca9f5f6d4eeab0679edabc66f1f6@sap.com> Message-ID: Hi Goetz, On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Hi Thomas, > > > > I adapted the comments in os.hpp. > > > > If I move the call to dll_build_name out of dll_locate_lib > > I have to do a lot of coding in all the places where it is called. > > That seems not useful to me. > > > > Fixed the type to size_t. > > > > One could merge posix/windows if putting the check for ?:? > > into a WINDOWS_ONLY() I guess. The check for \ could be > > done in posix as well, if using file_seperator(). > > > > ? Not your change, but: why does the code in os::dll_locate_lib() even > > ? differentiate between a PATH containing no os::path_separator() > > ? and a path containing os::path_separator()? > > I assume this was done to avoid all the allocations and copying of the > path. > > > > Also adapted the comment in jvmtiExport.cpp. > > > > New webrev: > > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.03/ > > incremental diff: > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.03/diffs-incremental.patch > > (fixed indentation on windows) > > > > Best regards, > > Goetz. > > > > > Comments in os.hpp seem unchanged ? But looks fine otherwise. I do not need another webrev. Thanks, Thomas > > > > > *From:* Thomas St?fe [mailto:thomas.stuefe at gmail.com] > *Sent:* Thursday, August 17, 2017 3:48 PM > *To:* Lindenmaier, Goetz > *Cc:* hotspot-runtime-dev at openjdk.java.net > *Subject:* Re: RFR(M): 8186072: dll_build_name returns true even if file > is missing. > > > > Hi Goetz, > > > > > > > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz < > goetz.lindenmaier at sap.com> wrote: > > Hi Thomas, > > I reworked the whole thing. > > First, there is dll_build_name. It just does -> lib.so. > > Second, I renamed the legacy dll_build_name to dll_locate_lib. > > I merged all the unix variants to one in os_posix. > > I removed the buffer overflow check at the top. > It's too restrictive because the path argument > can contain several paths. I added the overflow > checks into the single cases. > > Also, I first assemble the pure name using the new, simple > dll_build_name. This is for reuse and readability. > > In case of an empty directory, I use get_current_directory > to complete the path as indicated by the original documentation > where it was called with "". > Dll_locate_lib now always returns a name with a full path if > the file exists. > > Also, on windows, I think I fixed a bug by reversing the order > of checks. A path list ending in ':' or '\' would not have > been recognized. > > On Bsd, I removed JNI_LIB_* because that already is defined > in jvm_bsh.h > > New webrev: > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/ > > Best regards, > Goetz. > > > > I like this better than before. Remarks: > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/ > share/vm/runtime/os.hpp.udiff.html > > > > + // Builds the platform-specific name of a library. > > + // Returns false on __buffer overflow__. > > > > Hopefully not! :D > > How about: "Returns false no truncation" instead. > > > > > > + // Builds a platform-specific full library path given an ld path and > lib name. > > + // Returns true if the buffer contains a full path to an existing file, > false > > + // otherwise. If pathname is empty, checks the current directory. > > + static bool dll_locate_lib(char* buffer, size_t size, > > const char* pathname, const char* > fname); > > > > Might be worth mentioning that "fname" is the unadorned library name, e.g. > "verify" for libverify.so or verify.dll. > > > > Would the following alternative be valid: > > > > one could make dll_locate_lib take the real file name, and let caller use > dll_build_name() to build the libary name first before handing it to > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to a > generic "find_file_in_path" because it would work for any kind of file. > > > > As an added bonus, there would be no need to create a temporary array in > dll_build_name/dll_locate_lib, and no need to call free() so no > cleanup-related control flow changes in these functions. > > > > ===== > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > > > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + > strlen(JNI_LIB_SUFFIX); > > > > int -> size_t (does that even compile without warning?) > > > > + // Check current working directory. > > + const char* p = get_current_directory(buffer, buflen); > > + if (p != NULL && > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { > > + strcat(buffer, "\\"); > > + strcat(buffer, fullfname); > > + retval = file_exists(buffer); > > > > Small nit: I'd use jio_snprintf instead of strcat. Functionally identical > but will make scanners (e.g. coverity) happy. One could then avoid the > length calculation and rely on jio_snprintf truncation: > > > > const char* p = get_current_directory(buffer, buflen); > > if (p != NULL) { > > const size_t end = strlen(p); > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { > > retval = file_exists(buffer); > > } > > } > > > > -- > > > > Not your change, but: why does the code in os::dll_locate_lib() even > differentiate between a PATH containing no os::path_separator() and a path > containing os::path_separator()? > > > > Would the former not be just a PATH with only one directory and hence need > no special treatment? > > > > ===== > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html > > > > Could os::dll_locate_lib be consolidated between windows and unix? Seems > to be the implementation is almost identical. > > > > ==== > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.02/src/ > share/vm/prims/jvmtiExport.cpp.udiff.html > > > > + // not found - try library path > > > > Proposal: "not found - try OS default library path" > > > > > > Find some comments inline: > > > > Especially if the path is empty, it just returns 'true'. > > Dll_build_name is usually used before calling dll_load. If > dll_load does not get a full path it searches > > in well known unix/windows locations. This is intended in the two > cases where dll_build_name > > is called with an empty path. > > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > > > > before, we would call os::dll_build_name() with an empty string for the > path > > which, for relative paths, would result in feeding that path unexpanded > to > > dlopen(), which would use whatever the OS does in those cases (LIBPATH, > > LD_LIBRARY_PATH, PATH on windows). Note that this does not necessarily > > include searching the current directory. > Right. With changed dll_biuld_name it's again exactly as before. > > > With your change, we now use java.library.path, which is not necessarily > the > > same? > You are right, I oversaw that java.library.path can be overwritten. > Initially, > it's set to the right thing. > > > (BTW, I think the old comments in thread.cpp and jniExport.cpp were > wrong:"// > > Try the local directory" - if "local" means "current", this is not what > did > > happen). > Right, I tried to adapt them, did I miss one? > > > I added a second variant of dll_build_name without the path > argument that adds the path > > from system property java.lang.path and use that in these two > cases. > > I changed the original function to actually check file > availability in all cases, > > and to check . if the path is empty. > > I think that may be a bit confusing. We would then have three options: > > > > - call os::dll_build_name with a real ";;.." PATH and get a file > name > > resolved from that path > > - call os::dll_build_name with "" for the PATH and get OS dll resolution > No, in that case, as I called file_exists(), it would only work if the dll > is in the > current working directory. But I changed this now, anyways. > > > - call your new overloaded version of os::dll_build_name(), which uses - > > Djava.library.path. > > > > Please review this change. I please need a sponsor. > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.01/ > dllBuildName/webrev.01/> > > > > > Best regards, > > Goetz. > > > > > > > > > > Kind Regards, Thomas > > > > Best Regards, Thomas > > > > > From bob.vandette at oracle.com Thu Aug 17 19:47:02 2017 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 17 Aug 2017 15:47:02 -0400 Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: <5d8c1673-de57-5504-e6d7-ff92080b0692@oracle.com> References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> <7C05B931-43BB-4703-9AAD-5C23FA419DC1@oracle.com> <5d8c1673-de57-5504-e6d7-ff92080b0692@oracle.com> Message-ID: <39369944-5A6C-4FA4-830F-0899DA0045AF@oracle.com> > On Aug 17, 2017, at 12:36 AM, David Holmes wrote: > > On 17/08/2017 1:29 PM, Bob Vandette wrote: >> I saw that but wasn't sure it needed the added flexibility since its probably ok that initial sizes are 50% or less. > > I'd go for consistency. I?m going to wait until the CSR decision is finalized. I may be adding a new flag which won?t be a float value. > > Also now you will need to guard against values < 1, I think. It looks like there?s a range option to our flags now. I agree we need the range to be 1 to MAX_DBL. > > There may be an option checking test that will need updating as well. I looked at these and they don?t get impacted by the float change. If I need to add a new flag, then I?ll have to add a test or 2. Bob. > > Cheers, > David > >> Bob. >>> On Aug 16, 2017, at 5:04 PM, David Holmes wrote: >>> >>> Hi Bob, >>> >>>> On 17/08/2017 3:32 AM, Bob Vandette wrote: >>>> Please review this simple two line fix which allows more flexibility in selecting the % of system RAM >>>> to be used by the Heap. This just changes two int variables to doubles. >>>> RFE: >>>> https://bugs.openjdk.java.net/browse/JDK-8186248 >>>> Webrev: >>>> http://cr.openjdk.java.net/~bobv/8186248 >>> >>> Wouldn't you also want/need to change the type of InitialRAMFraction? >>> >>> Note: jdk10/hs is currently closed to changes as we prepare to push up to jdk10/jdk10. >>> >>> Thanks, >>> David >>> >>>> Bob. From mikhailo.seledtsov at oracle.com Thu Aug 17 20:40:37 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Thu, 17 Aug 2017 13:40:37 -0700 Subject: RFR(XXS): 8186308: [TESTBUG] quarantine jvmti/LoadAgentDcmdTest.java test until underlying issue is fixed Message-ID: <360a5b69-0d5c-f6b0-f993-cf3f674c5357@oracle.com> Please review this one-liner that puts offending test on a problem list until the root cause is resolved. Webrev: http://cr.openjdk.java.net/~mseledtsov/8186308.01/ Testing: Ran the affected test Thank you, Misha From george.triantafillou at oracle.com Thu Aug 17 21:08:24 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Thu, 17 Aug 2017 17:08:24 -0400 Subject: RFR(XXS): 8186308: [TESTBUG] quarantine jvmti/LoadAgentDcmdTest.java test until underlying issue is fixed In-Reply-To: <360a5b69-0d5c-f6b0-f993-cf3f674c5357@oracle.com> References: <360a5b69-0d5c-f6b0-f993-cf3f674c5357@oracle.com> Message-ID: <19c179e1-f715-4bc3-1b98-5b4d816006b1@oracle.com> Hi Misha, Looks good. -George On 8/17/2017 4:40 PM, mikhailo wrote: > Please review this one-liner that puts offending test on a problem > list until the root cause is resolved. > > ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8186308.01/ > ??? Testing:? Ran the affected test > > > Thank you, > Misha > From thomas.stuefe at gmail.com Fri Aug 18 07:23:50 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 18 Aug 2017 09:23:50 +0200 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code Message-ID: Dear all, may I please have a review for this change: Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ 8186349-centralize-dbghelp-handling/webrev.00/webrev/ This is a part of an ongoing work I do to make error reporting (especially callstacks) on Windows more reliable. At first I did a rather large patch, see: https://bugs.openjdk.java.net/ browse/JDK-8185712 and http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-August/024286.html . But after discussing this patch with Ioi, I saw that this patch is better split up into multiple parts for easier reviewing. So this is the first split up patch. -- This patch here centralizes handling of the dbghelp.dll (loading the library, resolving function pointers and synchronizing access). Which solves the problem that accesses to functions exported from the dbghelp.dll need to be synchronized, as stated by the MSDN. We somehow never really cared. I guess it never caused visible trouble, because most of the time (not always) the functions are accessed from VMError::report(), and chances of parallel access from other non-error-reporting threads are slim. Even if it were to crash, secondary error handling would step in and write an "Error occurred during error reporting" or "Another thread had an error too" message and we would probably just shrug it off. But as this whole effort is about increasing the chance of useful callstacks in hs-err files, I'd like to fix this. In addition to the fix, I think this is also a nice cleanup and removes duplicate code. Notes: 1) Robustness: We may or may not find a dbghelp.dll on the target system. If we find it, it may be old or new (it is not tightly coupled with the OS, may be part of other installation packages, may exist multiple times etc). We should handle older versions of the dbghelp dll gracefully and hide all that complexity from the caller. 2) The new DbgHelpLoader class does not export any state indicating whether or not it successfully loaded, and if it loaded which functions are actually available. That was a deliberate decision, there is no need for the caller to know this. Caller should invoke the DbgHelpLoader functions as if they were the equivalent OS functions and handle return errors. DbgHelpLoader should never crash or assert; missing functions should behave like failing functions. 3) However, I added a one liner to the hs-err file indicating the state of the dbghelp dll - version info, what functions were missing etc. This may help understanding weird or missing callstacks. 4) I removed the implementation for shutdown (WindowsDecoder::shutdown). I think there is no valid reason to ever shutdown the decoder. For one, we may crash right at the end, and still it would be nice to have callstacks. And then, why spend cycles shutting down the decoder when we could just let it end with the process? 5) This code gets used during error reporting. So no VM infrastructure must be used to avoid circular crashes and VM initialization dependencies. So, to synchronize, this code uses raw windows CriticalSection objects. -- Next step will be revamping handling of the Symbol APIs. This will involve removing the WindowsDecoder class, which introduces other errors and really makes no sense if the underlying dbghelp layer does its own synchronization. Thanks for reviewing! Kind Regards, Thomas From robbin.ehn at oracle.com Mon Aug 21 08:39:18 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 21 Aug 2017 10:39:18 +0200 Subject: RFR: Parallelize safepoint cleanup In-Reply-To: References: <61d80e98-275f-b2b8-4ac7-6d5d03b047de@redhat.com> <0e44cc90-b384-0820-93d6-a70d22c501c3@oracle.com> <20E06CEC-38CA-41AE-99DB-17EF22A3C5CC@oracle.com> <58f2278e-b95c-4ec2-4f7d-9fefa3a281e4@redhat.com> <623c0dbf-9210-7c63-3546-4314c7d47f85@redhat.com> <29521e46-a5e8-5ff0-23a2-22eeee145389@oracle.com> <4445a727-060b-70f9-c8db-e9f70faae3d5@redhat.com> <37755fec-05b9-8d2c-7eb9-8849393c7485@oracle.com> <57cddde0-60e6-366e-489e-f6f9534e3ed9@redhat.com> <6fa761bc-8feb-74e6-9a54-8a65ab81203b@oracle.com> <5af9855a-652e-64f0-af83-e8f5962247ca@oracle.com> <07a5bf0a-02fa-7a8c-35be-813f5207cb0c@oracle.com> Message-ID: <50073958-c2af-a054-b614-e4015e6b9315@oracle.com> Hi David, On 08/02/2017 03:38 AM, David Holmes wrote: > Catching up after my vacation ... Same > > It isn't clear to me that the change of _stack_traversal_mark from long to jlong is suitable. Should this really be 64-bit on a 32-bit system? And given it is set from the > traversal_count which is still a plain long, this change just seems wrong to me. The jlong was just because our API for orderacess. > If anything acquire/release semantics should have been added to the _state variable though that would also not have had any > bearing on the storestore that was removed - AFAICS. I can't recall my train of thoughts, looking at now, I agree with you. I have some other stuff that require my immediate attention, but I'll send out patches addressing these issues you raised, thanks! /Robbin > > Cheers, > David > > > On 20/07/2017 8:53 PM, Roman Kennke wrote: >> Hi all, >> >> Robbin found some more missing includes in jprt testing (thanks!!) >> >> Differential: >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/ >> >> Full: >> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/ >> >> >> Am I breaking the record for most webrev revisions? :-P >> >> According the Robbin, builds are now all clean. >> >> Can I get final reviews and then a sponsor? >> >> Thanks, >> Roman >> >> Am 16.07.2017 um 10:25 schrieb Robbin Ehn: >>> Hi Roman, >>> >>> On 2017-07-12 15:32, Roman Kennke wrote: >>>> Hi Robbin and all, >>>> >>>> I fixed the 32bit failures by using jlong in all relevant places: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/ >>>> >>>> >>>> then Robbin found another problem. SafepointCleanupTest started to fail, >>>> because "mark nmethods" is no longer printed. This made me think that >>>> we're not measuring the conflated (and possibly parallelized) >>>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with >>>> "safepoint cleanup tasks" which measures the total duration of safepoint >>>> cleanup. We can't reasonably measure a possibly parallel and conflated >>>> pass standalone, but we can measure all and by subtrating all the other >>>> subphases, get an idea how long deflation and nmethod marking take up. >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/ >>>> >>>> >>>> The full webrev is now: >>>> >>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/ >>>> >>>> >>>> Hope that's all ;-) >>> >>> With this changeset something always pop-ups. >>> >>> Failure reason: Targets failed. Target macosx_x64_10.9-fastdebug FAILED. >>> >>> >>> /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ >>> >>> -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS >>> -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE >>> -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions >>> -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16 >>> -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070 >>> -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN >>> -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef >>> -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS >>> -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86 >>> -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64 >>> -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED >>> -DINCLUDE_AOT >>> -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm >>> -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22: >>> error: variable has incomplete type 'StrongRootsScope' >>> StrongRootsScope srs(num_cleanup_workers); >>> ^ >>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: >>> note: forward declaration of 'StrongRootsScope' >>> class StrongRootsScope; >>> ^ >>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22: >>> error: variable has incomplete type 'StrongRootsScope' >>> StrongRootsScope srs(1); >>> ^ >>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: >>> note: forward declaration of 'StrongRootsScope' >>> class StrongRootsScope; >>> ^ >>> 2 errors generated. >>> make[3]: *** >>> [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o] >>> Error 1 >>> make[3]: *** Waiting for unfinished jobs.... >>> make[2]: *** [hotspot-server-libs] Error 2 >>> >>> Send me the new webrev and I'll test it before the 16th round of >>> review :) >>> >>> /Robbin >>> >>>> >>>> Roman >>>> >>>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn: >>>>> Hi, unfortunately the push failed on 32-bit. >>>>> >>>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty) >>>>> >>>>> I do not have anytime to look at this, so here is the error. >>>>> >>>>> /Robbin >>>>> >>>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make' >>>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>>>> member function 'long int nmethod::stack_traversal_mark()': >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>>>> >>>>> error: call of overloaded 'load_acquire(volatile long int*)' is >>>>> ambiguous >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108: >>>>> >>>>> note: candidates are: >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>>>> >>>>> note: static jint OrderAccess::load_acquire(const volatile jint*) >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17: >>>>> >>>>> note: no known conversion for argument 1 from 'volatile long int*' >>>>> to 'const volatile jint* {aka const volatile int*}' >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>>>> >>>>> note: static juint OrderAccess::load_acquire(const volatile juint*) >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17: >>>>> >>>>> note: no known conversion for argument 1 from 'volatile long int*' >>>>> to 'const volatile juint* {aka const volatile unsigned int*}' >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In >>>>> member function 'void nmethod::set_stack_traversal_mark(long int)': >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>>>> >>>>> error: call of overloaded 'release_store(volatile long int*, long >>>>> int&)' is ambiguous >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105: >>>>> >>>>> note: candidates are: >>>>> In file included from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28, >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33, >>>>> >>>>> from >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28: >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>>>> >>>>> note: static void OrderAccess::release_store(volatile jint*, jint) >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17: >>>>> >>>>> note: no known conversion for argument 1 from 'volatile long int*' >>>>> to 'volatile jint* {aka volatile int*}' >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>>>> >>>>> note: static void OrderAccess::release_store(volatile juint*, juint) >>>>> >>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17: >>>>> >>>>> note: no known conversion for argument 1 from 'volatile long int*' >>>>> to 'volatile juint* {aka volatile unsigned int*}' >>>>> >>>>> On 2017-07-10 20:50, Robbin Ehn wrote: >>>>>> I'll start a push now. >>>>>> >>>>>> /Robbin >>>>>> >>>>>> On 2017-07-10 12:38, Roman Kennke wrote: >>>>>>> Ok, so I guess I need a sponsor for this now: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>> >>>>>>> >>>>>>> Roman >>>>>>> >>>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov: >>>>>>>> >>>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn >>>>>>>> > wrote: >>>>>>>>> >>>>>>>>> Hi Roman, >>>>>>>>> >>>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote: >>>>>>>>>> Hi Robbin, >>>>>>>>>>> >>>>>>>>>>> Far down -> >>>>>>>>>>> >>>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I'm not happy about this change: >>>>>>>>>>>>> >>>>>>>>>>>>> + ~ParallelSPCleanupThreadClosure() { >>>>>>>>>>>>> + // This is here to be consistent with sweeper.cpp >>>>>>>>>>>>> NMethodSweeper::mark_active_nmethods(). >>>>>>>>>>>>> + // TODO: Is this really needed? >>>>>>>>>>>>> + OrderAccess::storestore(); >>>>>>>>>>>>> + } >>>>>>>>>>>>> >>>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be >>>>>>>>>>>>> consistent >>>>>>>>>>>>> with an OrderAccess::storestore() that's not properly >>>>>>>>>>>>> documented >>>>>>>>>>>>> which is only increasing the technical debt. >>>>>>>>>>>>> >>>>>>>>>>>>> So a couple of things above don't make sense to me: >>>>>>>>>>>>> >>>>>>>>>>>>>> - sweeper thread runs outside safepoint >>>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case >>>>>>>>>>>>>> that >>>>>>>>>>>>>> I'm looking at) runs while all other threads (incl. the >>>>>>>>>>>>>> sweeper) >>>>>>>>>>>>>> is holding still. >>>>>>>>>>>>> >>>>>>>>>>>>> and: >>>>>>>>>>>>> >>>>>>>>>>>>>> There should be no need for a storestore() (at least in >>>>>>>>>>>>>> sweeper.cpp... >>>>>>>>>>>> >>>>>>>>>>>> Either one or the other are running. Either the VMThread is >>>>>>>>>>>> marking >>>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running >>>>>>>>>>>> (outside >>>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed >>>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no >>>>>>>>>>>> storestore() >>>>>>>>>>>> should be necessary. >>>>>>>>>>>> >>>>>>>>>>>> From Igor's comment I can see how it happened though: >>>>>>>>>>>> Apparently >>>>>>>>>>>> there >>>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent >>>>>>>>>>>> with >>>>>>>>>>>> compiler threads, as far as I understand). And there's a call to >>>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is >>>>>>>>>>>> required >>>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have >>>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's >>>>>>>>>>>> also put >>>>>>>>>>>> a storestore() in the other places that call >>>>>>>>>>>> mark_as_seen_on_stack(), >>>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're >>>>>>>>>>>> discussing. (why the storestore() hasn't been put right into >>>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one >>>>>>>>>>>> storestore() >>>>>>>>>>>> really was necessary, the other looks like it has been put there >>>>>>>>>>>> 'for >>>>>>>>>>>> consistency' or just conservatively. But it shouldn't be >>>>>>>>>>>> necessary in >>>>>>>>>>>> the safepoint cleanup code that we're discussing. >>>>>>>>>>>> >>>>>>>>>>>> So what should we do? Remove the storestore() for good? >>>>>>>>>>>> Refactor the >>>>>>>>>>>> code so that both paths at least call the storestore() in the >>>>>>>>>>>> same >>>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and >>>>>>>>>>>> call >>>>>>>>>>>> storestore() in the dtor as proposed?) >>>>>>>>>>> >>>>>>>>>>> I took a quick look, maybe I'm missing some stuff but: >>>>>>>>>>> >>>>>>>>>>> So there is a slight optimization when not running sweeper to >>>>>>>>>>> skip >>>>>>>>>>> compiler barrier/fence in stw. >>>>>>>>>>> >>>>>>>>>>> Don't think that matter, so I propose something like: >>>>>>>>>>> - long stack_traversal_mark() { return >>>>>>>>>>> _stack_traversal_mark; } >>>>>>>>>>> - void set_stack_traversal_mark(long l) { >>>>>>>>>>> _stack_traversal_mark = l; } >>>>>>>>>>> + long stack_traversal_mark() { return >>>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); } >>>>>>>>>>> + void set_stack_traversal_mark(long l) { >>>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); } >>>>>>>>>>> >>>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking >>>>>>>>>>> that >>>>>>>>>>> it is concurrent accessed. >>>>>>>>>>> And remove both storestore. >>>>>>>>>>> >>>>>>>>>>> "Also neither of these state variables are volatile in >>>>>>>>>>> nmethod, so >>>>>>>>>>> even the compiler may reorder the stores" >>>>>>>>>>> Fortunately at least _state is volatile now. >>>>>>>>>>> >>>>>>>>>>> I think _state also should use la/rs semantics instead, but >>>>>>>>>>> that's >>>>>>>>>>> another story. >>>>>>>>>> Like this? >>>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, exactly, I like this! >>>>>>>>> Dan? Igor ? Tobias? >>>>>>>>> >>>>>>>> >>>>>>>> That seems correct. >>>>>>>> >>>>>>>> igor >>>>>>>> >>>>>>>>> Thanks Roman! >>>>>>>>> >>>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow >>>>>>>>> this >>>>>>>>> thread/changeset to the end! >>>>>>>>> >>>>>>>>> /Robbin >>>>>>>>> >>>>>>>>>> Roman >>>>>>>> >>>>>>> >>>> >> From ioi.lam at oracle.com Mon Aug 21 18:30:52 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 21 Aug 2017 11:30:52 -0700 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: Message-ID: Hi Thomas, The changes look good. I have a few style nit picks: CritSect is currently only allocated statically (once), but I wonder if it makes sense, for completeness, to add a destructor that calls DeleteCriticalSection, just in case someone may use it dynamically in the future? Also, there's repetition of this pattern without explanation: 131 CritSectLocker lck(&g_cs); 132 if (initialize_if_needed()) { I wonder if it's better to do this, so we can put the comments at a place that's specific to DbgHelper (and also remove one level of nested "if"): DbgHelperLocker lock; if (lock.is_initialized() && g_pfn_SymGetSearchPath != NULL) { ..... } // Functions from dbghelp.dll are not threadsafe; calls to them must be synchronized. class DbgHelperLocker ..... { .... bool is_initialized() { return initialize_if_needed(); } } Also, can DbgHelpLoader be renamed to WindowsDbgHelp, since it doesn't do just loading the DLL but also handles the invocation? 57 #define FOR_ALL_FUNCTIONS(DO) \ 58 DO(ImagehlpApiVersion) \ 59 DO(SymSetOptions) \ 60 DO(SymInitialize) \ 61 DO(SymGetSymFromAddr64) \ 62 DO(UnDecorateSymbolName) \ 63 DO(SymSetSearchPath) \ 64 DO(SymGetSearchPath) \ 65 DO(StackWalk64) \ 66 DO(SymFunctionTableAccess64) \ 67 DO(SymGetModuleBase64) \ 68 DO(MiniDumpWriteDump) \ 69 DO(SymGetLineFromAddr64) Are the xxx64 versions of the functions also available on 32-bit windows? Have you tested with win32 build? Maybe add a comment that these functions will not be called on 32-bit windows: 79 pfn_StackWalk64 _pfnStackWalk64; 80 pfn_SymFunctionTableAccess64 _pfnSymFunctionTableAccess64; 81 pfn_SymGetModuleBase64 _pfnSymGetModuleBase64; and put an assert in the corresponding wrapper function? Thanks - Ioi On 8/18/17 12:23 AM, Thomas St?fe wrote: > Dear all, > > may I please have a review for this change: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 > > Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp-handling/webrev.00/webrev/ > > > This is a part of an ongoing work I do to make error reporting > (especially callstacks) on Windows more reliable. > > At first I did a rather large patch, see: > https://bugs.openjdk.java.net/browse/JDK-8185712 > and > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-August/024286.html > . But after discussing this patch with Ioi, I saw that this patch is > better split up into multiple parts for easier reviewing. > > So this is the first split up patch. > > -- > > This patch here centralizes handling of the dbghelp.dll (loading the > library, resolving function pointers and synchronizing access). > > Which solves the problem that accesses to functions exported from the > dbghelp.dll need to be synchronized, as stated by the MSDN. We somehow > never really cared. I guess it never caused visible trouble, because > most of the time (not always) the functions are accessed from > VMError::report(), and chances of parallel access from other > non-error-reporting threads are slim. Even if it were to crash, > secondary error handling would step in and write an "Error occurred > during error reporting" or "Another thread had an error too" message > and we would probably just shrug it off. > > But as this whole effort is about increasing the chance of useful > callstacks in hs-err files, I'd like to fix this. > > In addition to the fix, I think this is also a nice cleanup and > removes duplicate code. > > Notes: > > 1) Robustness: We may or may not find a dbghelp.dll on the target > system. If we find it, it may be old or new (it is not tightly coupled > with the OS, may be part of other installation packages, may exist > multiple times etc). We should handle older versions of the dbghelp > dll gracefully and hide all that complexity from the caller. > > 2) The new DbgHelpLoader class does not export any state indicating > whether or not it successfully loaded, and if it loaded which > functions are actually available. That was a deliberate decision, > there is no need for the caller to know this. Caller should invoke the > DbgHelpLoader functions as if they were the equivalent OS functions > and handle return errors. DbgHelpLoader should never crash or assert; > missing functions should behave like failing functions. > > 3) However, I added a one liner to the hs-err file indicating the > state of the dbghelp dll - version info, what functions were missing > etc. This may help understanding weird or missing callstacks. > > 4) I removed the implementation for shutdown > (WindowsDecoder::shutdown). I think there is no valid reason to ever > shutdown the decoder. For one, we may crash right at the end, and > still it would be nice to have callstacks. And then, why spend cycles > shutting down the decoder when we could just let it end with the process? > > 5) This code gets used during error reporting. So no VM infrastructure > must be used to avoid circular crashes and VM initialization > dependencies. So, to synchronize, this code uses raw windows > CriticalSection objects. > > -- > > Next step will be revamping handling of the Symbol APIs. This will > involve removing the WindowsDecoder class, which introduces other > errors and really makes no sense if the underlying dbghelp layer does > its own synchronization. > > > Thanks for reviewing! > > Kind Regards, Thomas From mikhailo.seledtsov at oracle.com Mon Aug 21 21:13:05 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 21 Aug 2017 14:13:05 -0700 Subject: RFR(XXS): 8186542: [TESTBUG] Add jvmti/LoadAgentDcmdTest.java to problem list until underlying issue is resolved Message-ID: <599B4CE1.2070708@oracle.com> Please review this one-liner, placing the test on a problem list. JBS: https://bugs.openjdk.java.net/browse/JDK-8186542 Webrev: http://cr.openjdk.java.net/~mseledtsov/8186542.00/ Testing: Ran the affected test The reason why the test is placed on a problem list: JDK-8186540 - [TESTBUG] serviceability/dcmd/jvmti/LoadAgentDcmdTest.java failed to clean up files in agentvm mode Thank you, Misha From serguei.spitsyn at oracle.com Mon Aug 21 23:34:48 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 21 Aug 2017 16:34:48 -0700 Subject: RFR(XXS): 8186542: [TESTBUG] Add jvmti/LoadAgentDcmdTest.java to problem list until underlying issue is resolved In-Reply-To: <599B4CE1.2070708@oracle.com> References: <599B4CE1.2070708@oracle.com> Message-ID: Looks good. Thanks, Serguei On 8/21/17 14:13, Mikhailo Seledtsov wrote: > Please review this one-liner, placing the test on a problem list. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8186542 > Webrev: http://cr.openjdk.java.net/~mseledtsov/8186542.00/ > Testing: > Ran the affected test > > The reason why the test is placed on a problem list: > JDK-8186540 - [TESTBUG] > serviceability/dcmd/jvmti/LoadAgentDcmdTest.java failed to clean up > files in agentvm mode > > > Thank you, > Misha From mikhailo.seledtsov at oracle.com Tue Aug 22 02:36:56 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 21 Aug 2017 19:36:56 -0700 Subject: RFR(XXS): 8186542: [TESTBUG] Add jvmti/LoadAgentDcmdTest.java to problem list until underlying issue is resolved In-Reply-To: References: <599B4CE1.2070708@oracle.com> Message-ID: <599B98C8.5000009@oracle.com> Thank you for review. I will use a trivial change rule for this change. Misha On 8/21/17, 4:34 PM, serguei.spitsyn at oracle.com wrote: > Looks good. > > Thanks, > Serguei > > > On 8/21/17 14:13, Mikhailo Seledtsov wrote: >> Please review this one-liner, placing the test on a problem list. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8186542 >> Webrev: http://cr.openjdk.java.net/~mseledtsov/8186542.00/ >> Testing: >> Ran the affected test >> >> The reason why the test is placed on a problem list: >> JDK-8186540 - [TESTBUG] >> serviceability/dcmd/jvmti/LoadAgentDcmdTest.java failed to clean up >> files in agentvm mode >> >> >> Thank you, >> Misha > From richard.reingruber at sap.com Tue Aug 22 07:44:07 2017 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 22 Aug 2017 07:44:07 +0000 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: Message-ID: Hi Thomas, thanks for the refactoring work! I had a look at your changes, but please note that I'm not a reviewer. ### dbghelp_loader.cpp: Should globalDefinitions.hpp be included? Little inconsistency: opening curly braces of method bodies should be on the same line as the end of the parameter list, I guess. 294: st->print("%s" #functionname, ((num_missing > 0) ? ", " : "")); Format string is incomplete. 196 BOOL DbgHelpLoader::stackWalk64(DWORD MachineType, 197 HANDLE hProcess, 198 HANDLE hThread, 199 LPSTACKFRAME64 StackFrame, 200 PVOID ContextRecord) 201 { 202 CritSectLocker lck(&g_cs); 203 if (initialize_if_needed()) { 204 if (g_pfn_StackWalk64 != NULL) { 205 return g_pfn_StackWalk64(MachineType, hProcess, hThread, StackFrame, 206 ContextRecord, 207 NULL, // ReadMemoryRoutine 208 g_pfn_SymFunctionTableAccess64, // FunctionTableAccessRoutine, 209 g_pfn_SymGetModuleBase64, // GetModuleBaseRoutine 210 NULL // TranslateAddressRoutine 211 ); 212 } 213 } 214 return FALSE; 215 } Lines 208, 209: is it ok to pass NULL? ### windows_decoder.cpp The following line was deleted without replacement. Are the options not needed? SymSetOptions(SYMOPT_UNDNAME | SYMOPT_DEFERRED_LOADS | SYMOPT_EXACT_SYMBOLS); Cheers, Richard. -----Original Message----- From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Thomas St?fe Sent: Freitag, 18. August 2017 09:24 To: hotspot-runtime-dev at openjdk.java.net Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code Dear all, may I please have a review for this change: Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ 8186349-centralize-dbghelp-handling/webrev.00/webrev/ This is a part of an ongoing work I do to make error reporting (especially callstacks) on Windows more reliable. At first I did a rather large patch, see: https://bugs.openjdk.java.net/ browse/JDK-8185712 and http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-August/024286.html . But after discussing this patch with Ioi, I saw that this patch is better split up into multiple parts for easier reviewing. So this is the first split up patch. -- This patch here centralizes handling of the dbghelp.dll (loading the library, resolving function pointers and synchronizing access). Which solves the problem that accesses to functions exported from the dbghelp.dll need to be synchronized, as stated by the MSDN. We somehow never really cared. I guess it never caused visible trouble, because most of the time (not always) the functions are accessed from VMError::report(), and chances of parallel access from other non-error-reporting threads are slim. Even if it were to crash, secondary error handling would step in and write an "Error occurred during error reporting" or "Another thread had an error too" message and we would probably just shrug it off. But as this whole effort is about increasing the chance of useful callstacks in hs-err files, I'd like to fix this. In addition to the fix, I think this is also a nice cleanup and removes duplicate code. Notes: 1) Robustness: We may or may not find a dbghelp.dll on the target system. If we find it, it may be old or new (it is not tightly coupled with the OS, may be part of other installation packages, may exist multiple times etc). We should handle older versions of the dbghelp dll gracefully and hide all that complexity from the caller. 2) The new DbgHelpLoader class does not export any state indicating whether or not it successfully loaded, and if it loaded which functions are actually available. That was a deliberate decision, there is no need for the caller to know this. Caller should invoke the DbgHelpLoader functions as if they were the equivalent OS functions and handle return errors. DbgHelpLoader should never crash or assert; missing functions should behave like failing functions. 3) However, I added a one liner to the hs-err file indicating the state of the dbghelp dll - version info, what functions were missing etc. This may help understanding weird or missing callstacks. 4) I removed the implementation for shutdown (WindowsDecoder::shutdown). I think there is no valid reason to ever shutdown the decoder. For one, we may crash right at the end, and still it would be nice to have callstacks. And then, why spend cycles shutting down the decoder when we could just let it end with the process? 5) This code gets used during error reporting. So no VM infrastructure must be used to avoid circular crashes and VM initialization dependencies. So, to synchronize, this code uses raw windows CriticalSection objects. -- Next step will be revamping handling of the Symbol APIs. This will involve removing the WindowsDecoder class, which introduces other errors and really makes no sense if the underlying dbghelp layer does its own synchronization. Thanks for reviewing! Kind Regards, Thomas From thomas.stuefe at gmail.com Tue Aug 22 09:50:50 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 22 Aug 2017 11:50:50 +0200 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: Message-ID: Hi Ioi, Thanks for the review! Please find my answers inline. On Mon, Aug 21, 2017 at 8:30 PM, Ioi Lam wrote: > Hi Thomas, > > The changes look good. I have a few style nit picks: > > CritSect is currently only allocated statically (once), but I wonder if it > makes sense, for completeness, to add a destructor that calls > DeleteCriticalSection, just in case someone may use it dynamically in the > future? > I do not want that, because it would introduce a window of opportunity where this code would not work anymore during shutdown. C++ objects are destructed in any order, and errors still may happen at this point. I like this code to be able to function right to the end and not stumble over a de-initialized critical section. But I see your point about introducing a something which looks like a general purpose class with an own header and then not following the design through. I removed this header and moved the critical section handling inside the DbgHelpLoader implementation. > > Also, there's repetition of this pattern without explanation: > > 131 CritSectLocker lck(&g_cs); > 132 if (initialize_if_needed()) { > > I wonder if it's better to do this, so we can put the comments at a place > that's specific to DbgHelper (and also remove one level of nested "if"): > > DbgHelperLocker lock; > if (lock.is_initialized() && g_pfn_SymGetSearchPath != NULL) { > ..... > } > > // Functions from dbghelp.dll are not threadsafe; calls to them must > be synchronized. > class DbgHelperLocker ..... { > .... > bool is_initialized() { > return initialize_if_needed(); > } > } > > You are right, that is a bit ugly. I did what you suggested. I added a "EntryGuard" RAII object which takes care of locking and initializing-on-demand. > Also, can DbgHelpLoader be renamed to WindowsDbgHelp, since it doesn't do > just loading the DLL but also handles the invocation? > Okay. DbgHelpLoader->WindowsDbgHelp, and "dbghelp_loader.cpp/hpp"->"windbghelp.cpp/hpp" too. Sorry, that means no incremental webrev, too much hassle due to the renaming :) > > 57 #define FOR_ALL_FUNCTIONS(DO) \ > 58 DO(ImagehlpApiVersion) \ > 59 DO(SymSetOptions) \ > 60 DO(SymInitialize) \ > 61 DO(SymGetSymFromAddr64) \ > 62 DO(UnDecorateSymbolName) \ > 63 DO(SymSetSearchPath) \ > 64 DO(SymGetSearchPath) \ > 65 DO(StackWalk64) \ > 66 DO(SymFunctionTableAccess64) \ > 67 DO(SymGetModuleBase64) \ > 68 DO(MiniDumpWriteDump) \ > 69 DO(SymGetLineFromAddr64) > > Are the xxx64 versions of the functions also available on 32-bit windows? > Have you tested with win32 build? > > Yes and yes. In fact, in our port, we have a function similar to os::platform_print_native_stack() - which walks the stack with native APIs only like StackWalk64 - since many years or so for all CPUs (x86,,x64, ia64). There is no technical reason not to implement os::platform_print_native_stack() for 32bit too. But Oracle seems to prefer using VMError::print_native_stack() (the generic stack walker using the Frame class) wherever possible to get mixed callstacks. In our port, we print callstacks using both methods. > Maybe add a comment that these functions will not be called on 32-bit > windows: > > 79 pfn_StackWalk64 _pfnStackWalk64; > 80 pfn_SymFunctionTableAccess64 _pfnSymFunctionTableAccess64; > 81 pfn_SymGetModuleBase64 _pfnSymGetModuleBase64; > > and put an assert in the corresponding wrapper function? > No reason to, these functions work fine in 32bit. > > Thanks > - Ioi > Thanks! I'll post an updated webrev once I worked in Richards change requests. Kind Regards, Thomas > > > On 8/18/17 12:23 AM, Thomas St?fe wrote: > > Dear all, > > may I please have a review for this change: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 > Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186349-c > entralize-dbghelp-handling/webrev.00/webrev/ > > This is a part of an ongoing work I do to make error reporting (especially > callstacks) on Windows more reliable. > > At first I did a rather large patch, see: https://bugs.openjdk.java > .net/browse/JDK-8185712 and http://mail.openjdk.java.net/p > ipermail/hotspot-runtime-dev/2017-August/024286.html . But after > discussing this patch with Ioi, I saw that this patch is better split up > into multiple parts for easier reviewing. > > So this is the first split up patch. > > -- > > This patch here centralizes handling of the dbghelp.dll (loading the > library, resolving function pointers and synchronizing access). > > Which solves the problem that accesses to functions exported from the > dbghelp.dll need to be synchronized, as stated by the MSDN. We somehow > never really cared. I guess it never caused visible trouble, because most > of the time (not always) the functions are accessed from VMError::report(), > and chances of parallel access from other non-error-reporting threads are > slim. Even if it were to crash, secondary error handling would step in and > write an "Error occurred during error reporting" or "Another thread had an > error too" message and we would probably just shrug it off. > > But as this whole effort is about increasing the chance of useful > callstacks in hs-err files, I'd like to fix this. > > In addition to the fix, I think this is also a nice cleanup and removes > duplicate code. > > Notes: > > 1) Robustness: We may or may not find a dbghelp.dll on the target system. > If we find it, it may be old or new (it is not tightly coupled with the OS, > may be part of other installation packages, may exist multiple times etc). > We should handle older versions of the dbghelp dll gracefully and hide all > that complexity from the caller. > > 2) The new DbgHelpLoader class does not export any state indicating > whether or not it successfully loaded, and if it loaded which functions are > actually available. That was a deliberate decision, there is no need for > the caller to know this. Caller should invoke the DbgHelpLoader functions > as if they were the equivalent OS functions and handle return errors. > DbgHelpLoader should never crash or assert; missing functions should behave > like failing functions. > > 3) However, I added a one liner to the hs-err file indicating the state of > the dbghelp dll - version info, what functions were missing etc. This may > help understanding weird or missing callstacks. > > 4) I removed the implementation for shutdown (WindowsDecoder::shutdown). I > think there is no valid reason to ever shutdown the decoder. For one, we > may crash right at the end, and still it would be nice to have callstacks. > And then, why spend cycles shutting down the decoder when we could just let > it end with the process? > > 5) This code gets used during error reporting. So no VM infrastructure > must be used to avoid circular crashes and VM initialization dependencies. > So, to synchronize, this code uses raw windows CriticalSection objects. > > -- > > Next step will be revamping handling of the Symbol APIs. This will involve > removing the WindowsDecoder class, which introduces other errors and really > makes no sense if the underlying dbghelp layer does its own synchronization. > > > Thanks for reviewing! > > Kind Regards, Thomas > > > From robbin.ehn at oracle.com Tue Aug 22 11:44:11 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 22 Aug 2017 13:44:11 +0200 Subject: RFR (CSR): 8180929: Deprecate -XX:+/-MonitorInUseLists option In-Reply-To: <3f0d67fa-aea9-7b9d-f2fb-61f2c43f6665@redhat.com> References: <3f0d67fa-aea9-7b9d-f2fb-61f2c43f6665@redhat.com> Message-ID: <8ebaf974-2bb8-66e0-0daf-019a3c7ce69c@oracle.com> Thanks for doing this! /Robbin On 07/24/2017 10:16 PM, Roman Kennke wrote: > I am not really sure how to do this. I was asked by Joe Darcy to get > somebody from Hotspot team to review this, so I thought I post it here: > > https://bugs.openjdk.java.net/browse/JDK-8180929 > > Roman > > From thomas.stuefe at gmail.com Tue Aug 22 12:33:08 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 22 Aug 2017 14:33:08 +0200 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: Message-ID: Hi Richard, thank you for the review! Please find my remarks inline. On Tue, Aug 22, 2017 at 9:44 AM, Reingruber, Richard < richard.reingruber at sap.com> wrote: > Hi Thomas, > > thanks for the refactoring work! > > I had a look at your changes, but please note that I'm not a reviewer. > > ### dbghelp_loader.cpp: > > Should globalDefinitions.hpp be included? > Currently I do not need it, so I'd rather not. > > Little inconsistency: opening curly braces of method bodies should be on > the same line as the end of the parameter list, I guess. > > 294: st->print("%s" #functionname, ((num_missing > 0) ? ", " : "")); > Are you sure? Sorry, I cannot spot an error here. > > Format string is incomplete. > > 196 BOOL DbgHelpLoader::stackWalk64(DWORD MachineType, > 197 HANDLE hProcess, > 198 HANDLE hThread, > 199 LPSTACKFRAME64 StackFrame, > 200 PVOID ContextRecord) > 201 { > 202 CritSectLocker lck(&g_cs); > 203 if (initialize_if_needed()) { > 204 if (g_pfn_StackWalk64 != NULL) { > 205 return g_pfn_StackWalk64(MachineType, hProcess, hThread, > StackFrame, > 206 ContextRecord, > 207 NULL, // ReadMemoryRoutine > 208 g_pfn_SymFunctionTableAccess64, // > FunctionTableAccessRoutine, > 209 g_pfn_SymGetModuleBase64, // > GetModuleBaseRoutine > 210 NULL // TranslateAddressRoutine > 211 ); > 212 } > 213 } > 214 return FALSE; > 215 } > > Lines 208, 209: is it ok to pass NULL? > > Good question. Documentation says parameters are required. I tested calling this with NULL for both functions and stack walking worked just fine. I leave it as it is, because I think at worst we risk StackWalk64 failing, and at best we get a callstack nevertheless. > ### windows_decoder.cpp > > The following line was deleted without replacement. Are the options not > needed? > > SymSetOptions(SYMOPT_UNDNAME | SYMOPT_DEFERRED_LOADS | > SYMOPT_EXACT_SYMBOLS); > > Good catch! Will fix. > > Cheers, Richard. > > Thanks! Thomas > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Thomas St?fe > Sent: Freitag, 18. August 2017 09:24 > To: hotspot-runtime-dev at openjdk.java.net > Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code > > Dear all, > > may I please have a review for this change: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 > Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > 8186349-centralize-dbghelp-handling/webrev.00/webrev/ > > This is a part of an ongoing work I do to make error reporting (especially > callstacks) on Windows more reliable. > > At first I did a rather large patch, see: https://bugs.openjdk.java.net/ > browse/JDK-8185712 and > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/ > 2017-August/024286.html > . But after discussing this patch with Ioi, I saw that this patch is better > split up into multiple parts for easier reviewing. > > So this is the first split up patch. > > -- > > This patch here centralizes handling of the dbghelp.dll (loading the > library, resolving function pointers and synchronizing access). > > Which solves the problem that accesses to functions exported from the > dbghelp.dll need to be synchronized, as stated by the MSDN. We somehow > never really cared. I guess it never caused visible trouble, because most > of the time (not always) the functions are accessed from VMError::report(), > and chances of parallel access from other non-error-reporting threads are > slim. Even if it were to crash, secondary error handling would step in and > write an "Error occurred during error reporting" or "Another thread had an > error too" message and we would probably just shrug it off. > > But as this whole effort is about increasing the chance of useful > callstacks in hs-err files, I'd like to fix this. > > In addition to the fix, I think this is also a nice cleanup and removes > duplicate code. > > Notes: > > 1) Robustness: We may or may not find a dbghelp.dll on the target system. > If we find it, it may be old or new (it is not tightly coupled with the OS, > may be part of other installation packages, may exist multiple times etc). > We should handle older versions of the dbghelp dll gracefully and hide all > that complexity from the caller. > > 2) The new DbgHelpLoader class does not export any state indicating whether > or not it successfully loaded, and if it loaded which functions are > actually available. That was a deliberate decision, there is no need for > the caller to know this. Caller should invoke the DbgHelpLoader functions > as if they were the equivalent OS functions and handle return errors. > DbgHelpLoader should never crash or assert; missing functions should behave > like failing functions. > > 3) However, I added a one liner to the hs-err file indicating the state of > the dbghelp dll - version info, what functions were missing etc. This may > help understanding weird or missing callstacks. > > 4) I removed the implementation for shutdown (WindowsDecoder::shutdown). I > think there is no valid reason to ever shutdown the decoder. For one, we > may crash right at the end, and still it would be nice to have callstacks. > And then, why spend cycles shutting down the decoder when we could just let > it end with the process? > > 5) This code gets used during error reporting. So no VM infrastructure must > be used to avoid circular crashes and VM initialization dependencies. So, > to synchronize, this code uses raw windows CriticalSection objects. > > -- > > Next step will be revamping handling of the Symbol APIs. This will involve > removing the WindowsDecoder class, which introduces other errors and really > makes no sense if the underlying dbghelp layer does its own > synchronization. > > > Thanks for reviewing! > > Kind Regards, Thomas > From thomas.stuefe at gmail.com Tue Aug 22 12:48:57 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 22 Aug 2017 14:48:57 +0200 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: Message-ID: Hi all, please see new Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp-handling/webrev.01/webrev/ I worked in proposed changes by Ioi and Richard. Sorry, because on file name changed, I did not do an incremental webrev. Here are the changes: - Renamed DbgHelpLoader to WindowsDbgHelp and renamed the files too. - I moved the critical section code to the implementation of WindowsDbgHelp. I discarded the general "CritSectLocker" object in favour of an object specialized for this case, see the EntryGuard class. - I readded the SymSetOptions call I accidentally discarded. - In decoder_windows.cpp, I set the value for _can_decode_in_vm to true. I plan to remove this flag in one of the next changes (see also JDK-8144855) As Richard is no full reviewer, I'll need a second Reviewer and a sponsor. Thanks! Kind Regards, Thomas On Tue, Aug 22, 2017 at 2:33 PM, Thomas St?fe wrote: > Hi Richard, > > thank you for the review! Please find my remarks inline. > > On Tue, Aug 22, 2017 at 9:44 AM, Reingruber, Richard < > richard.reingruber at sap.com> wrote: > >> Hi Thomas, >> >> thanks for the refactoring work! >> >> I had a look at your changes, but please note that I'm not a reviewer. >> >> ### dbghelp_loader.cpp: >> >> Should globalDefinitions.hpp be included? >> > > Currently I do not need it, so I'd rather not. > > >> >> Little inconsistency: opening curly braces of method bodies should be on >> the same line as the end of the parameter list, I guess. >> >> 294: st->print("%s" #functionname, ((num_missing > 0) ? ", " : "")); >> > > Are you sure? Sorry, I cannot spot an error here. > > >> >> Format string is incomplete. >> >> 196 BOOL DbgHelpLoader::stackWalk64(DWORD MachineType, >> 197 HANDLE hProcess, >> 198 HANDLE hThread, >> 199 LPSTACKFRAME64 StackFrame, >> 200 PVOID ContextRecord) >> 201 { >> 202 CritSectLocker lck(&g_cs); >> 203 if (initialize_if_needed()) { >> 204 if (g_pfn_StackWalk64 != NULL) { >> 205 return g_pfn_StackWalk64(MachineType, hProcess, hThread, >> StackFrame, >> 206 ContextRecord, >> 207 NULL, // ReadMemoryRoutine >> 208 g_pfn_SymFunctionTableAccess64, >> // FunctionTableAccessRoutine, >> 209 g_pfn_SymGetModuleBase64, // >> GetModuleBaseRoutine >> 210 NULL // TranslateAddressRoutine >> 211 ); >> 212 } >> 213 } >> 214 return FALSE; >> 215 } >> >> Lines 208, 209: is it ok to pass NULL? >> >> > Good question. Documentation says parameters are required. I tested > calling this with NULL for both functions and stack walking worked just > fine. I leave it as it is, because I think at worst we risk StackWalk64 > failing, and at best we get a callstack nevertheless. > > >> ### windows_decoder.cpp >> >> The following line was deleted without replacement. Are the options not >> needed? >> >> SymSetOptions(SYMOPT_UNDNAME | SYMOPT_DEFERRED_LOADS | >> SYMOPT_EXACT_SYMBOLS); >> >> > Good catch! Will fix. > > >> >> Cheers, Richard. >> >> > Thanks! > > Thomas > > >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bo >> unces at openjdk.java.net] On Behalf Of Thomas St?fe >> Sent: Freitag, 18. August 2017 09:24 >> To: hotspot-runtime-dev at openjdk.java.net >> Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code >> >> Dear all, >> >> may I please have a review for this change: >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 >> Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ >> 8186349-centralize-dbghelp-handling/webrev.00/webrev/ >> >> This is a part of an ongoing work I do to make error reporting (especially >> callstacks) on Windows more reliable. >> >> At first I did a rather large patch, see: https://bugs.openjdk.java.net/ >> browse/JDK-8185712 and >> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2 >> 017-August/024286.html >> . But after discussing this patch with Ioi, I saw that this patch is >> better >> split up into multiple parts for easier reviewing. >> >> So this is the first split up patch. >> >> -- >> >> This patch here centralizes handling of the dbghelp.dll (loading the >> library, resolving function pointers and synchronizing access). >> >> Which solves the problem that accesses to functions exported from the >> dbghelp.dll need to be synchronized, as stated by the MSDN. We somehow >> never really cared. I guess it never caused visible trouble, because most >> of the time (not always) the functions are accessed from >> VMError::report(), >> and chances of parallel access from other non-error-reporting threads are >> slim. Even if it were to crash, secondary error handling would step in and >> write an "Error occurred during error reporting" or "Another thread had an >> error too" message and we would probably just shrug it off. >> >> But as this whole effort is about increasing the chance of useful >> callstacks in hs-err files, I'd like to fix this. >> >> In addition to the fix, I think this is also a nice cleanup and removes >> duplicate code. >> >> Notes: >> >> 1) Robustness: We may or may not find a dbghelp.dll on the target system. >> If we find it, it may be old or new (it is not tightly coupled with the >> OS, >> may be part of other installation packages, may exist multiple times etc). >> We should handle older versions of the dbghelp dll gracefully and hide all >> that complexity from the caller. >> >> 2) The new DbgHelpLoader class does not export any state indicating >> whether >> or not it successfully loaded, and if it loaded which functions are >> actually available. That was a deliberate decision, there is no need for >> the caller to know this. Caller should invoke the DbgHelpLoader functions >> as if they were the equivalent OS functions and handle return errors. >> DbgHelpLoader should never crash or assert; missing functions should >> behave >> like failing functions. >> >> 3) However, I added a one liner to the hs-err file indicating the state of >> the dbghelp dll - version info, what functions were missing etc. This may >> help understanding weird or missing callstacks. >> >> 4) I removed the implementation for shutdown (WindowsDecoder::shutdown). I >> think there is no valid reason to ever shutdown the decoder. For one, we >> may crash right at the end, and still it would be nice to have callstacks. >> And then, why spend cycles shutting down the decoder when we could just >> let >> it end with the process? >> >> 5) This code gets used during error reporting. So no VM infrastructure >> must >> be used to avoid circular crashes and VM initialization dependencies. So, >> to synchronize, this code uses raw windows CriticalSection objects. >> >> -- >> >> Next step will be revamping handling of the Symbol APIs. This will involve >> removing the WindowsDecoder class, which introduces other errors and >> really >> makes no sense if the underlying dbghelp layer does its own >> synchronization. >> >> >> Thanks for reviewing! >> >> Kind Regards, Thomas >> > > From thomas.stuefe at gmail.com Tue Aug 22 13:05:13 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 22 Aug 2017 15:05:13 +0200 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: Message-ID: p.s. I built on x86 and x64. Ran gtests and part of the jtreg tests (hotspot/runtime/ErrorReporting) for both platforms. Two jtreg tests failed, but errors have nothing to do with my change - some java cross-module-access issue. Do these tests get run regularly? I also tried to build without precompiled headers to see if any includes were missing, but ran into other missing includes first, that seems to be rotted a bit. ..Thomas On Tue, Aug 22, 2017 at 2:48 PM, Thomas St?fe wrote: > Hi all, > > please see new Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349- > centralize-dbghelp-handling/webrev.01/webrev/ > > I worked in proposed changes by Ioi and Richard. Sorry, because on file > name changed, I did not do an incremental webrev. Here are the changes: > > - Renamed DbgHelpLoader to WindowsDbgHelp and renamed the files too. > - I moved the critical section code to the implementation of > WindowsDbgHelp. I discarded the general "CritSectLocker" object in favour > of an object specialized for this case, see the EntryGuard class. > - I readded the SymSetOptions call I accidentally discarded. > - In decoder_windows.cpp, I set the value for _can_decode_in_vm to true. I > plan to remove this flag in one of the next changes (see also JDK-8144855) > > > As Richard is no full reviewer, I'll need a second Reviewer and a sponsor. > > Thanks! > > Kind Regards, Thomas > > > On Tue, Aug 22, 2017 at 2:33 PM, Thomas St?fe > wrote: > >> Hi Richard, >> >> thank you for the review! Please find my remarks inline. >> >> On Tue, Aug 22, 2017 at 9:44 AM, Reingruber, Richard < >> richard.reingruber at sap.com> wrote: >> >>> Hi Thomas, >>> >>> thanks for the refactoring work! >>> >>> I had a look at your changes, but please note that I'm not a reviewer. >>> >>> ### dbghelp_loader.cpp: >>> >>> Should globalDefinitions.hpp be included? >>> >> >> Currently I do not need it, so I'd rather not. >> >> >>> >>> Little inconsistency: opening curly braces of method bodies should be on >>> the same line as the end of the parameter list, I guess. >>> >>> 294: st->print("%s" #functionname, ((num_missing > 0) ? ", " : "")); >>> >> >> Are you sure? Sorry, I cannot spot an error here. >> >> >>> >>> Format string is incomplete. >>> >>> 196 BOOL DbgHelpLoader::stackWalk64(DWORD MachineType, >>> 197 HANDLE hProcess, >>> 198 HANDLE hThread, >>> 199 LPSTACKFRAME64 StackFrame, >>> 200 PVOID ContextRecord) >>> 201 { >>> 202 CritSectLocker lck(&g_cs); >>> 203 if (initialize_if_needed()) { >>> 204 if (g_pfn_StackWalk64 != NULL) { >>> 205 return g_pfn_StackWalk64(MachineType, hProcess, hThread, >>> StackFrame, >>> 206 ContextRecord, >>> 207 NULL, // ReadMemoryRoutine >>> 208 g_pfn_SymFunctionTableAccess64, >>> // FunctionTableAccessRoutine, >>> 209 g_pfn_SymGetModuleBase64, // >>> GetModuleBaseRoutine >>> 210 NULL // TranslateAddressRoutine >>> 211 ); >>> 212 } >>> 213 } >>> 214 return FALSE; >>> 215 } >>> >>> Lines 208, 209: is it ok to pass NULL? >>> >>> >> Good question. Documentation says parameters are required. I tested >> calling this with NULL for both functions and stack walking worked just >> fine. I leave it as it is, because I think at worst we risk StackWalk64 >> failing, and at best we get a callstack nevertheless. >> >> >>> ### windows_decoder.cpp >>> >>> The following line was deleted without replacement. Are the options not >>> needed? >>> >>> SymSetOptions(SYMOPT_UNDNAME | SYMOPT_DEFERRED_LOADS | >>> SYMOPT_EXACT_SYMBOLS); >>> >>> >> Good catch! Will fix. >> >> >>> >>> Cheers, Richard. >>> >>> >> Thanks! >> >> Thomas >> >> >>> -----Original Message----- >>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bo >>> unces at openjdk.java.net] On Behalf Of Thomas St?fe >>> Sent: Freitag, 18. August 2017 09:24 >>> To: hotspot-runtime-dev at openjdk.java.net >>> Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code >>> >>> Dear all, >>> >>> may I please have a review for this change: >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 >>> Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ >>> 8186349-centralize-dbghelp-handling/webrev.00/webrev/ >>> >>> This is a part of an ongoing work I do to make error reporting >>> (especially >>> callstacks) on Windows more reliable. >>> >>> At first I did a rather large patch, see: https://bugs.openjdk.java.net/ >>> browse/JDK-8185712 and >>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2 >>> 017-August/024286.html >>> . But after discussing this patch with Ioi, I saw that this patch is >>> better >>> split up into multiple parts for easier reviewing. >>> >>> So this is the first split up patch. >>> >>> -- >>> >>> This patch here centralizes handling of the dbghelp.dll (loading the >>> library, resolving function pointers and synchronizing access). >>> >>> Which solves the problem that accesses to functions exported from the >>> dbghelp.dll need to be synchronized, as stated by the MSDN. We somehow >>> never really cared. I guess it never caused visible trouble, because most >>> of the time (not always) the functions are accessed from >>> VMError::report(), >>> and chances of parallel access from other non-error-reporting threads are >>> slim. Even if it were to crash, secondary error handling would step in >>> and >>> write an "Error occurred during error reporting" or "Another thread had >>> an >>> error too" message and we would probably just shrug it off. >>> >>> But as this whole effort is about increasing the chance of useful >>> callstacks in hs-err files, I'd like to fix this. >>> >>> In addition to the fix, I think this is also a nice cleanup and removes >>> duplicate code. >>> >>> Notes: >>> >>> 1) Robustness: We may or may not find a dbghelp.dll on the target system. >>> If we find it, it may be old or new (it is not tightly coupled with the >>> OS, >>> may be part of other installation packages, may exist multiple times >>> etc). >>> We should handle older versions of the dbghelp dll gracefully and hide >>> all >>> that complexity from the caller. >>> >>> 2) The new DbgHelpLoader class does not export any state indicating >>> whether >>> or not it successfully loaded, and if it loaded which functions are >>> actually available. That was a deliberate decision, there is no need for >>> the caller to know this. Caller should invoke the DbgHelpLoader functions >>> as if they were the equivalent OS functions and handle return errors. >>> DbgHelpLoader should never crash or assert; missing functions should >>> behave >>> like failing functions. >>> >>> 3) However, I added a one liner to the hs-err file indicating the state >>> of >>> the dbghelp dll - version info, what functions were missing etc. This may >>> help understanding weird or missing callstacks. >>> >>> 4) I removed the implementation for shutdown (WindowsDecoder::shutdown). >>> I >>> think there is no valid reason to ever shutdown the decoder. For one, we >>> may crash right at the end, and still it would be nice to have >>> callstacks. >>> And then, why spend cycles shutting down the decoder when we could just >>> let >>> it end with the process? >>> >>> 5) This code gets used during error reporting. So no VM infrastructure >>> must >>> be used to avoid circular crashes and VM initialization dependencies. So, >>> to synchronize, this code uses raw windows CriticalSection objects. >>> >>> -- >>> >>> Next step will be revamping handling of the Symbol APIs. This will >>> involve >>> removing the WindowsDecoder class, which introduces other errors and >>> really >>> makes no sense if the underlying dbghelp layer does its own >>> synchronization. >>> >>> >>> Thanks for reviewing! >>> >>> Kind Regards, Thomas >>> >> >> > From goetz.lindenmaier at sap.com Tue Aug 22 13:48:22 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 22 Aug 2017 13:48:22 +0000 Subject: RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: References: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> <2151e4417d4d4bf5b9e368db68c75db2@sap.com> <59d9ca9f5f6d4eeab0679edabc66f1f6@sap.com> Message-ID: <2d9ee3fc3c1b4d19ae1a41e100b6ac5c@sap.com> Hi, could I please get a second review? http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName-hs/webrev.04 I had to update the webrev because of a problem on windows. @Thomas I had edited os.hpp, but not saved :( Best regards, Goetz. PS: Didn't double-check the webrev as cr server is slow. > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Donnerstag, 17. August 2017 19:54 > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file is > missing. > > Hi Goetz, > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz > > wrote: > > > Hi Thomas, > > > > I adapted the comments in os.hpp. > > > > If I move the call to dll_build_name out of dll_locate_lib > > I have to do a lot of coding in all the places where it is called. > > That seems not useful to me. > > > > Fixed the type to size_t. > > > > One could merge posix/windows if putting the check for ?:? > > into a WINDOWS_ONLY() I guess. The check for \ could be > > done in posix as well, if using file_seperator(). > > > > * Not your change, but: why does the code in os::dll_locate_lib() even > > * differentiate between a PATH containing no os::path_separator() > > * and a path containing os::path_separator()? > > I assume this was done to avoid all the allocations and copying of the > path. > > > > Also adapted the comment in jvmtiExport.cpp. > > > > New webrev: > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.03/ dllBuildName/webrev.03/> > > incremental diff: > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.03/diffs-incremental.patch > dllBuildName/webrev.03/diffs-incremental.patch> > > (fixed indentation on windows) > > > > Best regards, > > Goetz. > > > > > > > Comments in os.hpp seem unchanged ? > > But looks fine otherwise. I do not need another webrev. > > Thanks, Thomas > > > > > > > > > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > ] > Sent: Thursday, August 17, 2017 3:48 PM > To: Lindenmaier, Goetz > > Cc: hotspot-runtime-dev at openjdk.java.net dev at openjdk.java.net> > Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file > is missing. > > > > Hi Goetz, > > > > > > > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz > > wrote: > > Hi Thomas, > > I reworked the whole thing. > > First, there is dll_build_name. It just does -> > lib.so. > > Second, I renamed the legacy dll_build_name to dll_locate_lib. > > I merged all the unix variants to one in os_posix. > > I removed the buffer overflow check at the top. > It's too restrictive because the path argument > can contain several paths. I added the overflow > checks into the single cases. > > Also, I first assemble the pure name using the new, simple > dll_build_name. This is for reuse and readability. > > In case of an empty directory, I use get_current_directory > to complete the path as indicated by the original > documentation > where it was called with "". > Dll_locate_lib now always returns a name with a full path if > the file exists. > > Also, on windows, I think I fixed a bug by reversing the order > of checks. A path list ending in ':' or '\' would not have > been recognized. > > On Bsd, I removed JNI_LIB_* because that already is defined > in jvm_bsh.h > > New webrev: > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.02/ dllBuildName/webrev.02/> > > Best regards, > Goetz. > > > > I like this better than before. Remarks: > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > > > > + // Builds the platform-specific name of a library. > > + // Returns false on __buffer overflow__. > > > > Hopefully not! :D > > How about: "Returns false no truncation" instead. > > > > > > + // Builds a platform-specific full library path given an ld path and lib > name. > > + // Returns true if the buffer contains a full path to an existing file, > false > > + // otherwise. If pathname is empty, checks the current directory. > > + static bool dll_locate_lib(char* buffer, size_t size, > > const char* pathname, const char* fname); > > > > Might be worth mentioning that "fname" is the unadorned library > name, e.g. "verify" for libverify.so or verify.dll. > > > > Would the following alternative be valid: > > > > one could make dll_locate_lib take the real file name, and let caller > use dll_build_name() to build the libary name first before handing it to > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to a generic > "find_file_in_path" because it would work for any kind of file. > > > > As an added bonus, there would be no need to create a temporary > array in dll_build_name/dll_locate_lib, and no need to call free() so no > cleanup-related control flow changes in these functions. > > > > ===== > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > > > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + > strlen(JNI_LIB_SUFFIX); > > > > int -> size_t (does that even compile without warning?) > > > > + // Check current working directory. > > + const char* p = get_current_directory(buffer, buflen); > > + if (p != NULL && > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { > > + strcat(buffer, "\\"); > > + strcat(buffer, fullfname); > > + retval = file_exists(buffer); > > > > Small nit: I'd use jio_snprintf instead of strcat. Functionally identical but > will make scanners (e.g. coverity) happy. One could then avoid the length > calculation and rely on jio_snprintf truncation: > > > > const char* p = get_current_directory(buffer, buflen); > > if (p != NULL) { > > const size_t end = strlen(p); > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { > > retval = file_exists(buffer); > > } > > } > > > > -- > > > > Not your change, but: why does the code in os::dll_locate_lib() even > differentiate between a PATH containing no os::path_separator() and a path > containing os::path_separator()? > > > > Would the former not be just a PATH with only one directory and hence > need no special treatment? > > > > ===== > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> > > > > Could os::dll_locate_lib be consolidated between windows and unix? > Seems to be the implementation is almost identical. > > > > ==== > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > > > > + // not found - try library path > > > > Proposal: "not found - try OS default library path" > > > > > > Find some comments inline: > > > > Especially if the path is empty, it just returns 'true'. > > Dll_build_name is usually used before calling dll_load. If > dll_load does not get a full path it searches > > in well known unix/windows locations. This is intended in > the two cases where dll_build_name > > is called with an empty path. > > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > > > > before, we would call os::dll_build_name() with an empty > string for the path > > which, for relative paths, would result in feeding that path > unexpanded to > > dlopen(), which would use whatever the OS does in those > cases (LIBPATH, > > LD_LIBRARY_PATH, PATH on windows). Note that this does > not necessarily > > include searching the current directory. > Right. With changed dll_biuld_name it's again exactly as > before. > > > With your change, we now use java.library.path, which is not > necessarily the > > same? > You are right, I oversaw that java.library.path can be > overwritten. Initially, > it's set to the right thing. > > > (BTW, I think the old comments in thread.cpp and > jniExport.cpp were wrong:"// > > Try the local directory" - if "local" means "current", this is not > what did > > happen). > Right, I tried to adapt them, did I miss one? > > > I added a second variant of dll_build_name without the > path argument that adds the path > > from system property java.lang.path and use that in these > two cases. > > I changed the original function to actually check file > availability in all cases, > > and to check . if the path is empty. > > I think that may be a bit confusing. We would then have three > options: > > > > - call os::dll_build_name with a real ";;.." PATH and > get a file name > > resolved from that path > > - call os::dll_build_name with "" for the PATH and get OS dll > resolution > No, in that case, as I called file_exists(), it would only work if > the dll is in the > current working directory. But I changed this now, anyways. > > > - call your new overloaded version of os::dll_build_name(), > which uses - > > Djava.library.path. > > > > Please review this change. I please need a sponsor. > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.01/ > > > dllBuildName/webrev.01/> > > > > > Best regards, > > Goetz. > > > > > > > > > > Kind Regards, Thomas > > > > Best Regards, Thomas > > > > > From goetz.lindenmaier at sap.com Tue Aug 22 14:33:09 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 22 Aug 2017 14:33:09 +0000 Subject: RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <2d9ee3fc3c1b4d19ae1a41e100b6ac5c@sap.com> References: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> <2151e4417d4d4bf5b9e368db68c75db2@sap.com> <59d9ca9f5f6d4eeab0679edabc66f1f6@sap.com> <2d9ee3fc3c1b4d19ae1a41e100b6ac5c@sap.com> Message-ID: <0ab5c522c5eb456bb8d9d764840c9718@sap.com> I mistyped the path to webrev, this should work: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.04 Sorry, Goetz > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Dienstag, 22. August 2017 15:48 > To: 'Thomas St?fe' > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if file is > missing. > > Hi, > > could I please get a second review? > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName-hs/webrev.04 > > I had to update the webrev because of a problem on windows. > @Thomas I had edited os.hpp, but not saved :( > > Best regards, > Goetz. > > PS: Didn't double-check the webrev as cr server is slow. > > > -----Original Message----- > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > > Sent: Donnerstag, 17. August 2017 19:54 > > To: Lindenmaier, Goetz > > Cc: hotspot-runtime-dev at openjdk.java.net > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file is > > missing. > > > > Hi Goetz, > > > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz > > > wrote: > > > > > > Hi Thomas, > > > > > > > > I adapted the comments in os.hpp. > > > > > > > > If I move the call to dll_build_name out of dll_locate_lib > > > > I have to do a lot of coding in all the places where it is called. > > > > That seems not useful to me. > > > > > > > > Fixed the type to size_t. > > > > > > > > One could merge posix/windows if putting the check for ?:? > > > > into a WINDOWS_ONLY() I guess. The check for \ could be > > > > done in posix as well, if using file_seperator(). > > > > > > > > * Not your change, but: why does the code in os::dll_locate_lib() even > > > > * differentiate between a PATH containing no os::path_separator() > > > > * and a path containing os::path_separator()? > > > > I assume this was done to avoid all the allocations and copying of the > > path. > > > > > > > > Also adapted the comment in jvmtiExport.cpp. > > > > > > > > New webrev: > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.03/ > dllBuildName/webrev.03/> > > > > incremental diff: > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.03/diffs-incremental.patch > > > dllBuildName/webrev.03/diffs-incremental.patch> > > > > (fixed indentation on windows) > > > > > > > > Best regards, > > > > Goetz. > > > > > > > > > > > > > > Comments in os.hpp seem unchanged ? > > > > But looks fine otherwise. I do not need another webrev. > > > > Thanks, Thomas > > > > > > > > > > > > > > > > > > > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > > ] > > Sent: Thursday, August 17, 2017 3:48 PM > > To: Lindenmaier, Goetz > > > > Cc: hotspot-runtime-dev at openjdk.java.net > dev at openjdk.java.net> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file > > is missing. > > > > > > > > Hi Goetz, > > > > > > > > > > > > > > > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz > > > wrote: > > > > Hi Thomas, > > > > I reworked the whole thing. > > > > First, there is dll_build_name. It just does -> > > lib.so. > > > > Second, I renamed the legacy dll_build_name to dll_locate_lib. > > > > I merged all the unix variants to one in os_posix. > > > > I removed the buffer overflow check at the top. > > It's too restrictive because the path argument > > can contain several paths. I added the overflow > > checks into the single cases. > > > > Also, I first assemble the pure name using the new, simple > > dll_build_name. This is for reuse and readability. > > > > In case of an empty directory, I use get_current_directory > > to complete the path as indicated by the original > > documentation > > where it was called with "". > > Dll_locate_lib now always returns a name with a full path if > > the file exists. > > > > Also, on windows, I think I fixed a bug by reversing the order > > of checks. A path list ending in ':' or '\' would not have > > been recognized. > > > > On Bsd, I removed JNI_LIB_* because that already is defined > > in jvm_bsh.h > > > > New webrev: > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.02/ > dllBuildName/webrev.02/> > > > > Best regards, > > Goetz. > > > > > > > > I like this better than before. Remarks: > > > > > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > > > > > > > > + // Builds the platform-specific name of a library. > > > > + // Returns false on __buffer overflow__. > > > > > > > > Hopefully not! :D > > > > How about: "Returns false no truncation" instead. > > > > > > > > > > > > + // Builds a platform-specific full library path given an ld path and lib > > name. > > > > + // Returns true if the buffer contains a full path to an existing file, > > false > > > > + // otherwise. If pathname is empty, checks the current directory. > > > > + static bool dll_locate_lib(char* buffer, size_t size, > > > > const char* pathname, const char* fname); > > > > > > > > Might be worth mentioning that "fname" is the unadorned library > > name, e.g. "verify" for libverify.so or verify.dll. > > > > > > > > Would the following alternative be valid: > > > > > > > > one could make dll_locate_lib take the real file name, and let caller > > use dll_build_name() to build the libary name first before handing it to > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to a generic > > "find_file_in_path" because it would work for any kind of file. > > > > > > > > As an added bonus, there would be no need to create a temporary > > array in dll_build_name/dll_locate_lib, and no need to call free() so no > > cleanup-related control flow changes in these functions. > > > > > > > > ===== > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > > > > > > > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + > > strlen(JNI_LIB_SUFFIX); > > > > > > > > int -> size_t (does that even compile without warning?) > > > > > > > > + // Check current working directory. > > > > + const char* p = get_current_directory(buffer, buflen); > > > > + if (p != NULL && > > > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { > > > > + strcat(buffer, "\\"); > > > > + strcat(buffer, fullfname); > > > > + retval = file_exists(buffer); > > > > > > > > Small nit: I'd use jio_snprintf instead of strcat. Functionally identical but > > will make scanners (e.g. coverity) happy. One could then avoid the length > > calculation and rely on jio_snprintf truncation: > > > > > > > > const char* p = get_current_directory(buffer, buflen); > > > > if (p != NULL) { > > > > const size_t end = strlen(p); > > > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { > > > > retval = file_exists(buffer); > > > > } > > > > } > > > > > > > > -- > > > > > > > > Not your change, but: why does the code in os::dll_locate_lib() even > > differentiate between a PATH containing no os::path_separator() and a path > > containing os::path_separator()? > > > > > > > > Would the former not be just a PATH with only one directory and hence > > need no special treatment? > > > > > > > > ===== > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> > > > > > > > > Could os::dll_locate_lib be consolidated between windows and unix? > > Seems to be the implementation is almost identical. > > > > > > > > ==== > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > > > > > > > > + // not found - try library path > > > > > > > > Proposal: "not found - try OS default library path" > > > > > > > > > > > > Find some comments inline: > > > > > > > Especially if the path is empty, it just returns 'true'. > > > Dll_build_name is usually used before calling dll_load. If > > dll_load does not get a full path it searches > > > in well known unix/windows locations. This is intended in > > the two cases where dll_build_name > > > is called with an empty path. > > > > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > > > > > > before, we would call os::dll_build_name() with an empty > > string for the path > > > which, for relative paths, would result in feeding that path > > unexpanded to > > > dlopen(), which would use whatever the OS does in those > > cases (LIBPATH, > > > LD_LIBRARY_PATH, PATH on windows). Note that this does > > not necessarily > > > include searching the current directory. > > Right. With changed dll_biuld_name it's again exactly as > > before. > > > > > With your change, we now use java.library.path, which is not > > necessarily the > > > same? > > You are right, I oversaw that java.library.path can be > > overwritten. Initially, > > it's set to the right thing. > > > > > (BTW, I think the old comments in thread.cpp and > > jniExport.cpp were wrong:"// > > > Try the local directory" - if "local" means "current", this is not > > what did > > > happen). > > Right, I tried to adapt them, did I miss one? > > > > > I added a second variant of dll_build_name without the > > path argument that adds the path > > > from system property java.lang.path and use that in these > > two cases. > > > I changed the original function to actually check file > > availability in all cases, > > > and to check . if the path is empty. > > > I think that may be a bit confusing. We would then have three > > options: > > > > > > - call os::dll_build_name with a real ";;.." PATH and > > get a file name > > > resolved from that path > > > - call os::dll_build_name with "" for the PATH and get OS dll > > resolution > > No, in that case, as I called file_exists(), it would only work if > > the dll is in the > > current working directory. But I changed this now, anyways. > > > > > - call your new overloaded version of os::dll_build_name(), > > which uses - > > > Djava.library.path. > > > > > > Please review this change. I please need a sponsor. > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > dllBuildName/webrev.01/ > > > > > > dllBuildName/webrev.01/> > > > > > > > > Best regards, > > > Goetz. > > > > > > > > > > > > > > > Kind Regards, Thomas > > > > > > > > Best Regards, Thomas > > > > > > > > > > From thomas.stuefe at gmail.com Tue Aug 22 15:30:10 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 22 Aug 2017 17:30:10 +0200 Subject: RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <0ab5c522c5eb456bb8d9d764840c9718@sap.com> References: <77ef0c93f5b2449b90aa28d09c83fb3b@sap.com> <2151e4417d4d4bf5b9e368db68c75db2@sap.com> <59d9ca9f5f6d4eeab0679edabc66f1f6@sap.com> <2d9ee3fc3c1b4d19ae1a41e100b6ac5c@sap.com> <0ab5c522c5eb456bb8d9d764840c9718@sap.com> Message-ID: Looks good. ..Thomas On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > I mistyped the path to webrev, this should work: > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.04 > > Sorry, > Goetz > > > > -----Original Message----- > > From: Lindenmaier, Goetz > > Sent: Dienstag, 22. August 2017 15:48 > > To: 'Thomas St?fe' > > Cc: hotspot-runtime-dev at openjdk.java.net > > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if file is > > missing. > > > > Hi, > > > > could I please get a second review? > > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName-hs/webrev.04 > > > > I had to update the webrev because of a problem on windows. > > @Thomas I had edited os.hpp, but not saved :( > > > > Best regards, > > Goetz. > > > > PS: Didn't double-check the webrev as cr server is slow. > > > > > -----Original Message----- > > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > > > Sent: Donnerstag, 17. August 2017 19:54 > > > To: Lindenmaier, Goetz > > > Cc: hotspot-runtime-dev at openjdk.java.net > > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file > is > > > missing. > > > > > > Hi Goetz, > > > > > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz > > > > wrote: > > > > > > > > > Hi Thomas, > > > > > > > > > > > > I adapted the comments in os.hpp. > > > > > > > > > > > > If I move the call to dll_build_name out of dll_locate_lib > > > > > > I have to do a lot of coding in all the places where it is called. > > > > > > That seems not useful to me. > > > > > > > > > > > > Fixed the type to size_t. > > > > > > > > > > > > One could merge posix/windows if putting the check for ?:? > > > > > > into a WINDOWS_ONLY() I guess. The check for \ could be > > > > > > done in posix as well, if using file_seperator(). > > > > > > > > > > > > * Not your change, but: why does the code in os::dll_locate_lib() > even > > > > > > * differentiate between a PATH containing no os::path_separator() > > > > > > * and a path containing os::path_separator()? > > > > > > I assume this was done to avoid all the allocations and copying of > the > > > path. > > > > > > > > > > > > Also adapted the comment in jvmtiExport.cpp. > > > > > > > > > > > > New webrev: > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.03/ goetz/wr17/8186072- > > > dllBuildName/webrev.03/> > > > > > > incremental diff: > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.03/diffs-incremental.patch > > > > > dllBuildName/webrev.03/diffs-incremental.patch> > > > > > > (fixed indentation on windows) > > > > > > > > > > > > Best regards, > > > > > > Goetz. > > > > > > > > > > > > > > > > > > > > > Comments in os.hpp seem unchanged ? > > > > > > But looks fine otherwise. I do not need another webrev. > > > > > > Thanks, Thomas > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > > > ] > > > Sent: Thursday, August 17, 2017 3:48 PM > > > To: Lindenmaier, Goetz > > > > > > Cc: hotspot-runtime-dev at openjdk.java.net > > dev at openjdk.java.net> > > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even if > file > > > is missing. > > > > > > > > > > > > Hi Goetz, > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz > > > > wrote: > > > > > > Hi Thomas, > > > > > > I reworked the whole thing. > > > > > > First, there is dll_build_name. It just does -> > > > lib.so. > > > > > > Second, I renamed the legacy dll_build_name to > dll_locate_lib. > > > > > > I merged all the unix variants to one in os_posix. > > > > > > I removed the buffer overflow check at the top. > > > It's too restrictive because the path argument > > > can contain several paths. I added the overflow > > > checks into the single cases. > > > > > > Also, I first assemble the pure name using the new, simple > > > dll_build_name. This is for reuse and readability. > > > > > > In case of an empty directory, I use get_current_directory > > > to complete the path as indicated by the original > > > documentation > > > where it was called with "". > > > Dll_locate_lib now always returns a name with a full path > if > > > the file exists. > > > > > > Also, on windows, I think I fixed a bug by reversing the > order > > > of checks. A path list ending in ':' or '\' would not have > > > been recognized. > > > > > > On Bsd, I removed JNI_LIB_* because that already is defined > > > in jvm_bsh.h > > > > > > New webrev: > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.02/ goetz/wr17/8186072- > > > dllBuildName/webrev.02/> > > > > > > Best regards, > > > Goetz. > > > > > > > > > > > > I like this better than before. Remarks: > > > > > > > > > > > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > > > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > > > > > > > > > > > > + // Builds the platform-specific name of a library. > > > > > > + // Returns false on __buffer overflow__. > > > > > > > > > > > > Hopefully not! :D > > > > > > How about: "Returns false no truncation" instead. > > > > > > > > > > > > > > > > > > + // Builds a platform-specific full library path given an ld > path and lib > > > name. > > > > > > + // Returns true if the buffer contains a full path to an > existing file, > > > false > > > > > > + // otherwise. If pathname is empty, checks the current > directory. > > > > > > + static bool dll_locate_lib(char* buffer, size_t size, > > > > > > const char* pathname, > const char* fname); > > > > > > > > > > > > Might be worth mentioning that "fname" is the unadorned library > > > name, e.g. "verify" for libverify.so or verify.dll. > > > > > > > > > > > > Would the following alternative be valid: > > > > > > > > > > > > one could make dll_locate_lib take the real file name, and let > caller > > > use dll_build_name() to build the libary name first before handing it > to > > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to a > generic > > > "find_file_in_path" because it would work for any kind of file. > > > > > > > > > > > > As an added bonus, there would be no need to create a temporary > > > array in dll_build_name/dll_locate_lib, and no need to call free() so > no > > > cleanup-related control flow changes in these functions. > > > > > > > > > > > > ===== > > > > > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > > > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > > > > > > > > > > > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + > > > strlen(JNI_LIB_SUFFIX); > > > > > > > > > > > > int -> size_t (does that even compile without warning?) > > > > > > > > > > > > + // Check current working directory. > > > > > > + const char* p = get_current_directory(buffer, buflen); > > > > > > + if (p != NULL && > > > > > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { > > > > > > + strcat(buffer, "\\"); > > > > > > + strcat(buffer, fullfname); > > > > > > + retval = file_exists(buffer); > > > > > > > > > > > > Small nit: I'd use jio_snprintf instead of strcat. Functionally > identical but > > > will make scanners (e.g. coverity) happy. One could then avoid the > length > > > calculation and rely on jio_snprintf truncation: > > > > > > > > > > > > const char* p = get_current_directory(buffer, buflen); > > > > > > if (p != NULL) { > > > > > > const size_t end = strlen(p); > > > > > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { > > > > > > retval = file_exists(buffer); > > > > > > } > > > > > > } > > > > > > > > > > > > -- > > > > > > > > > > > > Not your change, but: why does the code in os::dll_locate_lib() > even > > > differentiate between a PATH containing no os::path_separator() and a > path > > > containing os::path_separator()? > > > > > > > > > > > > Would the former not be just a PATH with only one directory and > hence > > > need no special treatment? > > > > > > > > > > > > ===== > > > > > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html > > > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> > > > > > > > > > > > > Could os::dll_locate_lib be consolidated between windows and unix? > > > Seems to be the implementation is almost identical. > > > > > > > > > > > > ==== > > > > > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > > > > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > > > > > > > > > > > > + // not found - try library path > > > > > > > > > > > > Proposal: "not found - try OS default library path" > > > > > > > > > > > > > > > > > > Find some comments inline: > > > > > > > > > > Especially if the path is empty, it just returns > 'true'. > > > > Dll_build_name is usually used before calling > dll_load. If > > > dll_load does not get a full path it searches > > > > in well known unix/windows locations. This is > intended in > > > the two cases where dll_build_name > > > > is called with an empty path. > > > > > > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > > > > > > > > before, we would call os::dll_build_name() with an empty > > > string for the path > > > > which, for relative paths, would result in feeding that > path > > > unexpanded to > > > > dlopen(), which would use whatever the OS does in those > > > cases (LIBPATH, > > > > LD_LIBRARY_PATH, PATH on windows). Note that this does > > > not necessarily > > > > include searching the current directory. > > > Right. With changed dll_biuld_name it's again exactly as > > > before. > > > > > > > With your change, we now use java.library.path, which is > not > > > necessarily the > > > > same? > > > You are right, I oversaw that java.library.path can be > > > overwritten. Initially, > > > it's set to the right thing. > > > > > > > (BTW, I think the old comments in thread.cpp and > > > jniExport.cpp were wrong:"// > > > > Try the local directory" - if "local" means "current", > this is not > > > what did > > > > happen). > > > Right, I tried to adapt them, did I miss one? > > > > > > > I added a second variant of dll_build_name without > the > > > path argument that adds the path > > > > from system property java.lang.path and use that > in these > > > two cases. > > > > I changed the original function to actually check > file > > > availability in all cases, > > > > and to check . if the path is empty. > > > > I think that may be a bit confusing. We would then have > three > > > options: > > > > > > > > - call os::dll_build_name with a real ";;.." > PATH and > > > get a file name > > > > resolved from that path > > > > - call os::dll_build_name with "" for the PATH and get > OS dll > > > resolution > > > No, in that case, as I called file_exists(), it would only > work if > > > the dll is in the > > > current working directory. But I changed this now, anyways. > > > > > > > - call your new overloaded version of > os::dll_build_name(), > > > which uses - > > > > Djava.library.path. > > > > > > > > Please review this change. I please need a sponsor. > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > dllBuildName/webrev.01/ > > > > > > > > > dllBuildName/webrev.01/> > > > > > > > > > > > Best regards, > > > > Goetz. > > > > > > > > > > > > > > > > > > > > Kind Regards, Thomas > > > > > > > > > > > > Best Regards, Thomas > > > > > > > > > > > > > > > > > From bob.vandette at oracle.com Tue Aug 22 18:22:07 2017 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 22 Aug 2017 14:22:07 -0400 Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: <5d8c1673-de57-5504-e6d7-ff92080b0692@oracle.com> References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> <7C05B931-43BB-4703-9AAD-5C23FA419DC1@oracle.com> <5d8c1673-de57-5504-e6d7-ff92080b0692@oracle.com> Message-ID: Please review the updated webrev which adds new VM flags to allow for the manual selection of Heap size based on a percentage of total available memory. This version deprecates the existing fractional options and allows the new % based flags to override. http://cr.openjdk.java.net/~bobv/8186248/webrev.01/ Bob. > On Aug 17, 2017, at 12:36 AM, David Holmes wrote: > > On 17/08/2017 1:29 PM, Bob Vandette wrote: >> I saw that but wasn't sure it needed the added flexibility since its probably ok that initial sizes are 50% or less. > > I'd go for consistency. > > Also now you will need to guard against values < 1, I think. > > There may be an option checking test that will need updating as well. > > Cheers, > David > >> Bob. >>> On Aug 16, 2017, at 5:04 PM, David Holmes wrote: >>> >>> Hi Bob, >>> >>>> On 17/08/2017 3:32 AM, Bob Vandette wrote: >>>> Please review this simple two line fix which allows more flexibility in selecting the % of system RAM >>>> to be used by the Heap. This just changes two int variables to doubles. >>>> RFE: >>>> https://bugs.openjdk.java.net/browse/JDK-8186248 >>>> Webrev: >>>> http://cr.openjdk.java.net/~bobv/8186248 >>> >>> Wouldn't you also want/need to change the type of InitialRAMFraction? >>> >>> Note: jdk10/hs is currently closed to changes as we prepare to push up to jdk10/jdk10. >>> >>> Thanks, >>> David >>> >>>> Bob. From martin.doerr at sap.com Tue Aug 22 19:03:38 2017 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 22 Aug 2017 19:03:38 +0000 Subject: RFR(S): 8186611: s390: Add missing compiler barriers and fix assembler Message-ID: <3d8e37bd6c2b489eb665309b793b6281@sap.com> Hi, please review my small s390 fix. As discovered by Kim Barret, the current inline assembly implementation of the atomics lacks "memory" in the clobber lists. This may lead to undesired compiler optimizations. We have never observed an issue, but it should be fixed. Another bug is in the assembler encoding of the mvh* instructions which take and Address parameter. In addition, I've improved an assertion which prevents the compiler from generating multiple accesses. Webrev: http://cr.openjdk.java.net/~mdoerr/8186611_s390_fixes/webrev.00/ Thanks and best regards, Martin From Derek.White at cavium.com Tue Aug 22 19:50:29 2017 From: Derek.White at cavium.com (White, Derek) Date: Tue, 22 Aug 2017 19:50:29 +0000 Subject: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> <7C05B931-43BB-4703-9AAD-5C23FA419DC1@oracle.com> <5d8c1673-de57-5504-e6d7-ff92080b0692@oracle.com> Message-ID: Hi Bob, Do you want to add the old flags to the special_jvm_flags list, to get "deprecated" warnings? Or will that come later after CCC-like approval? - Derek > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Bob Vandette > Sent: Tuesday, August 22, 2017 2:22 PM > To: hotspot-runtime-dev at openjdk.java.net runtime dev at openjdk.java.net>; hotspot-gc-dev at openjdk.java.net > Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available > RAM > > Please review the updated webrev which adds new VM flags to allow for the > manual selection of Heap size based on a percentage of total available > memory. > > This version deprecates the existing fractional options and allows the new % > based flags to override. > > http://cr.openjdk.java.net/~bobv/8186248/webrev.01/ > > > Bob. > > > > On Aug 17, 2017, at 12:36 AM, David Holmes > wrote: > > > > On 17/08/2017 1:29 PM, Bob Vandette wrote: > >> I saw that but wasn't sure it needed the added flexibility since its probably > ok that initial sizes are 50% or less. > > > > I'd go for consistency. > > > > Also now you will need to guard against values < 1, I think. > > > > There may be an option checking test that will need updating as well. > > > > Cheers, > > David > > > >> Bob. > >>> On Aug 16, 2017, at 5:04 PM, David Holmes > wrote: > >>> > >>> Hi Bob, > >>> > >>>> On 17/08/2017 3:32 AM, Bob Vandette wrote: > >>>> Please review this simple two line fix which allows more > >>>> flexibility in selecting the % of system RAM to be used by the Heap. This > just changes two int variables to doubles. > >>>> RFE: > >>>> https://bugs.openjdk.java.net/browse/JDK-8186248 > >>>> > >>>> Webrev: > >>>> http://cr.openjdk.java.net/~bobv/8186248 > >>>> > >>> > >>> Wouldn't you also want/need to change the type of InitialRAMFraction? > >>> > >>> Note: jdk10/hs is currently closed to changes as we prepare to push up > to jdk10/jdk10. > >>> > >>> Thanks, > >>> David > >>> > >>>> Bob. From bob.vandette at oracle.com Tue Aug 22 20:05:34 2017 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 22 Aug 2017 16:05:34 -0400 Subject: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> <7C05B931-43BB-4703-9AAD-5C23FA419DC1@oracle.com> <5d8c1673-de57-5504-e6d7-ff92080b0692@oracle.com> Message-ID: > On Aug 22, 2017, at 3:50 PM, White, Derek wrote: > > Hi Bob, > > Do you want to add the old flags to the special_jvm_flags list, to get "deprecated" warnings? Or will that come later after CCC-like approval? > Lets see what others in the VM runtime team have to say about deprecation. I added Deprecation to the old flag descriptions but there are some hotspot jtreg tests that use the old flags and I don?t think we want warnings to come out while running these tests. If I cause the warnings, then I?ll need to change all uses of these old flags in these tests as well. I was expecting to do that in the next release. Perhaps I can use deprecated_in(11). Bob. > - Derek > >> -----Original Message----- >> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >> bounces at openjdk.java.net] On Behalf Of Bob Vandette >> Sent: Tuesday, August 22, 2017 2:22 PM >> To: hotspot-runtime-dev at openjdk.java.net runtime > dev at openjdk.java.net>; hotspot-gc-dev at openjdk.java.net >> Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available >> RAM >> >> Please review the updated webrev which adds new VM flags to allow for the >> manual selection of Heap size based on a percentage of total available >> memory. >> >> This version deprecates the existing fractional options and allows the new % >> based flags to override. >> >> http://cr.openjdk.java.net/~bobv/8186248/webrev.01/ >> >> >> Bob. >> >> >>> On Aug 17, 2017, at 12:36 AM, David Holmes >> wrote: >>> >>> On 17/08/2017 1:29 PM, Bob Vandette wrote: >>>> I saw that but wasn't sure it needed the added flexibility since its probably >> ok that initial sizes are 50% or less. >>> >>> I'd go for consistency. >>> >>> Also now you will need to guard against values < 1, I think. >>> >>> There may be an option checking test that will need updating as well. >>> >>> Cheers, >>> David >>> >>>> Bob. >>>>> On Aug 16, 2017, at 5:04 PM, David Holmes >> wrote: >>>>> >>>>> Hi Bob, >>>>> >>>>>> On 17/08/2017 3:32 AM, Bob Vandette wrote: >>>>>> Please review this simple two line fix which allows more >>>>>> flexibility in selecting the % of system RAM to be used by the Heap. This >> just changes two int variables to doubles. >>>>>> RFE: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8186248 >>>>>> >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~bobv/8186248 >>>>>> >>>>> >>>>> Wouldn't you also want/need to change the type of InitialRAMFraction? >>>>> >>>>> Note: jdk10/hs is currently closed to changes as we prepare to push up >> to jdk10/jdk10. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Bob. > From nick.chadwick at nichesolutions.co.uk Tue Aug 22 22:52:18 2017 From: nick.chadwick at nichesolutions.co.uk (Nick Chadwick) Date: Tue, 22 Aug 2017 15:52:18 -0700 (MST) Subject: Strange STW pauses Message-ID: <1503442338456-312324.post@n7.nabble.com> Hi, My Java application is suffering from long (5s+) STW pauses. Latency is a priority for the application, so I need to get to the bottom of what's going on. With all the usual JVM options on, I see the following in the logs: [deflating idle monitors, 0.0000102 secs] [updating inline caches, 0.0000149 secs] [compilation policy safepoint handler, 0.0000005 secs] [mark nmethods, 0.0000549 secs] [purging class loader data graph, 0.0000002 secs] vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 915.696: no vm operation [ 24 1 1 ] [ 0 0 7303 0 0 ] 1 Total time for which application threads were stopped: 0.0002608 seconds, Stopping threads took: 0.0001250 seconds [deflating idle monitors, 0.0000110 secs] [updating inline caches, 0.0000706 secs] [compilation policy safepoint handler, 0.0000006 secs] [mark nmethods, 0.0000576 secs] [purging class loader data graph, 0.0000003 secs] vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 1051.564: no vm operation [ 24 0 1 ] [ 0 0 1652 0 0 ] 0 Total time for which application threads were stopped: 0.0002285 seconds, Stopping threads took: 0.0000414 seconds [deflating idle monitors, 0.0000135 secs] [updating inline caches, 0.0000083 secs] [compilation policy safepoint handler, 0.0000005 secs] [mark nmethods, 0.0000894 secs] [purging class loader data graph, 0.0000002 secs] vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 1056.223: no vm operation [ 24 0 0 ] [ 0 0 0 5704 0 ] 0 Total time for which application threads were stopped: 5.7042454 seconds, Stopping threads took: 0.0000270 seconds [deflating idle monitors, 0.0000116 secs] [updating inline caches, 0.0000093 secs] [compilation policy safepoint handler, 0.0000005 secs] [mark nmethods, 0.0000540 secs] [purging class loader data graph, 0.0000001 secs] vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 1756.475: no vm operation [ 24 0 0 ] [ 0 0 0 4402 0 ] 0 Total time for which application threads were stopped: 4.4023833 seconds, Stopping threads took: 0.0000268 seconds I've done some reading up, here and elsewhere, on "guaranteed safepoints" etc, but am stumped as to what it's spending its time doing. Please can anyone give me a clue? Thanks, Nick -- View this message in context: http://openjdk.5641.n7.nabble.com/Strange-STW-pauses-tp312324.html Sent from the OpenJDK Hotspot Runtime System mailing list archive at Nabble.com. From david.holmes at oracle.com Wed Aug 23 02:00:09 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 23 Aug 2017 12:00:09 +1000 Subject: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> <7C05B931-43BB-4703-9AAD-5C23FA419DC1@oracle.com> <5d8c1673-de57-5504-e6d7-ff92080b0692@oracle.com> Message-ID: <81573f22-bf8d-10cf-17f9-d98512bc46b0@oracle.com> On 23/08/2017 6:05 AM, Bob Vandette wrote: > >> On Aug 22, 2017, at 3:50 PM, White, Derek wrote: >> >> Hi Bob, >> >> Do you want to add the old flags to the special_jvm_flags list, to get "deprecated" warnings? Or will that come later after CCC-like approval? >> > > Lets see what others in the VM runtime team have to say about deprecation. If the new flags are intended as replacements for the old flags, then the old flags should be deprecated when the new flags are introduced. I would suggest tests be updated to use the new flags at the same time, to avoid the deprecation warnings. David > I added Deprecation to the old flag descriptions but there are some hotspot > jtreg tests that use the old flags and I don?t think we want warnings to come out > while running these tests. > > If I cause the warnings, then I?ll need to change all uses of these old flags in these > tests as well. I was expecting to do that in the next release. Perhaps I can > use deprecated_in(11). > > Bob. > > >> - Derek >> >>> -----Original Message----- >>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- >>> bounces at openjdk.java.net] On Behalf Of Bob Vandette >>> Sent: Tuesday, August 22, 2017 2:22 PM >>> To: hotspot-runtime-dev at openjdk.java.net runtime >> dev at openjdk.java.net>; hotspot-gc-dev at openjdk.java.net >>> Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available >>> RAM >>> >>> Please review the updated webrev which adds new VM flags to allow for the >>> manual selection of Heap size based on a percentage of total available >>> memory. >>> >>> This version deprecates the existing fractional options and allows the new % >>> based flags to override. >>> >>> http://cr.openjdk.java.net/~bobv/8186248/webrev.01/ >>> >>> >>> Bob. >>> >>> >>>> On Aug 17, 2017, at 12:36 AM, David Holmes >>> wrote: >>>> >>>> On 17/08/2017 1:29 PM, Bob Vandette wrote: >>>>> I saw that but wasn't sure it needed the added flexibility since its probably >>> ok that initial sizes are 50% or less. >>>> >>>> I'd go for consistency. >>>> >>>> Also now you will need to guard against values < 1, I think. >>>> >>>> There may be an option checking test that will need updating as well. >>>> >>>> Cheers, >>>> David >>>> >>>>> Bob. >>>>>> On Aug 16, 2017, at 5:04 PM, David Holmes >>> wrote: >>>>>> >>>>>> Hi Bob, >>>>>> >>>>>>> On 17/08/2017 3:32 AM, Bob Vandette wrote: >>>>>>> Please review this simple two line fix which allows more >>>>>>> flexibility in selecting the % of system RAM to be used by the Heap. This >>> just changes two int variables to doubles. >>>>>>> RFE: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8186248 >>>>>>> >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~bobv/8186248 >>>>>>> >>>>>> >>>>>> Wouldn't you also want/need to change the type of InitialRAMFraction? >>>>>> >>>>>> Note: jdk10/hs is currently closed to changes as we prepare to push up >>> to jdk10/jdk10. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Bob. >> > From david.holmes at oracle.com Wed Aug 23 02:26:03 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 23 Aug 2017 12:26:03 +1000 Subject: RFR: 8186248 - Allow more flexibility in selecting Heap % of available RAM In-Reply-To: References: <8E4747DA-9645-4E9A-ADF5-D012790797C5@oracle.com> <981b2de3-7713-8a1f-22a7-b7cc05c49ada@oracle.com> <7C05B931-43BB-4703-9AAD-5C23FA419DC1@oracle.com> <5d8c1673-de57-5504-e6d7-ff92080b0692@oracle.com> Message-ID: <904a6f90-aac2-5e5f-d944-7c1f8ea93dd7@oracle.com> Hi Bob, On 23/08/2017 4:22 AM, Bob Vandette wrote: > Please review the updated webrev which adds new VM flags to allow for the manual > selection of Heap size based on a percentage of total available memory. > > This version deprecates the existing fractional options and allows the new % based flags > to override. > > http://cr.openjdk.java.net/~bobv/8186248/webrev.01/ Deprecation of flags is done by adding them to the table of deprecated/obsoleted/expired flags. I would expect this change to do a conversion from the existing flags to the new flags, and then all remaining code should only refer to the new flags i.e. the deprecated flags should only exist in the globals.hpp file, the deprecation table, and the place where their values are (potentially) assigned to the new flags. Thanks, David ----- > Bob. > > >> On Aug 17, 2017, at 12:36 AM, David Holmes wrote: >> >> On 17/08/2017 1:29 PM, Bob Vandette wrote: >>> I saw that but wasn't sure it needed the added flexibility since its probably ok that initial sizes are 50% or less. >> >> I'd go for consistency. >> >> Also now you will need to guard against values < 1, I think. >> >> There may be an option checking test that will need updating as well. >> >> Cheers, >> David >> >>> Bob. >>>> On Aug 16, 2017, at 5:04 PM, David Holmes wrote: >>>> >>>> Hi Bob, >>>> >>>>> On 17/08/2017 3:32 AM, Bob Vandette wrote: >>>>> Please review this simple two line fix which allows more flexibility in selecting the % of system RAM >>>>> to be used by the Heap. This just changes two int variables to doubles. >>>>> RFE: >>>>> https://bugs.openjdk.java.net/browse/JDK-8186248 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~bobv/8186248 >>>> >>>> Wouldn't you also want/need to change the type of InitialRAMFraction? >>>> >>>> Note: jdk10/hs is currently closed to changes as we prepare to push up to jdk10/jdk10. >>>> >>>> Thanks, >>>> David >>>> >>>>> Bob. > From david.holmes at oracle.com Wed Aug 23 02:30:22 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 23 Aug 2017 12:30:22 +1000 Subject: Strange STW pauses In-Reply-To: <1503442338456-312324.post@n7.nabble.com> References: <1503442338456-312324.post@n7.nabble.com> Message-ID: <60522c79-93bf-7eef-7c0d-1604fcd7ce6a@oracle.com> Hi Nick, AFAICS the "no vm operation" safepoints are not consuming the time. What do the actual vm operation safepoints show? What do GC logs show? Cheers, David On 23/08/2017 8:52 AM, Nick Chadwick wrote: > Hi, > > My Java application is suffering from long (5s+) STW pauses. Latency is a > priority for the application, so I need to get to the bottom of what's going > on. > > With all the usual JVM options on, I see the following in the logs: > > [deflating idle monitors, 0.0000102 secs] > [updating inline caches, 0.0000149 secs] > [compilation policy safepoint handler, 0.0000005 secs] > [mark nmethods, 0.0000549 secs] > [purging class loader data graph, 0.0000002 secs] > vmop [threads: total initially_running > wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > 915.696: no vm operation [ 24 1 > 1 ] [ 0 0 7303 0 0 ] 1 > Total time for which application threads were stopped: 0.0002608 seconds, > Stopping threads took: 0.0001250 seconds > > [deflating idle monitors, 0.0000110 secs] > [updating inline caches, 0.0000706 secs] > [compilation policy safepoint handler, 0.0000006 secs] > [mark nmethods, 0.0000576 secs] > [purging class loader data graph, 0.0000003 secs] > vmop [threads: total initially_running > wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > 1051.564: no vm operation [ 24 0 > 1 ] [ 0 0 1652 0 0 ] 0 > Total time for which application threads were stopped: 0.0002285 seconds, > Stopping threads took: 0.0000414 seconds > > [deflating idle monitors, 0.0000135 secs] > [updating inline caches, 0.0000083 secs] > [compilation policy safepoint handler, 0.0000005 secs] > [mark nmethods, 0.0000894 secs] > [purging class loader data graph, 0.0000002 secs] > vmop [threads: total initially_running > wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > 1056.223: no vm operation [ 24 0 > 0 ] [ 0 0 0 5704 0 ] 0 > Total time for which application threads were stopped: 5.7042454 seconds, > Stopping threads took: 0.0000270 seconds > > [deflating idle monitors, 0.0000116 secs] > [updating inline caches, 0.0000093 secs] > [compilation policy safepoint handler, 0.0000005 secs] > [mark nmethods, 0.0000540 secs] > [purging class loader data graph, 0.0000001 secs] > vmop [threads: total initially_running > wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > 1756.475: no vm operation [ 24 0 > 0 ] [ 0 0 0 4402 0 ] 0 > Total time for which application threads were stopped: 4.4023833 seconds, > Stopping threads took: 0.0000268 seconds > > I've done some reading up, here and elsewhere, on "guaranteed safepoints" > etc, but am stumped as to what it's spending its time doing. > > Please can anyone give me a clue? > > Thanks, > Nick > > > > -- > View this message in context: http://openjdk.5641.n7.nabble.com/Strange-STW-pauses-tp312324.html > Sent from the OpenJDK Hotspot Runtime System mailing list archive at Nabble.com. > From kirk at kodewerk.com Wed Aug 23 05:53:35 2017 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Wed, 23 Aug 2017 07:53:35 +0200 Subject: Strange STW pauses In-Reply-To: <1503442338456-312324.post@n7.nabble.com> References: <1503442338456-312324.post@n7.nabble.com> Message-ID: Hi Nick, Is this a regular occurrence? The pauses can come from many sources including the OS. They are generally a result of a build up of maintenance work recycling a non-shareable resource and thus if you can understand what that resource is, you can understand how your application is putting undo pressure in it and possibly make adjustments. Common sources of long pauses are garbage collection in the JVM and page reclamation at the OS level. GC logs can speak to the GC issue where as you?ll need the OS performance counters to help you understand paging behavior. But do be careful as more often than not, the garbage collector takes the blame for the page reclamation activity. Do collect the GC logs with PrintGCDetails and PrintGCApplicationStoppedTime turned on and post it here or send it to me off list if you like Kind regards, Kirk > On Aug 23, 2017, at 12:52 AM, Nick Chadwick wrote: > > Hi, > > My Java application is suffering from long (5s+) STW pauses. Latency is a > priority for the application, so I need to get to the bottom of what's going > on. > > With all the usual JVM options on, I see the following in the logs: > > [deflating idle monitors, 0.0000102 secs] > [updating inline caches, 0.0000149 secs] > [compilation policy safepoint handler, 0.0000005 secs] > [mark nmethods, 0.0000549 secs] > [purging class loader data graph, 0.0000002 secs] > vmop [threads: total initially_running > wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > 915.696: no vm operation [ 24 1 > 1 ] [ 0 0 7303 0 0 ] 1 > Total time for which application threads were stopped: 0.0002608 seconds, > Stopping threads took: 0.0001250 seconds > > [deflating idle monitors, 0.0000110 secs] > [updating inline caches, 0.0000706 secs] > [compilation policy safepoint handler, 0.0000006 secs] > [mark nmethods, 0.0000576 secs] > [purging class loader data graph, 0.0000003 secs] > vmop [threads: total initially_running > wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > 1051.564: no vm operation [ 24 0 > 1 ] [ 0 0 1652 0 0 ] 0 > Total time for which application threads were stopped: 0.0002285 seconds, > Stopping threads took: 0.0000414 seconds > > [deflating idle monitors, 0.0000135 secs] > [updating inline caches, 0.0000083 secs] > [compilation policy safepoint handler, 0.0000005 secs] > [mark nmethods, 0.0000894 secs] > [purging class loader data graph, 0.0000002 secs] > vmop [threads: total initially_running > wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > 1056.223: no vm operation [ 24 0 > 0 ] [ 0 0 0 5704 0 ] 0 > Total time for which application threads were stopped: 5.7042454 seconds, > Stopping threads took: 0.0000270 seconds > > [deflating idle monitors, 0.0000116 secs] > [updating inline caches, 0.0000093 secs] > [compilation policy safepoint handler, 0.0000005 secs] > [mark nmethods, 0.0000540 secs] > [purging class loader data graph, 0.0000001 secs] > vmop [threads: total initially_running > wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > 1756.475: no vm operation [ 24 0 > 0 ] [ 0 0 0 4402 0 ] 0 > Total time for which application threads were stopped: 4.4023833 seconds, > Stopping threads took: 0.0000268 seconds > > I've done some reading up, here and elsewhere, on "guaranteed safepoints" > etc, but am stumped as to what it's spending its time doing. > > Please can anyone give me a clue? > > Thanks, > Nick > > > > -- > View this message in context: http://openjdk.5641.n7.nabble.com/Strange-STW-pauses-tp312324.html > Sent from the OpenJDK Hotspot Runtime System mailing list archive at Nabble.com. From goetz.lindenmaier at sap.com Wed Aug 23 08:09:58 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 23 Aug 2017 08:09:58 +0000 Subject: RFR(S): 8186611: s390: Add missing compiler barriers and fix assembler In-Reply-To: <3d8e37bd6c2b489eb665309b793b6281@sap.com> References: <3d8e37bd6c2b489eb665309b793b6281@sap.com> Message-ID: <62221db731954c9e80bd09655958142e@sap.com> Hi Martin, thanks for these hard-to-spot fixes! Reviewed. Best regards, Goetz. > -----Original Message----- > From: Doerr, Martin > Sent: Dienstag, 22. August 2017 21:04 > To: hotspot-runtime-dev at openjdk.java.net; Lindenmaier, Goetz > > Cc: Kim Barrett > Subject: RFR(S): 8186611: s390: Add missing compiler barriers and fix > assembler > > Hi, > > > > please review my small s390 fix. > > As discovered by Kim Barret, the current inline assembly implementation of the > atomics lacks "memory" in the clobber lists. This may lead to undesired > compiler optimizations. We have never observed an issue, but it should be > fixed. > > Another bug is in the assembler encoding of the mvh* instructions which take > and Address parameter. > > In addition, I've improved an assertion which prevents the compiler from > generating multiple accesses. > > > > Webrev: > > http://cr.openjdk.java.net/~mdoerr/8186611_s390_fixes/webrev.00/ > > > > Thanks and best regards, > > Martin > > From nick.chadwick at nichesolutions.co.uk Wed Aug 23 08:48:14 2017 From: nick.chadwick at nichesolutions.co.uk (Nick Chadwick) Date: Wed, 23 Aug 2017 01:48:14 -0700 (MST) Subject: Strange STW pauses In-Reply-To: <1503442338456-312324.post@n7.nabble.com> References: <1503442338456-312324.post@n7.nabble.com> Message-ID: <1503478094090-312352.post@n7.nabble.com> Thanks both for your replies. Yes, it is a recurring issue - it doesn't go away after the JVM has warmed up. I am able to trigger the pauses quite easily and regularly with minimal load. I am already logging GC in addition, but there are no GCs close to the pauses that I am seeing. As I understand it, the log entries above are explicitly not garbage collections - they would show something other than "no vm operation" under vmop (e.g. CollectForMetadataAllocation, ParallelGCFailedAllocation, etc) - but guaranteed safepoints. I understand I can tweak the GuaranteedSafepointInterval, but that doesn't feel like the solution to me, and would probably just make the pauses longer, albeit less often. If you do think it's still worth me posting some GC logs here as well, I can, but it definitely feels like it is not GC-related. (I am running with a large heap, and see only a few, short pauses for GC) Although I agree it looks like the vmop itself is not taking up the time, I am seeing stop-the-world pauses that correlate very closely with the total time spent under sync/cleanup in these specific pauses. Grateful for any further insight. -- View this message in context: http://openjdk.5641.n7.nabble.com/Strange-STW-pauses-tp312324p312352.html Sent from the OpenJDK Hotspot Runtime System mailing list archive at Nabble.com. From kirk.pepperdine at gmail.com Wed Aug 23 08:50:41 2017 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Wed, 23 Aug 2017 10:50:41 +0200 Subject: Strange STW pauses In-Reply-To: <1503478094090-312352.post@n7.nabble.com> References: <1503442338456-312324.post@n7.nabble.com> <1503478094090-312352.post@n7.nabble.com> Message-ID: <20C6A72E-1836-4DCD-8AE5-19D632C7D5FB@gmail.com> Hi Nick, I think it would be worth posting the GC log as long as it was collected with the option that I recommended. Even if this isn?t GC related, there is likely some information in the log that might yield a clue. Kind regards, Kirk Pepperdine > On Aug 23, 2017, at 10:48 AM, Nick Chadwick wrote: > > Thanks both for your replies. > > Yes, it is a recurring issue - it doesn't go away after the JVM has warmed > up. I am able to trigger the pauses quite easily and regularly with minimal > load. > > I am already logging GC in addition, but there are no GCs close to the > pauses that I am seeing. As I understand it, the log entries above are > explicitly not garbage collections - they would show something other than > "no vm operation" under vmop (e.g. CollectForMetadataAllocation, > ParallelGCFailedAllocation, etc) - but guaranteed safepoints. I understand I > can tweak the GuaranteedSafepointInterval, but that doesn't feel like the > solution to me, and would probably just make the pauses longer, albeit less > often. > > If you do think it's still worth me posting some GC logs here as well, I > can, but it definitely feels like it is not GC-related. (I am running with a > large heap, and see only a few, short pauses for GC) > > Although I agree it looks like the vmop itself is not taking up the time, I > am seeing stop-the-world pauses that correlate very closely with the total > time spent under sync/cleanup in these specific pauses. > > Grateful for any further insight. > > > > -- > View this message in context: http://openjdk.5641.n7.nabble.com/Strange-STW-pauses-tp312324p312352.html > Sent from the OpenJDK Hotspot Runtime System mailing list archive at Nabble.com. From aph at redhat.com Wed Aug 23 08:50:59 2017 From: aph at redhat.com (Andrew Haley) Date: Wed, 23 Aug 2017 09:50:59 +0100 Subject: Strange STW pauses In-Reply-To: <1503478094090-312352.post@n7.nabble.com> References: <1503442338456-312324.post@n7.nabble.com> <1503478094090-312352.post@n7.nabble.com> Message-ID: On 23/08/17 09:48, Nick Chadwick wrote: > Grateful for any further insight. Your mailer mangled the little bit of the log you posted. Some more log would help, attached or uploaded somewhere so that it doesn't get mangled. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From david.holmes at oracle.com Wed Aug 23 12:33:35 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 23 Aug 2017 22:33:35 +1000 Subject: Strange STW pauses In-Reply-To: <1503478094090-312352.post@n7.nabble.com> References: <1503442338456-312324.post@n7.nabble.com> <1503478094090-312352.post@n7.nabble.com> Message-ID: <2bbe063f-9844-8c16-eaf2-cc58f4a8f52b@oracle.com> On 23/08/2017 6:48 PM, Nick Chadwick wrote: > Thanks both for your replies. > > Yes, it is a recurring issue - it doesn't go away after the JVM has warmed > up. I am able to trigger the pauses quite easily and regularly with minimal > load. > > I am already logging GC in addition, but there are no GCs close to the > pauses that I am seeing. As I understand it, the log entries above are > explicitly not garbage collections - they would show something other than > "no vm operation" under vmop (e.g. CollectForMetadataAllocation, > ParallelGCFailedAllocation, etc) - but guaranteed safepoints. I understand I > can tweak the GuaranteedSafepointInterval, but that doesn't feel like the > solution to me, and would probably just make the pauses longer, albeit less > often. > > If you do think it's still worth me posting some GC logs here as well, I > can, but it definitely feels like it is not GC-related. (I am running with a > large heap, and see only a few, short pauses for GC) > > Although I agree it looks like the vmop itself is not taking up the time, I > am seeing stop-the-world pauses that correlate very closely with the total > time spent under sync/cleanup in these specific pauses. I did not glean that from the log fragments you posted. The cleaning times seemed very short, as was the time to reach the safepoint. But the formatting was problematic so perhaps I misread something ?? If there are long delays in reaching a safepoint then it is likely a C2 issue where safepoint checks have been elided from lengthy code fragments. Try running with -XX:+UseCountedLoopSafepoints David > Grateful for any further insight. > > > > -- > View this message in context: http://openjdk.5641.n7.nabble.com/Strange-STW-pauses-tp312324p312352.html > Sent from the OpenJDK Hotspot Runtime System mailing list archive at Nabble.com. > From claes.redestad at oracle.com Wed Aug 23 12:32:35 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 23 Aug 2017 14:32:35 +0200 Subject: [10] RFR: 8179040: Avoid Ticks::now calls when EventClassLoad is not enabled In-Reply-To: References: Message-ID: Hi, this patch was never pushed due to an unfortunate interaction around cancelling of events. Markus Gr?nlund has helped resolve these, while maintaining the speedup: http://cr.openjdk.java.net/~redestad/8179040/hotspot.02/ Thanks! /Claes On 04/25/2017 03:21 PM, Claes Redestad wrote: > Hi, > > this patch removes calling Ticks::now when EventClassLoad isn't > enabled, which > has an effect on class loading performance: > > http://cr.openjdk.java.net/~redestad/8179040/hotspot.01/ > > When tracing isn't enabled trace/tracing.hpp has dummy implementations > which > are easily optimized away by a compiler, which I've verified happens > on linux > OpenJDK builds with tracing disabled. > > On builds with tracing enabled then the changes means the call to get > the time > only happen if the event is enabled, which achieves the sought after > startup > optimization. > > Thanks! > > /Claes From nick.chadwick at nichesolutions.co.uk Wed Aug 23 12:40:00 2017 From: nick.chadwick at nichesolutions.co.uk (Nick Chadwick) Date: Wed, 23 Aug 2017 05:40:00 -0700 (MST) Subject: Strange STW pauses In-Reply-To: <1503442338456-312324.post@n7.nabble.com> References: <1503442338456-312324.post@n7.nabble.com> Message-ID: <1503492000787-312371.post@n7.nabble.com> All, I'm in the process of collecting some additional logs, and will post a link to them when I've got them. Thanks, Nick -- View this message in context: http://openjdk.5641.n7.nabble.com/Strange-STW-pauses-tp312324p312371.html Sent from the OpenJDK Hotspot Runtime System mailing list archive at Nabble.com. From nick.chadwick at nichesolutions.co.uk Wed Aug 23 12:49:47 2017 From: nick.chadwick at nichesolutions.co.uk (Nick Chadwick) Date: Wed, 23 Aug 2017 05:49:47 -0700 (MST) Subject: Strange STW pauses In-Reply-To: <2bbe063f-9844-8c16-eaf2-cc58f4a8f52b@oracle.com> References: <1503442338456-312324.post@n7.nabble.com> <1503478094090-312352.post@n7.nabble.com> <2bbe063f-9844-8c16-eaf2-cc58f4a8f52b@oracle.com> Message-ID: <1503492587860-312374.post@n7.nabble.com> David Holmes wrote > On 23/08/2017 6:48 PM, Nick Chadwick wrote: >> Thanks both for your replies. >> >> Yes, it is a recurring issue - it doesn't go away after the JVM has >> warmed >> up. I am able to trigger the pauses quite easily and regularly with >> minimal >> load. >> >> I am already logging GC in addition, but there are no GCs close to the >> pauses that I am seeing. As I understand it, the log entries above are >> explicitly not garbage collections - they would show something other than >> "no vm operation" under vmop (e.g. CollectForMetadataAllocation, >> ParallelGCFailedAllocation, etc) - but guaranteed safepoints. I >> understand I >> can tweak the GuaranteedSafepointInterval, but that doesn't feel like the >> solution to me, and would probably just make the pauses longer, albeit >> less >> often. >> >> If you do think it's still worth me posting some GC logs here as well, I >> can, but it definitely feels like it is not GC-related. (I am running >> with a >> large heap, and see only a few, short pauses for GC) >> >> Although I agree it looks like the vmop itself is not taking up the time, >> I >> am seeing stop-the-world pauses that correlate very closely with the >> total >> time spent under sync/cleanup in these specific pauses. > > I did not glean that from the log fragments you posted. The cleaning > times seemed very short, as was the time to reach the safepoint. But the > formatting was problematic so perhaps I misread something ?? > > If there are long delays in reaching a safepoint then it is likely a C2 > issue where safepoint checks have been elided from lengthy code > fragments. Try running with -XX:+UseCountedLoopSafepoints > > David > >> Grateful for any further insight. >> >> >> >> -- >> View this message in context: >> http://openjdk.5641.n7.nabble.com/Strange-STW-pauses-tp312324p312352.html >> Sent from the OpenJDK Hotspot Runtime System mailing list archive at >> Nabble.com. >> Admittedly, the snippets in my original post are a bit confusing - the first two pauses and the second two pauses are notably different in the way they're logged, albeit they all caused STW pauses of multiple seconds. If you look at the last two, the"total time for which application threads were stopped" (5.7s and 4.4s respectively) closely match the "cleanup" time - and the pauses I observed. That seems logical to me. The first two, though, show hardly any "total time for which application threads were stopped", but substantially longer numbers in "sync" time. The pauses I observed were approximately equal to that sync time. Anyway, I'll revert when I've got some more logs to share, and will look at adding UseCountedLoopSafepoints into the mix. I think I can guess what that does :) I should state that there are no valid situations where threads running application code should be taking several seconds to sync to a safepoint - there are no long loops, as fast response time is a key requirement of the application. Nick -- View this message in context: http://openjdk.5641.n7.nabble.com/Strange-STW-pauses-tp312324p312374.html Sent from the OpenJDK Hotspot Runtime System mailing list archive at Nabble.com. From martin.doerr at sap.com Wed Aug 23 14:06:45 2017 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 23 Aug 2017 14:06:45 +0000 Subject: RFR(S): 8186611: s390: Add missing compiler barriers and fix assembler In-Reply-To: <62221db731954c9e80bd09655958142e@sap.com> References: <3d8e37bd6c2b489eb665309b793b6281@sap.com> <62221db731954c9e80bd09655958142e@sap.com> Message-ID: Thanks for the review. Pushed. Best regards, Martin -----Original Message----- From: Lindenmaier, Goetz Sent: Mittwoch, 23. August 2017 10:10 To: Doerr, Martin ; hotspot-runtime-dev at openjdk.java.net Cc: Kim Barrett Subject: RE: RFR(S): 8186611: s390: Add missing compiler barriers and fix assembler Hi Martin, thanks for these hard-to-spot fixes! Reviewed. Best regards, Goetz. > -----Original Message----- > From: Doerr, Martin > Sent: Dienstag, 22. August 2017 21:04 > To: hotspot-runtime-dev at openjdk.java.net; Lindenmaier, Goetz > > Cc: Kim Barrett > Subject: RFR(S): 8186611: s390: Add missing compiler barriers and fix > assembler > > Hi, > > > > please review my small s390 fix. > > As discovered by Kim Barret, the current inline assembly implementation of the > atomics lacks "memory" in the clobber lists. This may lead to undesired > compiler optimizations. We have never observed an issue, but it should be > fixed. > > Another bug is in the assembler encoding of the mvh* instructions which take > and Address parameter. > > In addition, I've improved an assertion which prevents the compiler from > generating multiple accesses. > > > > Webrev: > > http://cr.openjdk.java.net/~mdoerr/8186611_s390_fixes/webrev.00/ > > > > Thanks and best regards, > > Martin > > From goetz.lindenmaier at sap.com Wed Aug 23 14:12:31 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 23 Aug 2017 14:12:31 +0000 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: Message-ID: <795c6f054c0c471bb1bcf95639219e1a@sap.com> Hi Thomas, nice cleanup, thanks. You could mention in the change summary or in the bug that you added printing the decoder state into the hs_err file. This is fine, but by the description I did not expect a functional change. Some minor remarks: decoder_windows.cpp: Copyright needs to be updated. Please check other files. Include decoder_windows.hpp not sorted alphabetically. decoder_windows.hpp: Superfluous 'private' specification. There are two of them now. (I think usually there is one space before private/public, but this wasn't the case before, either, in this file.) os_windows_x86.cpp Why don't you add the two includes to unwind_windows_x86.hpp and windbghelp.hpp into the normal list at the proper alphabetic position? If there is a good reason for this, please put this into a comment. decoder.cpp Copyright wrong. You guard the #include by _WINDOWS and the code by _WIN32. This is inconsistent. Best regards, Goetz. > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot-runtime-dev- > bounces at openjdk.java.net] On Behalf Of Thomas St?fe > Sent: Dienstag, 22. August 2017 15:05 > To: Reingruber, Richard ; Ioi Lam > ; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(s): 8186349: [windows] Centralize dbghelp handling code > > p.s. > > I built on x86 and x64. Ran gtests and part of the jtreg tests > (hotspot/runtime/ErrorReporting) for both platforms. > > Two jtreg tests failed, but errors have nothing to do with my change - some > java cross-module-access issue. Do these tests get run regularly? > > I also tried to build without precompiled headers to see if any includes > were missing, but ran into other missing includes first, that seems to be > rotted a bit. > > ..Thomas > > On Tue, Aug 22, 2017 at 2:48 PM, Thomas St?fe > wrote: > > > Hi all, > > > > please see new Webrev: > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349- > > centralize-dbghelp-handling/webrev.01/webrev/ > > > > I worked in proposed changes by Ioi and Richard. Sorry, because on file > > name changed, I did not do an incremental webrev. Here are the changes: > > > > - Renamed DbgHelpLoader to WindowsDbgHelp and renamed the files too. > > - I moved the critical section code to the implementation of > > WindowsDbgHelp. I discarded the general "CritSectLocker" object in favour > > of an object specialized for this case, see the EntryGuard class. > > - I readded the SymSetOptions call I accidentally discarded. > > - In decoder_windows.cpp, I set the value for _can_decode_in_vm to true. I > > plan to remove this flag in one of the next changes (see also JDK-8144855) > > > > > > As Richard is no full reviewer, I'll need a second Reviewer and a sponsor. > > > > Thanks! > > > > Kind Regards, Thomas > > > > > > On Tue, Aug 22, 2017 at 2:33 PM, Thomas St?fe > > wrote: > > > >> Hi Richard, > >> > >> thank you for the review! Please find my remarks inline. > >> > >> On Tue, Aug 22, 2017 at 9:44 AM, Reingruber, Richard < > >> richard.reingruber at sap.com> wrote: > >> > >>> Hi Thomas, > >>> > >>> thanks for the refactoring work! > >>> > >>> I had a look at your changes, but please note that I'm not a reviewer. > >>> > >>> ### dbghelp_loader.cpp: > >>> > >>> Should globalDefinitions.hpp be included? > >>> > >> > >> Currently I do not need it, so I'd rather not. > >> > >> > >>> > >>> Little inconsistency: opening curly braces of method bodies should be on > >>> the same line as the end of the parameter list, I guess. > >>> > >>> 294: st->print("%s" #functionname, ((num_missing > 0) ? ", " : "")); > >>> > >> > >> Are you sure? Sorry, I cannot spot an error here. > >> > >> > >>> > >>> Format string is incomplete. > >>> > >>> 196 BOOL DbgHelpLoader::stackWalk64(DWORD MachineType, > >>> 197 HANDLE hProcess, > >>> 198 HANDLE hThread, > >>> 199 LPSTACKFRAME64 StackFrame, > >>> 200 PVOID ContextRecord) > >>> 201 { > >>> 202 CritSectLocker lck(&g_cs); > >>> 203 if (initialize_if_needed()) { > >>> 204 if (g_pfn_StackWalk64 != NULL) { > >>> 205 return g_pfn_StackWalk64(MachineType, hProcess, hThread, > >>> StackFrame, > >>> 206 ContextRecord, > >>> 207 NULL, // ReadMemoryRoutine > >>> 208 g_pfn_SymFunctionTableAccess64, > >>> // FunctionTableAccessRoutine, > >>> 209 g_pfn_SymGetModuleBase64, // > >>> GetModuleBaseRoutine > >>> 210 NULL // TranslateAddressRoutine > >>> 211 ); > >>> 212 } > >>> 213 } > >>> 214 return FALSE; > >>> 215 } > >>> > >>> Lines 208, 209: is it ok to pass NULL? > >>> > >>> > >> Good question. Documentation says parameters are required. I tested > >> calling this with NULL for both functions and stack walking worked just > >> fine. I leave it as it is, because I think at worst we risk StackWalk64 > >> failing, and at best we get a callstack nevertheless. > >> > >> > >>> ### windows_decoder.cpp > >>> > >>> The following line was deleted without replacement. Are the options not > >>> needed? > >>> > >>> SymSetOptions(SYMOPT_UNDNAME | SYMOPT_DEFERRED_LOADS | > >>> SYMOPT_EXACT_SYMBOLS); > >>> > >>> > >> Good catch! Will fix. > >> > >> > >>> > >>> Cheers, Richard. > >>> > >>> > >> Thanks! > >> > >> Thomas > >> > >> > >>> -----Original Message----- > >>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bo > >>> unces at openjdk.java.net] On Behalf Of Thomas St?fe > >>> Sent: Freitag, 18. August 2017 09:24 > >>> To: hotspot-runtime-dev at openjdk.java.net > >>> Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code > >>> > >>> Dear all, > >>> > >>> may I please have a review for this change: > >>> > >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 > >>> Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > >>> 8186349-centralize-dbghelp-handling/webrev.00/webrev/ > >>> > >>> This is a part of an ongoing work I do to make error reporting > >>> (especially > >>> callstacks) on Windows more reliable. > >>> > >>> At first I did a rather large patch, see: https://bugs.openjdk.java.net/ > >>> browse/JDK-8185712 and > >>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2 > >>> 017-August/024286.html > >>> . But after discussing this patch with Ioi, I saw that this patch is > >>> better > >>> split up into multiple parts for easier reviewing. > >>> > >>> So this is the first split up patch. > >>> > >>> -- > >>> > >>> This patch here centralizes handling of the dbghelp.dll (loading the > >>> library, resolving function pointers and synchronizing access). > >>> > >>> Which solves the problem that accesses to functions exported from the > >>> dbghelp.dll need to be synchronized, as stated by the MSDN. We somehow > >>> never really cared. I guess it never caused visible trouble, because most > >>> of the time (not always) the functions are accessed from > >>> VMError::report(), > >>> and chances of parallel access from other non-error-reporting threads are > >>> slim. Even if it were to crash, secondary error handling would step in > >>> and > >>> write an "Error occurred during error reporting" or "Another thread had > >>> an > >>> error too" message and we would probably just shrug it off. > >>> > >>> But as this whole effort is about increasing the chance of useful > >>> callstacks in hs-err files, I'd like to fix this. > >>> > >>> In addition to the fix, I think this is also a nice cleanup and removes > >>> duplicate code. > >>> > >>> Notes: > >>> > >>> 1) Robustness: We may or may not find a dbghelp.dll on the target system. > >>> If we find it, it may be old or new (it is not tightly coupled with the > >>> OS, > >>> may be part of other installation packages, may exist multiple times > >>> etc). > >>> We should handle older versions of the dbghelp dll gracefully and hide > >>> all > >>> that complexity from the caller. > >>> > >>> 2) The new DbgHelpLoader class does not export any state indicating > >>> whether > >>> or not it successfully loaded, and if it loaded which functions are > >>> actually available. That was a deliberate decision, there is no need for > >>> the caller to know this. Caller should invoke the DbgHelpLoader functions > >>> as if they were the equivalent OS functions and handle return errors. > >>> DbgHelpLoader should never crash or assert; missing functions should > >>> behave > >>> like failing functions. > >>> > >>> 3) However, I added a one liner to the hs-err file indicating the state > >>> of > >>> the dbghelp dll - version info, what functions were missing etc. This may > >>> help understanding weird or missing callstacks. > >>> > >>> 4) I removed the implementation for shutdown > (WindowsDecoder::shutdown). > >>> I > >>> think there is no valid reason to ever shutdown the decoder. For one, we > >>> may crash right at the end, and still it would be nice to have > >>> callstacks. > >>> And then, why spend cycles shutting down the decoder when we could just > >>> let > >>> it end with the process? > >>> > >>> 5) This code gets used during error reporting. So no VM infrastructure > >>> must > >>> be used to avoid circular crashes and VM initialization dependencies. So, > >>> to synchronize, this code uses raw windows CriticalSection objects. > >>> > >>> -- > >>> > >>> Next step will be revamping handling of the Symbol APIs. This will > >>> involve > >>> removing the WindowsDecoder class, which introduces other errors and > >>> really > >>> makes no sense if the underlying dbghelp layer does its own > >>> synchronization. > >>> > >>> > >>> Thanks for reviewing! > >>> > >>> Kind Regards, Thomas > >>> > >> > >> > > From ioi.lam at oracle.com Wed Aug 23 14:26:34 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 23 Aug 2017 07:26:34 -0700 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: Message-ID: <1df8ec2a-e1f0-52f8-7e26-04b0d770f9cd@oracle.com> Hi Thomas, thanks for addressing my concerns. I can sponsor the change after you get OK from the other reviewers. Thanks - Ioi On 8/22/17 5:48 AM, Thomas St?fe wrote: > Hi all, > > please see new Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp-handling/webrev.01/webrev/ > > > I worked in proposed changes by Ioi and Richard. Sorry, because on > file name changed, I did not do an incremental webrev. Here are the > changes: > > - Renamed DbgHelpLoader to WindowsDbgHelp and renamed the files too. > - I moved the critical section code to the implementation of > WindowsDbgHelp. I discarded the general "CritSectLocker" object in > favour of an object specialized for this case, see the EntryGuard class. > - I readded the SymSetOptions call I accidentally discarded. > - In decoder_windows.cpp, I set the value for _can_decode_in_vm to > true. I plan to remove this flag in one of the next changes (see > also?JDK-8144855) > > > As Richard is no full reviewer, I'll need a second Reviewer and a sponsor. > > Thanks! > > Kind Regards, Thomas > > > On Tue, Aug 22, 2017 at 2:33 PM, Thomas St?fe > wrote: > > Hi Richard, > > thank you for the review! Please find my remarks inline. > > On Tue, Aug 22, 2017 at 9:44 AM, Reingruber, Richard > > > wrote: > > Hi Thomas, > > thanks for the refactoring work! > > I had a look at your changes, but please note that I'm not a > reviewer. > > ### dbghelp_loader.cpp: > > Should globalDefinitions.hpp? be included? > > > Currently I do not need it, so I'd rather not. > > > Little inconsistency: opening curly braces of method bodies > should be on the same line as the end of the parameter list, I > guess. > > 294: st->print("%s" #functionname, ((num_missing > 0) ? ", " : > "")); > > > Are you sure? Sorry, I cannot spot an error here. > > > Format string is incomplete. > > ?196? ? BOOL DbgHelpLoader::stackWalk64(DWORD MachineType, > ?197? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? HANDLE hProcess, > ?198? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? HANDLE hThread, > ?199 LPSTACKFRAME64 StackFrame, > ?200? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? PVOID ContextRecord) > ?201? ? { > ?202? ? ? CritSectLocker lck(&g_cs); > ?203? ? ? if (initialize_if_needed()) { > ?204? ? ? ? if (g_pfn_StackWalk64 != NULL) { > ?205? ? ? ? ? return g_pfn_StackWalk64(MachineType, hProcess, > hThread, StackFrame, > ?206 ?ContextRecord, > ?207? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?NULL, // ReadMemoryRoutine > ?208 ?g_pfn_SymFunctionTableAccess64, // > FunctionTableAccessRoutine, > ?209 ?g_pfn_SymGetModuleBase64, // GetModuleBaseRoutine > ?210? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?NULL // > TranslateAddressRoutine > ?211? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?); > ?212? ? ? ? } > ?213? ? ? } > ?214? ? ? return FALSE; > ?215? ? } > > Lines 208, 209: is it ok to pass NULL? > > > Good question. Documentation says parameters are required. I > tested calling this with NULL for both functions and stack walking > worked just fine. I leave it as it is, because I think at worst we > risk StackWalk64 failing, and at best we get a callstack nevertheless. > > ### windows_decoder.cpp > > The following line was deleted without replacement. Are the > options not needed? > > SymSetOptions(SYMOPT_UNDNAME | SYMOPT_DEFERRED_LOADS | > SYMOPT_EXACT_SYMBOLS); > > > Good catch! Will fix. > > > Cheers, Richard. > > > Thanks! > > Thomas > > -----Original Message----- > From: hotspot-runtime-dev > [mailto:hotspot-runtime-dev-bounces at openjdk.java.net > ] On > Behalf Of Thomas St?fe > Sent: Freitag, 18. August 2017 09:24 > To: hotspot-runtime-dev at openjdk.java.net > > Subject: RFR(s): 8186349: [windows] Centralize dbghelp > handling code > > Dear all, > > may I please have a review for this change: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 > > Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > > 8186349-centralize-dbghelp-handling/webrev.00/webrev/ > > This is a part of an ongoing work I do to make error reporting > (especially > callstacks) on Windows more reliable. > > At first I did a rather large patch, see: > https://bugs.openjdk.java.net/ > browse/JDK-8185712 and > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-August/024286.html > > . But after discussing this patch with Ioi, I saw that this > patch is better > split up into multiple parts for easier reviewing. > > So this is the first split up patch. > > -- > > This patch here centralizes handling of the dbghelp.dll > (loading the > library, resolving function pointers and synchronizing access). > > Which solves the problem that accesses to functions exported > from the > dbghelp.dll need to be synchronized, as stated by the MSDN. We > somehow > never really cared. I guess it never caused visible trouble, > because most > of the time (not always) the functions are accessed from > VMError::report(), > and chances of parallel access from other non-error-reporting > threads are > slim. Even if it were to crash, secondary error handling would > step in and > write an "Error occurred during error reporting" or "Another > thread had an > error too" message and we would probably just shrug it off. > > But as this whole effort is about increasing the chance of useful > callstacks in hs-err files, I'd like to fix this. > > In addition to the fix, I think this is also a nice cleanup > and removes > duplicate code. > > Notes: > > 1) Robustness: We may or may not find a dbghelp.dll on the > target system. > If we find it, it may be old or new (it is not tightly coupled > with the OS, > may be part of other installation packages, may exist multiple > times etc). > We should handle older versions of the dbghelp dll gracefully > and hide all > that complexity from the caller. > > 2) The new DbgHelpLoader class does not export any state > indicating whether > or not it successfully loaded, and if it loaded which > functions are > actually available. That was a deliberate decision, there is > no need for > the caller to know this. Caller should invoke the > DbgHelpLoader functions > as if they were the equivalent OS functions and handle return > errors. > DbgHelpLoader should never crash or assert; missing functions > should behave > like failing functions. > > 3) However, I added a one liner to the hs-err file indicating > the state of > the dbghelp dll - version info, what functions were missing > etc. This may > help understanding weird or missing callstacks. > > 4) I removed the implementation for shutdown > (WindowsDecoder::shutdown). I > think there is no valid reason to ever shutdown the decoder. > For one, we > may crash right at the end, and still it would be nice to have > callstacks. > And then, why spend cycles shutting down the decoder when we > could just let > it end with the process? > > 5) This code gets used during error reporting. So no VM > infrastructure must > be used to avoid circular crashes and VM initialization > dependencies. So, > to synchronize, this code uses raw windows CriticalSection > objects. > > -- > > Next step will be revamping handling of the Symbol APIs. This > will involve > removing the WindowsDecoder class, which introduces other > errors and really > makes no sense if the underlying dbghelp layer does its own > synchronization. > > > Thanks for reviewing! > > Kind Regards, Thomas > > > From ioi.lam at oracle.com Wed Aug 23 18:11:59 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 23 Aug 2017 11:11:59 -0700 Subject: [10] RFR: 8179040: Avoid Ticks::now calls when EventClassLoad is not enabled In-Reply-To: References: Message-ID: Hi Claes, The changes look good. Reviewed. Thanks - Ioi On 8/23/17 5:32 AM, Claes Redestad wrote: > Hi, > > this patch was never pushed due to an unfortunate interaction around > cancelling > of events. Markus Gr?nlund has helped resolve these, while maintaining > the > speedup: > > http://cr.openjdk.java.net/~redestad/8179040/hotspot.02/ > > Thanks! > > /Claes > > On 04/25/2017 03:21 PM, Claes Redestad wrote: >> Hi, >> >> this patch removes calling Ticks::now when EventClassLoad isn't >> enabled, which >> has an effect on class loading performance: >> >> http://cr.openjdk.java.net/~redestad/8179040/hotspot.01/ >> >> When tracing isn't enabled trace/tracing.hpp has dummy >> implementations which >> are easily optimized away by a compiler, which I've verified happens >> on linux >> OpenJDK builds with tracing disabled. >> >> On builds with tracing enabled then the changes means the call to get >> the time >> only happen if the event is enabled, which achieves the sought after >> startup >> optimization. >> >> Thanks! >> >> /Claes > From claes.redestad at oracle.com Wed Aug 23 18:20:34 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 23 Aug 2017 20:20:34 +0200 Subject: [10] RFR: 8179040: Avoid Ticks::now calls when EventClassLoad is not enabled In-Reply-To: References: Message-ID: <170b10e1-2bdb-8576-be3a-c9d86cfd2d38@oracle.com> Thanks, Ioi! /Claes On 2017-08-23 20:11, Ioi Lam wrote: > Hi Claes, > > The changes look good. Reviewed. > > Thanks > > - Ioi > > > On 8/23/17 5:32 AM, Claes Redestad wrote: >> Hi, >> >> this patch was never pushed due to an unfortunate interaction around >> cancelling >> of events. Markus Gr?nlund has helped resolve these, while >> maintaining the >> speedup: >> >> http://cr.openjdk.java.net/~redestad/8179040/hotspot.02/ >> >> Thanks! >> >> /Claes >> >> On 04/25/2017 03:21 PM, Claes Redestad wrote: >>> Hi, >>> >>> this patch removes calling Ticks::now when EventClassLoad isn't >>> enabled, which >>> has an effect on class loading performance: >>> >>> http://cr.openjdk.java.net/~redestad/8179040/hotspot.01/ >>> >>> When tracing isn't enabled trace/tracing.hpp has dummy >>> implementations which >>> are easily optimized away by a compiler, which I've verified happens >>> on linux >>> OpenJDK builds with tracing disabled. >>> >>> On builds with tracing enabled then the changes means the call to >>> get the time >>> only happen if the event is enabled, which achieves the sought after >>> startup >>> optimization. >>> >>> Thanks! >>> >>> /Claes >> > From jiangli.zhou at Oracle.COM Wed Aug 23 23:24:53 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Wed, 23 Aug 2017 16:24:53 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken Message-ID: Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. Tested with tier4-comp tests. Thanks, Jiangli From calvin.cheung at oracle.com Thu Aug 24 00:20:49 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 23 Aug 2017 17:20:49 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: References: Message-ID: <599E1BE1.1020803@oracle.com> Hi Jiangli, The change looks good to me. thanks, Calvin On 8/23/17, 4:24 PM, Jiangli Zhou wrote: > Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. > > bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 > webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ > > ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. > > Tested with tier4-comp tests. > > Thanks, > Jiangli From ioi.lam at oracle.com Thu Aug 24 01:03:26 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 23 Aug 2017 18:03:26 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: References: Message-ID: Hi Jiangli, Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks ? { ??? NoSafepointVerifier nsv; ??? // Cache for recording where the archived objects are copied to ??? MetaspaceShared::create_archive_object_cache(); ??? tty->print_cr("Dumping String objects to closed archive heap region ..."); ??? NOT_PRODUCT(StringTable::verify()); ??? // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. ??? _string_regions = new GrowableArray(2); ??? StringTable::write_to_archive(_string_regions); ??? tty->print_cr("Dumping objects to open archive heap region ..."); ??? _open_archive_heap_regions = new GrowableArray(2); MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); +?? MetaspaceShared::create_archive_object_cache(); ? } + static void delete_archive_object_cache() { +?? CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); + } Thanks - Ioi On 8/23/17 4:24 PM, Jiangli Zhou wrote: > Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. > > bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 > webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ > > ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. > > Tested with tier4-comp tests. > > Thanks, > Jiangli From jiangli.zhou at oracle.com Thu Aug 24 01:04:57 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 23 Aug 2017 18:04:57 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: <599E1BE1.1020803@oracle.com> References: <599E1BE1.1020803@oracle.com> Message-ID: <1199B127-62B5-4B8E-96D6-8BE8ED684A66@oracle.com> Thanks! Jiangli > On Aug 23, 2017, at 5:20 PM, Calvin Cheung wrote: > > Hi Jiangli, > > The change looks good to me. > > thanks, > Calvin > > On 8/23/17, 4:24 PM, Jiangli Zhou wrote: >> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >> >> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >> >> Tested with tier4-comp tests. >> >> Thanks, >> Jiangli From david.holmes at oracle.com Thu Aug 24 05:05:57 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 24 Aug 2017 15:05:57 +1000 Subject: [10] RFR: 8179040: Avoid Ticks::now calls when EventClassLoad is not enabled In-Reply-To: References: Message-ID: <2941268f-2bd4-37c0-c6ef-942d897038d2@oracle.com> Hi Claes, This still seems okay to me. Reviewed. Thanks, David On 23/08/2017 10:32 PM, Claes Redestad wrote: > Hi, > > this patch was never pushed due to an unfortunate interaction around > cancelling > of events. Markus Gr?nlund has helped resolve these, while maintaining the > speedup: > > http://cr.openjdk.java.net/~redestad/8179040/hotspot.02/ > > Thanks! > > /Claes > > On 04/25/2017 03:21 PM, Claes Redestad wrote: >> Hi, >> >> this patch removes calling Ticks::now when EventClassLoad isn't >> enabled, which >> has an effect on class loading performance: >> >> http://cr.openjdk.java.net/~redestad/8179040/hotspot.01/ >> >> When tracing isn't enabled trace/tracing.hpp has dummy implementations >> which >> are easily optimized away by a compiler, which I've verified happens >> on linux >> OpenJDK builds with tracing disabled. >> >> On builds with tracing enabled then the changes means the call to get >> the time >> only happen if the event is enabled, which achieves the sought after >> startup >> optimization. >> >> Thanks! >> >> /Claes > From jiangli.zhou at Oracle.COM Thu Aug 24 05:27:39 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Wed, 23 Aug 2017 22:27:39 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: References: Message-ID: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> Hi Ioi, The table was not changed to be allocated as ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. Thanks, Jiangli > On Aug 23, 2017, at 6:03 PM, Ioi Lam wrote: > > Hi Jiangli, > > Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks > > { > NoSafepointVerifier nsv; > > // Cache for recording where the archived objects are copied to > MetaspaceShared::create_archive_object_cache(); > > tty->print_cr("Dumping String objects to closed archive heap region ..."); > NOT_PRODUCT(StringTable::verify()); > // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. > _string_regions = new GrowableArray(2); > StringTable::write_to_archive(_string_regions); > > tty->print_cr("Dumping objects to open archive heap region ..."); > _open_archive_heap_regions = new GrowableArray(2); > MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); > > + MetaspaceShared::create_archive_object_cache(); > > } > > + static void delete_archive_object_cache() { > + CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); > + } > > Thanks > - Ioi > > On 8/23/17 4:24 PM, Jiangli Zhou wrote: >> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >> >> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >> >> Tested with tier4-comp tests. >> >> Thanks, >> Jiangli > From david.holmes at oracle.com Thu Aug 24 06:33:59 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 24 Aug 2017 16:33:59 +1000 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: References: Message-ID: <29006bbc-562a-f181-e05d-153d6a13dace@oracle.com> Hi Adam, cc'ing hotspot runtime dev as runtime own JNI and the invocation API - and some of the problematic code resides in the VM. On 24/08/2017 1:54 AM, Adam Farley8 wrote: > Hi All, > > Problem: Several of Java's "c" files call exit(0) if you pass certain > command-line options to JNI_CreateJavaVM, which can terminate the C++ code > JNI users use to initialise the JVM. > > Example: If you write some C++ code that calls JNI_CreateJavaVM, and uses > the option "-agentlib:jdwp=help", Java's c files will print the needed > help output and call exit(0). > > Result: Your C++ code is terminated on this line, and a return code of 0 > is produced. > > Issues: > > Issue 1: The exit(0) prevents your code from doing anything useful after > the JNI_CreateJavaVM call. > Issue 2: The exit(0) indicates to anything monitoring your C++ code that > your code exited normally, even though it was terminated mid-way-through. > Issue 3: This return code is useless to us, as a 0 can indicate the VM > started correctly, or it can indicate the VM was terminated due to one or > more of these command-line options. > Issue 4: Of the other JNI return values (JNI_OK, JNI_ERR, etc) none of > them appear to cover this scenario. This specific case seems like a bug to me as the logic is assuming it is only ever called by a launcher which it is okay to terminate. Though to be honest the very existence of the "help" option seems to me somewhat misguided in a hosted-VM environment. That said, I see unified logging in 9 also added a terminating "help" option . More generally it must be noted that the VM itself will often abort the hosting process, upon encountering an initialization error, rather than causing JNI_CreateJavaVM to return JNI_ERR. But I think we can certainly do better with "help" options, to not (necessarily) terminate the initialization process and abort the VM. > > Proposed solutions: > > PS1: We should amend the JNI specification to include a "JNI_SILENT_EXIT" > return code, so the C++ code knows a VM was not created, but that it isn't > an error. I'm not sure a new code is needed ... especially if initialization of the VM continues. Though if we really want the VM to abandon initialization when it sees certain flags then that needs to be spec'd into JNI_CreateJavaVM. > PS2: We should identify a list of the command-line options that produce > this behaviour via the JNI. (not all of the "help" options are recognised > by the JNI interface. E.g. -version and -help produce a JNI_ERR and an > "Option not recognised" message) Options processed by the VM will be recognized, while options processed by the Java launcher will not be. "-version", "-X", "-help" and numerous others are launcher options. Pure VM options are -XX options, but the VM also processes some -X flags and, as a result of jigsaw, now also processes a bunch of module-related flags that are simple --foo options. If writing a custom launcher, or something that acts like one, you need to be able to process all of the command-line flags and know which ones get passed through to the JVM and which do not. If you are trying to be compatible with the OpenJDK launcher then you'll need to be prepared to handle all of its arguments. JNI itself should not be aware of any such arguments - it should simply be allowed for JNI_CreateJavaVM to "fail" when it encounters them. > PS3: We should replace these annoying exit(0) calls with code that returns > "JNI_SILENT_EXIT", so the C++ code has a chance to finish. We should certainly look at getting rid of the exit() calls. Ideally no VM related flag would be of the "report and terminate" type, as the termination aspect really belongs in the launcher. I can imagine a launcher using Dcmds to retrieve "help" strings from subsystems rather than having actual "help" options processed by the VM (somewhat similar t how -version is handled: create VM, extract version info, detach VM. exit launcher). That said it would be somewhat awkward, for example, if the launcher processed -Xlog:help, but the VM processed all other -Xlog arguments. Cheers, David > > Best Regards > > Adam Farley > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > From thomas.stuefe at gmail.com Thu Aug 24 06:59:56 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 24 Aug 2017 08:59:56 +0200 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: <29006bbc-562a-f181-e05d-153d6a13dace@oracle.com> References: <29006bbc-562a-f181-e05d-153d6a13dace@oracle.com> Message-ID: Hi David, our mails crossed :) > PS3: We should replace these annoying exit(0) calls with code that returns >> "JNI_SILENT_EXIT", so the C++ code has a chance to finish. >> > > We should certainly look at getting rid of the exit() calls. Or at least consistently route them through the exit hook if one exists. It would also be nice to indicate to the caller in whether we were able to clean up successfully (or did not do anything important yet) or whether the process is tainted. For example, to allow him to retry CreateJavaVM with different options. > Ideally no VM related flag would be of the "report and terminate" type, as > the termination aspect really belongs in the launcher. I can imagine a > launcher using Dcmds to retrieve "help" strings from subsystems rather than > having actual "help" options processed by the VM (somewhat similar t how > -version is handled: create VM, extract version info, detach VM. exit > launcher). That said it would be somewhat awkward, for example, if the > launcher processed -Xlog:help, but the VM processed all other -Xlog > arguments. > > This would be a pretty cool feature. It would be nice though if help could be retrieved without creating the VM first. ..Thomas > Cheers, > David > > > >> Best Regards >> >> Adam Farley >> Unless stated otherwise above: >> IBM United Kingdom Limited - Registered in England and Wales with number >> 741598. >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >> >> From david.holmes at oracle.com Thu Aug 24 07:54:03 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 24 Aug 2017 17:54:03 +1000 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: References: <29006bbc-562a-f181-e05d-153d6a13dace@oracle.com> Message-ID: On 24/08/2017 4:59 PM, Thomas St?fe wrote: > Hi David, > > our mails crossed :) And unfortunately we each added in a different hotspot mailing list. :) > > > > PS3: We should replace these annoying exit(0) calls with code > that returns > "JNI_SILENT_EXIT", so the C++ code has a chance to finish. > > > We should certainly look at getting rid of the exit() calls. > > > Or at least consistently route them through the exit hook if one exists. > It would also be nice to indicate to the caller in whether we were able > to clean up successfully (or did not do anything important yet) or > whether the process is tainted. For example, to allow him to retry > CreateJavaVM with different options. I think these simple cases all occur during argument processing before too much of the VM has been initialized. Even so it would probably take some effort to try and then allow a second call to JNI_CreateJavaVM to proceed due to use of static initializers and onLoad hooks. Depending on the exact context it may be better if the process loading the VM filters out these problematic flags in the first place. They really only make sense for CLI invocations. Cheers, David ----- > Ideally no VM related flag would be of the "report and terminate" > type, as the termination aspect really belongs in the launcher. I > can imagine a launcher using Dcmds to retrieve "help" strings from > subsystems rather than having actual "help" options processed by the > VM (somewhat similar t how -version is handled: create VM, extract > version info, detach VM. exit launcher). That said it would be > somewhat awkward, for example, if the launcher processed -Xlog:help, > but the VM processed all other -Xlog arguments. > > > This would be a pretty cool feature. It would be nice though if help > could be retrieved without creating the VM first. > > ..Thomas > > Cheers, > David > > > > Best Regards > > Adam Farley > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales > with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, > Hampshire PO6 3AU > > From thomas.stuefe at gmail.com Thu Aug 24 08:50:55 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 24 Aug 2017 10:50:55 +0200 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: <1df8ec2a-e1f0-52f8-7e26-04b0d770f9cd@oracle.com> References: <1df8ec2a-e1f0-52f8-7e26-04b0d770f9cd@oracle.com> Message-ID: Hi guys, thanks for the reviews! @Ioi: thanks for sponsoring! New Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp-handling/webrev.02/webrev/ Delta to last: http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp-handling/webrev.01-to-02/webrev/ Nothing exciting changed. I addressed all of Goetz and Richards concerns and added a Summary line to the change. Change built and tested (gtests only) on Windows x86, x64. Kind Regards, Thomas On Wed, Aug 23, 2017 at 4:26 PM, Ioi Lam wrote: > Hi Thomas, thanks for addressing my concerns. I can sponsor the change > after you get OK from the other reviewers. > > Thanks > > - Ioi > > On 8/22/17 5:48 AM, Thomas St?fe wrote: > > Hi all, > > please see new Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349- > centralize-dbghelp-handling/webrev.01/webrev/ > > I worked in proposed changes by Ioi and Richard. Sorry, because on file > name changed, I did not do an incremental webrev. Here are the changes: > > - Renamed DbgHelpLoader to WindowsDbgHelp and renamed the files too. > - I moved the critical section code to the implementation of > WindowsDbgHelp. I discarded the general "CritSectLocker" object in favour > of an object specialized for this case, see the EntryGuard class. > - I readded the SymSetOptions call I accidentally discarded. > - In decoder_windows.cpp, I set the value for _can_decode_in_vm to true. I > plan to remove this flag in one of the next changes (see also JDK-8144855) > > > As Richard is no full reviewer, I'll need a second Reviewer and a sponsor. > > Thanks! > > Kind Regards, Thomas > > > On Tue, Aug 22, 2017 at 2:33 PM, Thomas St?fe > wrote: > >> Hi Richard, >> >> thank you for the review! Please find my remarks inline. >> >> On Tue, Aug 22, 2017 at 9:44 AM, Reingruber, Richard < >> richard.reingruber at sap.com> wrote: >> >>> Hi Thomas, >>> >>> thanks for the refactoring work! >>> >>> I had a look at your changes, but please note that I'm not a reviewer. >>> >>> ### dbghelp_loader.cpp: >>> >>> Should globalDefinitions.hpp be included? >>> >> >> Currently I do not need it, so I'd rather not. >> >> >>> >>> Little inconsistency: opening curly braces of method bodies should be on >>> the same line as the end of the parameter list, I guess. >>> >>> 294: st->print("%s" #functionname, ((num_missing > 0) ? ", " : "")); >>> >> >> Are you sure? Sorry, I cannot spot an error here. >> >> >>> >>> Format string is incomplete. >>> >>> 196 BOOL DbgHelpLoader::stackWalk64(DWORD MachineType, >>> 197 HANDLE hProcess, >>> 198 HANDLE hThread, >>> 199 LPSTACKFRAME64 StackFrame, >>> 200 PVOID ContextRecord) >>> 201 { >>> 202 CritSectLocker lck(&g_cs); >>> 203 if (initialize_if_needed()) { >>> 204 if (g_pfn_StackWalk64 != NULL) { >>> 205 return g_pfn_StackWalk64(MachineType, hProcess, hThread, >>> StackFrame, >>> 206 ContextRecord, >>> 207 NULL, // ReadMemoryRoutine >>> 208 g_pfn_SymFunctionTableAccess64, >>> // FunctionTableAccessRoutine, >>> 209 g_pfn_SymGetModuleBase64, // >>> GetModuleBaseRoutine >>> 210 NULL // TranslateAddressRoutine >>> 211 ); >>> 212 } >>> 213 } >>> 214 return FALSE; >>> 215 } >>> >>> Lines 208, 209: is it ok to pass NULL? >>> >>> >> Good question. Documentation says parameters are required. I tested >> calling this with NULL for both functions and stack walking worked just >> fine. I leave it as it is, because I think at worst we risk StackWalk64 >> failing, and at best we get a callstack nevertheless. >> >> >>> ### windows_decoder.cpp >>> >>> The following line was deleted without replacement. Are the options not >>> needed? >>> >>> SymSetOptions(SYMOPT_UNDNAME | SYMOPT_DEFERRED_LOADS | >>> SYMOPT_EXACT_SYMBOLS); >>> >>> >> Good catch! Will fix. >> >> >>> >>> Cheers, Richard. >>> >>> >> Thanks! >> >> Thomas >> >> >>> -----Original Message----- >>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bo >>> unces at openjdk.java.net] On Behalf Of Thomas St?fe >>> Sent: Freitag, 18. August 2017 09:24 >>> To: hotspot-runtime-dev at openjdk.java.net >>> Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code >>> >>> Dear all, >>> >>> may I please have a review for this change: >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8186349 >>> Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ >>> 8186349-centralize-dbghelp-handling/webrev.00/webrev/ >>> >>> This is a part of an ongoing work I do to make error reporting >>> (especially >>> callstacks) on Windows more reliable. >>> >>> At first I did a rather large patch, see: https://bugs.openjdk.java.net/ >>> browse/JDK-8185712 and >>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2 >>> 017-August/024286.html >>> . But after discussing this patch with Ioi, I saw that this patch is >>> better >>> split up into multiple parts for easier reviewing. >>> >>> So this is the first split up patch. >>> >>> -- >>> >>> This patch here centralizes handling of the dbghelp.dll (loading the >>> library, resolving function pointers and synchronizing access). >>> >>> Which solves the problem that accesses to functions exported from the >>> dbghelp.dll need to be synchronized, as stated by the MSDN. We somehow >>> never really cared. I guess it never caused visible trouble, because most >>> of the time (not always) the functions are accessed from >>> VMError::report(), >>> and chances of parallel access from other non-error-reporting threads are >>> slim. Even if it were to crash, secondary error handling would step in >>> and >>> write an "Error occurred during error reporting" or "Another thread had >>> an >>> error too" message and we would probably just shrug it off. >>> >>> But as this whole effort is about increasing the chance of useful >>> callstacks in hs-err files, I'd like to fix this. >>> >>> In addition to the fix, I think this is also a nice cleanup and removes >>> duplicate code. >>> >>> Notes: >>> >>> 1) Robustness: We may or may not find a dbghelp.dll on the target system. >>> If we find it, it may be old or new (it is not tightly coupled with the >>> OS, >>> may be part of other installation packages, may exist multiple times >>> etc). >>> We should handle older versions of the dbghelp dll gracefully and hide >>> all >>> that complexity from the caller. >>> >>> 2) The new DbgHelpLoader class does not export any state indicating >>> whether >>> or not it successfully loaded, and if it loaded which functions are >>> actually available. That was a deliberate decision, there is no need for >>> the caller to know this. Caller should invoke the DbgHelpLoader functions >>> as if they were the equivalent OS functions and handle return errors. >>> DbgHelpLoader should never crash or assert; missing functions should >>> behave >>> like failing functions. >>> >>> 3) However, I added a one liner to the hs-err file indicating the state >>> of >>> the dbghelp dll - version info, what functions were missing etc. This may >>> help understanding weird or missing callstacks. >>> >>> 4) I removed the implementation for shutdown (WindowsDecoder::shutdown). >>> I >>> think there is no valid reason to ever shutdown the decoder. For one, we >>> may crash right at the end, and still it would be nice to have >>> callstacks. >>> And then, why spend cycles shutting down the decoder when we could just >>> let >>> it end with the process? >>> >>> 5) This code gets used during error reporting. So no VM infrastructure >>> must >>> be used to avoid circular crashes and VM initialization dependencies. So, >>> to synchronize, this code uses raw windows CriticalSection objects. >>> >>> -- >>> >>> Next step will be revamping handling of the Symbol APIs. This will >>> involve >>> removing the WindowsDecoder class, which introduces other errors and >>> really >>> makes no sense if the underlying dbghelp layer does its own >>> synchronization. >>> >>> >>> Thanks for reviewing! >>> >>> Kind Regards, Thomas >>> >> >> > > From goetz.lindenmaier at sap.com Thu Aug 24 09:14:16 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 24 Aug 2017 09:14:16 +0000 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: <1df8ec2a-e1f0-52f8-7e26-04b0d770f9cd@oracle.com> Message-ID: <4cf94c01dd734b50955292c68e509459@sap.com> Looks perfect now, Reviewed. Best, Goetz. > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Donnerstag, 24. August 2017 10:51 > To: Ioi Lam ; Reingruber, Richard > ; Lindenmaier, Goetz > > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(s): 8186349: [windows] Centralize dbghelp handling code > > Hi guys, > > thanks for the reviews! > @Ioi: thanks for sponsoring! > > > New Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp- > handling/webrev.02/webrev/ > > > Delta to last: > http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp- > handling/webrev.01-to-02/webrev/ > > > Nothing exciting changed. I addressed all of Goetz and Richards concerns and > added a Summary line to the change. > > Change built and tested (gtests only) on Windows x86, x64. > > Kind Regards, Thomas > > > > On Wed, Aug 23, 2017 at 4:26 PM, Ioi Lam > wrote: > > > Hi Thomas, thanks for addressing my concerns. I can sponsor the > change after you get OK from the other reviewers. > > Thanks > > - Ioi > > > > On 8/22/17 5:48 AM, Thomas St?fe wrote: > > > Hi all, > > please see new Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349- > centralize-dbghelp-handling/webrev.01/webrev/ > handling/webrev.01/webrev/> > > > I worked in proposed changes by Ioi and Richard. Sorry, > because on file name changed, I did not do an incremental webrev. Here are > the changes: > > - Renamed DbgHelpLoader to WindowsDbgHelp and renamed > the files too. > - I moved the critical section code to the implementation of > WindowsDbgHelp. I discarded the general "CritSectLocker" object in favour of > an object specialized for this case, see the EntryGuard class. > - I readded the SymSetOptions call I accidentally discarded. > - In decoder_windows.cpp, I set the value for > _can_decode_in_vm to true. I plan to remove this flag in one of the next > changes (see also JDK-8144855) > > > As Richard is no full reviewer, I'll need a second Reviewer and > a sponsor. > > Thanks! > > Kind Regards, Thomas > > > On Tue, Aug 22, 2017 at 2:33 PM, Thomas St?fe > > wrote: > > > Hi Richard, > > > thank you for the review! Please find my remarks > inline. > > On Tue, Aug 22, 2017 at 9:44 AM, Reingruber, Richard > > wrote: > > > Hi Thomas, > > thanks for the refactoring work! > > I had a look at your changes, but please note > that I'm not a reviewer. > > ### dbghelp_loader.cpp: > > Should globalDefinitions.hpp be included? > > > > Currently I do not need it, so I'd rather not. > > > > > Little inconsistency: opening curly braces of > method bodies should be on the same line as the end of the parameter list, I > guess. > > 294: st->print("%s" #functionname, > ((num_missing > 0) ? ", " : "")); > > > > Are you sure? Sorry, I cannot spot an error here. > > > > > Format string is incomplete. > > 196 BOOL > DbgHelpLoader::stackWalk64(DWORD MachineType, > 197 HANDLE hProcess, > 198 HANDLE hThread, > 199 LPSTACKFRAME64 > StackFrame, > 200 PVOID > ContextRecord) > 201 { > 202 CritSectLocker lck(&g_cs); > 203 if (initialize_if_needed()) { > 204 if (g_pfn_StackWalk64 != NULL) { > 205 return > g_pfn_StackWalk64(MachineType, hProcess, hThread, StackFrame, > 206 ContextRecord, > 207 NULL, // > ReadMemoryRoutine > 208 > g_pfn_SymFunctionTableAccess64, // FunctionTableAccessRoutine, > 209 > g_pfn_SymGetModuleBase64, // GetModuleBaseRoutine > 210 NULL // > TranslateAddressRoutine > 211 ); > 212 } > 213 } > 214 return FALSE; > 215 } > > Lines 208, 209: is it ok to pass NULL? > > > > > Good question. Documentation says parameters are > required. I tested calling this with NULL for both functions and stack walking > worked just fine. I leave it as it is, because I think at worst we risk StackWalk64 > failing, and at best we get a callstack nevertheless. > > > > ### windows_decoder.cpp > > The following line was deleted without > replacement. Are the options not needed? > > SymSetOptions(SYMOPT_UNDNAME | > SYMOPT_DEFERRED_LOADS | SYMOPT_EXACT_SYMBOLS); > > > > > Good catch! Will fix. > > > > > Cheers, Richard. > > > > > Thanks! > > > Thomas > > > -----Original Message----- > From: hotspot-runtime-dev [mailto:hotspot- > runtime-dev-bounces at openjdk.java.net bounces at openjdk.java.net> ] On Behalf Of Thomas St?fe > Sent: Freitag, 18. August 2017 09:24 > To: hotspot-runtime-dev at openjdk.java.net > > Subject: RFR(s): 8186349: [windows] Centralize > dbghelp handling code > > Dear all, > > may I please have a review for this change: > > Issue: > https://bugs.openjdk.java.net/browse/JDK-8186349 > > Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/ > > 8186349-centralize-dbghelp- > handling/webrev.00/webrev/ > > This is a part of an ongoing work I do to make > error reporting (especially > callstacks) on Windows more reliable. > > At first I did a rather large patch, see: > https://bugs.openjdk.java.net/ > browse/JDK-8185712 and > > http://mail.openjdk.java.net/pipermail/hotspot- > runtime-dev/2017-August/024286.html > August/024286.html> > . But after discussing this patch with Ioi, I saw > that this patch is better > split up into multiple parts for easier reviewing. > > So this is the first split up patch. > > -- > > This patch here centralizes handling of the > dbghelp.dll (loading the > library, resolving function pointers and > synchronizing access). > > Which solves the problem that accesses to > functions exported from the > dbghelp.dll need to be synchronized, as stated > by the MSDN. We somehow > never really cared. I guess it never caused > visible trouble, because most > of the time (not always) the functions are > accessed from VMError::report(), > and chances of parallel access from other non- > error-reporting threads are > slim. Even if it were to crash, secondary error > handling would step in and > write an "Error occurred during error > reporting" or "Another thread had an > error too" message and we would probably just > shrug it off. > > But as this whole effort is about increasing the > chance of useful > callstacks in hs-err files, I'd like to fix this. > > In addition to the fix, I think this is also a nice > cleanup and removes > duplicate code. > > Notes: > > 1) Robustness: We may or may not find a > dbghelp.dll on the target system. > If we find it, it may be old or new (it is not > tightly coupled with the OS, > may be part of other installation packages, may > exist multiple times etc). > We should handle older versions of the dbghelp > dll gracefully and hide all > that complexity from the caller. > > 2) The new DbgHelpLoader class does not > export any state indicating whether > or not it successfully loaded, and if it loaded > which functions are > actually available. That was a deliberate > decision, there is no need for > the caller to know this. Caller should invoke the > DbgHelpLoader functions > as if they were the equivalent OS functions and > handle return errors. > DbgHelpLoader should never crash or assert; > missing functions should behave > like failing functions. > > 3) However, I added a one liner to the hs-err > file indicating the state of > the dbghelp dll - version info, what functions > were missing etc. This may > help understanding weird or missing callstacks. > > 4) I removed the implementation for shutdown > (WindowsDecoder::shutdown). I > think there is no valid reason to ever shutdown > the decoder. For one, we > may crash right at the end, and still it would be > nice to have callstacks. > And then, why spend cycles shutting down the > decoder when we could just let > it end with the process? > > 5) This code gets used during error reporting. > So no VM infrastructure must > be used to avoid circular crashes and VM > initialization dependencies. So, > to synchronize, this code uses raw windows > CriticalSection objects. > > -- > > Next step will be revamping handling of the > Symbol APIs. This will involve > removing the WindowsDecoder class, which > introduces other errors and really > makes no sense if the underlying dbghelp layer > does its own synchronization. > > > Thanks for reviewing! > > Kind Regards, Thomas > > > > > From thomas.stuefe at gmail.com Thu Aug 24 09:15:04 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 24 Aug 2017 11:15:04 +0200 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: <4cf94c01dd734b50955292c68e509459@sap.com> References: <1df8ec2a-e1f0-52f8-7e26-04b0d770f9cd@oracle.com> <4cf94c01dd734b50955292c68e509459@sap.com> Message-ID: Thanks Goetz! @Ioi: We are good now for reviews. Thanks for sponsoring! ..Thomas On Thu, Aug 24, 2017 at 11:14 AM, Lindenmaier, Goetz < goetz.lindenmaier at sap.com> wrote: > Looks perfect now, Reviewed. > > Best, > Goetz. > > > -----Original Message----- > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > > Sent: Donnerstag, 24. August 2017 10:51 > > To: Ioi Lam ; Reingruber, Richard > > ; Lindenmaier, Goetz > > > > Cc: hotspot-runtime-dev at openjdk.java.net > > Subject: Re: RFR(s): 8186349: [windows] Centralize dbghelp handling code > > > > Hi guys, > > > > thanks for the reviews! > > @Ioi: thanks for sponsoring! > > > > > > New Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp- > > handling/webrev.02/webrev/ > > > > > > Delta to last: > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp- > > handling/webrev.01-to-02/webrev/ > > > > > > Nothing exciting changed. I addressed all of Goetz and Richards concerns > and > > added a Summary line to the change. > > > > Change built and tested (gtests only) on Windows x86, x64. > > > > Kind Regards, Thomas > > > > > > > > On Wed, Aug 23, 2017 at 4:26 PM, Ioi Lam > > wrote: > > > > > > Hi Thomas, thanks for addressing my concerns. I can sponsor the > > change after you get OK from the other reviewers. > > > > Thanks > > > > - Ioi > > > > > > > > On 8/22/17 5:48 AM, Thomas St?fe wrote: > > > > > > Hi all, > > > > please see new Webrev: > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349- > > centralize-dbghelp-handling/webrev.01/webrev/ > > centralize-dbghelp- > > handling/webrev.01/webrev/> > > > > > > I worked in proposed changes by Ioi and Richard. Sorry, > > because on file name changed, I did not do an incremental webrev. Here > are > > the changes: > > > > - Renamed DbgHelpLoader to WindowsDbgHelp and renamed > > the files too. > > - I moved the critical section code to the implementation > of > > WindowsDbgHelp. I discarded the general "CritSectLocker" object in > favour of > > an object specialized for this case, see the EntryGuard class. > > - I readded the SymSetOptions call I accidentally > discarded. > > - In decoder_windows.cpp, I set the value for > > _can_decode_in_vm to true. I plan to remove this flag in one of the next > > changes (see also JDK-8144855) > > > > > > As Richard is no full reviewer, I'll need a second > Reviewer and > > a sponsor. > > > > Thanks! > > > > Kind Regards, Thomas > > > > > > On Tue, Aug 22, 2017 at 2:33 PM, Thomas St?fe > > > wrote: > > > > > > Hi Richard, > > > > > > thank you for the review! Please find my remarks > > inline. > > > > On Tue, Aug 22, 2017 at 9:44 AM, Reingruber, > Richard > > > wrote: > > > > > > Hi Thomas, > > > > thanks for the refactoring work! > > > > I had a look at your changes, but please > note > > that I'm not a reviewer. > > > > ### dbghelp_loader.cpp: > > > > Should globalDefinitions.hpp be included? > > > > > > > > Currently I do not need it, so I'd rather not. > > > > > > > > > > Little inconsistency: opening curly braces > of > > method bodies should be on the same line as the end of the parameter > list, I > > guess. > > > > 294: st->print("%s" #functionname, > > ((num_missing > 0) ? ", " : "")); > > > > > > > > Are you sure? Sorry, I cannot spot an error here. > > > > > > > > > > Format string is incomplete. > > > > 196 BOOL > > DbgHelpLoader::stackWalk64(DWORD MachineType, > > 197 > HANDLE hProcess, > > 198 > HANDLE hThread, > > 199 > LPSTACKFRAME64 > > StackFrame, > > 200 > PVOID > > ContextRecord) > > 201 { > > 202 CritSectLocker lck(&g_cs); > > 203 if (initialize_if_needed()) { > > 204 if (g_pfn_StackWalk64 != NULL) > { > > 205 return > > g_pfn_StackWalk64(MachineType, hProcess, hThread, StackFrame, > > 206 > ContextRecord, > > 207 > NULL, // > > ReadMemoryRoutine > > 208 > > g_pfn_SymFunctionTableAccess64, // FunctionTableAccessRoutine, > > 209 > > g_pfn_SymGetModuleBase64, // GetModuleBaseRoutine > > 210 > NULL // > > TranslateAddressRoutine > > 211 ); > > 212 } > > 213 } > > 214 return FALSE; > > 215 } > > > > Lines 208, 209: is it ok to pass NULL? > > > > > > > > > > Good question. Documentation says parameters are > > required. I tested calling this with NULL for both functions and stack > walking > > worked just fine. I leave it as it is, because I think at worst we risk > StackWalk64 > > failing, and at best we get a callstack nevertheless. > > > > > > > > ### windows_decoder.cpp > > > > The following line was deleted without > > replacement. Are the options not needed? > > > > SymSetOptions(SYMOPT_UNDNAME | > > SYMOPT_DEFERRED_LOADS | SYMOPT_EXACT_SYMBOLS); > > > > > > > > > > Good catch! Will fix. > > > > > > > > > > Cheers, Richard. > > > > > > > > > > Thanks! > > > > > > Thomas > > > > > > -----Original Message----- > > From: hotspot-runtime-dev [mailto:hotspot- > > runtime-dev-bounces at openjdk.java.net > bounces at openjdk.java.net> ] On Behalf Of Thomas St?fe > > Sent: Freitag, 18. August 2017 09:24 > > To: hotspot-runtime-dev at openjdk.java.net > > > > Subject: RFR(s): 8186349: [windows] > Centralize > > dbghelp handling code > > > > Dear all, > > > > may I please have a review for this change: > > > > Issue: > > https://bugs.openjdk.java.net/browse/JDK-8186349 > > > > Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/ > > > > 8186349-centralize-dbghelp- > > handling/webrev.00/webrev/ > > > > This is a part of an ongoing work I do to > make > > error reporting (especially > > callstacks) on Windows more reliable. > > > > At first I did a rather large patch, see: > > https://bugs.openjdk.java.net/ > > browse/JDK-8185712 and > > > > http://mail.openjdk.java.net/ > pipermail/hotspot- > > runtime-dev/2017-August/024286.html > > > August/024286.html> > > . But after discussing this patch with > Ioi, I saw > > that this patch is better > > split up into multiple parts for easier > reviewing. > > > > So this is the first split up patch. > > > > -- > > > > This patch here centralizes handling of the > > dbghelp.dll (loading the > > library, resolving function pointers and > > synchronizing access). > > > > Which solves the problem that accesses to > > functions exported from the > > dbghelp.dll need to be synchronized, as > stated > > by the MSDN. We somehow > > never really cared. I guess it never caused > > visible trouble, because most > > of the time (not always) the functions are > > accessed from VMError::report(), > > and chances of parallel access from other > non- > > error-reporting threads are > > slim. Even if it were to crash, secondary > error > > handling would step in and > > write an "Error occurred during error > > reporting" or "Another thread had an > > error too" message and we would probably > just > > shrug it off. > > > > But as this whole effort is about > increasing the > > chance of useful > > callstacks in hs-err files, I'd like to > fix this. > > > > In addition to the fix, I think this is > also a nice > > cleanup and removes > > duplicate code. > > > > Notes: > > > > 1) Robustness: We may or may not find a > > dbghelp.dll on the target system. > > If we find it, it may be old or new (it is > not > > tightly coupled with the OS, > > may be part of other installation > packages, may > > exist multiple times etc). > > We should handle older versions of the > dbghelp > > dll gracefully and hide all > > that complexity from the caller. > > > > 2) The new DbgHelpLoader class does not > > export any state indicating whether > > or not it successfully loaded, and if it > loaded > > which functions are > > actually available. That was a deliberate > > decision, there is no need for > > the caller to know this. Caller should > invoke the > > DbgHelpLoader functions > > as if they were the equivalent OS > functions and > > handle return errors. > > DbgHelpLoader should never crash or assert; > > missing functions should behave > > like failing functions. > > > > 3) However, I added a one liner to the > hs-err > > file indicating the state of > > the dbghelp dll - version info, what > functions > > were missing etc. This may > > help understanding weird or missing > callstacks. > > > > 4) I removed the implementation for > shutdown > > (WindowsDecoder::shutdown). I > > think there is no valid reason to ever > shutdown > > the decoder. For one, we > > may crash right at the end, and still it > would be > > nice to have callstacks. > > And then, why spend cycles shutting down > the > > decoder when we could just let > > it end with the process? > > > > 5) This code gets used during error > reporting. > > So no VM infrastructure must > > be used to avoid circular crashes and VM > > initialization dependencies. So, > > to synchronize, this code uses raw windows > > CriticalSection objects. > > > > -- > > > > Next step will be revamping handling of the > > Symbol APIs. This will involve > > removing the WindowsDecoder class, which > > introduces other errors and really > > makes no sense if the underlying dbghelp > layer > > does its own synchronization. > > > > > > Thanks for reviewing! > > > > Kind Regards, Thomas > > > > > > > > > > > > From Alan.Bateman at oracle.com Thu Aug 24 11:55:02 2017 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 24 Aug 2017 12:55:02 +0100 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: <29006bbc-562a-f181-e05d-153d6a13dace@oracle.com> References: <29006bbc-562a-f181-e05d-153d6a13dace@oracle.com> Message-ID: <15ed8720-4d13-ae95-dfbe-dd0e3d5acfd6@oracle.com> On 24/08/2017 07:33, David Holmes wrote: > Hi Adam, > > cc'ing hotspot runtime dev as runtime own JNI and the invocation API - > and some of the problematic code resides in the VM. Yeah, the hotspot mailing list would be a better place to discuss this as there are several issues here and several places where HotSpot aborts the process when initialization fails. It's a long standing issue (going back 15+ years) that I think is partly because it's not easy to release all resources and cleanup before CreateJavaVM returns with an error. > > This specific case seems like a bug to me as the logic is assuming it > is only ever called by a launcher which it is okay to terminate. > Though to be honest the very existence of the "help" option seems to > me somewhat misguided in a hosted-VM environment. That said, I see > unified logging in 9 also added a terminating "help" option . The agent "help" option case is tricky and would likely need an update to the JVM TI spec and the Agent_OnLoad return value. > > Options processed by the VM will be recognized, while options > processed by the Java launcher will not be. "-version", "-X", "-help" > and numerous others are launcher options. Pure VM options are -XX > options, but the VM also processes some -X flags and, as a result of > jigsaw, now also processes a bunch of module-related flags that are > simple --foo options. Right because these options need to passed to CreateJavaVM as they are used when initializing the VM. Using system properties would just repeat the issues of past (e.g. java.class.path) and require documenting a slew of system properties (which is complicated at repeating options). -Alan From martin.doerr at sap.com Thu Aug 24 12:24:30 2017 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 24 Aug 2017 12:24:30 +0000 Subject: RFR(XS): 8186734: AIX build broken after 8186166: Generalize Atomic::cmpxchg with templates Message-ID: <37beebcbe54d4500a747b41f31bd11e4@sap.com> Hi, the AIX build is broken because of undefined symbol: STATIC_CAST Fix: http://cr.openjdk.java.net/~mdoerr/8186734_fix_aix_build/webrev.00/ Please review. Best regards, Martin From goetz.lindenmaier at sap.com Thu Aug 24 12:30:19 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 24 Aug 2017 12:30:19 +0000 Subject: RFR(XS): 8186734: AIX build broken after 8186166: Generalize Atomic::cmpxchg with templates In-Reply-To: <37beebcbe54d4500a747b41f31bd11e4@sap.com> References: <37beebcbe54d4500a747b41f31bd11e4@sap.com> Message-ID: <6013d5893bfd40c0befed6d7b3626789@sap.com> Hi Martin, thanks for fixing this. Reviewed. Best regards, Goetz. > -----Original Message----- > From: Doerr, Martin > Sent: Donnerstag, 24. August 2017 14:25 > To: hotspot-runtime-dev at openjdk.java.net; Lindenmaier, Goetz > ; Volker Simonis (volker.simonis at gmail.com) > > Subject: RFR(XS): 8186734: AIX build broken after 8186166: Generalize > Atomic::cmpxchg with templates > > Hi, > > > > the AIX build is broken because of undefined symbol: STATIC_CAST > > > > Fix: > > http://cr.openjdk.java.net/~mdoerr/8186734_fix_aix_build/webrev.00/ > > > > Please review. > > > > Best regards, > > Martin > > From goetz.lindenmaier at sap.com Thu Aug 24 14:19:55 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 24 Aug 2017 14:19:55 +0000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. Message-ID: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> Hi, I please need a second review and a sponsor: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.04 To update my description of the change to the status after Thomas' review: dll_build_name builds the proper path to a library given a list of paths separated by path_seperator and a library name. It adds in the platform specific endings etc. It is documented to return whether the file exists, but only does so if a path_seperator exists in the path. Especially if the path is empty, it just returns ?true? without checking. Dll_build_name is usually used before calling dll_load. If dll_load does not get a full path it searches in well known unix/windows locations. This is intended in the two cases where dll_build_name is called with an empty path. I renamed dll_build_name to dll_locate_lib and changed it's behavior to always return a full path to the lib, inserting current working directory if no path is given. For the use case where "" was actually passed to the function, I added a new function (reusing the old function name) dll_build_name that just adds system dependent prefix and suffix to the name. I merged all unix implementations to the posix os branch. Best regards, Goetz. > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Dienstag, 22. August 2017 17:30 > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file is > missing. > > Looks good. > > ..Thomas > > On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz > > wrote: > > > I mistyped the path to webrev, this should work: > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.04 dllBuildName/webrev.04> > > Sorry, > Goetz > > > > > -----Original Message----- > > From: Lindenmaier, Goetz > > Sent: Dienstag, 22. August 2017 15:48 > > To: 'Thomas St?fe' > > > Cc: hotspot-runtime-dev at openjdk.java.net dev at openjdk.java.net> > > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if > file is > > missing. > > > > Hi, > > > > could I please get a second review? > > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName- > hs/webrev.04 hs/webrev.04> > > > > I had to update the webrev because of a problem on windows. > > @Thomas I had edited os.hpp, but not saved :( > > > > Best regards, > > Goetz. > > > > PS: Didn't double-check the webrev as cr server is slow. > > > > > -----Original Message----- > > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > ] > > > Sent: Donnerstag, 17. August 2017 19:54 > > > To: Lindenmaier, Goetz > > > > Cc: hotspot-runtime-dev at openjdk.java.net runtime-dev at openjdk.java.net> > > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even if > file is > > > missing. > > > > > > Hi Goetz, > > > > > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz > > > > > wrote: > > > > > > > > > Hi Thomas, > > > > > > > > > > > > I adapted the comments in os.hpp. > > > > > > > > > > > > If I move the call to dll_build_name out of dll_locate_lib > > > > > > I have to do a lot of coding in all the places where it is called. > > > > > > That seems not useful to me. > > > > > > > > > > > > Fixed the type to size_t. > > > > > > > > > > > > One could merge posix/windows if putting the check for ?:? > > > > > > into a WINDOWS_ONLY() I guess. The check for \ could be > > > > > > done in posix as well, if using file_seperator(). > > > > > > > > > > > > * Not your change, but: why does the code in os::dll_locate_lib() > even > > > > > > * differentiate between a PATH containing no > os::path_separator() > > > > > > * and a path containing os::path_separator()? > > > > > > I assume this was done to avoid all the allocations and copying of > the > > > path. > > > > > > > > > > > > Also adapted the comment in jvmtiExport.cpp. > > > > > > > > > > > > New webrev: > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > dllBuildName/webrev.03/ > > > > dllBuildName/webrev.03/> > > > > > > incremental diff: > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > dllBuildName/webrev.03/diffs-incremental.patch > > > > > > dllBuildName/webrev.03/diffs-incremental.patch> > > > > > > (fixed indentation on windows) > > > > > > > > > > > > Best regards, > > > > > > Goetz. > > > > > > > > > > > > > > > > > > > > > Comments in os.hpp seem unchanged ? > > > > > > But looks fine otherwise. I do not need another webrev. > > > > > > Thanks, Thomas > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > > > > > ] > > > Sent: Thursday, August 17, 2017 3:48 PM > > > To: Lindenmaier, Goetz > > > > > > > > Cc: hotspot-runtime-dev at openjdk.java.net runtime-dev at openjdk.java.net> runtime-> > > > dev at openjdk.java.net > > > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even > if file > > > is missing. > > > > > > > > > > > > Hi Goetz, > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz > > > > > wrote: > > > > > > Hi Thomas, > > > > > > I reworked the whole thing. > > > > > > First, there is dll_build_name. It just does -> > > > lib.so. > > > > > > Second, I renamed the legacy dll_build_name to > dll_locate_lib. > > > > > > I merged all the unix variants to one in os_posix. > > > > > > I removed the buffer overflow check at the top. > > > It's too restrictive because the path argument > > > can contain several paths. I added the overflow > > > checks into the single cases. > > > > > > Also, I first assemble the pure name using the new, simple > > > dll_build_name. This is for reuse and readability. > > > > > > In case of an empty directory, I use get_current_directory > > > to complete the path as indicated by the original > > > documentation > > > where it was called with "". > > > Dll_locate_lib now always returns a name with a full path if > > > the file exists. > > > > > > Also, on windows, I think I fixed a bug by reversing the order > > > of checks. A path list ending in ':' or '\' would not have > > > been recognized. > > > > > > On Bsd, I removed JNI_LIB_* because that already is defined > > > in jvm_bsh.h > > > > > > New webrev: > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > dllBuildName/webrev.02/ > > > > dllBuildName/webrev.02/> > > > > > > Best regards, > > > Goetz. > > > > > > > > > > > > I like this better than before. Remarks: > > > > > > > > > > > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > > > > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > > > > > > > > > > > > + // Builds the platform-specific name of a library. > > > > > > + // Returns false on __buffer overflow__. > > > > > > > > > > > > Hopefully not! :D > > > > > > How about: "Returns false no truncation" instead. > > > > > > > > > > > > > > > > > > + // Builds a platform-specific full library path given an ld path > and lib > > > name. > > > > > > + // Returns true if the buffer contains a full path to an existing > file, > > > false > > > > > > + // otherwise. If pathname is empty, checks the current > directory. > > > > > > + static bool dll_locate_lib(char* buffer, size_t size, > > > > > > const char* pathname, const char* > fname); > > > > > > > > > > > > Might be worth mentioning that "fname" is the unadorned library > > > name, e.g. "verify" for libverify.so or verify.dll. > > > > > > > > > > > > Would the following alternative be valid: > > > > > > > > > > > > one could make dll_locate_lib take the real file name, and let > caller > > > use dll_build_name() to build the libary name first before handing it > to > > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to a > generic > > > "find_file_in_path" because it would work for any kind of file. > > > > > > > > > > > > As an added bonus, there would be no need to create a > temporary > > > array in dll_build_name/dll_locate_lib, and no need to call free() so > no > > > cleanup-related control flow changes in these functions. > > > > > > > > > > > > ===== > > > > > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > > > > > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > > > > > > > > > > > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + > > > strlen(JNI_LIB_SUFFIX); > > > > > > > > > > > > int -> size_t (does that even compile without warning?) > > > > > > > > > > > > + // Check current working directory. > > > > > > + const char* p = get_current_directory(buffer, buflen); > > > > > > + if (p != NULL && > > > > > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { > > > > > > + strcat(buffer, "\\"); > > > > > > + strcat(buffer, fullfname); > > > > > > + retval = file_exists(buffer); > > > > > > > > > > > > Small nit: I'd use jio_snprintf instead of strcat. Functionally > identical but > > > will make scanners (e.g. coverity) happy. One could then avoid the > length > > > calculation and rely on jio_snprintf truncation: > > > > > > > > > > > > const char* p = get_current_directory(buffer, buflen); > > > > > > if (p != NULL) { > > > > > > const size_t end = strlen(p); > > > > > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { > > > > > > retval = file_exists(buffer); > > > > > > } > > > > > > } > > > > > > > > > > > > -- > > > > > > > > > > > > Not your change, but: why does the code in os::dll_locate_lib() > even > > > differentiate between a PATH containing no os::path_separator() > and a path > > > containing os::path_separator()? > > > > > > > > > > > > Would the former not be just a PATH with only one directory and > hence > > > need no special treatment? > > > > > > > > > > > > ===== > > > > > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html > > > > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> > > > > > > > > > > > > Could os::dll_locate_lib be consolidated between windows and > unix? > > > Seems to be the implementation is almost identical. > > > > > > > > > > > > ==== > > > > > > > > > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > > > > > > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > > > > > > > > > > > > + // not found - try library path > > > > > > > > > > > > Proposal: "not found - try OS default library path" > > > > > > > > > > > > > > > > > > Find some comments inline: > > > > > > > > > > Especially if the path is empty, it just returns 'true'. > > > > Dll_build_name is usually used before calling dll_load. > If > > > dll_load does not get a full path it searches > > > > in well known unix/windows locations. This is intended > in > > > the two cases where dll_build_name > > > > is called with an empty path. > > > > > > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > > > > > > > > before, we would call os::dll_build_name() with an empty > > > string for the path > > > > which, for relative paths, would result in feeding that path > > > unexpanded to > > > > dlopen(), which would use whatever the OS does in those > > > cases (LIBPATH, > > > > LD_LIBRARY_PATH, PATH on windows). Note that this does > > > not necessarily > > > > include searching the current directory. > > > Right. With changed dll_biuld_name it's again exactly as > > > before. > > > > > > > With your change, we now use java.library.path, which is > not > > > necessarily the > > > > same? > > > You are right, I oversaw that java.library.path can be > > > overwritten. Initially, > > > it's set to the right thing. > > > > > > > (BTW, I think the old comments in thread.cpp and > > > jniExport.cpp were wrong:"// > > > > Try the local directory" - if "local" means "current", this is > not > > > what did > > > > happen). > > > Right, I tried to adapt them, did I miss one? > > > > > > > I added a second variant of dll_build_name without the > > > path argument that adds the path > > > > from system property java.lang.path and use that in > these > > > two cases. > > > > I changed the original function to actually check file > > > availability in all cases, > > > > and to check . if the path is empty. > > > > I think that may be a bit confusing. We would then have > three > > > options: > > > > > > > > - call os::dll_build_name with a real ";;.." PATH > and > > > get a file name > > > > resolved from that path > > > > - call os::dll_build_name with "" for the PATH and get OS > dll > > > resolution > > > No, in that case, as I called file_exists(), it would only work if > > > the dll is in the > > > current working directory. But I changed this now, anyways. > > > > > > > - call your new overloaded version of os::dll_build_name(), > > > which uses - > > > > Djava.library.path. > > > > > > > > Please review this change. I please need a sponsor. > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > > > dllBuildName/webrev.01/ > > > > > > > > > > > dllBuildName/webrev.01/> > > > > > > > > > > > Best regards, > > > > Goetz. > > > > > > > > > > > > > > > > > > > > Kind Regards, Thomas > > > > > > > > > > > > Best Regards, Thomas > > > > > > > > > > > > > > > > > > From ioi.lam at oracle.com Thu Aug 24 15:37:54 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 24 Aug 2017 08:37:54 -0700 Subject: RFR(s): 8186349: [windows] Centralize dbghelp handling code In-Reply-To: References: <1df8ec2a-e1f0-52f8-7e26-04b0d770f9cd@oracle.com> <4cf94c01dd734b50955292c68e509459@sap.com> Message-ID: <462941fe-a12f-d383-f54a-3caa7647dee7@oracle.com> OK, I will run a bunch of windows 32/64 tests in our test environment to make sure things are OK, and then I'll push. Thanks - Ioi On 8/24/17 2:15 AM, Thomas St?fe wrote: > Thanks Goetz! > > @Ioi: We are good now for reviews. Thanks for sponsoring! > > ..Thomas > > On Thu, Aug 24, 2017 at 11:14 AM, Lindenmaier, Goetz > > wrote: > > Looks perfect now, Reviewed. > > Best, > ? Goetz. > > > -----Original Message----- > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > ] > > Sent: Donnerstag, 24. August 2017 10:51 > > To: Ioi Lam >; > Reingruber, Richard > > >; Lindenmaier, Goetz > > > > > Cc: hotspot-runtime-dev at openjdk.java.net > > > Subject: Re: RFR(s): 8186349: [windows] Centralize dbghelp > handling code > > > > Hi guys, > > > > thanks for the reviews! > > @Ioi: thanks for sponsoring! > > > > > > New Webrev: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp- > > > handling/webrev.02/webrev/ > > > > > > Delta to last: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349-centralize-dbghelp- > > > handling/webrev.01-to-02/webrev/ > > > > > > Nothing exciting changed. I addressed all of Goetz and Richards > concerns and > > added a Summary line to the change. > > > > Change built and tested (gtests only) on Windows x86, x64. > > > > Kind Regards, Thomas > > > > > > > > On Wed, Aug 23, 2017 at 4:26 PM, Ioi Lam > > > > wrote: > > > > > >? ? ? ?Hi Thomas, thanks for addressing my concerns. I can > sponsor the > > change after you get OK from the other reviewers. > > > >? ? ? ?Thanks > > > >? ? ? ?- Ioi > > > > > > > >? ? ? ?On 8/22/17 5:48 AM, Thomas St?fe wrote: > > > > > >? ? ? ? ? ? ? ?Hi all, > > > >? ? ? ? ? ? ? ?please see new Webrev: > > > > http://cr.openjdk.java.net/~stuefe/webrevs/8186349- > > > centralize-dbghelp-handling/webrev.01/webrev/ > > > > > handling/webrev.01/webrev/> > > > > > >? ? ? ? ? ? ? ?I worked in proposed changes by Ioi and Richard. > Sorry, > > because on file name changed, I did not do an incremental > webrev. Here are > > the changes: > > > >? ? ? ? ? ? ? ?- Renamed DbgHelpLoader to WindowsDbgHelp and renamed > > the files too. > >? ? ? ? ? ? ? ?- I moved the critical section code to the > implementation of > > WindowsDbgHelp. I discarded the general "CritSectLocker" object > in favour of > > an object specialized for this case, see the EntryGuard class. > >? ? ? ? ? ? ? ?- I readded the SymSetOptions call I accidentally > discarded. > >? ? ? ? ? ? ? ?- In decoder_windows.cpp, I set the value for > > _can_decode_in_vm to true. I plan to remove this flag in one of > the next > > changes (see also JDK-8144855) > > > > > >? ? ? ? ? ? ? ?As Richard is no full reviewer, I'll need a second > Reviewer and > > a sponsor. > > > >? ? ? ? ? ? ? ?Thanks! > > > >? ? ? ? ? ? ? ?Kind Regards, Thomas > > > > > >? ? ? ? ? ? ? ?On Tue, Aug 22, 2017 at 2:33 PM, Thomas St?fe > > > > > > wrote: > > > > > >? ? ? ? ? ? ? ? ? ? ? ?Hi Richard, > > > > > >? ? ? ? ? ? ? ? ? ? ? ?thank you for the review! Please find my > remarks > > inline. > > > >? ? ? ? ? ? ? ? ? ? ? ?On Tue, Aug 22, 2017 at 9:44 AM, > Reingruber, Richard > > > > > wrote: > > > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Hi Thomas, > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?thanks for the refactoring work! > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?I had a look at your changes, but > please note > > that I'm not a reviewer. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?### dbghelp_loader.cpp: > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Should globalDefinitions.hpp? be > included? > > > > > > > >? ? ? ? ? ? ? ? ? ? ? ?Currently I do not need it, so I'd rather not. > > > > > > > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Little inconsistency: opening > curly braces of > > method bodies should be on the same line as the end of the > parameter list, I > > guess. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?294: st->print("%s" #functionname, > > ((num_missing > 0) ? ", " : "")); > > > > > > > >? ? ? ? ? ? ? ? ? ? ? ?Are you sure? Sorry, I cannot spot an > error here. > > > > > > > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Format string is incomplete. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 196? ? BOOL > > DbgHelpLoader::stackWalk64(DWORD MachineType, > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 197 ? ? ? ? ? ? ? ? ? HANDLE > hProcess, > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 198 ? ? ? ? ? ? ? ? ? HANDLE hThread, > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 199 ? ? ? ? ? ? ? ? ? LPSTACKFRAME64 > > StackFrame, > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 200 ? ? ? ? ? ? ? ? ? PVOID > > ContextRecord) > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 201? ? { > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 202 CritSectLocker lck(&g_cs); > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 203? ? ? if > (initialize_if_needed()) { > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 204? ? ? ? if (g_pfn_StackWalk64 > != NULL) { > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 205? ? ? ? ? return > > g_pfn_StackWalk64(MachineType, hProcess, hThread, StackFrame, > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 206 ? ? ? ? ? ? ? ? ?ContextRecord, > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 207 ? ? ? ? ? ? ? ? ?NULL, // > > ReadMemoryRoutine > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 208 > > g_pfn_SymFunctionTableAccess64, // FunctionTableAccessRoutine, > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 209 > > g_pfn_SymGetModuleBase64, // GetModuleBaseRoutine > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 210 ? ? ? ? ? ? ? ? ?NULL // > > TranslateAddressRoutine > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 211 ? ? ? ? ? ? ? ? ?); > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 212? ? ? ? } > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 213? ? ? } > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 214? ? ? return FALSE; > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 215? ? } > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Lines 208, 209: is it ok to pass NULL? > > > > > > > > > >? ? ? ? ? ? ? ? ? ? ? ?Good question. Documentation says > parameters are > > required. I tested calling this with NULL for both functions and > stack walking > > worked just fine. I leave it as it is, because I think at worst > we risk StackWalk64 > > failing, and at best we get a callstack nevertheless. > > > > > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?### windows_decoder.cpp > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?The following line was deleted without > > replacement. Are the options not needed? > > > > ?SymSetOptions(SYMOPT_UNDNAME | > > SYMOPT_DEFERRED_LOADS | SYMOPT_EXACT_SYMBOLS); > > > > > > > > > >? ? ? ? ? ? ? ? ? ? ? ?Good catch! Will fix. > > > > > > > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Cheers, Richard. > > > > > > > > > >? ? ? ? ? ? ? ? ? ? ? ?Thanks! > > > > > >? ? ? ? ? ? ? ? ? ? ? ?Thomas > > > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-----Original Message----- > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?From: hotspot-runtime-dev > [mailto:hotspot- > > runtime-dev-bounces at openjdk.java.net > > > > bounces at openjdk.java.net > ] On > Behalf Of Thomas St?fe > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Sent: Freitag, 18. August 2017 09:24 > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?To: > hotspot-runtime-dev at openjdk.java.net > > > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Subject: RFR(s): 8186349: > [windows] Centralize > > dbghelp handling code > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Dear all, > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?may I please have a review for > this change: > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Issue: > > https://bugs.openjdk.java.net/browse/JDK-8186349 > > > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Webrev: > > http://cr.openjdk.java.net/~stuefe/webrevs/ > > > > > > ?8186349-centralize-dbghelp- > > handling/webrev.00/webrev/ > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?This is a part of an ongoing work > I do to make > > error reporting (especially > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?callstacks) on Windows more reliable. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?At first I did a rather large > patch, see: > > https://bugs.openjdk.java.net/ > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?browse/JDK-8185712 and > > > > http://mail.openjdk.java.net/pipermail/hotspot- > > > runtime-dev/2017-August/024286.html > > > > > August/024286.html> > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?. But after discussing this patch > with Ioi, I saw > > that this patch is better > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?split up into multiple parts for > easier reviewing. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?So this is the first split up patch. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-- > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?This patch here centralizes > handling of the > > dbghelp.dll (loading the > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?library, resolving function > pointers and > > synchronizing access). > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Which solves the problem that > accesses to > > functions exported from the > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?dbghelp.dll need to be > synchronized, as stated > > by the MSDN. We somehow > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?never really cared. I guess it > never caused > > visible trouble, because most > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?of the time (not always) the > functions are > > accessed from VMError::report(), > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?and chances of parallel access > from other non- > > error-reporting threads are > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?slim. Even if it were to crash, > secondary error > > handling would step in and > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?write an "Error occurred during error > > reporting" or "Another thread had an > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?error too" message and we would > probably just > > shrug it off. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?But as this whole effort is about > increasing the > > chance of useful > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?callstacks in hs-err files, I'd > like to fix this. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?In addition to the fix, I think > this is also a nice > > cleanup and removes > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?duplicate code. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Notes: > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?1) Robustness: We may or may not > find a > > dbghelp.dll on the target system. > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?If we find it, it may be old or > new (it is not > > tightly coupled with the OS, > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?may be part of other installation > packages, may > > exist multiple times etc). > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?We should handle older versions of > the dbghelp > > dll gracefully and hide all > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?that complexity from the caller. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?2) The new DbgHelpLoader class > does not > > export any state indicating whether > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?or not it successfully loaded, and > if it loaded > > which functions are > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?actually available. That was a > deliberate > > decision, there is no need for > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?the caller to know this. Caller > should invoke the > > DbgHelpLoader functions > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?as if they were the equivalent OS > functions and > > handle return errors. > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?DbgHelpLoader should never crash > or assert; > > missing functions should behave > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?like failing functions. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?3) However, I added a one liner to > the hs-err > > file indicating the state of > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?the dbghelp dll - version info, > what functions > > were missing etc. This may > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?help understanding weird or > missing callstacks. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?4) I removed the implementation > for shutdown > > (WindowsDecoder::shutdown). I > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?think there is no valid reason to > ever shutdown > > the decoder. For one, we > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?may crash right at the end, and > still it would be > > nice to have callstacks. > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?And then, why spend cycles > shutting down the > > decoder when we could just let > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?it end with the process? > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?5) This code gets used during > error reporting. > > So no VM infrastructure must > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?be used to avoid circular crashes > and VM > > initialization dependencies. So, > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?to synchronize, this code uses raw > windows > > CriticalSection objects. > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-- > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Next step will be revamping > handling of the > > Symbol APIs. This will involve > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?removing the WindowsDecoder class, > which > > introduces other errors and really > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?makes no sense if the underlying > dbghelp layer > > does its own synchronization. > > > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Thanks for reviewing! > > > >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Kind Regards, Thomas > > > > > > > > > > > > From claes.redestad at oracle.com Thu Aug 24 15:38:10 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Thu, 24 Aug 2017 17:38:10 +0200 Subject: [10] RFR: 8179040: Avoid Ticks::now calls when EventClassLoad is not enabled In-Reply-To: References: Message-ID: <9fb7cb6b-a9cc-e426-cc12-f1341675cf03@oracle.com> Thanks Ioi and David for reviewing, sadly there was some issues with building the previous patch without the closed sources which has been corrected here: http://cr.openjdk.java.net/~redestad/8179040/hotspot.03/ Thanks! /Claes On 2017-08-23 20:11, Ioi Lam wrote: > Hi Claes, > > The changes look good. Reviewed. > > Thanks > > - Ioi > > > On 8/23/17 5:32 AM, Claes Redestad wrote: >> Hi, >> >> this patch was never pushed due to an unfortunate interaction around >> cancelling >> of events. Markus Gr?nlund has helped resolve these, while >> maintaining the >> speedup: >> >> http://cr.openjdk.java.net/~redestad/8179040/hotspot.02/ >> >> Thanks! >> >> /Claes >> >> On 04/25/2017 03:21 PM, Claes Redestad wrote: >>> Hi, >>> >>> this patch removes calling Ticks::now when EventClassLoad isn't >>> enabled, which >>> has an effect on class loading performance: >>> >>> http://cr.openjdk.java.net/~redestad/8179040/hotspot.01/ >>> >>> When tracing isn't enabled trace/tracing.hpp has dummy >>> implementations which >>> are easily optimized away by a compiler, which I've verified happens >>> on linux >>> OpenJDK builds with tracing disabled. >>> >>> On builds with tracing enabled then the changes means the call to >>> get the time >>> only happen if the event is enabled, which achieves the sought after >>> startup >>> optimization. >>> >>> Thanks! >>> >>> /Claes >> > From ioi.lam at oracle.com Thu Aug 24 17:22:55 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 24 Aug 2017 10:22:55 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> Message-ID: <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> Hi Jiangli, The Nodes need to be deallocated in the ResourceHashtable destructor. ? ~ResourceHashtable() { ??? if (ALLOC_TYPE == C_HEAP) { ????? Node* const* bucket = _table; ????? while (bucket < &_table[SIZE]) { ??????? Node* node = *bucket; ??????? while (node != NULL) { ????????? Node* cur = node; ????????? node = node->_next; ????????? delete cur; ??????? } ??????? ++bucket; ????? } ??? } ? } The problem with ResourceHashtable is that by default ALLOC_TYPE = ResourceObj::RESOURCE_AREA, but if your call path looks like this: ResourceHashtable<...> table; ? ... ? { ??? ResourceMark rm; ??? ... ??? { ????? table.put(....); ??? } ? } The Node in the table will end up being invalid. In your case, the code path between the allocation of the ResourceHashtable and the call to MetaspaceShared::archive_heap_object covers a few files. There's currently no ResourceMark in between. However, in the future, someone could potentially put in a ResourceMark and cause erratic failures. So, since your're fixing the hashtable code, I think it will be a good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. However, when doing that, it's a good idea to do the proper clean up by invoking the ~ResourceHashtable() destructor via the delete operator. Thanks - Ioi On 8/23/17 10:27 PM, Jiangli Zhou wrote: > Hi Ioi, > > The table was not changed to be allocated as ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. > > Thanks, > Jiangli > >> On Aug 23, 2017, at 6:03 PM, Ioi Lam wrote: >> >> Hi Jiangli, >> >> Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks >> >> { >> NoSafepointVerifier nsv; >> >> // Cache for recording where the archived objects are copied to >> MetaspaceShared::create_archive_object_cache(); >> >> tty->print_cr("Dumping String objects to closed archive heap region ..."); >> NOT_PRODUCT(StringTable::verify()); >> // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. >> _string_regions = new GrowableArray(2); >> StringTable::write_to_archive(_string_regions); >> >> tty->print_cr("Dumping objects to open archive heap region ..."); >> _open_archive_heap_regions = new GrowableArray(2); >> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >> >> + MetaspaceShared::create_archive_object_cache(); >> >> } >> >> + static void delete_archive_object_cache() { >> + CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); >> + } >> >> Thanks >> - Ioi >> >> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>> >>> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >>> >>> Tested with tier4-comp tests. >>> >>> Thanks, >>> Jiangli From coleen.phillimore at oracle.com Thu Aug 24 18:28:50 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 24 Aug 2017 14:28:50 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp Message-ID: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set since it's accessed outside the SystemDictionary_lock Ran parallel class loading tests that we have as well as tier1 tests. See bug for details. open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8164207 Thanks, Coleen From zgu at redhat.com Thu Aug 24 18:29:47 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 24 Aug 2017 14:29:47 -0400 Subject: RFR(XXS) 8186748: NMT: memTracker::record_virtual_memory_reserve_and_commit() does not tag the memory as committed Message-ID: Please review this one line change to fix missing tag of committed memory region. The bug results NMT to report "Shared class space" as reserved, but not committed region. bug: https://bugs.openjdk.java.net/browse/JDK-8186748 Webrev: http://cr.openjdk.java.net/~zgu/8186748/webrev.00/ Before: - Shared class space (reserved=16904KB, committed=0KB) (mmap: reserved=16904KB, committed=0KB) After: - Shared class space (reserved=16832KB, committed=16832KB) (mmap: reserved=16832KB, committed=16832KB) Thanks, -Zhengyu From shade at redhat.com Thu Aug 24 18:32:53 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 24 Aug 2017 20:32:53 +0200 Subject: RFR(XXS) 8186748: NMT: memTracker::record_virtual_memory_reserve_and_commit() does not tag the memory as committed In-Reply-To: References: Message-ID: <7830841d-fb0e-32c8-406b-f80539e60e22@redhat.com> On 08/24/2017 08:29 PM, Zhengyu Gu wrote: > bug: https://bugs.openjdk.java.net/browse/JDK-8186748 > Webrev: http://cr.openjdk.java.net/~zgu/8186748/webrev.00/ Oh. Looks good to me. Thanks, -Aleksey From coleen.phillimore at oracle.com Thu Aug 24 18:50:08 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 24 Aug 2017 14:50:08 -0400 Subject: RFR(XXS) 8186748: NMT: memTracker::record_virtual_memory_reserve_and_commit() does not tag the memory as committed In-Reply-To: References: Message-ID: This looks good. I assume that you ran the NMT tests on it. I'll sponsor it. Coleen On 8/24/17 2:29 PM, Zhengyu Gu wrote: > Please review this one line change to fix missing tag of committed > memory region. > > The bug results NMT to report "Shared class space" as reserved, but > not committed region. > > > bug: https://bugs.openjdk.java.net/browse/JDK-8186748 > Webrev: http://cr.openjdk.java.net/~zgu/8186748/webrev.00/ > > > > Before: > > - Shared class space (reserved=16904KB, committed=0KB) > (mmap: reserved=16904KB, committed=0KB) > > After: > > - Shared class space (reserved=16832KB, committed=16832KB) > (mmap: reserved=16832KB, committed=16832KB) > > > Thanks, > > -Zhengyu From zgu at redhat.com Thu Aug 24 19:04:57 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 24 Aug 2017 15:04:57 -0400 Subject: RFR(XXS) 8186748: NMT: memTracker::record_virtual_memory_reserve_and_commit() does not tag the memory as committed In-Reply-To: References: Message-ID: On 08/24/2017 02:50 PM, coleen.phillimore at oracle.com wrote: > > This looks good. I assume that you ran the NMT tests on it. I'll > sponsor it. > Coleen Yes, I ran jtreg NMT tests. Thanks so much for the review and sponsorship, again! -Zhengyu > > > On 8/24/17 2:29 PM, Zhengyu Gu wrote: >> Please review this one line change to fix missing tag of committed >> memory region. >> >> The bug results NMT to report "Shared class space" as reserved, but >> not committed region. >> >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8186748 >> Webrev: http://cr.openjdk.java.net/~zgu/8186748/webrev.00/ >> >> >> >> Before: >> >> - Shared class space (reserved=16904KB, committed=0KB) >> (mmap: reserved=16904KB, committed=0KB) >> >> After: >> >> - Shared class space (reserved=16832KB, committed=16832KB) >> (mmap: reserved=16832KB, committed=16832KB) >> >> >> Thanks, >> >> -Zhengyu > -------------- next part -------------- A non-text attachment was scrubbed... Name: 8186748.patch Type: text/x-patch Size: 1246 bytes Desc: not available URL: From coleen.phillimore at oracle.com Thu Aug 24 19:35:37 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 24 Aug 2017 15:35:37 -0400 Subject: RFR(XXS) 8186748: NMT: memTracker::record_virtual_memory_reserve_and_commit() does not tag the memory as committed In-Reply-To: References: Message-ID: You are welcome. I don't this change needs a 24 hour wait. Coleen On 8/24/17 3:04 PM, Zhengyu Gu wrote: > On 08/24/2017 02:50 PM, coleen.phillimore at oracle.com wrote: >> >> This looks good. I assume that you ran the NMT tests on it. I'll >> sponsor it. >> Coleen > > Yes, I ran jtreg NMT tests. > > Thanks so much for the review and sponsorship, again! > > -Zhengyu > > >> >> >> On 8/24/17 2:29 PM, Zhengyu Gu wrote: >>> Please review this one line change to fix missing tag of committed >>> memory region. >>> >>> The bug results NMT to report "Shared class space" as reserved, but >>> not committed region. >>> >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8186748 >>> Webrev: http://cr.openjdk.java.net/~zgu/8186748/webrev.00/ >>> >>> >>> >>> Before: >>> >>> - Shared class space (reserved=16904KB, committed=0KB) >>> (mmap: reserved=16904KB, committed=0KB) >>> >>> After: >>> >>> - Shared class space (reserved=16832KB, committed=16832KB) >>> (mmap: reserved=16832KB, committed=16832KB) >>> >>> >>> Thanks, >>> >>> -Zhengyu >> From jiangli.zhou at oracle.com Thu Aug 24 19:53:55 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 24 Aug 2017 12:53:55 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> Message-ID: Hi Ioi, ResourceObj only allows delete for C_HEAP objects, we need to allocate _archive_object_cache as C_HEAP object also. Otherwise, we would hit the following assert. void ResourceObj::operator delete(void* p) { assert(((ResourceObj *)p)->allocated_on_C_heap(), "delete only allowed for C_HEAP objects?); Here is the updated webrev that allocates/deallocates the _archive_object_cache table and nodes as C_HEAP objects. http://cr.openjdk.java.net/~jiangli/8186706/webrev.01/ Thanks, Jiangli > On Aug 24, 2017, at 10:22 AM, Ioi Lam wrote: > > Hi Jiangli, > > The Nodes need to be deallocated in the ResourceHashtable destructor. > > ~ResourceHashtable() { > if (ALLOC_TYPE == C_HEAP) { > Node* const* bucket = _table; > while (bucket < &_table[SIZE]) { > Node* node = *bucket; > while (node != NULL) { > Node* cur = node; > node = node->_next; > delete cur; > } > ++bucket; > } > } > } > > > The problem with ResourceHashtable is that by default ALLOC_TYPE = ResourceObj::RESOURCE_AREA, but if your call path looks like this: > > > ResourceHashtable<...> table; > ... > { > ResourceMark rm; > ... > { > table.put(....); > } > } > > The Node in the table will end up being invalid. > > In your case, the code path between the allocation of the ResourceHashtable and the call to MetaspaceShared::archive_heap_object covers a few files. There's currently no ResourceMark in between. However, in the future, someone could potentially put in a ResourceMark and cause erratic failures. > > So, since your're fixing the hashtable code, I think it will be a good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. However, when doing that, it's a good idea to do the proper clean up by invoking the ~ResourceHashtable() destructor via the delete operator. > > > Thanks > > - Ioi > > > > On 8/23/17 10:27 PM, Jiangli Zhou wrote: >> Hi Ioi, >> >> The table was not changed to be allocated as ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. >> >> Thanks, >> Jiangli >> >>> On Aug 23, 2017, at 6:03 PM, Ioi Lam wrote: >>> >>> Hi Jiangli, >>> >>> Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks >>> >>> { >>> NoSafepointVerifier nsv; >>> >>> // Cache for recording where the archived objects are copied to >>> MetaspaceShared::create_archive_object_cache(); >>> >>> tty->print_cr("Dumping String objects to closed archive heap region ..."); >>> NOT_PRODUCT(StringTable::verify()); >>> // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. >>> _string_regions = new GrowableArray(2); >>> StringTable::write_to_archive(_string_regions); >>> >>> tty->print_cr("Dumping objects to open archive heap region ..."); >>> _open_archive_heap_regions = new GrowableArray(2); >>> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >>> >>> + MetaspaceShared::create_archive_object_cache(); >>> >>> } >>> >>> + static void delete_archive_object_cache() { >>> + CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); >>> + } >>> >>> Thanks >>> - Ioi >>> >>> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>>> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>>> >>>> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >>>> >>>> Tested with tier4-comp tests. >>>> >>>> Thanks, >>>> Jiangli > From zgu at redhat.com Thu Aug 24 20:07:44 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 24 Aug 2017 16:07:44 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> Message-ID: <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> Hi Coleen, There are two instances probably overlooked? dictionary.cpp #103 and #124 for (ProtectionDomainEntry* current = _pd_set; => for (ProtectionDomainEntry* current = pd_set(); Thanks, -Zhengyu On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: > Summary: Use load_acquire for accessing DictionaryEntry::_pd_set since > it's accessed outside the SystemDictionary_lock > > Ran parallel class loading tests that we have as well as tier1 tests. > See bug for details. > > open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > > Thanks, > Coleen > From coleen.phillimore at oracle.com Thu Aug 24 20:31:50 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 24 Aug 2017 16:31:50 -0400 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> Message-ID: <97fcc27b-fe6f-1836-2bf1-9dddd9dc0f2f@oracle.com> Hi, I'm glad that you changed this to CHEAP allocated. + CDS_JAVA_HEAP_ONLY(_archive_object_cache = new (ResourceObj::C_HEAP, mtInternal)ArchivedObjectCache();); This should probably be mtClass and not mtInternal. The other question I had was that I expected you to use obj->hash() since the objects probably should (?) have a hashcode installed when in the archive. Thanks, Coleen On 8/24/17 3:53 PM, Jiangli Zhou wrote: > Hi Ioi, > > ResourceObj only allows delete for C_HEAP objects, we need to allocate _archive_object_cache as C_HEAP object also. Otherwise, we would hit the following assert. > > void ResourceObj::operator delete(void* p) { > assert(((ResourceObj *)p)->allocated_on_C_heap(), > "delete only allowed for C_HEAP objects?); > > Here is the updated webrev that allocates/deallocates the _archive_object_cache table and nodes as C_HEAP objects. > http://cr.openjdk.java.net/~jiangli/8186706/webrev.01/ > > Thanks, > Jiangli > >> On Aug 24, 2017, at 10:22 AM, Ioi Lam wrote: >> >> Hi Jiangli, >> >> The Nodes need to be deallocated in the ResourceHashtable destructor. >> >> ~ResourceHashtable() { >> if (ALLOC_TYPE == C_HEAP) { >> Node* const* bucket = _table; >> while (bucket < &_table[SIZE]) { >> Node* node = *bucket; >> while (node != NULL) { >> Node* cur = node; >> node = node->_next; >> delete cur; >> } >> ++bucket; >> } >> } >> } >> >> >> The problem with ResourceHashtable is that by default ALLOC_TYPE = ResourceObj::RESOURCE_AREA, but if your call path looks like this: >> >> >> ResourceHashtable<...> table; >> ... >> { >> ResourceMark rm; >> ... >> { >> table.put(....); >> } >> } >> >> The Node in the table will end up being invalid. >> >> In your case, the code path between the allocation of the ResourceHashtable and the call to MetaspaceShared::archive_heap_object covers a few files. There's currently no ResourceMark in between. However, in the future, someone could potentially put in a ResourceMark and cause erratic failures. >> >> So, since your're fixing the hashtable code, I think it will be a good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. However, when doing that, it's a good idea to do the proper clean up by invoking the ~ResourceHashtable() destructor via the delete operator. >> >> >> Thanks >> >> - Ioi >> >> >> >> On 8/23/17 10:27 PM, Jiangli Zhou wrote: >>> Hi Ioi, >>> >>> The table was not changed to be allocated as ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. >>> >>> Thanks, >>> Jiangli >>> >>>> On Aug 23, 2017, at 6:03 PM, Ioi Lam wrote: >>>> >>>> Hi Jiangli, >>>> >>>> Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks >>>> >>>> { >>>> NoSafepointVerifier nsv; >>>> >>>> // Cache for recording where the archived objects are copied to >>>> MetaspaceShared::create_archive_object_cache(); >>>> >>>> tty->print_cr("Dumping String objects to closed archive heap region ..."); >>>> NOT_PRODUCT(StringTable::verify()); >>>> // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. >>>> _string_regions = new GrowableArray(2); >>>> StringTable::write_to_archive(_string_regions); >>>> >>>> tty->print_cr("Dumping objects to open archive heap region ..."); >>>> _open_archive_heap_regions = new GrowableArray(2); >>>> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >>>> >>>> + MetaspaceShared::create_archive_object_cache(); >>>> >>>> } >>>> >>>> + static void delete_archive_object_cache() { >>>> + CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); >>>> + } >>>> >>>> Thanks >>>> - Ioi >>>> >>>> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>>>> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>>>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>>>> >>>>> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >>>>> >>>>> Tested with tier4-comp tests. >>>>> >>>>> Thanks, >>>>> Jiangli From ioi.lam at oracle.com Thu Aug 24 20:54:04 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 24 Aug 2017 13:54:04 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> Message-ID: <7624db4d-6b9b-8d82-170f-2e8a47f0af99@oracle.com> Thanks Jiangli. Looks good. - Ioi On 8/24/17 12:53 PM, Jiangli Zhou wrote: > Hi Ioi, > > ResourceObj only allows delete for C_HEAP objects, we need to > allocate?_archive_object_cache as C_HEAP object also. Otherwise, we > would hit the following assert. > > void ResourceObj::operator delete(void* p) { > ? assert(((ResourceObj *)p)->allocated_on_C_heap(), > "delete only allowed for C_HEAP objects?); > > Here is the updated webrev that allocates/deallocates the > _archive_object_cache table and nodes as C_HEAP objects. > http://cr.openjdk.java.net/~jiangli/8186706/webrev.01/ > > > Thanks, > Jiangli > >> On Aug 24, 2017, at 10:22 AM, Ioi Lam > > wrote: >> >> Hi Jiangli, >> >> The Nodes need to be deallocated in the ResourceHashtable destructor. >> >> ? ~ResourceHashtable() { >> ??? if (ALLOC_TYPE == C_HEAP) { >> ????? Node* const* bucket = _table; >> ????? while (bucket < &_table[SIZE]) { >> ??????? Node* node = *bucket; >> ??????? while (node != NULL) { >> ????????? Node* cur = node; >> ????????? node = node->_next; >> ????????? delete cur; >> ??????? } >> ??????? ++bucket; >> ????? } >> ??? } >> ? } >> >> >> The problem with ResourceHashtable is that by default ALLOC_TYPE = >> ResourceObj::RESOURCE_AREA, but if your call path looks like this: >> >> >> ResourceHashtable<...> table; >> ? ... >> ? { >> ??? ResourceMark rm; >> ??? ... >> ??? { >> ????? table.put(....); >> ??? } >> ? } >> >> The Node in the table will end up being invalid. >> >> In your case, the code path between the allocation of the >> ResourceHashtable and the call to >> MetaspaceShared::archive_heap_object covers a few files. There's >> currently no ResourceMark in between. However, in the future, someone >> could potentially put in a ResourceMark and cause erratic failures. >> >> So, since your're fixing the hashtable code, I think it will be a >> good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. However, >> when doing that, it's a good idea to do the proper clean up by >> invoking the ~ResourceHashtable() destructor via the delete operator. >> >> >> Thanks >> >> - Ioi >> >> >> >> On 8/23/17 10:27 PM, Jiangli Zhou wrote: >>> Hi Ioi, >>> >>> The table was not changed to be allocated as ResourceObj::C_HEAP. I >>> see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. >>> >>> Thanks, >>> Jiangli >>> >>>> On Aug 23, 2017, at 6:03 PM, Ioi Lam >>> > wrote: >>>> >>>> Hi Jiangli, >>>> >>>> Since the table is allocated as ResourceObj::C_HEAP, it's better to >>>> delete it afterwards to avoid memory leaks >>>> >>>> ??{ >>>> ????NoSafepointVerifier nsv; >>>> >>>> ????// Cache for recording where the archived objects are copied to >>>> ????MetaspaceShared::create_archive_object_cache(); >>>> >>>> ????tty->print_cr("Dumping String objects to closed archive heap >>>> region ..."); >>>> ????NOT_PRODUCT(StringTable::verify()); >>>> ????// The string space has maximum two regions. See >>>> FileMapInfo::write_archive_heap_regions() for details. >>>> ????_string_regions = new GrowableArray(2); >>>> ????StringTable::write_to_archive(_string_regions); >>>> >>>> ????tty->print_cr("Dumping objects to open archive heap region ..."); >>>> ????_open_archive_heap_regions = new GrowableArray(2); >>>> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >>>> >>>> + ??MetaspaceShared::create_archive_object_cache(); >>>> >>>> ??} >>>> >>>> + static void delete_archive_object_cache() { >>>> + ??CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; >>>> _archive_object_cache = NULL;); >>>> + } >>>> >>>> Thanks >>>> - Ioi >>>> >>>> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>>>> Please review the following webrev that fixes the >>>>> ArchivedObjectCache obj_hash() issue. The patch was from Ioi >>>>> (thanks!). I will count myself as a reviewer. >>>>> >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>>>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>>>> >>>>> >>>>> ArchivedObjectCache obj_hash() computes hash using incorrect >>>>> address. The fix is to use the correct oop address. The default >>>>> ResourceHashtable size is 256, which is too small when large >>>>> number of objects are archived. The table is now changed to use a >>>>> much larger (15889) size. The ArchivedObjectCache issue was >>>>> noticed when one test times out on slower linux arm64 machine. >>>>> With the fix the test finishes without timeout. >>>>> >>>>> Tested with tier4-comp tests. >>>>> >>>>> Thanks, >>>>> Jiangli >> > From coleen.phillimore at oracle.com Thu Aug 24 20:54:46 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 24 Aug 2017 16:54:46 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> Message-ID: <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> On 8/24/17 4:07 PM, Zhengyu Gu wrote: > Hi Coleen, > > There are two instances probably overlooked? > > dictionary.cpp #103 and #124 > > for (ProtectionDomainEntry* current = _pd_set; > => > for (ProtectionDomainEntry* current = pd_set(); > > Oh yeah, you're right. That's embarrasing. I'll fix and retest. Thank you!! Coleen > Thanks, > > -Zhengyu > > On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set >> since it's accessed outside the SystemDictionary_lock >> >> Ran parallel class loading tests that we have as well as tier1 tests. >> See bug for details. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >> >> Thanks, >> Coleen >> From cthalinger at twitter.com Thu Aug 24 21:00:39 2017 From: cthalinger at twitter.com (Christian Thalinger) Date: Thu, 24 Aug 2017 11:00:39 -1000 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> Message-ID: > On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: > > > > On 8/24/17 4:07 PM, Zhengyu Gu wrote: >> Hi Coleen, >> >> There are two instances probably overlooked? >> >> dictionary.cpp #103 and #124 >> >> for (ProtectionDomainEntry* current = _pd_set; >> => >> for (ProtectionDomainEntry* current = pd_set(); >> >> > > Oh yeah, you're right. That's embarrasing. I'll fix and retest. Which also shows that there is a potential for future mistakes. Can we isolate the field better so it?s only accessible via setter and getter? > > Thank you!! > Coleen > >> Thanks, >> >> -Zhengyu >> >> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set since it's accessed outside the SystemDictionary_lock >>> >>> Ran parallel class loading tests that we have as well as tier1 tests. See bug for details. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>> >>> Thanks, >>> Coleen >>> > From coleen.phillimore at oracle.com Thu Aug 24 21:11:50 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 24 Aug 2017 17:11:50 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> Message-ID: <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> On 8/24/17 5:00 PM, Christian Thalinger wrote: >> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >> >> >> >> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>> Hi Coleen, >>> >>> There are two instances probably overlooked? >>> >>> dictionary.cpp #103 and #124 >>> >>> for (ProtectionDomainEntry* current = _pd_set; >>> => >>> for (ProtectionDomainEntry* current = pd_set(); >>> >>> >> Oh yeah, you're right. That's embarrasing. I'll fix and retest. > Which also shows that there is a potential for future mistakes. Can we isolate the field better so it?s only accessible via setter and getter? Yes, great idea. Coleen >> Thank you!! >> Coleen >> >>> Thanks, >>> >>> -Zhengyu >>> >>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set since it's accessed outside the SystemDictionary_lock >>>> >>>> Ran parallel class loading tests that we have as well as tier1 tests. See bug for details. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>> >>>> Thanks, >>>> Coleen >>>> From jiangli.zhou at Oracle.COM Thu Aug 24 23:32:45 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Thu, 24 Aug 2017 16:32:45 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: <97fcc27b-fe6f-1836-2bf1-9dddd9dc0f2f@oracle.com> References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> <97fcc27b-fe6f-1836-2bf1-9dddd9dc0f2f@oracle.com> Message-ID: <990E9A29-BEE0-4E16-8A35-5F8EFE67655F@oracle.com> Hi Coleen, Thanks for reviewing this! > On Aug 24, 2017, at 1:31 PM, coleen.phillimore at oracle.com wrote: > > > Hi, > > I'm glad that you changed this to CHEAP allocated. > > + CDS_JAVA_HEAP_ONLY(_archive_object_cache = new (ResourceObj::C_HEAP, mtInternal)ArchivedObjectCache();); > > > This should probably be mtClass and not mtInternal. Ok. I?ll use mtClass. > > The other question I had was that I expected you to use obj->hash() since the objects probably should (?) have a hashcode installed when in the archive. Do you mean identity_hash()? That should also work for this use case. Initially I wanted to use a simply hash code so went with computation using object address. Yes, we compute the identity_hash for archived object at dump time. I updated the webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.02/ Thanks, Jiangli > > Thanks, > Coleen > > > On 8/24/17 3:53 PM, Jiangli Zhou wrote: >> Hi Ioi, >> >> ResourceObj only allows delete for C_HEAP objects, we need to allocate _archive_object_cache as C_HEAP object also. Otherwise, we would hit the following assert. >> >> void ResourceObj::operator delete(void* p) { >> assert(((ResourceObj *)p)->allocated_on_C_heap(), >> "delete only allowed for C_HEAP objects?); >> >> Here is the updated webrev that allocates/deallocates the _archive_object_cache table and nodes as C_HEAP objects. >> http://cr.openjdk.java.net/~jiangli/8186706/webrev.01/ >> >> Thanks, >> Jiangli >> >>> On Aug 24, 2017, at 10:22 AM, Ioi Lam wrote: >>> >>> Hi Jiangli, >>> >>> The Nodes need to be deallocated in the ResourceHashtable destructor. >>> >>> ~ResourceHashtable() { >>> if (ALLOC_TYPE == C_HEAP) { >>> Node* const* bucket = _table; >>> while (bucket < &_table[SIZE]) { >>> Node* node = *bucket; >>> while (node != NULL) { >>> Node* cur = node; >>> node = node->_next; >>> delete cur; >>> } >>> ++bucket; >>> } >>> } >>> } >>> >>> >>> The problem with ResourceHashtable is that by default ALLOC_TYPE = ResourceObj::RESOURCE_AREA, but if your call path looks like this: >>> >>> >>> ResourceHashtable<...> table; >>> ... >>> { >>> ResourceMark rm; >>> ... >>> { >>> table.put(....); >>> } >>> } >>> >>> The Node in the table will end up being invalid. >>> >>> In your case, the code path between the allocation of the ResourceHashtable and the call to MetaspaceShared::archive_heap_object covers a few files. There's currently no ResourceMark in between. However, in the future, someone could potentially put in a ResourceMark and cause erratic failures. >>> >>> So, since your're fixing the hashtable code, I think it will be a good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. However, when doing that, it's a good idea to do the proper clean up by invoking the ~ResourceHashtable() destructor via the delete operator. >>> >>> >>> Thanks >>> >>> - Ioi >>> >>> >>> >>> On 8/23/17 10:27 PM, Jiangli Zhou wrote: >>>> Hi Ioi, >>>> >>>> The table was not changed to be allocated as ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> On Aug 23, 2017, at 6:03 PM, Ioi Lam wrote: >>>>> >>>>> Hi Jiangli, >>>>> >>>>> Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks >>>>> >>>>> { >>>>> NoSafepointVerifier nsv; >>>>> >>>>> // Cache for recording where the archived objects are copied to >>>>> MetaspaceShared::create_archive_object_cache(); >>>>> >>>>> tty->print_cr("Dumping String objects to closed archive heap region ..."); >>>>> NOT_PRODUCT(StringTable::verify()); >>>>> // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. >>>>> _string_regions = new GrowableArray(2); >>>>> StringTable::write_to_archive(_string_regions); >>>>> >>>>> tty->print_cr("Dumping objects to open archive heap region ..."); >>>>> _open_archive_heap_regions = new GrowableArray(2); >>>>> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >>>>> >>>>> + MetaspaceShared::create_archive_object_cache(); >>>>> >>>>> } >>>>> >>>>> + static void delete_archive_object_cache() { >>>>> + CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); >>>>> + } >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>>>>> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >>>>>> >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>>>>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>>>>> >>>>>> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >>>>>> >>>>>> Tested with tier4-comp tests. >>>>>> >>>>>> Thanks, >>>>>> Jiangli > From jiangli.zhou at oracle.com Thu Aug 24 23:35:13 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 24 Aug 2017 16:35:13 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: <7624db4d-6b9b-8d82-170f-2e8a47f0af99@oracle.com> References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> <7624db4d-6b9b-8d82-170f-2e8a47f0af99@oracle.com> Message-ID: <2E9076D6-CC1E-43F2-9D59-C026D5C1D465@oracle.com> Thanks, Ioi! I made additional changes after Coleen?s suggestion. Please see the other email. I?m rerunning the test. Thanks, Jiangli > On Aug 24, 2017, at 1:54 PM, Ioi Lam wrote: > > Thanks Jiangli. Looks good. > > - Ioi > > On 8/24/17 12:53 PM, Jiangli Zhou wrote: >> Hi Ioi, >> >> ResourceObj only allows delete for C_HEAP objects, we need to allocate _archive_object_cache as C_HEAP object also. Otherwise, we would hit the following assert. >> >> void ResourceObj::operator delete(void* p) { >> assert(((ResourceObj *)p)->allocated_on_C_heap(), >> "delete only allowed for C_HEAP objects?); >> >> Here is the updated webrev that allocates/deallocates the _archive_object_cache table and nodes as C_HEAP objects. >> http://cr.openjdk.java.net/~jiangli/8186706/webrev.01/ >> >> Thanks, >> Jiangli >> >>> On Aug 24, 2017, at 10:22 AM, Ioi Lam > wrote: >>> >>> Hi Jiangli, >>> >>> The Nodes need to be deallocated in the ResourceHashtable destructor. >>> >>> ~ResourceHashtable() { >>> if (ALLOC_TYPE == C_HEAP) { >>> Node* const* bucket = _table; >>> while (bucket < &_table[SIZE]) { >>> Node* node = *bucket; >>> while (node != NULL) { >>> Node* cur = node; >>> node = node->_next; >>> delete cur; >>> } >>> ++bucket; >>> } >>> } >>> } >>> >>> >>> The problem with ResourceHashtable is that by default ALLOC_TYPE = ResourceObj::RESOURCE_AREA, but if your call path looks like this: >>> >>> >>> ResourceHashtable<...> table; >>> ... >>> { >>> ResourceMark rm; >>> ... >>> { >>> table.put(....); >>> } >>> } >>> >>> The Node in the table will end up being invalid. >>> >>> In your case, the code path between the allocation of the ResourceHashtable and the call to MetaspaceShared::archive_heap_object covers a few files. There's currently no ResourceMark in between. However, in the future, someone could potentially put in a ResourceMark and cause erratic failures. >>> >>> So, since your're fixing the hashtable code, I think it will be a good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. However, when doing that, it's a good idea to do the proper clean up by invoking the ~ResourceHashtable() destructor via the delete operator. >>> >>> >>> Thanks >>> >>> - Ioi >>> >>> >>> >>> On 8/23/17 10:27 PM, Jiangli Zhou wrote: >>>> Hi Ioi, >>>> >>>> The table was not changed to be allocated as ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> On Aug 23, 2017, at 6:03 PM, Ioi Lam > wrote: >>>>> >>>>> Hi Jiangli, >>>>> >>>>> Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks >>>>> >>>>> { >>>>> NoSafepointVerifier nsv; >>>>> >>>>> // Cache for recording where the archived objects are copied to >>>>> MetaspaceShared::create_archive_object_cache(); >>>>> >>>>> tty->print_cr("Dumping String objects to closed archive heap region ..."); >>>>> NOT_PRODUCT(StringTable::verify()); >>>>> // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. >>>>> _string_regions = new GrowableArray(2); >>>>> StringTable::write_to_archive(_string_regions); >>>>> >>>>> tty->print_cr("Dumping objects to open archive heap region ..."); >>>>> _open_archive_heap_regions = new GrowableArray(2); >>>>> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >>>>> >>>>> + MetaspaceShared::create_archive_object_cache(); >>>>> >>>>> } >>>>> >>>>> + static void delete_archive_object_cache() { >>>>> + CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); >>>>> + } >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>>>>> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >>>>>> >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>>>>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>>>>> >>>>>> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >>>>>> >>>>>> Tested with tier4-comp tests. >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>> >> > From david.holmes at oracle.com Fri Aug 25 02:10:21 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 25 Aug 2017 12:10:21 +1000 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: References: <29006bbc-562a-f181-e05d-153d6a13dace@oracle.com> <15ed8720-4d13-ae95-dfbe-dd0e3d5acfd6@oracle.com> Message-ID: <3528158a-fda7-fda9-7126-d51fba0d3f28@oracle.com> On 24/08/2017 11:30 PM, Adam Farley8 wrote: > Hi Alan, David, and Tom, > > First, thanks again for your efforts on this. As a new guy to OpenJDK > contributions, it means a lot to see so much progress on this so > quickly. :) All I see is discussion :) Progress would be something else entirely. > > >On 24/08/2017 07:33, David Holmes wrote: > >> Hi Adam, > >> > >> cc'ing hotspot runtime dev as runtime own JNI and the invocation API - > >> and some of the problematic code resides in the VM. > >Yeah, the hotspot mailing list would be a better place to discuss this > >as there are several issues here and several places where HotSpot aborts > >the process when initialization fails. It's a long standing issue (going > >back 15+ years) that I think is partly because it's not easy to release > >all resources and cleanup before CreateJavaVM returns with an error. > > > > According to the JNI spec, it is not possible (yet) to create a second VM > in the same thread as the first. > > There is also a bug (dup'd against another bug I don't have the access for) > which states that even a successful VM creation+destruction won't permit > a second VM to be created. > > https://bugs.openjdk.java.net/browse/JDK-4712793 > > Both of these seem to imply that making a new VM after a failed VM-creation > (in the same thread) is unsupported behaviour. > > So is it important to release all resources and cleanup, given that we > won't > be trying to create a new VM in this thread? By "important" I mean "more > important than exiting with a return code and allowing the user's code > to finish". Okay, so if there is no intention of attempting to reload the jvm again, I'm unclear what the purpose of the hosting process actually is. To me it is either a customer launcher - in which case the exit calls are "harmless" (and atexit handlers could be used if the process has its own clean up) - or it's something multi-purpose part of which is to launch a VM. In the latter case given the inability to reload a VM, and assuming the process does not what it's java launching powers to be removed, then the only real option is to filter out the problematic arguments and either ignore them or exec a separate process to handle them. > >> > >> This specific case seems like a bug to me as the logic is assuming it > >> is only ever called by a launcher which it is okay to terminate. > >> Though to be honest the very existence of the "help" option seems to > >> me somewhat misguided in a hosted-VM environment. That said, I see > >> unified logging in 9 also added a terminating "help" option . > >The agent "help" option case is tricky and would likely need an update > >to the JVM TI spec and the Agent_OnLoad return value. > > > > To clarify, the agent "help" option is only an example of this problem. > There are 19 locations both within and without hotspot that call exit(0) > directly, plus more places where exit is passed a variable that can be > 0 (e.g. the aforementioned agent "help", which calls the forceExit function > with an argument of 0, which calls exit(arg) in turn). > > I understand that your comment was intended as an effort to effect a fix > for this specific instance of the problem. I wanted to make sure we kept > sight of the wider problem, as ideally we'd come up with an ideal solution > that could be applied to all cases. The fact there are numerous potential process termination points in the VM and JDK native code, is something we just have to live with. I'm only considering these kind of "report and terminate" flags to be the problem cases that should be handled better. > My thought on this was a unique return code that tells the user's code > that the VM is not in a usable state, but that no error has occurred. This > should be a negative code (so the usual x<0 check will prevent the user's > code from using the VM), but it shouldn't be one of the existing JNI codes; > all of which seem to indicate either: > > a) The VM is fine and can be used (0). > or > b) The VM is not fine because an error occurred (-1 to -6). > > Ideally we need a c) The VM is not fine, but no error has occurred. It's somewhat debatable how to classify the case where you ask the VM to load and then perform a one-off action that effectively succeeds but leaves the VM unusable. Again ideally, to me, the VM would never do that - such actions would occur as part of VM initialization, the VM would be usable, but the launcher would do the termination because that is how the flag is specified. But that is non-trivial to untangle. David > Or is there another solution to the exit(0) problem? Other than putting > a copy of the rest of your code on the exit hook, I mean. > > > > >> > >> Options processed by the VM will be recognized, while options > >> processed by the Java launcher will not be. "-version", "-X", "-help" > >> and numerous others are launcher options. Pure VM options are -XX > >> options, but the VM also processes some -X flags and, as a result of > >> jigsaw, now also processes a bunch of module-related flags that are > >> simple --foo options. > >Right because these options need to passed to CreateJavaVM as they are > >used when initializing the VM. Using system properties would just repeat > >the issues of past (e.g. java.class.path) and require documenting a slew > >of system properties (which is complicated at repeating options). > > > >-Alan > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From david.holmes at oracle.com Fri Aug 25 02:12:24 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 25 Aug 2017 12:12:24 +1000 Subject: [10] RFR: 8179040: Avoid Ticks::now calls when EventClassLoad is not enabled In-Reply-To: <9fb7cb6b-a9cc-e426-cc12-f1341675cf03@oracle.com> References: <9fb7cb6b-a9cc-e426-cc12-f1341675cf03@oracle.com> Message-ID: <50254b27-b250-05e3-a916-692739c66b75@oracle.com> On 25/08/2017 1:38 AM, Claes Redestad wrote: > Thanks Ioi and David for reviewing, > > sadly there was some issues with building the previous patch without the > closed > sources which has been corrected here: > > http://cr.openjdk.java.net/~redestad/8179040/hotspot.03/ Incremental webrev possible? Thanks, David > Thanks! > > /Claes > > On 2017-08-23 20:11, Ioi Lam wrote: >> Hi Claes, >> >> The changes look good. Reviewed. >> >> Thanks >> >> - Ioi >> >> >> On 8/23/17 5:32 AM, Claes Redestad wrote: >>> Hi, >>> >>> this patch was never pushed due to an unfortunate interaction around >>> cancelling >>> of events. Markus Gr?nlund has helped resolve these, while >>> maintaining the >>> speedup: >>> >>> http://cr.openjdk.java.net/~redestad/8179040/hotspot.02/ >>> >>> Thanks! >>> >>> /Claes >>> >>> On 04/25/2017 03:21 PM, Claes Redestad wrote: >>>> Hi, >>>> >>>> this patch removes calling Ticks::now when EventClassLoad isn't >>>> enabled, which >>>> has an effect on class loading performance: >>>> >>>> http://cr.openjdk.java.net/~redestad/8179040/hotspot.01/ >>>> >>>> When tracing isn't enabled trace/tracing.hpp has dummy >>>> implementations which >>>> are easily optimized away by a compiler, which I've verified happens >>>> on linux >>>> OpenJDK builds with tracing disabled. >>>> >>>> On builds with tracing enabled then the changes means the call to >>>> get the time >>>> only happen if the event is enabled, which achieves the sought after >>>> startup >>>> optimization. >>>> >>>> Thanks! >>>> >>>> /Claes >>> >> > From claes.redestad at oracle.com Fri Aug 25 08:06:14 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Fri, 25 Aug 2017 10:06:14 +0200 Subject: [10] RFR: 8179040: Avoid Ticks::now calls when EventClassLoad is not enabled In-Reply-To: <50254b27-b250-05e3-a916-692739c66b75@oracle.com> References: <9fb7cb6b-a9cc-e426-cc12-f1341675cf03@oracle.com> <50254b27-b250-05e3-a916-692739c66b75@oracle.com> Message-ID: On 2017-08-25 04:12, David Holmes wrote: >> >> http://cr.openjdk.java.net/~redestad/8179040/hotspot.03/ > > Incremental webrev possible? Of course: http://cr.openjdk.java.net/~redestad/8179040/hotspot.inc_02_03/ /Claes From coleen.phillimore at oracle.com Fri Aug 25 12:49:19 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 25 Aug 2017 08:49:19 -0400 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: <990E9A29-BEE0-4E16-8A35-5F8EFE67655F@oracle.com> References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> <97fcc27b-fe6f-1836-2bf1-9dddd9dc0f2f@oracle.com> <990E9A29-BEE0-4E16-8A35-5F8EFE67655F@oracle.com> Message-ID: <00a6292b-dc1d-a8bc-97c9-b9cbfa2c17e1@oracle.com> On 8/24/17 7:32 PM, Jiangli Zhou wrote: > Hi Coleen, > > Thanks for reviewing this! > >> On Aug 24, 2017, at 1:31 PM, coleen.phillimore at oracle.com wrote: >> >> >> Hi, >> >> I'm glad that you changed this to CHEAP allocated. >> >> + CDS_JAVA_HEAP_ONLY(_archive_object_cache = new (ResourceObj::C_HEAP, mtInternal)ArchivedObjectCache();); >> >> >> This should probably be mtClass and not mtInternal. > Ok. I?ll use mtClass. > >> The other question I had was that I expected you to use obj->hash() since the objects probably should (?) have a hashcode installed when in the archive. > Do you mean identity_hash()? That should also work for this use case. Initially I wanted to use a simply hash code so went with computation using object address. Yes, we compute the identity_hash for archived object at dump time. I updated the webrev: The only thing to worry about is if identity_hash() can go to a safepoint here. But don't the strings already have an identity hash installed in the markOop? thanks, Coleen > > http://cr.openjdk.java.net/~jiangli/8186706/webrev.02/ > > Thanks, > Jiangli > >> Thanks, >> Coleen >> >> >> On 8/24/17 3:53 PM, Jiangli Zhou wrote: >>> Hi Ioi, >>> >>> ResourceObj only allows delete for C_HEAP objects, we need to allocate _archive_object_cache as C_HEAP object also. Otherwise, we would hit the following assert. >>> >>> void ResourceObj::operator delete(void* p) { >>> assert(((ResourceObj *)p)->allocated_on_C_heap(), >>> "delete only allowed for C_HEAP objects?); >>> >>> Here is the updated webrev that allocates/deallocates the _archive_object_cache table and nodes as C_HEAP objects. >>> http://cr.openjdk.java.net/~jiangli/8186706/webrev.01/ >>> >>> Thanks, >>> Jiangli >>> >>>> On Aug 24, 2017, at 10:22 AM, Ioi Lam wrote: >>>> >>>> Hi Jiangli, >>>> >>>> The Nodes need to be deallocated in the ResourceHashtable destructor. >>>> >>>> ~ResourceHashtable() { >>>> if (ALLOC_TYPE == C_HEAP) { >>>> Node* const* bucket = _table; >>>> while (bucket < &_table[SIZE]) { >>>> Node* node = *bucket; >>>> while (node != NULL) { >>>> Node* cur = node; >>>> node = node->_next; >>>> delete cur; >>>> } >>>> ++bucket; >>>> } >>>> } >>>> } >>>> >>>> >>>> The problem with ResourceHashtable is that by default ALLOC_TYPE = ResourceObj::RESOURCE_AREA, but if your call path looks like this: >>>> >>>> >>>> ResourceHashtable<...> table; >>>> ... >>>> { >>>> ResourceMark rm; >>>> ... >>>> { >>>> table.put(....); >>>> } >>>> } >>>> >>>> The Node in the table will end up being invalid. >>>> >>>> In your case, the code path between the allocation of the ResourceHashtable and the call to MetaspaceShared::archive_heap_object covers a few files. There's currently no ResourceMark in between. However, in the future, someone could potentially put in a ResourceMark and cause erratic failures. >>>> >>>> So, since your're fixing the hashtable code, I think it will be a good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. However, when doing that, it's a good idea to do the proper clean up by invoking the ~ResourceHashtable() destructor via the delete operator. >>>> >>>> >>>> Thanks >>>> >>>> - Ioi >>>> >>>> >>>> >>>> On 8/23/17 10:27 PM, Jiangli Zhou wrote: >>>>> Hi Ioi, >>>>> >>>>> The table was not changed to be allocated as ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>>> On Aug 23, 2017, at 6:03 PM, Ioi Lam wrote: >>>>>> >>>>>> Hi Jiangli, >>>>>> >>>>>> Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks >>>>>> >>>>>> { >>>>>> NoSafepointVerifier nsv; >>>>>> >>>>>> // Cache for recording where the archived objects are copied to >>>>>> MetaspaceShared::create_archive_object_cache(); >>>>>> >>>>>> tty->print_cr("Dumping String objects to closed archive heap region ..."); >>>>>> NOT_PRODUCT(StringTable::verify()); >>>>>> // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. >>>>>> _string_regions = new GrowableArray(2); >>>>>> StringTable::write_to_archive(_string_regions); >>>>>> >>>>>> tty->print_cr("Dumping objects to open archive heap region ..."); >>>>>> _open_archive_heap_regions = new GrowableArray(2); >>>>>> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >>>>>> >>>>>> + MetaspaceShared::create_archive_object_cache(); >>>>>> >>>>>> } >>>>>> >>>>>> + static void delete_archive_object_cache() { >>>>>> + CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); >>>>>> + } >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>>>>>> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >>>>>>> >>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>>>>>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>>>>>> >>>>>>> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >>>>>>> >>>>>>> Tested with tier4-comp tests. >>>>>>> >>>>>>> Thanks, >>>>>>> Jiangli From coleen.phillimore at oracle.com Fri Aug 25 13:26:31 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 25 Aug 2017 09:26:31 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> Message-ID: Thank you Zhengyu for noticing this change was wrong, and Christian for the idea. New webrev: open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8164207 I reran parallel class loading tests and jck testing is in progress, but order access requires inspection. Thanks, Coleen On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: > > > On 8/24/17 5:00 PM, Christian Thalinger wrote: >>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> >>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>> Hi Coleen, >>>> >>>> There are two instances probably overlooked? >>>> >>>> dictionary.cpp #103 and #124 >>>> >>>> for (ProtectionDomainEntry* current = _pd_set; >>>> => >>>> for (ProtectionDomainEntry* current = pd_set(); >>>> >>>> >>> Oh yeah, you're right. That's embarrasing. I'll fix and retest. >> Which also shows that there is a potential for future mistakes. Can >> we isolate the field better so it?s only accessible via setter and >> getter? > > Yes, great idea. > Coleen > >>> Thank you!! >>> Coleen >>> >>>> Thanks, >>>> >>>> -Zhengyu >>>> >>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set >>>>> since it's accessed outside the SystemDictionary_lock >>>>> >>>>> Ran parallel class loading tests that we have as well as tier1 >>>>> tests. See bug for details. >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>> >>>>> Thanks, >>>>> Coleen >>>>> > From zgu at redhat.com Fri Aug 25 14:55:23 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 25 Aug 2017 10:55:23 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> Message-ID: Looks good to me. -Zhengyu On 08/25/2017 09:26 AM, coleen.phillimore at oracle.com wrote: > > Thank you Zhengyu for noticing this change was wrong, and Christian for > the idea. New webrev: > > open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > > I reran parallel class loading tests and jck testing is in progress, but > order access requires inspection. > > Thanks, > Coleen > > > On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> >>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>> Hi Coleen, >>>>> >>>>> There are two instances probably overlooked? >>>>> >>>>> dictionary.cpp #103 and #124 >>>>> >>>>> for (ProtectionDomainEntry* current = _pd_set; >>>>> => >>>>> for (ProtectionDomainEntry* current = pd_set(); >>>>> >>>>> >>>> Oh yeah, you're right. That's embarrasing. I'll fix and retest. >>> Which also shows that there is a potential for future mistakes. Can >>> we isolate the field better so it?s only accessible via setter and >>> getter? >> >> Yes, great idea. >> Coleen >> >>>> Thank you!! >>>> Coleen >>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set >>>>>> since it's accessed outside the SystemDictionary_lock >>>>>> >>>>>> Ran parallel class loading tests that we have as well as tier1 >>>>>> tests. See bug for details. >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >> > From jiangli.zhou at oracle.com Fri Aug 25 18:08:08 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 25 Aug 2017 11:08:08 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: <00a6292b-dc1d-a8bc-97c9-b9cbfa2c17e1@oracle.com> References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> <97fcc27b-fe6f-1836-2bf1-9dddd9dc0f2f@oracle.com> <990E9A29-BEE0-4E16-8A35-5F8EFE67655F@oracle.com> <00a6292b-dc1d-a8bc-97c9-b9cbfa2c17e1@oracle.com> Message-ID: <7F676480-0B75-472A-B43A-705C779EF7F5@oracle.com> > On Aug 25, 2017, at 5:49 AM, coleen.phillimore at oracle.com wrote: > > > > On 8/24/17 7:32 PM, Jiangli Zhou wrote: >> Hi Coleen, >> >> Thanks for reviewing this! >> >>> On Aug 24, 2017, at 1:31 PM, coleen.phillimore at oracle.com wrote: >>> >>> >>> Hi, >>> >>> I'm glad that you changed this to CHEAP allocated. >>> >>> + CDS_JAVA_HEAP_ONLY(_archive_object_cache = new (ResourceObj::C_HEAP, mtInternal)ArchivedObjectCache();); >>> >>> >>> This should probably be mtClass and not mtInternal. >> Ok. I?ll use mtClass. >> >>> The other question I had was that I expected you to use obj->hash() since the objects probably should (?) have a hashcode installed when in the archive. >> Do you mean identity_hash()? That should also work for this use case. Initially I wanted to use a simply hash code so went with computation using object address. Yes, we compute the identity_hash for archived object at dump time. I updated the webrev: > > The only thing to worry about is if identity_hash() can go to a safepoint here. But don't the strings already have an identity hash installed in the markOop? Not all strings have identity hash installed yet. During object archiving, we compute identity hash for all archived object right before we copy the objects. The change causes the identity hash being computed slightly earlier, but still during the object archiving. Object archiving is guarded by NoSafepointVerifier. Thanks, Jiangli > > thanks, > Coleen > >> >> http://cr.openjdk.java.net/~jiangli/8186706/webrev.02/ >> >> Thanks, >> Jiangli >> >>> Thanks, >>> Coleen >>> >>> >>> On 8/24/17 3:53 PM, Jiangli Zhou wrote: >>>> Hi Ioi, >>>> >>>> ResourceObj only allows delete for C_HEAP objects, we need to allocate _archive_object_cache as C_HEAP object also. Otherwise, we would hit the following assert. >>>> >>>> void ResourceObj::operator delete(void* p) { >>>> assert(((ResourceObj *)p)->allocated_on_C_heap(), >>>> "delete only allowed for C_HEAP objects?); >>>> >>>> Here is the updated webrev that allocates/deallocates the _archive_object_cache table and nodes as C_HEAP objects. >>>> http://cr.openjdk.java.net/~jiangli/8186706/webrev.01/ >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> On Aug 24, 2017, at 10:22 AM, Ioi Lam wrote: >>>>> >>>>> Hi Jiangli, >>>>> >>>>> The Nodes need to be deallocated in the ResourceHashtable destructor. >>>>> >>>>> ~ResourceHashtable() { >>>>> if (ALLOC_TYPE == C_HEAP) { >>>>> Node* const* bucket = _table; >>>>> while (bucket < &_table[SIZE]) { >>>>> Node* node = *bucket; >>>>> while (node != NULL) { >>>>> Node* cur = node; >>>>> node = node->_next; >>>>> delete cur; >>>>> } >>>>> ++bucket; >>>>> } >>>>> } >>>>> } >>>>> >>>>> >>>>> The problem with ResourceHashtable is that by default ALLOC_TYPE = ResourceObj::RESOURCE_AREA, but if your call path looks like this: >>>>> >>>>> >>>>> ResourceHashtable<...> table; >>>>> ... >>>>> { >>>>> ResourceMark rm; >>>>> ... >>>>> { >>>>> table.put(....); >>>>> } >>>>> } >>>>> >>>>> The Node in the table will end up being invalid. >>>>> >>>>> In your case, the code path between the allocation of the ResourceHashtable and the call to MetaspaceShared::archive_heap_object covers a few files. There's currently no ResourceMark in between. However, in the future, someone could potentially put in a ResourceMark and cause erratic failures. >>>>> >>>>> So, since your're fixing the hashtable code, I think it will be a good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. However, when doing that, it's a good idea to do the proper clean up by invoking the ~ResourceHashtable() destructor via the delete operator. >>>>> >>>>> >>>>> Thanks >>>>> >>>>> - Ioi >>>>> >>>>> >>>>> >>>>> On 8/23/17 10:27 PM, Jiangli Zhou wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> The table was not changed to be allocated as ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>>> On Aug 23, 2017, at 6:03 PM, Ioi Lam wrote: >>>>>>> >>>>>>> Hi Jiangli, >>>>>>> >>>>>>> Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks >>>>>>> >>>>>>> { >>>>>>> NoSafepointVerifier nsv; >>>>>>> >>>>>>> // Cache for recording where the archived objects are copied to >>>>>>> MetaspaceShared::create_archive_object_cache(); >>>>>>> >>>>>>> tty->print_cr("Dumping String objects to closed archive heap region ..."); >>>>>>> NOT_PRODUCT(StringTable::verify()); >>>>>>> // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. >>>>>>> _string_regions = new GrowableArray(2); >>>>>>> StringTable::write_to_archive(_string_regions); >>>>>>> >>>>>>> tty->print_cr("Dumping objects to open archive heap region ..."); >>>>>>> _open_archive_heap_regions = new GrowableArray(2); >>>>>>> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >>>>>>> >>>>>>> + MetaspaceShared::create_archive_object_cache(); >>>>>>> >>>>>>> } >>>>>>> >>>>>>> + static void delete_archive_object_cache() { >>>>>>> + CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); >>>>>>> + } >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>>>>>>> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >>>>>>>> >>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>>>>>>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>>>>>>> >>>>>>>> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >>>>>>>> >>>>>>>> Tested with tier4-comp tests. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jiangli From ioi.lam at oracle.com Fri Aug 25 18:19:57 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 25 Aug 2017 11:19:57 -0700 Subject: RFR(XXS) JDK-8186778 Deprecate VM options for shared region size control Message-ID: <96ea8bbf-2821-1494-9f98-5c4fa7b19e1f@oracle.com> Hi, please review this very small change. The corresponding CSR has been approved. https://bugs.openjdk.java.net/browse/JDK-8186778 Since JDK-8072061 (Automatically determine optimal sizes for the CDS regions) is integrated, the following 4 options are no longer necessary, and they are no longer used by the VM anymore. Hence, these options should be deprecated: ??? SharedReadWriteSize ??? SharedReadOnlySize ??? SharedMiscDataSize ??? SharedMiscCodeSize hotspot$ hg diff diff -r 3a8e59bdaaac src/share/vm/runtime/arguments.cpp --- a/src/share/vm/runtime/arguments.cpp??? Thu Aug 24 14:00:04 2017 +0000 +++ b/src/share/vm/runtime/arguments.cpp??? Fri Aug 25 11:16:29 2017 -0700 @@ -379,6 +379,10 @@ ?static SpecialFlag const special_jvm_flags[] = { ?? // -------------- Deprecated Flags -------------- ?? // --- Non-alias flags - sorted by obsolete_in then expired_in: ?? { "MaxGCMinorPauseMillis", JDK_Version::jdk(8), JDK_Version::undefined(), JDK_Version::undefined() }, ?? { "UseConcMarkSweepGC",??? JDK_Version::jdk(9), JDK_Version::undefined(), JDK_Version::undefined() }, ?? { "MonitorInUseLists", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, +? { "SharedMiscCodeSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, +? { "SharedMiscDataSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, +? { "SharedReadOnlySize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, +? { "SharedReadWriteSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, ?? // --- Deprecated alias flags (see also aliased_jvm_flags) - sorted by obsolete_in then expired_in: ?? { "DefaultMaxRAMFraction", JDK_Version::jdk(8), JDK_Version::undefined(), JDK_Version::undefined() }, From harold.seigel at oracle.com Fri Aug 25 18:28:02 2017 From: harold.seigel at oracle.com (harold seigel) Date: Fri, 25 Aug 2017 14:28:02 -0400 Subject: RFR(XXS) JDK-8186778 Deprecate VM options for shared region size control In-Reply-To: <96ea8bbf-2821-1494-9f98-5c4fa7b19e1f@oracle.com> References: <96ea8bbf-2821-1494-9f98-5c4fa7b19e1f@oracle.com> Message-ID: <1148d9ef-e50e-ef7e-6aea-a728c1554d17@oracle.com> Looks good. Thanks, Harodl On 8/25/2017 2:19 PM, Ioi Lam wrote: > Hi, please review this very small change. The corresponding CSR has > been approved. > > https://bugs.openjdk.java.net/browse/JDK-8186778 > > Since JDK-8072061 (Automatically determine optimal sizes for the CDS > regions) is integrated, > the following 4 options are no longer necessary, and they are no > longer used by the VM > anymore. Hence, these options should be deprecated: > > ??? SharedReadWriteSize > ??? SharedReadOnlySize > ??? SharedMiscDataSize > ??? SharedMiscCodeSize > > hotspot$ hg diff > diff -r 3a8e59bdaaac src/share/vm/runtime/arguments.cpp > --- a/src/share/vm/runtime/arguments.cpp??? Thu Aug 24 14:00:04 2017 > +0000 > +++ b/src/share/vm/runtime/arguments.cpp??? Fri Aug 25 11:16:29 2017 > -0700 > @@ -379,6 +379,10 @@ > ?static SpecialFlag const special_jvm_flags[] = { > ?? // -------------- Deprecated Flags -------------- > ?? // --- Non-alias flags - sorted by obsolete_in then expired_in: > ?? { "MaxGCMinorPauseMillis", JDK_Version::jdk(8), > JDK_Version::undefined(), JDK_Version::undefined() }, > ?? { "UseConcMarkSweepGC",??? JDK_Version::jdk(9), > JDK_Version::undefined(), JDK_Version::undefined() }, > ?? { "MonitorInUseLists", > JDK_Version::jdk(10),JDK_Version::undefined(), > JDK_Version::undefined() }, > +? { "SharedMiscCodeSize", > JDK_Version::jdk(10),JDK_Version::undefined(), > JDK_Version::undefined() }, > +? { "SharedMiscDataSize", > JDK_Version::jdk(10),JDK_Version::undefined(), > JDK_Version::undefined() }, > +? { "SharedReadOnlySize", > JDK_Version::jdk(10),JDK_Version::undefined(), > JDK_Version::undefined() }, > +? { "SharedReadWriteSize", > JDK_Version::jdk(10),JDK_Version::undefined(), > JDK_Version::undefined() }, > > ?? // --- Deprecated alias flags (see also aliased_jvm_flags) - sorted > by obsolete_in then expired_in: > ?? { "DefaultMaxRAMFraction", JDK_Version::jdk(8), > JDK_Version::undefined(), JDK_Version::undefined() }, > From coleen.phillimore at oracle.com Fri Aug 25 18:53:40 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 25 Aug 2017 14:53:40 -0400 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: <7F676480-0B75-472A-B43A-705C779EF7F5@oracle.com> References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> <97fcc27b-fe6f-1836-2bf1-9dddd9dc0f2f@oracle.com> <990E9A29-BEE0-4E16-8A35-5F8EFE67655F@oracle.com> <00a6292b-dc1d-a8bc-97c9-b9cbfa2c17e1@oracle.com> <7F676480-0B75-472A-B43A-705C779EF7F5@oracle.com> Message-ID: <597d67e4-3f0e-13ca-51d1-f96d71f82377@oracle.com> On 8/25/17 2:08 PM, Jiangli Zhou wrote: > >> On Aug 25, 2017, at 5:49 AM, coleen.phillimore at oracle.com >> wrote: >> >> >> >> On 8/24/17 7:32 PM, Jiangli Zhou wrote: >>> Hi Coleen, >>> >>> Thanks for reviewing this! >>> >>>> On Aug 24, 2017, at 1:31 PM, coleen.phillimore at oracle.com >>>> wrote: >>>> >>>> >>>> Hi, >>>> >>>> I'm glad that you changed this to CHEAP allocated. >>>> >>>> + CDS_JAVA_HEAP_ONLY(_archive_object_cache = new >>>> (ResourceObj::C_HEAP, mtInternal)ArchivedObjectCache();); >>>> >>>> >>>> This should probably be mtClass and not mtInternal. >>> Ok. I?ll use mtClass. >>> >>>> The other question I had was that I expected you to use obj->hash() >>>> since the objects probably should (?) have a hashcode installed >>>> when in the archive. >>> Do you mean identity_hash()? That should also work for this use >>> case. Initially I wanted to use a simply hash code so went with >>> computation using object address. Yes, we compute the identity_hash >>> for archived object at dump time. I updated the webrev: >> >> The only thing to worry about is if identity_hash() can go to a >> safepoint here. ??But don't the strings already have an identity hash >> installed in the markOop? > > Not all strings have identity hash installed yet. During object > archiving, we compute identity hash for all archived object right > before we copy the objects. The change causes the identity hash being > computed slightly earlier, but still during the object archiving. > Object archiving is guarded by?NoSafepointVerifier. Okay, I keep not remembering why it's safe to call identity_hash() but this doesn't change the situation, since you're calling it under the same NSV. One thing that might remind me is if you add something like: static unsigned obj_hash(oop const& p) { - unsigned hash = (unsigned)((uintptr_t)&p); - return hash ^ (hash >> LogMinObjAlignment); + assert(p->mark()->has_bias_pattern, "this object should never have been locked"); // so identity_hash won't safepoint + unsigned hash = (unsigned)p->identity_hash(); + return hash; } The change looks good, especially since you're already adding the hash code. Thanks, Coleen > Thanks, > Jiangli > >> >> thanks, >> Coleen >> >>> >>> http://cr.openjdk.java.net/~jiangli/8186706/webrev.02/ >>> >>> >>> Thanks, >>> Jiangli >>> >>>> Thanks, >>>> Coleen >>>> >>>> >>>> On 8/24/17 3:53 PM, Jiangli Zhou wrote: >>>>> Hi Ioi, >>>>> >>>>> ResourceObj only allows delete for C_HEAP objects, we need to >>>>> allocate _archive_object_cache as C_HEAP object also. Otherwise, >>>>> we would hit the following assert. >>>>> >>>>> void ResourceObj::operator delete(void* p) { >>>>> ??assert(((ResourceObj *)p)->allocated_on_C_heap(), >>>>> ?????????"delete only allowed for C_HEAP objects?); >>>>> >>>>> Here is the updated webrev that allocates/deallocates the >>>>> _archive_object_cache table and nodes as C_HEAP objects. >>>>> http://cr.openjdk.java.net/~jiangli/8186706/webrev.01/ >>>>> >>>>> Thanks, >>>>> Jiangli >>>>> >>>>>> On Aug 24, 2017, at 10:22 AM, Ioi Lam wrote: >>>>>> >>>>>> Hi Jiangli, >>>>>> >>>>>> The Nodes need to be deallocated in the ResourceHashtable destructor. >>>>>> >>>>>> ??~ResourceHashtable() { >>>>>> ????if (ALLOC_TYPE == C_HEAP) { >>>>>> ??????Node* const* bucket = _table; >>>>>> ??????while (bucket < &_table[SIZE]) { >>>>>> ????????Node* node = *bucket; >>>>>> ????????while (node != NULL) { >>>>>> ??????????Node* cur = node; >>>>>> ??????????node = node->_next; >>>>>> ??????????delete cur; >>>>>> ????????} >>>>>> ????????++bucket; >>>>>> ??????} >>>>>> ????} >>>>>> ??} >>>>>> >>>>>> >>>>>> The problem with ResourceHashtable is that by default ALLOC_TYPE >>>>>> = ResourceObj::RESOURCE_AREA, but if your call path looks like this: >>>>>> >>>>>> >>>>>> ResourceHashtable<...> table; >>>>>> ??... >>>>>> ??{ >>>>>> ????ResourceMark rm; >>>>>> ????... >>>>>> ????{ >>>>>> ??????table.put(....); >>>>>> ????} >>>>>> ??} >>>>>> >>>>>> The Node in the table will end up being invalid. >>>>>> >>>>>> In your case, the code path between the allocation of the >>>>>> ResourceHashtable and the call to >>>>>> MetaspaceShared::archive_heap_object covers a few files. There's >>>>>> currently no ResourceMark in between. However, in the future, >>>>>> someone could potentially put in a ResourceMark and cause erratic >>>>>> failures. >>>>>> >>>>>> So, since your're fixing the hashtable code, I think it will be a >>>>>> good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. >>>>>> However, when doing that, it's a good idea to do the proper clean >>>>>> up by invoking the ~ResourceHashtable() destructor via the delete >>>>>> operator. >>>>>> >>>>>> >>>>>> Thanks >>>>>> >>>>>> - Ioi >>>>>> >>>>>> >>>>>> >>>>>> On 8/23/17 10:27 PM, Jiangli Zhou wrote: >>>>>>> Hi Ioi, >>>>>>> >>>>>>> The table was not changed to be allocated as >>>>>>> ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the >>>>>>> Nodes in the ResourceHashtable. >>>>>>> >>>>>>> Thanks, >>>>>>> Jiangli >>>>>>> >>>>>>>> On Aug 23, 2017, at 6:03 PM, Ioi Lam wrote: >>>>>>>> >>>>>>>> Hi Jiangli, >>>>>>>> >>>>>>>> Since the table is allocated as ResourceObj::C_HEAP, it's >>>>>>>> better to delete it afterwards to avoid memory leaks >>>>>>>> >>>>>>>> ??{ >>>>>>>> ????NoSafepointVerifier nsv; >>>>>>>> >>>>>>>> ????// Cache for recording where the archived objects are copied to >>>>>>>> ????MetaspaceShared::create_archive_object_cache(); >>>>>>>> >>>>>>>> ????tty->print_cr("Dumping String objects to closed archive >>>>>>>> heap region ..."); >>>>>>>> ????NOT_PRODUCT(StringTable::verify()); >>>>>>>> ????// The string space has maximum two regions. See >>>>>>>> FileMapInfo::write_archive_heap_regions() for details. >>>>>>>> ????_string_regions = new GrowableArray(2); >>>>>>>> ????StringTable::write_to_archive(_string_regions); >>>>>>>> >>>>>>>> ????tty->print_cr("Dumping objects to open archive heap region >>>>>>>> ..."); >>>>>>>> ????_open_archive_heap_regions = new GrowableArray(2); >>>>>>>> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >>>>>>>> >>>>>>>> + ??MetaspaceShared::create_archive_object_cache(); >>>>>>>> >>>>>>>> ??} >>>>>>>> >>>>>>>> + static void delete_archive_object_cache() { >>>>>>>> + ??CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; >>>>>>>> _archive_object_cache = NULL;); >>>>>>>> + } >>>>>>>> >>>>>>>> Thanks >>>>>>>> - Ioi >>>>>>>> >>>>>>>> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>>>>>>>> Please review the following webrev that fixes the >>>>>>>>> ArchivedObjectCache obj_hash() issue. The patch was from Ioi >>>>>>>>> (thanks!). I will count myself as a reviewer. >>>>>>>>> >>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>>>>>>>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>>>>>>>> >>>>>>>>> ArchivedObjectCache obj_hash() computes hash using incorrect >>>>>>>>> address. The fix is to use the correct oop address. The >>>>>>>>> default ResourceHashtable size is 256, which is too small when >>>>>>>>> large number of objects are archived. The table is now changed >>>>>>>>> to use a much larger (15889) size. The ArchivedObjectCache >>>>>>>>> issue was noticed when one test times out on slower linux >>>>>>>>> arm64 machine. With the fix the test finishes without timeout. >>>>>>>>> >>>>>>>>> Tested with tier4-comp tests. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jiangli > From jiangli.zhou at oracle.com Fri Aug 25 19:11:42 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 25 Aug 2017 12:11:42 -0700 Subject: RFR(XXS) JDK-8186778 Deprecate VM options for shared region size control In-Reply-To: <96ea8bbf-2821-1494-9f98-5c4fa7b19e1f@oracle.com> References: <96ea8bbf-2821-1494-9f98-5c4fa7b19e1f@oracle.com> Message-ID: Hi Ioi, The change looks good. I noticed the deprecated options (not just the ones you are adding) are still kept in globals.hpp. Do you know why we didn?t remove them from globals.hpp? Thanks, Jiangli > On Aug 25, 2017, at 11:19 AM, Ioi Lam wrote: > > Hi, please review this very small change. The corresponding CSR has been approved. > > https://bugs.openjdk.java.net/browse/JDK-8186778 > > Since JDK-8072061 (Automatically determine optimal sizes for the CDS regions) is integrated, > the following 4 options are no longer necessary, and they are no longer used by the VM > anymore. Hence, these options should be deprecated: > > SharedReadWriteSize > SharedReadOnlySize > SharedMiscDataSize > SharedMiscCodeSize > > hotspot$ hg diff > diff -r 3a8e59bdaaac src/share/vm/runtime/arguments.cpp > --- a/src/share/vm/runtime/arguments.cpp Thu Aug 24 14:00:04 2017 +0000 > +++ b/src/share/vm/runtime/arguments.cpp Fri Aug 25 11:16:29 2017 -0700 > @@ -379,6 +379,10 @@ > static SpecialFlag const special_jvm_flags[] = { > // -------------- Deprecated Flags -------------- > // --- Non-alias flags - sorted by obsolete_in then expired_in: > { "MaxGCMinorPauseMillis", JDK_Version::jdk(8), JDK_Version::undefined(), JDK_Version::undefined() }, > { "UseConcMarkSweepGC", JDK_Version::jdk(9), JDK_Version::undefined(), JDK_Version::undefined() }, > { "MonitorInUseLists", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, > + { "SharedMiscCodeSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, > + { "SharedMiscDataSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, > + { "SharedReadOnlySize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, > + { "SharedReadWriteSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, > > // --- Deprecated alias flags (see also aliased_jvm_flags) - sorted by obsolete_in then expired_in: > { "DefaultMaxRAMFraction", JDK_Version::jdk(8), JDK_Version::undefined(), JDK_Version::undefined() }, > From jiangli.zhou at oracle.com Fri Aug 25 19:12:32 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 25 Aug 2017 12:12:32 -0700 Subject: RFR: 8186706: ArchivedObjectCache obj_hash() is broken In-Reply-To: <597d67e4-3f0e-13ca-51d1-f96d71f82377@oracle.com> References: <75518830-8EDD-44CF-83BC-2417C31494D4@oracle.com> <65a19b2e-bf60-0cf3-6a83-ed587815353a@oracle.com> <97fcc27b-fe6f-1836-2bf1-9dddd9dc0f2f@oracle.com> <990E9A29-BEE0-4E16-8A35-5F8EFE67655F@oracle.com> <00a6292b-dc1d-a8bc-97c9-b9cbfa2c17e1@oracle.com> <7F676480-0B75-472A-B43A-705C779EF7F5@oracle.com> <597d67e4-3f0e-13ca-51d1-f96d71f82377@oracle.com> Message-ID: > On Aug 25, 2017, at 11:53 AM, coleen.phillimore at oracle.com wrote: > > > > On 8/25/17 2:08 PM, Jiangli Zhou wrote: >> >>> On Aug 25, 2017, at 5:49 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> >>> On 8/24/17 7:32 PM, Jiangli Zhou wrote: >>>> Hi Coleen, >>>> >>>> Thanks for reviewing this! >>>> >>>>> On Aug 24, 2017, at 1:31 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> Hi, >>>>> >>>>> I'm glad that you changed this to CHEAP allocated. >>>>> >>>>> + CDS_JAVA_HEAP_ONLY(_archive_object_cache = new (ResourceObj::C_HEAP, mtInternal)ArchivedObjectCache();); >>>>> >>>>> >>>>> This should probably be mtClass and not mtInternal. >>>> Ok. I?ll use mtClass. >>>> >>>>> The other question I had was that I expected you to use obj->hash() since the objects probably should (?) have a hashcode installed when in the archive. >>>> Do you mean identity_hash()? That should also work for this use case. Initially I wanted to use a simply hash code so went with computation using object address. Yes, we compute the identity_hash for archived object at dump time. I updated the webrev: >>> >>> The only thing to worry about is if identity_hash() can go to a safepoint here. But don't the strings already have an identity hash installed in the markOop? >> >> Not all strings have identity hash installed yet. During object archiving, we compute identity hash for all archived object right before we copy the objects. The change causes the identity hash being computed slightly earlier, but still during the object archiving. Object archiving is guarded by NoSafepointVerifier. > > > Okay, I keep not remembering why it's safe to call identity_hash() but this doesn't change the situation, since you're calling it under the same NSV. > > One thing that might remind me is if you add something like: > > static unsigned obj_hash(oop const& p) { > - unsigned hash = (unsigned)((uintptr_t)&p); > - return hash ^ (hash >> LogMinObjAlignment); > + assert(p->mark()->has_bias_pattern, "this object should never have been locked"); // so identity_hash won't safepoint > + unsigned hash = (unsigned)p->identity_hash(); > + return hash; > } > > The change looks good, especially since you're already adding the hash code. I?ll add the assert and double check with my tests. Thanks! Jiangli > > Thanks, > Coleen > >> Thanks, >> Jiangli >> >>> >>> thanks, >>> Coleen >>> >>>> >>>> http://cr.openjdk.java.net/~jiangli/8186706/webrev.02/ >>>> >>>> Thanks, >>>> Jiangli >>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>> >>>>> On 8/24/17 3:53 PM, Jiangli Zhou wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> ResourceObj only allows delete for C_HEAP objects, we need to allocate _archive_object_cache as C_HEAP object also. Otherwise, we would hit the following assert. >>>>>> >>>>>> void ResourceObj::operator delete(void* p) { >>>>>> assert(((ResourceObj *)p)->allocated_on_C_heap(), >>>>>> "delete only allowed for C_HEAP objects?); >>>>>> >>>>>> Here is the updated webrev that allocates/deallocates the _archive_object_cache table and nodes as C_HEAP objects. >>>>>> http://cr.openjdk.java.net/~jiangli/8186706/webrev.01/ >>>>>> >>>>>> Thanks, >>>>>> Jiangli >>>>>> >>>>>>> On Aug 24, 2017, at 10:22 AM, Ioi Lam wrote: >>>>>>> >>>>>>> Hi Jiangli, >>>>>>> >>>>>>> The Nodes need to be deallocated in the ResourceHashtable destructor. >>>>>>> >>>>>>> ~ResourceHashtable() { >>>>>>> if (ALLOC_TYPE == C_HEAP) { >>>>>>> Node* const* bucket = _table; >>>>>>> while (bucket < &_table[SIZE]) { >>>>>>> Node* node = *bucket; >>>>>>> while (node != NULL) { >>>>>>> Node* cur = node; >>>>>>> node = node->_next; >>>>>>> delete cur; >>>>>>> } >>>>>>> ++bucket; >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> >>>>>>> The problem with ResourceHashtable is that by default ALLOC_TYPE = ResourceObj::RESOURCE_AREA, but if your call path looks like this: >>>>>>> >>>>>>> >>>>>>> ResourceHashtable<...> table; >>>>>>> ... >>>>>>> { >>>>>>> ResourceMark rm; >>>>>>> ... >>>>>>> { >>>>>>> table.put(....); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> The Node in the table will end up being invalid. >>>>>>> >>>>>>> In your case, the code path between the allocation of the ResourceHashtable and the call to MetaspaceShared::archive_heap_object covers a few files. There's currently no ResourceMark in between. However, in the future, someone could potentially put in a ResourceMark and cause erratic failures. >>>>>>> >>>>>>> So, since your're fixing the hashtable code, I think it will be a good idea to change the ALLOC_TYPE = ResourceObj::C_HEAP. However, when doing that, it's a good idea to do the proper clean up by invoking the ~ResourceHashtable() destructor via the delete operator. >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> - Ioi >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 8/23/17 10:27 PM, Jiangli Zhou wrote: >>>>>>>> Hi Ioi, >>>>>>>> >>>>>>>> The table was not changed to be allocated as ResourceObj::C_HEAP. I see ?ALLOC_TYPE? only applies to the Nodes in the ResourceHashtable. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jiangli >>>>>>>> >>>>>>>>> On Aug 23, 2017, at 6:03 PM, Ioi Lam wrote: >>>>>>>>> >>>>>>>>> Hi Jiangli, >>>>>>>>> >>>>>>>>> Since the table is allocated as ResourceObj::C_HEAP, it's better to delete it afterwards to avoid memory leaks >>>>>>>>> >>>>>>>>> { >>>>>>>>> NoSafepointVerifier nsv; >>>>>>>>> >>>>>>>>> // Cache for recording where the archived objects are copied to >>>>>>>>> MetaspaceShared::create_archive_object_cache(); >>>>>>>>> >>>>>>>>> tty->print_cr("Dumping String objects to closed archive heap region ..."); >>>>>>>>> NOT_PRODUCT(StringTable::verify()); >>>>>>>>> // The string space has maximum two regions. See FileMapInfo::write_archive_heap_regions() for details. >>>>>>>>> _string_regions = new GrowableArray(2); >>>>>>>>> StringTable::write_to_archive(_string_regions); >>>>>>>>> >>>>>>>>> tty->print_cr("Dumping objects to open archive heap region ..."); >>>>>>>>> _open_archive_heap_regions = new GrowableArray(2); >>>>>>>>> MetaspaceShared::dump_open_archive_heap_objects(_open_archive_heap_regions); >>>>>>>>> >>>>>>>>> + MetaspaceShared::create_archive_object_cache(); >>>>>>>>> >>>>>>>>> } >>>>>>>>> >>>>>>>>> + static void delete_archive_object_cache() { >>>>>>>>> + CDS_JAVA_HEAP_ONLY(delete _archive_object_cache; _archive_object_cache = NULL;); >>>>>>>>> + } >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>> On 8/23/17 4:24 PM, Jiangli Zhou wrote: >>>>>>>>>> Please review the following webrev that fixes the ArchivedObjectCache obj_hash() issue. The patch was from Ioi (thanks!). I will count myself as a reviewer. >>>>>>>>>> >>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186706?filter=14921 >>>>>>>>>> webrev: http://cr.openjdk.java.net/~jiangli/8186706/webrev.00/ >>>>>>>>>> >>>>>>>>>> ArchivedObjectCache obj_hash() computes hash using incorrect address. The fix is to use the correct oop address. The default ResourceHashtable size is 256, which is too small when large number of objects are archived. The table is now changed to use a much larger (15889) size. The ArchivedObjectCache issue was noticed when one test times out on slower linux arm64 machine. With the fix the test finishes without timeout. >>>>>>>>>> >>>>>>>>>> Tested with tier4-comp tests. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Jiangli >> > From ioi.lam at oracle.com Fri Aug 25 20:02:56 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 25 Aug 2017 13:02:56 -0700 Subject: RFR(XXS) JDK-8186778 Deprecate VM options for shared region size control In-Reply-To: References: <96ea8bbf-2821-1494-9f98-5c4fa7b19e1f@oracle.com> Message-ID: <4EBA2D00-F20A-4E0A-8576-74F67EFAD4BB@oracle.com> If I understand the comments above my changes in arguments.cpp correctly, there are 3 stage of removing options - deprecated, obsolete and removed. I am at stage 1 now, so the options are kept in globals.hpp Thanks Ioi > On Aug 25, 2017, at 12:11 PM, Jiangli Zhou wrote: > > Hi Ioi, > > The change looks good. I noticed the deprecated options (not just the ones you are adding) are still kept in globals.hpp. Do you know why we didn?t remove them from globals.hpp? > > Thanks, > Jiangli > >> On Aug 25, 2017, at 11:19 AM, Ioi Lam wrote: >> >> Hi, please review this very small change. The corresponding CSR has been approved. >> >> https://bugs.openjdk.java.net/browse/JDK-8186778 >> >> Since JDK-8072061 (Automatically determine optimal sizes for the CDS regions) is integrated, >> the following 4 options are no longer necessary, and they are no longer used by the VM >> anymore. Hence, these options should be deprecated: >> >> SharedReadWriteSize >> SharedReadOnlySize >> SharedMiscDataSize >> SharedMiscCodeSize >> >> hotspot$ hg diff >> diff -r 3a8e59bdaaac src/share/vm/runtime/arguments.cpp >> --- a/src/share/vm/runtime/arguments.cpp Thu Aug 24 14:00:04 2017 +0000 >> +++ b/src/share/vm/runtime/arguments.cpp Fri Aug 25 11:16:29 2017 -0700 >> @@ -379,6 +379,10 @@ >> static SpecialFlag const special_jvm_flags[] = { >> // -------------- Deprecated Flags -------------- >> // --- Non-alias flags - sorted by obsolete_in then expired_in: >> { "MaxGCMinorPauseMillis", JDK_Version::jdk(8), JDK_Version::undefined(), JDK_Version::undefined() }, >> { "UseConcMarkSweepGC", JDK_Version::jdk(9), JDK_Version::undefined(), JDK_Version::undefined() }, >> { "MonitorInUseLists", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >> + { "SharedMiscCodeSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >> + { "SharedMiscDataSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >> + { "SharedReadOnlySize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >> + { "SharedReadWriteSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >> >> // --- Deprecated alias flags (see also aliased_jvm_flags) - sorted by obsolete_in then expired_in: >> { "DefaultMaxRAMFraction", JDK_Version::jdk(8), JDK_Version::undefined(), JDK_Version::undefined() }, >> > > > > > From cthalinger at twitter.com Fri Aug 25 21:16:08 2017 From: cthalinger at twitter.com (Christian Thalinger) Date: Fri, 25 Aug 2017 11:16:08 -1000 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> Message-ID: <12AB6CD9-9EF1-47AA-8BB3-6E45BE802B28@twitter.com> > On Aug 25, 2017, at 3:26 AM, coleen.phillimore at oracle.com wrote: > > > Thank you Zhengyu for noticing this change was wrong, and Christian for the idea. New webrev: > > open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8164207 Looks good. > > I reran parallel class loading tests and jck testing is in progress, but order access requires inspection. > > Thanks, > Coleen > > > On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> >>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>> Hi Coleen, >>>>> >>>>> There are two instances probably overlooked? >>>>> >>>>> dictionary.cpp #103 and #124 >>>>> >>>>> for (ProtectionDomainEntry* current = _pd_set; >>>>> => >>>>> for (ProtectionDomainEntry* current = pd_set(); >>>>> >>>>> >>>> Oh yeah, you're right. That's embarrasing. I'll fix and retest. >>> Which also shows that there is a potential for future mistakes. Can we isolate the field better so it?s only accessible via setter and getter? >> >> Yes, great idea. >> Coleen >> >>>> Thank you!! >>>> Coleen >>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set since it's accessed outside the SystemDictionary_lock >>>>>> >>>>>> Ran parallel class loading tests that we have as well as tier1 tests. See bug for details. >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >> > From coleen.phillimore at oracle.com Fri Aug 25 21:19:20 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 25 Aug 2017 17:19:20 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> Message-ID: <8887d8cb-3510-e0d6-5f7e-296ce4856a52@oracle.com> Thanks, Zhengyu. Coleen On 8/25/17 10:55 AM, Zhengyu Gu wrote: > Looks good to me. > > -Zhengyu > > On 08/25/2017 09:26 AM, coleen.phillimore at oracle.com wrote: >> >> Thank you Zhengyu for noticing this change was wrong, and Christian >> for the idea.?? New webrev: >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >> >> I reran parallel class loading tests and jck testing is in progress, >> but order access requires inspection. >> >> Thanks, >> Coleen >> >> >> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> >>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> There are two instances probably overlooked? >>>>>> >>>>>> dictionary.cpp #103 and #124 >>>>>> >>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>> => >>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>> >>>>>> >>>>> Oh yeah, you're right.? That's embarrasing.?? I'll fix and retest. >>>> Which also shows that there is a potential for future mistakes. Can >>>> we isolate the field better so it?s only accessible via setter and >>>> getter? >>> >>> Yes, great idea. >>> Coleen >>> >>>>> Thank you!! >>>>> Coleen >>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>>> >>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set >>>>>>> since it's accessed outside the SystemDictionary_lock >>>>>>> >>>>>>> Ran parallel class loading tests that we have as well as tier1 >>>>>>> tests. See bug for details. >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>> >> From coleen.phillimore at oracle.com Fri Aug 25 21:19:31 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 25 Aug 2017 17:19:31 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <12AB6CD9-9EF1-47AA-8BB3-6E45BE802B28@twitter.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <12AB6CD9-9EF1-47AA-8BB3-6E45BE802B28@twitter.com> Message-ID: Thanks Christian! Coleen On 8/25/17 5:16 PM, Christian Thalinger wrote: > >> On Aug 25, 2017, at 3:26 AM, coleen.phillimore at oracle.com >> wrote: >> >> >> Thank you Zhengyu for noticing this change was wrong, and Christian >> for the idea. ??New webrev: >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >> >> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > > Looks good. > >> >> I reran parallel class loading tests and jck testing is in progress, >> but order access requires inspection. >> >> Thanks, >> Coleen >> >> >> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com >> wrote: >>> >>> >>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com >>>>> wrote: >>>>> >>>>> >>>>> >>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> There are two instances probably overlooked? >>>>>> >>>>>> dictionary.cpp #103 and #124 >>>>>> >>>>>> ???for (ProtectionDomainEntry* current = _pd_set; >>>>>> => >>>>>> ???for (ProtectionDomainEntry* current = pd_set(); >>>>>> >>>>>> >>>>> Oh yeah, you're right. ?That's embarrasing. ??I'll fix and retest. >>>> Which also shows that there is a potential for future mistakes. Can >>>> we isolate the field better so it?s only accessible via setter and >>>> getter? >>> >>> Yes, great idea. >>> Coleen >>> >>>>> Thank you!! >>>>> Coleen >>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>>> >>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com >>>>>> wrote: >>>>>>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set >>>>>>> since it's accessed outside the SystemDictionary_lock >>>>>>> >>>>>>> Ran parallel class loading tests that we have as well as tier1 >>>>>>> tests. See bug for details. >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>> >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>> >> > From calvin.cheung at oracle.com Sat Aug 26 00:09:28 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Fri, 25 Aug 2017 17:09:28 -0700 Subject: RFR(L): 8172218: Use Java class loaders for creating the AppCDS archive Message-ID: <59A0BC38.6090004@oracle.com> Please review this RFE change for jdk10. bug: https://bugs.openjdk.java.net/browse/JDK-8172218 webrev: http://cr.openjdk.java.net/~ccheung/8172218/webrev.00/ I've put the fix description in the comment section of the bug report. Testing: Running test on all hotspot components (hotspot_compiler,hotspot_gc,hotspot_runtime,hotspot_serviceability,hotspot_misc). Will run tier1 to tier4 and selected JCK tests. thanks, Calvin From david.holmes at oracle.com Sun Aug 27 21:27:06 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 28 Aug 2017 07:27:06 +1000 Subject: [10] RFR: 8179040: Avoid Ticks::now calls when EventClassLoad is not enabled In-Reply-To: References: <9fb7cb6b-a9cc-e426-cc12-f1341675cf03@oracle.com> <50254b27-b250-05e3-a916-692739c66b75@oracle.com> Message-ID: On 25/08/2017 6:06 PM, Claes Redestad wrote: > On 2017-08-25 04:12, David Holmes wrote: >>> >>> http://cr.openjdk.java.net/~redestad/8179040/hotspot.03/ >> >> Incremental webrev possible? > > Of course: http://cr.openjdk.java.net/~redestad/8179040/hotspot.inc_02_03/ Thanks. I'm curious, in src/share/vm/trace/traceEvent.hpp, why do we now include "trace/traceTime.hpp" outside the "#if INCLUDE_TRACE" guard? David > /Claes From david.holmes at oracle.com Sun Aug 27 22:07:24 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 28 Aug 2017 08:07:24 +1000 Subject: RFR(XXS) JDK-8186778 Deprecate VM options for shared region size control In-Reply-To: <4EBA2D00-F20A-4E0A-8576-74F67EFAD4BB@oracle.com> References: <96ea8bbf-2821-1494-9f98-5c4fa7b19e1f@oracle.com> <4EBA2D00-F20A-4E0A-8576-74F67EFAD4BB@oracle.com> Message-ID: On 26/08/2017 6:02 AM, Ioi Lam wrote: > If I understand the comments above my changes in arguments.cpp correctly, there are 3 stage of removing options - deprecated, obsolete and removed. I am at stage 1 now, so the options are kept in globals.hpp Except you haven't actually deprecated these options you have made them obsolete as they are now ignored by the VM. If they were only deprecated then the VM would still use them. I think you will need to redo this and the CSR as an obsoletion request. Thanks, David > Thanks > Ioi > >> On Aug 25, 2017, at 12:11 PM, Jiangli Zhou wrote: >> >> Hi Ioi, >> >> The change looks good. I noticed the deprecated options (not just the ones you are adding) are still kept in globals.hpp. Do you know why we didn?t remove them from globals.hpp? >> >> Thanks, >> Jiangli >> >>> On Aug 25, 2017, at 11:19 AM, Ioi Lam wrote: >>> >>> Hi, please review this very small change. The corresponding CSR has been approved. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8186778 >>> >>> Since JDK-8072061 (Automatically determine optimal sizes for the CDS regions) is integrated, >>> the following 4 options are no longer necessary, and they are no longer used by the VM >>> anymore. Hence, these options should be deprecated: >>> >>> SharedReadWriteSize >>> SharedReadOnlySize >>> SharedMiscDataSize >>> SharedMiscCodeSize >>> >>> hotspot$ hg diff >>> diff -r 3a8e59bdaaac src/share/vm/runtime/arguments.cpp >>> --- a/src/share/vm/runtime/arguments.cpp Thu Aug 24 14:00:04 2017 +0000 >>> +++ b/src/share/vm/runtime/arguments.cpp Fri Aug 25 11:16:29 2017 -0700 >>> @@ -379,6 +379,10 @@ >>> static SpecialFlag const special_jvm_flags[] = { >>> // -------------- Deprecated Flags -------------- >>> // --- Non-alias flags - sorted by obsolete_in then expired_in: >>> { "MaxGCMinorPauseMillis", JDK_Version::jdk(8), JDK_Version::undefined(), JDK_Version::undefined() }, >>> { "UseConcMarkSweepGC", JDK_Version::jdk(9), JDK_Version::undefined(), JDK_Version::undefined() }, >>> { "MonitorInUseLists", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >>> + { "SharedMiscCodeSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >>> + { "SharedMiscDataSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >>> + { "SharedReadOnlySize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >>> + { "SharedReadWriteSize", JDK_Version::jdk(10),JDK_Version::undefined(), JDK_Version::undefined() }, >>> >>> // --- Deprecated alias flags (see also aliased_jvm_flags) - sorted by obsolete_in then expired_in: >>> { "DefaultMaxRAMFraction", JDK_Version::jdk(8), JDK_Version::undefined(), JDK_Version::undefined() }, >>> >> >> >> >> >> > From david.holmes at oracle.com Mon Aug 28 04:25:54 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 28 Aug 2017 14:25:54 +1000 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> Message-ID: <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> Hi Coleen, On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: > > Thank you Zhengyu for noticing this change was wrong, and Christian for > the idea.?? New webrev: > > open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8164207 The idea of a load-acquire accessor and release_store-setter is fine in principal, but it seems to me that we now use these everywhere, even if we may not need them because there is no concurrent/lock-free access. Overall I find it very difficult to determine what the concurrent access patterns are for a Dictionary versus a DictionaryEntry, and which paths are in fact lock and/or safepoint free, and may be racing with locked or safepointed code. ?? That aside I don't understand why you added a level of indirection with the ProtectionDomainSet class? Also we have been trying to include release/acquire in the names of such accessors so that it is clear when we are relying on memory ordering properties ie. pd_set_acquire and release_set_pd_set Thanks, David > I reran parallel class loading tests and jck testing is in progress, but > order access requires inspection. > > Thanks, > Coleen > > > On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> >>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>> Hi Coleen, >>>>> >>>>> There are two instances probably overlooked? >>>>> >>>>> dictionary.cpp #103 and #124 >>>>> >>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>> => >>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>> >>>>> >>>> Oh yeah, you're right.? That's embarrasing.?? I'll fix and retest. >>> Which also shows that there is a potential for future mistakes. Can >>> we isolate the field better so it?s only accessible via setter and >>> getter? >> >> Yes, great idea. >> Coleen >> >>>> Thank you!! >>>> Coleen >>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set >>>>>> since it's accessed outside the SystemDictionary_lock >>>>>> >>>>>> Ran parallel class loading tests that we have as well as tier1 >>>>>> tests. See bug for details. >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >> > From david.holmes at oracle.com Mon Aug 28 05:37:56 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 28 Aug 2017 15:37:56 +1000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> Message-ID: <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> Hi Goetz, On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: > Hi, > > I please need a second review and a sponsor: > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.04 > > To update my description of the change to the status after Thomas' review: > > dll_build_name builds the proper path to a library given a list of paths separated by > path_seperator and a library name. It adds in the platform specific endings etc. > It is documented to return whether the file exists, but only does so if a path_seperator > exists in the path. > Especially if the path is empty, it just returns ?true? without checking. > > Dll_build_name is usually used before calling dll_load. If dll_load does not get a full path it searches > in well known unix/windows locations. This is intended in the two cases where dll_build_name > is called with an empty path. > > I renamed dll_build_name to dll_locate_lib and changed it's behavior to always return > a full path to the lib, inserting current working directory if no path is given. > For the use case where "" was actually passed to the function, I added a new function > (reusing the old function name) dll_build_name that just adds system dependent prefix and suffix > to the name. > I merged all unix implementations to the posix os branch. I started to look at this and have applied the patch to run through some basic testing. The overall approach seems reasonable. But it is hard to track all the details - in particular whether there were any subtle differences across the "posix" systems? I'm wondering what, if any, significant differences exist between the Windows and POSIX versions? I would hope the platform differences could easily be hidden behind macros (for path separator, library suffix etc). Then perhaps this could just go in shared code (os.hpp, os.cpp)? That aside, in the Windows code shouldn't the hardwired .dll strings actually be JNI_LIB_SUFFIX? Thanks, David > Best regards, > Goetz. > > > >> -----Original Message----- >> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] >> Sent: Dienstag, 22. August 2017 17:30 >> To: Lindenmaier, Goetz >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file is >> missing. >> >> Looks good. >> >> ..Thomas >> >> On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz >> > wrote: >> >> >> I mistyped the path to webrev, this should work: >> http://cr.openjdk.java.net/~goetz/wr17/8186072- >> dllBuildName/webrev.04 > dllBuildName/webrev.04> >> >> Sorry, >> Goetz >> >> >> >> > -----Original Message----- >> > From: Lindenmaier, Goetz >> > Sent: Dienstag, 22. August 2017 15:48 >> > To: 'Thomas St?fe' > > >> > Cc: hotspot-runtime-dev at openjdk.java.net > dev at openjdk.java.net> >> > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if >> file is >> > missing. >> > >> > Hi, >> > >> > could I please get a second review? >> > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName- >> hs/webrev.04 > hs/webrev.04> >> > >> > I had to update the webrev because of a problem on windows. >> > @Thomas I had edited os.hpp, but not saved :( >> > >> > Best regards, >> > Goetz. >> > >> > PS: Didn't double-check the webrev as cr server is slow. >> > >> > > -----Original Message----- >> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >> ] >> > > Sent: Donnerstag, 17. August 2017 19:54 >> > > To: Lindenmaier, Goetz > > >> > > Cc: hotspot-runtime-dev at openjdk.java.net > runtime-dev at openjdk.java.net> >> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even if >> file is >> > > missing. >> > > >> > > Hi Goetz, >> > > >> > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz >> > > > > > > wrote: >> > > >> > > >> > > Hi Thomas, >> > > >> > > >> > > >> > > I adapted the comments in os.hpp. >> > > >> > > >> > > >> > > If I move the call to dll_build_name out of dll_locate_lib >> > > >> > > I have to do a lot of coding in all the places where it is called. >> > > >> > > That seems not useful to me. >> > > >> > > >> > > >> > > Fixed the type to size_t. >> > > >> > > >> > > >> > > One could merge posix/windows if putting the check for ?:? >> > > >> > > into a WINDOWS_ONLY() I guess. The check for \ could be >> > > >> > > done in posix as well, if using file_seperator(). >> > > >> > > >> > > >> > > * Not your change, but: why does the code in os::dll_locate_lib() >> even >> > > >> > > * differentiate between a PATH containing no >> os::path_separator() >> > > >> > > * and a path containing os::path_separator()? >> > > >> > > I assume this was done to avoid all the allocations and copying of >> the >> > > path. >> > > >> > > >> > > >> > > Also adapted the comment in jvmtiExport.cpp. >> > > >> > > >> > > >> > > New webrev: >> > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >> >> > > dllBuildName/webrev.03/ >> > >> > > dllBuildName/webrev.03/> >> > > >> > > incremental diff: >> > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >> >> > > dllBuildName/webrev.03/diffs-incremental.patch >> > > > >> > > dllBuildName/webrev.03/diffs-incremental.patch> >> > > >> > > (fixed indentation on windows) >> > > >> > > >> > > >> > > Best regards, >> > > >> > > Goetz. >> > > >> > > >> > > >> > > >> > > >> > > >> > > Comments in os.hpp seem unchanged ? >> > > >> > > But looks fine otherwise. I do not need another webrev. >> > > >> > > Thanks, Thomas >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >> >> > > > > ] >> > > Sent: Thursday, August 17, 2017 3:48 PM >> > > To: Lindenmaier, Goetz > >> > > > > > >> > > Cc: hotspot-runtime-dev at openjdk.java.net > runtime-dev at openjdk.java.net> > runtime-> >> > > dev at openjdk.java.net > >> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even >> if file >> > > is missing. >> > > >> > > >> > > >> > > Hi Goetz, >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz >> > > > > > > wrote: >> > > >> > > Hi Thomas, >> > > >> > > I reworked the whole thing. >> > > >> > > First, there is dll_build_name. It just does -> >> > > lib.so. >> > > >> > > Second, I renamed the legacy dll_build_name to >> dll_locate_lib. >> > > >> > > I merged all the unix variants to one in os_posix. >> > > >> > > I removed the buffer overflow check at the top. >> > > It's too restrictive because the path argument >> > > can contain several paths. I added the overflow >> > > checks into the single cases. >> > > >> > > Also, I first assemble the pure name using the new, simple >> > > dll_build_name. This is for reuse and readability. >> > > >> > > In case of an empty directory, I use get_current_directory >> > > to complete the path as indicated by the original >> > > documentation >> > > where it was called with "". >> > > Dll_locate_lib now always returns a name with a full path if >> > > the file exists. >> > > >> > > Also, on windows, I think I fixed a bug by reversing the order >> > > of checks. A path list ending in ':' or '\' would not have >> > > been recognized. >> > > >> > > On Bsd, I removed JNI_LIB_* because that already is defined >> > > in jvm_bsh.h >> > > >> > > New webrev: >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >> >> > > dllBuildName/webrev.02/ >> > >> > > dllBuildName/webrev.02/> >> > > >> > > Best regards, >> > > Goetz. >> > > >> > > >> > > >> > > I like this better than before. Remarks: >> > > >> > > >> > > >> > > >> > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >> >> > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html >> > > > >> > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> >> > > >> > > >> > > >> > > + // Builds the platform-specific name of a library. >> > > >> > > + // Returns false on __buffer overflow__. >> > > >> > > >> > > >> > > Hopefully not! :D >> > > >> > > How about: "Returns false no truncation" instead. >> > > >> > > >> > > >> > > >> > > >> > > + // Builds a platform-specific full library path given an ld path >> and lib >> > > name. >> > > >> > > + // Returns true if the buffer contains a full path to an existing >> file, >> > > false >> > > >> > > + // otherwise. If pathname is empty, checks the current >> directory. >> > > >> > > + static bool dll_locate_lib(char* buffer, size_t size, >> > > >> > > const char* pathname, const char* >> fname); >> > > >> > > >> > > >> > > Might be worth mentioning that "fname" is the unadorned library >> > > name, e.g. "verify" for libverify.so or verify.dll. >> > > >> > > >> > > >> > > Would the following alternative be valid: >> > > >> > > >> > > >> > > one could make dll_locate_lib take the real file name, and let >> caller >> > > use dll_build_name() to build the libary name first before handing it >> to >> > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to a >> generic >> > > "find_file_in_path" because it would work for any kind of file. >> > > >> > > >> > > >> > > As an added bonus, there would be no need to create a >> temporary >> > > array in dll_build_name/dll_locate_lib, and no need to call free() so >> no >> > > cleanup-related control flow changes in these functions. >> > > >> > > >> > > >> > > ===== >> > > >> > > >> > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >> >> > > >> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html >> > > > >> > > >> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> >> > > >> > > >> > > >> > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + >> > > strlen(JNI_LIB_SUFFIX); >> > > >> > > >> > > >> > > int -> size_t (does that even compile without warning?) >> > > >> > > >> > > >> > > + // Check current working directory. >> > > >> > > + const char* p = get_current_directory(buffer, buflen); >> > > >> > > + if (p != NULL && >> > > >> > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { >> > > >> > > + strcat(buffer, "\\"); >> > > >> > > + strcat(buffer, fullfname); >> > > >> > > + retval = file_exists(buffer); >> > > >> > > >> > > >> > > Small nit: I'd use jio_snprintf instead of strcat. Functionally >> identical but >> > > will make scanners (e.g. coverity) happy. One could then avoid the >> length >> > > calculation and rely on jio_snprintf truncation: >> > > >> > > >> > > >> > > const char* p = get_current_directory(buffer, buflen); >> > > >> > > if (p != NULL) { >> > > >> > > const size_t end = strlen(p); >> > > >> > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { >> > > >> > > retval = file_exists(buffer); >> > > >> > > } >> > > >> > > } >> > > >> > > >> > > >> > > -- >> > > >> > > >> > > >> > > Not your change, but: why does the code in os::dll_locate_lib() >> even >> > > differentiate between a PATH containing no os::path_separator() >> and a path >> > > containing os::path_separator()? >> > > >> > > >> > > >> > > Would the former not be just a PATH with only one directory and >> hence >> > > need no special treatment? >> > > >> > > >> > > >> > > ===== >> > > >> > > >> > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >> >> > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html >> > > > >> > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> >> > > >> > > >> > > >> > > Could os::dll_locate_lib be consolidated between windows and >> unix? >> > > Seems to be the implementation is almost identical. >> > > >> > > >> > > >> > > ==== >> > > >> > > >> > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >> >> > > >> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html >> > > > >> > > >> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> >> > > >> > > >> > > >> > > + // not found - try library path >> > > >> > > >> > > >> > > Proposal: "not found - try OS default library path" >> > > >> > > >> > > >> > > >> > > >> > > Find some comments inline: >> > > >> > > >> > > > Especially if the path is empty, it just returns 'true'. >> > > > Dll_build_name is usually used before calling dll_load. >> If >> > > dll_load does not get a full path it searches >> > > > in well known unix/windows locations. This is intended >> in >> > > the two cases where dll_build_name >> > > > is called with an empty path. >> > > > >> > > > So, for both cases (thread.cpp, jvmtiExport.cpp), >> > > > >> > > > before, we would call os::dll_build_name() with an empty >> > > string for the path >> > > > which, for relative paths, would result in feeding that path >> > > unexpanded to >> > > > dlopen(), which would use whatever the OS does in those >> > > cases (LIBPATH, >> > > > LD_LIBRARY_PATH, PATH on windows). Note that this does >> > > not necessarily >> > > > include searching the current directory. >> > > Right. With changed dll_biuld_name it's again exactly as >> > > before. >> > > >> > > > With your change, we now use java.library.path, which is >> not >> > > necessarily the >> > > > same? >> > > You are right, I oversaw that java.library.path can be >> > > overwritten. Initially, >> > > it's set to the right thing. >> > > >> > > > (BTW, I think the old comments in thread.cpp and >> > > jniExport.cpp were wrong:"// >> > > > Try the local directory" - if "local" means "current", this is >> not >> > > what did >> > > > happen). >> > > Right, I tried to adapt them, did I miss one? >> > > >> > > > I added a second variant of dll_build_name without the >> > > path argument that adds the path >> > > > from system property java.lang.path and use that in >> these >> > > two cases. >> > > > I changed the original function to actually check file >> > > availability in all cases, >> > > > and to check . if the path is empty. >> > > > I think that may be a bit confusing. We would then have >> three >> > > options: >> > > > >> > > > - call os::dll_build_name with a real ";;.." PATH >> and >> > > get a file name >> > > > resolved from that path >> > > > - call os::dll_build_name with "" for the PATH and get OS >> dll >> > > resolution >> > > No, in that case, as I called file_exists(), it would only work if >> > > the dll is in the >> > > current working directory. But I changed this now, anyways. >> > > >> > > > - call your new overloaded version of os::dll_build_name(), >> > > which uses - >> > > > Djava.library.path. >> > > > >> > > > Please review this change. I please need a sponsor. >> > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >> >> > > > > >> > > > dllBuildName/webrev.01/ >> > > > >> > > > > >> > > > dllBuildName/webrev.01/> >> > > >> > > > >> > > > Best regards, >> > > > Goetz. >> > > > >> > > > >> > > > >> > > > >> > > > Kind Regards, Thomas >> > > >> > > >> > > >> > > Best Regards, Thomas >> > > >> > > >> > > >> > > >> > > >> >> >> > From goetz.lindenmaier at sap.com Mon Aug 28 09:32:31 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 28 Aug 2017 09:32:31 +0000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> Message-ID: <9425a555a3484a9cbb1f1034b382f610@sap.com> Hi David, > I started to look at this and have applied the patch to run through some > basic testing. I ran it with our nightly tests. That?s jck tests, hotspot jtreg, jdk jtreg, spec benchmarks, some SAP applications. All that on windowsx86_64, linux_x86_64, linux_ppc64, linux_ppc64le, linux_s390, aix_ppc64, solaris_sparc, mac. No issues. > The overall approach seems reasonable. But it is hard to > track all the details - in particular whether there were any subtle > differences across the "posix" systems I only spotted syntactic differences on the posix systems. Like using JNI_LIB_SUFFIX or just ".so". > I'm wondering what, if any, significant differences exist between the > Windows and POSIX versions? I would hope the platform differences could > easily be hidden behind macros (for path separator, library suffix etc). > Then perhaps this could just go in shared code (os.hpp, os.cpp)? The differences on windows are - a check to avoid double file seperators. "bin\\java" This could easily also be done on posix. Maybe even more correct, because MAX_PATH_LEN might be reached for a path that actually should fit. - a check to avoid file separator after a drive letter "C:\" Would have to be protected by WINDOWS_ONLY() - different file_exists() implementations. os::stat is available on windows, too, but GetFileAttributes is much less complex. I could make file_exists an os member, or just use os::stat instead. > That aside, in the Windows code shouldn't the hardwired .dll strings > actually be JNI_LIB_SUFFIX? I guess it does not matter as it's defined on the same level (jvm_windows.h), I just didn't want to change existing code. Do you want me to merge it to os.cpp? I'm also fine with that. Best regards, Goetz. > > Thanks, > David > > > Best regards, > > Goetz. > > > > > > > >> -----Original Message----- > >> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > >> Sent: Dienstag, 22. August 2017 17:30 > >> To: Lindenmaier, Goetz > >> Cc: hotspot-runtime-dev at openjdk.java.net > >> Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file is > >> missing. > >> > >> Looks good. > >> > >> ..Thomas > >> > >> On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz > >> > > wrote: > >> > >> > >> I mistyped the path to webrev, this should work: > >> http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> dllBuildName/webrev.04 > >> dllBuildName/webrev.04> > >> > >> Sorry, > >> Goetz > >> > >> > >> > >> > -----Original Message----- > >> > From: Lindenmaier, Goetz > >> > Sent: Dienstag, 22. August 2017 15:48 > >> > To: 'Thomas St?fe' >> > > >> > Cc: hotspot-runtime-dev at openjdk.java.net runtime- > >> dev at openjdk.java.net> > >> > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if > >> file is > >> > missing. > >> > > >> > Hi, > >> > > >> > could I please get a second review? > >> > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName- > >> hs/webrev.04 dllBuildName- > >> hs/webrev.04> > >> > > >> > I had to update the webrev because of a problem on windows. > >> > @Thomas I had edited os.hpp, but not saved :( > >> > > >> > Best regards, > >> > Goetz. > >> > > >> > PS: Didn't double-check the webrev as cr server is slow. > >> > > >> > > -----Original Message----- > >> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > >> ] > >> > > Sent: Donnerstag, 17. August 2017 19:54 > >> > > To: Lindenmaier, Goetz >> > > >> > > Cc: hotspot-runtime-dev at openjdk.java.net >> runtime-dev at openjdk.java.net> > >> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even > if > >> file is > >> > > missing. > >> > > > >> > > Hi Goetz, > >> > > > >> > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz > >> > > >> > >> > > wrote: > >> > > > >> > > > >> > > Hi Thomas, > >> > > > >> > > > >> > > > >> > > I adapted the comments in os.hpp. > >> > > > >> > > > >> > > > >> > > If I move the call to dll_build_name out of dll_locate_lib > >> > > > >> > > I have to do a lot of coding in all the places where it is called. > >> > > > >> > > That seems not useful to me. > >> > > > >> > > > >> > > > >> > > Fixed the type to size_t. > >> > > > >> > > > >> > > > >> > > One could merge posix/windows if putting the check for ?:? > >> > > > >> > > into a WINDOWS_ONLY() I guess. The check for \ could be > >> > > > >> > > done in posix as well, if using file_seperator(). > >> > > > >> > > > >> > > > >> > > * Not your change, but: why does the code in > os::dll_locate_lib() > >> even > >> > > > >> > > * differentiate between a PATH containing no > >> os::path_separator() > >> > > > >> > > * and a path containing os::path_separator()? > >> > > > >> > > I assume this was done to avoid all the allocations and copying > of > >> the > >> > > path. > >> > > > >> > > > >> > > > >> > > Also adapted the comment in jvmtiExport.cpp. > >> > > > >> > > > >> > > > >> > > New webrev: > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > dllBuildName/webrev.03/ > >> >> > >> > > dllBuildName/webrev.03/> > >> > > > >> > > incremental diff: > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > dllBuildName/webrev.03/diffs-incremental.patch > >> > > >> > >> > > dllBuildName/webrev.03/diffs-incremental.patch> > >> > > > >> > > (fixed indentation on windows) > >> > > > >> > > > >> > > > >> > > Best regards, > >> > > > >> > > Goetz. > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > Comments in os.hpp seem unchanged ? > >> > > > >> > > But looks fine otherwise. I do not need another webrev. > >> > > > >> > > Thanks, Thomas > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > >> > >> > > >> > ] > >> > > Sent: Thursday, August 17, 2017 3:48 PM > >> > > To: Lindenmaier, Goetz >> > >> > > >> > > > >> > > Cc: hotspot-runtime-dev at openjdk.java.net >> runtime-dev at openjdk.java.net> >> runtime-> > >> > > dev at openjdk.java.net > > >> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true > even > >> if file > >> > > is missing. > >> > > > >> > > > >> > > > >> > > Hi Goetz, > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz > >> > > >> > >> > > wrote: > >> > > > >> > > Hi Thomas, > >> > > > >> > > I reworked the whole thing. > >> > > > >> > > First, there is dll_build_name. It just does -> > >> > > lib.so. > >> > > > >> > > Second, I renamed the legacy dll_build_name to > >> dll_locate_lib. > >> > > > >> > > I merged all the unix variants to one in os_posix. > >> > > > >> > > I removed the buffer overflow check at the top. > >> > > It's too restrictive because the path argument > >> > > can contain several paths. I added the overflow > >> > > checks into the single cases. > >> > > > >> > > Also, I first assemble the pure name using the new, simple > >> > > dll_build_name. This is for reuse and readability. > >> > > > >> > > In case of an empty directory, I use get_current_directory > >> > > to complete the path as indicated by the original > >> > > documentation > >> > > where it was called with "". > >> > > Dll_locate_lib now always returns a name with a full path if > >> > > the file exists. > >> > > > >> > > Also, on windows, I think I fixed a bug by reversing the > order > >> > > of checks. A path list ending in ':' or '\' would not have > >> > > been recognized. > >> > > > >> > > On Bsd, I removed JNI_LIB_* because that already is > defined > >> > > in jvm_bsh.h > >> > > > >> > > New webrev: > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > dllBuildName/webrev.02/ > >> >> > >> > > dllBuildName/webrev.02/> > >> > > > >> > > Best regards, > >> > > Goetz. > >> > > > >> > > > >> > > > >> > > I like this better than before. Remarks: > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > >> > > >> > >> > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > >> > > > >> > > > >> > > > >> > > + // Builds the platform-specific name of a library. > >> > > > >> > > + // Returns false on __buffer overflow__. > >> > > > >> > > > >> > > > >> > > Hopefully not! :D > >> > > > >> > > How about: "Returns false no truncation" instead. > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > + // Builds a platform-specific full library path given an ld path > >> and lib > >> > > name. > >> > > > >> > > + // Returns true if the buffer contains a full path to an existing > >> file, > >> > > false > >> > > > >> > > + // otherwise. If pathname is empty, checks the current > >> directory. > >> > > > >> > > + static bool dll_locate_lib(char* buffer, size_t size, > >> > > > >> > > const char* pathname, const char* > >> fname); > >> > > > >> > > > >> > > > >> > > Might be worth mentioning that "fname" is the unadorned > library > >> > > name, e.g. "verify" for libverify.so or verify.dll. > >> > > > >> > > > >> > > > >> > > Would the following alternative be valid: > >> > > > >> > > > >> > > > >> > > one could make dll_locate_lib take the real file name, and let > >> caller > >> > > use dll_build_name() to build the libary name first before handing > it > >> to > >> > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to > a > >> generic > >> > > "find_file_in_path" because it would work for any kind of file. > >> > > > >> > > > >> > > > >> > > As an added bonus, there would be no need to create a > >> temporary > >> > > array in dll_build_name/dll_locate_lib, and no need to call free() > so > >> no > >> > > cleanup-related control flow changes in these functions. > >> > > > >> > > > >> > > > >> > > ===== > >> > > > >> > > > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > > >> > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > >> > > >> > >> > > > >> > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > >> > > > >> > > > >> > > > >> > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + > >> > > strlen(JNI_LIB_SUFFIX); > >> > > > >> > > > >> > > > >> > > int -> size_t (does that even compile without warning?) > >> > > > >> > > > >> > > > >> > > + // Check current working directory. > >> > > > >> > > + const char* p = get_current_directory(buffer, buflen); > >> > > > >> > > + if (p != NULL && > >> > > > >> > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { > >> > > > >> > > + strcat(buffer, "\\"); > >> > > > >> > > + strcat(buffer, fullfname); > >> > > > >> > > + retval = file_exists(buffer); > >> > > > >> > > > >> > > > >> > > Small nit: I'd use jio_snprintf instead of strcat. Functionally > >> identical but > >> > > will make scanners (e.g. coverity) happy. One could then avoid > the > >> length > >> > > calculation and rely on jio_snprintf truncation: > >> > > > >> > > > >> > > > >> > > const char* p = get_current_directory(buffer, buflen); > >> > > > >> > > if (p != NULL) { > >> > > > >> > > const size_t end = strlen(p); > >> > > > >> > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { > >> > > > >> > > retval = file_exists(buffer); > >> > > > >> > > } > >> > > > >> > > } > >> > > > >> > > > >> > > > >> > > -- > >> > > > >> > > > >> > > > >> > > Not your change, but: why does the code in os::dll_locate_lib() > >> even > >> > > differentiate between a PATH containing no os::path_separator() > >> and a path > >> > > containing os::path_separator()? > >> > > > >> > > > >> > > > >> > > Would the former not be just a PATH with only one directory > and > >> hence > >> > > need no special treatment? > >> > > > >> > > > >> > > > >> > > ===== > >> > > > >> > > > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html > >> > > >> > >> > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> > >> > > > >> > > > >> > > > >> > > Could os::dll_locate_lib be consolidated between windows and > >> unix? > >> > > Seems to be the implementation is almost identical. > >> > > > >> > > > >> > > > >> > > ==== > >> > > > >> > > > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > > >> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > >> > > >> > >> > > > >> > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > >> > > > >> > > > >> > > > >> > > + // not found - try library path > >> > > > >> > > > >> > > > >> > > Proposal: "not found - try OS default library path" > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > Find some comments inline: > >> > > > >> > > > >> > > > Especially if the path is empty, it just returns 'true'. > >> > > > Dll_build_name is usually used before calling dll_load. > >> If > >> > > dll_load does not get a full path it searches > >> > > > in well known unix/windows locations. This is intended > >> in > >> > > the two cases where dll_build_name > >> > > > is called with an empty path. > >> > > > > >> > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > >> > > > > >> > > > before, we would call os::dll_build_name() with an empty > >> > > string for the path > >> > > > which, for relative paths, would result in feeding that path > >> > > unexpanded to > >> > > > dlopen(), which would use whatever the OS does in those > >> > > cases (LIBPATH, > >> > > > LD_LIBRARY_PATH, PATH on windows). Note that this > does > >> > > not necessarily > >> > > > include searching the current directory. > >> > > Right. With changed dll_biuld_name it's again exactly as > >> > > before. > >> > > > >> > > > With your change, we now use java.library.path, which is > >> not > >> > > necessarily the > >> > > > same? > >> > > You are right, I oversaw that java.library.path can be > >> > > overwritten. Initially, > >> > > it's set to the right thing. > >> > > > >> > > > (BTW, I think the old comments in thread.cpp and > >> > > jniExport.cpp were wrong:"// > >> > > > Try the local directory" - if "local" means "current", this is > >> not > >> > > what did > >> > > > happen). > >> > > Right, I tried to adapt them, did I miss one? > >> > > > >> > > > I added a second variant of dll_build_name without > the > >> > > path argument that adds the path > >> > > > from system property java.lang.path and use that in > >> these > >> > > two cases. > >> > > > I changed the original function to actually check file > >> > > availability in all cases, > >> > > > and to check . if the path is empty. > >> > > > I think that may be a bit confusing. We would then have > >> three > >> > > options: > >> > > > > >> > > > - call os::dll_build_name with a real ";;.." PATH > >> and > >> > > get a file name > >> > > > resolved from that path > >> > > > - call os::dll_build_name with "" for the PATH and get OS > >> dll > >> > > resolution > >> > > No, in that case, as I called file_exists(), it would only work if > >> > > the dll is in the > >> > > current working directory. But I changed this now, anyways. > >> > > > >> > > > - call your new overloaded version of > os::dll_build_name(), > >> > > which uses - > >> > > > Djava.library.path. > >> > > > > >> > > > Please review this change. I please need a sponsor. > >> > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > >> > > >> > > > dllBuildName/webrev.01/ > >> > > >> > >> > > >> > > >> > > > dllBuildName/webrev.01/> > >> > > > >> > > > > >> > > > Best regards, > >> > > > Goetz. > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > Kind Regards, Thomas > >> > > > >> > > > >> > > > >> > > Best Regards, Thomas > >> > > > >> > > > >> > > > >> > > > >> > > > >> > >> > >> > > From goetz.lindenmaier at sap.com Mon Aug 28 10:10:19 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 28 Aug 2017 10:10:19 +0000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> Message-ID: <726e7be2591f481d8d0ee50af6486f75@sap.com> Hi, this are the changes needed to make the windows dll_locate_lib universally applicable. I also merge the three similar jio_snprintf calls into one method. I do some gymnastics to avoid another buffer of MAX_PATH_LEN at the first call to conc_path_file_and_check. I'll test this tonight. Best regards, Goetz. diff -r e09e4eb985c5 src/os/windows/vm/os_windows.cpp --- a/src/os/windows/vm/os_windows.cpp Thu Aug 17 17:26:02 2017 +0200 +++ b/src/os/windows/vm/os_windows.cpp Mon Aug 28 12:02:26 2017 +0200 @@ -1205,6 +1205,17 @@ return GetFileAttributes(filename) != INVALID_FILE_ATTRIBUTES; } +bool conc_path_file_and_check(char *buffer, char *printbuffer, size_t printbuflen, + const char* pname, char lastchar, const char* fname) { + char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar == os::file_seperator()[0]) ? "" : os::file_separator(); + int ret = jio_snprintf(printbuffer, printbuflen, "%s%s%s", path, filesep, fullfname); + if (ret != -1) { + struct stat statbuf; + return os::stat(buffer, &statbuf) == 0; + } + return false; +} + bool os::dll_locate_lib(char *buffer, size_t buflen, const char* pname, const char* fname) { bool retval = false; @@ -1220,11 +1231,8 @@ if (p != NULL) { const size_t plen = strlen(buffer); const char lastchar = buffer[plen - 1]; - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; - int ret = jio_snprintf(&buffer[plen], buflen - plen, "%s%s", filesep, fullfname); - if (ret != -1) { - retval = file_exists(buffer); - } + retval = conc_path_file_and_check(buffer, &buffer[plen], buflen - plen, + "", lastchar, fullfname); } } else if (strchr(pname, *os::path_separator()) != NULL) { int n; @@ -1238,12 +1246,8 @@ continue; // skip the empty path values } const char lastchar = path[plen - 1]; - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; - int ret = jio_snprintf(buffer, buflen, "%s%s%s", path, filesep, fullfname); - if (ret != -1 && file_exists(buffer)) { - retval = true; - break; - } + retval = conc_path_file_and_check(buffer, buffer, buflen, path, lastchar, fullfname); + if (retval) break; } // release the storage for (int i = 0; i < n; i++) { @@ -1255,11 +1259,7 @@ } } else { const char lastchar = pname[pnamelen-1]; - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; - int ret = jio_snprintf(buffer, buflen, "%s%s%s", pname, filesep, fullfname); - if (ret != -1) { - retval = file_exists(buffer); - } + retval = conc_path_file_and_check(buffer, buffer, buflen, path, lastchar, fullfname); } } > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Montag, 28. August 2017 07:38 > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns true even if file > is missing. > > Hi Goetz, > > On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: > > Hi, > > > > I please need a second review and a sponsor: > > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.04 > > > > To update my description of the change to the status after Thomas' review: > > > > dll_build_name builds the proper path to a library given a list of paths > separated by > > path_seperator and a library name. It adds in the platform specific endings > etc. > > It is documented to return whether the file exists, but only does so if a > path_seperator > > exists in the path. > > Especially if the path is empty, it just returns ?true? without checking. > > > > Dll_build_name is usually used before calling dll_load. If dll_load does not > get a full path it searches > > in well known unix/windows locations. This is intended in the two cases > where dll_build_name > > is called with an empty path. > > > > I renamed dll_build_name to dll_locate_lib and changed it's behavior to > always return > > a full path to the lib, inserting current working directory if no path is given. > > For the use case where "" was actually passed to the function, I added a > new function > > (reusing the old function name) dll_build_name that just adds system > dependent prefix and suffix > > to the name. > > I merged all unix implementations to the posix os branch. > > I started to look at this and have applied the patch to run through some > basic testing. The overall approach seems reasonable. But it is hard to > track all the details - in particular whether there were any subtle > differences across the "posix" systems? > > I'm wondering what, if any, significant differences exist between the > Windows and POSIX versions? I would hope the platform differences could > easily be hidden behind macros (for path separator, library suffix etc). > Then perhaps this could just go in shared code (os.hpp, os.cpp)? > > That aside, in the Windows code shouldn't the hardwired .dll strings > actually be JNI_LIB_SUFFIX? > > Thanks, > David > > > Best regards, > > Goetz. > > > > > > > >> -----Original Message----- > >> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > >> Sent: Dienstag, 22. August 2017 17:30 > >> To: Lindenmaier, Goetz > >> Cc: hotspot-runtime-dev at openjdk.java.net > >> Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file is > >> missing. > >> > >> Looks good. > >> > >> ..Thomas > >> > >> On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz > >> > > wrote: > >> > >> > >> I mistyped the path to webrev, this should work: > >> http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> dllBuildName/webrev.04 > >> dllBuildName/webrev.04> > >> > >> Sorry, > >> Goetz > >> > >> > >> > >> > -----Original Message----- > >> > From: Lindenmaier, Goetz > >> > Sent: Dienstag, 22. August 2017 15:48 > >> > To: 'Thomas St?fe' >> > > >> > Cc: hotspot-runtime-dev at openjdk.java.net runtime- > >> dev at openjdk.java.net> > >> > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if > >> file is > >> > missing. > >> > > >> > Hi, > >> > > >> > could I please get a second review? > >> > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName- > >> hs/webrev.04 dllBuildName- > >> hs/webrev.04> > >> > > >> > I had to update the webrev because of a problem on windows. > >> > @Thomas I had edited os.hpp, but not saved :( > >> > > >> > Best regards, > >> > Goetz. > >> > > >> > PS: Didn't double-check the webrev as cr server is slow. > >> > > >> > > -----Original Message----- > >> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > >> ] > >> > > Sent: Donnerstag, 17. August 2017 19:54 > >> > > To: Lindenmaier, Goetz >> > > >> > > Cc: hotspot-runtime-dev at openjdk.java.net >> runtime-dev at openjdk.java.net> > >> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even > if > >> file is > >> > > missing. > >> > > > >> > > Hi Goetz, > >> > > > >> > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz > >> > > >> > >> > > wrote: > >> > > > >> > > > >> > > Hi Thomas, > >> > > > >> > > > >> > > > >> > > I adapted the comments in os.hpp. > >> > > > >> > > > >> > > > >> > > If I move the call to dll_build_name out of dll_locate_lib > >> > > > >> > > I have to do a lot of coding in all the places where it is called. > >> > > > >> > > That seems not useful to me. > >> > > > >> > > > >> > > > >> > > Fixed the type to size_t. > >> > > > >> > > > >> > > > >> > > One could merge posix/windows if putting the check for ?:? > >> > > > >> > > into a WINDOWS_ONLY() I guess. The check for \ could be > >> > > > >> > > done in posix as well, if using file_seperator(). > >> > > > >> > > > >> > > > >> > > * Not your change, but: why does the code in > os::dll_locate_lib() > >> even > >> > > > >> > > * differentiate between a PATH containing no > >> os::path_separator() > >> > > > >> > > * and a path containing os::path_separator()? > >> > > > >> > > I assume this was done to avoid all the allocations and copying > of > >> the > >> > > path. > >> > > > >> > > > >> > > > >> > > Also adapted the comment in jvmtiExport.cpp. > >> > > > >> > > > >> > > > >> > > New webrev: > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > dllBuildName/webrev.03/ > >> >> > >> > > dllBuildName/webrev.03/> > >> > > > >> > > incremental diff: > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > dllBuildName/webrev.03/diffs-incremental.patch > >> > > >> > >> > > dllBuildName/webrev.03/diffs-incremental.patch> > >> > > > >> > > (fixed indentation on windows) > >> > > > >> > > > >> > > > >> > > Best regards, > >> > > > >> > > Goetz. > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > Comments in os.hpp seem unchanged ? > >> > > > >> > > But looks fine otherwise. I do not need another webrev. > >> > > > >> > > Thanks, Thomas > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > >> > >> > > >> > ] > >> > > Sent: Thursday, August 17, 2017 3:48 PM > >> > > To: Lindenmaier, Goetz >> > >> > > >> > > > >> > > Cc: hotspot-runtime-dev at openjdk.java.net >> runtime-dev at openjdk.java.net> >> runtime-> > >> > > dev at openjdk.java.net > > >> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true > even > >> if file > >> > > is missing. > >> > > > >> > > > >> > > > >> > > Hi Goetz, > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz > >> > > >> > >> > > wrote: > >> > > > >> > > Hi Thomas, > >> > > > >> > > I reworked the whole thing. > >> > > > >> > > First, there is dll_build_name. It just does -> > >> > > lib.so. > >> > > > >> > > Second, I renamed the legacy dll_build_name to > >> dll_locate_lib. > >> > > > >> > > I merged all the unix variants to one in os_posix. > >> > > > >> > > I removed the buffer overflow check at the top. > >> > > It's too restrictive because the path argument > >> > > can contain several paths. I added the overflow > >> > > checks into the single cases. > >> > > > >> > > Also, I first assemble the pure name using the new, simple > >> > > dll_build_name. This is for reuse and readability. > >> > > > >> > > In case of an empty directory, I use get_current_directory > >> > > to complete the path as indicated by the original > >> > > documentation > >> > > where it was called with "". > >> > > Dll_locate_lib now always returns a name with a full path if > >> > > the file exists. > >> > > > >> > > Also, on windows, I think I fixed a bug by reversing the > order > >> > > of checks. A path list ending in ':' or '\' would not have > >> > > been recognized. > >> > > > >> > > On Bsd, I removed JNI_LIB_* because that already is > defined > >> > > in jvm_bsh.h > >> > > > >> > > New webrev: > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > dllBuildName/webrev.02/ > >> >> > >> > > dllBuildName/webrev.02/> > >> > > > >> > > Best regards, > >> > > Goetz. > >> > > > >> > > > >> > > > >> > > I like this better than before. Remarks: > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > >> > > >> > >> > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > >> > > > >> > > > >> > > > >> > > + // Builds the platform-specific name of a library. > >> > > > >> > > + // Returns false on __buffer overflow__. > >> > > > >> > > > >> > > > >> > > Hopefully not! :D > >> > > > >> > > How about: "Returns false no truncation" instead. > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > + // Builds a platform-specific full library path given an ld path > >> and lib > >> > > name. > >> > > > >> > > + // Returns true if the buffer contains a full path to an existing > >> file, > >> > > false > >> > > > >> > > + // otherwise. If pathname is empty, checks the current > >> directory. > >> > > > >> > > + static bool dll_locate_lib(char* buffer, size_t size, > >> > > > >> > > const char* pathname, const char* > >> fname); > >> > > > >> > > > >> > > > >> > > Might be worth mentioning that "fname" is the unadorned > library > >> > > name, e.g. "verify" for libverify.so or verify.dll. > >> > > > >> > > > >> > > > >> > > Would the following alternative be valid: > >> > > > >> > > > >> > > > >> > > one could make dll_locate_lib take the real file name, and let > >> caller > >> > > use dll_build_name() to build the libary name first before handing > it > >> to > >> > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to > a > >> generic > >> > > "find_file_in_path" because it would work for any kind of file. > >> > > > >> > > > >> > > > >> > > As an added bonus, there would be no need to create a > >> temporary > >> > > array in dll_build_name/dll_locate_lib, and no need to call free() > so > >> no > >> > > cleanup-related control flow changes in these functions. > >> > > > >> > > > >> > > > >> > > ===== > >> > > > >> > > > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > > >> > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > >> > > >> > >> > > > >> > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > >> > > > >> > > > >> > > > >> > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + > >> > > strlen(JNI_LIB_SUFFIX); > >> > > > >> > > > >> > > > >> > > int -> size_t (does that even compile without warning?) > >> > > > >> > > > >> > > > >> > > + // Check current working directory. > >> > > > >> > > + const char* p = get_current_directory(buffer, buflen); > >> > > > >> > > + if (p != NULL && > >> > > > >> > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { > >> > > > >> > > + strcat(buffer, "\\"); > >> > > > >> > > + strcat(buffer, fullfname); > >> > > > >> > > + retval = file_exists(buffer); > >> > > > >> > > > >> > > > >> > > Small nit: I'd use jio_snprintf instead of strcat. Functionally > >> identical but > >> > > will make scanners (e.g. coverity) happy. One could then avoid > the > >> length > >> > > calculation and rely on jio_snprintf truncation: > >> > > > >> > > > >> > > > >> > > const char* p = get_current_directory(buffer, buflen); > >> > > > >> > > if (p != NULL) { > >> > > > >> > > const size_t end = strlen(p); > >> > > > >> > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { > >> > > > >> > > retval = file_exists(buffer); > >> > > > >> > > } > >> > > > >> > > } > >> > > > >> > > > >> > > > >> > > -- > >> > > > >> > > > >> > > > >> > > Not your change, but: why does the code in os::dll_locate_lib() > >> even > >> > > differentiate between a PATH containing no os::path_separator() > >> and a path > >> > > containing os::path_separator()? > >> > > > >> > > > >> > > > >> > > Would the former not be just a PATH with only one directory > and > >> hence > >> > > need no special treatment? > >> > > > >> > > > >> > > > >> > > ===== > >> > > > >> > > > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html > >> > > >> > >> > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> > >> > > > >> > > > >> > > > >> > > Could os::dll_locate_lib be consolidated between windows and > >> unix? > >> > > Seems to be the implementation is almost identical. > >> > > > >> > > > >> > > > >> > > ==== > >> > > > >> > > > >> > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > > >> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > >> > > >> > >> > > > >> > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > >> > > > >> > > > >> > > > >> > > + // not found - try library path > >> > > > >> > > > >> > > > >> > > Proposal: "not found - try OS default library path" > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > Find some comments inline: > >> > > > >> > > > >> > > > Especially if the path is empty, it just returns 'true'. > >> > > > Dll_build_name is usually used before calling dll_load. > >> If > >> > > dll_load does not get a full path it searches > >> > > > in well known unix/windows locations. This is intended > >> in > >> > > the two cases where dll_build_name > >> > > > is called with an empty path. > >> > > > > >> > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > >> > > > > >> > > > before, we would call os::dll_build_name() with an empty > >> > > string for the path > >> > > > which, for relative paths, would result in feeding that path > >> > > unexpanded to > >> > > > dlopen(), which would use whatever the OS does in those > >> > > cases (LIBPATH, > >> > > > LD_LIBRARY_PATH, PATH on windows). Note that this > does > >> > > not necessarily > >> > > > include searching the current directory. > >> > > Right. With changed dll_biuld_name it's again exactly as > >> > > before. > >> > > > >> > > > With your change, we now use java.library.path, which is > >> not > >> > > necessarily the > >> > > > same? > >> > > You are right, I oversaw that java.library.path can be > >> > > overwritten. Initially, > >> > > it's set to the right thing. > >> > > > >> > > > (BTW, I think the old comments in thread.cpp and > >> > > jniExport.cpp were wrong:"// > >> > > > Try the local directory" - if "local" means "current", this is > >> not > >> > > what did > >> > > > happen). > >> > > Right, I tried to adapt them, did I miss one? > >> > > > >> > > > I added a second variant of dll_build_name without > the > >> > > path argument that adds the path > >> > > > from system property java.lang.path and use that in > >> these > >> > > two cases. > >> > > > I changed the original function to actually check file > >> > > availability in all cases, > >> > > > and to check . if the path is empty. > >> > > > I think that may be a bit confusing. We would then have > >> three > >> > > options: > >> > > > > >> > > > - call os::dll_build_name with a real ";;.." PATH > >> and > >> > > get a file name > >> > > > resolved from that path > >> > > > - call os::dll_build_name with "" for the PATH and get OS > >> dll > >> > > resolution > >> > > No, in that case, as I called file_exists(), it would only work if > >> > > the dll is in the > >> > > current working directory. But I changed this now, anyways. > >> > > > >> > > > - call your new overloaded version of > os::dll_build_name(), > >> > > which uses - > >> > > > Djava.library.path. > >> > > > > >> > > > Please review this change. I please need a sponsor. > >> > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> > >> > > >> > > >> > > > dllBuildName/webrev.01/ > >> > > >> > >> > > >> > > >> > > > dllBuildName/webrev.01/> > >> > > > >> > > > > >> > > > Best regards, > >> > > > Goetz. > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > Kind Regards, Thomas > >> > > > >> > > > >> > > > >> > > Best Regards, Thomas > >> > > > >> > > > >> > > > >> > > > >> > > > >> > >> > >> > > From coleen.phillimore at oracle.com Mon Aug 28 12:07:27 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 28 Aug 2017 08:07:27 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> Message-ID: On 8/28/17 12:25 AM, David Holmes wrote: > Hi Coleen, > > On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >> >> Thank you Zhengyu for noticing this change was wrong, and Christian >> for the idea.?? New webrev: >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > > The idea of a load-acquire accessor and release_store-setter is fine > in principal, but it seems to me that we now use these everywhere, > even if we may not need them because there is no concurrent/lock-free > access. Overall I find it very difficult to determine what the > concurrent access patterns are for a Dictionary versus a > DictionaryEntry, and which paths are in fact lock and/or safepoint > free, and may be racing with locked or safepointed code. ?? That's exactly the point of making them accessors.? So one doesn't have to visit each individual call site and spend time answering the question for each case.? And probably getting it wrong.?? The performance delta for these accesses is minimal since it's only getting the head of the list, not each element. Then it's also future proof so that if a lock is removed, then we don't miss one of the accessors at a later time.?? Note that observing bugs caused by this is very difficult to do, and can only be done by inspection.?? That's why I erred on the side of safety and consistency. > > That aside I don't understand why you added a level of indirection > with the ProtectionDomainSet class? Only the code is a level of indirection not the access.?? That is to avoid what I said above.? See Christian's and Zhengyu's comments. > > Also we have been trying to include release/acquire in the names of > such accessors so that it is clear when we are relying on memory > ordering properties ie. pd_set_acquire and release_set_pd_set > I will change the names of these functions. thanks, Coleen > Thanks, > David > > >> I reran parallel class loading tests and jck testing is in progress, >> but order access requires inspection. >> >> Thanks, >> Coleen >> >> >> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> >>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> There are two instances probably overlooked? >>>>>> >>>>>> dictionary.cpp #103 and #124 >>>>>> >>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>> => >>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>> >>>>>> >>>>> Oh yeah, you're right.? That's embarrasing.?? I'll fix and retest. >>>> Which also shows that there is a potential for future mistakes. Can >>>> we isolate the field better so it?s only accessible via setter and >>>> getter? >>> >>> Yes, great idea. >>> Coleen >>> >>>>> Thank you!! >>>>> Coleen >>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>>> >>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set >>>>>>> since it's accessed outside the SystemDictionary_lock >>>>>>> >>>>>>> Ran parallel class loading tests that we have as well as tier1 >>>>>>> tests. See bug for details. >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>> >> From ioi.lam at oracle.com Mon Aug 28 15:31:43 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 28 Aug 2017 08:31:43 -0700 Subject: RFR(XXS) JDK-8186778 Deprecate VM options for shared region size control In-Reply-To: References: <96ea8bbf-2821-1494-9f98-5c4fa7b19e1f@oracle.com> <4EBA2D00-F20A-4E0A-8576-74F67EFAD4BB@oracle.com> Message-ID: On 8/27/17 3:07 PM, David Holmes wrote: > On 26/08/2017 6:02 AM, Ioi Lam wrote: >> If I understand the comments above my changes in arguments.cpp >> correctly, there are 3 stage of removing options - deprecated, >> obsolete and removed. I am at stage 1 now, so the options are kept in >> globals.hpp > > Except you haven't actually deprecated these options you have made > them obsolete as they are now ignored by the VM. If they were only > deprecated then the VM would still use them. > > I think you will need to redo this and the CSR as an obsoletion request. > OK, I'll do that. Thanks - Ioi > Thanks, > David > >> Thanks >> Ioi >> >>> On Aug 25, 2017, at 12:11 PM, Jiangli Zhou >>> wrote: >>> >>> Hi Ioi, >>> >>> The change looks good. I noticed the deprecated options (not just >>> the ones you are adding) are still kept in globals.hpp. Do you know >>> why we didn?t remove them from globals.hpp? >>> >>> Thanks, >>> Jiangli >>> >>>> On Aug 25, 2017, at 11:19 AM, Ioi Lam wrote: >>>> >>>> Hi, please review this very small change. The corresponding CSR has >>>> been approved. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8186778 >>>> >>>> Since JDK-8072061 (Automatically determine optimal sizes for the >>>> CDS regions) is integrated, >>>> the following 4 options are no longer necessary, and they are no >>>> longer used by the VM >>>> anymore. Hence, these options should be deprecated: >>>> >>>> ??? SharedReadWriteSize >>>> ??? SharedReadOnlySize >>>> ??? SharedMiscDataSize >>>> ??? SharedMiscCodeSize >>>> >>>> hotspot$ hg diff >>>> diff -r 3a8e59bdaaac src/share/vm/runtime/arguments.cpp >>>> --- a/src/share/vm/runtime/arguments.cpp??? Thu Aug 24 14:00:04 >>>> 2017 +0000 >>>> +++ b/src/share/vm/runtime/arguments.cpp??? Fri Aug 25 11:16:29 >>>> 2017 -0700 >>>> @@ -379,6 +379,10 @@ >>>> static SpecialFlag const special_jvm_flags[] = { >>>> ?? // -------------- Deprecated Flags -------------- >>>> ?? // --- Non-alias flags - sorted by obsolete_in then expired_in: >>>> ?? { "MaxGCMinorPauseMillis", JDK_Version::jdk(8), >>>> JDK_Version::undefined(), JDK_Version::undefined() }, >>>> ?? { "UseConcMarkSweepGC",??? JDK_Version::jdk(9), >>>> JDK_Version::undefined(), JDK_Version::undefined() }, >>>> ?? { "MonitorInUseLists", >>>> JDK_Version::jdk(10),JDK_Version::undefined(), >>>> JDK_Version::undefined() }, >>>> +? { "SharedMiscCodeSize", >>>> JDK_Version::jdk(10),JDK_Version::undefined(), >>>> JDK_Version::undefined() }, >>>> +? { "SharedMiscDataSize", >>>> JDK_Version::jdk(10),JDK_Version::undefined(), >>>> JDK_Version::undefined() }, >>>> +? { "SharedReadOnlySize", >>>> JDK_Version::jdk(10),JDK_Version::undefined(), >>>> JDK_Version::undefined() }, >>>> +? { "SharedReadWriteSize", >>>> JDK_Version::jdk(10),JDK_Version::undefined(), >>>> JDK_Version::undefined() }, >>>> >>>> ?? // --- Deprecated alias flags (see also aliased_jvm_flags) - >>>> sorted by obsolete_in then expired_in: >>>> ?? { "DefaultMaxRAMFraction", JDK_Version::jdk(8), >>>> JDK_Version::undefined(), JDK_Version::undefined() }, >>>> >>> >>> >>> >>> >>> >> From calvin.cheung at oracle.com Mon Aug 28 17:34:11 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Mon, 28 Aug 2017 10:34:11 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive Message-ID: <59A45413.70800@oracle.com> Hi, This is a re-post of a previous RFR for 8172218 using the correct bug id. bug: https://bugs.openjdk.java.net/browse/JDK-8186842 webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ Please refer to the comment section of the bug for description of the change. Tests executed so far: JPRT hs-tier2 though hs-tier4 hs-tier5 (linux-x64) thanks, Calvin From coleen.phillimore at oracle.com Mon Aug 28 19:38:20 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 28 Aug 2017 15:38:20 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> Message-ID: <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> Here is the third webrev with the names of pd_set and set_pd_set renamed to pd_set_acquire and release_set_pd_set. thanks, Coleen On 8/28/17 8:07 AM, coleen.phillimore at oracle.com wrote: > > > On 8/28/17 12:25 AM, David Holmes wrote: >> Hi Coleen, >> >> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >>> >>> Thank you Zhengyu for noticing this change was wrong, and Christian >>> for the idea.?? New webrev: >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >> >> The idea of a load-acquire accessor and release_store-setter is fine >> in principal, but it seems to me that we now use these everywhere, >> even if we may not need them because there is no concurrent/lock-free >> access. Overall I find it very difficult to determine what the >> concurrent access patterns are for a Dictionary versus a >> DictionaryEntry, and which paths are in fact lock and/or safepoint >> free, and may be racing with locked or safepointed code. ?? > > That's exactly the point of making them accessors.? So one doesn't > have to visit each individual call site and spend time answering the > question for each case.? And probably getting it wrong.?? The > performance delta for these accesses is minimal since it's only > getting the head of the list, not each element. > > Then it's also future proof so that if a lock is removed, then we > don't miss one of the accessors at a later time.?? Note that observing > bugs caused by this is very difficult to do, and can only be done by > inspection.?? That's why I erred on the side of safety and consistency. >> >> That aside I don't understand why you added a level of indirection >> with the ProtectionDomainSet class? > > Only the code is a level of indirection not the access.?? That is to > avoid what I said above.? See Christian's and Zhengyu's comments. >> >> Also we have been trying to include release/acquire in the names of >> such accessors so that it is clear when we are relying on memory >> ordering properties ie. pd_set_acquire and release_set_pd_set >> > > I will change the names of these functions. > > thanks, > Coleen >> Thanks, >> David >> >> >>> I reran parallel class loading tests and jck testing is in progress, >>> but order access requires inspection. >>> >>> Thanks, >>> Coleen >>> >>> >>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> There are two instances probably overlooked? >>>>>>> >>>>>>> dictionary.cpp #103 and #124 >>>>>>> >>>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>>> => >>>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>>> >>>>>>> >>>>>> Oh yeah, you're right.? That's embarrasing.?? I'll fix and retest. >>>>> Which also shows that there is a potential for future mistakes. >>>>> Can we isolate the field better so it?s only accessible via setter >>>>> and getter? >>>> >>>> Yes, great idea. >>>> Coleen >>>> >>>>>> Thank you!! >>>>>> Coleen >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -Zhengyu >>>>>>> >>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Use load_acquire for accessing >>>>>>>> DictionaryEntry::_pd_set since it's accessed outside the >>>>>>>> SystemDictionary_lock >>>>>>>> >>>>>>>> Ran parallel class loading tests that we have as well as tier1 >>>>>>>> tests. See bug for details. >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>> >>>> >>> > From coleen.phillimore at oracle.com Mon Aug 28 19:39:16 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 28 Aug 2017 15:39:16 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> Message-ID: On 8/28/17 3:38 PM, coleen.phillimore at oracle.com wrote: > > Here is the third webrev with the names of pd_set and set_pd_set > renamed to pd_set_acquire and release_set_pd_set. open webrev at http://cr.openjdk.java.net/~coleenp/8164207.03/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > thanks, > Coleen > > On 8/28/17 8:07 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 8/28/17 12:25 AM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Thank you Zhengyu for noticing this change was wrong, and Christian >>>> for the idea.?? New webrev: >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>> >>> The idea of a load-acquire accessor and release_store-setter is fine >>> in principal, but it seems to me that we now use these everywhere, >>> even if we may not need them because there is no >>> concurrent/lock-free access. Overall I find it very difficult to >>> determine what the concurrent access patterns are for a Dictionary >>> versus a DictionaryEntry, and which paths are in fact lock and/or >>> safepoint free, and may be racing with locked or safepointed code. ?? >> >> That's exactly the point of making them accessors.? So one doesn't >> have to visit each individual call site and spend time answering the >> question for each case.? And probably getting it wrong.?? The >> performance delta for these accesses is minimal since it's only >> getting the head of the list, not each element. >> >> Then it's also future proof so that if a lock is removed, then we >> don't miss one of the accessors at a later time.?? Note that >> observing bugs caused by this is very difficult to do, and can only >> be done by inspection.?? That's why I erred on the side of safety and >> consistency. >>> >>> That aside I don't understand why you added a level of indirection >>> with the ProtectionDomainSet class? >> >> Only the code is a level of indirection not the access.?? That is to >> avoid what I said above.? See Christian's and Zhengyu's comments. >>> >>> Also we have been trying to include release/acquire in the names of >>> such accessors so that it is clear when we are relying on memory >>> ordering properties ie. pd_set_acquire and release_set_pd_set >>> >> >> I will change the names of these functions. >> >> thanks, >> Coleen >>> Thanks, >>> David >>> >>> >>>> I reran parallel class loading tests and jck testing is in >>>> progress, but order access requires inspection. >>>> >>>> Thanks, >>>> Coleen >>>> >>>> >>>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> There are two instances probably overlooked? >>>>>>>> >>>>>>>> dictionary.cpp #103 and #124 >>>>>>>> >>>>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>>>> => >>>>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>>>> >>>>>>>> >>>>>>> Oh yeah, you're right.? That's embarrasing.?? I'll fix and retest. >>>>>> Which also shows that there is a potential for future mistakes. >>>>>> Can we isolate the field better so it?s only accessible via >>>>>> setter and getter? >>>>> >>>>> Yes, great idea. >>>>> Coleen >>>>> >>>>>>> Thank you!! >>>>>>> Coleen >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -Zhengyu >>>>>>>> >>>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>> Summary: Use load_acquire for accessing >>>>>>>>> DictionaryEntry::_pd_set since it's accessed outside the >>>>>>>>> SystemDictionary_lock >>>>>>>>> >>>>>>>>> Ran parallel class loading tests that we have as well as tier1 >>>>>>>>> tests. See bug for details. >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>>> >>>>> >>>> >> > From zgu at redhat.com Mon Aug 28 21:19:30 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 28 Aug 2017 17:19:30 -0400 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary Message-ID: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> This enhancement allows NMT to report class metadata information. NMT has no visibility into metaspace so far, it has become an obstacle to estimate real memory cost for classes. While estimating the cost, we usually assume that class metadata occupies whole committed space, which results higher than actual number. The patch uses existing metaspace APIs, and reports counters in NMT *Class* summary section. Bug: https://bugs.openjdk.java.net/browse/JDK-8186770 Webrev: http://cr.openjdk.java.net/~zgu/8186770/webrev.00/index.html Test: hotspot_tier1_runtime (fastdebug and release) Sample outputs: Class summary: - Class (reserved=1071790KB, committed=24750KB) (classes #3078) (malloc=686KB #7122) (mmap: reserved=1071104KB, committed=24064KB) ( Metadata: ) ( reserved=22528KB, committed=21504KB) ( capacity=21327KB, used=20654KB) ( free chunks=113KB) ( available=0KB) ( Class space: ) ( reserved=1048576KB, committed=2560KB) ( capacity=2525KB, used=2268KB) ( free chunks=0KB) ( available=35KB) Class summary diff: - Class (reserved=1074075KB +2290KB, committed=27291KB +2546KB) (classes #3198 +122) (malloc=923KB +242KB #8418 +1463) (mmap: reserved=1073152KB +2048KB, committed=26368KB +2304KB) ( Metadata: ) ( reserved=24576KB +2048KB, committed=23296KB +1792KB) ( capacity=23071KB +1808KB, used=22368KB +1794KB) ( free chunks=49KB -64KB) ( available=0KB) ( Class space: ) ( reserved=1048576KB, committed=3072KB +512KB) ( capacity=2843KB +318KB, used=2400KB +133KB) ( free chunks=141KB +141KB) ( available=88KB +53KB) Thanks, -Zhengyu From goetz.lindenmaier at sap.com Tue Aug 29 06:18:42 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 29 Aug 2017 06:18:42 +0000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <726e7be2591f481d8d0ee50af6486f75@sap.com> References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> <726e7be2591f481d8d0ee50af6486f75@sap.com> Message-ID: Hi, this is a webrev with merged windows and posix implementations: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.05/ Best regards, Goetz > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Montag, 28. August 2017 12:10 > To: 'David Holmes' > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: RE: [ping] RFR(M): 8186072: dll_build_name returns true even if file > is missing. > > Hi, > > this are the changes needed to make the windows dll_locate_lib > universally applicable. I also merge the three similar jio_snprintf > calls into one method. > I do some gymnastics to avoid another buffer of MAX_PATH_LEN > at the first call to conc_path_file_and_check. > I'll test this tonight. > > Best regards, > Goetz. > > diff -r e09e4eb985c5 src/os/windows/vm/os_windows.cpp > --- a/src/os/windows/vm/os_windows.cpp Thu Aug 17 17:26:02 2017 > +0200 > +++ b/src/os/windows/vm/os_windows.cpp Mon Aug 28 12:02:26 2017 > +0200 > @@ -1205,6 +1205,17 @@ > return GetFileAttributes(filename) != INVALID_FILE_ATTRIBUTES; > } > > +bool conc_path_file_and_check(char *buffer, char *printbuffer, size_t > printbuflen, > + const char* pname, char lastchar, const char* fname) { > + char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar == > os::file_seperator()[0]) ? "" : os::file_separator(); > + int ret = jio_snprintf(printbuffer, printbuflen, "%s%s%s", path, filesep, > fullfname); > + if (ret != -1) { > + struct stat statbuf; > + return os::stat(buffer, &statbuf) == 0; > + } > + return false; > +} > + > bool os::dll_locate_lib(char *buffer, size_t buflen, > const char* pname, const char* fname) { > bool retval = false; > @@ -1220,11 +1231,8 @@ > if (p != NULL) { > const size_t plen = strlen(buffer); > const char lastchar = buffer[plen - 1]; > - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; > - int ret = jio_snprintf(&buffer[plen], buflen - plen, "%s%s", filesep, > fullfname); > - if (ret != -1) { > - retval = file_exists(buffer); > - } > + retval = conc_path_file_and_check(buffer, &buffer[plen], buflen - > plen, > + "", lastchar, fullfname); > } > } else if (strchr(pname, *os::path_separator()) != NULL) { > int n; > @@ -1238,12 +1246,8 @@ > continue; // skip the empty path values > } > const char lastchar = path[plen - 1]; > - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; > - int ret = jio_snprintf(buffer, buflen, "%s%s%s", path, filesep, > fullfname); > - if (ret != -1 && file_exists(buffer)) { > - retval = true; > - break; > - } > + retval = conc_path_file_and_check(buffer, buffer, buflen, path, > lastchar, fullfname); > + if (retval) break; > } > // release the storage > for (int i = 0; i < n; i++) { > @@ -1255,11 +1259,7 @@ > } > } else { > const char lastchar = pname[pnamelen-1]; > - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; > - int ret = jio_snprintf(buffer, buflen, "%s%s%s", pname, filesep, > fullfname); > - if (ret != -1) { > - retval = file_exists(buffer); > - } > + retval = conc_path_file_and_check(buffer, buffer, buflen, path, lastchar, > fullfname); > } > } > > > > -----Original Message----- > > From: David Holmes [mailto:david.holmes at oracle.com] > > Sent: Montag, 28. August 2017 07:38 > > To: Lindenmaier, Goetz > > Cc: hotspot-runtime-dev at openjdk.java.net > > Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns true even if > file > > is missing. > > > > Hi Goetz, > > > > On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: > > > Hi, > > > > > > I please need a second review and a sponsor: > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.04 > > > > > > To update my description of the change to the status after Thomas' > review: > > > > > > dll_build_name builds the proper path to a library given a list of paths > > separated by > > > path_seperator and a library name. It adds in the platform specific > endings > > etc. > > > It is documented to return whether the file exists, but only does so if a > > path_seperator > > > exists in the path. > > > Especially if the path is empty, it just returns ?true? without checking. > > > > > > Dll_build_name is usually used before calling dll_load. If dll_load does > not > > get a full path it searches > > > in well known unix/windows locations. This is intended in the two cases > > where dll_build_name > > > is called with an empty path. > > > > > > I renamed dll_build_name to dll_locate_lib and changed it's behavior to > > always return > > > a full path to the lib, inserting current working directory if no path is given. > > > For the use case where "" was actually passed to the function, I added a > > new function > > > (reusing the old function name) dll_build_name that just adds system > > dependent prefix and suffix > > > to the name. > > > I merged all unix implementations to the posix os branch. > > > > I started to look at this and have applied the patch to run through some > > basic testing. The overall approach seems reasonable. But it is hard to > > track all the details - in particular whether there were any subtle > > differences across the "posix" systems? > > > > I'm wondering what, if any, significant differences exist between the > > Windows and POSIX versions? I would hope the platform differences could > > easily be hidden behind macros (for path separator, library suffix etc). > > Then perhaps this could just go in shared code (os.hpp, os.cpp)? > > > > That aside, in the Windows code shouldn't the hardwired .dll strings > > actually be JNI_LIB_SUFFIX? > > > > Thanks, > > David > > > > > Best regards, > > > Goetz. > > > > > > > > > > > >> -----Original Message----- > > >> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > > >> Sent: Dienstag, 22. August 2017 17:30 > > >> To: Lindenmaier, Goetz > > >> Cc: hotspot-runtime-dev at openjdk.java.net > > >> Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file is > > >> missing. > > >> > > >> Looks good. > > >> > > >> ..Thomas > > >> > > >> On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz > > >> > > > wrote: > > >> > > >> > > >> I mistyped the path to webrev, this should work: > > >> http://cr.openjdk.java.net/~goetz/wr17/8186072- > > >> dllBuildName/webrev.04 > > > >> dllBuildName/webrev.04> > > >> > > >> Sorry, > > >> Goetz > > >> > > >> > > >> > > >> > -----Original Message----- > > >> > From: Lindenmaier, Goetz > > >> > Sent: Dienstag, 22. August 2017 15:48 > > >> > To: 'Thomas St?fe' > >> > > > >> > Cc: hotspot-runtime-dev at openjdk.java.net > runtime- > > >> dev at openjdk.java.net> > > >> > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if > > >> file is > > >> > missing. > > >> > > > >> > Hi, > > >> > > > >> > could I please get a second review? > > >> > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName- > > >> hs/webrev.04 > dllBuildName- > > >> hs/webrev.04> > > >> > > > >> > I had to update the webrev because of a problem on windows. > > >> > @Thomas I had edited os.hpp, but not saved :( > > >> > > > >> > Best regards, > > >> > Goetz. > > >> > > > >> > PS: Didn't double-check the webrev as cr server is slow. > > >> > > > >> > > -----Original Message----- > > >> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > > >> ] > > >> > > Sent: Donnerstag, 17. August 2017 19:54 > > >> > > To: Lindenmaier, Goetz > >> > > > >> > > Cc: hotspot-runtime-dev at openjdk.java.net > >> runtime-dev at openjdk.java.net> > > >> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even > > if > > >> file is > > >> > > missing. > > >> > > > > >> > > Hi Goetz, > > >> > > > > >> > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz > > >> > > > >> > > > >> > > wrote: > > >> > > > > >> > > > > >> > > Hi Thomas, > > >> > > > > >> > > > > >> > > > > >> > > I adapted the comments in os.hpp. > > >> > > > > >> > > > > >> > > > > >> > > If I move the call to dll_build_name out of dll_locate_lib > > >> > > > > >> > > I have to do a lot of coding in all the places where it is called. > > >> > > > > >> > > That seems not useful to me. > > >> > > > > >> > > > > >> > > > > >> > > Fixed the type to size_t. > > >> > > > > >> > > > > >> > > > > >> > > One could merge posix/windows if putting the check for ?:? > > >> > > > > >> > > into a WINDOWS_ONLY() I guess. The check for \ could be > > >> > > > > >> > > done in posix as well, if using file_seperator(). > > >> > > > > >> > > > > >> > > > > >> > > * Not your change, but: why does the code in > > os::dll_locate_lib() > > >> even > > >> > > > > >> > > * differentiate between a PATH containing no > > >> os::path_separator() > > >> > > > > >> > > * and a path containing os::path_separator()? > > >> > > > > >> > > I assume this was done to avoid all the allocations and copying > > of > > >> the > > >> > > path. > > >> > > > > >> > > > > >> > > > > >> > > Also adapted the comment in jvmtiExport.cpp. > > >> > > > > >> > > > > >> > > > > >> > > New webrev: > > >> > > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > >> > > >> > > dllBuildName/webrev.03/ > > >> > >> > > >> > > dllBuildName/webrev.03/> > > >> > > > > >> > > incremental diff: > > >> > > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > >> > > >> > > dllBuildName/webrev.03/diffs-incremental.patch > > >> > > > >> > > >> > > dllBuildName/webrev.03/diffs-incremental.patch> > > >> > > > > >> > > (fixed indentation on windows) > > >> > > > > >> > > > > >> > > > > >> > > Best regards, > > >> > > > > >> > > Goetz. > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > Comments in os.hpp seem unchanged ? > > >> > > > > >> > > But looks fine otherwise. I do not need another webrev. > > >> > > > > >> > > Thanks, Thomas > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > > >> > > >> > > > >> > ] > > >> > > Sent: Thursday, August 17, 2017 3:48 PM > > >> > > To: Lindenmaier, Goetz > >> > > >> > > > >> > > > > >> > > Cc: hotspot-runtime-dev at openjdk.java.net > >> runtime-dev at openjdk.java.net> > > >> runtime-> > > >> > > dev at openjdk.java.net > > > >> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true > > even > > >> if file > > >> > > is missing. > > >> > > > > >> > > > > >> > > > > >> > > Hi Goetz, > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz > > >> > > > >> > > > >> > > wrote: > > >> > > > > >> > > Hi Thomas, > > >> > > > > >> > > I reworked the whole thing. > > >> > > > > >> > > First, there is dll_build_name. It just does -> > > >> > > lib.so. > > >> > > > > >> > > Second, I renamed the legacy dll_build_name to > > >> dll_locate_lib. > > >> > > > > >> > > I merged all the unix variants to one in os_posix. > > >> > > > > >> > > I removed the buffer overflow check at the top. > > >> > > It's too restrictive because the path argument > > >> > > can contain several paths. I added the overflow > > >> > > checks into the single cases. > > >> > > > > >> > > Also, I first assemble the pure name using the new, simple > > >> > > dll_build_name. This is for reuse and readability. > > >> > > > > >> > > In case of an empty directory, I use get_current_directory > > >> > > to complete the path as indicated by the original > > >> > > documentation > > >> > > where it was called with "". > > >> > > Dll_locate_lib now always returns a name with a full path if > > >> > > the file exists. > > >> > > > > >> > > Also, on windows, I think I fixed a bug by reversing the > > order > > >> > > of checks. A path list ending in ':' or '\' would not have > > >> > > been recognized. > > >> > > > > >> > > On Bsd, I removed JNI_LIB_* because that already is > > defined > > >> > > in jvm_bsh.h > > >> > > > > >> > > New webrev: > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > >> > > >> > > dllBuildName/webrev.02/ > > >> > >> > > >> > > dllBuildName/webrev.02/> > > >> > > > > >> > > Best regards, > > >> > > Goetz. > > >> > > > > >> > > > > >> > > > > >> > > I like this better than before. Remarks: > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > >> > > >> > > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > > >> > > > >> > > >> > > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > > >> > > > > >> > > > > >> > > > > >> > > + // Builds the platform-specific name of a library. > > >> > > > > >> > > + // Returns false on __buffer overflow__. > > >> > > > > >> > > > > >> > > > > >> > > Hopefully not! :D > > >> > > > > >> > > How about: "Returns false no truncation" instead. > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > + // Builds a platform-specific full library path given an ld path > > >> and lib > > >> > > name. > > >> > > > > >> > > + // Returns true if the buffer contains a full path to an existing > > >> file, > > >> > > false > > >> > > > > >> > > + // otherwise. If pathname is empty, checks the current > > >> directory. > > >> > > > > >> > > + static bool dll_locate_lib(char* buffer, size_t size, > > >> > > > > >> > > const char* pathname, const char* > > >> fname); > > >> > > > > >> > > > > >> > > > > >> > > Might be worth mentioning that "fname" is the unadorned > > library > > >> > > name, e.g. "verify" for libverify.so or verify.dll. > > >> > > > > >> > > > > >> > > > > >> > > Would the following alternative be valid: > > >> > > > > >> > > > > >> > > > > >> > > one could make dll_locate_lib take the real file name, and let > > >> caller > > >> > > use dll_build_name() to build the libary name first before handing > > it > > >> to > > >> > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to > > a > > >> generic > > >> > > "find_file_in_path" because it would work for any kind of file. > > >> > > > > >> > > > > >> > > > > >> > > As an added bonus, there would be no need to create a > > >> temporary > > >> > > array in dll_build_name/dll_locate_lib, and no need to call free() > > so > > >> no > > >> > > cleanup-related control flow changes in these functions. > > >> > > > > >> > > > > >> > > > > >> > > ===== > > >> > > > > >> > > > > >> > > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > >> > > >> > > > > >> > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > > >> > > > >> > > >> > > > > >> > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > > >> > > > > >> > > > > >> > > > > >> > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + > > >> > > strlen(JNI_LIB_SUFFIX); > > >> > > > > >> > > > > >> > > > > >> > > int -> size_t (does that even compile without warning?) > > >> > > > > >> > > > > >> > > > > >> > > + // Check current working directory. > > >> > > > > >> > > + const char* p = get_current_directory(buffer, buflen); > > >> > > > > >> > > + if (p != NULL && > > >> > > > > >> > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { > > >> > > > > >> > > + strcat(buffer, "\\"); > > >> > > > > >> > > + strcat(buffer, fullfname); > > >> > > > > >> > > + retval = file_exists(buffer); > > >> > > > > >> > > > > >> > > > > >> > > Small nit: I'd use jio_snprintf instead of strcat. Functionally > > >> identical but > > >> > > will make scanners (e.g. coverity) happy. One could then avoid > > the > > >> length > > >> > > calculation and rely on jio_snprintf truncation: > > >> > > > > >> > > > > >> > > > > >> > > const char* p = get_current_directory(buffer, buflen); > > >> > > > > >> > > if (p != NULL) { > > >> > > > > >> > > const size_t end = strlen(p); > > >> > > > > >> > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { > > >> > > > > >> > > retval = file_exists(buffer); > > >> > > > > >> > > } > > >> > > > > >> > > } > > >> > > > > >> > > > > >> > > > > >> > > -- > > >> > > > > >> > > > > >> > > > > >> > > Not your change, but: why does the code in os::dll_locate_lib() > > >> even > > >> > > differentiate between a PATH containing no os::path_separator() > > >> and a path > > >> > > containing os::path_separator()? > > >> > > > > >> > > > > >> > > > > >> > > Would the former not be just a PATH with only one directory > > and > > >> hence > > >> > > need no special treatment? > > >> > > > > >> > > > > >> > > > > >> > > ===== > > >> > > > > >> > > > > >> > > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > >> > > >> > > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html > > >> > > > >> > > >> > > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> > > >> > > > > >> > > > > >> > > > > >> > > Could os::dll_locate_lib be consolidated between windows and > > >> unix? > > >> > > Seems to be the implementation is almost identical. > > >> > > > > >> > > > > >> > > > > >> > > ==== > > >> > > > > >> > > > > >> > > > > >> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > >> > > >> > > > > >> > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > > >> > > > >> > > >> > > > > >> > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > > >> > > > > >> > > > > >> > > > > >> > > + // not found - try library path > > >> > > > > >> > > > > >> > > > > >> > > Proposal: "not found - try OS default library path" > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > Find some comments inline: > > >> > > > > >> > > > > >> > > > Especially if the path is empty, it just returns 'true'. > > >> > > > Dll_build_name is usually used before calling dll_load. > > >> If > > >> > > dll_load does not get a full path it searches > > >> > > > in well known unix/windows locations. This is intended > > >> in > > >> > > the two cases where dll_build_name > > >> > > > is called with an empty path. > > >> > > > > > >> > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > > >> > > > > > >> > > > before, we would call os::dll_build_name() with an empty > > >> > > string for the path > > >> > > > which, for relative paths, would result in feeding that path > > >> > > unexpanded to > > >> > > > dlopen(), which would use whatever the OS does in those > > >> > > cases (LIBPATH, > > >> > > > LD_LIBRARY_PATH, PATH on windows). Note that this > > does > > >> > > not necessarily > > >> > > > include searching the current directory. > > >> > > Right. With changed dll_biuld_name it's again exactly as > > >> > > before. > > >> > > > > >> > > > With your change, we now use java.library.path, which is > > >> not > > >> > > necessarily the > > >> > > > same? > > >> > > You are right, I oversaw that java.library.path can be > > >> > > overwritten. Initially, > > >> > > it's set to the right thing. > > >> > > > > >> > > > (BTW, I think the old comments in thread.cpp and > > >> > > jniExport.cpp were wrong:"// > > >> > > > Try the local directory" - if "local" means "current", this is > > >> not > > >> > > what did > > >> > > > happen). > > >> > > Right, I tried to adapt them, did I miss one? > > >> > > > > >> > > > I added a second variant of dll_build_name without > > the > > >> > > path argument that adds the path > > >> > > > from system property java.lang.path and use that in > > >> these > > >> > > two cases. > > >> > > > I changed the original function to actually check file > > >> > > availability in all cases, > > >> > > > and to check . if the path is empty. > > >> > > > I think that may be a bit confusing. We would then have > > >> three > > >> > > options: > > >> > > > > > >> > > > - call os::dll_build_name with a real ";;.." PATH > > >> and > > >> > > get a file name > > >> > > > resolved from that path > > >> > > > - call os::dll_build_name with "" for the PATH and get OS > > >> dll > > >> > > resolution > > >> > > No, in that case, as I called file_exists(), it would only work if > > >> > > the dll is in the > > >> > > current working directory. But I changed this now, anyways. > > >> > > > > >> > > > - call your new overloaded version of > > os::dll_build_name(), > > >> > > which uses - > > >> > > > Djava.library.path. > > >> > > > > > >> > > > Please review this change. I please need a sponsor. > > >> > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > >> > > >> > > > >> > > > >> > > > dllBuildName/webrev.01/ > > >> > > > >> > > >> > > > >> > > > >> > > > dllBuildName/webrev.01/> > > >> > > > > >> > > > > > >> > > > Best regards, > > >> > > > Goetz. > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > Kind Regards, Thomas > > >> > > > > >> > > > > >> > > > > >> > > Best Regards, Thomas > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > >> > > >> > > > From david.holmes at oracle.com Tue Aug 29 06:28:02 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Aug 2017 16:28:02 +1000 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> Message-ID: <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> Hi Coleen, On 29/08/2017 5:39 AM, coleen.phillimore at oracle.com wrote: > > > On 8/28/17 3:38 PM, coleen.phillimore at oracle.com wrote: >> >> Here is the third webrev with the names of pd_set and set_pd_set >> renamed to pd_set_acquire and release_set_pd_set. > > open webrev at http://cr.openjdk.java.net/~coleenp/8164207.03/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8164207 This API should also be renamed: ! ProtectionDomainEntry* pd_set() const { return _inner.pd_set_acquire(); } ! void set_pd_set(ProtectionDomainEntry* new_head) { _inner.release_set_pd_set(new_head); } These are the ones that need to give visibility to the fact we're accessing things lock-free (if indeed we are). More below ... >> On 8/28/17 8:07 AM, coleen.phillimore at oracle.com wrote: >>> On 8/28/17 12:25 AM, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Thank you Zhengyu for noticing this change was wrong, and Christian >>>>> for the idea.?? New webrev: >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>> >>>> The idea of a load-acquire accessor and release_store-setter is fine >>>> in principal, but it seems to me that we now use these everywhere, >>>> even if we may not need them because there is no >>>> concurrent/lock-free access. Overall I find it very difficult to >>>> determine what the concurrent access patterns are for a Dictionary >>>> versus a DictionaryEntry, and which paths are in fact lock and/or >>>> safepoint free, and may be racing with locked or safepointed code. ?? >>> >>> That's exactly the point of making them accessors.? So one doesn't >>> have to visit each individual call site and spend time answering the >>> question for each case.? And probably getting it wrong.?? The >>> performance delta for these accesses is minimal since it's only >>> getting the head of the list, not each element. >>> >>> Then it's also future proof so that if a lock is removed, then we >>> don't miss one of the accessors at a later time.?? Note that >>> observing bugs caused by this is very difficult to do, and can only >>> be done by inspection.?? That's why I erred on the side of safety and >>> consistency. Sorry, it may sound strange to say that I don't agree with "erring on the side of safety and consistency" but I do not agree with just using acquire/release semantics everywhere just in case! If we don't know the lock-free paths then how can we possibly know things are correct. The whole point of these accessors is to make it obvious where the lock-free accesses are. >>>> >>>> That aside I don't understand why you added a level of indirection >>>> with the ProtectionDomainSet class? >>> >>> Only the code is a level of indirection not the access.?? That is to >>> avoid what I said above.? See Christian's and Zhengyu's comments. Okay - I see what you did but I would not expect to have to protect _pd_set from direct use within its own class - anyone messing with that class should be aware of the need to use the accessors. Though I suppose this encapsulation is little different to defining the field as some kind of "Atomic" type rather than a "raw" type. Thanks, David ----- >>>> >>>> Also we have been trying to include release/acquire in the names of >>>> such accessors so that it is clear when we are relying on memory >>>> ordering properties ie. pd_set_acquire and release_set_pd_set >>>> >>> >>> I will change the names of these functions. >>> >>> thanks, >>> Coleen >>>> Thanks, >>>> David >>>> >>>> >>>>> I reran parallel class loading tests and jck testing is in >>>>> progress, but order access requires inspection. >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>> >>>>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> >>>>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> There are two instances probably overlooked? >>>>>>>>> >>>>>>>>> dictionary.cpp #103 and #124 >>>>>>>>> >>>>>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>>>>> => >>>>>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>>>>> >>>>>>>>> >>>>>>>> Oh yeah, you're right.? That's embarrasing.?? I'll fix and retest. >>>>>>> Which also shows that there is a potential for future mistakes. >>>>>>> Can we isolate the field better so it?s only accessible via >>>>>>> setter and getter? >>>>>> >>>>>> Yes, great idea. >>>>>> Coleen >>>>>> >>>>>>>> Thank you!! >>>>>>>> Coleen >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> -Zhengyu >>>>>>>>> >>>>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> Summary: Use load_acquire for accessing >>>>>>>>>> DictionaryEntry::_pd_set since it's accessed outside the >>>>>>>>>> SystemDictionary_lock >>>>>>>>>> >>>>>>>>>> Ran parallel class loading tests that we have as well as tier1 >>>>>>>>>> tests. See bug for details. >>>>>>>>>> >>>>>>>>>> open webrev at >>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>>>> >>>>>> >>>>> >>> >> > From david.holmes at oracle.com Tue Aug 29 07:53:13 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Aug 2017 17:53:13 +1000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> <726e7be2591f481d8d0ee50af6486f75@sap.com> Message-ID: Hi Goetz, On 29/08/2017 4:18 PM, Lindenmaier, Goetz wrote: > Hi, > > this is a webrev with merged windows and posix implementations: > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.05/ I like the look of this. There are a couple of indention nits in os.cpp: 247 static bool conc_path_file_and_check(char *buffer, char *printbuffer, size_t printbuflen, 248 const char* pname, char lastchar, const char* fname) { 251 const char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar == os::file_separator()[0]) ? 252 "" : os::file_separator(); Thanks, David > Best regards, > Goetz > >> -----Original Message----- >> From: Lindenmaier, Goetz >> Sent: Montag, 28. August 2017 12:10 >> To: 'David Holmes' >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: RE: [ping] RFR(M): 8186072: dll_build_name returns true even if file >> is missing. >> >> Hi, >> >> this are the changes needed to make the windows dll_locate_lib >> universally applicable. I also merge the three similar jio_snprintf >> calls into one method. >> I do some gymnastics to avoid another buffer of MAX_PATH_LEN >> at the first call to conc_path_file_and_check. >> I'll test this tonight. >> >> Best regards, >> Goetz. >> >> diff -r e09e4eb985c5 src/os/windows/vm/os_windows.cpp >> --- a/src/os/windows/vm/os_windows.cpp Thu Aug 17 17:26:02 2017 >> +0200 >> +++ b/src/os/windows/vm/os_windows.cpp Mon Aug 28 12:02:26 2017 >> +0200 >> @@ -1205,6 +1205,17 @@ >> return GetFileAttributes(filename) != INVALID_FILE_ATTRIBUTES; >> } >> >> +bool conc_path_file_and_check(char *buffer, char *printbuffer, size_t >> printbuflen, >> + const char* pname, char lastchar, const char* fname) { >> + char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar == >> os::file_seperator()[0]) ? "" : os::file_separator(); >> + int ret = jio_snprintf(printbuffer, printbuflen, "%s%s%s", path, filesep, >> fullfname); >> + if (ret != -1) { >> + struct stat statbuf; >> + return os::stat(buffer, &statbuf) == 0; >> + } >> + return false; >> +} >> + >> bool os::dll_locate_lib(char *buffer, size_t buflen, >> const char* pname, const char* fname) { >> bool retval = false; >> @@ -1220,11 +1231,8 @@ >> if (p != NULL) { >> const size_t plen = strlen(buffer); >> const char lastchar = buffer[plen - 1]; >> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; >> - int ret = jio_snprintf(&buffer[plen], buflen - plen, "%s%s", filesep, >> fullfname); >> - if (ret != -1) { >> - retval = file_exists(buffer); >> - } >> + retval = conc_path_file_and_check(buffer, &buffer[plen], buflen - >> plen, >> + "", lastchar, fullfname); >> } >> } else if (strchr(pname, *os::path_separator()) != NULL) { >> int n; >> @@ -1238,12 +1246,8 @@ >> continue; // skip the empty path values >> } >> const char lastchar = path[plen - 1]; >> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; >> - int ret = jio_snprintf(buffer, buflen, "%s%s%s", path, filesep, >> fullfname); >> - if (ret != -1 && file_exists(buffer)) { >> - retval = true; >> - break; >> - } >> + retval = conc_path_file_and_check(buffer, buffer, buflen, path, >> lastchar, fullfname); >> + if (retval) break; >> } >> // release the storage >> for (int i = 0; i < n; i++) { >> @@ -1255,11 +1259,7 @@ >> } >> } else { >> const char lastchar = pname[pnamelen-1]; >> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; >> - int ret = jio_snprintf(buffer, buflen, "%s%s%s", pname, filesep, >> fullfname); >> - if (ret != -1) { >> - retval = file_exists(buffer); >> - } >> + retval = conc_path_file_and_check(buffer, buffer, buflen, path, lastchar, >> fullfname); >> } >> } >> >> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Montag, 28. August 2017 07:38 >>> To: Lindenmaier, Goetz >>> Cc: hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns true even if >> file >>> is missing. >>> >>> Hi Goetz, >>> >>> On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> I please need a second review and a sponsor: >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >> dllBuildName/webrev.04 >>>> >>>> To update my description of the change to the status after Thomas' >> review: >>>> >>>> dll_build_name builds the proper path to a library given a list of paths >>> separated by >>>> path_seperator and a library name. It adds in the platform specific >> endings >>> etc. >>>> It is documented to return whether the file exists, but only does so if a >>> path_seperator >>>> exists in the path. >>>> Especially if the path is empty, it just returns ?true? without checking. >>>> >>>> Dll_build_name is usually used before calling dll_load. If dll_load does >> not >>> get a full path it searches >>>> in well known unix/windows locations. This is intended in the two cases >>> where dll_build_name >>>> is called with an empty path. >>>> >>>> I renamed dll_build_name to dll_locate_lib and changed it's behavior to >>> always return >>>> a full path to the lib, inserting current working directory if no path is given. >>>> For the use case where "" was actually passed to the function, I added a >>> new function >>>> (reusing the old function name) dll_build_name that just adds system >>> dependent prefix and suffix >>>> to the name. >>>> I merged all unix implementations to the posix os branch. >>> >>> I started to look at this and have applied the patch to run through some >>> basic testing. The overall approach seems reasonable. But it is hard to >>> track all the details - in particular whether there were any subtle >>> differences across the "posix" systems? >>> >>> I'm wondering what, if any, significant differences exist between the >>> Windows and POSIX versions? I would hope the platform differences could >>> easily be hidden behind macros (for path separator, library suffix etc). >>> Then perhaps this could just go in shared code (os.hpp, os.cpp)? >>> >>> That aside, in the Windows code shouldn't the hardwired .dll strings >>> actually be JNI_LIB_SUFFIX? >>> >>> Thanks, >>> David >>> >>>> Best regards, >>>> Goetz. >>>> >>>> >>>> >>>>> -----Original Message----- >>>>> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] >>>>> Sent: Dienstag, 22. August 2017 17:30 >>>>> To: Lindenmaier, Goetz >>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file is >>>>> missing. >>>>> >>>>> Looks good. >>>>> >>>>> ..Thomas >>>>> >>>>> On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz >>>>> > >>> wrote: >>>>> >>>>> >>>>> I mistyped the path to webrev, this should work: >>>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>> dllBuildName/webrev.04 >>> >>>> dllBuildName/webrev.04> >>>>> >>>>> Sorry, >>>>> Goetz >>>>> >>>>> >>>>> >>>>> > -----Original Message----- >>>>> > From: Lindenmaier, Goetz >>>>> > Sent: Dienstag, 22. August 2017 15:48 >>>>> > To: 'Thomas St?fe' >>>> > >>>>> > Cc: hotspot-runtime-dev at openjdk.java.net >> runtime- >>>>> dev at openjdk.java.net> >>>>> > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if >>>>> file is >>>>> > missing. >>>>> > >>>>> > Hi, >>>>> > >>>>> > could I please get a second review? >>>>> > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName- >>>>> hs/webrev.04 >> dllBuildName- >>>>> hs/webrev.04> >>>>> > >>>>> > I had to update the webrev because of a problem on windows. >>>>> > @Thomas I had edited os.hpp, but not saved :( >>>>> > >>>>> > Best regards, >>>>> > Goetz. >>>>> > >>>>> > PS: Didn't double-check the webrev as cr server is slow. >>>>> > >>>>> > > -----Original Message----- >>>>> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >>>>> ] >>>>> > > Sent: Donnerstag, 17. August 2017 19:54 >>>>> > > To: Lindenmaier, Goetz >>>> > >>>>> > > Cc: hotspot-runtime-dev at openjdk.java.net >>>> runtime-dev at openjdk.java.net> >>>>> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even >>> if >>>>> file is >>>>> > > missing. >>>>> > > >>>>> > > Hi Goetz, >>>>> > > >>>>> > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz >>>>> > > >>>> >>> >>>> > > wrote: >>>>> > > >>>>> > > >>>>> > > Hi Thomas, >>>>> > > >>>>> > > >>>>> > > >>>>> > > I adapted the comments in os.hpp. >>>>> > > >>>>> > > >>>>> > > >>>>> > > If I move the call to dll_build_name out of dll_locate_lib >>>>> > > >>>>> > > I have to do a lot of coding in all the places where it is called. >>>>> > > >>>>> > > That seems not useful to me. >>>>> > > >>>>> > > >>>>> > > >>>>> > > Fixed the type to size_t. >>>>> > > >>>>> > > >>>>> > > >>>>> > > One could merge posix/windows if putting the check for ?:? >>>>> > > >>>>> > > into a WINDOWS_ONLY() I guess. The check for \ could be >>>>> > > >>>>> > > done in posix as well, if using file_seperator(). >>>>> > > >>>>> > > >>>>> > > >>>>> > > * Not your change, but: why does the code in >>> os::dll_locate_lib() >>>>> even >>>>> > > >>>>> > > * differentiate between a PATH containing no >>>>> os::path_separator() >>>>> > > >>>>> > > * and a path containing os::path_separator()? >>>>> > > >>>>> > > I assume this was done to avoid all the allocations and copying >>> of >>>>> the >>>>> > > path. >>>>> > > >>>>> > > >>>>> > > >>>>> > > Also adapted the comment in jvmtiExport.cpp. >>>>> > > >>>>> > > >>>>> > > >>>>> > > New webrev: >>>>> > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>> >>>>> > > dllBuildName/webrev.03/ >>>>> >>>> >>>>> > > dllBuildName/webrev.03/> >>>>> > > >>>>> > > incremental diff: >>>>> > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>> >>>>> > > dllBuildName/webrev.03/diffs-incremental.patch >>>>> > > >>>> >>>>> > > dllBuildName/webrev.03/diffs-incremental.patch> >>>>> > > >>>>> > > (fixed indentation on windows) >>>>> > > >>>>> > > >>>>> > > >>>>> > > Best regards, >>>>> > > >>>>> > > Goetz. >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > Comments in os.hpp seem unchanged ? >>>>> > > >>>>> > > But looks fine otherwise. I do not need another webrev. >>>>> > > >>>>> > > Thanks, Thomas >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >>>>> >>>>> > > >>>> > ] >>>>> > > Sent: Thursday, August 17, 2017 3:48 PM >>>>> > > To: Lindenmaier, Goetz >>>> >>>>> > > >>>> > > >>>>> > > Cc: hotspot-runtime-dev at openjdk.java.net >>>> runtime-dev at openjdk.java.net> >> >>>> runtime-> >>>>> > > dev at openjdk.java.net > >>>>> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true >>> even >>>>> if file >>>>> > > is missing. >>>>> > > >>>>> > > >>>>> > > >>>>> > > Hi Goetz, >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz >>>>> > > >>>> >>> >>>> > > wrote: >>>>> > > >>>>> > > Hi Thomas, >>>>> > > >>>>> > > I reworked the whole thing. >>>>> > > >>>>> > > First, there is dll_build_name. It just does -> >>>>> > > lib.so. >>>>> > > >>>>> > > Second, I renamed the legacy dll_build_name to >>>>> dll_locate_lib. >>>>> > > >>>>> > > I merged all the unix variants to one in os_posix. >>>>> > > >>>>> > > I removed the buffer overflow check at the top. >>>>> > > It's too restrictive because the path argument >>>>> > > can contain several paths. I added the overflow >>>>> > > checks into the single cases. >>>>> > > >>>>> > > Also, I first assemble the pure name using the new, simple >>>>> > > dll_build_name. This is for reuse and readability. >>>>> > > >>>>> > > In case of an empty directory, I use get_current_directory >>>>> > > to complete the path as indicated by the original >>>>> > > documentation >>>>> > > where it was called with "". >>>>> > > Dll_locate_lib now always returns a name with a full path if >>>>> > > the file exists. >>>>> > > >>>>> > > Also, on windows, I think I fixed a bug by reversing the >>> order >>>>> > > of checks. A path list ending in ':' or '\' would not have >>>>> > > been recognized. >>>>> > > >>>>> > > On Bsd, I removed JNI_LIB_* because that already is >>> defined >>>>> > > in jvm_bsh.h >>>>> > > >>>>> > > New webrev: >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>> >>>>> > > dllBuildName/webrev.02/ >>>>> >>>> >>>>> > > dllBuildName/webrev.02/> >>>>> > > >>>>> > > Best regards, >>>>> > > Goetz. >>>>> > > >>>>> > > >>>>> > > >>>>> > > I like this better than before. Remarks: >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>> >>>>> > > >>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html >>>>> > > >>>> >>>>> > > >>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> >>>>> > > >>>>> > > >>>>> > > >>>>> > > + // Builds the platform-specific name of a library. >>>>> > > >>>>> > > + // Returns false on __buffer overflow__. >>>>> > > >>>>> > > >>>>> > > >>>>> > > Hopefully not! :D >>>>> > > >>>>> > > How about: "Returns false no truncation" instead. >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > + // Builds a platform-specific full library path given an ld path >>>>> and lib >>>>> > > name. >>>>> > > >>>>> > > + // Returns true if the buffer contains a full path to an existing >>>>> file, >>>>> > > false >>>>> > > >>>>> > > + // otherwise. If pathname is empty, checks the current >>>>> directory. >>>>> > > >>>>> > > + static bool dll_locate_lib(char* buffer, size_t size, >>>>> > > >>>>> > > const char* pathname, const char* >>>>> fname); >>>>> > > >>>>> > > >>>>> > > >>>>> > > Might be worth mentioning that "fname" is the unadorned >>> library >>>>> > > name, e.g. "verify" for libverify.so or verify.dll. >>>>> > > >>>>> > > >>>>> > > >>>>> > > Would the following alternative be valid: >>>>> > > >>>>> > > >>>>> > > >>>>> > > one could make dll_locate_lib take the real file name, and let >>>>> caller >>>>> > > use dll_build_name() to build the libary name first before handing >>> it >>>>> to >>>>> > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to >>> a >>>>> generic >>>>> > > "find_file_in_path" because it would work for any kind of file. >>>>> > > >>>>> > > >>>>> > > >>>>> > > As an added bonus, there would be no need to create a >>>>> temporary >>>>> > > array in dll_build_name/dll_locate_lib, and no need to call free() >>> so >>>>> no >>>>> > > cleanup-related control flow changes in these functions. >>>>> > > >>>>> > > >>>>> > > >>>>> > > ===== >>>>> > > >>>>> > > >>>>> > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>> >>>>> > > >>>>> >>> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html >>>>> > > >>>> >>>>> > > >>>>> >>> >> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> >>>>> > > >>>>> > > >>>>> > > >>>>> > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + >>>>> > > strlen(JNI_LIB_SUFFIX); >>>>> > > >>>>> > > >>>>> > > >>>>> > > int -> size_t (does that even compile without warning?) >>>>> > > >>>>> > > >>>>> > > >>>>> > > + // Check current working directory. >>>>> > > >>>>> > > + const char* p = get_current_directory(buffer, buflen); >>>>> > > >>>>> > > + if (p != NULL && >>>>> > > >>>>> > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { >>>>> > > >>>>> > > + strcat(buffer, "\\"); >>>>> > > >>>>> > > + strcat(buffer, fullfname); >>>>> > > >>>>> > > + retval = file_exists(buffer); >>>>> > > >>>>> > > >>>>> > > >>>>> > > Small nit: I'd use jio_snprintf instead of strcat. Functionally >>>>> identical but >>>>> > > will make scanners (e.g. coverity) happy. One could then avoid >>> the >>>>> length >>>>> > > calculation and rely on jio_snprintf truncation: >>>>> > > >>>>> > > >>>>> > > >>>>> > > const char* p = get_current_directory(buffer, buflen); >>>>> > > >>>>> > > if (p != NULL) { >>>>> > > >>>>> > > const size_t end = strlen(p); >>>>> > > >>>>> > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { >>>>> > > >>>>> > > retval = file_exists(buffer); >>>>> > > >>>>> > > } >>>>> > > >>>>> > > } >>>>> > > >>>>> > > >>>>> > > >>>>> > > -- >>>>> > > >>>>> > > >>>>> > > >>>>> > > Not your change, but: why does the code in os::dll_locate_lib() >>>>> even >>>>> > > differentiate between a PATH containing no os::path_separator() >>>>> and a path >>>>> > > containing os::path_separator()? >>>>> > > >>>>> > > >>>>> > > >>>>> > > Would the former not be just a PATH with only one directory >>> and >>>>> hence >>>>> > > need no special treatment? >>>>> > > >>>>> > > >>>>> > > >>>>> > > ===== >>>>> > > >>>>> > > >>>>> > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>> >>>>> > > >>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html >>>>> > > >>>> >>>>> > > >>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> >>>>> > > >>>>> > > >>>>> > > >>>>> > > Could os::dll_locate_lib be consolidated between windows and >>>>> unix? >>>>> > > Seems to be the implementation is almost identical. >>>>> > > >>>>> > > >>>>> > > >>>>> > > ==== >>>>> > > >>>>> > > >>>>> > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>> >>>>> > > >>>>> >> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html >>>>> > > >>>> >>>>> > > >>>>> >>> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> >>>>> > > >>>>> > > >>>>> > > >>>>> > > + // not found - try library path >>>>> > > >>>>> > > >>>>> > > >>>>> > > Proposal: "not found - try OS default library path" >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > Find some comments inline: >>>>> > > >>>>> > > >>>>> > > > Especially if the path is empty, it just returns 'true'. >>>>> > > > Dll_build_name is usually used before calling dll_load. >>>>> If >>>>> > > dll_load does not get a full path it searches >>>>> > > > in well known unix/windows locations. This is intended >>>>> in >>>>> > > the two cases where dll_build_name >>>>> > > > is called with an empty path. >>>>> > > > >>>>> > > > So, for both cases (thread.cpp, jvmtiExport.cpp), >>>>> > > > >>>>> > > > before, we would call os::dll_build_name() with an empty >>>>> > > string for the path >>>>> > > > which, for relative paths, would result in feeding that path >>>>> > > unexpanded to >>>>> > > > dlopen(), which would use whatever the OS does in those >>>>> > > cases (LIBPATH, >>>>> > > > LD_LIBRARY_PATH, PATH on windows). Note that this >>> does >>>>> > > not necessarily >>>>> > > > include searching the current directory. >>>>> > > Right. With changed dll_biuld_name it's again exactly as >>>>> > > before. >>>>> > > >>>>> > > > With your change, we now use java.library.path, which is >>>>> not >>>>> > > necessarily the >>>>> > > > same? >>>>> > > You are right, I oversaw that java.library.path can be >>>>> > > overwritten. Initially, >>>>> > > it's set to the right thing. >>>>> > > >>>>> > > > (BTW, I think the old comments in thread.cpp and >>>>> > > jniExport.cpp were wrong:"// >>>>> > > > Try the local directory" - if "local" means "current", this is >>>>> not >>>>> > > what did >>>>> > > > happen). >>>>> > > Right, I tried to adapt them, did I miss one? >>>>> > > >>>>> > > > I added a second variant of dll_build_name without >>> the >>>>> > > path argument that adds the path >>>>> > > > from system property java.lang.path and use that in >>>>> these >>>>> > > two cases. >>>>> > > > I changed the original function to actually check file >>>>> > > availability in all cases, >>>>> > > > and to check . if the path is empty. >>>>> > > > I think that may be a bit confusing. We would then have >>>>> three >>>>> > > options: >>>>> > > > >>>>> > > > - call os::dll_build_name with a real ";;.." PATH >>>>> and >>>>> > > get a file name >>>>> > > > resolved from that path >>>>> > > > - call os::dll_build_name with "" for the PATH and get OS >>>>> dll >>>>> > > resolution >>>>> > > No, in that case, as I called file_exists(), it would only work if >>>>> > > the dll is in the >>>>> > > current working directory. But I changed this now, anyways. >>>>> > > >>>>> > > > - call your new overloaded version of >>> os::dll_build_name(), >>>>> > > which uses - >>>>> > > > Djava.library.path. >>>>> > > > >>>>> > > > Please review this change. I please need a sponsor. >>>>> > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>> >>>>> > > >>>> > >>>>> > > > dllBuildName/webrev.01/ >>>>> > > >>>> >>>>> > > >>>> > >>>>> > > > dllBuildName/webrev.01/> >>>>> > > >>>>> > > > >>>>> > > > Best regards, >>>>> > > > Goetz. >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > Kind Regards, Thomas >>>>> > > >>>>> > > >>>>> > > >>>>> > > Best Regards, Thomas >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> >>>>> >>>>> >>>> From goetz.lindenmaier at sap.com Tue Aug 29 08:29:15 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 29 Aug 2017 08:29:15 +0000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> <726e7be2591f481d8d0ee50af6486f75@sap.com> Message-ID: <11ed4c208b8641088b99a2066dfc992b@sap.com> Hi David, I fixed the indentation and added you as reviewer. I replaced the webrev in-place: http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.05/ The new code went through all our testing ... except for some ppc/s390 builds that failed because of an other change pushed to hs tonight. But that should not matter, it all passed with the jdk10/jdk10 testing. Would you mind sponsoring? Best regards, Goetz. > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Dienstag, 29. August 2017 09:53 > To: Lindenmaier, Goetz > Cc: hotspot-runtime-dev at openjdk.java.net > Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns true even if file > is missing. > > Hi Goetz, > > On 29/08/2017 4:18 PM, Lindenmaier, Goetz wrote: > > Hi, > > > > this is a webrev with merged windows and posix implementations: > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.05/ > > I like the look of this. > > There are a couple of indention nits in os.cpp: > > 247 static bool conc_path_file_and_check(char *buffer, char > *printbuffer, size_t printbuflen, > 248 const char* pname, char lastchar, > const char* fname) { > > > 251 const char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar > == os::file_separator()[0]) ? > 252 "" : os::file_separator(); > > > Thanks, > David > > > Best regards, > > Goetz > > > >> -----Original Message----- > >> From: Lindenmaier, Goetz > >> Sent: Montag, 28. August 2017 12:10 > >> To: 'David Holmes' > >> Cc: hotspot-runtime-dev at openjdk.java.net > >> Subject: RE: [ping] RFR(M): 8186072: dll_build_name returns true even if > file > >> is missing. > >> > >> Hi, > >> > >> this are the changes needed to make the windows dll_locate_lib > >> universally applicable. I also merge the three similar jio_snprintf > >> calls into one method. > >> I do some gymnastics to avoid another buffer of MAX_PATH_LEN > >> at the first call to conc_path_file_and_check. > >> I'll test this tonight. > >> > >> Best regards, > >> Goetz. > >> > >> diff -r e09e4eb985c5 src/os/windows/vm/os_windows.cpp > >> --- a/src/os/windows/vm/os_windows.cpp Thu Aug 17 17:26:02 > 2017 > >> +0200 > >> +++ b/src/os/windows/vm/os_windows.cpp Mon Aug 28 12:02:26 > 2017 > >> +0200 > >> @@ -1205,6 +1205,17 @@ > >> return GetFileAttributes(filename) != INVALID_FILE_ATTRIBUTES; > >> } > >> > >> +bool conc_path_file_and_check(char *buffer, char *printbuffer, size_t > >> printbuflen, > >> + const char* pname, char lastchar, const char* fname) { > >> + char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar == > >> os::file_seperator()[0]) ? "" : os::file_separator(); > >> + int ret = jio_snprintf(printbuffer, printbuflen, "%s%s%s", path, filesep, > >> fullfname); > >> + if (ret != -1) { > >> + struct stat statbuf; > >> + return os::stat(buffer, &statbuf) == 0; > >> + } > >> + return false; > >> +} > >> + > >> bool os::dll_locate_lib(char *buffer, size_t buflen, > >> const char* pname, const char* fname) { > >> bool retval = false; > >> @@ -1220,11 +1231,8 @@ > >> if (p != NULL) { > >> const size_t plen = strlen(buffer); > >> const char lastchar = buffer[plen - 1]; > >> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; > >> - int ret = jio_snprintf(&buffer[plen], buflen - plen, "%s%s", filesep, > >> fullfname); > >> - if (ret != -1) { > >> - retval = file_exists(buffer); > >> - } > >> + retval = conc_path_file_and_check(buffer, &buffer[plen], buflen - > >> plen, > >> + "", lastchar, fullfname); > >> } > >> } else if (strchr(pname, *os::path_separator()) != NULL) { > >> int n; > >> @@ -1238,12 +1246,8 @@ > >> continue; // skip the empty path values > >> } > >> const char lastchar = path[plen - 1]; > >> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; > >> - int ret = jio_snprintf(buffer, buflen, "%s%s%s", path, filesep, > >> fullfname); > >> - if (ret != -1 && file_exists(buffer)) { > >> - retval = true; > >> - break; > >> - } > >> + retval = conc_path_file_and_check(buffer, buffer, buflen, path, > >> lastchar, fullfname); > >> + if (retval) break; > >> } > >> // release the storage > >> for (int i = 0; i < n; i++) { > >> @@ -1255,11 +1259,7 @@ > >> } > >> } else { > >> const char lastchar = pname[pnamelen-1]; > >> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; > >> - int ret = jio_snprintf(buffer, buflen, "%s%s%s", pname, filesep, > >> fullfname); > >> - if (ret != -1) { > >> - retval = file_exists(buffer); > >> - } > >> + retval = conc_path_file_and_check(buffer, buffer, buflen, path, > lastchar, > >> fullfname); > >> } > >> } > >> > >> > >>> -----Original Message----- > >>> From: David Holmes [mailto:david.holmes at oracle.com] > >>> Sent: Montag, 28. August 2017 07:38 > >>> To: Lindenmaier, Goetz > >>> Cc: hotspot-runtime-dev at openjdk.java.net > >>> Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns true even if > >> file > >>> is missing. > >>> > >>> Hi Goetz, > >>> > >>> On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> I please need a second review and a sponsor: > >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- > >> dllBuildName/webrev.04 > >>>> > >>>> To update my description of the change to the status after Thomas' > >> review: > >>>> > >>>> dll_build_name builds the proper path to a library given a list of paths > >>> separated by > >>>> path_seperator and a library name. It adds in the platform specific > >> endings > >>> etc. > >>>> It is documented to return whether the file exists, but only does so if a > >>> path_seperator > >>>> exists in the path. > >>>> Especially if the path is empty, it just returns ?true? without checking. > >>>> > >>>> Dll_build_name is usually used before calling dll_load. If dll_load does > >> not > >>> get a full path it searches > >>>> in well known unix/windows locations. This is intended in the two cases > >>> where dll_build_name > >>>> is called with an empty path. > >>>> > >>>> I renamed dll_build_name to dll_locate_lib and changed it's behavior to > >>> always return > >>>> a full path to the lib, inserting current working directory if no path is > given. > >>>> For the use case where "" was actually passed to the function, I added > a > >>> new function > >>>> (reusing the old function name) dll_build_name that just adds system > >>> dependent prefix and suffix > >>>> to the name. > >>>> I merged all unix implementations to the posix os branch. > >>> > >>> I started to look at this and have applied the patch to run through some > >>> basic testing. The overall approach seems reasonable. But it is hard to > >>> track all the details - in particular whether there were any subtle > >>> differences across the "posix" systems? > >>> > >>> I'm wondering what, if any, significant differences exist between the > >>> Windows and POSIX versions? I would hope the platform differences > could > >>> easily be hidden behind macros (for path separator, library suffix etc). > >>> Then perhaps this could just go in shared code (os.hpp, os.cpp)? > >>> > >>> That aside, in the Windows code shouldn't the hardwired .dll strings > >>> actually be JNI_LIB_SUFFIX? > >>> > >>> Thanks, > >>> David > >>> > >>>> Best regards, > >>>> Goetz. > >>>> > >>>> > >>>> > >>>>> -----Original Message----- > >>>>> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > >>>>> Sent: Dienstag, 22. August 2017 17:30 > >>>>> To: Lindenmaier, Goetz > >>>>> Cc: hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file > is > >>>>> missing. > >>>>> > >>>>> Looks good. > >>>>> > >>>>> ..Thomas > >>>>> > >>>>> On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz > >>>>> > > > >>> wrote: > >>>>> > >>>>> > >>>>> I mistyped the path to webrev, this should work: > >>>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- > >>>>> dllBuildName/webrev.04 > >>> >>>>> dllBuildName/webrev.04> > >>>>> > >>>>> Sorry, > >>>>> Goetz > >>>>> > >>>>> > >>>>> > >>>>> > -----Original Message----- > >>>>> > From: Lindenmaier, Goetz > >>>>> > Sent: Dienstag, 22. August 2017 15:48 > >>>>> > To: 'Thomas St?fe' >>>>> > > >>>>> > Cc: hotspot-runtime-dev at openjdk.java.net >>> runtime- > >>>>> dev at openjdk.java.net> > >>>>> > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if > >>>>> file is > >>>>> > missing. > >>>>> > > >>>>> > Hi, > >>>>> > > >>>>> > could I please get a second review? > >>>>> > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName- > >>>>> hs/webrev.04 >>> dllBuildName- > >>>>> hs/webrev.04> > >>>>> > > >>>>> > I had to update the webrev because of a problem on windows. > >>>>> > @Thomas I had edited os.hpp, but not saved :( > >>>>> > > >>>>> > Best regards, > >>>>> > Goetz. > >>>>> > > >>>>> > PS: Didn't double-check the webrev as cr server is slow. > >>>>> > > >>>>> > > -----Original Message----- > >>>>> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > >>>>> ] > >>>>> > > Sent: Donnerstag, 17. August 2017 19:54 > >>>>> > > To: Lindenmaier, Goetz >>>>> > > >>>>> > > Cc: hotspot-runtime-dev at openjdk.java.net >>>>> runtime-dev at openjdk.java.net> > >>>>> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even > >>> if > >>>>> file is > >>>>> > > missing. > >>>>> > > > >>>>> > > Hi Goetz, > >>>>> > > > >>>>> > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz > >>>>> > > >>>>> > >>> >>>>> > > wrote: > >>>>> > > > >>>>> > > > >>>>> > > Hi Thomas, > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > I adapted the comments in os.hpp. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > If I move the call to dll_build_name out of dll_locate_lib > >>>>> > > > >>>>> > > I have to do a lot of coding in all the places where it is called. > >>>>> > > > >>>>> > > That seems not useful to me. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Fixed the type to size_t. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > One could merge posix/windows if putting the check for ?:? > >>>>> > > > >>>>> > > into a WINDOWS_ONLY() I guess. The check for \ could be > >>>>> > > > >>>>> > > done in posix as well, if using file_seperator(). > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > * Not your change, but: why does the code in > >>> os::dll_locate_lib() > >>>>> even > >>>>> > > > >>>>> > > * differentiate between a PATH containing no > >>>>> os::path_separator() > >>>>> > > > >>>>> > > * and a path containing os::path_separator()? > >>>>> > > > >>>>> > > I assume this was done to avoid all the allocations and copying > >>> of > >>>>> the > >>>>> > > path. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Also adapted the comment in jvmtiExport.cpp. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > New webrev: > >>>>> > > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >>>>> > >>>>> > > dllBuildName/webrev.03/ > >>>>> >>>>> > >>>>> > > dllBuildName/webrev.03/> > >>>>> > > > >>>>> > > incremental diff: > >>>>> > > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >>>>> > >>>>> > > dllBuildName/webrev.03/diffs-incremental.patch > >>>>> > > >>>>> > >>>>> > > dllBuildName/webrev.03/diffs-incremental.patch> > >>>>> > > > >>>>> > > (fixed indentation on windows) > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Best regards, > >>>>> > > > >>>>> > > Goetz. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Comments in os.hpp seem unchanged ? > >>>>> > > > >>>>> > > But looks fine otherwise. I do not need another webrev. > >>>>> > > > >>>>> > > Thanks, Thomas > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com > >>>>> > >>>>> > > >>>>> > ] > >>>>> > > Sent: Thursday, August 17, 2017 3:48 PM > >>>>> > > To: Lindenmaier, Goetz >>>>> > >>>>> > > >>>>> > > > >>>>> > > Cc: hotspot-runtime-dev at openjdk.java.net >>>>> runtime-dev at openjdk.java.net> >>> >>>>> runtime-> > >>>>> > > dev at openjdk.java.net > > >>>>> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true > >>> even > >>>>> if file > >>>>> > > is missing. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Hi Goetz, > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz > >>>>> > > >>>>> > >>> >>>>> > > wrote: > >>>>> > > > >>>>> > > Hi Thomas, > >>>>> > > > >>>>> > > I reworked the whole thing. > >>>>> > > > >>>>> > > First, there is dll_build_name. It just does -> > >>>>> > > lib.so. > >>>>> > > > >>>>> > > Second, I renamed the legacy dll_build_name to > >>>>> dll_locate_lib. > >>>>> > > > >>>>> > > I merged all the unix variants to one in os_posix. > >>>>> > > > >>>>> > > I removed the buffer overflow check at the top. > >>>>> > > It's too restrictive because the path argument > >>>>> > > can contain several paths. I added the overflow > >>>>> > > checks into the single cases. > >>>>> > > > >>>>> > > Also, I first assemble the pure name using the new, simple > >>>>> > > dll_build_name. This is for reuse and readability. > >>>>> > > > >>>>> > > In case of an empty directory, I use get_current_directory > >>>>> > > to complete the path as indicated by the original > >>>>> > > documentation > >>>>> > > where it was called with "". > >>>>> > > Dll_locate_lib now always returns a name with a full path if > >>>>> > > the file exists. > >>>>> > > > >>>>> > > Also, on windows, I think I fixed a bug by reversing the > >>> order > >>>>> > > of checks. A path list ending in ':' or '\' would not have > >>>>> > > been recognized. > >>>>> > > > >>>>> > > On Bsd, I removed JNI_LIB_* because that already is > >>> defined > >>>>> > > in jvm_bsh.h > >>>>> > > > >>>>> > > New webrev: > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >>>>> > >>>>> > > dllBuildName/webrev.02/ > >>>>> >>>>> > >>>>> > > dllBuildName/webrev.02/> > >>>>> > > > >>>>> > > Best regards, > >>>>> > > Goetz. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > I like this better than before. Remarks: > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >>>>> > >>>>> > > > >>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > >>>>> > > >>>>> > >>>>> > > > >>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > + // Builds the platform-specific name of a library. > >>>>> > > > >>>>> > > + // Returns false on __buffer overflow__. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Hopefully not! :D > >>>>> > > > >>>>> > > How about: "Returns false no truncation" instead. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > + // Builds a platform-specific full library path given an ld path > >>>>> and lib > >>>>> > > name. > >>>>> > > > >>>>> > > + // Returns true if the buffer contains a full path to an existing > >>>>> file, > >>>>> > > false > >>>>> > > > >>>>> > > + // otherwise. If pathname is empty, checks the current > >>>>> directory. > >>>>> > > > >>>>> > > + static bool dll_locate_lib(char* buffer, size_t size, > >>>>> > > > >>>>> > > const char* pathname, const char* > >>>>> fname); > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Might be worth mentioning that "fname" is the unadorned > >>> library > >>>>> > > name, e.g. "verify" for libverify.so or verify.dll. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Would the following alternative be valid: > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > one could make dll_locate_lib take the real file name, and let > >>>>> caller > >>>>> > > use dll_build_name() to build the libary name first before handing > >>> it > >>>>> to > >>>>> > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to > >>> a > >>>>> generic > >>>>> > > "find_file_in_path" because it would work for any kind of file. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > As an added bonus, there would be no need to create a > >>>>> temporary > >>>>> > > array in dll_build_name/dll_locate_lib, and no need to call free() > >>> so > >>>>> no > >>>>> > > cleanup-related control flow changes in these functions. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > ===== > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >>>>> > >>>>> > > > >>>>> > >>> > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > >>>>> > > >>>>> > >>>>> > > > >>>>> > >>> > >> > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + > >>>>> > > strlen(JNI_LIB_SUFFIX); > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > int -> size_t (does that even compile without warning?) > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > + // Check current working directory. > >>>>> > > > >>>>> > > + const char* p = get_current_directory(buffer, buflen); > >>>>> > > > >>>>> > > + if (p != NULL && > >>>>> > > > >>>>> > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { > >>>>> > > > >>>>> > > + strcat(buffer, "\\"); > >>>>> > > > >>>>> > > + strcat(buffer, fullfname); > >>>>> > > > >>>>> > > + retval = file_exists(buffer); > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Small nit: I'd use jio_snprintf instead of strcat. Functionally > >>>>> identical but > >>>>> > > will make scanners (e.g. coverity) happy. One could then avoid > >>> the > >>>>> length > >>>>> > > calculation and rely on jio_snprintf truncation: > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > const char* p = get_current_directory(buffer, buflen); > >>>>> > > > >>>>> > > if (p != NULL) { > >>>>> > > > >>>>> > > const size_t end = strlen(p); > >>>>> > > > >>>>> > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { > >>>>> > > > >>>>> > > retval = file_exists(buffer); > >>>>> > > > >>>>> > > } > >>>>> > > > >>>>> > > } > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > -- > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Not your change, but: why does the code in os::dll_locate_lib() > >>>>> even > >>>>> > > differentiate between a PATH containing no os::path_separator() > >>>>> and a path > >>>>> > > containing os::path_separator()? > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Would the former not be just a PATH with only one directory > >>> and > >>>>> hence > >>>>> > > need no special treatment? > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > ===== > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >>>>> > >>>>> > > > >>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html > >>>>> > > >>>>> > >>>>> > > > >>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Could os::dll_locate_lib be consolidated between windows and > >>>>> unix? > >>>>> > > Seems to be the implementation is almost identical. > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > ==== > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >>>>> > >>>>> > > > >>>>> > >> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > >>>>> > > >>>>> > >>>>> > > > >>>>> > >>> > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > + // not found - try library path > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Proposal: "not found - try OS default library path" > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Find some comments inline: > >>>>> > > > >>>>> > > > >>>>> > > > Especially if the path is empty, it just returns 'true'. > >>>>> > > > Dll_build_name is usually used before calling dll_load. > >>>>> If > >>>>> > > dll_load does not get a full path it searches > >>>>> > > > in well known unix/windows locations. This is intended > >>>>> in > >>>>> > > the two cases where dll_build_name > >>>>> > > > is called with an empty path. > >>>>> > > > > >>>>> > > > So, for both cases (thread.cpp, jvmtiExport.cpp), > >>>>> > > > > >>>>> > > > before, we would call os::dll_build_name() with an empty > >>>>> > > string for the path > >>>>> > > > which, for relative paths, would result in feeding that path > >>>>> > > unexpanded to > >>>>> > > > dlopen(), which would use whatever the OS does in those > >>>>> > > cases (LIBPATH, > >>>>> > > > LD_LIBRARY_PATH, PATH on windows). Note that this > >>> does > >>>>> > > not necessarily > >>>>> > > > include searching the current directory. > >>>>> > > Right. With changed dll_biuld_name it's again exactly as > >>>>> > > before. > >>>>> > > > >>>>> > > > With your change, we now use java.library.path, which is > >>>>> not > >>>>> > > necessarily the > >>>>> > > > same? > >>>>> > > You are right, I oversaw that java.library.path can be > >>>>> > > overwritten. Initially, > >>>>> > > it's set to the right thing. > >>>>> > > > >>>>> > > > (BTW, I think the old comments in thread.cpp and > >>>>> > > jniExport.cpp were wrong:"// > >>>>> > > > Try the local directory" - if "local" means "current", this is > >>>>> not > >>>>> > > what did > >>>>> > > > happen). > >>>>> > > Right, I tried to adapt them, did I miss one? > >>>>> > > > >>>>> > > > I added a second variant of dll_build_name without > >>> the > >>>>> > > path argument that adds the path > >>>>> > > > from system property java.lang.path and use that in > >>>>> these > >>>>> > > two cases. > >>>>> > > > I changed the original function to actually check file > >>>>> > > availability in all cases, > >>>>> > > > and to check . if the path is empty. > >>>>> > > > I think that may be a bit confusing. We would then have > >>>>> three > >>>>> > > options: > >>>>> > > > > >>>>> > > > - call os::dll_build_name with a real ";;.." PATH > >>>>> and > >>>>> > > get a file name > >>>>> > > > resolved from that path > >>>>> > > > - call os::dll_build_name with "" for the PATH and get OS > >>>>> dll > >>>>> > > resolution > >>>>> > > No, in that case, as I called file_exists(), it would only work if > >>>>> > > the dll is in the > >>>>> > > current working directory. But I changed this now, anyways. > >>>>> > > > >>>>> > > > - call your new overloaded version of > >>> os::dll_build_name(), > >>>>> > > which uses - > >>>>> > > > Djava.library.path. > >>>>> > > > > >>>>> > > > Please review this change. I please need a sponsor. > >>>>> > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > >>>>> > >>>>> > > >>>>> > > >>>>> > > > dllBuildName/webrev.01/ > >>>>> > > >>>>> > >>>>> > > >>>>> > > >>>>> > > > dllBuildName/webrev.01/> > >>>>> > > > >>>>> > > > > >>>>> > > > Best regards, > >>>>> > > > Goetz. > >>>>> > > > > >>>>> > > > > >>>>> > > > > >>>>> > > > > >>>>> > > > Kind Regards, Thomas > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > Best Regards, Thomas > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > >>>>> > >>>>> > >>>> From david.holmes at oracle.com Tue Aug 29 08:41:40 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Aug 2017 18:41:40 +1000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <11ed4c208b8641088b99a2066dfc992b@sap.com> References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> <726e7be2591f481d8d0ee50af6486f75@sap.com> <11ed4c208b8641088b99a2066dfc992b@sap.com> Message-ID: <7653afda-fe01-1115-df4a-da0c2e7356f0@oracle.com> On 29/08/2017 6:29 PM, Lindenmaier, Goetz wrote: > Hi David, > > I fixed the indentation and added you as reviewer. > I replaced the webrev in-place: > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.05/ > The new code went through all our testing ... except for some ppc/s390 builds > that failed because of an other change pushed to hs tonight. But that should > not matter, it all passed with the jdk10/jdk10 testing. > > Would you mind sponsoring? No problem - but can we get Thomas to sign-off on this latest version please. Thanks, David > Best regards, > Goetz. > >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Dienstag, 29. August 2017 09:53 >> To: Lindenmaier, Goetz >> Cc: hotspot-runtime-dev at openjdk.java.net >> Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns true even if file >> is missing. >> >> Hi Goetz, >> >> On 29/08/2017 4:18 PM, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> this is a webrev with merged windows and posix implementations: >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >> dllBuildName/webrev.05/ >> >> I like the look of this. >> >> There are a couple of indention nits in os.cpp: >> >> 247 static bool conc_path_file_and_check(char *buffer, char >> *printbuffer, size_t printbuflen, >> 248 const char* pname, char lastchar, >> const char* fname) { >> >> >> 251 const char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar >> == os::file_separator()[0]) ? >> 252 "" : os::file_separator(); >> >> >> Thanks, >> David >> >>> Best regards, >>> Goetz >>> >>>> -----Original Message----- >>>> From: Lindenmaier, Goetz >>>> Sent: Montag, 28. August 2017 12:10 >>>> To: 'David Holmes' >>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>> Subject: RE: [ping] RFR(M): 8186072: dll_build_name returns true even if >> file >>>> is missing. >>>> >>>> Hi, >>>> >>>> this are the changes needed to make the windows dll_locate_lib >>>> universally applicable. I also merge the three similar jio_snprintf >>>> calls into one method. >>>> I do some gymnastics to avoid another buffer of MAX_PATH_LEN >>>> at the first call to conc_path_file_and_check. >>>> I'll test this tonight. >>>> >>>> Best regards, >>>> Goetz. >>>> >>>> diff -r e09e4eb985c5 src/os/windows/vm/os_windows.cpp >>>> --- a/src/os/windows/vm/os_windows.cpp Thu Aug 17 17:26:02 >> 2017 >>>> +0200 >>>> +++ b/src/os/windows/vm/os_windows.cpp Mon Aug 28 12:02:26 >> 2017 >>>> +0200 >>>> @@ -1205,6 +1205,17 @@ >>>> return GetFileAttributes(filename) != INVALID_FILE_ATTRIBUTES; >>>> } >>>> >>>> +bool conc_path_file_and_check(char *buffer, char *printbuffer, size_t >>>> printbuflen, >>>> + const char* pname, char lastchar, const char* fname) { >>>> + char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar == >>>> os::file_seperator()[0]) ? "" : os::file_separator(); >>>> + int ret = jio_snprintf(printbuffer, printbuflen, "%s%s%s", path, filesep, >>>> fullfname); >>>> + if (ret != -1) { >>>> + struct stat statbuf; >>>> + return os::stat(buffer, &statbuf) == 0; >>>> + } >>>> + return false; >>>> +} >>>> + >>>> bool os::dll_locate_lib(char *buffer, size_t buflen, >>>> const char* pname, const char* fname) { >>>> bool retval = false; >>>> @@ -1220,11 +1231,8 @@ >>>> if (p != NULL) { >>>> const size_t plen = strlen(buffer); >>>> const char lastchar = buffer[plen - 1]; >>>> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; >>>> - int ret = jio_snprintf(&buffer[plen], buflen - plen, "%s%s", filesep, >>>> fullfname); >>>> - if (ret != -1) { >>>> - retval = file_exists(buffer); >>>> - } >>>> + retval = conc_path_file_and_check(buffer, &buffer[plen], buflen - >>>> plen, >>>> + "", lastchar, fullfname); >>>> } >>>> } else if (strchr(pname, *os::path_separator()) != NULL) { >>>> int n; >>>> @@ -1238,12 +1246,8 @@ >>>> continue; // skip the empty path values >>>> } >>>> const char lastchar = path[plen - 1]; >>>> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; >>>> - int ret = jio_snprintf(buffer, buflen, "%s%s%s", path, filesep, >>>> fullfname); >>>> - if (ret != -1 && file_exists(buffer)) { >>>> - retval = true; >>>> - break; >>>> - } >>>> + retval = conc_path_file_and_check(buffer, buffer, buflen, path, >>>> lastchar, fullfname); >>>> + if (retval) break; >>>> } >>>> // release the storage >>>> for (int i = 0; i < n; i++) { >>>> @@ -1255,11 +1259,7 @@ >>>> } >>>> } else { >>>> const char lastchar = pname[pnamelen-1]; >>>> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : "\\"; >>>> - int ret = jio_snprintf(buffer, buflen, "%s%s%s", pname, filesep, >>>> fullfname); >>>> - if (ret != -1) { >>>> - retval = file_exists(buffer); >>>> - } >>>> + retval = conc_path_file_and_check(buffer, buffer, buflen, path, >> lastchar, >>>> fullfname); >>>> } >>>> } >>>> >>>> >>>>> -----Original Message----- >>>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>>> Sent: Montag, 28. August 2017 07:38 >>>>> To: Lindenmaier, Goetz >>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns true even if >>>> file >>>>> is missing. >>>>> >>>>> Hi Goetz, >>>>> >>>>> On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: >>>>>> Hi, >>>>>> >>>>>> I please need a second review and a sponsor: >>>>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> dllBuildName/webrev.04 >>>>>> >>>>>> To update my description of the change to the status after Thomas' >>>> review: >>>>>> >>>>>> dll_build_name builds the proper path to a library given a list of paths >>>>> separated by >>>>>> path_seperator and a library name. It adds in the platform specific >>>> endings >>>>> etc. >>>>>> It is documented to return whether the file exists, but only does so if a >>>>> path_seperator >>>>>> exists in the path. >>>>>> Especially if the path is empty, it just returns ?true? without checking. >>>>>> >>>>>> Dll_build_name is usually used before calling dll_load. If dll_load does >>>> not >>>>> get a full path it searches >>>>>> in well known unix/windows locations. This is intended in the two cases >>>>> where dll_build_name >>>>>> is called with an empty path. >>>>>> >>>>>> I renamed dll_build_name to dll_locate_lib and changed it's behavior to >>>>> always return >>>>>> a full path to the lib, inserting current working directory if no path is >> given. >>>>>> For the use case where "" was actually passed to the function, I added >> a >>>>> new function >>>>>> (reusing the old function name) dll_build_name that just adds system >>>>> dependent prefix and suffix >>>>>> to the name. >>>>>> I merged all unix implementations to the posix os branch. >>>>> >>>>> I started to look at this and have applied the patch to run through some >>>>> basic testing. The overall approach seems reasonable. But it is hard to >>>>> track all the details - in particular whether there were any subtle >>>>> differences across the "posix" systems? >>>>> >>>>> I'm wondering what, if any, significant differences exist between the >>>>> Windows and POSIX versions? I would hope the platform differences >> could >>>>> easily be hidden behind macros (for path separator, library suffix etc). >>>>> Then perhaps this could just go in shared code (os.hpp, os.cpp)? >>>>> >>>>> That aside, in the Windows code shouldn't the hardwired .dll strings >>>>> actually be JNI_LIB_SUFFIX? >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Best regards, >>>>>> Goetz. >>>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] >>>>>>> Sent: Dienstag, 22. August 2017 17:30 >>>>>>> To: Lindenmaier, Goetz >>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>> Subject: Re: RFR(M): 8186072: dll_build_name returns true even if file >> is >>>>>>> missing. >>>>>>> >>>>>>> Looks good. >>>>>>> >>>>>>> ..Thomas >>>>>>> >>>>>>> On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz >>>>>>> >>> >>>>> wrote: >>>>>>> >>>>>>> >>>>>>> I mistyped the path to webrev, this should work: >>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> dllBuildName/webrev.04 >>>>> >>>>>> dllBuildName/webrev.04> >>>>>>> >>>>>>> Sorry, >>>>>>> Goetz >>>>>>> >>>>>>> >>>>>>> >>>>>>> > -----Original Message----- >>>>>>> > From: Lindenmaier, Goetz >>>>>>> > Sent: Dienstag, 22. August 2017 15:48 >>>>>>> > To: 'Thomas St?fe' >>>>>> > >>>>>>> > Cc: hotspot-runtime-dev at openjdk.java.net >>>> runtime- >>>>>>> dev at openjdk.java.net> >>>>>>> > Subject: RE: RFR(M): 8186072: dll_build_name returns true even if >>>>>>> file is >>>>>>> > missing. >>>>>>> > >>>>>>> > Hi, >>>>>>> > >>>>>>> > could I please get a second review? >>>>>>> > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName- >>>>>>> hs/webrev.04 >>>> dllBuildName- >>>>>>> hs/webrev.04> >>>>>>> > >>>>>>> > I had to update the webrev because of a problem on windows. >>>>>>> > @Thomas I had edited os.hpp, but not saved :( >>>>>>> > >>>>>>> > Best regards, >>>>>>> > Goetz. >>>>>>> > >>>>>>> > PS: Didn't double-check the webrev as cr server is slow. >>>>>>> > >>>>>>> > > -----Original Message----- >>>>>>> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >>>>>>> ] >>>>>>> > > Sent: Donnerstag, 17. August 2017 19:54 >>>>>>> > > To: Lindenmaier, Goetz >>>>>> > >>>>>>> > > Cc: hotspot-runtime-dev at openjdk.java.net >>>>>> runtime-dev at openjdk.java.net> >>>>>>> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true even >>>>> if >>>>>>> file is >>>>>>> > > missing. >>>>>>> > > >>>>>>> > > Hi Goetz, >>>>>>> > > >>>>>>> > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz >>>>>>> > > >>>>>> >>>>> >>>>>> > > wrote: >>>>>>> > > >>>>>>> > > >>>>>>> > > Hi Thomas, >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > I adapted the comments in os.hpp. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > If I move the call to dll_build_name out of dll_locate_lib >>>>>>> > > >>>>>>> > > I have to do a lot of coding in all the places where it is called. >>>>>>> > > >>>>>>> > > That seems not useful to me. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Fixed the type to size_t. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > One could merge posix/windows if putting the check for ?:? >>>>>>> > > >>>>>>> > > into a WINDOWS_ONLY() I guess. The check for \ could be >>>>>>> > > >>>>>>> > > done in posix as well, if using file_seperator(). >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > * Not your change, but: why does the code in >>>>> os::dll_locate_lib() >>>>>>> even >>>>>>> > > >>>>>>> > > * differentiate between a PATH containing no >>>>>>> os::path_separator() >>>>>>> > > >>>>>>> > > * and a path containing os::path_separator()? >>>>>>> > > >>>>>>> > > I assume this was done to avoid all the allocations and copying >>>>> of >>>>>>> the >>>>>>> > > path. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Also adapted the comment in jvmtiExport.cpp. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > New webrev: >>>>>>> > > >>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> >>>>>>> > > dllBuildName/webrev.03/ >>>>>>> >>>>>> >>>>>>> > > dllBuildName/webrev.03/> >>>>>>> > > >>>>>>> > > incremental diff: >>>>>>> > > >>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> >>>>>>> > > dllBuildName/webrev.03/diffs-incremental.patch >>>>>>> > > >>>>>> >>>>>>> > > dllBuildName/webrev.03/diffs-incremental.patch> >>>>>>> > > >>>>>>> > > (fixed indentation on windows) >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Best regards, >>>>>>> > > >>>>>>> > > Goetz. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Comments in os.hpp seem unchanged ? >>>>>>> > > >>>>>>> > > But looks fine otherwise. I do not need another webrev. >>>>>>> > > >>>>>>> > > Thanks, Thomas >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >>>>>>> >>>>>>> > > >>>>>> > ] >>>>>>> > > Sent: Thursday, August 17, 2017 3:48 PM >>>>>>> > > To: Lindenmaier, Goetz >>>>>> >>>>>>> > > >>>>>> > > >>>>>>> > > Cc: hotspot-runtime-dev at openjdk.java.net >>>>>> runtime-dev at openjdk.java.net> >>>> >>>>>> runtime-> >>>>>>> > > dev at openjdk.java.net > >>>>>>> > > Subject: Re: RFR(M): 8186072: dll_build_name returns true >>>>> even >>>>>>> if file >>>>>>> > > is missing. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Hi Goetz, >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz >>>>>>> > > >>>>>> >>>>> >>>>>> > > wrote: >>>>>>> > > >>>>>>> > > Hi Thomas, >>>>>>> > > >>>>>>> > > I reworked the whole thing. >>>>>>> > > >>>>>>> > > First, there is dll_build_name. It just does -> >>>>>>> > > lib.so. >>>>>>> > > >>>>>>> > > Second, I renamed the legacy dll_build_name to >>>>>>> dll_locate_lib. >>>>>>> > > >>>>>>> > > I merged all the unix variants to one in os_posix. >>>>>>> > > >>>>>>> > > I removed the buffer overflow check at the top. >>>>>>> > > It's too restrictive because the path argument >>>>>>> > > can contain several paths. I added the overflow >>>>>>> > > checks into the single cases. >>>>>>> > > >>>>>>> > > Also, I first assemble the pure name using the new, simple >>>>>>> > > dll_build_name. This is for reuse and readability. >>>>>>> > > >>>>>>> > > In case of an empty directory, I use get_current_directory >>>>>>> > > to complete the path as indicated by the original >>>>>>> > > documentation >>>>>>> > > where it was called with "". >>>>>>> > > Dll_locate_lib now always returns a name with a full path if >>>>>>> > > the file exists. >>>>>>> > > >>>>>>> > > Also, on windows, I think I fixed a bug by reversing the >>>>> order >>>>>>> > > of checks. A path list ending in ':' or '\' would not have >>>>>>> > > been recognized. >>>>>>> > > >>>>>>> > > On Bsd, I removed JNI_LIB_* because that already is >>>>> defined >>>>>>> > > in jvm_bsh.h >>>>>>> > > >>>>>>> > > New webrev: >>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> >>>>>>> > > dllBuildName/webrev.02/ >>>>>>> >>>>>> >>>>>>> > > dllBuildName/webrev.02/> >>>>>>> > > >>>>>>> > > Best regards, >>>>>>> > > Goetz. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > I like this better than before. Remarks: >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> >>>>>>> > > >>>>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html >>>>>>> > > >>>>>> >>>>>>> > > >>>>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > + // Builds the platform-specific name of a library. >>>>>>> > > >>>>>>> > > + // Returns false on __buffer overflow__. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Hopefully not! :D >>>>>>> > > >>>>>>> > > How about: "Returns false no truncation" instead. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > + // Builds a platform-specific full library path given an ld path >>>>>>> and lib >>>>>>> > > name. >>>>>>> > > >>>>>>> > > + // Returns true if the buffer contains a full path to an existing >>>>>>> file, >>>>>>> > > false >>>>>>> > > >>>>>>> > > + // otherwise. If pathname is empty, checks the current >>>>>>> directory. >>>>>>> > > >>>>>>> > > + static bool dll_locate_lib(char* buffer, size_t size, >>>>>>> > > >>>>>>> > > const char* pathname, const char* >>>>>>> fname); >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Might be worth mentioning that "fname" is the unadorned >>>>> library >>>>>>> > > name, e.g. "verify" for libverify.so or verify.dll. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Would the following alternative be valid: >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > one could make dll_locate_lib take the real file name, and let >>>>>>> caller >>>>>>> > > use dll_build_name() to build the libary name first before handing >>>>> it >>>>>>> to >>>>>>> > > dll_locate_lib(). In that case, dll_locate_lib() could be renamed to >>>>> a >>>>>>> generic >>>>>>> > > "find_file_in_path" because it would work for any kind of file. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > As an added bonus, there would be no need to create a >>>>>>> temporary >>>>>>> > > array in dll_build_name/dll_locate_lib, and no need to call free() >>>>> so >>>>>>> no >>>>>>> > > cleanup-related control flow changes in these functions. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > ===== >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> >>>>>>> > > >>>>>>> >>>>> >> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html >>>>>>> > > >>>>>> >>>>>>> > > >>>>>>> >>>>> >>>> >> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + strlen(fname) + >>>>>>> > > strlen(JNI_LIB_SUFFIX); >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > int -> size_t (does that even compile without warning?) >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > + // Check current working directory. >>>>>>> > > >>>>>>> > > + const char* p = get_current_directory(buffer, buflen); >>>>>>> > > >>>>>>> > > + if (p != NULL && >>>>>>> > > >>>>>>> > > + strlen(buffer) + 1 + fullfnamelen + 1 <= buflen) { >>>>>>> > > >>>>>>> > > + strcat(buffer, "\\"); >>>>>>> > > >>>>>>> > > + strcat(buffer, fullfname); >>>>>>> > > >>>>>>> > > + retval = file_exists(buffer); >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Small nit: I'd use jio_snprintf instead of strcat. Functionally >>>>>>> identical but >>>>>>> > > will make scanners (e.g. coverity) happy. One could then avoid >>>>> the >>>>>>> length >>>>>>> > > calculation and rely on jio_snprintf truncation: >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > const char* p = get_current_directory(buffer, buflen); >>>>>>> > > >>>>>>> > > if (p != NULL) { >>>>>>> > > >>>>>>> > > const size_t end = strlen(p); >>>>>>> > > >>>>>>> > > if (jio_snprintf(end, buflen - end, "\\%s", fullname) != -1) { >>>>>>> > > >>>>>>> > > retval = file_exists(buffer); >>>>>>> > > >>>>>>> > > } >>>>>>> > > >>>>>>> > > } >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > -- >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Not your change, but: why does the code in os::dll_locate_lib() >>>>>>> even >>>>>>> > > differentiate between a PATH containing no os::path_separator() >>>>>>> and a path >>>>>>> > > containing os::path_separator()? >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Would the former not be just a PATH with only one directory >>>>> and >>>>>>> hence >>>>>>> > > need no special treatment? >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > ===== >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> >>>>>>> > > >>>>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html >>>>>>> > > >>>>>> >>>>>>> > > >>>>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Could os::dll_locate_lib be consolidated between windows and >>>>>>> unix? >>>>>>> > > Seems to be the implementation is almost identical. >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > ==== >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> >>>>>>> > > >>>>>>> >>>> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html >>>>>>> > > >>>>>> >>>>>>> > > >>>>>>> >>>>> >> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > + // not found - try library path >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Proposal: "not found - try OS default library path" >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Find some comments inline: >>>>>>> > > >>>>>>> > > >>>>>>> > > > Especially if the path is empty, it just returns 'true'. >>>>>>> > > > Dll_build_name is usually used before calling dll_load. >>>>>>> If >>>>>>> > > dll_load does not get a full path it searches >>>>>>> > > > in well known unix/windows locations. This is intended >>>>>>> in >>>>>>> > > the two cases where dll_build_name >>>>>>> > > > is called with an empty path. >>>>>>> > > > >>>>>>> > > > So, for both cases (thread.cpp, jvmtiExport.cpp), >>>>>>> > > > >>>>>>> > > > before, we would call os::dll_build_name() with an empty >>>>>>> > > string for the path >>>>>>> > > > which, for relative paths, would result in feeding that path >>>>>>> > > unexpanded to >>>>>>> > > > dlopen(), which would use whatever the OS does in those >>>>>>> > > cases (LIBPATH, >>>>>>> > > > LD_LIBRARY_PATH, PATH on windows). Note that this >>>>> does >>>>>>> > > not necessarily >>>>>>> > > > include searching the current directory. >>>>>>> > > Right. With changed dll_biuld_name it's again exactly as >>>>>>> > > before. >>>>>>> > > >>>>>>> > > > With your change, we now use java.library.path, which is >>>>>>> not >>>>>>> > > necessarily the >>>>>>> > > > same? >>>>>>> > > You are right, I oversaw that java.library.path can be >>>>>>> > > overwritten. Initially, >>>>>>> > > it's set to the right thing. >>>>>>> > > >>>>>>> > > > (BTW, I think the old comments in thread.cpp and >>>>>>> > > jniExport.cpp were wrong:"// >>>>>>> > > > Try the local directory" - if "local" means "current", this is >>>>>>> not >>>>>>> > > what did >>>>>>> > > > happen). >>>>>>> > > Right, I tried to adapt them, did I miss one? >>>>>>> > > >>>>>>> > > > I added a second variant of dll_build_name without >>>>> the >>>>>>> > > path argument that adds the path >>>>>>> > > > from system property java.lang.path and use that in >>>>>>> these >>>>>>> > > two cases. >>>>>>> > > > I changed the original function to actually check file >>>>>>> > > availability in all cases, >>>>>>> > > > and to check . if the path is empty. >>>>>>> > > > I think that may be a bit confusing. We would then have >>>>>>> three >>>>>>> > > options: >>>>>>> > > > >>>>>>> > > > - call os::dll_build_name with a real ";;.." PATH >>>>>>> and >>>>>>> > > get a file name >>>>>>> > > > resolved from that path >>>>>>> > > > - call os::dll_build_name with "" for the PATH and get OS >>>>>>> dll >>>>>>> > > resolution >>>>>>> > > No, in that case, as I called file_exists(), it would only work if >>>>>>> > > the dll is in the >>>>>>> > > current working directory. But I changed this now, anyways. >>>>>>> > > >>>>>>> > > > - call your new overloaded version of >>>>> os::dll_build_name(), >>>>>>> > > which uses - >>>>>>> > > > Djava.library.path. >>>>>>> > > > >>>>>>> > > > Please review this change. I please need a sponsor. >>>>>>> > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> >>>>>>> > > >>>>>> > >>>>>>> > > > dllBuildName/webrev.01/ >>>>>>> > > >>>>>> >>>>>>> > > >>>>>> > >>>>>>> > > > dllBuildName/webrev.01/> >>>>>>> > > >>>>>>> > > > >>>>>>> > > > Best regards, >>>>>>> > > > Goetz. >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > Kind Regards, Thomas >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > Best Regards, Thomas >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> >>>>>>> >>>>>>> >>>>>> From thomas.stuefe at gmail.com Tue Aug 29 09:44:26 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Aug 2017 11:44:26 +0200 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <7653afda-fe01-1115-df4a-da0c2e7356f0@oracle.com> References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> <726e7be2591f481d8d0ee50af6486f75@sap.com> <11ed4c208b8641088b99a2066dfc992b@sap.com> <7653afda-fe01-1115-df4a-da0c2e7356f0@oracle.com> Message-ID: Hi, I am fine with this change going in in the current form. Lets get this patch in. I see some issues, but they can be addressed in follow up items: 1) os::path_seperator() should really return a char, not a string. Most callers seem to reference os::path_seperator()[0]. 2) If the user provided buffer is too small, we will fail, which looks like the dll could not have been located. I am not sure we have to be shy with allocating memory - internally, we malloc a buffer for assembling the filename, and then os::splitpath() will malloc a whole bunch of arrays too. So I think we could just return the dll path location in a malloced buffer and require the caller to free. 3) I do not understand ':' as a file separator on windows. So, a path is allowed to contain e.g. "C:;" ? Which would mean "a path relative to the current directory currently active on drive C". If I am not mistaken. Do we want to support this, what is the use case? Kind Regards, Thomas On Tue, Aug 29, 2017 at 10:41 AM, David Holmes wrote: > On 29/08/2017 6:29 PM, Lindenmaier, Goetz wrote: > >> Hi David, >> >> I fixed the indentation and added you as reviewer. >> I replaced the webrev in-place: >> http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.05/ >> The new code went through all our testing ... except for some ppc/s390 >> builds >> that failed because of an other change pushed to hs tonight. But that >> should >> not matter, it all passed with the jdk10/jdk10 testing. >> >> Would you mind sponsoring? >> > > No problem - but can we get Thomas to sign-off on this latest version > please. > > > Thanks, > David > > Best regards, >> Goetz. >> >> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com] >>> Sent: Dienstag, 29. August 2017 09:53 >>> To: Lindenmaier, Goetz >>> Cc: hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns true even if >>> file >>> is missing. >>> >>> Hi Goetz, >>> >>> On 29/08/2017 4:18 PM, Lindenmaier, Goetz wrote: >>> >>>> Hi, >>>> >>>> this is a webrev with merged windows and posix implementations: >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>> dllBuildName/webrev.05/ >>> >>> I like the look of this. >>> >>> There are a couple of indention nits in os.cpp: >>> >>> 247 static bool conc_path_file_and_check(char *buffer, char >>> *printbuffer, size_t printbuflen, >>> 248 const char* pname, char lastchar, >>> const char* fname) { >>> >>> >>> 251 const char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar >>> == os::file_separator()[0]) ? >>> 252 "" : os::file_separator(); >>> >>> >>> Thanks, >>> David >>> >>> Best regards, >>>> Goetz >>>> >>>> -----Original Message----- >>>>> From: Lindenmaier, Goetz >>>>> Sent: Montag, 28. August 2017 12:10 >>>>> To: 'David Holmes' >>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>> Subject: RE: [ping] RFR(M): 8186072: dll_build_name returns true even >>>>> if >>>>> >>>> file >>> >>>> is missing. >>>>> >>>>> Hi, >>>>> >>>>> this are the changes needed to make the windows dll_locate_lib >>>>> universally applicable. I also merge the three similar jio_snprintf >>>>> calls into one method. >>>>> I do some gymnastics to avoid another buffer of MAX_PATH_LEN >>>>> at the first call to conc_path_file_and_check. >>>>> I'll test this tonight. >>>>> >>>>> Best regards, >>>>> Goetz. >>>>> >>>>> diff -r e09e4eb985c5 src/os/windows/vm/os_windows.cpp >>>>> --- a/src/os/windows/vm/os_windows.cpp Thu Aug 17 17:26:02 >>>>> >>>> 2017 >>> >>>> +0200 >>>>> +++ b/src/os/windows/vm/os_windows.cpp Mon Aug 28 12:02:26 >>>>> >>>> 2017 >>> >>>> +0200 >>>>> @@ -1205,6 +1205,17 @@ >>>>> return GetFileAttributes(filename) != INVALID_FILE_ATTRIBUTES; >>>>> } >>>>> >>>>> +bool conc_path_file_and_check(char *buffer, char *printbuffer, size_t >>>>> printbuflen, >>>>> + const char* pname, char lastchar, const >>>>> char* fname) { >>>>> + char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) lastchar == >>>>> os::file_seperator()[0]) ? "" : os::file_separator(); >>>>> + int ret = jio_snprintf(printbuffer, printbuflen, "%s%s%s", path, >>>>> filesep, >>>>> fullfname); >>>>> + if (ret != -1) { >>>>> + struct stat statbuf; >>>>> + return os::stat(buffer, &statbuf) == 0; >>>>> + } >>>>> + return false; >>>>> +} >>>>> + >>>>> bool os::dll_locate_lib(char *buffer, size_t buflen, >>>>> const char* pname, const char* fname) { >>>>> bool retval = false; >>>>> @@ -1220,11 +1231,8 @@ >>>>> if (p != NULL) { >>>>> const size_t plen = strlen(buffer); >>>>> const char lastchar = buffer[plen - 1]; >>>>> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : >>>>> "\\"; >>>>> - int ret = jio_snprintf(&buffer[plen], buflen - plen, "%s%s", >>>>> filesep, >>>>> fullfname); >>>>> - if (ret != -1) { >>>>> - retval = file_exists(buffer); >>>>> - } >>>>> + retval = conc_path_file_and_check(buffer, &buffer[plen], >>>>> buflen - >>>>> plen, >>>>> + "", lastchar, fullfname); >>>>> } >>>>> } else if (strchr(pname, *os::path_separator()) != NULL) { >>>>> int n; >>>>> @@ -1238,12 +1246,8 @@ >>>>> continue; // skip the empty path values >>>>> } >>>>> const char lastchar = path[plen - 1]; >>>>> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" >>>>> : "\\"; >>>>> - int ret = jio_snprintf(buffer, buflen, "%s%s%s", path, >>>>> filesep, >>>>> fullfname); >>>>> - if (ret != -1 && file_exists(buffer)) { >>>>> - retval = true; >>>>> - break; >>>>> - } >>>>> + retval = conc_path_file_and_check(buffer, buffer, buflen, >>>>> path, >>>>> lastchar, fullfname); >>>>> + if (retval) break; >>>>> } >>>>> // release the storage >>>>> for (int i = 0; i < n; i++) { >>>>> @@ -1255,11 +1259,7 @@ >>>>> } >>>>> } else { >>>>> const char lastchar = pname[pnamelen-1]; >>>>> - char *filesep = (lastchar == ':' || lastchar == '\\') ? "" : >>>>> "\\"; >>>>> - int ret = jio_snprintf(buffer, buflen, "%s%s%s", pname, filesep, >>>>> fullfname); >>>>> - if (ret != -1) { >>>>> - retval = file_exists(buffer); >>>>> - } >>>>> + retval = conc_path_file_and_check(buffer, buffer, buflen, path, >>>>> >>>> lastchar, >>> >>>> fullfname); >>>>> } >>>>> } >>>>> >>>>> >>>>> -----Original Message----- >>>>>> From: David Holmes [mailto:david.holmes at oracle.com] >>>>>> Sent: Montag, 28. August 2017 07:38 >>>>>> To: Lindenmaier, Goetz >>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns true even >>>>>> if >>>>>> >>>>> file >>>>> >>>>>> is missing. >>>>>> >>>>>> Hi Goetz, >>>>>> >>>>>> On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I please need a second review and a sponsor: >>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>> >>>>>> dllBuildName/webrev.04 >>>>> >>>>>> >>>>>>> To update my description of the change to the status after Thomas' >>>>>>> >>>>>> review: >>>>> >>>>>> >>>>>>> dll_build_name builds the proper path to a library given a list of >>>>>>> paths >>>>>>> >>>>>> separated by >>>>>> >>>>>>> path_seperator and a library name. It adds in the platform specific >>>>>>> >>>>>> endings >>>>> >>>>>> etc. >>>>>> >>>>>>> It is documented to return whether the file exists, but only does so >>>>>>> if a >>>>>>> >>>>>> path_seperator >>>>>> >>>>>>> exists in the path. >>>>>>> Especially if the path is empty, it just returns ?true? without >>>>>>> checking. >>>>>>> >>>>>>> Dll_build_name is usually used before calling dll_load. If dll_load >>>>>>> does >>>>>>> >>>>>> not >>>>> >>>>>> get a full path it searches >>>>>> >>>>>>> in well known unix/windows locations. This is intended in the two >>>>>>> cases >>>>>>> >>>>>> where dll_build_name >>>>>> >>>>>>> is called with an empty path. >>>>>>> >>>>>>> I renamed dll_build_name to dll_locate_lib and changed it's behavior >>>>>>> to >>>>>>> >>>>>> always return >>>>>> >>>>>>> a full path to the lib, inserting current working directory if no >>>>>>> path is >>>>>>> >>>>>> given. >>> >>>> For the use case where "" was actually passed to the function, I added >>>>>>> >>>>>> a >>> >>>> new function >>>>>> >>>>>>> (reusing the old function name) dll_build_name that just adds system >>>>>>> >>>>>> dependent prefix and suffix >>>>>> >>>>>>> to the name. >>>>>>> I merged all unix implementations to the posix os branch. >>>>>>> >>>>>> >>>>>> I started to look at this and have applied the patch to run through >>>>>> some >>>>>> basic testing. The overall approach seems reasonable. But it is hard >>>>>> to >>>>>> track all the details - in particular whether there were any subtle >>>>>> differences across the "posix" systems? >>>>>> >>>>>> I'm wondering what, if any, significant differences exist between the >>>>>> Windows and POSIX versions? I would hope the platform differences >>>>>> >>>>> could >>> >>>> easily be hidden behind macros (for path separator, library suffix etc). >>>>>> Then perhaps this could just go in shared code (os.hpp, os.cpp)? >>>>>> >>>>>> That aside, in the Windows code shouldn't the hardwired .dll strings >>>>>> actually be JNI_LIB_SUFFIX? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> Best regards, >>>>>>> Goetz. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>>> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] >>>>>>>> Sent: Dienstag, 22. August 2017 17:30 >>>>>>>> To: Lindenmaier, Goetz >>>>>>>> Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>>> Subject: Re: RFR(M): 8186072: dll_build_name returns true even if >>>>>>>> file >>>>>>>> >>>>>>> is >>> >>>> missing. >>>>>>>> >>>>>>>> Looks good. >>>>>>>> >>>>>>>> ..Thomas >>>>>>>> >>>>>>>> On Tue, Aug 22, 2017 at 4:33 PM, Lindenmaier, Goetz >>>>>>>> >>>>>>>> >>>>>>> >>>> wrote: >>>>>> >>>>>>> >>>>>>>> >>>>>>>> I mistyped the path to webrev, this should work: >>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>>> dllBuildName/webrev.04 >>>>>>>> >>>>>>> >>>>> >>>>>>> dllBuildName/webrev.04> >>>>>>>> >>>>>>>> Sorry, >>>>>>>> Goetz >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> > -----Original Message----- >>>>>>>> > From: Lindenmaier, Goetz >>>>>>>> > Sent: Dienstag, 22. August 2017 15:48 >>>>>>>> > To: 'Thomas St?fe' >>>>>>> > >>>>>>>> > Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>> >>>>>>> runtime- >>>>>> >>>>>>> dev at openjdk.java.net> >>>>>>>> > Subject: RE: RFR(M): 8186072: dll_build_name returns true >>>>>>>> even if >>>>>>>> file is >>>>>>>> > missing. >>>>>>>> > >>>>>>>> > Hi, >>>>>>>> > >>>>>>>> > could I please get a second review? >>>>>>>> > http://cr.openjdk.java.net/~go >>>>>>>> etz/wr17/8186072-dllBuildName- >>>>>>>> hs/webrev.04 >>>>>>> >>>>>>> dllBuildName- >>>>>> >>>>>>> hs/webrev.04> >>>>>>>> > >>>>>>>> > I had to update the webrev because of a problem on >>>>>>>> windows. >>>>>>>> > @Thomas I had edited os.hpp, but not saved :( >>>>>>>> > >>>>>>>> > Best regards, >>>>>>>> > Goetz. >>>>>>>> > >>>>>>>> > PS: Didn't double-check the webrev as cr server is slow. >>>>>>>> > >>>>>>>> > > -----Original Message----- >>>>>>>> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >>>>>>>> ] >>>>>>>> > > Sent: Donnerstag, 17. August 2017 19:54 >>>>>>>> > > To: Lindenmaier, Goetz >>>>>>> > >>>>>>>> > > Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>> hotspot- >>>>>>>> runtime-dev at openjdk.java.net> >>>>>>>> > > Subject: Re: RFR(M): 8186072: dll_build_name returns >>>>>>>> true even >>>>>>>> >>>>>>> if >>>>>> >>>>>>> file is >>>>>>>> > > missing. >>>>>>>> > > >>>>>>>> > > Hi Goetz, >>>>>>>> > > >>>>>>>> > > On Thu, Aug 17, 2017 at 6:03 PM, Lindenmaier, Goetz >>>>>>>> > > >>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>>>> > > wrote: >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Hi Thomas, >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > I adapted the comments in os.hpp. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > If I move the call to dll_build_name out of >>>>>>>> dll_locate_lib >>>>>>>> > > >>>>>>>> > > I have to do a lot of coding in all the places >>>>>>>> where it is called. >>>>>>>> > > >>>>>>>> > > That seems not useful to me. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Fixed the type to size_t. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > One could merge posix/windows if putting the check >>>>>>>> for ?:? >>>>>>>> > > >>>>>>>> > > into a WINDOWS_ONLY() I guess. The check for \ >>>>>>>> could be >>>>>>>> > > >>>>>>>> > > done in posix as well, if using file_seperator(). >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > * Not your change, but: why does the code in >>>>>>>> >>>>>>> os::dll_locate_lib() >>>>>> >>>>>>> even >>>>>>>> > > >>>>>>>> > > * differentiate between a PATH containing no >>>>>>>> os::path_separator() >>>>>>>> > > >>>>>>>> > > * and a path containing os::path_separator()? >>>>>>>> > > >>>>>>>> > > I assume this was done to avoid all the allocations >>>>>>>> and copying >>>>>>>> >>>>>>> of >>>>>> >>>>>>> the >>>>>>>> > > path. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Also adapted the comment in jvmtiExport.cpp. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > New webrev: >>>>>>>> > > >>>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>>> >>>>>>>> > > dllBuildName/webrev.03/ >>>>>>>> >>>>>>> >>>>>>>> > > dllBuildName/webrev.03/> >>>>>>>> > > >>>>>>>> > > incremental diff: >>>>>>>> > > >>>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>>> >>>>>>>> > > dllBuildName/webrev.03/diffs-incremental.patch >>>>>>>> > > >>>>>>> >>>>>>>> > > dllBuildName/webrev.03/diffs-incremental.patch> >>>>>>>> > > >>>>>>>> > > (fixed indentation on windows) >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Best regards, >>>>>>>> > > >>>>>>>> > > Goetz. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Comments in os.hpp seem unchanged ? >>>>>>>> > > >>>>>>>> > > But looks fine otherwise. I do not need another webrev. >>>>>>>> > > >>>>>>>> > > Thanks, Thomas >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com >>>>>>>> >>>>>>>> > > >>>>>>> > ] >>>>>>>> > > Sent: Thursday, August 17, 2017 3:48 PM >>>>>>>> > > To: Lindenmaier, Goetz >>>>>>> >>>>>>>> > > >>>>>>> > > >>>>>>>> > > Cc: hotspot-runtime-dev at openjdk.java.net >>>>>>> hotspot- >>>>>>>> runtime-dev at openjdk.java.net> >>>>>>> >>>>>>> >>>>> >>>>>>> runtime-> >>>>>>>> > > dev at openjdk.java.net > >>>>>>>> > > Subject: Re: RFR(M): 8186072: dll_build_name >>>>>>>> returns true >>>>>>>> >>>>>>> even >>>>>> >>>>>>> if file >>>>>>>> > > is missing. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Hi Goetz, >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > On Thu, Aug 17, 2017 at 1:35 PM, Lindenmaier, Goetz >>>>>>>> > > >>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>>>> > > wrote: >>>>>>>> > > >>>>>>>> > > Hi Thomas, >>>>>>>> > > >>>>>>>> > > I reworked the whole thing. >>>>>>>> > > >>>>>>>> > > First, there is dll_build_name. It just >>>>>>>> does -> >>>>>>>> > > lib.so. >>>>>>>> > > >>>>>>>> > > Second, I renamed the legacy dll_build_name >>>>>>>> to >>>>>>>> dll_locate_lib. >>>>>>>> > > >>>>>>>> > > I merged all the unix variants to one in >>>>>>>> os_posix. >>>>>>>> > > >>>>>>>> > > I removed the buffer overflow check at the >>>>>>>> top. >>>>>>>> > > It's too restrictive because the path >>>>>>>> argument >>>>>>>> > > can contain several paths. I added the >>>>>>>> overflow >>>>>>>> > > checks into the single cases. >>>>>>>> > > >>>>>>>> > > Also, I first assemble the pure name using >>>>>>>> the new, simple >>>>>>>> > > dll_build_name. This is for reuse and >>>>>>>> readability. >>>>>>>> > > >>>>>>>> > > In case of an empty directory, I use >>>>>>>> get_current_directory >>>>>>>> > > to complete the path as indicated by the >>>>>>>> original >>>>>>>> > > documentation >>>>>>>> > > where it was called with "". >>>>>>>> > > Dll_locate_lib now always returns a name >>>>>>>> with a full path if >>>>>>>> > > the file exists. >>>>>>>> > > >>>>>>>> > > Also, on windows, I think I fixed a bug by >>>>>>>> reversing the >>>>>>>> >>>>>>> order >>>>>> >>>>>>> > > of checks. A path list ending in ':' or '\' >>>>>>>> would not have >>>>>>>> > > been recognized. >>>>>>>> > > >>>>>>>> > > On Bsd, I removed JNI_LIB_* because that >>>>>>>> already is >>>>>>>> >>>>>>> defined >>>>>> >>>>>>> > > in jvm_bsh.h >>>>>>>> > > >>>>>>>> > > New webrev: >>>>>>>> > > http://cr.openjdk.java.net/~g >>>>>>>> oetz/wr17/8186072- >>>>>>>> >>>>>>>> > > dllBuildName/webrev.02/ >>>>>>>> >>>>>>> >>>>>>>> > > dllBuildName/webrev.02/> >>>>>>>> > > >>>>>>>> > > Best regards, >>>>>>>> > > Goetz. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > I like this better than before. Remarks: >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>>> >>>>>>>> > > >>>>>>>> >>>>>>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html >>>>>> >>>>>>> > > >>>>>>> >>>>>>>> > > >>>>>>>> >>>>>>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> >>>>>> >>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > + // Builds the platform-specific name of a >>>>>>>> library. >>>>>>>> > > >>>>>>>> > > + // Returns false on __buffer overflow__. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Hopefully not! :D >>>>>>>> > > >>>>>>>> > > How about: "Returns false no truncation" instead. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > + // Builds a platform-specific full library path >>>>>>>> given an ld path >>>>>>>> and lib >>>>>>>> > > name. >>>>>>>> > > >>>>>>>> > > + // Returns true if the buffer contains a full >>>>>>>> path to an existing >>>>>>>> file, >>>>>>>> > > false >>>>>>>> > > >>>>>>>> > > + // otherwise. If pathname is empty, checks the >>>>>>>> current >>>>>>>> directory. >>>>>>>> > > >>>>>>>> > > + static bool dll_locate_lib(char* >>>>>>>> buffer, size_t size, >>>>>>>> > > >>>>>>>> > > const char* >>>>>>>> pathname, const char* >>>>>>>> fname); >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Might be worth mentioning that "fname" is the >>>>>>>> unadorned >>>>>>>> >>>>>>> library >>>>>> >>>>>>> > > name, e.g. "verify" for libverify.so or verify.dll. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Would the following alternative be valid: >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > one could make dll_locate_lib take the real file >>>>>>>> name, and let >>>>>>>> caller >>>>>>>> > > use dll_build_name() to build the libary name first >>>>>>>> before handing >>>>>>>> >>>>>>> it >>>>>> >>>>>>> to >>>>>>>> > > dll_locate_lib(). In that case, dll_locate_lib() could >>>>>>>> be renamed to >>>>>>>> >>>>>>> a >>>>>> >>>>>>> generic >>>>>>>> > > "find_file_in_path" because it would work for any kind >>>>>>>> of file. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > As an added bonus, there would be no need to create >>>>>>>> a >>>>>>>> temporary >>>>>>>> > > array in dll_build_name/dll_locate_lib, and no need to >>>>>>>> call free() >>>>>>>> >>>>>>> so >>>>>> >>>>>>> no >>>>>>>> > > cleanup-related control flow changes in these functions. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > ===== >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>>> >>>>>>>> > > >>>>>>>> >>>>>>>> >>>>>> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html >>> >>>> > > >>>>>>> >>>>>>>> > > >>>>>>>> >>>>>>>> >>>>>> >>>>> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> >>> >>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > + int fullfnamelen = strlen(JNI_LIB_PREFIX) + >>>>>>>> strlen(fname) + >>>>>>>> > > strlen(JNI_LIB_SUFFIX); >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > int -> size_t (does that even compile without >>>>>>>> warning?) >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > + // Check current working directory. >>>>>>>> > > >>>>>>>> > > + const char* p = get_current_directory(buffer, >>>>>>>> buflen); >>>>>>>> > > >>>>>>>> > > + if (p != NULL && >>>>>>>> > > >>>>>>>> > > + strlen(buffer) + 1 + fullfnamelen + 1 <= >>>>>>>> buflen) { >>>>>>>> > > >>>>>>>> > > + strcat(buffer, "\\"); >>>>>>>> > > >>>>>>>> > > + strcat(buffer, fullfname); >>>>>>>> > > >>>>>>>> > > + retval = file_exists(buffer); >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Small nit: I'd use jio_snprintf instead of strcat. >>>>>>>> Functionally >>>>>>>> identical but >>>>>>>> > > will make scanners (e.g. coverity) happy. One could >>>>>>>> then avoid >>>>>>>> >>>>>>> the >>>>>> >>>>>>> length >>>>>>>> > > calculation and rely on jio_snprintf truncation: >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > const char* p = get_current_directory(buffer, >>>>>>>> buflen); >>>>>>>> > > >>>>>>>> > > if (p != NULL) { >>>>>>>> > > >>>>>>>> > > const size_t end = strlen(p); >>>>>>>> > > >>>>>>>> > > if (jio_snprintf(end, buflen - end, "\\%s", >>>>>>>> fullname) != -1) { >>>>>>>> > > >>>>>>>> > > retval = file_exists(buffer); >>>>>>>> > > >>>>>>>> > > } >>>>>>>> > > >>>>>>>> > > } >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > -- >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Not your change, but: why does the code in >>>>>>>> os::dll_locate_lib() >>>>>>>> even >>>>>>>> > > differentiate between a PATH containing no >>>>>>>> os::path_separator() >>>>>>>> and a path >>>>>>>> > > containing os::path_separator()? >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Would the former not be just a PATH with only one >>>>>>>> directory >>>>>>>> >>>>>>> and >>>>>> >>>>>>> hence >>>>>>>> > > need no special treatment? >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > ===== >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>>> >>>>>>>> > > >>>>>>>> >>>>>>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html >>>>>> >>>>>>> > > >>>>>>> >>>>>>>> > > >>>>>>>> >>>>>>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.html> >>>>>> >>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Could os::dll_locate_lib be consolidated between >>>>>>>> windows and >>>>>>>> unix? >>>>>>>> > > Seems to be the implementation is almost identical. >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > ==== >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>>>>>> >>>>>>>> > > >>>>>>>> >>>>>>>> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp. >>>>> udiff.html >>>>> >>>>>> > > >>>>>>> >>>>>>>> > > >>>>>>>> >>>>>>>> >>>>>> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> >>> >>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > + // not found - try library path >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Proposal: "not found - try OS default library path" >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Find some comments inline: >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > > Especially if the path is empty, it >>>>>>>> just returns 'true'. >>>>>>>> > > > Dll_build_name is usually used >>>>>>>> before calling dll_load. >>>>>>>> If >>>>>>>> > > dll_load does not get a full path it searches >>>>>>>> > > > in well known unix/windows >>>>>>>> locations. This is intended >>>>>>>> in >>>>>>>> > > the two cases where dll_build_name >>>>>>>> > > > is called with an empty path. >>>>>>>> > > > >>>>>>>> > > > So, for both cases (thread.cpp, >>>>>>>> jvmtiExport.cpp), >>>>>>>> > > > >>>>>>>> > > > before, we would call >>>>>>>> os::dll_build_name() with an empty >>>>>>>> > > string for the path >>>>>>>> > > > which, for relative paths, would result >>>>>>>> in feeding that path >>>>>>>> > > unexpanded to >>>>>>>> > > > dlopen(), which would use whatever the OS >>>>>>>> does in those >>>>>>>> > > cases (LIBPATH, >>>>>>>> > > > LD_LIBRARY_PATH, PATH on windows). Note >>>>>>>> that this >>>>>>>> >>>>>>> does >>>>>> >>>>>>> > > not necessarily >>>>>>>> > > > include searching the current directory. >>>>>>>> > > Right. With changed dll_biuld_name it's >>>>>>>> again exactly as >>>>>>>> > > before. >>>>>>>> > > >>>>>>>> > > > With your change, we now use >>>>>>>> java.library.path, which is >>>>>>>> not >>>>>>>> > > necessarily the >>>>>>>> > > > same? >>>>>>>> > > You are right, I oversaw that >>>>>>>> java.library.path can be >>>>>>>> > > overwritten. Initially, >>>>>>>> > > it's set to the right thing. >>>>>>>> > > >>>>>>>> > > > (BTW, I think the old comments in >>>>>>>> thread.cpp and >>>>>>>> > > jniExport.cpp were wrong:"// >>>>>>>> > > > Try the local directory" - if "local" >>>>>>>> means "current", this is >>>>>>>> not >>>>>>>> > > what did >>>>>>>> > > > happen). >>>>>>>> > > Right, I tried to adapt them, did I miss >>>>>>>> one? >>>>>>>> > > >>>>>>>> > > > I added a second variant of >>>>>>>> dll_build_name without >>>>>>>> >>>>>>> the >>>>>> >>>>>>> > > path argument that adds the path >>>>>>>> > > > from system property java.lang.path >>>>>>>> and use that in >>>>>>>> these >>>>>>>> > > two cases. >>>>>>>> > > > I changed the original function to >>>>>>>> actually check file >>>>>>>> > > availability in all cases, >>>>>>>> > > > and to check . if the path is empty. >>>>>>>> > > > I think that may be a bit confusing. We >>>>>>>> would then have >>>>>>>> three >>>>>>>> > > options: >>>>>>>> > > > >>>>>>>> > > > - call os::dll_build_name with a real >>>>>>>> ";;.." PATH >>>>>>>> and >>>>>>>> > > get a file name >>>>>>>> > > > resolved from that path >>>>>>>> > > > - call os::dll_build_name with "" for the >>>>>>>> PATH and get OS >>>>>>>> dll >>>>>>>> > > resolution >>>>>>>> > > No, in that case, as I called >>>>>>>> file_exists(), it would only work if >>>>>>>> > > the dll is in the >>>>>>>> > > current working directory. But I changed >>>>>>>> this now, anyways. >>>>>>>> > > >>>>>>>> > > > - call your new overloaded version of >>>>>>>> >>>>>>> os::dll_build_name(), >>>>>> >>>>>>> > > which uses - >>>>>>>> > > > Djava.library.path. >>>>>>>> > > > >>>>>>>> > > > Please review this change. I please >>>>>>>> need a sponsor. >>>>>>>> > > > http://cr.openjdk.java.net/~g >>>>>>>> oetz/wr17/8186072- >>>>>>>> >>>>>>>> > > >>>>>>> > >>>>>>>> > > > dllBuildName/webrev.01/ >>>>>>>> > > >>>>>>> >>>>>>>> > > >>>>>>> > >>>>>>>> > > > dllBuildName/webrev.01/> >>>>>>>> > > >>>>>>>> > > > >>>>>>>> > > > Best regards, >>>>>>>> > > > Goetz. >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > Kind Regards, Thomas >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > Best Regards, Thomas >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> From david.holmes at oracle.com Tue Aug 29 09:57:53 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Aug 2017 19:57:53 +1000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> <726e7be2591f481d8d0ee50af6486f75@sap.com> <11ed4c208b8641088b99a2066dfc992b@sap.com> <7653afda-fe01-1115-df4a-da0c2e7356f0@oracle.com> Message-ID: Hi Thomas, On 29/08/2017 7:44 PM, Thomas St?fe wrote: > Hi, > > I am fine with this change going in in the current form. Lets get this > patch in. Now I'm questioning some things ... > I see some issues, but they can be addressed in follow up items: > > 1) os::path_seperator() should really return a char, not a string. Most > callers seem to reference os::path_seperator()[0]. I've often wondered about that too but that's a separate RFE. > 2) If the user provided buffer is too small, we will fail, which looks > like the dll could not have been located. I am not sure we have to be > shy with allocating memory - internally, we malloc a buffer for > assembling the filename, and then os::splitpath() will malloc a whole > bunch of arrays too. So I think we could just return the dll path > location in a malloced buffer and require the caller to free. That seems a much bigger change. My question before pushing this is: have we in any way reduced the size of path that we may accept on some platforms? If we have that would be bad. > 3) I do not understand ':' as a file separator on windows. So, a path is > allowed to contain e.g. "C:;" ? Which would mean "a path relative to the > current directory currently active on drive C". If I am not mistaken. Do > we want to support this, what is the use case? IIUC the existing windows code supports this. So yes c:foo.dll is a reference to foo.dll in whatever the current directory on drive c: is. As for a usecase ... perhaps a way to workaround long paths or historical path format restrictions (ie no spaces) ? But the main thing is to not break what we currently support. Thanks, David > Kind Regards, Thomas > > > On Tue, Aug 29, 2017 at 10:41 AM, David Holmes > wrote: > > On 29/08/2017 6:29 PM, Lindenmaier, Goetz wrote: > > Hi David, > > I fixed the indentation and added you as reviewer. > I replaced the webrev in-place: > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName/webrev.05/ > > The new code went through all our testing ... except for some > ppc/s390 builds > that failed because of an other change pushed to hs tonight. But > that should > not matter, it all passed with the jdk10/jdk10 testing. > > Would you mind sponsoring? > > > No problem - but can we get Thomas to sign-off on this latest > version please. > > > Thanks, > David > > Best regards, > ? ?Goetz. > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com > ] > Sent: Dienstag, 29. August 2017 09:53 > To: Lindenmaier, Goetz > > Cc: hotspot-runtime-dev at openjdk.java.net > > Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns > true even if file > is missing. > > Hi Goetz, > > On 29/08/2017 4:18 PM, Lindenmaier, Goetz wrote: > > Hi, > > this is a webrev with merged windows and posix > implementations: > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.05/ > > I like the look of this. > > There are a couple of indention nits in os.cpp: > > ? ?247 static bool conc_path_file_and_check(char *buffer, char > *printbuffer, size_t printbuflen, > ? ?248? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?const char* pname, > char lastchar, > const char* fname) { > > > ? ?251? ?const char *filesep = (WINDOWS_ONLY(lastchar == > ':' ||) lastchar > == os::file_separator()[0]) ? > ? ?252? ? ? ? ? ? ? ? ? ?"" : os::file_separator(); > > > Thanks, > David > > Best regards, > ? ? Goetz > > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Montag, 28. August 2017 12:10 > To: 'David Holmes' > > Cc: hotspot-runtime-dev at openjdk.java.net > > Subject: RE: [ping] RFR(M): 8186072: dll_build_name > returns true even if > > file > > is missing. > > Hi, > > this are the changes needed to make the windows > dll_locate_lib > universally applicable. I also merge the three > similar jio_snprintf > calls into one method. > I do some gymnastics to avoid another buffer of > MAX_PATH_LEN > at the first call to conc_path_file_and_check. > I'll test this tonight. > > Best regards, > ? ? Goetz. > > diff -r e09e4eb985c5 src/os/windows/vm/os_windows.cpp > --- a/src/os/windows/vm/os_windows.cpp? Thu Aug 17 > 17:26:02 > > 2017 > > +0200 > +++ b/src/os/windows/vm/os_windows.cpp? Mon Aug 28 > 12:02:26 > > 2017 > > +0200 > @@ -1205,6 +1205,17 @@ > ? ? ?return GetFileAttributes(filename) != > INVALID_FILE_ATTRIBUTES; > ? ?} > > +bool conc_path_file_and_check(char *buffer, char > *printbuffer, size_t > printbuflen, > +? ? ? ? ? ? ? ? ? ? ? ? ? ? ? const char* pname, > char lastchar, const char* fname) { > +? char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) > lastchar == > os::file_seperator()[0]) ? "" : os::file_separator(); > +? int ret = jio_snprintf(printbuffer, printbuflen, > "%s%s%s", path, filesep, > fullfname); > +? if (ret != -1) { > +? ? struct stat statbuf; > +? ? return os::stat(buffer, &statbuf) == 0; > +? } > +? return false; > +} > + > ? ?bool os::dll_locate_lib(char *buffer, size_t buflen, > ? ? ? ? ? ? ? ? ? ? ? ? ? ?const char* pname, const > char* fname) { > ? ? ?bool retval = false; > @@ -1220,11 +1231,8 @@ > ? ? ? ? ?if (p != NULL) { > ? ? ? ? ? ?const size_t plen = strlen(buffer); > ? ? ? ? ? ?const char lastchar = buffer[plen - 1]; > -? ? ? ? char *filesep = (lastchar == ':' || > lastchar == '\\') ? "" : "\\"; > -? ? ? ? int ret = jio_snprintf(&buffer[plen], > buflen - plen, "%s%s", filesep, > fullfname); > -? ? ? ? if (ret != -1) { > -? ? ? ? ? retval = file_exists(buffer); > -? ? ? ? } > +? ? ? ? retval = conc_path_file_and_check(buffer, > &buffer[plen], buflen - > plen, > +? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? "", > lastchar, fullfname); > ? ? ? ? ?} > ? ? ? ?} else if (strchr(pname, > *os::path_separator()) != NULL) { > ? ? ? ? ?int n; > @@ -1238,12 +1246,8 @@ > ? ? ? ? ? ? ? ?continue; // skip the empty path values > ? ? ? ? ? ? ?} > ? ? ? ? ? ? ?const char lastchar = path[plen - 1]; > -? ? ? ? ? char *filesep = (lastchar == ':' || > lastchar == '\\') ? "" : "\\"; > -? ? ? ? ? int ret = jio_snprintf(buffer, buflen, > "%s%s%s", path, filesep, > fullfname); > -? ? ? ? ? if (ret != -1 && file_exists(buffer)) { > -? ? ? ? ? ? retval = true; > -? ? ? ? ? ? break; > -? ? ? ? ? } > +? ? ? ? ? retval = conc_path_file_and_check(buffer, > buffer, buflen, path, > lastchar, fullfname); > +? ? ? ? ? if (retval) break; > ? ? ? ? ? ?} > ? ? ? ? ? ?// release the storage > ? ? ? ? ? ?for (int i = 0; i < n; i++) { > @@ -1255,11 +1259,7 @@ > ? ? ? ? ?} > ? ? ? ?} else { > ? ? ? ? ?const char lastchar = pname[pnamelen-1]; > -? ? ? char *filesep = (lastchar == ':' || lastchar > == '\\') ? "" : "\\"; > -? ? ? int ret = jio_snprintf(buffer, buflen, > "%s%s%s", pname, filesep, > fullfname); > -? ? ? if (ret != -1) { > -? ? ? ? retval = file_exists(buffer); > -? ? ? } > +? ? ? retval = conc_path_file_and_check(buffer, > buffer, buflen, path, > > lastchar, > > fullfname); > ? ? ? ?} > ? ? ?} > > > -----Original Message----- > From: David Holmes > [mailto:david.holmes at oracle.com > ] > Sent: Montag, 28. August 2017 07:38 > To: Lindenmaier, Goetz > > > Cc: hotspot-runtime-dev at openjdk.java.net > > Subject: Re: [ping] RFR(M): 8186072: > dll_build_name returns true even if > > file > > is missing. > > Hi Goetz, > > On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: > > Hi, > > I please need a second review and a sponsor: > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > dllBuildName/webrev.04 > > > To update my description of the change to > the status after Thomas' > > review: > > > dll_build_name builds the proper path to a > library given a list of paths > > separated by > > path_seperator and a library name. It adds > in the platform specific > > endings > > etc. > > It is documented to return whether the file > exists, but only does so if a > > path_seperator > > exists in the path. > Especially if the path is empty, it just > returns ?true? without checking. > > Dll_build_name is usually used before > calling dll_load.? If dll_load does > > not > > get a full path it searches > > in well known unix/windows locations. This > is intended in the two cases > > where dll_build_name > > is called with an empty path. > > I renamed dll_build_name to dll_locate_lib > and changed it's behavior to > > always return > > a full path to the lib, inserting current > working directory if no path is > > given. > > For the use case where "" was actually > passed to the function, I added > > a > > new function > > (reusing the old function name) > dll_build_name that just adds system > > dependent prefix and suffix > > to the name. > I merged all unix implementations to the > posix os branch. > > > I started to look at this and have applied the > patch to run through some > basic testing. The overall approach seems > reasonable. But it is hard to > track all the details - in particular whether > there were any subtle > differences across the "posix" systems? > > I'm wondering what, if any, significant > differences exist between the > Windows and POSIX versions? I would hope the > platform differences > > could > > easily be hidden behind macros (for path > separator, library suffix etc). > Then perhaps this could just go in shared code > (os.hpp, os.cpp)? > > That aside, in the Windows code shouldn't the > hardwired .dll strings > actually be JNI_LIB_SUFFIX? > > Thanks, > David > > Best regards, > ? ? ?Goetz. > > > > -----Original Message----- > From: Thomas St?fe > [mailto:thomas.stuefe at gmail.com > ] > Sent: Dienstag, 22. August 2017 17:30 > To: Lindenmaier, Goetz > > > Cc: hotspot-runtime-dev at openjdk.java.net > > Subject: Re: RFR(M): 8186072: > dll_build_name returns true even if file > > is > > missing. > > Looks good. > > ..Thomas > > On Tue, Aug 22, 2017 at 4:33 PM, > Lindenmaier, Goetz > > > > > > wrote: > > > > ? ? ? ? I mistyped the path to webrev, > this should work: > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > dllBuildName/webrev.04 > > > > dllBuildName/webrev.04> > > ? ? ? ? Sorry, > ? ? ? ? ? Goetz > > > > ? ? ? ? > -----Original Message----- > ? ? ? ? > From: Lindenmaier, Goetz > ? ? ? ? > Sent: Dienstag, 22. August > 2017 15:48 > ? ? ? ? > To: 'Thomas St?fe' > > > > > ? ? ? ? > Cc: > hotspot-runtime-dev at openjdk.java.net > > > > runtime- > > dev at openjdk.java.net > > > ? ? ? ? > Subject: RE: RFR(M): 8186072: > dll_build_name returns true even if > file is > ? ? ? ? > missing. > ? ? ? ? > > ? ? ? ? > Hi, > ? ? ? ? > > ? ? ? ? > could I please get a second > review? > ? ? ? ? > > http://cr.openjdk.java.net/~goetz/wr17/8186072-dllBuildName- > > hs/webrev.04 > > > dllBuildName- > > hs/webrev.04> > ? ? ? ? > > ? ? ? ? > I had to update the webrev > because of a problem on windows. > ? ? ? ? > @Thomas I had edited os.hpp, > but not saved :( > ? ? ? ? > > ? ? ? ? > Best regards, > ? ? ? ? >? ?Goetz. > ? ? ? ? > > ? ? ? ? > PS: Didn't double-check the > webrev as cr server is slow. > ? ? ? ? > > ? ? ? ? > > -----Original Message----- > ? ? ? ? > > From: Thomas St?fe > [mailto:thomas.stuefe at gmail.com > > > ] > ? ? ? ? > > Sent: Donnerstag, 17. > August 2017 19:54 > ? ? ? ? > > To: Lindenmaier, Goetz > > > > > ? ? ? ? > > Cc: > hotspot-runtime-dev at openjdk.java.net > > > runtime-dev at openjdk.java.net > > > ? ? ? ? > > Subject: Re: RFR(M): > 8186072: dll_build_name returns true even > > if > > file is > ? ? ? ? > > missing. > ? ? ? ? > > > ? ? ? ? > > Hi Goetz, > ? ? ? ? > > > ? ? ? ? > > On Thu, Aug 17, 2017 at > 6:03 PM, Lindenmaier, Goetz > ? ? ? ? > > > > > > > > > > > > wrote: > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Hi Thomas, > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?I adapted the comments > in os.hpp. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?If I move the call to > dll_build_name out of dll_locate_lib > ? ? ? ? > > > ? ? ? ? > >? ? ?I have to do a lot of > coding in all the places where it is called. > ? ? ? ? > > > ? ? ? ? > >? ? ?That seems not useful > to me. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Fixed the type to size_t. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?One could merge > posix/windows if putting the check for ?:? > ? ? ? ? > > > ? ? ? ? > >? ? ?into a WINDOWS_ONLY() I > guess. The check for \ could be > ? ? ? ? > > > ? ? ? ? > >? ? ?done in posix as well, > if using file_seperator(). > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?*? Not your change, > but: why does the code in > > os::dll_locate_lib() > > even > ? ? ? ? > > > ? ? ? ? > >? ? ?*? differentiate > between a PATH containing no > os::path_separator() > ? ? ? ? > > > ? ? ? ? > >? ? ?*? and a path > containing os::path_separator()? > ? ? ? ? > > > ? ? ? ? > >? ? ?I assume this was done > to avoid all the allocations and copying > > of > > the > ? ? ? ? > > path. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Also adapted the > comment in jvmtiExport.cpp. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?New webrev: > ? ? ? ? > > > ? ? ? ? > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > ? ? ? ? > > dllBuildName/webrev.03/ > > > > ? ? ? ? > > dllBuildName/webrev.03/> > ? ? ? ? > > > ? ? ? ? > >? ? ?incremental diff: > ? ? ? ? > > > ? ? ? ? > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > ? ? ? ? > > > dllBuildName/webrev.03/diffs-incremental.patch > ? ? ? ? > > > > > > ? ? ? ? > > > dllBuildName/webrev.03/diffs-incremental.patch> > ? ? ? ? > > > ? ? ? ? > >? ? ?(fixed indentation on > windows) > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Best regards, > ? ? ? ? > > > ? ? ? ? > >? ? ? ?Goetz. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > Comments in os.hpp seem > unchanged ? > ? ? ? ? > > > ? ? ? ? > > But looks fine otherwise. I > do not need another webrev. > ? ? ? ? > > > ? ? ? ? > > Thanks, Thomas > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?From: Thomas St?fe > [mailto:thomas.stuefe at gmail.com > > > > ? ? ? ? > > > > > > ] > ? ? ? ? > >? ? ?Sent: Thursday, August > 17, 2017 3:48 PM > ? ? ? ? > >? ? ?To: Lindenmaier, Goetz > > > > ? ? ? ? > > > > > > > > ? ? ? ? > >? ? ?Cc: > hotspot-runtime-dev at openjdk.java.net > > > runtime-dev at openjdk.java.net > > > > > > > runtime-> > ? ? ? ? > > dev at openjdk.java.net > > > > > ? ? ? ? > >? ? ?Subject: Re: RFR(M): > 8186072: dll_build_name returns true > > even > > if file > ? ? ? ? > > is missing. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Hi Goetz, > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?On Thu, Aug 17, 2017 at > 1:35 PM, Lindenmaier, Goetz > ? ? ? ? > > > > > > > > > > > > wrote: > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?Hi Thomas, > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?I reworked the > whole thing. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?First, there is > dll_build_name. It just does -> > ? ? ? ? > > lib.so. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?Second, I > renamed the legacy dll_build_name to > dll_locate_lib. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?I merged all > the unix variants to one in os_posix. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?I removed the > buffer overflow check at the top. > ? ? ? ? > >? ? ? ? ? ? ?It's too > restrictive because the path argument > ? ? ? ? > >? ? ? ? ? ? ?can contain > several paths.? I added the overflow > ? ? ? ? > >? ? ? ? ? ? ?checks into the > single cases. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?Also, I first > assemble the pure name using the new, simple > ? ? ? ? > >? ? ? ? ? ? ?dll_build_name. > This is for reuse and readability. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?In case of an > empty directory, I use get_current_directory > ? ? ? ? > >? ? ? ? ? ? ?to complete the > path as indicated by the original > ? ? ? ? > > documentation > ? ? ? ? > >? ? ? ? ? ? ?where it was > called with "". > ? ? ? ? > >? ? ? ? ? ? ?Dll_locate_lib > now always returns a name with a full > path if > ? ? ? ? > >? ? ? ? ? ? ?the file exists. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?Also, on > windows, I think I fixed a bug by > reversing the > > order > > ? ? ? ? > >? ? ? ? ? ? ?of checks. A > path list ending in ':' or '\' would not > have > ? ? ? ? > >? ? ? ? ? ? ?been recognized. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?On Bsd, I > removed JNI_LIB_* because that already is > > defined > > ? ? ? ? > >? ? ? ? ? ? ?in jvm_bsh.h > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?New webrev: > ? ? ? ? > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > ? ? ? ? > > dllBuildName/webrev.02/ > > > > ? ? ? ? > > dllBuildName/webrev.02/> > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?Best regards, > ? ? ? ? > >? ? ? ? ? ? ? ?Goetz. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?I like this better than > before. Remarks: > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > ? ? ? ? > > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > > ? ? ? ? > > > > > > ? ? ? ? > > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?+? // Builds the > platform-specific name of a library. > ? ? ? ? > > > ? ? ? ? > >? ? ?+? // Returns false on > __buffer overflow__. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Hopefully not! :D > ? ? ? ? > > > ? ? ? ? > >? ? ?How about: "Returns > false no truncation" instead. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?+? // Builds a > platform-specific full library path > given an ld path > and lib > ? ? ? ? > > name. > ? ? ? ? > > > ? ? ? ? > >? ? ?+? // Returns true if > the buffer contains a full path to an > existing > file, > ? ? ? ? > > false > ? ? ? ? > > > ? ? ? ? > >? ? ?+? // otherwise. If > pathname is empty, checks the current > directory. > ? ? ? ? > > > ? ? ? ? > >? ? ?+? static bool > ?dll_locate_lib(char* buffer, size_t size, > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? ? ? ? ? ?const char* pathname, > const char* > fname); > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Might be worth > mentioning that "fname" is the unadorned > > library > > ? ? ? ? > > name, e.g. "verify" for > libverify.so or verify.dll. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Would the following > alternative be valid: > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?one could make > dll_locate_lib take the real file name, > and let > caller > ? ? ? ? > > use dll_build_name() to > build the libary name first before handing > > it > > to > ? ? ? ? > > dll_locate_lib(). In that > case, dll_locate_lib() could be renamed to > > a > > generic > ? ? ? ? > > "find_file_in_path" because > it would work for any kind of file. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?As an added bonus, > there would be no need to create a > temporary > ? ? ? ? > > array in > dll_build_name/dll_locate_lib, and no > need to call free() > > so > > no > ? ? ? ? > > cleanup-related control > flow changes in these functions. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?===== > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > ? ? ? ? > > > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > > ? ? ? ? > > > > > > ? ? ? ? > > > > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?+? int fullfnamelen = > strlen(JNI_LIB_PREFIX) + strlen(fname) + > ? ? ? ? > > strlen(JNI_LIB_SUFFIX); > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?int -> size_t (does > that even compile without warning?) > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?+? ? // Check current > working directory. > ? ? ? ? > > > ? ? ? ? > >? ? ?+? ? const char* p = > get_current_directory(buffer, buflen); > ? ? ? ? > > > ? ? ? ? > >? ? ?+? ? if (p != NULL && > ? ? ? ? > > > ? ? ? ? > >? ? ?+? ? ? ? strlen(buffer) > + 1 + fullfnamelen + 1 <= buflen) { > ? ? ? ? > > > ? ? ? ? > >? ? ?+? ? ? strcat(buffer, > "\\"); > ? ? ? ? > > > ? ? ? ? > >? ? ?+? ? ? strcat(buffer, > fullfname); > ? ? ? ? > > > ? ? ? ? > >? ? ?+? ? ? retval = > file_exists(buffer); > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Small nit: I'd use > jio_snprintf instead of strcat. Functionally > identical but > ? ? ? ? > > will make scanners (e.g. > coverity) happy. One could then avoid > > the > > length > ? ? ? ? > > calculation and rely on > jio_snprintf truncation: > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?const char* p = > get_current_directory(buffer, buflen); > ? ? ? ? > > > ? ? ? ? > >? ? ?if (p != NULL) { > ? ? ? ? > > > ? ? ? ? > >? ? ? ?const size_t end = > strlen(p); > ? ? ? ? > > > ? ? ? ? > >? ? ? ?if (jio_snprintf(end, > buflen - end, "\\%s", fullname) != -1) { > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ?retval = > file_exists(buffer); > ? ? ? ? > > > ? ? ? ? > >? ? ? ?} > ? ? ? ? > > > ? ? ? ? > >? ? ?} > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?-- > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Not your change, but: > why does the code in os::dll_locate_lib() > even > ? ? ? ? > > differentiate between a > PATH containing no os::path_separator() > and a path > ? ? ? ? > > containing > os::path_separator()? > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Would the former not be > just a PATH with only one directory > > and > > hence > ? ? ? ? > > need no special treatment? > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?===== > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > ? ? ? ? > > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.ht > ml > > ? ? ? ? > > > > > > ? ? ? ? > > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.ht > ml> > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Could > os::dll_locate_lib be consolidated > between windows and > unix? > ? ? ? ? > > Seems to be the > implementation is almost identical. > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?==== > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > ? ? ? ? > > > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > > ? ? ? ? > > > > > > ? ? ? ? > > > > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?+? ? ? ? // not found - > try library path > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Proposal: "not found - > try OS default library path" > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?Find some > comments inline: > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?> > ?Especially if the path is empty, it > just returns 'true'. > ? ? ? ? > >? ? ? ? ? ? ?> > ?Dll_build_name is usually used before > calling dll_load. > If > ? ? ? ? > > dll_load does not get a > full path it searches > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?in well > known unix/windows locations. This is > intended > in > ? ? ? ? > > the two cases where > dll_build_name > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?is > called with an empty path. > ? ? ? ? > >? ? ? ? ? ? ?> > ? ? ? ? > >? ? ? ? ? ? ?> So, for both > cases (thread.cpp, jvmtiExport.cpp), > ? ? ? ? > >? ? ? ? ? ? ?> > ? ? ? ? > >? ? ? ? ? ? ?> before, we > would call os::dll_build_name() with an > empty > ? ? ? ? > > string for the path > ? ? ? ? > >? ? ? ? ? ? ?> which, for > relative paths, would result in feeding > that path > ? ? ? ? > > unexpanded to > ? ? ? ? > >? ? ? ? ? ? ?> dlopen(), > which would use whatever the OS does in > those > ? ? ? ? > > cases (LIBPATH, > ? ? ? ? > >? ? ? ? ? ? ?> > LD_LIBRARY_PATH, PATH on windows). Note > that this > > does > > ? ? ? ? > > not necessarily > ? ? ? ? > >? ? ? ? ? ? ?> include > searching the current directory. > ? ? ? ? > >? ? ? ? ? ? ?Right. With > changed dll_biuld_name it's again exactly as > ? ? ? ? > > before. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?> With your > change, we now use java.library.path, > which is > not > ? ? ? ? > > necessarily the > ? ? ? ? > >? ? ? ? ? ? ?> same? > ? ? ? ? > >? ? ? ? ? ? ?You are right, > I oversaw that java.library.path can be > ? ? ? ? > > overwritten.? Initially, > ? ? ? ? > >? ? ? ? ? ? ?it's set to the > right thing. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?> (BTW, I think > the old comments in thread.cpp and > ? ? ? ? > > jniExport.cpp were wrong:"// > ? ? ? ? > >? ? ? ? ? ? ?> Try the local > directory" - if "local" means "current", > this is > not > ? ? ? ? > > what did > ? ? ? ? > >? ? ? ? ? ? ?> happen). > ? ? ? ? > >? ? ? ? ? ? ?Right, I tried > to adapt them, did I miss one? > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?I added > a second variant of dll_build_name without > > the > > ? ? ? ? > > path argument that adds the > path > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?from > system property java.lang.path and use > that in > these > ? ? ? ? > > two cases. > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?I > changed the original function to > actually check file > ? ? ? ? > > availability in all cases, > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?and to > check . if the path is empty. > ? ? ? ? > >? ? ? ? ? ? ?> I think that > may be a bit confusing. We would then have > three > ? ? ? ? > > options: > ? ? ? ? > >? ? ? ? ? ? ?> > ? ? ? ? > >? ? ? ? ? ? ?> - call > os::dll_build_name with a real > ";;.." PATH > and > ? ? ? ? > > get a file name > ? ? ? ? > >? ? ? ? ? ? ?> resolved from > that path > ? ? ? ? > >? ? ? ? ? ? ?> - call > os::dll_build_name with "" for the PATH > and get OS > dll > ? ? ? ? > > resolution > ? ? ? ? > >? ? ? ? ? ? ?No, in that > case, as I called file_exists(), it > would only work if > ? ? ? ? > > the dll is in the > ? ? ? ? > >? ? ? ? ? ? ?current working > directory. But I changed this now, anyways. > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?> - call your > new overloaded version of > > os::dll_build_name(), > > ? ? ? ? > > which uses - > ? ? ? ? > >? ? ? ? ? ? ?> > Djava.library.path. > ? ? ? ? > >? ? ? ? ? ? ?> > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?Please > review this change. I please need a sponsor. > ? ? ? ? > >? ? ? ? ? ? ?> > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > ? ? ? ? > > > > > > > > ? ? ? ? > >? ? ? ? ? ? ?> > dllBuildName/webrev.01/ > ? ? ? ? > > > > > > ? ? ? ? > > > > > > > > ? ? ? ? > >? ? ? ? ? ? ?> > dllBuildName/webrev.01/> > ? ? ? ? > > > ? ? ? ? > >? ? ? ? ? ? ?> > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?Best > regards, > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ? ?Goetz. > ? ? ? ? > >? ? ? ? ? ? ?> > ? ? ? ? > >? ? ? ? ? ? ?> > ? ? ? ? > >? ? ? ? ? ? ?> > ? ? ? ? > >? ? ? ? ? ? ?> > ? ? ? ? > >? ? ? ? ? ? ?> Kind Regards, > Thomas > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > >? ? ?Best Regards, Thomas > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > ? ? ? ? > > > > > > > From goetz.lindenmaier at sap.com Tue Aug 29 10:11:51 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 29 Aug 2017 10:11:51 +0000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> <726e7be2591f481d8d0ee50af6486f75@sap.com> <11ed4c208b8641088b99a2066dfc992b@sap.com> <7653afda-fe01-1115-df4a-da0c2e7356f0@oracle.com> Message-ID: Hi, Thomas, thanks for looking at the new webrev ad hoc. > > 2) If the user provided buffer is too small, we will fail, which looks > > like the dll could not have been located. I am not sure we have to be > > shy with allocating memory - internally, we malloc a buffer for > > assembling the filename, and then os::splitpath() will malloc a whole > > bunch of arrays too. So I think we could just return the dll path > > location in a malloced buffer and require the caller to free. > > That seems a much bigger change. My question before pushing this is: > have we in any way reduced the size of path that we may accept on some > platforms? If we have that would be bad. No. I nowhere change the size of the buffer passed into this method, and that's the limit. On posix, paths no more contain // where concatenated, so the paths supported can contain one more char in this case. This is an improvement. The buffer passed in usually has size MAX_PATH, so the checks for overflow should never cause a problem. They are mostly there for safety. Allocation of the memory rather would reduce memory consumption, but I think it's not that relevant. Overall, I think it's a mismatch that some functions allocate the memory, others require a buffer ... but that's really out of scope. Best regards, Goetz. > > > 3) I do not understand ':' as a file separator on windows. So, a path is > > allowed to contain e.g. "C:;" ? Which would mean "a path relative to the > > current directory currently active on drive C". If I am not mistaken. Do > > we want to support this, what is the use case? > > IIUC the existing windows code supports this. So yes c:foo.dll is a > reference to foo.dll in whatever the current directory on drive c: is. > As for a usecase ... perhaps a way to workaround long paths or > historical path format restrictions (ie no spaces) ? But the main thing > is to not break what we currently support. > > Thanks, > David > > > Kind Regards, Thomas > > > > > > On Tue, Aug 29, 2017 at 10:41 AM, David Holmes > > > wrote: > > > > On 29/08/2017 6:29 PM, Lindenmaier, Goetz wrote: > > > > Hi David, > > > > I fixed the indentation and added you as reviewer. > > I replaced the webrev in-place: > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName/webrev.05/ > > dllBuildName/webrev.05/> > > The new code went through all our testing ... except for some > > ppc/s390 builds > > that failed because of an other change pushed to hs tonight. But > > that should > > not matter, it all passed with the jdk10/jdk10 testing. > > > > Would you mind sponsoring? > > > > > > No problem - but can we get Thomas to sign-off on this latest > > version please. > > > > > > Thanks, > > David > > > > Best regards, > > ? ?Goetz. > > > > -----Original Message----- > > From: David Holmes [mailto:david.holmes at oracle.com > > ] > > Sent: Dienstag, 29. August 2017 09:53 > > To: Lindenmaier, Goetz > > > > Cc: hotspot-runtime-dev at openjdk.java.net > > > > Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns > > true even if file > > is missing. > > > > Hi Goetz, > > > > On 29/08/2017 4:18 PM, Lindenmaier, Goetz wrote: > > > > Hi, > > > > this is a webrev with merged windows and posix > > implementations: > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > dllBuildName/webrev.05/ > > > > I like the look of this. > > > > There are a couple of indention nits in os.cpp: > > > > ? ?247 static bool conc_path_file_and_check(char *buffer, char > > *printbuffer, size_t printbuflen, > > ? ?248? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?const char* pname, > > char lastchar, > > const char* fname) { > > > > > > ? ?251? ?const char *filesep = (WINDOWS_ONLY(lastchar == > > ':' ||) lastchar > > == os::file_separator()[0]) ? > > ? ?252? ? ? ? ? ? ? ? ? ?"" : os::file_separator(); > > > > > > Thanks, > > David > > > > Best regards, > > ? ? Goetz > > > > -----Original Message----- > > From: Lindenmaier, Goetz > > Sent: Montag, 28. August 2017 12:10 > > To: 'David Holmes' > > > > Cc: hotspot-runtime-dev at openjdk.java.net > > > > Subject: RE: [ping] RFR(M): 8186072: dll_build_name > > returns true even if > > > > file > > > > is missing. > > > > Hi, > > > > this are the changes needed to make the windows > > dll_locate_lib > > universally applicable. I also merge the three > > similar jio_snprintf > > calls into one method. > > I do some gymnastics to avoid another buffer of > > MAX_PATH_LEN > > at the first call to conc_path_file_and_check. > > I'll test this tonight. > > > > Best regards, > > ? ? Goetz. > > > > diff -r e09e4eb985c5 src/os/windows/vm/os_windows.cpp > > --- a/src/os/windows/vm/os_windows.cpp? Thu Aug 17 > > 17:26:02 > > > > 2017 > > > > +0200 > > +++ b/src/os/windows/vm/os_windows.cpp? Mon Aug 28 > > 12:02:26 > > > > 2017 > > > > +0200 > > @@ -1205,6 +1205,17 @@ > > ? ? ?return GetFileAttributes(filename) != > > INVALID_FILE_ATTRIBUTES; > > ? ?} > > > > +bool conc_path_file_and_check(char *buffer, char > > *printbuffer, size_t > > printbuflen, > > +? ? ? ? ? ? ? ? ? ? ? ? ? ? ? const char* pname, > > char lastchar, const char* fname) { > > +? char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) > > lastchar == > > os::file_seperator()[0]) ? "" : os::file_separator(); > > +? int ret = jio_snprintf(printbuffer, printbuflen, > > "%s%s%s", path, filesep, > > fullfname); > > +? if (ret != -1) { > > +? ? struct stat statbuf; > > +? ? return os::stat(buffer, &statbuf) == 0; > > +? } > > +? return false; > > +} > > + > > ? ?bool os::dll_locate_lib(char *buffer, size_t buflen, > > ? ? ? ? ? ? ? ? ? ? ? ? ? ?const char* pname, const > > char* fname) { > > ? ? ?bool retval = false; > > @@ -1220,11 +1231,8 @@ > > ? ? ? ? ?if (p != NULL) { > > ? ? ? ? ? ?const size_t plen = strlen(buffer); > > ? ? ? ? ? ?const char lastchar = buffer[plen - 1]; > > -? ? ? ? char *filesep = (lastchar == ':' || > > lastchar == '\\') ? "" : "\\"; > > -? ? ? ? int ret = jio_snprintf(&buffer[plen], > > buflen - plen, "%s%s", filesep, > > fullfname); > > -? ? ? ? if (ret != -1) { > > -? ? ? ? ? retval = file_exists(buffer); > > -? ? ? ? } > > +? ? ? ? retval = conc_path_file_and_check(buffer, > > &buffer[plen], buflen - > > plen, > > +? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? "", > > lastchar, fullfname); > > ? ? ? ? ?} > > ? ? ? ?} else if (strchr(pname, > > *os::path_separator()) != NULL) { > > ? ? ? ? ?int n; > > @@ -1238,12 +1246,8 @@ > > ? ? ? ? ? ? ? ?continue; // skip the empty path values > > ? ? ? ? ? ? ?} > > ? ? ? ? ? ? ?const char lastchar = path[plen - 1]; > > -? ? ? ? ? char *filesep = (lastchar == ':' || > > lastchar == '\\') ? "" : "\\"; > > -? ? ? ? ? int ret = jio_snprintf(buffer, buflen, > > "%s%s%s", path, filesep, > > fullfname); > > -? ? ? ? ? if (ret != -1 && file_exists(buffer)) { > > -? ? ? ? ? ? retval = true; > > -? ? ? ? ? ? break; > > -? ? ? ? ? } > > +? ? ? ? ? retval = conc_path_file_and_check(buffer, > > buffer, buflen, path, > > lastchar, fullfname); > > +? ? ? ? ? if (retval) break; > > ? ? ? ? ? ?} > > ? ? ? ? ? ?// release the storage > > ? ? ? ? ? ?for (int i = 0; i < n; i++) { > > @@ -1255,11 +1259,7 @@ > > ? ? ? ? ?} > > ? ? ? ?} else { > > ? ? ? ? ?const char lastchar = pname[pnamelen-1]; > > -? ? ? char *filesep = (lastchar == ':' || lastchar > > == '\\') ? "" : "\\"; > > -? ? ? int ret = jio_snprintf(buffer, buflen, > > "%s%s%s", pname, filesep, > > fullfname); > > -? ? ? if (ret != -1) { > > -? ? ? ? retval = file_exists(buffer); > > -? ? ? } > > +? ? ? retval = conc_path_file_and_check(buffer, > > buffer, buflen, path, > > > > lastchar, > > > > fullfname); > > ? ? ? ?} > > ? ? ?} > > > > > > -----Original Message----- > > From: David Holmes > > [mailto:david.holmes at oracle.com > > ] > > Sent: Montag, 28. August 2017 07:38 > > To: Lindenmaier, Goetz > > > > > > Cc: hotspot-runtime-dev at openjdk.java.net > > > > Subject: Re: [ping] RFR(M): 8186072: > > dll_build_name returns true even if > > > > file > > > > is missing. > > > > Hi Goetz, > > > > On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: > > > > Hi, > > > > I please need a second review and a sponsor: > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > dllBuildName/webrev.04 > > > > > > To update my description of the change to > > the status after Thomas' > > > > review: > > > > > > dll_build_name builds the proper path to a > > library given a list of paths > > > > separated by > > > > path_seperator and a library name. It adds > > in the platform specific > > > > endings > > > > etc. > > > > It is documented to return whether the file > > exists, but only does so if a > > > > path_seperator > > > > exists in the path. > > Especially if the path is empty, it just > > returns ?true? without checking. > > > > Dll_build_name is usually used before > > calling dll_load.? If dll_load does > > > > not > > > > get a full path it searches > > > > in well known unix/windows locations. This > > is intended in the two cases > > > > where dll_build_name > > > > is called with an empty path. > > > > I renamed dll_build_name to dll_locate_lib > > and changed it's behavior to > > > > always return > > > > a full path to the lib, inserting current > > working directory if no path is > > > > given. > > > > For the use case where "" was actually > > passed to the function, I added > > > > a > > > > new function > > > > (reusing the old function name) > > dll_build_name that just adds system > > > > dependent prefix and suffix > > > > to the name. > > I merged all unix implementations to the > > posix os branch. > > > > > > I started to look at this and have applied the > > patch to run through some > > basic testing. The overall approach seems > > reasonable. But it is hard to > > track all the details - in particular whether > > there were any subtle > > differences across the "posix" systems? > > > > I'm wondering what, if any, significant > > differences exist between the > > Windows and POSIX versions? I would hope the > > platform differences > > > > could > > > > easily be hidden behind macros (for path > > separator, library suffix etc). > > Then perhaps this could just go in shared code > > (os.hpp, os.cpp)? > > > > That aside, in the Windows code shouldn't the > > hardwired .dll strings > > actually be JNI_LIB_SUFFIX? > > > > Thanks, > > David > > > > Best regards, > > ? ? ?Goetz. > > > > > > > > -----Original Message----- > > From: Thomas St?fe > > [mailto:thomas.stuefe at gmail.com > > ] > > Sent: Dienstag, 22. August 2017 17:30 > > To: Lindenmaier, Goetz > > > > > > Cc: hotspot-runtime-dev at openjdk.java.net > > > > Subject: Re: RFR(M): 8186072: > > dll_build_name returns true even if file > > > > is > > > > missing. > > > > Looks good. > > > > ..Thomas > > > > On Tue, Aug 22, 2017 at 4:33 PM, > > Lindenmaier, Goetz > > > > > > > > > > > > > wrote: > > > > > > > > ? ? ? ? I mistyped the path to webrev, > > this should work: > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > dllBuildName/webrev.04 > > > > > > > > > dllBuildName/webrev.04> > > > > ? ? ? ? Sorry, > > ? ? ? ? ? Goetz > > > > > > > > ? ? ? ? > -----Original Message----- > > ? ? ? ? > From: Lindenmaier, Goetz > > ? ? ? ? > Sent: Dienstag, 22. August > > 2017 15:48 > > ? ? ? ? > To: 'Thomas St?fe' > > > > > > > > > > ? ? ? ? > Cc: > > hotspot-runtime-dev at openjdk.java.net > > > > > > > > runtime- > > > > dev at openjdk.java.net > > > > > ? ? ? ? > Subject: RE: RFR(M): 8186072: > > dll_build_name returns true even if > > file is > > ? ? ? ? > missing. > > ? ? ? ? > > > ? ? ? ? > Hi, > > ? ? ? ? > > > ? ? ? ? > could I please get a second > > review? > > ? ? ? ? > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > dllBuildName- > > dllBuildName-> > > hs/webrev.04 > > > > > > > dllBuildName- > > > > hs/webrev.04> > > ? ? ? ? > > > ? ? ? ? > I had to update the webrev > > because of a problem on windows. > > ? ? ? ? > @Thomas I had edited os.hpp, > > but not saved :( > > ? ? ? ? > > > ? ? ? ? > Best regards, > > ? ? ? ? >? ?Goetz. > > ? ? ? ? > > > ? ? ? ? > PS: Didn't double-check the > > webrev as cr server is slow. > > ? ? ? ? > > > ? ? ? ? > > -----Original Message----- > > ? ? ? ? > > From: Thomas St?fe > > [mailto:thomas.stuefe at gmail.com > > > > > > ] > > ? ? ? ? > > Sent: Donnerstag, 17. > > August 2017 19:54 > > ? ? ? ? > > To: Lindenmaier, Goetz > > > > > > > > > > ? ? ? ? > > Cc: > > hotspot-runtime-dev at openjdk.java.net > > > > > > runtime-dev at openjdk.java.net > > > > > ? ? ? ? > > Subject: Re: RFR(M): > > 8186072: dll_build_name returns true even > > > > if > > > > file is > > ? ? ? ? > > missing. > > ? ? ? ? > > > > ? ? ? ? > > Hi Goetz, > > ? ? ? ? > > > > ? ? ? ? > > On Thu, Aug 17, 2017 at > > 6:03 PM, Lindenmaier, Goetz > > ? ? ? ? > > > > > > > > > > > > > > > > > > > > > > wrote: > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Hi Thomas, > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?I adapted the comments > > in os.hpp. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?If I move the call to > > dll_build_name out of dll_locate_lib > > ? ? ? ? > > > > ? ? ? ? > >? ? ?I have to do a lot of > > coding in all the places where it is called. > > ? ? ? ? > > > > ? ? ? ? > >? ? ?That seems not useful > > to me. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Fixed the type to size_t. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?One could merge > > posix/windows if putting the check for ?:? > > ? ? ? ? > > > > ? ? ? ? > >? ? ?into a WINDOWS_ONLY() I > > guess. The check for \ could be > > ? ? ? ? > > > > ? ? ? ? > >? ? ?done in posix as well, > > if using file_seperator(). > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?*? Not your change, > > but: why does the code in > > > > os::dll_locate_lib() > > > > even > > ? ? ? ? > > > > ? ? ? ? > >? ? ?*? differentiate > > between a PATH containing no > > os::path_separator() > > ? ? ? ? > > > > ? ? ? ? > >? ? ?*? and a path > > containing os::path_separator()? > > ? ? ? ? > > > > ? ? ? ? > >? ? ?I assume this was done > > to avoid all the allocations and copying > > > > of > > > > the > > ? ? ? ? > > path. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Also adapted the > > comment in jvmtiExport.cpp. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?New webrev: > > ? ? ? ? > > > > ? ? ? ? > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > > ? ? ? ? > > dllBuildName/webrev.03/ > > > > > > > > > ? ? ? ? > > dllBuildName/webrev.03/> > > ? ? ? ? > > > > ? ? ? ? > >? ? ?incremental diff: > > ? ? ? ? > > > > ? ? ? ? > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > > ? ? ? ? > > > > dllBuildName/webrev.03/diffs-incremental.patch > > ? ? ? ? > > > > > > > > > > > ? ? ? ? > > > > dllBuildName/webrev.03/diffs-incremental.patch> > > ? ? ? ? > > > > ? ? ? ? > >? ? ?(fixed indentation on > > windows) > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Best regards, > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ?Goetz. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > Comments in os.hpp seem > > unchanged ? > > ? ? ? ? > > > > ? ? ? ? > > But looks fine otherwise. I > > do not need another webrev. > > ? ? ? ? > > > > ? ? ? ? > > Thanks, Thomas > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?From: Thomas St?fe > > [mailto:thomas.stuefe at gmail.com > > > > > > > > ? ? ? ? > > > > > > > > > > ] > > ? ? ? ? > >? ? ?Sent: Thursday, August > > 17, 2017 3:48 PM > > ? ? ? ? > >? ? ?To: Lindenmaier, Goetz > > > > > > > > > ? ? ? ? > > > > > > > > > > > > > ? ? ? ? > >? ? ?Cc: > > hotspot-runtime-dev at openjdk.java.net > > > > > > runtime-dev at openjdk.java.net > > > > > > > > > > > > > > runtime-> > > ? ? ? ? > > dev at openjdk.java.net > > > > > > > > > ? ? ? ? > >? ? ?Subject: Re: RFR(M): > > 8186072: dll_build_name returns true > > > > even > > > > if file > > ? ? ? ? > > is missing. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Hi Goetz, > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?On Thu, Aug 17, 2017 at > > 1:35 PM, Lindenmaier, Goetz > > ? ? ? ? > > > > > > > > > > > > > > > > > > > > > > wrote: > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?Hi Thomas, > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?I reworked the > > whole thing. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?First, there is > > dll_build_name. It just does -> > > ? ? ? ? > > lib.so. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?Second, I > > renamed the legacy dll_build_name to > > dll_locate_lib. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?I merged all > > the unix variants to one in os_posix. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?I removed the > > buffer overflow check at the top. > > ? ? ? ? > >? ? ? ? ? ? ?It's too > > restrictive because the path argument > > ? ? ? ? > >? ? ? ? ? ? ?can contain > > several paths.? I added the overflow > > ? ? ? ? > >? ? ? ? ? ? ?checks into the > > single cases. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?Also, I first > > assemble the pure name using the new, simple > > ? ? ? ? > >? ? ? ? ? ? ?dll_build_name. > > This is for reuse and readability. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?In case of an > > empty directory, I use get_current_directory > > ? ? ? ? > >? ? ? ? ? ? ?to complete the > > path as indicated by the original > > ? ? ? ? > > documentation > > ? ? ? ? > >? ? ? ? ? ? ?where it was > > called with "". > > ? ? ? ? > >? ? ? ? ? ? ?Dll_locate_lib > > now always returns a name with a full > > path if > > ? ? ? ? > >? ? ? ? ? ? ?the file exists. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?Also, on > > windows, I think I fixed a bug by > > reversing the > > > > order > > > > ? ? ? ? > >? ? ? ? ? ? ?of checks. A > > path list ending in ':' or '\' would not > > have > > ? ? ? ? > >? ? ? ? ? ? ?been recognized. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?On Bsd, I > > removed JNI_LIB_* because that already is > > > > defined > > > > ? ? ? ? > >? ? ? ? ? ? ?in jvm_bsh.h > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?New webrev: > > ? ? ? ? > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > > ? ? ? ? > > dllBuildName/webrev.02/ > > > > > > > > > ? ? ? ? > > dllBuildName/webrev.02/> > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?Best regards, > > ? ? ? ? > >? ? ? ? ? ? ? ?Goetz. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?I like this better than > > before. Remarks: > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > > ? ? ? ? > > > > > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html > > > > ? ? ? ? > > > > > > > > > > > ? ? ? ? > > > > > > > dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? // Builds the > > platform-specific name of a library. > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? // Returns false on > > __buffer overflow__. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Hopefully not! :D > > ? ? ? ? > > > > ? ? ? ? > >? ? ?How about: "Returns > > false no truncation" instead. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? // Builds a > > platform-specific full library path > > given an ld path > > and lib > > ? ? ? ? > > name. > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? // Returns true if > > the buffer contains a full path to an > > existing > > file, > > ? ? ? ? > > false > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? // otherwise. If > > pathname is empty, checks the current > > directory. > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? static bool > > ?dll_locate_lib(char* buffer, size_t size, > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? ? ? ? ? ?const char* pathname, > > const char* > > fname); > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Might be worth > > mentioning that "fname" is the unadorned > > > > library > > > > ? ? ? ? > > name, e.g. "verify" for > > libverify.so or verify.dll. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Would the following > > alternative be valid: > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?one could make > > dll_locate_lib take the real file name, > > and let > > caller > > ? ? ? ? > > use dll_build_name() to > > build the libary name first before handing > > > > it > > > > to > > ? ? ? ? > > dll_locate_lib(). In that > > case, dll_locate_lib() could be renamed to > > > > a > > > > generic > > ? ? ? ? > > "find_file_in_path" because > > it would work for any kind of file. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?As an added bonus, > > there would be no need to create a > > temporary > > ? ? ? ? > > array in > > dll_build_name/dll_locate_lib, and no > > need to call free() > > > > so > > > > no > > ? ? ? ? > > cleanup-related control > > flow changes in these functions. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?===== > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > > ? ? ? ? > > > > > > > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html > > > > ? ? ? ? > > > > > > > > > > > ? ? ? ? > > > > > > > > > > > dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? int fullfnamelen = > > strlen(JNI_LIB_PREFIX) + strlen(fname) + > > ? ? ? ? > > strlen(JNI_LIB_SUFFIX); > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?int -> size_t (does > > that even compile without warning?) > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? ? // Check current > > working directory. > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? ? const char* p = > > get_current_directory(buffer, buflen); > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? ? if (p != NULL && > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? ? ? ? strlen(buffer) > > + 1 + fullfnamelen + 1 <= buflen) { > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? ? ? strcat(buffer, > > "\\"); > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? ? ? strcat(buffer, > > fullfname); > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? ? ? retval = > > file_exists(buffer); > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Small nit: I'd use > > jio_snprintf instead of strcat. Functionally > > identical but > > ? ? ? ? > > will make scanners (e.g. > > coverity) happy. One could then avoid > > > > the > > > > length > > ? ? ? ? > > calculation and rely on > > jio_snprintf truncation: > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?const char* p = > > get_current_directory(buffer, buflen); > > ? ? ? ? > > > > ? ? ? ? > >? ? ?if (p != NULL) { > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ?const size_t end = > > strlen(p); > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ?if (jio_snprintf(end, > > buflen - end, "\\%s", fullname) != -1) { > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ?retval = > > file_exists(buffer); > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ?} > > ? ? ? ? > > > > ? ? ? ? > >? ? ?} > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?-- > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Not your change, but: > > why does the code in os::dll_locate_lib() > > even > > ? ? ? ? > > differentiate between a > > PATH containing no os::path_separator() > > and a path > > ? ? ? ? > > containing > > os::path_separator()? > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Would the former not be > > just a PATH with only one directory > > > > and > > > > hence > > ? ? ? ? > > need no special treatment? > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?===== > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > > ? ? ? ? > > > > > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.ht > > ml > > > > ? ? ? ? > > > > > > > > > > > ? ? ? ? > > > > > > > dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.ht > > ml> > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Could > > os::dll_locate_lib be consolidated > > between windows and > > unix? > > ? ? ? ? > > Seems to be the > > implementation is almost identical. > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?==== > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > > ? ? ? ? > > > > > > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html > > > > ? ? ? ? > > > > > > > > > > > ? ? ? ? > > > > > > > > > dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?+? ? ? ? // not found - > > try library path > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Proposal: "not found - > > try OS default library path" > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?Find some > > comments inline: > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?> > > ?Especially if the path is empty, it > > just returns 'true'. > > ? ? ? ? > >? ? ? ? ? ? ?> > > ?Dll_build_name is usually used before > > calling dll_load. > > If > > ? ? ? ? > > dll_load does not get a > > full path it searches > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?in well > > known unix/windows locations. This is > > intended > > in > > ? ? ? ? > > the two cases where > > dll_build_name > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?is > > called with an empty path. > > ? ? ? ? > >? ? ? ? ? ? ?> > > ? ? ? ? > >? ? ? ? ? ? ?> So, for both > > cases (thread.cpp, jvmtiExport.cpp), > > ? ? ? ? > >? ? ? ? ? ? ?> > > ? ? ? ? > >? ? ? ? ? ? ?> before, we > > would call os::dll_build_name() with an > > empty > > ? ? ? ? > > string for the path > > ? ? ? ? > >? ? ? ? ? ? ?> which, for > > relative paths, would result in feeding > > that path > > ? ? ? ? > > unexpanded to > > ? ? ? ? > >? ? ? ? ? ? ?> dlopen(), > > which would use whatever the OS does in > > those > > ? ? ? ? > > cases (LIBPATH, > > ? ? ? ? > >? ? ? ? ? ? ?> > > LD_LIBRARY_PATH, PATH on windows). Note > > that this > > > > does > > > > ? ? ? ? > > not necessarily > > ? ? ? ? > >? ? ? ? ? ? ?> include > > searching the current directory. > > ? ? ? ? > >? ? ? ? ? ? ?Right. With > > changed dll_biuld_name it's again exactly as > > ? ? ? ? > > before. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?> With your > > change, we now use java.library.path, > > which is > > not > > ? ? ? ? > > necessarily the > > ? ? ? ? > >? ? ? ? ? ? ?> same? > > ? ? ? ? > >? ? ? ? ? ? ?You are right, > > I oversaw that java.library.path can be > > ? ? ? ? > > overwritten.? Initially, > > ? ? ? ? > >? ? ? ? ? ? ?it's set to the > > right thing. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?> (BTW, I think > > the old comments in thread.cpp and > > ? ? ? ? > > jniExport.cpp were wrong:"// > > ? ? ? ? > >? ? ? ? ? ? ?> Try the local > > directory" - if "local" means "current", > > this is > > not > > ? ? ? ? > > what did > > ? ? ? ? > >? ? ? ? ? ? ?> happen). > > ? ? ? ? > >? ? ? ? ? ? ?Right, I tried > > to adapt them, did I miss one? > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?I added > > a second variant of dll_build_name without > > > > the > > > > ? ? ? ? > > path argument that adds the > > path > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?from > > system property java.lang.path and use > > that in > > these > > ? ? ? ? > > two cases. > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?I > > changed the original function to > > actually check file > > ? ? ? ? > > availability in all cases, > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?and to > > check . if the path is empty. > > ? ? ? ? > >? ? ? ? ? ? ?> I think that > > may be a bit confusing. We would then have > > three > > ? ? ? ? > > options: > > ? ? ? ? > >? ? ? ? ? ? ?> > > ? ? ? ? > >? ? ? ? ? ? ?> - call > > os::dll_build_name with a real > > ";;.." PATH > > and > > ? ? ? ? > > get a file name > > ? ? ? ? > >? ? ? ? ? ? ?> resolved from > > that path > > ? ? ? ? > >? ? ? ? ? ? ?> - call > > os::dll_build_name with "" for the PATH > > and get OS > > dll > > ? ? ? ? > > resolution > > ? ? ? ? > >? ? ? ? ? ? ?No, in that > > case, as I called file_exists(), it > > would only work if > > ? ? ? ? > > the dll is in the > > ? ? ? ? > >? ? ? ? ? ? ?current working > > directory. But I changed this now, anyways. > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?> - call your > > new overloaded version of > > > > os::dll_build_name(), > > > > ? ? ? ? > > which uses - > > ? ? ? ? > >? ? ? ? ? ? ?> > > Djava.library.path. > > ? ? ? ? > >? ? ? ? ? ? ?> > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?Please > > review this change. I please need a sponsor. > > ? ? ? ? > >? ? ? ? ? ? ?> > > http://cr.openjdk.java.net/~goetz/wr17/8186072- > > > > > > > > ? ? ? ? > > > > > > > > > > > > > > ? ? ? ? > >? ? ? ? ? ? ?> > > dllBuildName/webrev.01/ > > ? ? ? ? > > > > > > > > > > > ? ? ? ? > > > > > > > > > > > > > > ? ? ? ? > >? ? ? ? ? ? ?> > > dllBuildName/webrev.01/> > > ? ? ? ? > > > > ? ? ? ? > >? ? ? ? ? ? ?> > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?Best > > regards, > > ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ? ?Goetz. > > ? ? ? ? > >? ? ? ? ? ? ?> > > ? ? ? ? > >? ? ? ? ? ? ?> > > ? ? ? ? > >? ? ? ? ? ? ?> > > ? ? ? ? > >? ? ? ? ? ? ?> > > ? ? ? ? > >? ? ? ? ? ? ?> Kind Regards, > > Thomas > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > >? ? ?Best Regards, Thomas > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > ? ? ? ? > > > > > > > > > > > > From david.holmes at oracle.com Tue Aug 29 10:14:13 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Aug 2017 20:14:13 +1000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> <726e7be2591f481d8d0ee50af6486f75@sap.com> <11ed4c208b8641088b99a2066dfc992b@sap.com> <7653afda-fe01-1115-df4a-da0c2e7356f0@oracle.com> Message-ID: <168e0aa6-eb7c-d6c6-fef6-82aadd12fb77@oracle.com> On 29/08/2017 8:11 PM, Lindenmaier, Goetz wrote: > Hi, > > Thomas, thanks for looking at the new webrev ad hoc. > >>> 2) If the user provided buffer is too small, we will fail, which looks >>> like the dll could not have been located. I am not sure we have to be >>> shy with allocating memory - internally, we malloc a buffer for >>> assembling the filename, and then os::splitpath() will malloc a whole >>> bunch of arrays too. So I think we could just return the dll path >>> location in a malloced buffer and require the caller to free. >> >> That seems a much bigger change. My question before pushing this is: >> have we in any way reduced the size of path that we may accept on some >> platforms? If we have that would be bad. > No. I nowhere change the size of the buffer passed into this method, > and that's the limit. Thanks for confirming - I will push this now. David ----- > On posix, paths no more contain // where concatenated, so the > paths supported can contain one more char in this case. This is an > improvement. > The buffer passed in usually has size MAX_PATH, so the checks for > overflow should never cause a problem. They are mostly there > for safety. Allocation of the memory rather would reduce memory > consumption, but I think it's not that relevant. Overall, I think it's > a mismatch that some functions allocate the memory, others > require a buffer ... but that's really out of scope. > > Best regards, > Goetz. > > > > >> >>> 3) I do not understand ':' as a file separator on windows. So, a path is >>> allowed to contain e.g. "C:;" ? Which would mean "a path relative to the >>> current directory currently active on drive C". If I am not mistaken. Do >>> we want to support this, what is the use case? >> >> IIUC the existing windows code supports this. So yes c:foo.dll is a >> reference to foo.dll in whatever the current directory on drive c: is. >> As for a usecase ... perhaps a way to workaround long paths or >> historical path format restrictions (ie no spaces) ? But the main thing >> is to not break what we currently support. >> >> Thanks, >> David >> >>> Kind Regards, Thomas >>> >>> >>> On Tue, Aug 29, 2017 at 10:41 AM, David Holmes >> >> > wrote: >>> >>> On 29/08/2017 6:29 PM, Lindenmaier, Goetz wrote: >>> >>> Hi David, >>> >>> I fixed the indentation and added you as reviewer. >>> I replaced the webrev in-place: >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >> dllBuildName/webrev.05/ >>> > dllBuildName/webrev.05/> >>> The new code went through all our testing ... except for some >>> ppc/s390 builds >>> that failed because of an other change pushed to hs tonight. But >>> that should >>> not matter, it all passed with the jdk10/jdk10 testing. >>> >>> Would you mind sponsoring? >>> >>> >>> No problem - but can we get Thomas to sign-off on this latest >>> version please. >>> >>> >>> Thanks, >>> David >>> >>> Best regards, >>> ? ?Goetz. >>> >>> -----Original Message----- >>> From: David Holmes [mailto:david.holmes at oracle.com >>> ] >>> Sent: Dienstag, 29. August 2017 09:53 >>> To: Lindenmaier, Goetz >> > >>> Cc: hotspot-runtime-dev at openjdk.java.net >>> >>> Subject: Re: [ping] RFR(M): 8186072: dll_build_name returns >>> true even if file >>> is missing. >>> >>> Hi Goetz, >>> >>> On 29/08/2017 4:18 PM, Lindenmaier, Goetz wrote: >>> >>> Hi, >>> >>> this is a webrev with merged windows and posix >>> implementations: >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >>> dllBuildName/webrev.05/ >>> >>> I like the look of this. >>> >>> There are a couple of indention nits in os.cpp: >>> >>> ? ?247 static bool conc_path_file_and_check(char *buffer, char >>> *printbuffer, size_t printbuflen, >>> ? ?248? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?const char* pname, >>> char lastchar, >>> const char* fname) { >>> >>> >>> ? ?251? ?const char *filesep = (WINDOWS_ONLY(lastchar == >>> ':' ||) lastchar >>> == os::file_separator()[0]) ? >>> ? ?252? ? ? ? ? ? ? ? ? ?"" : os::file_separator(); >>> >>> >>> Thanks, >>> David >>> >>> Best regards, >>> ? ? Goetz >>> >>> -----Original Message----- >>> From: Lindenmaier, Goetz >>> Sent: Montag, 28. August 2017 12:10 >>> To: 'David Holmes' >> > >>> Cc: hotspot-runtime-dev at openjdk.java.net >>> >>> Subject: RE: [ping] RFR(M): 8186072: dll_build_name >>> returns true even if >>> >>> file >>> >>> is missing. >>> >>> Hi, >>> >>> this are the changes needed to make the windows >>> dll_locate_lib >>> universally applicable. I also merge the three >>> similar jio_snprintf >>> calls into one method. >>> I do some gymnastics to avoid another buffer of >>> MAX_PATH_LEN >>> at the first call to conc_path_file_and_check. >>> I'll test this tonight. >>> >>> Best regards, >>> ? ? Goetz. >>> >>> diff -r e09e4eb985c5 src/os/windows/vm/os_windows.cpp >>> --- a/src/os/windows/vm/os_windows.cpp? Thu Aug 17 >>> 17:26:02 >>> >>> 2017 >>> >>> +0200 >>> +++ b/src/os/windows/vm/os_windows.cpp? Mon Aug 28 >>> 12:02:26 >>> >>> 2017 >>> >>> +0200 >>> @@ -1205,6 +1205,17 @@ >>> ? ? ?return GetFileAttributes(filename) != >>> INVALID_FILE_ATTRIBUTES; >>> ? ?} >>> >>> +bool conc_path_file_and_check(char *buffer, char >>> *printbuffer, size_t >>> printbuflen, >>> +? ? ? ? ? ? ? ? ? ? ? ? ? ? ? const char* pname, >>> char lastchar, const char* fname) { >>> +? char *filesep = (WINDOWS_ONLY(lastchar == ':' ||) >>> lastchar == >>> os::file_seperator()[0]) ? "" : os::file_separator(); >>> +? int ret = jio_snprintf(printbuffer, printbuflen, >>> "%s%s%s", path, filesep, >>> fullfname); >>> +? if (ret != -1) { >>> +? ? struct stat statbuf; >>> +? ? return os::stat(buffer, &statbuf) == 0; >>> +? } >>> +? return false; >>> +} >>> + >>> ? ?bool os::dll_locate_lib(char *buffer, size_t buflen, >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ?const char* pname, const >>> char* fname) { >>> ? ? ?bool retval = false; >>> @@ -1220,11 +1231,8 @@ >>> ? ? ? ? ?if (p != NULL) { >>> ? ? ? ? ? ?const size_t plen = strlen(buffer); >>> ? ? ? ? ? ?const char lastchar = buffer[plen - 1]; >>> -? ? ? ? char *filesep = (lastchar == ':' || >>> lastchar == '\\') ? "" : "\\"; >>> -? ? ? ? int ret = jio_snprintf(&buffer[plen], >>> buflen - plen, "%s%s", filesep, >>> fullfname); >>> -? ? ? ? if (ret != -1) { >>> -? ? ? ? ? retval = file_exists(buffer); >>> -? ? ? ? } >>> +? ? ? ? retval = conc_path_file_and_check(buffer, >>> &buffer[plen], buflen - >>> plen, >>> +? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? "", >>> lastchar, fullfname); >>> ? ? ? ? ?} >>> ? ? ? ?} else if (strchr(pname, >>> *os::path_separator()) != NULL) { >>> ? ? ? ? ?int n; >>> @@ -1238,12 +1246,8 @@ >>> ? ? ? ? ? ? ? ?continue; // skip the empty path values >>> ? ? ? ? ? ? ?} >>> ? ? ? ? ? ? ?const char lastchar = path[plen - 1]; >>> -? ? ? ? ? char *filesep = (lastchar == ':' || >>> lastchar == '\\') ? "" : "\\"; >>> -? ? ? ? ? int ret = jio_snprintf(buffer, buflen, >>> "%s%s%s", path, filesep, >>> fullfname); >>> -? ? ? ? ? if (ret != -1 && file_exists(buffer)) { >>> -? ? ? ? ? ? retval = true; >>> -? ? ? ? ? ? break; >>> -? ? ? ? ? } >>> +? ? ? ? ? retval = conc_path_file_and_check(buffer, >>> buffer, buflen, path, >>> lastchar, fullfname); >>> +? ? ? ? ? if (retval) break; >>> ? ? ? ? ? ?} >>> ? ? ? ? ? ?// release the storage >>> ? ? ? ? ? ?for (int i = 0; i < n; i++) { >>> @@ -1255,11 +1259,7 @@ >>> ? ? ? ? ?} >>> ? ? ? ?} else { >>> ? ? ? ? ?const char lastchar = pname[pnamelen-1]; >>> -? ? ? char *filesep = (lastchar == ':' || lastchar >>> == '\\') ? "" : "\\"; >>> -? ? ? int ret = jio_snprintf(buffer, buflen, >>> "%s%s%s", pname, filesep, >>> fullfname); >>> -? ? ? if (ret != -1) { >>> -? ? ? ? retval = file_exists(buffer); >>> -? ? ? } >>> +? ? ? retval = conc_path_file_and_check(buffer, >>> buffer, buflen, path, >>> >>> lastchar, >>> >>> fullfname); >>> ? ? ? ?} >>> ? ? ?} >>> >>> >>> -----Original Message----- >>> From: David Holmes >>> [mailto:david.holmes at oracle.com >>> ] >>> Sent: Montag, 28. August 2017 07:38 >>> To: Lindenmaier, Goetz >>> >> > >>> Cc: hotspot-runtime-dev at openjdk.java.net >>> >>> Subject: Re: [ping] RFR(M): 8186072: >>> dll_build_name returns true even if >>> >>> file >>> >>> is missing. >>> >>> Hi Goetz, >>> >>> On 25/08/2017 12:19 AM, Lindenmaier, Goetz wrote: >>> >>> Hi, >>> >>> I please need a second review and a sponsor: >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >>> dllBuildName/webrev.04 >>> >>> >>> To update my description of the change to >>> the status after Thomas' >>> >>> review: >>> >>> >>> dll_build_name builds the proper path to a >>> library given a list of paths >>> >>> separated by >>> >>> path_seperator and a library name. It adds >>> in the platform specific >>> >>> endings >>> >>> etc. >>> >>> It is documented to return whether the file >>> exists, but only does so if a >>> >>> path_seperator >>> >>> exists in the path. >>> Especially if the path is empty, it just >>> returns ?true? without checking. >>> >>> Dll_build_name is usually used before >>> calling dll_load.? If dll_load does >>> >>> not >>> >>> get a full path it searches >>> >>> in well known unix/windows locations. This >>> is intended in the two cases >>> >>> where dll_build_name >>> >>> is called with an empty path. >>> >>> I renamed dll_build_name to dll_locate_lib >>> and changed it's behavior to >>> >>> always return >>> >>> a full path to the lib, inserting current >>> working directory if no path is >>> >>> given. >>> >>> For the use case where "" was actually >>> passed to the function, I added >>> >>> a >>> >>> new function >>> >>> (reusing the old function name) >>> dll_build_name that just adds system >>> >>> dependent prefix and suffix >>> >>> to the name. >>> I merged all unix implementations to the >>> posix os branch. >>> >>> >>> I started to look at this and have applied the >>> patch to run through some >>> basic testing. The overall approach seems >>> reasonable. But it is hard to >>> track all the details - in particular whether >>> there were any subtle >>> differences across the "posix" systems? >>> >>> I'm wondering what, if any, significant >>> differences exist between the >>> Windows and POSIX versions? I would hope the >>> platform differences >>> >>> could >>> >>> easily be hidden behind macros (for path >>> separator, library suffix etc). >>> Then perhaps this could just go in shared code >>> (os.hpp, os.cpp)? >>> >>> That aside, in the Windows code shouldn't the >>> hardwired .dll strings >>> actually be JNI_LIB_SUFFIX? >>> >>> Thanks, >>> David >>> >>> Best regards, >>> ? ? ?Goetz. >>> >>> >>> >>> -----Original Message----- >>> From: Thomas St?fe >>> [mailto:thomas.stuefe at gmail.com >>> ] >>> Sent: Dienstag, 22. August 2017 17:30 >>> To: Lindenmaier, Goetz >>> >> > >>> Cc: hotspot-runtime-dev at openjdk.java.net >>> >>> Subject: Re: RFR(M): 8186072: >>> dll_build_name returns true even if file >>> >>> is >>> >>> missing. >>> >>> Looks good. >>> >>> ..Thomas >>> >>> On Tue, Aug 22, 2017 at 4:33 PM, >>> Lindenmaier, Goetz >>> >> >>> >> > >>> >>> >>> wrote: >>> >>> >>> >>> ? ? ? ? I mistyped the path to webrev, >>> this should work: >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> dllBuildName/webrev.04 >>> >>> >> >>> >>> dllBuildName/webrev.04> >>> >>> ? ? ? ? Sorry, >>> ? ? ? ? ? Goetz >>> >>> >>> >>> ? ? ? ? > -----Original Message----- >>> ? ? ? ? > From: Lindenmaier, Goetz >>> ? ? ? ? > Sent: Dienstag, 22. August >>> 2017 15:48 >>> ? ? ? ? > To: 'Thomas St?fe' >>> >> >>> >> > > >>> ? ? ? ? > Cc: >>> hotspot-runtime-dev at openjdk.java.net >>> >>> >>> >>> runtime- >>> >>> dev at openjdk.java.net >>> > >>> ? ? ? ? > Subject: RE: RFR(M): 8186072: >>> dll_build_name returns true even if >>> file is >>> ? ? ? ? > missing. >>> ? ? ? ? > >>> ? ? ? ? > Hi, >>> ? ? ? ? > >>> ? ? ? ? > could I please get a second >>> review? >>> ? ? ? ? > >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >> dllBuildName- >>> > dllBuildName-> >>> hs/webrev.04 >>> >> >>> >>> dllBuildName- >>> >>> hs/webrev.04> >>> ? ? ? ? > >>> ? ? ? ? > I had to update the webrev >>> because of a problem on windows. >>> ? ? ? ? > @Thomas I had edited os.hpp, >>> but not saved :( >>> ? ? ? ? > >>> ? ? ? ? > Best regards, >>> ? ? ? ? >? ?Goetz. >>> ? ? ? ? > >>> ? ? ? ? > PS: Didn't double-check the >>> webrev as cr server is slow. >>> ? ? ? ? > >>> ? ? ? ? > > -----Original Message----- >>> ? ? ? ? > > From: Thomas St?fe >>> [mailto:thomas.stuefe at gmail.com >>> >>> >> > ] >>> ? ? ? ? > > Sent: Donnerstag, 17. >>> August 2017 19:54 >>> ? ? ? ? > > To: Lindenmaier, Goetz >>> >> >>> >> > > >>> ? ? ? ? > > Cc: >>> hotspot-runtime-dev at openjdk.java.net >>> >>> >>> runtime-dev at openjdk.java.net >>> > >>> ? ? ? ? > > Subject: Re: RFR(M): >>> 8186072: dll_build_name returns true even >>> >>> if >>> >>> file is >>> ? ? ? ? > > missing. >>> ? ? ? ? > > >>> ? ? ? ? > > Hi Goetz, >>> ? ? ? ? > > >>> ? ? ? ? > > On Thu, Aug 17, 2017 at >>> 6:03 PM, Lindenmaier, Goetz >>> ? ? ? ? > > >> >>> >> > >>> >>> >> >>> >>> >> > > > >>> wrote: >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Hi Thomas, >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?I adapted the comments >>> in os.hpp. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?If I move the call to >>> dll_build_name out of dll_locate_lib >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?I have to do a lot of >>> coding in all the places where it is called. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?That seems not useful >>> to me. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Fixed the type to size_t. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?One could merge >>> posix/windows if putting the check for ?:? >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?into a WINDOWS_ONLY() I >>> guess. The check for \ could be >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?done in posix as well, >>> if using file_seperator(). >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?*? Not your change, >>> but: why does the code in >>> >>> os::dll_locate_lib() >>> >>> even >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?*? differentiate >>> between a PATH containing no >>> os::path_separator() >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?*? and a path >>> containing os::path_separator()? >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?I assume this was done >>> to avoid all the allocations and copying >>> >>> of >>> >>> the >>> ? ? ? ? > > path. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Also adapted the >>> comment in jvmtiExport.cpp. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?New webrev: >>> ? ? ? ? > > >>> ? ? ? ? > > >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >> > >>> ? ? ? ? > > dllBuildName/webrev.03/ >>> >> >>> >> > >>> ? ? ? ? > > dllBuildName/webrev.03/> >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?incremental diff: >>> ? ? ? ? > > >>> ? ? ? ? > > >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >> > >>> ? ? ? ? > > >>> dllBuildName/webrev.03/diffs-incremental.patch >>> ? ? ? ? > > >>> >> >>> >> > >>> ? ? ? ? > > >>> dllBuildName/webrev.03/diffs-incremental.patch> >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?(fixed indentation on >>> windows) >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Best regards, >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ?Goetz. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > Comments in os.hpp seem >>> unchanged ? >>> ? ? ? ? > > >>> ? ? ? ? > > But looks fine otherwise. I >>> do not need another webrev. >>> ? ? ? ? > > >>> ? ? ? ? > > Thanks, Thomas >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?From: Thomas St?fe >>> [mailto:thomas.stuefe at gmail.com >>> >>> >> > >>> ? ? ? ? > > >>> >> >>> >> > > ] >>> ? ? ? ? > >? ? ?Sent: Thursday, August >>> 17, 2017 3:48 PM >>> ? ? ? ? > >? ? ?To: Lindenmaier, Goetz >>> >> >>> >> > >>> ? ? ? ? > > >>> >> >>> >> > > > >>> ? ? ? ? > >? ? ?Cc: >>> hotspot-runtime-dev at openjdk.java.net >>> >>> >>> runtime-dev at openjdk.java.net >>> > >>> >> >>> >>> >>> >>> runtime-> >>> ? ? ? ? > > dev at openjdk.java.net >>> >>> >> > > >>> ? ? ? ? > >? ? ?Subject: Re: RFR(M): >>> 8186072: dll_build_name returns true >>> >>> even >>> >>> if file >>> ? ? ? ? > > is missing. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Hi Goetz, >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?On Thu, Aug 17, 2017 at >>> 1:35 PM, Lindenmaier, Goetz >>> ? ? ? ? > > >> >>> >> > >>> >>> >> >>> >>> >> > > > >>> wrote: >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?Hi Thomas, >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?I reworked the >>> whole thing. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?First, there is >>> dll_build_name. It just does -> >>> ? ? ? ? > > lib.so. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?Second, I >>> renamed the legacy dll_build_name to >>> dll_locate_lib. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?I merged all >>> the unix variants to one in os_posix. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?I removed the >>> buffer overflow check at the top. >>> ? ? ? ? > >? ? ? ? ? ? ?It's too >>> restrictive because the path argument >>> ? ? ? ? > >? ? ? ? ? ? ?can contain >>> several paths.? I added the overflow >>> ? ? ? ? > >? ? ? ? ? ? ?checks into the >>> single cases. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?Also, I first >>> assemble the pure name using the new, simple >>> ? ? ? ? > >? ? ? ? ? ? ?dll_build_name. >>> This is for reuse and readability. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?In case of an >>> empty directory, I use get_current_directory >>> ? ? ? ? > >? ? ? ? ? ? ?to complete the >>> path as indicated by the original >>> ? ? ? ? > > documentation >>> ? ? ? ? > >? ? ? ? ? ? ?where it was >>> called with "". >>> ? ? ? ? > >? ? ? ? ? ? ?Dll_locate_lib >>> now always returns a name with a full >>> path if >>> ? ? ? ? > >? ? ? ? ? ? ?the file exists. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?Also, on >>> windows, I think I fixed a bug by >>> reversing the >>> >>> order >>> >>> ? ? ? ? > >? ? ? ? ? ? ?of checks. A >>> path list ending in ':' or '\' would not >>> have >>> ? ? ? ? > >? ? ? ? ? ? ?been recognized. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?On Bsd, I >>> removed JNI_LIB_* because that already is >>> >>> defined >>> >>> ? ? ? ? > >? ? ? ? ? ? ?in jvm_bsh.h >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?New webrev: >>> ? ? ? ? > > >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >> > >>> ? ? ? ? > > dllBuildName/webrev.02/ >>> >> >>> >> > >>> ? ? ? ? > > dllBuildName/webrev.02/> >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?Best regards, >>> ? ? ? ? > >? ? ? ? ? ? ? ?Goetz. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?I like this better than >>> before. Remarks: >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >> > >>> ? ? ? ? > > >>> >>> >> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html >>> >>> ? ? ? ? > > >>> >> >>> >> > >>> ? ? ? ? > > >>> >>> >> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> >>> >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? // Builds the >>> platform-specific name of a library. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? // Returns false on >>> __buffer overflow__. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Hopefully not! :D >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?How about: "Returns >>> false no truncation" instead. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? // Builds a >>> platform-specific full library path >>> given an ld path >>> and lib >>> ? ? ? ? > > name. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? // Returns true if >>> the buffer contains a full path to an >>> existing >>> file, >>> ? ? ? ? > > false >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? // otherwise. If >>> pathname is empty, checks the current >>> directory. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? static bool >>> ?dll_locate_lib(char* buffer, size_t size, >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? ? ? ? ? ?const char* pathname, >>> const char* >>> fname); >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Might be worth >>> mentioning that "fname" is the unadorned >>> >>> library >>> >>> ? ? ? ? > > name, e.g. "verify" for >>> libverify.so or verify.dll. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Would the following >>> alternative be valid: >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?one could make >>> dll_locate_lib take the real file name, >>> and let >>> caller >>> ? ? ? ? > > use dll_build_name() to >>> build the libary name first before handing >>> >>> it >>> >>> to >>> ? ? ? ? > > dll_locate_lib(). In that >>> case, dll_locate_lib() could be renamed to >>> >>> a >>> >>> generic >>> ? ? ? ? > > "find_file_in_path" because >>> it would work for any kind of file. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?As an added bonus, >>> there would be no need to create a >>> temporary >>> ? ? ? ? > > array in >>> dll_build_name/dll_locate_lib, and no >>> need to call free() >>> >>> so >>> >>> no >>> ? ? ? ? > > cleanup-related control >>> flow changes in these functions. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?===== >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >> > >>> ? ? ? ? > > >>> >>> >>> >> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html >>> >>> ? ? ? ? > > >>> >> >>> >> > >>> ? ? ? ? > > >>> >>> >>> >>> >> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> >>> >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? int fullfnamelen = >>> strlen(JNI_LIB_PREFIX) + strlen(fname) + >>> ? ? ? ? > > strlen(JNI_LIB_SUFFIX); >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?int -> size_t (does >>> that even compile without warning?) >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? ? // Check current >>> working directory. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? ? const char* p = >>> get_current_directory(buffer, buflen); >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? ? if (p != NULL && >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? ? ? ? strlen(buffer) >>> + 1 + fullfnamelen + 1 <= buflen) { >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? ? ? strcat(buffer, >>> "\\"); >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? ? ? strcat(buffer, >>> fullfname); >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? ? ? retval = >>> file_exists(buffer); >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Small nit: I'd use >>> jio_snprintf instead of strcat. Functionally >>> identical but >>> ? ? ? ? > > will make scanners (e.g. >>> coverity) happy. One could then avoid >>> >>> the >>> >>> length >>> ? ? ? ? > > calculation and rely on >>> jio_snprintf truncation: >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?const char* p = >>> get_current_directory(buffer, buflen); >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?if (p != NULL) { >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ?const size_t end = >>> strlen(p); >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ?if (jio_snprintf(end, >>> buflen - end, "\\%s", fullname) != -1) { >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ?retval = >>> file_exists(buffer); >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ?} >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?} >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?-- >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Not your change, but: >>> why does the code in os::dll_locate_lib() >>> even >>> ? ? ? ? > > differentiate between a >>> PATH containing no os::path_separator() >>> and a path >>> ? ? ? ? > > containing >>> os::path_separator()? >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Would the former not be >>> just a PATH with only one directory >>> >>> and >>> >>> hence >>> ? ? ? ? > > need no special treatment? >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?===== >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >> > >>> ? ? ? ? > > >>> >>> >> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.ht >>> ml >>> >>> ? ? ? ? > > >>> >> >>> >> > >>> ? ? ? ? > > >>> >>> >> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.ht >>> ml> >>> >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Could >>> os::dll_locate_lib be consolidated >>> between windows and >>> unix? >>> ? ? ? ? > > Seems to be the >>> implementation is almost identical. >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?==== >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >> > >>> ? ? ? ? > > >>> >>> >> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html >>> >>> ? ? ? ? > > >>> >> >>> >> > >>> ? ? ? ? > > >>> >>> >>> >> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> >>> >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?+? ? ? ? // not found - >>> try library path >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Proposal: "not found - >>> try OS default library path" >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?Find some >>> comments inline: >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ?Especially if the path is empty, it >>> just returns 'true'. >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ?Dll_build_name is usually used before >>> calling dll_load. >>> If >>> ? ? ? ? > > dll_load does not get a >>> full path it searches >>> ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?in well >>> known unix/windows locations. This is >>> intended >>> in >>> ? ? ? ? > > the two cases where >>> dll_build_name >>> ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?is >>> called with an empty path. >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ? ? ? ? > >? ? ? ? ? ? ?> So, for both >>> cases (thread.cpp, jvmtiExport.cpp), >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ? ? ? ? > >? ? ? ? ? ? ?> before, we >>> would call os::dll_build_name() with an >>> empty >>> ? ? ? ? > > string for the path >>> ? ? ? ? > >? ? ? ? ? ? ?> which, for >>> relative paths, would result in feeding >>> that path >>> ? ? ? ? > > unexpanded to >>> ? ? ? ? > >? ? ? ? ? ? ?> dlopen(), >>> which would use whatever the OS does in >>> those >>> ? ? ? ? > > cases (LIBPATH, >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> LD_LIBRARY_PATH, PATH on windows). Note >>> that this >>> >>> does >>> >>> ? ? ? ? > > not necessarily >>> ? ? ? ? > >? ? ? ? ? ? ?> include >>> searching the current directory. >>> ? ? ? ? > >? ? ? ? ? ? ?Right. With >>> changed dll_biuld_name it's again exactly as >>> ? ? ? ? > > before. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?> With your >>> change, we now use java.library.path, >>> which is >>> not >>> ? ? ? ? > > necessarily the >>> ? ? ? ? > >? ? ? ? ? ? ?> same? >>> ? ? ? ? > >? ? ? ? ? ? ?You are right, >>> I oversaw that java.library.path can be >>> ? ? ? ? > > overwritten.? Initially, >>> ? ? ? ? > >? ? ? ? ? ? ?it's set to the >>> right thing. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?> (BTW, I think >>> the old comments in thread.cpp and >>> ? ? ? ? > > jniExport.cpp were wrong:"// >>> ? ? ? ? > >? ? ? ? ? ? ?> Try the local >>> directory" - if "local" means "current", >>> this is >>> not >>> ? ? ? ? > > what did >>> ? ? ? ? > >? ? ? ? ? ? ?> happen). >>> ? ? ? ? > >? ? ? ? ? ? ?Right, I tried >>> to adapt them, did I miss one? >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?I added >>> a second variant of dll_build_name without >>> >>> the >>> >>> ? ? ? ? > > path argument that adds the >>> path >>> ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?from >>> system property java.lang.path and use >>> that in >>> these >>> ? ? ? ? > > two cases. >>> ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?I >>> changed the original function to >>> actually check file >>> ? ? ? ? > > availability in all cases, >>> ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?and to >>> check . if the path is empty. >>> ? ? ? ? > >? ? ? ? ? ? ?> I think that >>> may be a bit confusing. We would then have >>> three >>> ? ? ? ? > > options: >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ? ? ? ? > >? ? ? ? ? ? ?> - call >>> os::dll_build_name with a real >>> ";;.." PATH >>> and >>> ? ? ? ? > > get a file name >>> ? ? ? ? > >? ? ? ? ? ? ?> resolved from >>> that path >>> ? ? ? ? > >? ? ? ? ? ? ?> - call >>> os::dll_build_name with "" for the PATH >>> and get OS >>> dll >>> ? ? ? ? > > resolution >>> ? ? ? ? > >? ? ? ? ? ? ?No, in that >>> case, as I called file_exists(), it >>> would only work if >>> ? ? ? ? > > the dll is in the >>> ? ? ? ? > >? ? ? ? ? ? ?current working >>> directory. But I changed this now, anyways. >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?> - call your >>> new overloaded version of >>> >>> os::dll_build_name(), >>> >>> ? ? ? ? > > which uses - >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> Djava.library.path. >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?Please >>> review this change. I please need a sponsor. >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> >>> >> > >>> ? ? ? ? > > >>> >> >>> >> > >>> > >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> dllBuildName/webrev.01/ >>> ? ? ? ? > > >>> >> >>> >> > >>> ? ? ? ? > > >>> >> >>> >> > >>> > >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> dllBuildName/webrev.01/> >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?Best >>> regards, >>> ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ? ?Goetz. >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ? ? ? ? > >? ? ? ? ? ? ?> >>> ? ? ? ? > >? ? ? ? ? ? ?> Kind Regards, >>> Thomas >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > >? ? ?Best Regards, Thomas >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> ? ? ? ? > > >>> >>> >>> >>> >>> From robbin.ehn at oracle.com Tue Aug 29 10:31:17 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 29 Aug 2017 12:31:17 +0200 Subject: RFR: 8186837: Memory ordering nmethod, _state and _stack_traversal_mark Message-ID: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> Hi please review, The issue 8180932 - "Parallelize safepoint cleanup" changed _stack_traversal_mark to load acquire/store release, this is at least half wrong. Instead for simplicity the write side storestore fence should be match with loadload on read side and the changes to _stack_traversal_mark undone (kept it volatile). Bug: https://bugs.openjdk.java.net/browse/JDK-8186837 Code: http://cr.openjdk.java.net/~rehn/8186837/hotspot.01/webrev/ It's not clear in this code if there other concurrent dependent read/writes. Is true that only when reading/writing _state and _stack_traversal_mark proper memory ordering is needed? To track that I created: https://bugs.openjdk.java.net/browse/JDK-8186839 Thanks Robbin From mikael.gerdin at oracle.com Tue Aug 29 11:42:40 2017 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 29 Aug 2017 13:42:40 +0200 Subject: RFR(S) 8186897: semaphore_posix.hpp should not be included on OSX Message-ID: <98bd8b1f-9a83-3bbd-32f5-fe23d611023d@oracle.com> Hi, Please review this small fix to not include the semaphore_posix header in os_posix.cpp on OSX. If the transitive includes of os_posix.cpp are changed such that it includes semaphore.hpp a name clash will otherwise occur since all platform specific semaphore headers define the SemaphoreImpl typedef. Webrev: http://cr.openjdk.java.net/~mgerdin/8186897/webrev.0 Bug: https://bugs.openjdk.java.net/browse/JDK-8186897 Testing: JPRT buildonly Thanks /Mikael From stefan.karlsson at oracle.com Tue Aug 29 11:44:40 2017 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 29 Aug 2017 13:44:40 +0200 Subject: RFR(S) 8186897: semaphore_posix.hpp should not be included on OSX In-Reply-To: <98bd8b1f-9a83-3bbd-32f5-fe23d611023d@oracle.com> References: <98bd8b1f-9a83-3bbd-32f5-fe23d611023d@oracle.com> Message-ID: <9f16be74-0112-84b1-9719-10732c0fcc9f@oracle.com> Looks good. StefanK On 2017-08-29 13:42, Mikael Gerdin wrote: > Hi, > Please review this small fix to not include the semaphore_posix header > in os_posix.cpp on OSX. > If the transitive includes of os_posix.cpp are changed such that it > includes semaphore.hpp a name clash will otherwise occur since all > platform specific semaphore headers define the SemaphoreImpl typedef. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8186897/webrev.0 > Bug: https://bugs.openjdk.java.net/browse/JDK-8186897 > Testing: JPRT buildonly > > Thanks > /Mikael From david.holmes at oracle.com Tue Aug 29 12:30:42 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Aug 2017 22:30:42 +1000 Subject: RFR(S) 8186897: semaphore_posix.hpp should not be included on OSX In-Reply-To: <98bd8b1f-9a83-3bbd-32f5-fe23d611023d@oracle.com> References: <98bd8b1f-9a83-3bbd-32f5-fe23d611023d@oracle.com> Message-ID: Looks good! Thanks, David On 29/08/2017 9:42 PM, Mikael Gerdin wrote: > Hi, > Please review this small fix to not include the semaphore_posix header > in os_posix.cpp on OSX. > If the transitive includes of os_posix.cpp are changed such that it > includes semaphore.hpp a name clash will otherwise occur since all > platform specific semaphore headers define the SemaphoreImpl typedef. > > Webrev: http://cr.openjdk.java.net/~mgerdin/8186897/webrev.0 > Bug: https://bugs.openjdk.java.net/browse/JDK-8186897 > Testing: JPRT buildonly > > Thanks > /Mikael From david.holmes at oracle.com Tue Aug 29 12:35:42 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Aug 2017 22:35:42 +1000 Subject: RFR: 8186837: Memory ordering nmethod, _state and _stack_traversal_mark In-Reply-To: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> References: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> Message-ID: <3cfddca1-12d8-eef4-1bfe-f6a8b9059634@oracle.com> Hi Robbin, On 29/08/2017 8:31 PM, Robbin Ehn wrote: > Hi please review, > > The issue 8180932 - "Parallelize safepoint cleanup" changed > _stack_traversal_mark to load acquire/store release, this is at least > half wrong. > Instead for simplicity the write side storestore fence should be match > with loadload on read side and the changes to _stack_traversal_mark > undone (kept it volatile). > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8186837 > > Code: > http://cr.openjdk.java.net/~rehn/8186837/hotspot.01/webrev/ This seems okay to me. > It's not clear in this code if there other concurrent dependent > read/writes. > Is true that only when reading/writing _state and _stack_traversal_mark > proper memory ordering is needed? > To track that I created: https://bugs.openjdk.java.net/browse/JDK-8186839 Okay. We need to understand how concurrent lock-free accesses can occur to ensure we have the right ordering constraints in place. Thanks, David > Thanks Robbin From david.holmes at oracle.com Tue Aug 29 12:42:33 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Aug 2017 22:42:33 +1000 Subject: [ping] RFR(M): 8186072: dll_build_name returns true even if file is missing. In-Reply-To: <168e0aa6-eb7c-d6c6-fef6-82aadd12fb77@oracle.com> References: <974f3cb4010a46ddbac4d2cfb89c82a2@sap.com> <85087682-405c-d3cd-7ad6-5e6e21d15165@oracle.com> <726e7be2591f481d8d0ee50af6486f75@sap.com> <11ed4c208b8641088b99a2066dfc992b@sap.com> <7653afda-fe01-1115-df4a-da0c2e7356f0@oracle.com> <168e0aa6-eb7c-d6c6-fef6-82aadd12fb77@oracle.com> Message-ID: <582a82cf-0322-f3ec-7c6b-7ad632d0ee42@oracle.com> Hi Goetz, This has been pushed but there is something odd with the changeset timestamp: date Thu, 17 Aug 2017 17:26:02 +0200 (11 days ago) ??? David On 29/08/2017 8:14 PM, David Holmes wrote: > On 29/08/2017 8:11 PM, Lindenmaier, Goetz wrote: >> Hi, >> >> Thomas, thanks for looking at the new webrev ad hoc. >> >>>> 2) If the user provided buffer is too small, we will fail, which looks >>>> like the dll could not have been located. I am not sure we have to be >>>> shy with allocating memory - internally, we malloc a buffer for >>>> assembling the filename, and then os::splitpath() will malloc a whole >>>> bunch of arrays too. So I think we could just return the dll path >>>> location in a malloced buffer and require the caller to free. >>> >>> That seems a much bigger change. My question before pushing this is: >>> have we in any way reduced the size of path that we may accept on some >>> platforms? If we have that would be bad. >> No. I nowhere change the size of the buffer passed into this method, >> and that's the limit. > > Thanks for confirming - I will push this now. > > David > ----- > >> On posix, paths no more contain // where concatenated, so the >> paths supported can contain one more char in this case. This is an >> improvement. >> The buffer passed in usually has size MAX_PATH, so the checks for >> overflow should never cause a problem. They are mostly there >> for safety.? Allocation of the memory rather would reduce memory >> consumption, but I think it's not that relevant.? Overall, I think it's >> a mismatch that some functions allocate the memory, others >> require a buffer ... but that's really out of scope. >> >> Best regards, >> ?? Goetz. >> >> >> >> >>> >>>> 3) I do not understand ':' as a file separator on windows. So, a >>>> path is >>>> allowed to contain e.g. "C:;" ? Which would mean "a path relative to >>>> the >>>> current directory currently active on drive C". If I am not >>>> mistaken. Do >>>> we want to support this, what is the use case? >>> >>> IIUC the existing windows code supports this. So yes c:foo.dll is a >>> reference to foo.dll in whatever the current directory on drive c: is. >>> As for a usecase ... perhaps a way to workaround long paths or >>> historical path format restrictions (ie no spaces) ? But the main thing >>> is to not break what we currently support. >>> >>> Thanks, >>> David >>> >>>> Kind Regards, Thomas >>>> >>>> >>>> On Tue, Aug 29, 2017 at 10:41 AM, David Holmes >>> >>> > wrote: >>>> >>>> ???? On 29/08/2017 6:29 PM, Lindenmaier, Goetz wrote: >>>> >>>> ???????? Hi David, >>>> >>>> ???????? I fixed the indentation and added you as reviewer. >>>> ???????? I replaced the webrev in-place: >>>> ???????? http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> dllBuildName/webrev.05/ >>>> ???????? >> dllBuildName/webrev.05/> >>>> ???????? The new code went through all our testing ... except for some >>>> ???????? ppc/s390 builds >>>> ???????? that failed because of an other change pushed to hs >>>> tonight. But >>>> ???????? that should >>>> ???????? not matter, it all passed with the jdk10/jdk10 testing. >>>> >>>> ???????? Would you mind sponsoring? >>>> >>>> >>>> ???? No problem - but can we get Thomas to sign-off on this latest >>>> ???? version please. >>>> >>>> >>>> ???? Thanks, >>>> ???? David >>>> >>>> ???????? Best regards, >>>> ????????? ? ?Goetz. >>>> >>>> ???????????? -----Original Message----- >>>> ???????????? From: David Holmes [mailto:david.holmes at oracle.com >>>> ???????????? ] >>>> ???????????? Sent: Dienstag, 29. August 2017 09:53 >>>> ???????????? To: Lindenmaier, Goetz >>> ???????????? > >>>> ???????????? Cc: hotspot-runtime-dev at openjdk.java.net >>>> ???????????? >>>> ???????????? Subject: Re: [ping] RFR(M): 8186072: dll_build_name >>>> returns >>>> ???????????? true even if file >>>> ???????????? is missing. >>>> >>>> ???????????? Hi Goetz, >>>> >>>> ???????????? On 29/08/2017 4:18 PM, Lindenmaier, Goetz wrote: >>>> >>>> ???????????????? Hi, >>>> >>>> ???????????????? this is a webrev with merged windows and posix >>>> ???????????????? implementations: >>>> ???????????????? http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> ???????????????? >>>> >>>> ???????????? dllBuildName/webrev.05/ >>>> >>>> ???????????? I like the look of this. >>>> >>>> ???????????? There are a couple of indention nits in os.cpp: >>>> >>>> ????????????? ? ?247 static bool conc_path_file_and_check(char >>>> *buffer, char >>>> ???????????? *printbuffer, size_t printbuflen, >>>> ????????????? ? ?248? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?const char* pname, >>>> ???????????? char lastchar, >>>> ???????????? const char* fname) { >>>> >>>> >>>> ????????????? ? ?251? ?const char *filesep = (WINDOWS_ONLY(lastchar == >>>> ???????????? ':' ||) lastchar >>>> ???????????? == os::file_separator()[0]) ? >>>> ????????????? ? ?252? ? ? ? ? ? ? ? ? ?"" : os::file_separator(); >>>> >>>> >>>> ???????????? Thanks, >>>> ???????????? David >>>> >>>> ???????????????? Best regards, >>>> ????????????????? ? ? Goetz >>>> >>>> ???????????????????? -----Original Message----- >>>> ???????????????????? From: Lindenmaier, Goetz >>>> ???????????????????? Sent: Montag, 28. August 2017 12:10 >>>> ???????????????????? To: 'David Holmes' >>> ???????????????????? > >>>> ???????????????????? Cc: hotspot-runtime-dev at openjdk.java.net >>>> ???????????????????? >>>> ???????????????????? Subject: RE: [ping] RFR(M): 8186072: >>>> dll_build_name >>>> ???????????????????? returns true even if >>>> >>>> ???????????? file >>>> >>>> ???????????????????? is missing. >>>> >>>> ???????????????????? Hi, >>>> >>>> ???????????????????? this are the changes needed to make the windows >>>> ???????????????????? dll_locate_lib >>>> ???????????????????? universally applicable. I also merge the three >>>> ???????????????????? similar jio_snprintf >>>> ???????????????????? calls into one method. >>>> ???????????????????? I do some gymnastics to avoid another buffer of >>>> ???????????????????? MAX_PATH_LEN >>>> ???????????????????? at the first call to conc_path_file_and_check. >>>> ???????????????????? I'll test this tonight. >>>> >>>> ???????????????????? Best regards, >>>> ????????????????????? ? ? Goetz. >>>> >>>> ???????????????????? diff -r e09e4eb985c5 >>>> src/os/windows/vm/os_windows.cpp >>>> ???????????????????? --- a/src/os/windows/vm/os_windows.cpp? Thu Aug 17 >>>> ???????????????????? 17:26:02 >>>> >>>> ???????????? 2017 >>>> >>>> ???????????????????? +0200 >>>> ???????????????????? +++ b/src/os/windows/vm/os_windows.cpp? Mon Aug 28 >>>> ???????????????????? 12:02:26 >>>> >>>> ???????????? 2017 >>>> >>>> ???????????????????? +0200 >>>> ???????????????????? @@ -1205,6 +1205,17 @@ >>>> ????????????????????? ? ? ?return GetFileAttributes(filename) != >>>> ???????????????????? INVALID_FILE_ATTRIBUTES; >>>> ????????????????????? ? ?} >>>> >>>> ???????????????????? +bool conc_path_file_and_check(char *buffer, char >>>> ???????????????????? *printbuffer, size_t >>>> ???????????????????? printbuflen, >>>> ???????????????????? +? ? ? ? ? ? ? ? ? ? ? ? ? ? ? const char* pname, >>>> ???????????????????? char lastchar, const char* fname) { >>>> ???????????????????? +? char *filesep = (WINDOWS_ONLY(lastchar == >>>> ':' ||) >>>> ???????????????????? lastchar == >>>> ???????????????????? os::file_seperator()[0]) ? "" : >>>> os::file_separator(); >>>> ???????????????????? +? int ret = jio_snprintf(printbuffer, >>>> printbuflen, >>>> ???????????????????? "%s%s%s", path, filesep, >>>> ???????????????????? fullfname); >>>> ???????????????????? +? if (ret != -1) { >>>> ???????????????????? +? ? struct stat statbuf; >>>> ???????????????????? +? ? return os::stat(buffer, &statbuf) == 0; >>>> ???????????????????? +? } >>>> ???????????????????? +? return false; >>>> ???????????????????? +} >>>> ???????????????????? + >>>> ????????????????????? ? ?bool os::dll_locate_lib(char *buffer, >>>> size_t buflen, >>>> ????????????????????? ? ? ? ? ? ? ? ? ? ? ? ? ? ?const char* pname, >>>> const >>>> ???????????????????? char* fname) { >>>> ????????????????????? ? ? ?bool retval = false; >>>> ???????????????????? @@ -1220,11 +1231,8 @@ >>>> ????????????????????? ? ? ? ? ?if (p != NULL) { >>>> ????????????????????? ? ? ? ? ? ?const size_t plen = strlen(buffer); >>>> ????????????????????? ? ? ? ? ? ?const char lastchar = buffer[plen - >>>> 1]; >>>> ???????????????????? -? ? ? ? char *filesep = (lastchar == ':' || >>>> ???????????????????? lastchar == '\\') ? "" : "\\"; >>>> ???????????????????? -? ? ? ? int ret = jio_snprintf(&buffer[plen], >>>> ???????????????????? buflen - plen, "%s%s", filesep, >>>> ???????????????????? fullfname); >>>> ???????????????????? -? ? ? ? if (ret != -1) { >>>> ???????????????????? -? ? ? ? ? retval = file_exists(buffer); >>>> ???????????????????? -? ? ? ? } >>>> ???????????????????? +? ? ? ? retval = conc_path_file_and_check(buffer, >>>> ???????????????????? &buffer[plen], buflen - >>>> ???????????????????? plen, >>>> ???????????????????? +? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? "", >>>> ???????????????????? lastchar, fullfname); >>>> ????????????????????? ? ? ? ? ?} >>>> ????????????????????? ? ? ? ?} else if (strchr(pname, >>>> ???????????????????? *os::path_separator()) != NULL) { >>>> ????????????????????? ? ? ? ? ?int n; >>>> ???????????????????? @@ -1238,12 +1246,8 @@ >>>> ????????????????????? ? ? ? ? ? ? ? ?continue; // skip the empty >>>> path values >>>> ????????????????????? ? ? ? ? ? ? ?} >>>> ????????????????????? ? ? ? ? ? ? ?const char lastchar = path[plen - >>>> 1]; >>>> ???????????????????? -? ? ? ? ? char *filesep = (lastchar == ':' || >>>> ???????????????????? lastchar == '\\') ? "" : "\\"; >>>> ???????????????????? -? ? ? ? ? int ret = jio_snprintf(buffer, buflen, >>>> ???????????????????? "%s%s%s", path, filesep, >>>> ???????????????????? fullfname); >>>> ???????????????????? -? ? ? ? ? if (ret != -1 && file_exists(buffer)) { >>>> ???????????????????? -? ? ? ? ? ? retval = true; >>>> ???????????????????? -? ? ? ? ? ? break; >>>> ???????????????????? -? ? ? ? ? } >>>> ???????????????????? +? ? ? ? ? retval = >>>> conc_path_file_and_check(buffer, >>>> ???????????????????? buffer, buflen, path, >>>> ???????????????????? lastchar, fullfname); >>>> ???????????????????? +? ? ? ? ? if (retval) break; >>>> ????????????????????? ? ? ? ? ? ?} >>>> ????????????????????? ? ? ? ? ? ?// release the storage >>>> ????????????????????? ? ? ? ? ? ?for (int i = 0; i < n; i++) { >>>> ???????????????????? @@ -1255,11 +1259,7 @@ >>>> ????????????????????? ? ? ? ? ?} >>>> ????????????????????? ? ? ? ?} else { >>>> ????????????????????? ? ? ? ? ?const char lastchar = pname[pnamelen-1]; >>>> ???????????????????? -? ? ? char *filesep = (lastchar == ':' || >>>> lastchar >>>> ???????????????????? == '\\') ? "" : "\\"; >>>> ???????????????????? -? ? ? int ret = jio_snprintf(buffer, buflen, >>>> ???????????????????? "%s%s%s", pname, filesep, >>>> ???????????????????? fullfname); >>>> ???????????????????? -? ? ? if (ret != -1) { >>>> ???????????????????? -? ? ? ? retval = file_exists(buffer); >>>> ???????????????????? -? ? ? } >>>> ???????????????????? +? ? ? retval = conc_path_file_and_check(buffer, >>>> ???????????????????? buffer, buflen, path, >>>> >>>> ???????????? lastchar, >>>> >>>> ???????????????????? fullfname); >>>> ????????????????????? ? ? ? ?} >>>> ????????????????????? ? ? ?} >>>> >>>> >>>> ???????????????????????? -----Original Message----- >>>> ???????????????????????? From: David Holmes >>>> ???????????????????????? [mailto:david.holmes at oracle.com >>>> ???????????????????????? ] >>>> ???????????????????????? Sent: Montag, 28. August 2017 07:38 >>>> ???????????????????????? To: Lindenmaier, Goetz >>>> ???????????????????????? >>> ???????????????????????? > >>>> ???????????????????????? Cc: hotspot-runtime-dev at openjdk.java.net >>>> ???????????????????????? >>>> ???????????????????????? Subject: Re: [ping] RFR(M): 8186072: >>>> ???????????????????????? dll_build_name returns true even if >>>> >>>> ???????????????????? file >>>> >>>> ???????????????????????? is missing. >>>> >>>> ???????????????????????? Hi Goetz, >>>> >>>> ???????????????????????? On 25/08/2017 12:19 AM, Lindenmaier, Goetz >>>> wrote: >>>> >>>> ???????????????????????????? Hi, >>>> >>>> ???????????????????????????? I please need a second review and a >>>> sponsor: >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> >>>> ???????????????????? dllBuildName/webrev.04 >>>> >>>> >>>> ???????????????????????????? To update my description of the change to >>>> ???????????????????????????? the status after Thomas' >>>> >>>> ???????????????????? review: >>>> >>>> >>>> ???????????????????????????? dll_build_name builds the proper path to a >>>> ???????????????????????????? library given a list of paths >>>> >>>> ???????????????????????? separated by >>>> >>>> ???????????????????????????? path_seperator and a library name. It adds >>>> ???????????????????????????? in the platform specific >>>> >>>> ???????????????????? endings >>>> >>>> ???????????????????????? etc. >>>> >>>> ???????????????????????????? It is documented to return whether the >>>> file >>>> ???????????????????????????? exists, but only does so if a >>>> >>>> ???????????????????????? path_seperator >>>> >>>> ???????????????????????????? exists in the path. >>>> ???????????????????????????? Especially if the path is empty, it just >>>> ???????????????????????????? returns ?true? without checking. >>>> >>>> ???????????????????????????? Dll_build_name is usually used before >>>> ???????????????????????????? calling dll_load.? If dll_load does >>>> >>>> ???????????????????? not >>>> >>>> ???????????????????????? get a full path it searches >>>> >>>> ???????????????????????????? in well known unix/windows locations. This >>>> ???????????????????????????? is intended in the two cases >>>> >>>> ???????????????????????? where dll_build_name >>>> >>>> ???????????????????????????? is called with an empty path. >>>> >>>> ???????????????????????????? I renamed dll_build_name to dll_locate_lib >>>> ???????????????????????????? and changed it's behavior to >>>> >>>> ???????????????????????? always return >>>> >>>> ???????????????????????????? a full path to the lib, inserting current >>>> ???????????????????????????? working directory if no path is >>>> >>>> ???????????? given. >>>> >>>> ???????????????????????????? For the use case where "" was actually >>>> ???????????????????????????? passed to the function, I added >>>> >>>> ???????????? a >>>> >>>> ???????????????????????? new function >>>> >>>> ???????????????????????????? (reusing the old function name) >>>> ???????????????????????????? dll_build_name that just adds system >>>> >>>> ???????????????????????? dependent prefix and suffix >>>> >>>> ???????????????????????????? to the name. >>>> ???????????????????????????? I merged all unix implementations to the >>>> ???????????????????????????? posix os branch. >>>> >>>> >>>> ???????????????????????? I started to look at this and have applied the >>>> ???????????????????????? patch to run through some >>>> ???????????????????????? basic testing. The overall approach seems >>>> ???????????????????????? reasonable. But it is hard to >>>> ???????????????????????? track all the details - in particular whether >>>> ???????????????????????? there were any subtle >>>> ???????????????????????? differences across the "posix" systems? >>>> >>>> ???????????????????????? I'm wondering what, if any, significant >>>> ???????????????????????? differences exist between the >>>> ???????????????????????? Windows and POSIX versions? I would hope the >>>> ???????????????????????? platform differences >>>> >>>> ???????????? could >>>> >>>> ???????????????????????? easily be hidden behind macros (for path >>>> ???????????????????????? separator, library suffix etc). >>>> ???????????????????????? Then perhaps this could just go in shared code >>>> ???????????????????????? (os.hpp, os.cpp)? >>>> >>>> ???????????????????????? That aside, in the Windows code shouldn't the >>>> ???????????????????????? hardwired .dll strings >>>> ???????????????????????? actually be JNI_LIB_SUFFIX? >>>> >>>> ???????????????????????? Thanks, >>>> ???????????????????????? David >>>> >>>> ???????????????????????????? Best regards, >>>> ????????????????????????????? ? ? ?Goetz. >>>> >>>> >>>> >>>> ???????????????????????????????? -----Original Message----- >>>> ???????????????????????????????? From: Thomas St?fe >>>> ???????????????????????????????? [mailto:thomas.stuefe at gmail.com >>>> ???????????????????????????????? ] >>>> ???????????????????????????????? Sent: Dienstag, 22. August 2017 17:30 >>>> ???????????????????????????????? To: Lindenmaier, Goetz >>>> ???????????????????????????????? >>> ???????????????????????????????? > >>>> ???????????????????????????????? Cc: >>>> hotspot-runtime-dev at openjdk.java.net >>>> >>>> >>>> ???????????????????????????????? Subject: Re: RFR(M): 8186072: >>>> ???????????????????????????????? dll_build_name returns true even if >>>> file >>>> >>>> ???????????? is >>>> >>>> ???????????????????????????????? missing. >>>> >>>> ???????????????????????????????? Looks good. >>>> >>>> ???????????????????????????????? ..Thomas >>>> >>>> ???????????????????????????????? On Tue, Aug 22, 2017 at 4:33 PM, >>>> ???????????????????????????????? Lindenmaier, Goetz >>>> ???????????????????????????????? >>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > >>>> >>>> >>>> ???????????????????????? wrote: >>>> >>>> >>>> >>>> ????????????????????????????????? ? ? ? ? I mistyped the path to >>>> webrev, >>>> ???????????????????????????????? this should work: >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> ???????????????????????????????? dllBuildName/webrev.04 >>>> >>>> >>>> >>> >>>> >>>> >>>> ???????????????????????????????? dllBuildName/webrev.04> >>>> >>>> ????????????????????????????????? ? ? ? ? Sorry, >>>> ????????????????????????????????? ? ? ? ? ? Goetz >>>> >>>> >>>> >>>> ????????????????????????????????? ? ? ? ? > -----Original Message----- >>>> ????????????????????????????????? ? ? ? ? > From: Lindenmaier, Goetz >>>> ????????????????????????????????? ? ? ? ? > Sent: Dienstag, 22. August >>>> ???????????????????????????????? 2017 15:48 >>>> ????????????????????????????????? ? ? ? ? > To: 'Thomas St?fe' >>>> ???????????????????????????????? >>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > > >>>> ????????????????????????????????? ? ? ? ? > Cc: >>>> ???????????????????????????????? hotspot-runtime-dev at openjdk.java.net >>>> >>>> >>>> ???????????????????????????????? >>>> >>>> ???????????????????????? runtime- >>>> >>>> ???????????????????????????????? dev at openjdk.java.net >>>> ???????????????????????????????? > >>>> ????????????????????????????????? ? ? ? ? > Subject: RE: RFR(M): >>>> 8186072: >>>> ???????????????????????????????? dll_build_name returns true even if >>>> ???????????????????????????????? file is >>>> ????????????????????????????????? ? ? ? ? > missing. >>>> ????????????????????????????????? ? ? ? ? > >>>> ????????????????????????????????? ? ? ? ? > Hi, >>>> ????????????????????????????????? ? ? ? ? > >>>> ????????????????????????????????? ? ? ? ? > could I please get a second >>>> ???????????????????????????????? review? >>>> ????????????????????????????????? ? ? ? ? > >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>> dllBuildName- >>>> >>>> >> dllBuildName-> >>>> ???????????????????????????????? hs/webrev.04 >>>> >>>> >>> >>>> >>>> >>>> ???????????????????????? dllBuildName- >>>> >>>> ???????????????????????????????? hs/webrev.04> >>>> ????????????????????????????????? ? ? ? ? > >>>> ????????????????????????????????? ? ? ? ? > I had to update the webrev >>>> ???????????????????????????????? because of a problem on windows. >>>> ????????????????????????????????? ? ? ? ? > @Thomas I had edited >>>> os.hpp, >>>> ???????????????????????????????? but not saved :( >>>> ????????????????????????????????? ? ? ? ? > >>>> ????????????????????????????????? ? ? ? ? > Best regards, >>>> ????????????????????????????????? ? ? ? ? >? ?Goetz. >>>> ????????????????????????????????? ? ? ? ? > >>>> ????????????????????????????????? ? ? ? ? > PS: Didn't double-check the >>>> ???????????????????????????????? webrev as cr server is slow. >>>> ????????????????????????????????? ? ? ? ? > >>>> ????????????????????????????????? ? ? ? ? > > -----Original >>>> Message----- >>>> ????????????????????????????????? ? ? ? ? > > From: Thomas St?fe >>>> ???????????????????????????????? [mailto:thomas.stuefe at gmail.com >>>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > ] >>>> ????????????????????????????????? ? ? ? ? > > Sent: Donnerstag, 17. >>>> ???????????????????????????????? August 2017 19:54 >>>> ????????????????????????????????? ? ? ? ? > > To: Lindenmaier, Goetz >>>> ???????????????????????????????? >>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > > >>>> ????????????????????????????????? ? ? ? ? > > Cc: >>>> ???????????????????????????????? hotspot-runtime-dev at openjdk.java.net >>>> >>>> >>>> ???????????????????????????????? >>>> ???????????????????????????????? runtime-dev at openjdk.java.net >>>> ???????????????????????????????? > >>>> ????????????????????????????????? ? ? ? ? > > Subject: Re: RFR(M): >>>> ???????????????????????????????? 8186072: dll_build_name returns >>>> true even >>>> >>>> ???????????????????????? if >>>> >>>> ???????????????????????????????? file is >>>> ????????????????????????????????? ? ? ? ? > > missing. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > Hi Goetz, >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > On Thu, Aug 17, 2017 at >>>> ???????????????????????????????? 6:03 PM, Lindenmaier, Goetz >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > >>>> >>>> ???????????????????????? >>> ???????????????????????? >>>> >>>> ???????????????????????????????? >>> ???????????????????????????????? > >>>> > > >>>> ???????????????????????????????? wrote: >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Hi Thomas, >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?I adapted the >>>> comments >>>> ???????????????????????????????? in os.hpp. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?If I move the call to >>>> ???????????????????????????????? dll_build_name out of dll_locate_lib >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?I have to do a lot of >>>> ???????????????????????????????? coding in all the places where it >>>> is called. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?That seems not useful >>>> ???????????????????????????????? to me. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Fixed the type to >>>> size_t. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?One could merge >>>> ???????????????????????????????? posix/windows if putting the check >>>> for ?:? >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?into a >>>> WINDOWS_ONLY() I >>>> ???????????????????????????????? guess. The check for \ could be >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?done in posix as >>>> well, >>>> ???????????????????????????????? if using file_seperator(). >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?*? Not your change, >>>> ???????????????????????????????? but: why does the code in >>>> >>>> ???????????????????????? os::dll_locate_lib() >>>> >>>> ???????????????????????????????? even >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?*? differentiate >>>> ???????????????????????????????? between a PATH containing no >>>> ???????????????????????????????? os::path_separator() >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?*? and a path >>>> ???????????????????????????????? containing os::path_separator()? >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?I assume this was >>>> done >>>> ???????????????????????????????? to avoid all the allocations and >>>> copying >>>> >>>> ???????????????????????? of >>>> >>>> ???????????????????????????????? the >>>> ????????????????????????????????? ? ? ? ? > > path. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Also adapted the >>>> ???????????????????????????????? comment in jvmtiExport.cpp. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?New webrev: >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > dllBuildName/webrev.03/ >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > dllBuildName/webrev.03/> >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?incremental diff: >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> dllBuildName/webrev.03/diffs-incremental.patch >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> dllBuildName/webrev.03/diffs-incremental.patch> >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?(fixed indentation on >>>> ???????????????????????????????? windows) >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Best regards, >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ?Goetz. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > Comments in os.hpp seem >>>> ???????????????????????????????? unchanged ? >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > But looks fine >>>> otherwise. I >>>> ???????????????????????????????? do not need another webrev. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > Thanks, Thomas >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?From: Thomas St?fe >>>> ???????????????????????????????? [mailto:thomas.stuefe at gmail.com >>>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ???????????????????????????????? >>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > > ] >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Sent: Thursday, >>>> August >>>> ???????????????????????????????? 17, 2017 3:48 PM >>>> ????????????????????????????????? ? ? ? ? > >? ? ?To: Lindenmaier, >>>> Goetz >>>> ???????????????????????????????? >>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ???????????????????????????????? >>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > >>>> > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Cc: >>>> ???????????????????????????????? hotspot-runtime-dev at openjdk.java.net >>>> >>>> >>>> ???????????????????????????????? >>>> ???????????????????????????????? runtime-dev at openjdk.java.net >>>> ???????????????????????????????? > >>>> ???????????????????????????????? >>> ???????????????????????????????? >>>> >>>> ???????????????????????? >>>> >>>> ???????????????????????????????? runtime-> >>>> ????????????????????????????????? ? ? ? ? > > dev at openjdk.java.net >>>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Subject: Re: RFR(M): >>>> ???????????????????????????????? 8186072: dll_build_name returns true >>>> >>>> ???????????????????????? even >>>> >>>> ???????????????????????????????? if file >>>> ????????????????????????????????? ? ? ? ? > > is missing. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Hi Goetz, >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?On Thu, Aug 17, >>>> 2017 at >>>> ???????????????????????????????? 1:35 PM, Lindenmaier, Goetz >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>> ???????????????????????????????? >>>> ???????????????????????????????? >>> ???????????????????????????????? > >>>> >>>> ???????????????????????? >>> ???????????????????????? >>>> >>>> ???????????????????????????????? >>> ???????????????????????????????? > >>>> > > >>>> ???????????????????????????????? wrote: >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?Hi Thomas, >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?I reworked >>>> the >>>> ???????????????????????????????? whole thing. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?First, >>>> there is >>>> ???????????????????????????????? dll_build_name. It just does -> >>>> ????????????????????????????????? ? ? ? ? > > lib.so. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?Second, I >>>> ???????????????????????????????? renamed the legacy dll_build_name to >>>> ???????????????????????????????? dll_locate_lib. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?I merged all >>>> ???????????????????????????????? the unix variants to one in os_posix. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?I removed the >>>> ???????????????????????????????? buffer overflow check at the top. >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?It's too >>>> ???????????????????????????????? restrictive because the path argument >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?can contain >>>> ???????????????????????????????? several paths.? I added the overflow >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?checks >>>> into the >>>> ???????????????????????????????? single cases. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?Also, I first >>>> ???????????????????????????????? assemble the pure name using the >>>> new, simple >>>> ????????????????????????????????? ? ? ? ? > > >>>> ?dll_build_name. >>>> ???????????????????????????????? This is for reuse and readability. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?In case of an >>>> ???????????????????????????????? empty directory, I use >>>> get_current_directory >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?to >>>> complete the >>>> ???????????????????????????????? path as indicated by the original >>>> ????????????????????????????????? ? ? ? ? > > documentation >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?where it was >>>> ???????????????????????????????? called with "". >>>> ????????????????????????????????? ? ? ? ? > > >>>> ?Dll_locate_lib >>>> ???????????????????????????????? now always returns a name with a full >>>> ???????????????????????????????? path if >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?the file >>>> exists. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?Also, on >>>> ???????????????????????????????? windows, I think I fixed a bug by >>>> ???????????????????????????????? reversing the >>>> >>>> ???????????????????????? order >>>> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?of checks. A >>>> ???????????????????????????????? path list ending in ':' or '\' >>>> would not >>>> ???????????????????????????????? have >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?been >>>> recognized. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?On Bsd, I >>>> ???????????????????????????????? removed JNI_LIB_* because that >>>> already is >>>> >>>> ???????????????????????? defined >>>> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?in jvm_bsh.h >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?New webrev: >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > dllBuildName/webrev.02/ >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > dllBuildName/webrev.02/> >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?Best regards, >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ? ?Goetz. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?I like this better >>>> than >>>> ???????????????????????????????? before. Remarks: >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html >>>> >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> dllBuildName/webrev.02/src/share/vm/runtime/os.hpp.udiff.html> >>>> >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? // Builds the >>>> ???????????????????????????????? platform-specific name of a library. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? // Returns >>>> false on >>>> ???????????????????????????????? __buffer overflow__. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Hopefully not! :D >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?How about: "Returns >>>> ???????????????????????????????? false no truncation" instead. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? // Builds a >>>> ???????????????????????????????? platform-specific full library path >>>> ???????????????????????????????? given an ld path >>>> ???????????????????????????????? and lib >>>> ????????????????????????????????? ? ? ? ? > > name. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? // Returns true if >>>> ???????????????????????????????? the buffer contains a full path to an >>>> ???????????????????????????????? existing >>>> ???????????????????????????????? file, >>>> ????????????????????????????????? ? ? ? ? > > false >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? // otherwise. If >>>> ???????????????????????????????? pathname is empty, checks the current >>>> ???????????????????????????????? directory. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? static bool >>>> ????????????????????????????????? ?dll_locate_lib(char* buffer, >>>> size_t size, >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? ? ? ? ? ?const char* >>>> pathname, >>>> ???????????????????????????????? const char* >>>> ???????????????????????????????? fname); >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Might be worth >>>> ???????????????????????????????? mentioning that "fname" is the >>>> unadorned >>>> >>>> ???????????????????????? library >>>> >>>> ????????????????????????????????? ? ? ? ? > > name, e.g. "verify" for >>>> ???????????????????????????????? libverify.so or verify.dll. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Would the following >>>> ???????????????????????????????? alternative be valid: >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?one could make >>>> ???????????????????????????????? dll_locate_lib take the real file >>>> name, >>>> ???????????????????????????????? and let >>>> ???????????????????????????????? caller >>>> ????????????????????????????????? ? ? ? ? > > use dll_build_name() to >>>> ???????????????????????????????? build the libary name first before >>>> handing >>>> >>>> ???????????????????????? it >>>> >>>> ???????????????????????????????? to >>>> ????????????????????????????????? ? ? ? ? > > dll_locate_lib(). In that >>>> ???????????????????????????????? case, dll_locate_lib() could be >>>> renamed to >>>> >>>> ???????????????????????? a >>>> >>>> ???????????????????????????????? generic >>>> ????????????????????????????????? ? ? ? ? > > "find_file_in_path" >>>> because >>>> ???????????????????????????????? it would work for any kind of file. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?As an added bonus, >>>> ???????????????????????????????? there would be no need to create a >>>> ???????????????????????????????? temporary >>>> ????????????????????????????????? ? ? ? ? > > array in >>>> ???????????????????????????????? dll_build_name/dll_locate_lib, and no >>>> ???????????????????????????????? need to call free() >>>> >>>> ???????????????????????? so >>>> >>>> ???????????????????????????????? no >>>> ????????????????????????????????? ? ? ? ? > > cleanup-related control >>>> ???????????????????????????????? flow changes in these functions. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?===== >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>>> >>> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html >>>> >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>>> >>>> >>> dllBuildName/webrev.02/src/os/windows/vm/os_windows.cpp.udiff.html> >>>> >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? int fullfnamelen = >>>> ???????????????????????????????? strlen(JNI_LIB_PREFIX) + >>>> strlen(fname) + >>>> ????????????????????????????????? ? ? ? ? > > strlen(JNI_LIB_SUFFIX); >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?int -> size_t (does >>>> ???????????????????????????????? that even compile without warning?) >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? ? // Check current >>>> ???????????????????????????????? working directory. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? ? const char* p = >>>> ???????????????????????????????? get_current_directory(buffer, buflen); >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? ? if (p != NULL && >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+ >>>> strlen(buffer) >>>> ???????????????????????????????? + 1 + fullfnamelen + 1 <= buflen) { >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? ? ? strcat(buffer, >>>> ???????????????????????????????? "\\"); >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? ? ? strcat(buffer, >>>> ???????????????????????????????? fullfname); >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? ? ? retval = >>>> ???????????????????????????????? file_exists(buffer); >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Small nit: I'd use >>>> ???????????????????????????????? jio_snprintf instead of strcat. >>>> Functionally >>>> ???????????????????????????????? identical but >>>> ????????????????????????????????? ? ? ? ? > > will make scanners (e.g. >>>> ???????????????????????????????? coverity) happy. One could then avoid >>>> >>>> ???????????????????????? the >>>> >>>> ???????????????????????????????? length >>>> ????????????????????????????????? ? ? ? ? > > calculation and rely on >>>> ???????????????????????????????? jio_snprintf truncation: >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?const char* p = >>>> ???????????????????????????????? get_current_directory(buffer, buflen); >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?if (p != NULL) { >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ?const size_t end = >>>> ???????????????????????????????? strlen(p); >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ?if >>>> (jio_snprintf(end, >>>> ???????????????????????????????? buflen - end, "\\%s", fullname) != >>>> -1) { >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ?retval = >>>> ???????????????????????????????? file_exists(buffer); >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ?} >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?} >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?-- >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Not your change, but: >>>> ???????????????????????????????? why does the code in >>>> os::dll_locate_lib() >>>> ???????????????????????????????? even >>>> ????????????????????????????????? ? ? ? ? > > differentiate between a >>>> ???????????????????????????????? PATH containing no >>>> os::path_separator() >>>> ???????????????????????????????? and a path >>>> ????????????????????????????????? ? ? ? ? > > containing >>>> ???????????????????????????????? os::path_separator()? >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Would the former >>>> not be >>>> ???????????????????????????????? just a PATH with only one directory >>>> >>>> ???????????????????????? and >>>> >>>> ???????????????????????????????? hence >>>> ????????????????????????????????? ? ? ? ? > > need no special >>>> treatment? >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?===== >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.ht >>>> ???????????????????????? ml >>>> >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> dllBuildName/webrev.02/src/os/posix/vm/os_posix.cpp.udiff.ht >>>> ???????????????????????? ml> >>>> >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Could >>>> ???????????????????????????????? os::dll_locate_lib be consolidated >>>> ???????????????????????????????? between windows and >>>> ???????????????????????????????? unix? >>>> ????????????????????????????????? ? ? ? ? > > Seems to be the >>>> ???????????????????????????????? implementation is almost identical. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?==== >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html >>>> >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>>> >>> dllBuildName/webrev.02/src/share/vm/prims/jvmtiExport.cpp.udiff.html> >>>> >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?+? ? ? ? // not >>>> found - >>>> ???????????????????????????????? try library path >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Proposal: "not >>>> found - >>>> ???????????????????????????????? try OS default library path" >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?Find some >>>> ???????????????????????????????? comments inline: >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ?Especially if the path is empty, it >>>> ???????????????????????????????? just returns 'true'. >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ?Dll_build_name is usually used >>>> before >>>> ???????????????????????????????? calling dll_load. >>>> ???????????????????????????????? If >>>> ????????????????????????????????? ? ? ? ? > > dll_load does not get a >>>> ???????????????????????????????? full path it searches >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?in >>>> well >>>> ???????????????????????????????? known unix/windows locations. This is >>>> ???????????????????????????????? intended >>>> ???????????????????????????????? in >>>> ????????????????????????????????? ? ? ? ? > > the two cases where >>>> ???????????????????????????????? dll_build_name >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?is >>>> ???????????????????????????????? called with an empty path. >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> So, for >>>> both >>>> ???????????????????????????????? cases (thread.cpp, jvmtiExport.cpp), >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> before, we >>>> ???????????????????????????????? would call os::dll_build_name() >>>> with an >>>> ???????????????????????????????? empty >>>> ????????????????????????????????? ? ? ? ? > > string for the path >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> which, for >>>> ???????????????????????????????? relative paths, would result in >>>> feeding >>>> ???????????????????????????????? that path >>>> ????????????????????????????????? ? ? ? ? > > unexpanded to >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> dlopen(), >>>> ???????????????????????????????? which would use whatever the OS >>>> does in >>>> ???????????????????????????????? those >>>> ????????????????????????????????? ? ? ? ? > > cases (LIBPATH, >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ???????????????????????????????? LD_LIBRARY_PATH, PATH on windows). >>>> Note >>>> ???????????????????????????????? that this >>>> >>>> ???????????????????????? does >>>> >>>> ????????????????????????????????? ? ? ? ? > > not necessarily >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> include >>>> ???????????????????????????????? searching the current directory. >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?Right. With >>>> ???????????????????????????????? changed dll_biuld_name it's again >>>> exactly as >>>> ????????????????????????????????? ? ? ? ? > > before. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> With your >>>> ???????????????????????????????? change, we now use java.library.path, >>>> ???????????????????????????????? which is >>>> ???????????????????????????????? not >>>> ????????????????????????????????? ? ? ? ? > > necessarily the >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> same? >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?You are >>>> right, >>>> ???????????????????????????????? I oversaw that java.library.path >>>> can be >>>> ????????????????????????????????? ? ? ? ? > > overwritten.? Initially, >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?it's set >>>> to the >>>> ???????????????????????????????? right thing. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> (BTW, I >>>> think >>>> ???????????????????????????????? the old comments in thread.cpp and >>>> ????????????????????????????????? ? ? ? ? > > jniExport.cpp were >>>> wrong:"// >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> Try the >>>> local >>>> ???????????????????????????????? directory" - if "local" means >>>> "current", >>>> ???????????????????????????????? this is >>>> ???????????????????????????????? not >>>> ????????????????????????????????? ? ? ? ? > > what did >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> happen). >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?Right, I >>>> tried >>>> ???????????????????????????????? to adapt them, did I miss one? >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?I >>>> added >>>> ???????????????????????????????? a second variant of dll_build_name >>>> without >>>> >>>> ???????????????????????? the >>>> >>>> ????????????????????????????????? ? ? ? ? > > path argument that >>>> adds the >>>> ???????????????????????????????? path >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?from >>>> ???????????????????????????????? system property java.lang.path and use >>>> ???????????????????????????????? that in >>>> ???????????????????????????????? these >>>> ????????????????????????????????? ? ? ? ? > > two cases. >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?I >>>> ???????????????????????????????? changed the original function to >>>> ???????????????????????????????? actually check file >>>> ????????????????????????????????? ? ? ? ? > > availability in all >>>> cases, >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ?and to >>>> ???????????????????????????????? check . if the path is empty. >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> I think >>>> that >>>> ???????????????????????????????? may be a bit confusing. We would >>>> then have >>>> ???????????????????????????????? three >>>> ????????????????????????????????? ? ? ? ? > > options: >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> - call >>>> ???????????????????????????????? os::dll_build_name with a real >>>> ???????????????????????????????? ";;.." PATH >>>> ???????????????????????????????? and >>>> ????????????????????????????????? ? ? ? ? > > get a file name >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> resolved >>>> from >>>> ???????????????????????????????? that path >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> - call >>>> ???????????????????????????????? os::dll_build_name with "" for the >>>> PATH >>>> ???????????????????????????????? and get OS >>>> ???????????????????????????????? dll >>>> ????????????????????????????????? ? ? ? ? > > resolution >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?No, in that >>>> ???????????????????????????????? case, as I called file_exists(), it >>>> ???????????????????????????????? would only work if >>>> ????????????????????????????????? ? ? ? ? > > the dll is in the >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?current >>>> working >>>> ???????????????????????????????? directory. But I changed this now, >>>> anyways. >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> - call your >>>> ???????????????????????????????? new overloaded version of >>>> >>>> ???????????????????????? os::dll_build_name(), >>>> >>>> ????????????????????????????????? ? ? ? ? > > which uses - >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ???????????????????????????????? Djava.library.path. >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ?Please >>>> ???????????????????????????????? review this change. I please need a >>>> sponsor. >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> >>>> http://cr.openjdk.java.net/~goetz/wr17/8186072- >>>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ???????????????????????????????? dllBuildName/webrev.01/ >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>> >>>> >>>> >>>> >>> >>>> > >>>> ????????????????????????????????? > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ???????????????????????????????? dllBuildName/webrev.01/> >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?>? ? ? ?Best >>>> ???????????????????????????????? regards, >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ?Goetz. >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> >>>> ????????????????????????????????? ? ? ? ? > >? ? ? ? ? ? ?> Kind >>>> Regards, >>>> ???????????????????????????????? Thomas >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > >? ? ?Best Regards, Thomas >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> ????????????????????????????????? ? ? ? ? > > >>>> >>>> >>>> >>>> >>>> From robbin.ehn at oracle.com Tue Aug 29 12:44:23 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 29 Aug 2017 14:44:23 +0200 Subject: RFR: 8186837: Memory ordering nmethod, _state and _stack_traversal_mark In-Reply-To: <3cfddca1-12d8-eef4-1bfe-f6a8b9059634@oracle.com> References: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> <3cfddca1-12d8-eef4-1bfe-f6a8b9059634@oracle.com> Message-ID: Thanks David! /Robbin On 08/29/2017 02:35 PM, David Holmes wrote: > Hi Robbin, > > On 29/08/2017 8:31 PM, Robbin Ehn wrote: >> Hi please review, >> >> The issue 8180932 - "Parallelize safepoint cleanup" changed _stack_traversal_mark to load acquire/store release, this is at least half wrong. >> Instead for simplicity the write side storestore fence should be match with loadload on read side and the changes to _stack_traversal_mark undone (kept it volatile). >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8186837 >> >> Code: >> http://cr.openjdk.java.net/~rehn/8186837/hotspot.01/webrev/ > > This seems okay to me. > >> It's not clear in this code if there other concurrent dependent read/writes. >> Is true that only when reading/writing _state and _stack_traversal_mark proper memory ordering is needed? >> To track that I created: https://bugs.openjdk.java.net/browse/JDK-8186839 > > Okay. We need to understand how concurrent lock-free accesses can occur to ensure we have the right ordering constraints in place. > > Thanks, > David > >> Thanks Robbin From rkennke at redhat.com Tue Aug 29 13:00:22 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 29 Aug 2017 15:00:22 +0200 Subject: RFR: 8186837: Memory ordering nmethod, _state and _stack_traversal_mark In-Reply-To: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> References: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> Message-ID: Hi Robin, I doubt that we can assume a symmetry between loadload and storestore like there is with load-acquire and release-store. This doesn't seem right. In my experience loadload and storestore are rather special purpose: loadload ensures ordering between otherwise unrelated loads and storestore likewise with stores. And even symmetric use of load-acquire and release-store are often done wrong: those are not meant to protect concurrent access to the field, but to the stuff that is protected by the field access (think locks), I.e. what happens between the LA and RS. At least that is my understanding. I suggest to do what David said and try to understand what concurrent accesses to which fields we have, and which fences are actually needed to ensure correct ordering. And thanks for revisiting this! Cheers, Roman Am 29. August 2017 12:31:17 MESZ schrieb Robbin Ehn : >Hi please review, > >The issue 8180932 - "Parallelize safepoint cleanup" changed >_stack_traversal_mark to load acquire/store release, this is at least >half wrong. >Instead for simplicity the write side storestore fence should be match >with loadload on read side and the changes to _stack_traversal_mark >undone (kept it volatile). > >Bug: >https://bugs.openjdk.java.net/browse/JDK-8186837 > >Code: >http://cr.openjdk.java.net/~rehn/8186837/hotspot.01/webrev/ > >It's not clear in this code if there other concurrent dependent >read/writes. >Is true that only when reading/writing _state and _stack_traversal_mark >proper memory ordering is needed? >To track that I created: >https://bugs.openjdk.java.net/browse/JDK-8186839 > >Thanks Robbin -- Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From adinn at redhat.com Tue Aug 29 13:53:16 2017 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 29 Aug 2017 14:53:16 +0100 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> Message-ID: <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> On 28/08/17 22:19, Zhengyu Gu wrote: > This enhancement allows NMT to report class metadata information. > > NMT has no visibility into metaspace so far, it has become an obstacle > to estimate real memory cost for classes. While estimating the cost, we > usually assume that class metadata occupies whole committed space, which > results higher than actual number. Of course, it's very important to know the committed size because that's what you have to provision for. However, any disparity between that size and the space actually occupied by metadata is also highly valuable information. That's true whether the disparity arises because of waste caused by fragmentation or because chunks have been added back to the free list or have not yet been carved off recently malloced regions. So, I think this extra info is very helpful. > The patch uses existing metaspace APIs, and reports counters in NMT > *Class* summary section. > . . . > Sample outputs: > > Class summary: > > -???????????????????? Class (reserved=1071790KB, committed=24750KB) > ??????????????????????????? (classes #3078) > ??????????????????????????? (malloc=686KB #7122) > ??????????????????????????? (mmap: reserved=1071104KB, committed=24064KB) > ??????????????????????????? (? Metadata:?????????????????????????????? ) > ??????????????????????????? (??? reserved=22528KB, committed=21504KB) > ??????????????????????????? (??? capacity=21327KB, used=20654KB) > ??????????????????????????? (??? free chunks=113KB) > ??????????????????????????? (??? available=0KB) > ??????????????????????????? (? Class space:??????????????????????????? ) > ??????????????????????????? (??? reserved=1048576KB, committed=2560KB) > ??????????????????????????? (??? capacity=2525KB, used=2268KB) > ??????????????????????????? (??? free chunks=0KB) > ??????????????????????????? (??? available=35KB) I think this change is to ship modulo a few small quibbles. The four figures quoted here for each of the data and class metaspace regions don't quite add up i.e. used + free chunks + available = total_in_use =/= capacity 2268 + 0 + 35 = 2303 =/= 2525 20654 + 113 + 0 = 20767 =/= 21327 As I understand it this is because of waste caused either by the need to insert block and chunk headers or by the inability to allocate objects out of small fragments at the end of in use chunks. Is that correct? If so then would it not be clearer to account for this waste explicitly? e.g. ( Metadata: ) ( reserved=22528KB, committed=21504KB) ( capacity=21327KB, used=20654KB) ( free chunks=113KB, available=0KB) ( waste = 560KB = 2.6%) n.b. the above figures are calculated as waste = capacity - total_in_use waste% = waste / capacity Also, whether or not waste space gets reported, I think the output would look cleaner if you were to report free and available space on one line. I'm happ regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From zgu at redhat.com Tue Aug 29 14:35:34 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 29 Aug 2017 10:35:34 -0400 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> Message-ID: <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> Hi Andrew, >> Class summary: >> >> - Class (reserved=1071790KB, committed=24750KB) >> (classes #3078) >> (malloc=686KB #7122) >> (mmap: reserved=1071104KB, committed=24064KB) >> ( Metadata: ) >> ( reserved=22528KB, committed=21504KB) >> ( capacity=21327KB, used=20654KB) >> ( free chunks=113KB) >> ( available=0KB) >> ( Class space: ) >> ( reserved=1048576KB, committed=2560KB) >> ( capacity=2525KB, used=2268KB) >> ( free chunks=0KB) >> ( available=35KB) > I think this change is to ship modulo a few small quibbles. > > The four figures quoted here for each of the data and class metaspace > regions don't quite add up i.e. > > used + free chunks + available = total_in_use =/= capacity > > 2268 + 0 + 35 = 2303 =/= 2525 > > 20654 + 113 + 0 = 20767 =/= 21327 > I struggled to come out an intuitive way for representing the numbers. Hopefully, we can get it right through this review process. Actually, the formula should be: committed = capacity + free chunks + available + waste capacity : amount of all in-used chunks. used: used amount out of capacity free chunks: amount of free chunk memory available: memory that was committed, but has yet to slice into chunks. > > As I understand it this is because of waste caused either by the need to > insert block and chunk headers or by the inability to allocate objects > out of small fragments at the end of in use chunks. Is that correct? > chunk headers are counted in *used* memory. > If so then would it not be clearer to account for this waste explicitly? > e.g. > > ( Metadata: ) > ( reserved=22528KB, committed=21504KB) > ( capacity=21327KB, used=20654KB) > ( free chunks=113KB, available=0KB) > ( waste = 560KB = 2.6%) Make sense to report *waste*. How about ( Metadata: ) ( reserved=22528KB, committed=21504KB) ( capacity=21327KB, used=20654KB) ( free chunks=113KB) ( available=0KB) ( waste = 560KB = 2.6%) Thanks, -Zhengyu > n.b. the above figures are calculated as > > waste = capacity - total_in_use > waste% = waste / capacity > > Also, whether or not waste space gets reported, I think the output would > look cleaner if you were to report free and available space on one line. > > I'm happ > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From adinn at redhat.com Tue Aug 29 14:48:21 2017 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 29 Aug 2017 15:48:21 +0100 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> Message-ID: On 29/08/17 15:35, Zhengyu Gu wrote: > I struggled to come out an intuitive way for representing the numbers. > Hopefully, we can get it right through this review process. > > Actually, the formula should be: > ?committed = capacity + free chunks + available + waste Ok, that is essentially what I computed > capacity :?? amount of all in-used chunks. > used:??????? used amount out of capacity > free chunks: amount of free chunk memory > available:?? memory that was committed, but has yet to slice into chunks. > chunk headers are counted in *used* memory. Ok, putting headers into the used space doesn't seem an unreasonable way to accont for 'use'. So, that's fine -- waste is only the small regions on the end of chunks that are not able to be used to allocate an object. >> If so then would it not be clearer to account for this waste explicitly? >> e.g. >> >> ???????????????????????????? (? Metadata:??????????????????????????? ) >> ???????????????????????????? (??? reserved=22528KB,?? committed=21504KB) >> ???????????????????????????? (??? capacity=21327KB,?? used=20654KB) >> ???????????????????????????? (??? free chunks=113KB,? available=0KB) >> ???????????????????????????? (??? waste = 560KB = 2.6%) > > Make sense to report *waste*. How about > ????????????????????????????? (? Metadata:??????????????????????????? ) > ????????????????????????????? (??? reserved=22528KB,? committed=21504KB) > ????????????????????????????? (??? capacity=21327KB,?? used=20654KB) > ????????????????????????????? (??? free chunks=113KB) > ????????????????????????????? (??? available=0KB) > ????????????????????????????? (??? waste = 560KB = 2.6%) Yes, agreed except that I mentioned I think it is tidier to put free chinks and available on the same line as I did above? Do you have a reason for not following that suggestion? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From robbin.ehn at oracle.com Tue Aug 29 14:50:03 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 29 Aug 2017 16:50:03 +0200 Subject: RFR: 8186837: Memory ordering nmethod, _state and _stack_traversal_mark In-Reply-To: References: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> Message-ID: <6b4e31ae-8c3c-92f7-8743-75f6791cae0d@oracle.com> Hi Roman thanks for having a look, On 08/29/2017 03:00 PM, Roman Kennke wrote: > Hi Robin, > > I doubt that we can assume a symmetry between loadload and storestore like there is with load-acquire and release-store. This doesn't seem right. In my experience loadload > and storestore are rather special purpose: loadload ensures ordering between otherwise unrelated loads and storestore likewise with stores. This exactly why I add loadload, to stop reordering of unrelated loads: ####################### The original code did: //nmethod::make_not_entrant_or_zombie store _stack_traversal_mark storestore store _state //NMethodSweeper::process_compiled_method load _state load _stack_traversal_mark // this is a none-volatile load, can be reordered by both gcc and hardware ####################### Adding la/sr + volatile to _stack_traversal_mark: store _stack_traversal_mark release // release not needed, we have a following storestore for the unrelated stores storestore store _state load _state load _stack_traversal_mark acquire // acquire not needed since we already loaded _state and any following writes/reads will be done after we have taken a Mutex. ####################### So therefore my conclusion was that, in this particular case: store _stack_traversal_mark storestore store _state load _state loadload load _stack_traversal_mark would be correct, agree? And as I said I have created another jira issue for the concerns me, you and David share. Thanks Robbin > > And even symmetric use of load-acquire and release-store are often done wrong: those are not meant to protect concurrent access to the field, but to the stuff that is > protected by the field access (think locks), I.e. what happens between the LA and RS. At least that is my understanding. > > I suggest to do what David said and try to understand what concurrent accesses to which fields we have, and which fences are actually needed to ensure correct ordering. > > And thanks for revisiting this! > > Cheers, Roman > > Am 29. August 2017 12:31:17 MESZ schrieb Robbin Ehn : > > Hi please review, > > The issue 8180932 - "Parallelize safepoint cleanup" changed _stack_traversal_mark to load acquire/store release, this is at least half wrong. > Instead for simplicity the write side storestore fence should be match with loadload on read side and the changes to _stack_traversal_mark undone (kept it volatile). > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8186837 > > Code: > http://cr.openjdk.java.net/~rehn/8186837/hotspot.01/webrev/ > > It's not clear in this code if there other concurrent dependent read/writes. > Is true that only when reading/writing _state and _stack_traversal_mark proper memory ordering is needed? > To track that I created:https://bugs.openjdk.java.net/browse/JDK-8186839 > > Thanks Robbin > > > -- > Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From zgu at redhat.com Tue Aug 29 14:59:45 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 29 Aug 2017 10:59:45 -0400 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> Message-ID: <4b0549a0-a4ec-0fce-a39e-d8c60e5d665d@redhat.com> >> >> Make sense to report *waste*. How about >> ( Metadata: ) >> ( reserved=22528KB, committed=21504KB) >> ( capacity=21327KB, used=20654KB) >> ( free chunks=113KB) >> ( available=0KB) >> ( waste = 560KB = 2.6%) > Yes, agreed except that I mentioned I think it is tidier to put free > chinks and available on the same line as I did above? Do you have a > reason for not following that suggestion? Yes. Because they are unrelated. e.g. *committed* belongs to *reserved* *used* is part of *capacity* but free chunks and available do not relate to each other. Thanks, -Zhengyu > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From adinn at redhat.com Tue Aug 29 15:48:52 2017 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 29 Aug 2017 16:48:52 +0100 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: <4b0549a0-a4ec-0fce-a39e-d8c60e5d665d@redhat.com> References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> <4b0549a0-a4ec-0fce-a39e-d8c60e5d665d@redhat.com> Message-ID: <6f01f8a3-8911-d5fc-9208-7dcac5d1874b@redhat.com> On 29/08/17 15:59, Zhengyu Gu wrote: >>> Make sense to report *waste*. How about >>> ?????????????????????????????? (? Metadata:??????????????????????????? ) >>> ?????????????????????????????? (??? reserved=22528KB,? >>> committed=21504KB) >>> ?????????????????????????????? (??? capacity=21327KB,?? used=20654KB) >>> ?????????????????????????????? (??? free chunks=113KB) >>> ?????????????????????????????? (??? available=0KB) >>> ?????????????????????????????? (??? waste = 560KB = 2.6%) >> Yes, agreed except that I mentioned I think it is tidier to put free >> chinks and available on the same line as I did above? Do you have a >> reason for not following that suggestion? > Yes. Because they are unrelated. > > e.g. > > *committed* belongs to *reserved* > *used* is part of *capacity* > > but free chunks and available do not relate to each other. Ok, I can see where you are coming from now. However, while you are right that used is part of capacity you can also consider it true that free_chunks, available and waste belong to/are part of capacity. Those four figures constitute four different components of the total capacity. Perhaps something like this might make that clearer? ( Metadata: ) ( reserved=22528KB, committed=21504KB) ( capacity=21327KB, used=20654KB) ( free chunks=113KB) ( available=0KB) ( waste = 560KB = 2.6%) regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From zgu at redhat.com Tue Aug 29 16:31:14 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 29 Aug 2017 12:31:14 -0400 Subject: RFR(S) 8186770: NMT: Report metadata information in NMT summary In-Reply-To: <6f01f8a3-8911-d5fc-9208-7dcac5d1874b@redhat.com> References: <83e0586b-aa58-084a-fdcf-428cf55669fe@redhat.com> <2e3ab1a3-03a5-9aa8-47f8-1224a00e9d0f@redhat.com> <72c3f342-654d-2fb2-58ac-959ad7f0de37@redhat.com> <4b0549a0-a4ec-0fce-a39e-d8c60e5d665d@redhat.com> <6f01f8a3-8911-d5fc-9208-7dcac5d1874b@redhat.com> Message-ID: >> *committed* belongs to *reserved* >> *used* is part of *capacity* >> >> but free chunks and available do not relate to each other. > Ok, I can see where you are coming from now. However, while you are > right that used is part of capacity you can also consider it true that > free_chunks, available and waste belong to/are part of capacity. Those > four figures constitute four different components of the total capacity. > Perhaps something like this might make that clearer? > > ( Metadata: ) > ( reserved=22528KB, committed=21504KB) > ( capacity=21327KB, used=20654KB) > ( free chunks=113KB) > ( available=0KB) > ( waste = 560KB = 2.6%) Okay, I see what you mean. But in this case, capacity = committed. I wonder if it is cleaner that just reports free, used and waste, e.g. ( Metadata: ) ( reserved=22528KB, committed=21504KB) ( used=20654KB) ( free=786KBKB) ( waste=64KB =0.30%) where free = (capacity - used) + free_chunks + available waste = committed - capacity - free_chunks - available total = committed Thanks, -Zhengyu > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From jiangli.zhou at Oracle.COM Tue Aug 29 19:55:14 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Tue, 29 Aug 2017 12:55:14 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <59A45413.70800@oracle.com> References: <59A45413.70800@oracle.com> Message-ID: <6B35DD35-57B6-4179-826D-2D910BFD2D51@oracle.com> Hi Calvin, These changes look good. I have a few remaining comments below. - src/share/vm/classfile/classLoader.cpp The ?_num_boot_entries' variable probably is unnecessary. It?s initialized to _num_entries and never modified. In places where _num_boot_entries is used, can you use _num_entries directly? I think we should remove the special boot classpath handling code for dump time. With the use of java class loaders at CDS/AppCDS dump time, we no longer need to append the -cp path to the boot classpath. With that, we can remove the CDS special cases from ClassLoader::load_class, ClassPathImageEntry::open_stream, etc. That would make the generic class loading code much cleaner. Since you are planning to integrate this change soon, making the suggested change can be risky. I?ll file a new RFE, we can do the clean up separately. 150 int ClassLoader::_num_boot_entries = -1; 1519 if (DumpSharedSpaces && classpath_index >= _num_boot_entries) { 1520 // Do not load any class from the app classpath using the boot loader. Let 1521 // the built-in app class laoder load them. 1522 break; 1523 } 1635 if (classpath_index < _num_boot_entries) { 1636 // ik is either: 1637 // 1) a boot class loaded from the runtime image during vm initialization (classpath_index = 0); or 1638 // 2) a user's class from -Xbootclasspath/a (classpath_index > 0) 1639 // In the second case, the classpath_index, classloader_type will be recorded via 1640 // context.record_result() in ClassLoader::load_class(Symbol* name, bool search_append_only, TRAPS). 1641 if (classpath_index > 0) { 1642 return; 1643 } 1644 } I?m wondering why the following is never needed before in get_package_entry() for non-CDS case. Do you have additional details? 253 // PackageEntryTable could be NULL for classes like java/lang/invoke/LambdaForm$MH 254 if (pkgEntryTable == NULL) { 255 return NULL; 256 } If I understand it correctly, the following is to find if the class is from the runtime image. Can you please change the following to check with module->location()->starts_with(?jrt:?)? 1610 if ((strcmp(_jrt_entry->name(), src) == 0) || 1611 (module != NULL && (module->name() != NULL) && 1612 (strcmp(module->name()->as_C_string(), src) == 0))) { 1613 e = _jrt_entry; 1614 classpath_index = 0; I was looking for code that handles anonymous classes. There are following code in classLoader.cpp and metaspaceShared.cpp. However, I can?t find any code that specifically removes anonymous classes from the system dictionary at CDS dump time. I?m probably missing something, how do we guarantee (besides the assert) anonymous classes are not being archived? 1582 void ClassLoader::record_shared_class_loader_type(InstanceKlass* ik, const ClassFileStream* stream) { 1583 assert(DumpSharedSpaces, "sanity"); 1584 assert(stream != NULL, "sanity"); 1585 1586 if (ik->is_anonymous()) { 1587 // We do not archive anonymous classes. 1588 return; 1589 } 487 NOT_PRODUCT( 488 static void assert_not_anonymous_class(InstanceKlass* k) { 489 if (k->is_instance_klass()) { 490 assert(!(k->is_anonymous()), "cannot archive anonymous classes"); 491 } 492 } 494 static void assert_no_anonymoys_classes_in_dictionaries() { 495 ClassLoaderDataGraph::dictionary_classes_do(assert_not_anonymous_class); 496 }) - src/share/vm/classfile/klassFactory.cpp The following code can be simplified to get the module from ?ik?. 74 if (path_index < 0) { 75 // AppCDSv2 class. 76 // Get the pkg_entry from the classloader 77 PackageEntry* pkg_entry = NULL; 78 TempNewSymbol pkg_name = InstanceKlass::package_from_name(class_name, CHECK_NULL); 79 if (pkg_name != NULL) { 80 const char* pkg_string = pkg_name->as_C_string(); 81 ClassLoaderData* loader_data = ClassLoaderData::class_loader_data(class_loader()); 82 if (loader_data != NULL) { 83 pkg_entry = loader_data->packages()->lookup_only(pkg_name); 84 } 85 } 86 if (pkg_entry != NULL) { 87 ModuleEntry* mod_entry = pkg_entry->module(); Could you please remove the following from KlassFactory::create_from_stream() and call it from ClassLoaderExt::record_result()? 229 #if INCLUDE_CDS 230 if (DumpSharedSpaces) { 231 ClassLoader::record_shared_class_loader_type(result, stream); 232 } 233 #endif Thanks, Jiangli > On Aug 28, 2017, at 10:34 AM, Calvin Cheung wrote: > > Hi, > > This is a re-post of a previous RFR for 8172218 using the correct bug id. > > bug: https://bugs.openjdk.java.net/browse/JDK-8186842 > > webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ > > Please refer to the comment section of the bug for description of the change. > > Tests executed so far: > JPRT > hs-tier2 though hs-tier4 > hs-tier5 (linux-x64) > > thanks, > Calvin From ioi.lam at oracle.com Tue Aug 29 20:44:13 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 29 Aug 2017 13:44:13 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <6B35DD35-57B6-4179-826D-2D910BFD2D51@oracle.com> References: <59A45413.70800@oracle.com> <6B35DD35-57B6-4179-826D-2D910BFD2D51@oracle.com> Message-ID: On 8/29/17 12:55 PM, Jiangli Zhou wrote: > I was looking for code that handles anonymous classes. There are following code in classLoader.cpp and metaspaceShared.cpp. However, I can?t find any code that specifically removes anonymous classes from the system dictionary at CDS dump time. I?m probably missing something, how do we guarantee (besides the assert) anonymous classes are not being archived? > > 1582 void ClassLoader::record_shared_class_loader_type(InstanceKlass* ik, const ClassFileStream* stream) { > 1583 assert(DumpSharedSpaces, "sanity"); > 1584 assert(stream != NULL, "sanity"); > 1585 > 1586 if (ik->is_anonymous()) { > 1587 // We do not archive anonymous classes. > 1588 return; > 1589 } > 487 NOT_PRODUCT( > 488 static void assert_not_anonymous_class(InstanceKlass* k) { > 489 if (k->is_instance_klass()) { > 490 assert(!(k->is_anonymous()), "cannot archive anonymous classes"); > 491 } > 492 } > 494 static void assert_no_anonymoys_classes_in_dictionaries() { > 495 ClassLoaderDataGraph::dictionary_classes_do(assert_not_anonymous_class); > 496 }) Hi Jiangli, Anonymous classes are not stored inside any Dictionaries. They are created by SystemDictionary::parse_stream() with a non-null host_klass: // Note: this method is much like resolve_from_stream, but // does not publish the classes via the SystemDictionary. // Handles unsafe_DefineAnonymousClass and redefineclasses // RedefinedClasses do not add to the class hierarchy InstanceKlass* SystemDictionary::parse_stream(Symbol* class_name, ????????????????????????????????????????????? Handle class_loader, ????????????????????????????????????????????? Handle protection_domain, ClassFileStream* st, ????????????????????????????????????????????? const InstanceKlass* host_klass, GrowableArray* cp_patches, ????????????????????????????????????????????? TRAPS) Anonymous classes are referenced in other data structures, such as in this loader_data: ??? loader_data = ClassLoaderData::anonymous_class_loader_data(class_loader(), CHECK_NULL); However, all these places are ignored by CDS during dump time (We only archive classes that are directly stored in the Dictionary objects of the boot/platform/app loaders. See SystemDictionary::combine_shared_dictionaries() in the new code). Therefore, we can guarantee that no anonymous classes are archived. The assert_no_anonymous_classes_in_dictionaries() just ensure this is true (i.e., in future VM development, if people start putting anonymous classes in the Dictionary objects of the boot/platform/app loaders, this assert would catch it). Thanks - Ioi From mikhailo.seledtsov at oracle.com Tue Aug 29 21:53:59 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Tue, 29 Aug 2017 14:53:59 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <59A45413.70800@oracle.com> References: <59A45413.70800@oracle.com> Message-ID: <590ac776-a630-944c-5673-9c3b12d15624@oracle.com> Hi Calvin, I have reviewed the test portion of the change; it looks good. Thank you, Misha On 08/28/2017 10:34 AM, Calvin Cheung wrote: > Hi, > > This is a re-post of a previous RFR for 8172218 using the correct bug id. > > bug: https://bugs.openjdk.java.net/browse/JDK-8186842 > > webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ > > Please refer to the comment > > section of the bug for description of the change. > > Tests executed so far: > JPRT > hs-tier2 though hs-tier4 > hs-tier5 (linux-x64) > > thanks, > Calvin From jiangli.zhou at oracle.com Tue Aug 29 22:48:43 2017 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 29 Aug 2017 15:48:43 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: References: <59A45413.70800@oracle.com> <6B35DD35-57B6-4179-826D-2D910BFD2D51@oracle.com> Message-ID: <41671ADF-7BBF-4121-9315-14FE98E00838@oracle.com> > On Aug 29, 2017, at 1:44 PM, Ioi Lam wrote: > > > > On 8/29/17 12:55 PM, Jiangli Zhou wrote: >> I was looking for code that handles anonymous classes. There are following code in classLoader.cpp and metaspaceShared.cpp. However, I can?t find any code that specifically removes anonymous classes from the system dictionary at CDS dump time. I?m probably missing something, how do we guarantee (besides the assert) anonymous classes are not being archived? >> >> 1582 void ClassLoader::record_shared_class_loader_type(InstanceKlass* ik, const ClassFileStream* stream) { >> 1583 assert(DumpSharedSpaces, "sanity"); >> 1584 assert(stream != NULL, "sanity"); >> 1585 >> 1586 if (ik->is_anonymous()) { >> 1587 // We do not archive anonymous classes. >> 1588 return; >> 1589 } >> 487 NOT_PRODUCT( >> 488 static void assert_not_anonymous_class(InstanceKlass* k) { >> 489 if (k->is_instance_klass()) { >> 490 assert(!(k->is_anonymous()), "cannot archive anonymous classes"); >> 491 } >> 492 } >> 494 static void assert_no_anonymoys_classes_in_dictionaries() { >> 495 ClassLoaderDataGraph::dictionary_classes_do(assert_not_anonymous_class); >> 496 }) > Hi Jiangli, > > Anonymous classes are not stored inside any Dictionaries. They are created by SystemDictionary::parse_stream() with a non-null host_klass: > > // Note: this method is much like resolve_from_stream, but > // does not publish the classes via the SystemDictionary. > // Handles unsafe_DefineAnonymousClass and redefineclasses > // RedefinedClasses do not add to the class hierarchy > InstanceKlass* SystemDictionary::parse_stream(Symbol* class_name, > Handle class_loader, > Handle protection_domain, > ClassFileStream* st, > const InstanceKlass* host_klass, > GrowableArray* cp_patches, > TRAPS) > > Anonymous classes are referenced in other data structures, such as in this loader_data: > > loader_data = ClassLoaderData::anonymous_class_loader_data(class_loader(), CHECK_NULL); > > However, all these places are ignored by CDS during dump time (We only archive classes that are directly stored in the Dictionary objects of the boot/platform/app loaders. See SystemDictionary::combine_shared_dictionaries() in the new code). > > Therefore, we can guarantee that no anonymous classes are archived. The assert_no_anonymous_classes_in_dictionaries() just ensure this is true (i.e., in future VM development, if people start putting anonymous classes in the Dictionary objects of the boot/platform/app loaders, this assert would catch it). That?s exactly what I was looking for! I suspected that must be the case but didn?t find the related code. Thanks for the details. Calvin, could you please add comments above assert_no_anonymoys_classes_in_dictionaries() explaining that anonymous classes are not stored in system dictionary? Thanks, Jiangli > > Thanks > - Ioi > > From coleen.phillimore at oracle.com Wed Aug 30 00:18:19 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 29 Aug 2017 20:18:19 -0400 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <59A45413.70800@oracle.com> References: <59A45413.70800@oracle.com> Message-ID: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoader.cpp.udiff.html Can you put some comment like "Find if the class is from the runtime image" above this.? I couldn't guess reading this so used Jianli's review as a hint. *+ if ((strcmp(_jrt_entry->name(), src) == 0) ||* *+ (module != NULL && (module->name() != NULL) &&* *+ (strcmp(module->name()->as_C_string(), src) == 0))) {* *+ e = _jrt_entry;* Can you change this: *+ if (!get_canonical_path(e->name(), canonical_path, JVM_MAXPATHLEN)) {* *+ continue;* *+ }* *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* *+ break;* *+ }* *+ classpath_index ++;* To: *+ if (get_canonical_path(e->name(), canonical_path, JVM_MAXPATHLEN)) {* *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* *+ break;* *+ }* *+ classpath_index ++; + } * So the confusing "continue" goes away? *+ const char* const class_name = ik->name()->as_C_string();* I think you need another ResourceMark here. *+ ClassLoaderExt::Context context(class_name, file_name, THREAD);* *+ context.record_result(ik->name(), e, classpath_index, ik, THREAD); // this is a tail call so doesn't need CATCH or CHECK * Could these throw exceptions and you don't expect them too??? Or do they just need a thread argument?? If the former, chagne THREAD to CATCH. *+ #endif* Can you put what this is an #endif to as a comment since it's far away from the #if ? http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoaderExt.hpp.udiff.html *+oop h_loader = result->class_loader();* Nit, can you remove h_ from the name since it's not a Handle. http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/klassFactory.cpp.udiff.html *+ ClassLoaderData* loader_data = ClassLoaderData::class_loader_data(class_loader());* *+ if (loader_data != NULL) {* *+ pkg_entry = loader_data->packages()->lookup_only(pkg_name);* *+ } * The ClassLoaderData should never be null at this point, and why would it be different than the one you fetched above.? I think this would be not legal to change the class loader with CFLH, and the original loader_data is used below, so this should be the same one. I think 12 or more inserted lines should be a new static function above, that's called here, like ?? const char* pathname = get_package_name(loader_data, class_name, path_index, CHECK); http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/systemDictionary.cpp.udiff.html The combining entries looks good.?? I think it needs a comment that it's only done during dump time (or an assert). *+ Dictionary* master_dictionary = ClassLoaderData::the_null_class_loader_data()->dictionary();* It's been bothering me that the shared dictionary at dump time is the NULL_CLD one.? With the combining, I think the dictionary at dump time should be shared_dictionary().?? Can this be a follow on RFE to clean this up to use shared_dictionary()??? I think this change enables that. Do you have to free the initiating entries??? Can you leave them around? http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/memory/metaspaceShared.cpp.udiff.html *+ NOT_PRODUCT(* *+ static void assert_not_anonymous_class(InstanceKlass* k) {* *+ if (k->is_instance_klass()) {* *+ assert(!(k->is_anonymous()), "cannot archive anonymous classes");* *+ }* *+ }* You don't have to ask if k->is_instance_klass() since it passes in an InstanceKlass. It surprises me that there are no anonymous classes loaded (I'll read Ioi's reply later).? I don't know if that will remain the case though. *+ tty->print_cr("Preload Warning: Cannot find %s", parser.current_class_name());* You should have an RFE to use log_warning() instead of tty->print_cr for all the CDS messages. http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/arrayKlass.cpp.udiff.html Because we use ClassLoaderDataGraph::classes_do() I thought all the array dimension Klasses are walked and this isn't needed. http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/constantPool.cpp.udiff.html Why are you unresolveing Klasses?? I thought that was a good thing for performance.? Can you add a comment why?? There's some leftover #if0 code. I've completed my review and these are only minor comments and questions.? I might need to see an incremental or new review depending on how much you change. ? It looks good. Thanks, Coleen On 8/28/17 1:34 PM, Calvin Cheung wrote: > Hi, > > This is a re-post of a previous RFR for 8172218 using the correct bug id. > > bug: https://bugs.openjdk.java.net/browse/JDK-8186842 > > webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ > > Please refer to the comment > > section of the bug for description of the change. > > Tests executed so far: > ??? JPRT > ??? hs-tier2 though hs-tier4 > ??? hs-tier5 (linux-x64) > > thanks, > Calvin From ioi.lam at oracle.com Wed Aug 30 05:00:49 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 29 Aug 2017 22:00:49 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: References: <59A45413.70800@oracle.com> Message-ID: Hi Calvin, there's one more place where we can get rid of 'continue' void ConstantPool::archive_resolved_references(Thread* THREAD) { ??? .... ??? for (int i = 0; i < rr_len; i++) { ????? oop p = rr->obj_at(i); ????? if (p != NULL && i < ref_map_len) { +?? ? ? rr->obj_at_put(i, NULL); ??????? int index = object_to_cp_index(i); ??????? // Skip the entry if the string hash code is 0 since the string ??????? // is not included in the shared string_table, see StringTable::copy_shared_string. ??????? if (tag_at(index).is_string() && java_lang_String::hash_code(p) != 0) { ????????? oop op = StringTable::create_archived_string(p, THREAD); ????????? // If the String object is not archived (possibly too large), ????????? // NULL is returned. Also set it in the array, so we won't ????????? // have a 'bad' reference in the archived resolved_reference ????????? // array. ????????? rr->obj_at_put(i, op); -???????? continue; ??????? } ????? } -???? rr->obj_at_put(i, NULL); ??? } Thanks - Ioi On 8/29/17 5:18 PM, coleen.phillimore at oracle.com wrote: > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoader.cpp.udiff.html > > > Can you put some comment like "Find if the class is from the runtime > image" above this.? I couldn't guess reading this so used Jianli's > review as a hint. > > *+ if ((strcmp(_jrt_entry->name(), src) == 0) ||* > *+ (module != NULL && (module->name() != NULL) &&* > *+ (strcmp(module->name()->as_C_string(), src) == 0))) {* > *+ e = _jrt_entry;* > > > Can you change this: > > *+ if (!get_canonical_path(e->name(), canonical_path, JVM_MAXPATHLEN)) {* > *+ continue;* > *+ }* > *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* > *+ break;* > *+ }* > *+ classpath_index ++;* > > To: > > *+ if (get_canonical_path(e->name(), canonical_path, JVM_MAXPATHLEN)) {* > *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* > *+ break;* > *+ }* > *+ classpath_index ++; + } * > > > So the confusing "continue" goes away? > > *+ const char* const class_name = ik->name()->as_C_string();* > > > I think you need another ResourceMark here. > > *+ ClassLoaderExt::Context context(class_name, file_name, THREAD);* > *+ context.record_result(ik->name(), e, classpath_index, ik, THREAD); > // this is a tail call so doesn't need CATCH or CHECK * > > Could these throw exceptions and you don't expect them too??? Or do > they just need a thread argument?? If the former, chagne THREAD to CATCH. > > *+ #endif* > > > Can you put what this is an #endif to as a comment since it's far away > from the #if ? > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoaderExt.hpp.udiff.html > > > *+oop h_loader = result->class_loader();* > > > Nit, can you remove h_ from the name since it's not a Handle. > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/klassFactory.cpp.udiff.html > > > *+ ClassLoaderData* loader_data = > ClassLoaderData::class_loader_data(class_loader());* > *+ if (loader_data != NULL) {* > *+ pkg_entry = loader_data->packages()->lookup_only(pkg_name);* > *+ } * > > The ClassLoaderData should never be null at this point, and why would > it be different than the one you fetched above.? I think this would be > not legal to change the class loader with CFLH, and the original > loader_data is used below, so this should be the same one. > > I think 12 or more inserted lines should be a new static function > above, that's called here, like > ?? const char* pathname = get_package_name(loader_data, class_name, > path_index, CHECK); > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/systemDictionary.cpp.udiff.html > > > The combining entries looks good.?? I think it needs a comment that > it's only done during dump time (or an assert). > > *+ Dictionary* master_dictionary = > ClassLoaderData::the_null_class_loader_data()->dictionary();* > > > It's been bothering me that the shared dictionary at dump time is the > NULL_CLD one.? With the combining, I think the dictionary at dump time > should be shared_dictionary().?? Can this be a follow on RFE to clean > this up to use shared_dictionary()??? I think this change enables that. > > Do you have to free the initiating entries??? Can you leave them around? > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/memory/metaspaceShared.cpp.udiff.html > > > *+ NOT_PRODUCT(* > *+ static void assert_not_anonymous_class(InstanceKlass* k) {* > *+ if (k->is_instance_klass()) {* > *+ assert(!(k->is_anonymous()), "cannot archive anonymous classes");* > *+ }* > *+ }* > > You don't have to ask if k->is_instance_klass() since it passes in an > InstanceKlass. > > It surprises me that there are no anonymous classes loaded (I'll read > Ioi's reply later).? I don't know if that will remain the case though. > > *+ tty->print_cr("Preload Warning: Cannot find %s", > parser.current_class_name());* > > > You should have an RFE to use log_warning() instead of tty->print_cr > for all the CDS messages. > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/arrayKlass.cpp.udiff.html > > > Because we use ClassLoaderDataGraph::classes_do() I thought all the > array dimension Klasses are walked and this isn't needed. > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/constantPool.cpp.udiff.html > > > Why are you unresolveing Klasses?? I thought that was a good thing for > performance.? Can you add a comment why?? There's some leftover #if0 > code. > > I've completed my review and these are only minor comments and > questions.? I might need to see an incremental or new review depending > on how much you change. ? It looks good. > > Thanks, > Coleen > > > On 8/28/17 1:34 PM, Calvin Cheung wrote: >> Hi, >> >> This is a re-post of a previous RFR for 8172218 using the correct bug >> id. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8186842 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ >> >> Please refer to the comment >> >> section of the bug for description of the change. >> >> Tests executed so far: >> ??? JPRT >> ??? hs-tier2 though hs-tier4 >> ??? hs-tier5 (linux-x64) >> >> thanks, >> Calvin > From coleen.phillimore at oracle.com Wed Aug 30 11:14:40 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 30 Aug 2017 07:14:40 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> Message-ID: Hi, I changed the edit for David to only use ordering semantics in the places where needed in the lock free access to pd_set.? Since only contains_protection_domain is read lock free, it should be ok. open webrev at http://cr.openjdk.java.net/~coleenp/8164207.04/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8164207 Thanks, Coleen On 8/29/17 2:28 AM, David Holmes wrote: > Hi Coleen, > > On 29/08/2017 5:39 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 8/28/17 3:38 PM, coleen.phillimore at oracle.com wrote: >>> >>> Here is the third webrev with the names of pd_set and set_pd_set >>> renamed to pd_set_acquire and release_set_pd_set. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.03/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > > This API should also be renamed: > > !?? ProtectionDomainEntry* pd_set() const??????????? { return > _inner.pd_set_acquire(); } > !?? void set_pd_set(ProtectionDomainEntry* new_head) { > _inner.release_set_pd_set(new_head); } > > These are the ones that need to give visibility to the fact we're > accessing things lock-free (if indeed we are). > > More below ... > >>> On 8/28/17 8:07 AM, coleen.phillimore at oracle.com wrote: >>>> On 8/28/17 12:25 AM, David Holmes wrote: >>>>> Hi Coleen, >>>>> >>>>> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Thank you Zhengyu for noticing this change was wrong, and >>>>>> Christian for the idea.?? New webrev: >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>> >>>>> The idea of a load-acquire accessor and release_store-setter is >>>>> fine in principal, but it seems to me that we now use these >>>>> everywhere, even if we may not need them because there is no >>>>> concurrent/lock-free access. Overall I find it very difficult to >>>>> determine what the concurrent access patterns are for a Dictionary >>>>> versus a DictionaryEntry, and which paths are in fact lock and/or >>>>> safepoint free, and may be racing with locked or safepointed code. ?? >>>> >>>> That's exactly the point of making them accessors.? So one doesn't >>>> have to visit each individual call site and spend time answering >>>> the question for each case.? And probably getting it wrong.?? The >>>> performance delta for these accesses is minimal since it's only >>>> getting the head of the list, not each element. >>>> >>>> Then it's also future proof so that if a lock is removed, then we >>>> don't miss one of the accessors at a later time. Note that >>>> observing bugs caused by this is very difficult to do, and can only >>>> be done by inspection.?? That's why I erred on the side of safety >>>> and consistency. > > Sorry, it may sound strange to say that I don't agree with "erring on > the side of safety and consistency" but I do not agree with just using > acquire/release semantics everywhere just in case! If we don't know > the lock-free paths then how can we possibly know things are correct. > The whole point of these accessors is to make it obvious where the > lock-free accesses are. > >>>>> >>>>> That aside I don't understand why you added a level of indirection >>>>> with the ProtectionDomainSet class? >>>> >>>> Only the code is a level of indirection not the access. That is to >>>> avoid what I said above.? See Christian's and Zhengyu's comments. > > Okay - I see what you did but I would not expect to have to protect > _pd_set from direct use within its own class - anyone messing with > that class should be aware of the need to use the accessors. Though I > suppose this encapsulation is little different to defining the field > as some kind of "Atomic" type rather than a "raw" type. > > Thanks, > David > ----- > >>>>> >>>>> Also we have been trying to include release/acquire in the names >>>>> of such accessors so that it is clear when we are relying on >>>>> memory ordering properties ie. pd_set_acquire and release_set_pd_set >>>>> >>>> >>>> I will change the names of these functions. >>>> >>>> thanks, >>>> Coleen >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>>> I reran parallel class loading tests and jck testing is in >>>>>> progress, but order access requires inspection. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >>>>>> >>>>>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> >>>>>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>>>>> Hi Coleen, >>>>>>>>>> >>>>>>>>>> There are two instances probably overlooked? >>>>>>>>>> >>>>>>>>>> dictionary.cpp #103 and #124 >>>>>>>>>> >>>>>>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>>>>>> => >>>>>>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Oh yeah, you're right.? That's embarrasing.?? I'll fix and >>>>>>>>> retest. >>>>>>>> Which also shows that there is a potential for future mistakes. >>>>>>>> Can we isolate the field better so it?s only accessible via >>>>>>>> setter and getter? >>>>>>> >>>>>>> Yes, great idea. >>>>>>> Coleen >>>>>>> >>>>>>>>> Thank you!! >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> -Zhengyu >>>>>>>>>> >>>>>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>> Summary: Use load_acquire for accessing >>>>>>>>>>> DictionaryEntry::_pd_set since it's accessed outside the >>>>>>>>>>> SystemDictionary_lock >>>>>>>>>>> >>>>>>>>>>> Ran parallel class loading tests that we have as well as >>>>>>>>>>> tier1 tests. See bug for details. >>>>>>>>>>> >>>>>>>>>>> open webrev at >>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>> >>>>>> >>>> >>> >> From adinn at redhat.com Wed Aug 30 11:36:27 2017 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 30 Aug 2017 12:36:27 +0100 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> Message-ID: <2f5a0a19-7a7a-436d-1297-28c5db386989@redhat.com> On 30/08/17 12:14, coleen.phillimore at oracle.com wrote: > > Hi, I changed the edit for David to only use ordering semantics in the > places where needed in the lock free access to pd_set.? Since only > contains_protection_domain is read lock free, it should be ok. > > open webrev at http://cr.openjdk.java.net/~coleenp/8164207.04/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8164207 Looks good to me! regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From coleen.phillimore at oracle.com Wed Aug 30 11:38:55 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 30 Aug 2017 07:38:55 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <2f5a0a19-7a7a-436d-1297-28c5db386989@redhat.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> <2f5a0a19-7a7a-436d-1297-28c5db386989@redhat.com> Message-ID: <301355fa-16d6-30cf-b2b2-0ac1fd332fc2@oracle.com> On 8/30/17 7:36 AM, Andrew Dinn wrote: > On 30/08/17 12:14, coleen.phillimore at oracle.com wrote: >> Hi, I changed the edit for David to only use ordering semantics in the >> places where needed in the lock free access to pd_set.? Since only >> contains_protection_domain is read lock free, it should be ok. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.04/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > > > Looks good to me! Thanks, Andrew! Coleen > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From david.holmes at oracle.com Wed Aug 30 11:54:34 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Aug 2017 21:54:34 +1000 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> Message-ID: On 30/08/2017 9:14 PM, coleen.phillimore at oracle.com wrote: > > Hi, I changed the edit for David to only use ordering semantics in the > places where needed in the lock free access to pd_set.? Since only > contains_protection_domain is read lock free, it should be ok. > > open webrev at http://cr.openjdk.java.net/~coleenp/8164207.04/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8164207 Thanks Coleen. This looks good to me. David ----- > Thanks, > Coleen > > On 8/29/17 2:28 AM, David Holmes wrote: >> Hi Coleen, >> >> On 29/08/2017 5:39 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 8/28/17 3:38 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Here is the third webrev with the names of pd_set and set_pd_set >>>> renamed to pd_set_acquire and release_set_pd_set. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.03/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >> >> This API should also be renamed: >> >> !?? ProtectionDomainEntry* pd_set() const??????????? { return >> _inner.pd_set_acquire(); } >> !?? void set_pd_set(ProtectionDomainEntry* new_head) { >> _inner.release_set_pd_set(new_head); } >> >> These are the ones that need to give visibility to the fact we're >> accessing things lock-free (if indeed we are). >> >> More below ... >> >>>> On 8/28/17 8:07 AM, coleen.phillimore at oracle.com wrote: >>>>> On 8/28/17 12:25 AM, David Holmes wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Thank you Zhengyu for noticing this change was wrong, and >>>>>>> Christian for the idea.?? New webrev: >>>>>>> >>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>> >>>>>> The idea of a load-acquire accessor and release_store-setter is >>>>>> fine in principal, but it seems to me that we now use these >>>>>> everywhere, even if we may not need them because there is no >>>>>> concurrent/lock-free access. Overall I find it very difficult to >>>>>> determine what the concurrent access patterns are for a Dictionary >>>>>> versus a DictionaryEntry, and which paths are in fact lock and/or >>>>>> safepoint free, and may be racing with locked or safepointed code. ?? >>>>> >>>>> That's exactly the point of making them accessors.? So one doesn't >>>>> have to visit each individual call site and spend time answering >>>>> the question for each case.? And probably getting it wrong.?? The >>>>> performance delta for these accesses is minimal since it's only >>>>> getting the head of the list, not each element. >>>>> >>>>> Then it's also future proof so that if a lock is removed, then we >>>>> don't miss one of the accessors at a later time. Note that >>>>> observing bugs caused by this is very difficult to do, and can only >>>>> be done by inspection.?? That's why I erred on the side of safety >>>>> and consistency. >> >> Sorry, it may sound strange to say that I don't agree with "erring on >> the side of safety and consistency" but I do not agree with just using >> acquire/release semantics everywhere just in case! If we don't know >> the lock-free paths then how can we possibly know things are correct. >> The whole point of these accessors is to make it obvious where the >> lock-free accesses are. >> >>>>>> >>>>>> That aside I don't understand why you added a level of indirection >>>>>> with the ProtectionDomainSet class? >>>>> >>>>> Only the code is a level of indirection not the access. That is to >>>>> avoid what I said above.? See Christian's and Zhengyu's comments. >> >> Okay - I see what you did but I would not expect to have to protect >> _pd_set from direct use within its own class - anyone messing with >> that class should be aware of the need to use the accessors. Though I >> suppose this encapsulation is little different to defining the field >> as some kind of "Atomic" type rather than a "raw" type. >> >> Thanks, >> David >> ----- >> >>>>>> >>>>>> Also we have been trying to include release/acquire in the names >>>>>> of such accessors so that it is clear when we are relying on >>>>>> memory ordering properties ie. pd_set_acquire and release_set_pd_set >>>>>> >>>>> >>>>> I will change the names of these functions. >>>>> >>>>> thanks, >>>>> Coleen >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> >>>>>>> I reran parallel class loading tests and jck testing is in >>>>>>> progress, but order access requires inspection. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>>>>>> >>>>>>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>>>>>> Hi Coleen, >>>>>>>>>>> >>>>>>>>>>> There are two instances probably overlooked? >>>>>>>>>>> >>>>>>>>>>> dictionary.cpp #103 and #124 >>>>>>>>>>> >>>>>>>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>>>>>>> => >>>>>>>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Oh yeah, you're right.? That's embarrasing.?? I'll fix and >>>>>>>>>> retest. >>>>>>>>> Which also shows that there is a potential for future mistakes. >>>>>>>>> Can we isolate the field better so it?s only accessible via >>>>>>>>> setter and getter? >>>>>>>> >>>>>>>> Yes, great idea. >>>>>>>> Coleen >>>>>>>> >>>>>>>>>> Thank you!! >>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> -Zhengyu >>>>>>>>>>> >>>>>>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> Summary: Use load_acquire for accessing >>>>>>>>>>>> DictionaryEntry::_pd_set since it's accessed outside the >>>>>>>>>>>> SystemDictionary_lock >>>>>>>>>>>> >>>>>>>>>>>> Ran parallel class loading tests that we have as well as >>>>>>>>>>>> tier1 tests. See bug for details. >>>>>>>>>>>> >>>>>>>>>>>> open webrev at >>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Coleen >>>>>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >>> > From thomas.stuefe at gmail.com Wed Aug 30 12:33:52 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 30 Aug 2017 14:33:52 +0200 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder Message-ID: Hi all, May I please have reviews for the following change. Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.01/webrev/ (This is the followup to: https://bugs.openjdk.java.net/browse/JDK-8186349) ------------- Basically, this is a reimplementation of the layer around the Windows Symbol API (the API used to resolve debug symbols). The old implementation had a number of errors and shortcomings which together caused the Windows native symbol resolution (and hence callstacks in error logs) to be a bit of a lottery. The aim of this reimplementation is to make the code more robust and easier to maintain. The problems with the existing implementation are listed in detail in the bug description. The new implementation: - uses the new centralized WindowsDbgHelper class, which wraps the dbghelp.dll loading, introduced with JDK-8186349 - Completely bypasses the "create two instances of AbstractDecoder class and synchronize access to them" scheme in decoder.cpp. It does not make sense for windows, where we have to synchronize each access to the dbghelp.dll anyway - this is done one layer below in WindowsDbgHelper. The static methods of the shared Decoder class now directly access the static methods in the new SymbolEngine class, see decoder_windows.cpp. - The layer wrapping the Symbol API lives in the new symbolengine.cpp/hpp files. The coding takes care of properly initializing (once) the symbol API and of assembling the pdb search path. - Pdb search path construction is changed: where before we just added jdk and jvm bin directories, we now just add all directories of all loaded DLLs (which, of course, include the jdk and jvm bin directories). That way we have a high chance of catching pdb files of third party libraries, as long as they follow the convention of putting the pdb files beside the dlls. This means it is easier to analyse crashes where third party DLLs are involved. - On Windows, we now have source file and line number in the callstack. - There is a new parameter, diagnostic and windows-only, called "InitializeDbgHelpEarly". That parameter is by default off. If on, it causes the symbol engine to be initialized early, which increases the chance of good callstacks later on (because the initialization does not have to run in an error situation). - Added tests: gtests and a jtreg test which tests the callstack printing. All tests windows only. There is no technical reason for making them windows only, but I wanted to keep disturbances to other platforms to a minimum and these kind of tests can be shaky. Thanks a lot for reviewing this! Kind Regards, Thomas From thomas.stuefe at gmail.com Wed Aug 30 12:35:31 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 30 Aug 2017 14:35:31 +0200 Subject: RFR(m): 8185712: [windows] Improve native symbol decoder In-Reply-To: References: Message-ID: P.S. As usual, I built slowdebug and release on x86 and x64. I ran gtests on these platforms and jtreg tests from "hotspot/runtime/ErrorHandling. On Wed, Aug 30, 2017 at 2:33 PM, Thomas St?fe wrote: > Hi all, > > May I please have reviews for the following change. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8185712 > Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > 8185712-windows-improve-native-symbol-resolver/webrev.01/webrev/ > > (This is the followup to: https://bugs.openjdk.java.net/browse/JDK-8186349 > ) > > ------------- > > Basically, this is a reimplementation of the layer around the Windows > Symbol API (the API used to resolve debug symbols). The old implementation > had a number of errors and shortcomings which together caused the Windows > native symbol resolution (and hence callstacks in error logs) to be a bit > of a lottery. The aim of this reimplementation is to make the code more > robust and easier to maintain. > > The problems with the existing implementation are listed in detail in the > bug description. > > The new implementation: > > - uses the new centralized WindowsDbgHelper class, which wraps the > dbghelp.dll loading, introduced with JDK-8186349 > > - Completely bypasses the "create two instances of AbstractDecoder class > and synchronize access to them" scheme in decoder.cpp. It does not make > sense for windows, where we have to synchronize each access to the > dbghelp.dll anyway - this is done one layer below in WindowsDbgHelper. The > static methods of the shared Decoder class now directly access the static > methods in the new SymbolEngine class, see decoder_windows.cpp. > > - The layer wrapping the Symbol API lives in the new symbolengine.cpp/hpp > files. The coding takes care of properly initializing (once) the symbol API > and of assembling the pdb search path. > > - Pdb search path construction is changed: where before we just added jdk > and jvm bin directories, we now just add all directories of all loaded DLLs > (which, of course, include the jdk and jvm bin directories). That way we > have a high chance of catching pdb files of third party libraries, as long > as they follow the convention of putting the pdb files beside the dlls. > This means it is easier to analyse crashes where third party DLLs are > involved. > > - On Windows, we now have source file and line number in the callstack. > > - There is a new parameter, diagnostic and windows-only, called "InitializeDbgHelpEarly". > That parameter is by default off. If on, it causes the symbol engine to be > initialized early, which increases the chance of good callstacks later on > (because the initialization does not have to run in an error situation). > > - Added tests: gtests and a jtreg test which tests the callstack printing. > All tests windows only. There is no technical reason for making them > windows only, but I wanted to keep disturbances to other platforms to a > minimum and these kind of tests can be shaky. > > Thanks a lot for reviewing this! > > Kind Regards, Thomas > > > > > From coleen.phillimore at oracle.com Wed Aug 30 13:49:55 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 30 Aug 2017 09:49:55 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> Message-ID: <891e08bd-5496-55ee-923f-811c65f65d96@oracle.com> On 8/30/17 7:54 AM, David Holmes wrote: > On 30/08/2017 9:14 PM, coleen.phillimore at oracle.com wrote: >> >> Hi, I changed the edit for David to only use ordering semantics in >> the places where needed in the lock free access to pd_set. Since only >> contains_protection_domain is read lock free, it should be ok. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.04/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > > Thanks Coleen. This looks good to me. Thanks, David. Coleen > > David > ----- > >> Thanks, >> Coleen >> >> On 8/29/17 2:28 AM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 29/08/2017 5:39 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 8/28/17 3:38 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Here is the third webrev with the names of pd_set and set_pd_set >>>>> renamed to pd_set_acquire and release_set_pd_set. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.03/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>> >>> This API should also be renamed: >>> >>> !?? ProtectionDomainEntry* pd_set() const??????????? { return >>> _inner.pd_set_acquire(); } >>> !?? void set_pd_set(ProtectionDomainEntry* new_head) { >>> _inner.release_set_pd_set(new_head); } >>> >>> These are the ones that need to give visibility to the fact we're >>> accessing things lock-free (if indeed we are). >>> >>> More below ... >>> >>>>> On 8/28/17 8:07 AM, coleen.phillimore at oracle.com wrote: >>>>>> On 8/28/17 12:25 AM, David Holmes wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> Thank you Zhengyu for noticing this change was wrong, and >>>>>>>> Christian for the idea.?? New webrev: >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>> >>>>>>> The idea of a load-acquire accessor and release_store-setter is >>>>>>> fine in principal, but it seems to me that we now use these >>>>>>> everywhere, even if we may not need them because there is no >>>>>>> concurrent/lock-free access. Overall I find it very difficult to >>>>>>> determine what the concurrent access patterns are for a >>>>>>> Dictionary versus a DictionaryEntry, and which paths are in fact >>>>>>> lock and/or safepoint free, and may be racing with locked or >>>>>>> safepointed code. ?? >>>>>> >>>>>> That's exactly the point of making them accessors.? So one >>>>>> doesn't have to visit each individual call site and spend time >>>>>> answering the question for each case.? And probably getting it >>>>>> wrong.?? The performance delta for these accesses is minimal >>>>>> since it's only getting the head of the list, not each element. >>>>>> >>>>>> Then it's also future proof so that if a lock is removed, then we >>>>>> don't miss one of the accessors at a later time. Note that >>>>>> observing bugs caused by this is very difficult to do, and can >>>>>> only be done by inspection.?? That's why I erred on the side of >>>>>> safety and consistency. >>> >>> Sorry, it may sound strange to say that I don't agree with "erring >>> on the side of safety and consistency" but I do not agree with just >>> using acquire/release semantics everywhere just in case! If we don't >>> know the lock-free paths then how can we possibly know things are >>> correct. The whole point of these accessors is to make it obvious >>> where the lock-free accesses are. >>> >>>>>>> >>>>>>> That aside I don't understand why you added a level of >>>>>>> indirection with the ProtectionDomainSet class? >>>>>> >>>>>> Only the code is a level of indirection not the access. That is >>>>>> to avoid what I said above.? See Christian's and Zhengyu's comments. >>> >>> Okay - I see what you did but I would not expect to have to protect >>> _pd_set from direct use within its own class - anyone messing with >>> that class should be aware of the need to use the accessors. Though >>> I suppose this encapsulation is little different to defining the >>> field as some kind of "Atomic" type rather than a "raw" type. >>> >>> Thanks, >>> David >>> ----- >>> >>>>>>> >>>>>>> Also we have been trying to include release/acquire in the names >>>>>>> of such accessors so that it is clear when we are relying on >>>>>>> memory ordering properties ie. pd_set_acquire and >>>>>>> release_set_pd_set >>>>>>> >>>>>> >>>>>> I will change the names of these functions. >>>>>> >>>>>> thanks, >>>>>> Coleen >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>>> I reran parallel class loading tests and jck testing is in >>>>>>>> progress, but order access requires inspection. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>> >>>>>>>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>>>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>>>>>>> Hi Coleen, >>>>>>>>>>>> >>>>>>>>>>>> There are two instances probably overlooked? >>>>>>>>>>>> >>>>>>>>>>>> dictionary.cpp #103 and #124 >>>>>>>>>>>> >>>>>>>>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>>>>>>>> => >>>>>>>>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> Oh yeah, you're right.? That's embarrasing. I'll fix and >>>>>>>>>>> retest. >>>>>>>>>> Which also shows that there is a potential for future >>>>>>>>>> mistakes. Can we isolate the field better so it?s only >>>>>>>>>> accessible via setter and getter? >>>>>>>>> >>>>>>>>> Yes, great idea. >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>>>> Thank you!! >>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> -Zhengyu >>>>>>>>>>>> >>>>>>>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>>> Summary: Use load_acquire for accessing >>>>>>>>>>>>> DictionaryEntry::_pd_set since it's accessed outside the >>>>>>>>>>>>> SystemDictionary_lock >>>>>>>>>>>>> >>>>>>>>>>>>> Ran parallel class loading tests that we have as well as >>>>>>>>>>>>> tier1 tests. See bug for details. >>>>>>>>>>>>> >>>>>>>>>>>>> open webrev at >>>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Coleen >>>>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >> From rkennke at redhat.com Wed Aug 30 10:14:02 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 30 Aug 2017 12:14:02 +0200 Subject: RFR: 8186837: Memory ordering nmethod, _state and _stack_traversal_mark In-Reply-To: <6b4e31ae-8c3c-92f7-8743-75f6791cae0d@oracle.com> References: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> <6b4e31ae-8c3c-92f7-8743-75f6791cae0d@oracle.com> Message-ID: <9480F78C-F0FB-4F26-B272-E7522B8A92F5@redhat.com> It sounds to me like LA/RL would be required and sufficient on _state ? Roman Am 29. August 2017 16:50:03 MESZ schrieb Robbin Ehn : >Hi Roman thanks for having a look, > >On 08/29/2017 03:00 PM, Roman Kennke wrote: >> Hi Robin, >> >> I doubt that we can assume a symmetry between loadload and storestore >like there is with load-acquire and release-store. This doesn't seem >right. In my experience loadload >> and storestore are rather special purpose: loadload ensures ordering >between otherwise unrelated loads and storestore likewise with stores. > >This exactly why I add loadload, to stop reordering of unrelated loads: > >####################### >The original code did: > >//nmethod::make_not_entrant_or_zombie >store _stack_traversal_mark >storestore >store _state > >//NMethodSweeper::process_compiled_method >load _state >load _stack_traversal_mark // this is a none-volatile load, can be >reordered by both gcc and hardware > >####################### >Adding la/sr + volatile to _stack_traversal_mark: > >store _stack_traversal_mark release // release not needed, we have a >following storestore for the unrelated stores >storestore >store _state > >load _state >load _stack_traversal_mark acquire // acquire not needed since we >already loaded _state and any following writes/reads will be done after >we have taken a Mutex. > >####################### >So therefore my conclusion was that, in this particular case: > >store _stack_traversal_mark >storestore >store _state > >load _state >loadload >load _stack_traversal_mark > >would be correct, agree? > >And as I said I have created another jira issue for the concerns me, >you and David share. > >Thanks Robbin > >> >> And even symmetric use of load-acquire and release-store are often >done wrong: those are not meant to protect concurrent access to the >field, but to the stuff that is >> protected by the field access (think locks), I.e. what happens >between the LA and RS. At least that is my understanding. >> >> I suggest to do what David said and try to understand what concurrent >accesses to which fields we have, and which fences are actually needed >to ensure correct ordering. >> >> And thanks for revisiting this! >> >> Cheers, Roman >> >> Am 29. August 2017 12:31:17 MESZ schrieb Robbin Ehn >: >> >> Hi please review, >> >> The issue 8180932 - "Parallelize safepoint cleanup" changed >_stack_traversal_mark to load acquire/store release, this is at least >half wrong. >> Instead for simplicity the write side storestore fence should be >match with loadload on read side and the changes to >_stack_traversal_mark undone (kept it volatile). >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8186837 >> >> Code: >> http://cr.openjdk.java.net/~rehn/8186837/hotspot.01/webrev/ >> >> It's not clear in this code if there other concurrent dependent >read/writes. >> Is true that only when reading/writing _state and >_stack_traversal_mark proper memory ordering is needed? >> To track that I >created:https://bugs.openjdk.java.net/browse/JDK-8186839 >> >> Thanks Robbin >> >> >> -- >> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. -- Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From robbin.ehn at oracle.com Wed Aug 30 14:50:26 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Aug 2017 16:50:26 +0200 Subject: RFR: 8186837: Memory ordering nmethod, _state and _stack_traversal_mark In-Reply-To: <9480F78C-F0FB-4F26-B272-E7522B8A92F5@redhat.com> References: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> <6b4e31ae-8c3c-92f7-8743-75f6791cae0d@oracle.com> <9480F78C-F0FB-4F26-B272-E7522B8A92F5@redhat.com> Message-ID: Hi Roman, On 08/30/2017 12:14 PM, Roman Kennke wrote: > It sounds to me like LA/RL would be required and sufficient on _state ? Yes, but - Using la/rs just sometimes can be confusing, changing to get/set with la/rs semantics also means we need to change assembly to use proper memory ordering, e.g. lda instead of ldr on aarch64 and armv7 add e.g. ldr, dmb, etc... (I asked compiler, their thoughts was dirty reads are most likely okay in all other cases since _state only goes 'up') - Using la/rs in just this case means either duplicate methods with acquire semantic or adding a bool to a lot of methods since these are accessed in deep call hierarchies. - The write side is using storestore today For those reasons I said: > Instead for simplicity the write side storestore fence should be match with loadload on read side and the changes to _stack_traversal_mark undone (kept it volatile). I'm well aware this no near perfect, but we need that loadload fence, so can you can live with my proposed change? And we can discussed the bigger picture in 8186839? Thanks, Robbin > > Roman > > Am 29. August 2017 16:50:03 MESZ schrieb Robbin Ehn : > > Hi Roman thanks for having a look, > > On 08/29/2017 03:00 PM, Roman Kennke wrote: > > Hi Robin, > > I doubt that we can assume a symmetry between loadload and storestore like there is with load-acquire and release-store. This doesn't seem right. In my experience > loadload > and storestore are rather special purpose: loadload ensures ordering between otherwise unrelated loads and storestore likewise with stores. > > > This exactly why I add loadload, to stop reordering of unrelated loads: > > ####################### > The original code did: > > //nmethod::make_not_entrant_or_zombie > store _stack_traversal_mark > storestore > store _state > > //NMethodSweeper::process_compiled_method > load _state > load _stack_traversal_mark // this is a none-volatile load, can be reordered by both gcc and hardware > > ####################### > Adding la/sr + volatile to _stack_traversal_mark: > > store _stack_traversal_mark release // release not needed, we have a following storestore for the unrelated stores > storestore > store _state > > load _state > load _stack_traversal_mark acquire // acquire not needed since we already loaded _state and any following writes/reads will be done after we have taken a Mutex. > > ####################### > So therefore my conclusion was that, in this particular case: > > store _stack_traversal_mark > storestore > store _state > > load _state > loadload > load _stack_traversal_mark > > would be correct, agree? > > And as I said I have created another jira issue for the concerns me, you and David share. > > Thanks Robbin > > > And even symmetric use of load-acquire and release-store are often done wrong: those are not meant to protect concurrent access to the field, but to the stuff that is > protected by the field access (think locks), I.e. what happens between the LA and RS. At least that is my understanding. > > I suggest to do what David said and try to understand what concurrent accesses to which fields we have, and which fences are actually needed to ensure correct ordering. > > And thanks for revisiting this! > > Cheers, Roman > > Am 29. August 2017 12:31:17 MESZ schrieb Robbin Ehn : > > Hi please review, > > The issue 8180932 - "Parallelize safepoint cleanup" changed _stack_traversal_mark to load acquire/store release, this is at least half wrong. > Instead for simplicity the write side storestore fence should be match with loadload on read side and the changes to _stack_traversal_mark undone (kept it volatile). > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8186837 > > Code: > http://cr.openjdk.java.net/~rehn/8186837/hotspot.01/webrev/ > > It's not clear in this code if there other concurrent dependent read/writes. > Is true that only when reading/writing _state and _stack_traversal_mark proper memory ordering is needed? > To track that I created:https://bugs.openjdk.java.net/browse/JDK-8186839 > > Thanks Robbin > > > -- > Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. > > > -- > Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From rkennke at redhat.com Wed Aug 30 15:03:43 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 30 Aug 2017 17:03:43 +0200 Subject: RFR: 8186837: Memory ordering nmethod, _state and _stack_traversal_mark In-Reply-To: References: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> <6b4e31ae-8c3c-92f7-8743-75f6791cae0d@oracle.com> <9480F78C-F0FB-4F26-B272-E7522B8A92F5@redhat.com> Message-ID: OK :-) Cheers, Roman Am 30. August 2017 16:50:26 MESZ schrieb Robbin Ehn : >Hi Roman, > >On 08/30/2017 12:14 PM, Roman Kennke wrote: >> It sounds to me like LA/RL would be required and sufficient on _state >? > >Yes, but >- Using la/rs just sometimes can be confusing, changing to get/set with >la/rs semantics also means we need to change assembly to use proper >memory ordering, > e.g. lda instead of ldr on aarch64 and armv7 add e.g. ldr, dmb, etc... >(I asked compiler, their thoughts was dirty reads are most likely okay >in all other cases since _state only goes 'up') >- Using la/rs in just this case means either duplicate methods with >acquire semantic or adding a bool to a lot of methods since these are >accessed in deep call hierarchies. >- The write side is using storestore today > >For those reasons I said: > >> Instead for simplicity the write side storestore fence should be >match with loadload on read side and the changes to >_stack_traversal_mark undone (kept it volatile). > >I'm well aware this no near perfect, but we need that loadload fence, >so can you can live with my proposed change? >And we can discussed the bigger picture in 8186839? > >Thanks, Robbin > >> >> Roman >> >> Am 29. August 2017 16:50:03 MESZ schrieb Robbin Ehn >: >> >> Hi Roman thanks for having a look, >> >> On 08/29/2017 03:00 PM, Roman Kennke wrote: >> >> Hi Robin, >> >> I doubt that we can assume a symmetry between loadload and >storestore like there is with load-acquire and release-store. This >doesn't seem right. In my experience >> loadload >> and storestore are rather special purpose: loadload ensures >ordering between otherwise unrelated loads and storestore likewise with >stores. >> >> >> This exactly why I add loadload, to stop reordering of unrelated >loads: >> >> ####################### >> The original code did: >> >> //nmethod::make_not_entrant_or_zombie >> store _stack_traversal_mark >> storestore >> store _state >> >> //NMethodSweeper::process_compiled_method >> load _state >> load _stack_traversal_mark // this is a none-volatile load, can >be reordered by both gcc and hardware >> >> ####################### >> Adding la/sr + volatile to _stack_traversal_mark: >> >> store _stack_traversal_mark release // release not needed, we >have a following storestore for the unrelated stores >> storestore >> store _state >> >> load _state >> load _stack_traversal_mark acquire // acquire not needed since we >already loaded _state and any following writes/reads will be done after >we have taken a Mutex. >> >> ####################### >> So therefore my conclusion was that, in this particular case: >> >> store _stack_traversal_mark >> storestore >> store _state >> >> load _state >> loadload >> load _stack_traversal_mark >> >> would be correct, agree? >> >> And as I said I have created another jira issue for the concerns >me, you and David share. >> >> Thanks Robbin >> >> >> And even symmetric use of load-acquire and release-store are >often done wrong: those are not meant to protect concurrent access to >the field, but to the stuff that is >> protected by the field access (think locks), I.e. what >happens between the LA and RS. At least that is my understanding. >> >> I suggest to do what David said and try to understand what >concurrent accesses to which fields we have, and which fences are >actually needed to ensure correct ordering. >> >> And thanks for revisiting this! >> >> Cheers, Roman >> >> Am 29. August 2017 12:31:17 MESZ schrieb Robbin Ehn >: >> >> Hi please review, >> >> The issue 8180932 - "Parallelize safepoint cleanup" changed >_stack_traversal_mark to load acquire/store release, this is at least >half wrong. >> Instead for simplicity the write side storestore fence should >be match with loadload on read side and the changes to >_stack_traversal_mark undone (kept it volatile). >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8186837 >> >> Code: >> http://cr.openjdk.java.net/~rehn/8186837/hotspot.01/webrev/ >> >> It's not clear in this code if there other concurrent >dependent read/writes. >> Is true that only when reading/writing _state and >_stack_traversal_mark proper memory ordering is needed? >> To track that I >created:https://bugs.openjdk.java.net/browse/JDK-8186839 >> >> Thanks Robbin >> >> >> -- >> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail >gesendet. >> >> >> -- >> Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. -- Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From calvin.cheung at oracle.com Wed Aug 30 16:48:37 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 30 Aug 2017 09:48:37 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <6B35DD35-57B6-4179-826D-2D910BFD2D51@oracle.com> References: <59A45413.70800@oracle.com> <6B35DD35-57B6-4179-826D-2D910BFD2D51@oracle.com> Message-ID: <59A6EC65.2010301@oracle.com> Hi Jiangli, Thanks for your review. On 8/29/17, 12:55 PM, Jiangli Zhou wrote: > Hi Calvin, > > These changes look good. I have a few remaining comments below. > > - src/share/vm/classfile/classLoader.cpp > The ?_num_boot_entries' variable probably is unnecessary. It?s > initialized to _num_entries and never modified. In places where > _num_boot_entries is used, can you use _num_entries directly? No. Because at the time of the initialization of _num_boot_entries, the _num_entries only has the number of entries in the boot class path including the runtime image. Later on, the _num_entries can change. Take a look at ClassLoader::add_to_list(ClassPathEntry *new_entry) and its callers. > > I think we should remove the special boot classpath handling code for > dump time. With the use of java class loaders at CDS/AppCDS dump time, > we no longer need to append the -cp path to the boot classpath. With > that, we can remove the CDS special cases from > ClassLoader::load_class, ClassPathImageEntry::open_stream, etc. That > would make the generic class loading code much cleaner. Since you are > planning to integrate this change soon, making the suggested change > can be risky. I?ll file a new RFE, we can do the clean up separately. That's a good idea. > 150 int ClassLoader::_num_boot_entries = -1; > 1519 if (DumpSharedSpaces&& classpath_index>= _num_boot_entries) { > 1520 // Do not load any class from the app classpath using the boot loader. Let > 1521 // the built-in app class laoder load them. > 1522 break; > 1523 } > 1635 if (classpath_index< _num_boot_entries) { > 1636 // ik is either: > 1637 // 1) a boot class loaded from the runtime image during vm initialization (classpath_index = 0); or > 1638 // 2) a user's class from -Xbootclasspath/a (classpath_index> 0) > 1639 // In the second case, the classpath_index, classloader_type will be recorded via > 1640 // context.record_result() in ClassLoader::load_class(Symbol* name, bool search_append_only, TRAPS). > 1641 if (classpath_index> 0) { > 1642 return; > 1643 } > 1644 } > > I?m wondering why the following is never needed before in > get_package_entry() for non-CDS case. Do you have additional details? > 253 // PackageEntryTable could be NULL for classes like java/lang/invoke/LambdaForm$MH > 254 if (pkgEntryTable == NULL) { > 255 return NULL; > 256 } It turns out it is no longer needed so I've removed it. > If I understand it correctly, the following is to find if the class is > from the runtime image. Can you please change the following to check > with module->location()->starts_with(?jrt:?)? > 1610 if ((strcmp(_jrt_entry->name(), src) == 0) || > 1611 (module != NULL&& (module->name() != NULL)&& > 1612 (strcmp(module->name()->as_C_string(), src) == 0))) { > 1613 e = _jrt_entry; > 1614 classpath_index = 0; I've made the change. > I was looking for code that handles anonymous classes. There are > following code in classLoader.cpp and metaspaceShared.cpp. However, I > can?t find any code that specifically removes anonymous classes from > the system dictionary at CDS dump time. I?m probably missing > something, how do we guarantee (besides the assert) anonymous classes > are not being archived? > 1582 void ClassLoader::record_shared_class_loader_type(InstanceKlass* ik, const ClassFileStream* stream) { > 1583 assert(DumpSharedSpaces, "sanity"); > 1584 assert(stream != NULL, "sanity"); > 1585 > 1586 if (ik->is_anonymous()) { > 1587 // We do not archive anonymous classes. > 1588 return; > 1589 } > 487 NOT_PRODUCT( > 488 static void assert_not_anonymous_class(InstanceKlass* k) { > 489 if (k->is_instance_klass()) { > 490 assert(!(k->is_anonymous()), "cannot archive anonymous classes"); > 491 } > 492 } > 494 static void assert_no_anonymoys_classes_in_dictionaries() { > 495 ClassLoaderDataGraph::dictionary_classes_do(assert_not_anonymous_class); > 496 }) > Thanks Ioi for answering this one. > - src/share/vm/classfile/klassFactory.cpp > The following code can be simplified to get the module from ?ik?. > 74 if (path_index< 0) { > 75 // AppCDSv2 class. > 76 // Get the pkg_entry from the classloader > 77 PackageEntry* pkg_entry = NULL; > 78 TempNewSymbol pkg_name = InstanceKlass::package_from_name(class_name, CHECK_NULL); > 79 if (pkg_name != NULL) { > 80 const char* pkg_string = pkg_name->as_C_string(); > 81 ClassLoaderData* loader_data = ClassLoaderData::class_loader_data(class_loader()); > 82 if (loader_data != NULL) { > 83 pkg_entry = loader_data->packages()->lookup_only(pkg_name); > 84 } > 85 } > 86 if (pkg_entry != NULL) { > 87 ModuleEntry* mod_entry = pkg_entry->module(); Yes, I'm using ik->module() in my new webrev. > Could you please remove the following from > KlassFactory::create_from_stream() and call it from > ClassLoaderExt::record_result()? > 229 #if INCLUDE_CDS > 230 if (DumpSharedSpaces) { > 231 ClassLoader::record_shared_class_loader_type(result, stream); > 232 } > 233 #endif Based on our off-list discussion, we've decided to keep the code but moving it further down in the same function together with another existing "#if INCLUDE_CDS" block. updated webrevs: incremental: http://cr.openjdk.java.net/~ccheung/8186842/webrev_00_01/ complete: http://cr.openjdk.java.net/~ccheung/8186842/webrev.01/ thanks, Calvin > Thanks, > > Jiangli > >> On Aug 28, 2017, at 10:34 AM, Calvin Cheung > > wrote: >> >> Hi, >> >> This is a re-post of a previous RFR for 8172218 using the correct bug id. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8186842 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ >> >> >> Please refer to the comment >> > > >> section of the bug for description of the change. >> >> Tests executed so far: >> JPRT >> hs-tier2 though hs-tier4 >> hs-tier5 (linux-x64) >> >> thanks, >> Calvin > From calvin.cheung at oracle.com Wed Aug 30 17:04:17 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 30 Aug 2017 10:04:17 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: References: <59A45413.70800@oracle.com> Message-ID: <59A6F011.1000205@oracle.com> Hi Coleen, Thanks for your review. On 8/29/17, 5:18 PM, coleen.phillimore at oracle.com wrote: > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoader.cpp.udiff.html > > > Can you put some comment like "Find if the class is from the runtime > image" above this. I couldn't guess reading this so used Jianli's > review as a hint. > > *+ if ((strcmp(_jrt_entry->name(), src) == 0) ||* > *+ (module != NULL && (module->name() != NULL) &&* > *+ (strcmp(module->name()->as_C_string(), src) == 0))) {* > *+ e = _jrt_entry;* > I've added a comment. The conditions have been simplified per Jiangli's suggestion. > > Can you change this: > > *+ if (!get_canonical_path(e->name(), canonical_path, JVM_MAXPATHLEN)) {* > *+ continue;* > *+ }* > *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* > *+ break;* > *+ }* > *+ classpath_index ++;* > > To: > > *+ if (get_canonical_path(e->name(), canonical_path, JVM_MAXPATHLEN)) {* > *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* > *+ break;* > *+ }* > *+ classpath_index ++; + } * > > > So the confusing "continue" goes away? I've fixed it per your suggestion. > > *+ const char* const class_name = ik->name()->as_C_string();* > > > I think you need another ResourceMark here. Added. > > *+ ClassLoaderExt::Context context(class_name, file_name, THREAD);* > *+ context.record_result(ik->name(), e, classpath_index, ik, THREAD); > // this is a tail call so doesn't need CATCH or CHECK * > > Could these throw exceptions and you don't expect them too? Or do > they just need a thread argument? If the former, chagne THREAD to CATCH. I've changed the TREAD to CATCH in both lines. > > *+ #endif* > > > Can you put what this is an #endif to as a comment since it's far away > from the #if ? Done. > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoaderExt.hpp.udiff.html > > > *+oop h_loader = result->class_loader();* > > > Nit, can you remove h_ from the name since it's not a Handle. Done. > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/klassFactory.cpp.udiff.html > > > *+ ClassLoaderData* loader_data = > ClassLoaderData::class_loader_data(class_loader());* > *+ if (loader_data != NULL) {* > *+ pkg_entry = loader_data->packages()->lookup_only(pkg_name);* > *+ } * > > The ClassLoaderData should never be null at this point, and why would > it be different than the one you fetched above. I think this would be > not legal to change the class loader with CFLH, and the original > loader_data is used below, so this should be the same one. > > I think 12 or more inserted lines should be a new static function > above, that's called here, like > const char* pathname = get_package_name(loader_data, class_name, > path_index, CHECK); The above have been changed to using ik->module() per Jiangli's suggestion. > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/systemDictionary.cpp.udiff.html > > > The combining entries looks good. I think it needs a comment that > it's only done during dump time (or an assert). There's already an assert in the caller: void SystemDictionary::combine_shared_dictionaries() { assert(DumpSharedSpaces, "dump time only"); > > *+ Dictionary* master_dictionary = > ClassLoaderData::the_null_class_loader_data()->dictionary();* > > > It's been bothering me that the shared dictionary at dump time is the > NULL_CLD one. With the combining, I think the dictionary at dump time > should be shared_dictionary(). Can this be a follow on RFE to clean > this up to use shared_dictionary()? I think this change enables that. Yes, we can file an RFE to investigate this. > > Do you have to free the initiating entries? Can you leave them around? If I don't free the entries, it will hit the following assert in BasicHashtable::verify_table guarantee(number_of_entries() == element_count, "Verify of %s failed", table_name); > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/memory/metaspaceShared.cpp.udiff.html > > > *+ NOT_PRODUCT(* > *+ static void assert_not_anonymous_class(InstanceKlass* k) {* > *+ if (k->is_instance_klass()) {* > *+ assert(!(k->is_anonymous()), "cannot archive anonymous classes");* > *+ }* > *+ }* > > You don't have to ask if k->is_instance_klass() since it passes in an > InstanceKlass. Fixed. > > It surprises me that there are no anonymous classes loaded (I'll read > Ioi's reply later). I don't know if that will remain the case though. > > *+ tty->print_cr("Preload Warning: Cannot find %s", > parser.current_class_name());* > > > You should have an RFE to use log_warning() instead of tty->print_cr > for all the CDS messages. I've filed https://bugs.openjdk.java.net/browse/JDK-8186988 > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/arrayKlass.cpp.udiff.html > > > Because we use ClassLoaderDataGraph::classes_do() I thought all the > array dimension Klasses are walked and this isn't needed. This would involve mostly removing code in a few files but requires running a lot of tests to make sure the change is good. I'll file a follow-up RFE to clean up the code. > > http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/constantPool.cpp.udiff.html > > > Why are you unresolveing Klasses? I thought that was a good thing for > performance. Can you add a comment why? There's some leftover #if0 > code. Added comment and removed the #if0 block in the new webrev. > > I've completed my review and these are only minor comments and > questions. I might need to see an incremental or new review depending > on how much you change. It looks good. updated webrevs: incremental: http://cr.openjdk.java.net/~ccheung/8186842/webrev_00_01/ complete: http://cr.openjdk.java.net/~ccheung/8186842/webrev.01/ thanks, Calvin > > Thanks, > Coleen > > > On 8/28/17 1:34 PM, Calvin Cheung wrote: >> Hi, >> >> This is a re-post of a previous RFR for 8172218 using the correct bug >> id. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8186842 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ >> >> Please refer to the comment >> >> section of the bug for description of the change. >> >> Tests executed so far: >> JPRT >> hs-tier2 though hs-tier4 >> hs-tier5 (linux-x64) >> >> thanks, >> Calvin > From calvin.cheung at oracle.com Wed Aug 30 17:06:07 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 30 Aug 2017 10:06:07 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <590ac776-a630-944c-5673-9c3b12d15624@oracle.com> References: <59A45413.70800@oracle.com> <590ac776-a630-944c-5673-9c3b12d15624@oracle.com> Message-ID: <59A6F07F.5030505@oracle.com> Hi Misha, Thanks for reviewing the test cases. Calvin On 8/29/17, 2:53 PM, mikhailo wrote: > Hi Calvin, > > I have reviewed the test portion of the change; it looks good. > > > Thank you, > > Misha > > > On 08/28/2017 10:34 AM, Calvin Cheung wrote: >> Hi, >> >> This is a re-post of a previous RFR for 8172218 using the correct bug >> id. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8186842 >> >> webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ >> >> Please refer to the comment >> >> section of the bug for description of the change. >> >> Tests executed so far: >> JPRT >> hs-tier2 though hs-tier4 >> hs-tier5 (linux-x64) >> >> thanks, >> Calvin > From calvin.cheung at oracle.com Wed Aug 30 17:23:37 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 30 Aug 2017 10:23:37 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: References: <59A45413.70800@oracle.com> Message-ID: <59A6F499.7010401@oracle.com> Hi Ioi, Thanks for the suggestion. Based on our off-list discussion, I've made the change with a slight modification: void ConstantPool::archive_resolved_references(Thread* THREAD) { .... for (int i = 0; i < rr_len; i++) { oop p = rr->obj_at(i); + rr->obj_at_put(i, NULL); if (p != NULL && i < ref_map_len) { updated webrevs: incremental: http://cr.openjdk.java.net/~ccheung/8186842/webrev_00_01/ complete: http://cr.openjdk.java.net/~ccheung/8186842/webrev.01/ thanks, Calvin On 8/29/17, 10:00 PM, Ioi Lam wrote: > Hi Calvin, there's one more place where we can get rid of 'continue' > > void ConstantPool::archive_resolved_references(Thread* THREAD) { > .... > for (int i = 0; i < rr_len; i++) { > oop p = rr->obj_at(i); > if (p != NULL && i < ref_map_len) { > + rr->obj_at_put(i, NULL); > int index = object_to_cp_index(i); > // Skip the entry if the string hash code is 0 since the string > // is not included in the shared string_table, see > StringTable::copy_shared_string. > if (tag_at(index).is_string() && > java_lang_String::hash_code(p) != 0) { > oop op = StringTable::create_archived_string(p, THREAD); > // If the String object is not archived (possibly too large), > // NULL is returned. Also set it in the array, so we won't > // have a 'bad' reference in the archived resolved_reference > // array. > rr->obj_at_put(i, op); > - continue; > } > } > - rr->obj_at_put(i, NULL); > } > > Thanks > - Ioi > > On 8/29/17 5:18 PM, coleen.phillimore at oracle.com wrote: >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoader.cpp.udiff.html >> >> >> Can you put some comment like "Find if the class is from the runtime >> image" above this. I couldn't guess reading this so used Jianli's >> review as a hint. >> >> *+ if ((strcmp(_jrt_entry->name(), src) == 0) ||* >> *+ (module != NULL && (module->name() != NULL) &&* >> *+ (strcmp(module->name()->as_C_string(), src) == 0))) {* >> *+ e = _jrt_entry;* >> >> >> Can you change this: >> >> *+ if (!get_canonical_path(e->name(), canonical_path, >> JVM_MAXPATHLEN)) {* >> *+ continue;* >> *+ }* >> *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* >> *+ break;* >> *+ }* >> *+ classpath_index ++;* >> >> To: >> >> *+ if (get_canonical_path(e->name(), canonical_path, JVM_MAXPATHLEN)) {* >> *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* >> *+ break;* >> *+ }* >> *+ classpath_index ++; + } * >> >> >> So the confusing "continue" goes away? >> >> *+ const char* const class_name = ik->name()->as_C_string();* >> >> >> I think you need another ResourceMark here. >> >> *+ ClassLoaderExt::Context context(class_name, file_name, THREAD);* >> *+ context.record_result(ik->name(), e, classpath_index, ik, THREAD); >> // this is a tail call so doesn't need CATCH or CHECK * >> >> Could these throw exceptions and you don't expect them too? Or do >> they just need a thread argument? If the former, chagne THREAD to >> CATCH. >> >> *+ #endif* >> >> >> Can you put what this is an #endif to as a comment since it's far >> away from the #if ? >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoaderExt.hpp.udiff.html >> >> >> *+oop h_loader = result->class_loader();* >> >> >> Nit, can you remove h_ from the name since it's not a Handle. >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/klassFactory.cpp.udiff.html >> >> >> *+ ClassLoaderData* loader_data = >> ClassLoaderData::class_loader_data(class_loader());* >> *+ if (loader_data != NULL) {* >> *+ pkg_entry = loader_data->packages()->lookup_only(pkg_name);* >> *+ } * >> >> The ClassLoaderData should never be null at this point, and why would >> it be different than the one you fetched above. I think this would >> be not legal to change the class loader with CFLH, and the original >> loader_data is used below, so this should be the same one. >> >> I think 12 or more inserted lines should be a new static function >> above, that's called here, like >> const char* pathname = get_package_name(loader_data, class_name, >> path_index, CHECK); >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/systemDictionary.cpp.udiff.html >> >> >> The combining entries looks good. I think it needs a comment that >> it's only done during dump time (or an assert). >> >> *+ Dictionary* master_dictionary = >> ClassLoaderData::the_null_class_loader_data()->dictionary();* >> >> >> It's been bothering me that the shared dictionary at dump time is the >> NULL_CLD one. With the combining, I think the dictionary at dump >> time should be shared_dictionary(). Can this be a follow on RFE to >> clean this up to use shared_dictionary()? I think this change >> enables that. >> >> Do you have to free the initiating entries? Can you leave them around? >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/memory/metaspaceShared.cpp.udiff.html >> >> >> *+ NOT_PRODUCT(* >> *+ static void assert_not_anonymous_class(InstanceKlass* k) {* >> *+ if (k->is_instance_klass()) {* >> *+ assert(!(k->is_anonymous()), "cannot archive anonymous classes");* >> *+ }* >> *+ }* >> >> You don't have to ask if k->is_instance_klass() since it passes in an >> InstanceKlass. >> >> It surprises me that there are no anonymous classes loaded (I'll read >> Ioi's reply later). I don't know if that will remain the case though. >> >> *+ tty->print_cr("Preload Warning: Cannot find %s", >> parser.current_class_name());* >> >> >> You should have an RFE to use log_warning() instead of tty->print_cr >> for all the CDS messages. >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/arrayKlass.cpp.udiff.html >> >> >> Because we use ClassLoaderDataGraph::classes_do() I thought all the >> array dimension Klasses are walked and this isn't needed. >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/constantPool.cpp.udiff.html >> >> >> Why are you unresolveing Klasses? I thought that was a good thing >> for performance. Can you add a comment why? There's some leftover >> #if0 code. >> >> I've completed my review and these are only minor comments and >> questions. I might need to see an incremental or new review >> depending on how much you change. It looks good. >> >> Thanks, >> Coleen >> >> >> On 8/28/17 1:34 PM, Calvin Cheung wrote: >>> Hi, >>> >>> This is a re-post of a previous RFR for 8172218 using the correct >>> bug id. >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8186842 >>> >>> webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ >>> >>> Please refer to the comment >>> >>> section of the bug for description of the change. >>> >>> Tests executed so far: >>> JPRT >>> hs-tier2 though hs-tier4 >>> hs-tier5 (linux-x64) >>> >>> thanks, >>> Calvin >> > From coleen.phillimore at oracle.com Wed Aug 30 20:02:51 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 30 Aug 2017 16:02:51 -0400 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <59A6F011.1000205@oracle.com> References: <59A45413.70800@oracle.com> <59A6F011.1000205@oracle.com> Message-ID: <7267241f-b8be-e107-7757-a181ad909251@oracle.com> Hi Calvin, Your changes look good.? Thank you for answering my questions.? One minor change. http://cr.openjdk.java.net/~ccheung/8186842/webrev_00_01/src/share/vm/classfile/classLoader.cpp.udiff.html *!context.record_result(ik->name(), e, classpath_index, ik, _CATCH_);* This should go back to THREAD because it's at the end of the function.? Sorry for the confusion. I don't need to see another webrev.? Thanks! Coleen On 8/30/17 1:04 PM, Calvin Cheung wrote: > Hi Coleen, > > Thanks for your review. > > On 8/29/17, 5:18 PM, coleen.phillimore at oracle.com wrote: >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoader.cpp.udiff.html >> >> >> Can you put some comment like "Find if the class is from the runtime >> image" above this.? I couldn't guess reading this so used Jianli's >> review as a hint. >> >> *+ if ((strcmp(_jrt_entry->name(), src) == 0) ||* >> *+ (module != NULL && (module->name() != NULL) &&* >> *+ (strcmp(module->name()->as_C_string(), src) == 0))) {* >> *+ e = _jrt_entry;* >> > I've added a comment. The conditions have been simplified per > Jiangli's suggestion. >> >> Can you change this: >> >> *+ if (!get_canonical_path(e->name(), canonical_path, >> JVM_MAXPATHLEN)) {* >> *+ continue;* >> *+ }* >> *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* >> *+ break;* >> *+ }* >> *+ classpath_index ++;* >> >> To: >> >> *+ if (get_canonical_path(e->name(), canonical_path, JVM_MAXPATHLEN)) {* >> *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* >> *+ break;* >> *+ }* >> *+ classpath_index ++; + } * >> >> >> So the confusing "continue" goes away? > I've fixed it per your suggestion. >> >> *+ const char* const class_name = ik->name()->as_C_string();* >> >> >> I think you need another ResourceMark here. > Added. >> >> *+ ClassLoaderExt::Context context(class_name, file_name, THREAD);* >> *+ context.record_result(ik->name(), e, classpath_index, ik, THREAD); >> // this is a tail call so doesn't need CATCH or CHECK * >> >> Could these throw exceptions and you don't expect them too??? Or do >> they just need a thread argument?? If the former, chagne THREAD to >> CATCH. > I've changed the TREAD to CATCH in both lines. >> >> *+ #endif* >> >> >> Can you put what this is an #endif to as a comment since it's far >> away from the #if ? > Done. >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoaderExt.hpp.udiff.html >> >> >> *+oop h_loader = result->class_loader();* >> >> >> Nit, can you remove h_ from the name since it's not a Handle. > Done. >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/klassFactory.cpp.udiff.html >> >> >> *+ ClassLoaderData* loader_data = >> ClassLoaderData::class_loader_data(class_loader());* >> *+ if (loader_data != NULL) {* >> *+ pkg_entry = loader_data->packages()->lookup_only(pkg_name);* >> *+ } * >> >> The ClassLoaderData should never be null at this point, and why would >> it be different than the one you fetched above.? I think this would >> be not legal to change the class loader with CFLH, and the original >> loader_data is used below, so this should be the same one. >> >> I think 12 or more inserted lines should be a new static function >> above, that's called here, like >> ?? const char* pathname = get_package_name(loader_data, class_name, >> path_index, CHECK); > The above have been changed to using ik->module() per Jiangli's > suggestion. >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/systemDictionary.cpp.udiff.html >> >> >> The combining entries looks good.?? I think it needs a comment that >> it's only done during dump time (or an assert). > There's already an assert in the caller: > void SystemDictionary::combine_shared_dictionaries() { > ? assert(DumpSharedSpaces, "dump time only"); >> >> *+ Dictionary* master_dictionary = >> ClassLoaderData::the_null_class_loader_data()->dictionary();* >> >> >> It's been bothering me that the shared dictionary at dump time is the >> NULL_CLD one.? With the combining, I think the dictionary at dump >> time should be shared_dictionary().?? Can this be a follow on RFE to >> clean this up to use shared_dictionary()??? I think this change >> enables that. > Yes, we can file an RFE to investigate this. >> >> Do you have to free the initiating entries??? Can you leave them around? > If I don't free the entries, it will hit the following assert in > BasicHashtable::verify_table > > ? guarantee(number_of_entries() == element_count, > ??????????? "Verify of %s failed", table_name); >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/memory/metaspaceShared.cpp.udiff.html >> >> >> *+ NOT_PRODUCT(* >> *+ static void assert_not_anonymous_class(InstanceKlass* k) {* >> *+ if (k->is_instance_klass()) {* >> *+ assert(!(k->is_anonymous()), "cannot archive anonymous classes");* >> *+ }* >> *+ }* >> >> You don't have to ask if k->is_instance_klass() since it passes in an >> InstanceKlass. > Fixed. >> >> It surprises me that there are no anonymous classes loaded (I'll read >> Ioi's reply later).? I don't know if that will remain the case though. >> >> *+ tty->print_cr("Preload Warning: Cannot find %s", >> parser.current_class_name());* >> >> >> You should have an RFE to use log_warning() instead of tty->print_cr >> for all the CDS messages. > I've filed https://bugs.openjdk.java.net/browse/JDK-8186988 >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/arrayKlass.cpp.udiff.html >> >> >> Because we use ClassLoaderDataGraph::classes_do() I thought all the >> array dimension Klasses are walked and this isn't needed. > This would involve mostly removing code in a few files but requires > running a lot of tests to make sure the change is good. > I'll file a follow-up RFE to clean up the code. >> >> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/constantPool.cpp.udiff.html >> >> >> Why are you unresolveing Klasses?? I thought that was a good thing >> for performance.? Can you add a comment why?? There's some leftover >> #if0 code. > Added comment and removed the #if0 block in the new webrev. >> >> I've completed my review and these are only minor comments and >> questions.? I might need to see an incremental or new review >> depending on how much you change.?? It looks good. > updated webrevs: > > incremental: http://cr.openjdk.java.net/~ccheung/8186842/webrev_00_01/ > complete: http://cr.openjdk.java.net/~ccheung/8186842/webrev.01/ > > thanks, > Calvin >> >> Thanks, >> Coleen >> >> >> On 8/28/17 1:34 PM, Calvin Cheung wrote: >>> Hi, >>> >>> This is a re-post of a previous RFR for 8172218 using the correct >>> bug id. >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8186842 >>> >>> webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ >>> >>> Please refer to the comment >>> >>> section of the bug for description of the change. >>> >>> Tests executed so far: >>> ??? JPRT >>> ??? hs-tier2 though hs-tier4 >>> ??? hs-tier5 (linux-x64) >>> >>> thanks, >>> Calvin >> From calvin.cheung at oracle.com Wed Aug 30 21:29:14 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Wed, 30 Aug 2017 14:29:14 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <7267241f-b8be-e107-7757-a181ad909251@oracle.com> References: <59A45413.70800@oracle.com> <59A6F011.1000205@oracle.com> <7267241f-b8be-e107-7757-a181ad909251@oracle.com> Message-ID: <59A72E2A.4070507@oracle.com> Hi Coleen, Thanks for taking a look again. I'll make the one-line change. Calvin On 8/30/17, 1:02 PM, coleen.phillimore at oracle.com wrote: > > Hi Calvin, > Your changes look good. Thank you for answering my questions. One > minor change. > > http://cr.openjdk.java.net/~ccheung/8186842/webrev_00_01/src/share/vm/classfile/classLoader.cpp.udiff.html > > *! context.record_result(ik->name(), e, classpath_index, ik,_CATCH_);* > This should go back to THREAD because it's at the end of the > function. Sorry for the confusion. > > I don't need to see another webrev. Thanks! > > Coleen > > On 8/30/17 1:04 PM, Calvin Cheung wrote: >> Hi Coleen, >> >> Thanks for your review. >> >> On 8/29/17, 5:18 PM, coleen.phillimore at oracle.com wrote: >>> >>> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoader.cpp.udiff.html >>> >>> >>> Can you put some comment like "Find if the class is from the runtime >>> image" above this. I couldn't guess reading this so used Jianli's >>> review as a hint. >>> >>> *+ if ((strcmp(_jrt_entry->name(), src) == 0) ||* >>> *+ (module != NULL && (module->name() != NULL) &&* >>> *+ (strcmp(module->name()->as_C_string(), src) == 0))) {* >>> *+ e = _jrt_entry;* >>> >> I've added a comment. The conditions have been simplified per >> Jiangli's suggestion. >>> >>> Can you change this: >>> >>> *+ if (!get_canonical_path(e->name(), canonical_path, >>> JVM_MAXPATHLEN)) {* >>> *+ continue;* >>> *+ }* >>> *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* >>> *+ break;* >>> *+ }* >>> *+ classpath_index ++;* >>> >>> To: >>> >>> *+ if (get_canonical_path(e->name(), canonical_path, >>> JVM_MAXPATHLEN)) {* >>> *+ if (strcmp(canonical_path, os::native_path((char*)src)) == 0) {* >>> *+ break;* >>> *+ }* >>> *+ classpath_index ++; + } * >>> >>> >>> So the confusing "continue" goes away? >> I've fixed it per your suggestion. >>> >>> *+ const char* const class_name = ik->name()->as_C_string();* >>> >>> >>> I think you need another ResourceMark here. >> Added. >>> >>> *+ ClassLoaderExt::Context context(class_name, file_name, THREAD);* >>> *+ context.record_result(ik->name(), e, classpath_index, ik, >>> THREAD); // this is a tail call so doesn't need CATCH or CHECK * >>> >>> Could these throw exceptions and you don't expect them too? Or do >>> they just need a thread argument? If the former, chagne THREAD to >>> CATCH. >> I've changed the TREAD to CATCH in both lines. >>> >>> *+ #endif* >>> >>> >>> Can you put what this is an #endif to as a comment since it's far >>> away from the #if ? >> Done. >>> >>> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/classLoaderExt.hpp.udiff.html >>> >>> >>> *+oop h_loader = result->class_loader();* >>> >>> >>> Nit, can you remove h_ from the name since it's not a Handle. >> Done. >>> >>> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/klassFactory.cpp.udiff.html >>> >>> >>> *+ ClassLoaderData* loader_data = >>> ClassLoaderData::class_loader_data(class_loader());* >>> *+ if (loader_data != NULL) {* >>> *+ pkg_entry = loader_data->packages()->lookup_only(pkg_name);* >>> *+ } * >>> >>> The ClassLoaderData should never be null at this point, and why >>> would it be different than the one you fetched above. I think this >>> would be not legal to change the class loader with CFLH, and the >>> original loader_data is used below, so this should be the same one. >>> >>> I think 12 or more inserted lines should be a new static function >>> above, that's called here, like >>> const char* pathname = get_package_name(loader_data, class_name, >>> path_index, CHECK); >> The above have been changed to using ik->module() per Jiangli's >> suggestion. >>> >>> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/classfile/systemDictionary.cpp.udiff.html >>> >>> >>> The combining entries looks good. I think it needs a comment that >>> it's only done during dump time (or an assert). >> There's already an assert in the caller: >> void SystemDictionary::combine_shared_dictionaries() { >> assert(DumpSharedSpaces, "dump time only"); >>> >>> *+ Dictionary* master_dictionary = >>> ClassLoaderData::the_null_class_loader_data()->dictionary();* >>> >>> >>> It's been bothering me that the shared dictionary at dump time is >>> the NULL_CLD one. With the combining, I think the dictionary at >>> dump time should be shared_dictionary(). Can this be a follow on >>> RFE to clean this up to use shared_dictionary()? I think this >>> change enables that. >> Yes, we can file an RFE to investigate this. >>> >>> Do you have to free the initiating entries? Can you leave them >>> around? >> If I don't free the entries, it will hit the following assert in >> BasicHashtable::verify_table >> >> guarantee(number_of_entries() == element_count, >> "Verify of %s failed", table_name); >>> >>> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/memory/metaspaceShared.cpp.udiff.html >>> >>> >>> *+ NOT_PRODUCT(* >>> *+ static void assert_not_anonymous_class(InstanceKlass* k) {* >>> *+ if (k->is_instance_klass()) {* >>> *+ assert(!(k->is_anonymous()), "cannot archive anonymous classes");* >>> *+ }* >>> *+ }* >>> >>> You don't have to ask if k->is_instance_klass() since it passes in >>> an InstanceKlass. >> Fixed. >>> >>> It surprises me that there are no anonymous classes loaded (I'll >>> read Ioi's reply later). I don't know if that will remain the case >>> though. >>> >>> *+ tty->print_cr("Preload Warning: Cannot find %s", >>> parser.current_class_name());* >>> >>> >>> You should have an RFE to use log_warning() instead of tty->print_cr >>> for all the CDS messages. >> I've filed https://bugs.openjdk.java.net/browse/JDK-8186988 >>> >>> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/arrayKlass.cpp.udiff.html >>> >>> >>> Because we use ClassLoaderDataGraph::classes_do() I thought all the >>> array dimension Klasses are walked and this isn't needed. >> This would involve mostly removing code in a few files but requires >> running a lot of tests to make sure the change is good. >> I'll file a follow-up RFE to clean up the code. >>> >>> http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/src/share/vm/oops/constantPool.cpp.udiff.html >>> >>> >>> Why are you unresolveing Klasses? I thought that was a good thing >>> for performance. Can you add a comment why? There's some leftover >>> #if0 code. >> Added comment and removed the #if0 block in the new webrev. >>> >>> I've completed my review and these are only minor comments and >>> questions. I might need to see an incremental or new review >>> depending on how much you change. It looks good. >> updated webrevs: >> >> incremental: http://cr.openjdk.java.net/~ccheung/8186842/webrev_00_01/ >> complete: http://cr.openjdk.java.net/~ccheung/8186842/webrev.01/ >> >> thanks, >> Calvin >>> >>> Thanks, >>> Coleen >>> >>> >>> On 8/28/17 1:34 PM, Calvin Cheung wrote: >>>> Hi, >>>> >>>> This is a re-post of a previous RFR for 8172218 using the correct >>>> bug id. >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186842 >>>> >>>> webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ >>>> >>>> Please refer to the comment >>>> >>>> section of the bug for description of the change. >>>> >>>> Tests executed so far: >>>> JPRT >>>> hs-tier2 though hs-tier4 >>>> hs-tier5 (linux-x64) >>>> >>>> thanks, >>>> Calvin >>> > From cthalinger at twitter.com Wed Aug 30 23:00:44 2017 From: cthalinger at twitter.com (Christian Thalinger) Date: Wed, 30 Aug 2017 13:00:44 -1000 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> Message-ID: <11F20E4C-3B9D-444F-BD48-398A5423D162@twitter.com> > On Aug 30, 2017, at 1:14 AM, coleen.phillimore at oracle.com wrote: > > > Hi, I changed the edit for David to only use ordering semantics in the places where needed in the lock free access to pd_set. Since only contains_protection_domain is read lock free, it should be ok. Sorry, I lost track. Are there now raw accesses to _pd_set? And if yes, does a comment say why? > > open webrev at http://cr.openjdk.java.net/~coleenp/8164207.04/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8164207 > Thanks, > Coleen > > On 8/29/17 2:28 AM, David Holmes wrote: >> Hi Coleen, >> >> On 29/08/2017 5:39 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 8/28/17 3:38 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Here is the third webrev with the names of pd_set and set_pd_set renamed to pd_set_acquire and release_set_pd_set. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.03/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >> >> This API should also be renamed: >> >> ! ProtectionDomainEntry* pd_set() const { return _inner.pd_set_acquire(); } >> ! void set_pd_set(ProtectionDomainEntry* new_head) { _inner.release_set_pd_set(new_head); } >> >> These are the ones that need to give visibility to the fact we're accessing things lock-free (if indeed we are). >> >> More below ... >> >>>> On 8/28/17 8:07 AM, coleen.phillimore at oracle.com wrote: >>>>> On 8/28/17 12:25 AM, David Holmes wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Thank you Zhengyu for noticing this change was wrong, and Christian for the idea. New webrev: >>>>>>> >>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>> >>>>>> The idea of a load-acquire accessor and release_store-setter is fine in principal, but it seems to me that we now use these everywhere, even if we may not need them because there is no concurrent/lock-free access. Overall I find it very difficult to determine what the concurrent access patterns are for a Dictionary versus a DictionaryEntry, and which paths are in fact lock and/or safepoint free, and may be racing with locked or safepointed code. ?? >>>>> >>>>> That's exactly the point of making them accessors. So one doesn't have to visit each individual call site and spend time answering the question for each case. And probably getting it wrong. The performance delta for these accesses is minimal since it's only getting the head of the list, not each element. >>>>> >>>>> Then it's also future proof so that if a lock is removed, then we don't miss one of the accessors at a later time. Note that observing bugs caused by this is very difficult to do, and can only be done by inspection. That's why I erred on the side of safety and consistency. >> >> Sorry, it may sound strange to say that I don't agree with "erring on the side of safety and consistency" but I do not agree with just using acquire/release semantics everywhere just in case! If we don't know the lock-free paths then how can we possibly know things are correct. The whole point of these accessors is to make it obvious where the lock-free accesses are. >> >>>>>> >>>>>> That aside I don't understand why you added a level of indirection with the ProtectionDomainSet class? >>>>> >>>>> Only the code is a level of indirection not the access. That is to avoid what I said above. See Christian's and Zhengyu's comments. >> >> Okay - I see what you did but I would not expect to have to protect _pd_set from direct use within its own class - anyone messing with that class should be aware of the need to use the accessors. Though I suppose this encapsulation is little different to defining the field as some kind of "Atomic" type rather than a "raw" type. >> >> Thanks, >> David >> ----- >> >>>>>> >>>>>> Also we have been trying to include release/acquire in the names of such accessors so that it is clear when we are relying on memory ordering properties ie. pd_set_acquire and release_set_pd_set >>>>>> >>>>> >>>>> I will change the names of these functions. >>>>> >>>>> thanks, >>>>> Coleen >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> >>>>>>> I reran parallel class loading tests and jck testing is in progress, but order access requires inspection. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>>>>>> >>>>>>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>>>>>> Hi Coleen, >>>>>>>>>>> >>>>>>>>>>> There are two instances probably overlooked? >>>>>>>>>>> >>>>>>>>>>> dictionary.cpp #103 and #124 >>>>>>>>>>> >>>>>>>>>>> for (ProtectionDomainEntry* current = _pd_set; >>>>>>>>>>> => >>>>>>>>>>> for (ProtectionDomainEntry* current = pd_set(); >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Oh yeah, you're right. That's embarrasing. I'll fix and retest. >>>>>>>>> Which also shows that there is a potential for future mistakes. Can we isolate the field better so it?s only accessible via setter and getter? >>>>>>>> >>>>>>>> Yes, great idea. >>>>>>>> Coleen >>>>>>>> >>>>>>>>>> Thank you!! >>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> -Zhengyu >>>>>>>>>>> >>>>>>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> Summary: Use load_acquire for accessing DictionaryEntry::_pd_set since it's accessed outside the SystemDictionary_lock >>>>>>>>>>>> >>>>>>>>>>>> Ran parallel class loading tests that we have as well as tier1 tests. See bug for details. >>>>>>>>>>>> >>>>>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Coleen >>>>>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >>> > From coleen.phillimore at oracle.com Wed Aug 30 23:14:45 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 30 Aug 2017 19:14:45 -0400 Subject: RFR (S) 8164207: Checking missing load-acquire in relation to _pd_set in dictionary.cpp In-Reply-To: <11F20E4C-3B9D-444F-BD48-398A5423D162@twitter.com> References: <6853e2cf-ddd3-f933-eb4e-db35ad85e62d@oracle.com> <84e488c9-e145-b5d1-202f-0ef9b7693608@redhat.com> <3901c1e0-fe69-8d96-2887-8e58d7dc5fd9@oracle.com> <6b416d62-f20d-1cd4-11d1-c3fb40c1bc82@oracle.com> <269e2ea2-c7d9-048f-156b-e12d0238ee14@oracle.com> <660a2310-2e64-fa49-6857-236bb59b6293@oracle.com> <7461b194-8f3c-bd0d-264c-a0806bbcd759@oracle.com> <11F20E4C-3B9D-444F-BD48-398A5423D162@twitter.com> Message-ID: On 8/30/17 7:00 PM, Christian Thalinger wrote: > >> On Aug 30, 2017, at 1:14 AM, coleen.phillimore at oracle.com >> wrote: >> >> >> Hi, I changed the edit for David to only use ordering semantics in >> the places where needed in the lock free access to pd_set.? Since >> only contains_protection_domain is read lock free, it should be ok. > > Sorry, I lost track. ?Are there now raw accesses to _pd_set? ?And if > yes, does a comment say why? There aren't raw accesses but ones that don't use acquire semantics.? I'll add a comment before the 3 that says ?? // The pd_set accessed inside SystemDictionary_lock here. Thanks, Coleen > >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8164207.04/webrev >> >> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >> >> Thanks, >> Coleen >> >> On 8/29/17 2:28 AM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 29/08/2017 5:39 AM, coleen.phillimore at oracle.com >>> wrote: >>>> >>>> >>>> On 8/28/17 3:38 PM, coleen.phillimore at oracle.com >>>> wrote: >>>>> >>>>> Here is the third webrev with the names of pd_set and set_pd_set >>>>> renamed to pd_set_acquire and release_set_pd_set. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/8164207.03/webrev >>>> >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>> >>> This API should also be renamed: >>> >>> !?? ProtectionDomainEntry* pd_set() const??????????? { return >>> _inner.pd_set_acquire(); } >>> !?? void set_pd_set(ProtectionDomainEntry* new_head) { >>> _inner.release_set_pd_set(new_head); } >>> >>> These are the ones that need to give visibility to the fact we're >>> accessing things lock-free (if indeed we are). >>> >>> More below ... >>> >>>>> On 8/28/17 8:07 AM, coleen.phillimore at oracle.com >>>>> wrote: >>>>>> On 8/28/17 12:25 AM, David Holmes wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> On 25/08/2017 11:26 PM, coleen.phillimore at oracle.com >>>>>>> wrote: >>>>>>>> >>>>>>>> Thank you Zhengyu for noticing this change was wrong, and >>>>>>>> Christian for the idea.?? New webrev: >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.02/webrev >>>>>>>> >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>> >>>>>>> The idea of a load-acquire accessor and release_store-setter is >>>>>>> fine in principal, but it seems to me that we now use these >>>>>>> everywhere, even if we may not need them because there is no >>>>>>> concurrent/lock-free access. Overall I find it very difficult to >>>>>>> determine what the concurrent access patterns are for a >>>>>>> Dictionary versus a DictionaryEntry, and which paths are in fact >>>>>>> lock and/or safepoint free, and may be racing with locked or >>>>>>> safepointed code. ?? >>>>>> >>>>>> That's exactly the point of making them accessors.? So one >>>>>> doesn't have to visit each individual call site and spend time >>>>>> answering the question for each case.? And probably getting it >>>>>> wrong.?? The performance delta for these accesses is minimal >>>>>> since it's only getting the head of the list, not each element. >>>>>> >>>>>> Then it's also future proof so that if a lock is removed, then we >>>>>> don't miss one of the accessors at a later time. Note that >>>>>> observing bugs caused by this is very difficult to do, and can >>>>>> only be done by inspection.?? That's why I erred on the side of >>>>>> safety and consistency. >>> >>> Sorry, it may sound strange to say that I don't agree with "erring >>> on the side of safety and consistency" but I do not agree with just >>> using acquire/release semantics everywhere just in case! If we don't >>> know the lock-free paths then how can we possibly know things are >>> correct. The whole point of these accessors is to make it obvious >>> where the lock-free accesses are. >>> >>>>>>> >>>>>>> That aside I don't understand why you added a level of >>>>>>> indirection with the ProtectionDomainSet class? >>>>>> >>>>>> Only the code is a level of indirection not the access. That is >>>>>> to avoid what I said above.? See Christian's and Zhengyu's comments. >>> >>> Okay - I see what you did but I would not expect to have to protect >>> _pd_set from direct use within its own class - anyone messing with >>> that class should be aware of the need to use the accessors. Though >>> I suppose this encapsulation is little different to defining the >>> field as some kind of "Atomic" type rather than a "raw" type. >>> >>> Thanks, >>> David >>> ----- >>> >>>>>>> >>>>>>> Also we have been trying to include release/acquire in the names >>>>>>> of such accessors so that it is clear when we are relying on >>>>>>> memory ordering properties ie. pd_set_acquire and release_set_pd_set >>>>>>> >>>>>> >>>>>> I will change the names of these functions. >>>>>> >>>>>> thanks, >>>>>> Coleen >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>>> I reran parallel class loading tests and jck testing is in >>>>>>>> progress, but order access requires inspection. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>> >>>>>>>> On 8/24/17 5:11 PM, coleen.phillimore at oracle.com >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 8/24/17 5:00 PM, Christian Thalinger wrote: >>>>>>>>>>> On Aug 24, 2017, at 10:54 AM, coleen.phillimore at oracle.com >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 8/24/17 4:07 PM, Zhengyu Gu wrote: >>>>>>>>>>>> Hi Coleen, >>>>>>>>>>>> >>>>>>>>>>>> There are two instances probably overlooked? >>>>>>>>>>>> >>>>>>>>>>>> dictionary.cpp #103 and #124 >>>>>>>>>>>> >>>>>>>>>>>> ??? for (ProtectionDomainEntry* current = _pd_set; >>>>>>>>>>>> => >>>>>>>>>>>> ??? for (ProtectionDomainEntry* current = pd_set(); >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> Oh yeah, you're right.? That's embarrasing.?? I'll fix and >>>>>>>>>>> retest. >>>>>>>>>> Which also shows that there is a potential for future >>>>>>>>>> mistakes. Can we isolate the field better so it?s only >>>>>>>>>> accessible via setter and getter? >>>>>>>>> >>>>>>>>> Yes, great idea. >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>>>> Thank you!! >>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> -Zhengyu >>>>>>>>>>>> >>>>>>>>>>>> On 08/24/2017 02:28 PM, coleen.phillimore at oracle.com >>>>>>>>>>>> wrote: >>>>>>>>>>>>> Summary: Use load_acquire for accessing >>>>>>>>>>>>> DictionaryEntry::_pd_set since it's accessed outside the >>>>>>>>>>>>> SystemDictionary_lock >>>>>>>>>>>>> >>>>>>>>>>>>> Ran parallel class loading tests that we have as well as >>>>>>>>>>>>> tier1 tests. See bug for details. >>>>>>>>>>>>> >>>>>>>>>>>>> open webrev at >>>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8164207.01/webrev >>>>>>>>>>>>> >>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8164207 >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Coleen >>>>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >> > From jiangli.zhou at Oracle.COM Wed Aug 30 23:57:26 2017 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Wed, 30 Aug 2017 16:57:26 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <59A6EC65.2010301@oracle.com> References: <59A45413.70800@oracle.com> <6B35DD35-57B6-4179-826D-2D910BFD2D51@oracle.com> <59A6EC65.2010301@oracle.com> Message-ID: <4AA0D920-4A44-4641-9261-511D4A94054B@oracle.com> Hi Calvin, Thank you for the additional changes and testing. Following are two minor issues from the latest webrev. No need for new webrev after you fix them. Could you please change the following comment in klassFactory.cpp to be something more meaningful. How about ?shared classes loaded by user defined class loader do not have shared_classpath_index?? 75 // AppCDSv2 class. In the same file, please remove the ?&& INCLUDE_JVMTI? from line 230. The call to record_shared_class_loader_type() should always be done when CDS is enabled. 230 #if INCLUDE_CDS && INCLUDE_JVMTI 231 if (DumpSharedSpaces) { <> 232 ClassLoader::record_shared_class_loader_type(result, stream); 233 #if INCLUDE_JVMTI Thanks, Jiangli > On Aug 30, 2017, at 9:48 AM, Calvin Cheung wrote: > > Hi Jiangli, > > Thanks for your review. > > On 8/29/17, 12:55 PM, Jiangli Zhou wrote: >> >> Hi Calvin, >> >> These changes look good. I have a few remaining comments below. >> >> - src/share/vm/classfile/classLoader.cpp >> The ?_num_boot_entries' variable probably is unnecessary. It?s initialized to _num_entries and never modified. In places where _num_boot_entries is used, can you use _num_entries directly? > No. Because at the time of the initialization of _num_boot_entries, the _num_entries only has the number of entries in the boot class path including the runtime image. Later on, the _num_entries can change. Take a look at ClassLoader::add_to_list(ClassPathEntry *new_entry) and its callers. >> >> I think we should remove the special boot classpath handling code for dump time. With the use of java class loaders at CDS/AppCDS dump time, we no longer need to append the -cp path to the boot classpath. With that, we can remove the CDS special cases from ClassLoader::load_class, ClassPathImageEntry::open_stream, etc. That would make the generic class loading code much cleaner. Since you are planning to integrate this change soon, making the suggested change can be risky. I?ll file a new RFE, we can do the clean up separately. > That's a good idea. >> 150 int ClassLoader::_num_boot_entries = -1; >> 1519 if (DumpSharedSpaces && classpath_index >= _num_boot_entries) { >> 1520 // Do not load any class from the app classpath using the boot loader. Let >> 1521 // the built-in app class laoder load them. >> 1522 break; >> 1523 } >> 1635 if (classpath_index < _num_boot_entries) { >> 1636 // ik is either: >> 1637 // 1) a boot class loaded from the runtime image during vm initialization (classpath_index = 0); or >> 1638 // 2) a user's class from -Xbootclasspath/a (classpath_index > 0) >> 1639 // In the second case, the classpath_index, classloader_type will be recorded via >> 1640 // context.record_result() in ClassLoader::load_class(Symbol* name, bool search_append_only, TRAPS). >> 1641 if (classpath_index > 0) { >> 1642 return; >> 1643 } >> 1644 } >> >> I?m wondering why the following is never needed before in get_package_entry() for non-CDS case. Do you have additional details? >> 253 // PackageEntryTable could be NULL for classes like java/lang/invoke/LambdaForm$MH >> 254 if (pkgEntryTable == NULL) { >> 255 return NULL; >> 256 } > It turns out it is no longer needed so I've removed it. >> If I understand it correctly, the following is to find if the class is from the runtime image. Can you please change the following to check with module->location()->starts_with(?jrt:?)? >> 1610 if ((strcmp(_jrt_entry->name(), src) == 0) || >> 1611 (module != NULL && (module->name() != NULL) && >> 1612 (strcmp(module->name()->as_C_string(), src) == 0))) { >> 1613 e = _jrt_entry; >> 1614 classpath_index = 0; > I've made the change. >> I was looking for code that handles anonymous classes. There are following code in classLoader.cpp and metaspaceShared.cpp. However, I can?t find any code that specifically removes anonymous classes from the system dictionary at CDS dump time. I?m probably missing something, how do we guarantee (besides the assert) anonymous classes are not being archived? >> 1582 void ClassLoader::record_shared_class_loader_type(InstanceKlass* ik, const ClassFileStream* stream) { >> 1583 assert(DumpSharedSpaces, "sanity"); >> 1584 assert(stream != NULL, "sanity"); >> 1585 >> 1586 if (ik->is_anonymous()) { >> 1587 // We do not archive anonymous classes. >> 1588 return; >> 1589 } >> 487 NOT_PRODUCT( >> 488 static void assert_not_anonymous_class(InstanceKlass* k) { >> 489 if (k->is_instance_klass()) { >> 490 assert(!(k->is_anonymous()), "cannot archive anonymous classes"); >> 491 } >> 492 } >> 494 static void assert_no_anonymoys_classes_in_dictionaries() { >> 495 ClassLoaderDataGraph::dictionary_classes_do(assert_not_anonymous_class); >> 496 }) >> > Thanks Ioi for answering this one. >> - src/share/vm/classfile/klassFactory.cpp >> The following code can be simplified to get the module from ?ik?. >> 74 if (path_index < 0) { >> 75 // AppCDSv2 class. >> 76 // Get the pkg_entry from the classloader >> 77 PackageEntry* pkg_entry = NULL; >> 78 TempNewSymbol pkg_name = InstanceKlass::package_from_name(class_name, CHECK_NULL); >> 79 if (pkg_name != NULL) { >> 80 const char* pkg_string = pkg_name->as_C_string(); >> 81 ClassLoaderData* loader_data = ClassLoaderData::class_loader_data(class_loader()); >> 82 if (loader_data != NULL) { >> 83 pkg_entry = loader_data->packages()->lookup_only(pkg_name); >> 84 } >> 85 } >> 86 if (pkg_entry != NULL) { >> 87 ModuleEntry* mod_entry = pkg_entry->module(); > Yes, I'm using ik->module() in my new webrev. >> Could you please remove the following from KlassFactory::create_from_stream() and call it from ClassLoaderExt::record_result()? >> 229 #if INCLUDE_CDS >> 230 if (DumpSharedSpaces) { >> 231 ClassLoader::record_shared_class_loader_type(result, stream); >> 232 } >> 233 #endif > Based on our off-list discussion, we've decided to keep the code but moving it further down in the same function together with another existing "#if INCLUDE_CDS" block. > > updated webrevs: > > incremental: http://cr.openjdk.java.net/~ccheung/8186842/webrev_00_01/ > complete: http://cr.openjdk.java.net/~ccheung/8186842/webrev.01/ > > thanks, > Calvin > >> Thanks, >> >> Jiangli >> >>> On Aug 28, 2017, at 10:34 AM, Calvin Cheung > wrote: >>> >>> Hi, >>> >>> This is a re-post of a previous RFR for 8172218 using the correct bug id. >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8186842 >>> >>> webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ >>> >>> Please refer to the comment > section of the bug for description of the change. >>> >>> Tests executed so far: >>> JPRT >>> hs-tier2 though hs-tier4 >>> hs-tier5 (linux-x64) >>> >>> thanks, >>> Calvin >> From robbin.ehn at oracle.com Thu Aug 31 06:13:26 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 31 Aug 2017 08:13:26 +0200 Subject: RFR: 8186837: Memory ordering nmethod, _state and _stack_traversal_mark In-Reply-To: References: <9fc2a70d-70bb-362b-16dd-6b5eaf018991@oracle.com> <6b4e31ae-8c3c-92f7-8743-75f6791cae0d@oracle.com> <9480F78C-F0FB-4F26-B272-E7522B8A92F5@redhat.com> Message-ID: On 08/30/2017 05:03 PM, Roman Kennke wrote: > OK :-) Thanks, Robbin > > Cheers, Roman > > > Am 30. August 2017 16:50:26 MESZ schrieb Robbin Ehn : > > Hi Roman, > > On 08/30/2017 12:14 PM, Roman Kennke wrote: > > It sounds to me like LA/RL would be required and sufficient on _state ? > > > Yes, but > - Using la/rs just sometimes can be confusing, changing to get/set with la/rs semantics also means we need to change assembly to use proper memory ordering, > e.g. lda instead of ldr on aarch64 and armv7 add e.g. ldr, dmb, etc... > (I asked compiler, their thoughts was dirty reads are most likely okay in all other cases since _state only goes 'up') > - Using la/rs in just this case means either duplicate methods with acquire semantic or adding a bool to a lot of methods since these are accessed in deep call hierarchies. > - The write side is using storestore today > > For those reasons I said: > > Instead for simplicity the write side storestore fence should be match with loadload on read side and the changes to _stack_traversal_mark undone (kept it volatile). > > > I'm well aware this no near perfect, but we need that loadload fence, so can you can live with my proposed change? > And we can discussed the bigger picture in 8186839? > > Thanks, Robbin > > > Roman > > Am 29. August 2017 16:50:03 MESZ schrieb Robbin Ehn : > > Hi Roman thanks for having a look, > > On 08/29/2017 03:00 PM, Roman Kennke wrote: > > Hi Robin, > > I doubt that we can assume a symmetry between loadload and storestore like there is with load-acquire and release-store. This doesn't seem right. In my experience > loadload > and storestore are rather special purpose: loadload ensures ordering between otherwise unrelated loads and storestore likewise with stores. > > > This exactly why I add loadload, to stop reordering of unrelated loads: > > ####################### > The original code did: > > //nmethod::make_not_entrant_or_zombie > store _stack_traversal_mark > storestore > store _state > > //NMethodSweeper::process_compiled_method > load _state > load _stack_traversal_mark // this is a none-volatile load, can be reordered by both gcc and hardware > > ####################### > Adding la/sr + volatile to _stack_traversal_mark: > > store _stack_traversal_mark release // release not needed, we have a following storestore for the unrelated stores > storestore > store _state > > load _state > load _stack_traversal_mark acquire // acquire not needed since we already loaded _state and any following writes/reads will be done after we have taken a Mutex. > > ####################### > So therefore my conclusion was that, in this particular case: > > store _stack_traversal_mark > storestore > store _state > > load _state > loadload > load _stack_traversal_mark > > would be correct, agree? > > And as I said I have created another jira issue for the concerns me, you and David share. > > Thanks Robbin > > > And even symmetric use of load-acquire and release-store are often done wrong: those are not meant to protect concurrent access to the field, but to the stuff that is > protected by the field access (think locks), I.e. what happens between the LA and RS. At least that is my understanding. > > I suggest to do what David said and try to understand what concurrent accesses to which fields we have, and which fences are actually needed to ensure correct ordering. > > And thanks for revisiting this! > > Cheers, Roman > > Am 29. August 2017 12:31:17 MESZ schrieb Robbin Ehn : > > Hi please review, > > The issue 8180932 - "Parallelize safepoint cleanup" changed _stack_traversal_mark to load acquire/store release, this is at least half wrong. > Instead for simplicity the write side storestore fence should be match with loadload on read side and the changes to _stack_traversal_mark undone (kept it volatile). > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8186837 > > Code: > http://cr.openjdk.java.net/~rehn/8186837/hotspot.01/webrev/ > > It's not clear in this code if there other concurrent dependent read/writes. > Is true that only when reading/writing _state and _stack_traversal_mark proper memory ordering is needed? > To track that I created:https://bugs.openjdk.java.net/browse/JDK-8186839 > > Thanks Robbin > > > -- > Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. > > > -- > Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. > > > -- > Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From dmitry.samersoff at bell-sw.com Thu Aug 31 07:49:18 2017 From: dmitry.samersoff at bell-sw.com (dmitry.samersov) Date: Thu, 31 Aug 2017 10:49:18 +0300 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup Message-ID: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Everybody, Please review: http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ I would propose different approach to fix JDK-8133740 platform-independent way: record all frames but strip unnecessary NMT-internal ones on printing. This approach is safe (we don't depend to compiler inlining and we never strip non-NMT frames) and platform independent, but cost us some extra memory. -Dmitry From aph at redhat.com Thu Aug 31 08:29:00 2017 From: aph at redhat.com (Andrew Haley) Date: Thu, 31 Aug 2017 09:29:00 +0100 Subject: RFR: Generic fix for 8163011 [Was: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup] In-Reply-To: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: On 31/08/17 08:49, dmitry.samersov wrote: > Everybody, > > Please review: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ I'm just forwarding this to make sure people know that this is NOT an AArch64- specific patch. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From adinn at redhat.com Thu Aug 31 08:56:57 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 31 Aug 2017 09:56:57 +0100 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: On 31/08/17 08:49, dmitry.samersov wrote: > Please review: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ > > I would propose different approach to fix JDK-8133740 > platform-independent way: record all frames but strip unnecessary > NMT-internal ones on printing. > > This approach is safe (we don't depend to compiler inlining and we never > strip non-NMT frames) and platform independent, but cost us some extra > memory. I don't think this is going to work well when symbols are not present (meaning you cannot resolve return pc addresses to function names). In that case the NMT frames will be printed that would otherwise get skipped, leading to differences in what calls are in the displayed in the caller stack relative to the case where symbols are present. What is more these changes would vary across architectures which use different inlining strategies. That may seem unimportant; one could take the view that an address which is not associated with a symbolic name is just a meaningless hex value. However, even without names it is still possible for someone who understands the NMT code to correlate allocations which have the same pattern of caller addresses, including correlation of such patterns across builds or architectures. Throwing one or more NMT addresses into the stack in place of a genuine caller will change these call patterns in ways that might make it impossible to spot such correlations. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From thomas.stuefe at gmail.com Thu Aug 31 11:49:21 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 31 Aug 2017 13:49:21 +0200 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: Hi, On Thu, Aug 31, 2017 at 10:56 AM, Andrew Dinn wrote: > On 31/08/17 08:49, dmitry.samersov wrote: > > Please review: > > > > http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ > > > > I would propose different approach to fix JDK-8133740 > > platform-independent way: record all frames but strip unnecessary > > NMT-internal ones on printing. > > > > This approach is safe (we don't depend to compiler inlining and we never > > strip non-NMT frames) and platform independent, but cost us some extra > > memory. > I don't think this is going to work well when symbols are not present > (meaning you cannot resolve return pc addresses to function names). > > In that case the NMT frames will be printed that would otherwise get > skipped, leading to differences in what calls are in the displayed in > the caller stack relative to the case where symbols are present. What is > more these changes would vary across architectures which use different > inlining strategies. > > The PCs of the last n frames of the callstacks collected by NativeCallStack() should always be the same, or? e.g. on Windows: NativeCallStack::NativeCallStack(int toSkip, bool fillStack) + x os::get_native_stack + y RtlCaptureStackBackTrace I assume that x and y would always be the same? If this is true, could we not find out these PCs in advance, store them, and during printing compare the to be printed PCs with the stored ones, skipping printing if they match? To find out those PCs in advance, one could instantiate a NativeCallStack object, let it fill in the callstack, then store the PCs of the last n frames up to the current frame. Sorry if this does not work, I did not yet try it out. > That may seem unimportant; one could take the view that an address which > is not associated with a symbolic name is just a meaningless hex value. > However, even without names it is still possible for someone who > understands the NMT code to correlate allocations which have the same > pattern of caller addresses, including correlation of such patterns > across builds or architectures. Throwing one or more NMT addresses into > the stack in place of a genuine caller will change these call patterns > in ways that might make it impossible to spot such correlations. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > Kind Regards, Thomas From thomas.stuefe at gmail.com Thu Aug 31 11:53:51 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 31 Aug 2017 13:53:51 +0200 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: Hi Dmitry, On Thu, Aug 31, 2017 at 9:49 AM, dmitry.samersov < dmitry.samersoff at bell-sw.com> wrote: > Everybody, > > Please review: > > http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ > > I would propose different approach to fix JDK-8133740 > platform-independent way: record all frames but strip unnecessary > NMT-internal ones on printing. > > This approach is safe (we don't depend to compiler inlining and we never > strip non-NMT frames) and platform independent, but cost us some extra > memory. > > -Dmitry > > > This looks good, I like it. Code is easier to read now and less vulnerable to compiler decisions. Small nits: should_skip_frame() - as this is an implementation detail for the print function, could we not just move it into the print function, or at least into nativeCallStack.cpp ? --- for (int index = 0; index < indent; index ++) out->print(" "); Could this be done with out->print("%*c", indent, ' '); instead? --- The changes in os::_get_previous_fp() for the four platforms affect os::get_native_stack() and method handle tracing (trace_method_handle_stub). The former may now, for the assert case, print more frames. The latter should, in theory, not change behavior at all. But this is from reading the code, it might be good to test this. Kind Regards, Thomas From goetz.lindenmaier at sap.com Thu Aug 31 12:29:26 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 31 Aug 2017 12:29:26 +0000 Subject: RFR(M): 8186978: Introduce configure argument enable-cds Message-ID: <7cbe5b02e52e4a2e99512306d79800f3@sap.com> Hi, Tests for class data sharing (cds) are enabled if @requires vm.cds is true. The property vm.cds depends on the preprocessor macro ENABLE_CDS. This can not yet be switched by configure. It's only disabled automatically for the minimal build. This change introduces enable-cds with default true, which only takes effect in the non-minimal build. If disabled, generate-classlist is disabled, too. Please review this change. I please need a sponsor. http://cr.openjdk.java.net/~goetz/wr17/8186978-disableCDS/webrev.01/index.html Best regards, Goetz. From david.holmes at oracle.com Thu Aug 31 12:47:53 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 31 Aug 2017 22:47:53 +1000 Subject: RFR(M): 8186978: Introduce configure argument enable-cds In-Reply-To: <7cbe5b02e52e4a2e99512306d79800f3@sap.com> References: <7cbe5b02e52e4a2e99512306d79800f3@sap.com> Message-ID: <08f05d7b-14fd-a51a-41e3-2c6d09201cd5@oracle.com> Hi Goetz, On 31/08/2017 10:29 PM, Lindenmaier, Goetz wrote: > Hi, > > Tests for class data sharing (cds) are enabled if @requires vm.cds is true. > The property vm.cds depends on the preprocessor macro ENABLE_CDS. > This can not yet be switched by configure. It's only disabled automatically > for the minimal build. > > This change introduces enable-cds with default true, which only takes effect > in the non-minimal build. If disabled, generate-classlist is disabled, too. > > Please review this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/wr17/8186978-disableCDS/webrev.01/index.html I'll let the build guys comment in detail, but the structure for this doesn't quite look right to me. I don't understand why you have in spec.gmk.in: + ENABLE_CDS:=@ENABLE_CDS@ when in the hotspot build CDS is controlled via the feature setting: ifneq ($(call check-jvm-feature, cds), true) which you are already handling. ?? Thanks, David > Best regards, > Goetz. > From erik.joelsson at oracle.com Thu Aug 31 12:57:51 2017 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Thu, 31 Aug 2017 14:57:51 +0200 Subject: RFR(M): 8186978: Introduce configure argument enable-cds In-Reply-To: <7cbe5b02e52e4a2e99512306d79800f3@sap.com> References: <7cbe5b02e52e4a2e99512306d79800f3@sap.com> Message-ID: This looks ok to me, but I would value Magnus' input as well. /Erik On 2017-08-31 14:29, Lindenmaier, Goetz wrote: > Hi, > > Tests for class data sharing (cds) are enabled if @requires vm.cds is true. > The property vm.cds depends on the preprocessor macro ENABLE_CDS. > This can not yet be switched by configure. It's only disabled automatically > for the minimal build. > > This change introduces enable-cds with default true, which only takes effect > in the non-minimal build. If disabled, generate-classlist is disabled, too. > > Please review this change. I please need a sponsor. > http://cr.openjdk.java.net/~goetz/wr17/8186978-disableCDS/webrev.01/index.html > > Best regards, > Goetz. From zgu at redhat.com Thu Aug 31 12:59:54 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 31 Aug 2017 08:59:54 -0400 Subject: RFR(M): JDK-8163011: AArch64: NMT detail stack trace cleanup In-Reply-To: References: <061d9eac-e588-cc9c-8f96-7ea5ecdc6568@bell-sw.com> Message-ID: <7bd18ec1-e46a-75fc-a760-d4277c19921f@redhat.com> Just keep in mind, that original NMT specification has memory overhead restriction (I can not recall the exact number). The memory for extra tracking stacks might be prohibitive. -Zhengyu On 08/31/2017 07:53 AM, Thomas St?fe wrote: > Hi Dmitry, > > On Thu, Aug 31, 2017 at 9:49 AM, dmitry.samersov < > dmitry.samersoff at bell-sw.com> wrote: > >> Everybody, >> >> Please review: >> >> http://cr.openjdk.java.net/~dsamersoff/JDK-8163011/webrev.05/ >> >> I would propose different approach to fix JDK-8133740 >> platform-independent way: record all frames but strip unnecessary >> NMT-internal ones on printing. >> >> This approach is safe (we don't depend to compiler inlining and we never >> strip non-NMT frames) and platform independent, but cost us some extra >> memory. >> >> -Dmitry >> >> >> > This looks good, I like it. Code is easier to read now and less vulnerable > to compiler decisions. > > Small nits: > > should_skip_frame() - as this is an implementation detail for the print > function, could we not just move it into the print function, or at least > into nativeCallStack.cpp ? > > --- > > for (int index = 0; index < indent; index ++) out->print(" "); > > Could this be done with > > out->print("%*c", indent, ' '); > > instead? > > --- > > The changes in os::_get_previous_fp() for the four platforms affect > os::get_native_stack() and method handle tracing > (trace_method_handle_stub). The former may now, for the assert case, print > more frames. The latter should, in theory, not change behavior at all. But > this is from reading the code, it might be good to test this. > > Kind Regards, Thomas > From magnus.ihse.bursie at oracle.com Thu Aug 31 13:35:26 2017 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 31 Aug 2017 15:35:26 +0200 Subject: RFR(M): 8186978: Introduce configure argument enable-cds In-Reply-To: <08f05d7b-14fd-a51a-41e3-2c6d09201cd5@oracle.com> References: <7cbe5b02e52e4a2e99512306d79800f3@sap.com> <08f05d7b-14fd-a51a-41e3-2c6d09201cd5@oracle.com> Message-ID: <0cf2865e-bfc0-826f-8c6f-350a70b87ba7@oracle.com> On 2017-08-31 14:47, David Holmes wrote: > Hi Goetz, > > On 31/08/2017 10:29 PM, Lindenmaier, Goetz wrote: >> Hi, >> >> Tests for class data sharing (cds) are enabled if @requires vm.cds is >> true. >> The property vm.cds depends on the preprocessor macro ENABLE_CDS. ... but you mean INCLUDE_CDS. :-) >> This can not yet be switched by configure. It's only disabled >> automatically >> for the minimal build. >> >> This change introduces enable-cds with default true, which only takes >> effect >> in the non-minimal build. If disabled, generate-classlist is >> disabled, too. >> >> Please review this change. I please need a sponsor. >> http://cr.openjdk.java.net/~goetz/wr17/8186978-disableCDS/webrev.01/index.html >> > > I'll let the build guys comment in detail, but the structure for this > doesn't quite look right to me. I don't understand why you have in > spec.gmk.in: > > + ENABLE_CDS:=@ENABLE_CDS@ > > when in the hotspot build CDS is controlled via the feature setting: > > ifneq ($(call check-jvm-feature, cds), true) > > which you are already handling. ?? Agree, the ENABLE_CDS variable is only used internally in the configure script and need not/should not be exported in spec.gmk.in. As David says, the test ($(call check-jvm-feature, cds), true) is enough to determine if to send the -DINCLUDE_CDS to the compiler. Just remove the changes to spec.gmk.in, and I'm ok with the patch. /Magnus > > Thanks, > David > > >> Best regards, >> Goetz. >> From goetz.lindenmaier at sap.com Thu Aug 31 14:49:48 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 31 Aug 2017 14:49:48 +0000 Subject: RFR(M): 8186978: Introduce configure argument enable-cds In-Reply-To: <0cf2865e-bfc0-826f-8c6f-350a70b87ba7@oracle.com> References: <7cbe5b02e52e4a2e99512306d79800f3@sap.com> <08f05d7b-14fd-a51a-41e3-2c6d09201cd5@oracle.com> <0cf2865e-bfc0-826f-8c6f-350a70b87ba7@oracle.com> Message-ID: <0de4b0ee7c804280a29b76a6000f95e7@sap.com> Hi, thanks for reviewing everybody! Yes, works fine without that assignment. New webrev: http://cr.openjdk.java.net/~goetz/wr17/8186978-disableCDS/webrev.02/ Could someone please sponsor? I think autogen.sh needs to be run before submitting. Best regards, Goetz. > -----Original Message----- > From: Magnus Ihse Bursie [mailto:magnus.ihse.bursie at oracle.com] > Sent: Thursday, August 31, 2017 3:35 PM > To: David Holmes ; Lindenmaier, Goetz > ; hotspot-runtime-dev at openjdk.java.net; > build-dev (build-dev at openjdk.java.net) > Subject: Re: RFR(M): 8186978: Introduce configure argument enable-cds > > > > On 2017-08-31 14:47, David Holmes wrote: > > Hi Goetz, > > > > On 31/08/2017 10:29 PM, Lindenmaier, Goetz wrote: > >> Hi, > >> > >> Tests for class data sharing (cds) are enabled if @requires vm.cds is > >> true. > >> The property vm.cds depends on the preprocessor macro ENABLE_CDS. > ... but you mean INCLUDE_CDS. :-) > > >> This can not yet be switched by configure. It's only disabled > >> automatically > >> for the minimal build. > >> > >> This change introduces enable-cds with default true, which only takes > >> effect > >> in the non-minimal build. If disabled, generate-classlist is > >> disabled, too. > >> > >> Please review this change. I please need a sponsor. > >> http://cr.openjdk.java.net/~goetz/wr17/8186978- > disableCDS/webrev.01/index.html > >> > > > > I'll let the build guys comment in detail, but the structure for this > > doesn't quite look right to me. I don't understand why you have in > > spec.gmk.in: > > > > + ENABLE_CDS:=@ENABLE_CDS@ > > > > when in the hotspot build CDS is controlled via the feature setting: > > > > ifneq ($(call check-jvm-feature, cds), true) > > > > which you are already handling. ?? > > Agree, the ENABLE_CDS variable is only used internally in the configure > script and need not/should not be exported in spec.gmk.in. As David > says, the test ($(call check-jvm-feature, cds), true) is enough to > determine if to send the -DINCLUDE_CDS to the compiler. > > Just remove the changes to spec.gmk.in, and I'm ok with the patch. > > /Magnus > > > > > > Thanks, > > David > > > > > >> Best regards, > >> Goetz. > >> From calvin.cheung at oracle.com Thu Aug 31 15:22:20 2017 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 31 Aug 2017 08:22:20 -0700 Subject: RFR(L): 8186842: Use Java class loaders for creating the CDS archive In-Reply-To: <4AA0D920-4A44-4641-9261-511D4A94054B@oracle.com> References: <59A45413.70800@oracle.com> <6B35DD35-57B6-4179-826D-2D910BFD2D51@oracle.com> <59A6EC65.2010301@oracle.com> <4AA0D920-4A44-4641-9261-511D4A94054B@oracle.com> Message-ID: <59A829AC.8090103@oracle.com> Hi Jiangli, Thanks for another round of review. Nice catch on the #if statement. I've made those changes and added a missing ResourceMark in the same file. I've an incremental webrev if you want to take a look. http://cr.openjdk.java.net/~ccheung/8186842/webrev.01_plus/ (ignore the cpCache.cpp change, I dont' know why the closing brace shows up this time) thanks, Calvin On 8/30/17, 4:57 PM, Jiangli Zhou wrote: > Hi Calvin, > > Thank you for the additional changes and testing. Following are two > minor issues from the latest webrev. No need for new webrev after you > fix them. > > Could you please change the following comment in klassFactory.cpp to > be something more meaningful. How about ?shared classes loaded by user > defined class loader do not have shared_classpath_index?? > 75 // AppCDSv2 class. > In the same file, please remove the ?&& INCLUDE_JVMTI? from line 230. > The call to record_shared_class_loader_type() should always be done > when CDS is enabled. > 230 #if INCLUDE_CDS&& INCLUDE_JVMTI > 231 if (DumpSharedSpaces) { > 232 ClassLoader::record_shared_class_loader_type(result, stream); > 233 #if INCLUDE_JVMTI > Thanks, > Jiangli > >> On Aug 30, 2017, at 9:48 AM, Calvin Cheung > > wrote: >> >> Hi Jiangli, >> >> Thanks for your review. >> >> On 8/29/17, 12:55 PM, Jiangli Zhou wrote: >>> Hi Calvin, >>> >>> These changes look good. I have a few remaining comments below. >>> >>> - src/share/vm/classfile/classLoader.cpp >>> The ?_num_boot_entries' variable probably is unnecessary. It?s >>> initialized to _num_entries and never modified. In places where >>> _num_boot_entries is used, can you use _num_entries directly? >> No. Because at the time of the initialization of _num_boot_entries, >> the _num_entries only has the number of entries in the boot class >> path including the runtime image. Later on, the _num_entries can >> change. Take a look at ClassLoader::add_to_list(ClassPathEntry >> *new_entry) and its callers. >>> >>> I think we should remove the special boot classpath handling code >>> for dump time. With the use of java class loaders at CDS/AppCDS dump >>> time, we no longer need to append the -cp path to the boot >>> classpath. With that, we can remove the CDS special cases from >>> ClassLoader::load_class, ClassPathImageEntry::open_stream, etc. That >>> would make the generic class loading code much cleaner. Since you >>> are planning to integrate this change soon, making the suggested >>> change can be risky. I?ll file a new RFE, we can do the clean up >>> separately. >> That's a good idea. >>> 150 int ClassLoader::_num_boot_entries = -1; >>> 1519 if (DumpSharedSpaces&& classpath_index>= _num_boot_entries) { >>> 1520 // Do not load any class from the app classpath using the boot loader. Let >>> 1521 // the built-in app class laoder load them. >>> 1522 break; >>> 1523 } >>> 1635 if (classpath_index< _num_boot_entries) { >>> 1636 // ik is either: >>> 1637 // 1) a boot class loaded from the runtime image during vm initialization (classpath_index = 0); or >>> 1638 // 2) a user's class from -Xbootclasspath/a (classpath_index> 0) >>> 1639 // In the second case, the classpath_index, classloader_type will be recorded via >>> 1640 // context.record_result() in ClassLoader::load_class(Symbol* name, bool search_append_only, TRAPS). >>> 1641 if (classpath_index> 0) { >>> 1642 return; >>> 1643 } >>> 1644 } >>> >>> I?m wondering why the following is never needed before in >>> get_package_entry() for non-CDS case. Do you have additional details? >>> 253 // PackageEntryTable could be NULL for classes like java/lang/invoke/LambdaForm$MH >>> 254 if (pkgEntryTable == NULL) { >>> 255 return NULL; >>> 256 } >> It turns out it is no longer needed so I've removed it. >>> If I understand it correctly, the following is to find if the class >>> is from the runtime image. Can you please change the following to >>> check with module->location()->starts_with(?jrt:?)? >>> 1610 if ((strcmp(_jrt_entry->name(), src) == 0) || >>> 1611 (module != NULL&& (module->name() != NULL)&& >>> 1612 (strcmp(module->name()->as_C_string(), src) == 0))) { >>> 1613 e = _jrt_entry; >>> 1614 classpath_index = 0; >> I've made the change. >>> I was looking for code that handles anonymous classes. There are >>> following code in classLoader.cpp and metaspaceShared.cpp. However, >>> I can?t find any code that specifically removes anonymous classes >>> from the system dictionary at CDS dump time. I?m probably missing >>> something, how do we guarantee (besides the assert) anonymous >>> classes are not being archived? >>> 1582 void ClassLoader::record_shared_class_loader_type(InstanceKlass* ik, const ClassFileStream* stream) { >>> 1583 assert(DumpSharedSpaces, "sanity"); >>> 1584 assert(stream != NULL, "sanity"); >>> 1585 >>> 1586 if (ik->is_anonymous()) { >>> 1587 // We do not archive anonymous classes. >>> 1588 return; >>> 1589 } >>> 487 NOT_PRODUCT( >>> 488 static void assert_not_anonymous_class(InstanceKlass* k) { >>> 489 if (k->is_instance_klass()) { >>> 490 assert(!(k->is_anonymous()), "cannot archive anonymous classes"); >>> 491 } >>> 492 } >>> 494 static void assert_no_anonymoys_classes_in_dictionaries() { >>> 495 ClassLoaderDataGraph::dictionary_classes_do(assert_not_anonymous_class); >>> 496 }) >>> >> Thanks Ioi for answering this one. >>> - src/share/vm/classfile/klassFactory.cpp >>> The following code can be simplified to get the module from ?ik?. >>> 74 if (path_index< 0) { >>> 75 // AppCDSv2 class. >>> 76 // Get the pkg_entry from the classloader >>> 77 PackageEntry* pkg_entry = NULL; >>> 78 TempNewSymbol pkg_name = InstanceKlass::package_from_name(class_name, CHECK_NULL); >>> 79 if (pkg_name != NULL) { >>> 80 const char* pkg_string = pkg_name->as_C_string(); >>> 81 ClassLoaderData* loader_data = ClassLoaderData::class_loader_data(class_loader()); >>> 82 if (loader_data != NULL) { >>> 83 pkg_entry = loader_data->packages()->lookup_only(pkg_name); >>> 84 } >>> 85 } >>> 86 if (pkg_entry != NULL) { >>> 87 ModuleEntry* mod_entry = pkg_entry->module(); >> Yes, I'm using ik->module() in my new webrev. >>> Could you please remove the following from >>> KlassFactory::create_from_stream() and call it from >>> ClassLoaderExt::record_result()? >>> 229 #if INCLUDE_CDS >>> 230 if (DumpSharedSpaces) { >>> 231 ClassLoader::record_shared_class_loader_type(result, stream); >>> 232 } >>> 233 #endif >> Based on our off-list discussion, we've decided to keep the code but >> moving it further down in the same function together with another >> existing "#if INCLUDE_CDS" block. >> >> updated webrevs: >> >> incremental: http://cr.openjdk.java.net/~ccheung/8186842/webrev_00_01/ >> complete: http://cr.openjdk.java.net/~ccheung/8186842/webrev.01/ >> >> thanks, >> Calvin >> >>> Thanks, >>> >>> Jiangli >>> >>>> On Aug 28, 2017, at 10:34 AM, Calvin Cheung >>>> > wrote: >>>> >>>> Hi, >>>> >>>> This is a re-post of a previous RFR for 8172218 using the correct >>>> bug id. >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8186842 >>>> >>>> webrev: http://cr.openjdk.java.net/~ccheung/8186842/webrev.00/ >>>> >>>> >>>> Please refer to the comment >>>> >>> > >>>> section of the bug for description of the change. >>>> >>>> Tests executed so far: >>>> JPRT >>>> hs-tier2 though hs-tier4 >>>> hs-tier5 (linux-x64) >>>> >>>> thanks, >>>> Calvin >>> > From thomas.stuefe at gmail.com Thu Aug 31 19:06:26 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 31 Aug 2017 21:06:26 +0200 Subject: 8186982: [aix] Garbage output for CPU info in hs-err file In-Reply-To: References: Message-ID: (Added hotspot-runtime) Hi all @hotspot-runtime, may I please have additional reviews for this AIX only patch. I already pushed this change - it was AIX only and I had two reviewers, so I thought I was fine. But I was not aware that a rule exists requiring me to run changes like these through one of the generic mailing lists in addition to the AIX-only mailing lists. Sorry for that. I usually try to avoid crossposting to keep the noise down. So, this change. It removes the AIX os::pd_print_cpu_info() implementation, which is not only broken but also superfluous since the generic one (os::print_cpu_info()) does just fine and there is nothing exciting to add at os level. Thank you! Thomas On Thu, Aug 31, 2017 at 2:20 PM, Thomas St?fe wrote: > Hi, > > may I please have reviews for this tiny trivial change: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8186982 > Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/ > 8186982-aix-cpu-info-garbage/webrev.00/webrev/ > > Thank you! > > ..Thomas > From goetz.lindenmaier at sap.com Thu Aug 31 19:36:51 2017 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 31 Aug 2017 19:36:51 +0000 Subject: 8186982: [aix] Garbage output for CPU info in hs-err file In-Reply-To: References: Message-ID: Hi Thomas, looks good, thanks. Best regards, Goetz. From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] Sent: Thursday, August 31, 2017 9:06 PM To: ppc-aix-port-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Cc: Lindenmaier, Goetz ; Volker Simonis Subject: Re: 8186982: [aix] Garbage output for CPU info in hs-err file (Added hotspot-runtime) Hi all @hotspot-runtime, may I please have additional reviews for this AIX only patch. I already pushed this change - it was AIX only and I had two reviewers, so I thought I was fine. But I was not aware that a rule exists requiring me to run changes like these through one of the generic mailing lists in addition to the AIX-only mailing lists. Sorry for that. I usually try to avoid crossposting to keep the noise down. So, this change. It removes the AIX os::pd_print_cpu_info() implementation, which is not only broken but also superfluous since the generic one (os::print_cpu_info()) does just fine and there is nothing exciting to add at os level. Thank you! Thomas On Thu, Aug 31, 2017 at 2:20 PM, Thomas St?fe > wrote: Hi, may I please have reviews for this tiny trivial change: Bug: https://bugs.openjdk.java.net/browse/JDK-8186982 Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8186982-aix-cpu-info-garbage/webrev.00/webrev/ Thank you! ..Thomas From david.holmes at oracle.com Thu Aug 31 21:15:04 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 1 Sep 2017 07:15:04 +1000 Subject: RFR(M): 8186978: Introduce configure argument enable-cds In-Reply-To: <0de4b0ee7c804280a29b76a6000f95e7@sap.com> References: <7cbe5b02e52e4a2e99512306d79800f3@sap.com> <08f05d7b-14fd-a51a-41e3-2c6d09201cd5@oracle.com> <0cf2865e-bfc0-826f-8c6f-350a70b87ba7@oracle.com> <0de4b0ee7c804280a29b76a6000f95e7@sap.com> Message-ID: Hi Goetz, I will sponsor this. Thanks, David On 1/09/2017 12:49 AM, Lindenmaier, Goetz wrote: > Hi, > > thanks for reviewing everybody! > Yes, works fine without that assignment. New webrev: > http://cr.openjdk.java.net/~goetz/wr17/8186978-disableCDS/webrev.02/ > > Could someone please sponsor? I think autogen.sh needs to be run > before submitting. > > Best regards, > Goetz. > >> -----Original Message----- >> From: Magnus Ihse Bursie [mailto:magnus.ihse.bursie at oracle.com] >> Sent: Thursday, August 31, 2017 3:35 PM >> To: David Holmes ; Lindenmaier, Goetz >> ; hotspot-runtime-dev at openjdk.java.net; >> build-dev (build-dev at openjdk.java.net) >> Subject: Re: RFR(M): 8186978: Introduce configure argument enable-cds >> >> >> >> On 2017-08-31 14:47, David Holmes wrote: >>> Hi Goetz, >>> >>> On 31/08/2017 10:29 PM, Lindenmaier, Goetz wrote: >>>> Hi, >>>> >>>> Tests for class data sharing (cds) are enabled if @requires vm.cds is >>>> true. >>>> The property vm.cds depends on the preprocessor macro ENABLE_CDS. >> ... but you mean INCLUDE_CDS. :-) >> >>>> This can not yet be switched by configure. It's only disabled >>>> automatically >>>> for the minimal build. >>>> >>>> This change introduces enable-cds with default true, which only takes >>>> effect >>>> in the non-minimal build. If disabled, generate-classlist is >>>> disabled, too. >>>> >>>> Please review this change. I please need a sponsor. >>>> http://cr.openjdk.java.net/~goetz/wr17/8186978- >> disableCDS/webrev.01/index.html >>>> >>> >>> I'll let the build guys comment in detail, but the structure for this >>> doesn't quite look right to me. I don't understand why you have in >>> spec.gmk.in: >>> >>> + ENABLE_CDS:=@ENABLE_CDS@ >>> >>> when in the hotspot build CDS is controlled via the feature setting: >>> >>> ifneq ($(call check-jvm-feature, cds), true) >>> >>> which you are already handling. ?? >> >> Agree, the ENABLE_CDS variable is only used internally in the configure >> script and need not/should not be exported in spec.gmk.in. As David >> says, the test ($(call check-jvm-feature, cds), true) is enough to >> determine if to send the -DINCLUDE_CDS to the compiler. >> >> Just remove the changes to spec.gmk.in, and I'm ok with the patch. >> >> /Magnus >> >> >>> >>> Thanks, >>> David >>> >>> >>>> Best regards, >>>> Goetz. >>>> > From adityailkal at gmail.com Fri Aug 11 10:56:44 2017 From: adityailkal at gmail.com (Aditya Ilkal) Date: Fri, 11 Aug 2017 10:56:44 -0000 Subject: Fwd: openjdk runtime consumes almost 100% cpu on Thread.sleep call In-Reply-To: References: Message-ID: Hi, We are running openjdk with below version on android os (Linux localhost 4.9.31-android-x86) *openjdk version "9-internal"* *OpenJDK Runtime Environment (build 9-internal+0-adhoc.aditi.dev)* *OpenJDK Client VM (build 9-internal+0-adhoc.aditi.dev, mixed mode)* The following is the program public static void main(String[] args) throws Throwable { while(true) { System.out.println("sd123"); Thread.sleep(30000); System.out.println("sd12"); Thread.sleep(10000); } } The processes details on the OS is as below. *Main Process* PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND 7206 28501 root S 869m 24.6 0 87.8 /etc/jdk8/images/jre/bin/java *Threads of above process* PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND 7208 28501 root R 869m 24.6 0 13.7 {VM Thread} 7207 28501 root R 869m 24.6 1 13.3 /etc/jdk8/images/jre/bin/java 7212 28501 root R 869m 24.6 0 12.8 {C1 CompilerThre} 7222 28501 root R 869m 24.6 1 12.6 {VM Periodic Task thread} 7216 28501 root R 869m 24.6 1 12.3 {Sweeper thread} 7217 28501 root R 869m 24.6 1 11.7 {Common-Cleaner} The same application when run on Ubuntu linux runs very well and the thread state of above threads is 'S', where as in android os, it is shown as R and consuming CPU. Also, the output "sd12" which should print after waiting 30 s, is taking more time. This just means sleep interval is not getting calculated correctly. This can be easily reproducible, hence request you to look in to this issue and provide possible insights. Thanks, Aditya Ilkal From adam.farley at uk.ibm.com Thu Aug 24 13:33:27 2017 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Thu, 24 Aug 2017 13:33:27 -0000 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: <15ed8720-4d13-ae95-dfbe-dd0e3d5acfd6@oracle.com> References: <29006bbc-562a-f181-e05d-153d6a13dace@oracle.com> <15ed8720-4d13-ae95-dfbe-dd0e3d5acfd6@oracle.com> Message-ID: Hi Alan, David, and Tom, First, thanks again for your efforts on this. As a new guy to OpenJDK contributions, it means a lot to see so much progress on this so quickly. :) >On 24/08/2017 07:33, David Holmes wrote: >> Hi Adam, >> >> cc'ing hotspot runtime dev as runtime own JNI and the invocation API - >> and some of the problematic code resides in the VM. >Yeah, the hotspot mailing list would be a better place to discuss this >as there are several issues here and several places where HotSpot aborts >the process when initialization fails. It's a long standing issue (going >back 15+ years) that I think is partly because it's not easy to release >all resources and cleanup before CreateJavaVM returns with an error. > According to the JNI spec, it is not possible (yet) to create a second VM in the same thread as the first. There is also a bug (dup'd against another bug I don't have the access for) which states that even a successful VM creation+destruction won't permit a second VM to be created. https://bugs.openjdk.java.net/browse/JDK-4712793 Both of these seem to imply that making a new VM after a failed VM-creation (in the same thread) is unsupported behaviour. So is it important to release all resources and cleanup, given that we won't be trying to create a new VM in this thread? By "important" I mean "more important than exiting with a return code and allowing the user's code to finish". >> >> This specific case seems like a bug to me as the logic is assuming it >> is only ever called by a launcher which it is okay to terminate. >> Though to be honest the very existence of the "help" option seems to >> me somewhat misguided in a hosted-VM environment. That said, I see >> unified logging in 9 also added a terminating "help" option . >The agent "help" option case is tricky and would likely need an update >to the JVM TI spec and the Agent_OnLoad return value. > To clarify, the agent "help" option is only an example of this problem. There are 19 locations both within and without hotspot that call exit(0) directly, plus more places where exit is passed a variable that can be 0 (e.g. the aforementioned agent "help", which calls the forceExit function with an argument of 0, which calls exit(arg) in turn). I understand that your comment was intended as an effort to effect a fix for this specific instance of the problem. I wanted to make sure we kept sight of the wider problem, as ideally we'd come up with an ideal solution that could be applied to all cases. My thought on this was a unique return code that tells the user's code that the VM is not in a usable state, but that no error has occurred. This should be a negative code (so the usual x<0 check will prevent the user's code from using the VM), but it shouldn't be one of the existing JNI codes; all of which seem to indicate either: a) The VM is fine and can be used (0). or b) The VM is not fine because an error occurred (-1 to -6). Ideally we need a c) The VM is not fine, but no error has occurred. Or is there another solution to the exit(0) problem? Other than putting a copy of the rest of your code on the exit hook, I mean. > >> >> Options processed by the VM will be recognized, while options >> processed by the Java launcher will not be. "-version", "-X", "-help" >> and numerous others are launcher options. Pure VM options are -XX >> options, but the VM also processes some -X flags and, as a result of >> jigsaw, now also processes a bunch of module-related flags that are >> simple --foo options. >Right because these options need to passed to CreateJavaVM as they are >used when initializing the VM. Using system properties would just repeat >the issues of past (e.g. java.class.path) and require documenting a slew >of system properties (which is complicated at repeating options). > >-Alan Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From adam.farley at uk.ibm.com Wed Aug 30 17:00:03 2017 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Wed, 30 Aug 2017 17:00:03 -0000 Subject: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java In-Reply-To: <3528158a-fda7-fda9-7126-d51fba0d3f28@oracle.com> References: <29006bbc-562a-f181-e05d-153d6a13dace@oracle.com> <15ed8720-4d13-ae95-dfbe-dd0e3d5acfd6@oracle.com> <3528158a-fda7-fda9-7126-d51fba0d3f28@oracle.com> Message-ID: Hi All, I've included the full text of my reply in-line below. A summary is: I continue to support the idea of a new return code on the basis that, when the VM is nonusable but no error has occurred, we have no suitable option. Right now we can: - Report an error that has not occurred. - Die and take the user's code with us (except for any exit hook code). - Return a JNI_OK, and allow the user's next action to fail. I think that if VM developers concur that the correct action is to leave the VM in a nonusable state, but to not throw an error, that this RC gives us a better option than exit(0). - Adam P.S. Apologies for the delay. I was on vacation. :) > Hi Alan, David, and Tom, >> >> First, thanks again for your efforts on this. As a new guy to OpenJDK >> contributions, it means a lot to see so much progress on this so >> quickly. :) > >All I see is discussion :) Progress would be something else entirely. True. :) > >> >> >On 24/08/2017 07:33, David Holmes wrote: >> >> Hi Adam, >> >> >> >> cc'ing hotspot runtime dev as runtime own JNI and the invocation API - >> >> and some of the problematic code resides in the VM. >> >Yeah, the hotspot mailing list would be a better place to discuss this >> >as there are several issues here and several places where HotSpot aborts >> >the process when initialization fails. It's a long standing issue (going >> >back 15+ years) that I think is partly because it's not easy to release >> >all resources and cleanup before CreateJavaVM returns with an error. >> > >> >> According to the JNI spec, it is not possible (yet) to create a second VM >> in the same thread as the first. >> >> There is also a bug (dup'd against another bug I don't have the access for) >> which states that even a successful VM creation+destruction won't permit >> a second VM to be created. >> >> https://bugs.openjdk.java.net/browse/JDK-4712793 >> >> Both of these seem to imply that making a new VM after a failed VM-creation >> (in the same thread) is unsupported behaviour. >> >> So is it important to release all resources and cleanup, given that we >> won't >> be trying to create a new VM in this thread? By "important" I mean "more >> important than exiting with a return code and allowing the user's code >> to finish". > >Okay, so if there is no intention of attempting to reload the jvm again, >I'm unclear what the purpose of the hosting process actually is. To me >it is either a customer launcher - in which case the exit calls are >"harmless" (and atexit handlers could be used if the process has its own >clean up) - or it's something multi-purpose part of which is to launch a >VM. In the latter case given the inability to reload a VM, and assuming >the process does not what it's java launching powers to be removed, then >the only real option is to filter out the problematic arguments and >either ignore them or exec a separate process to handle them. My assumption is that the user's code may be doing many things, of which the Java work is only a part. I'm trying not to be too specific here, as I don't know what the user is trying to do, nor what they want their code to do if Java returns an error. I think we should tell the user what has happened, and allow them to act on the information. Right now the VM developers don't have that option. They don't have a mechanism to tell the user that the VM is not in a usable state, but had found no errors. Therefore the VM *must* call exit(0) to indicate "pass", but also to prevent the user trying to do anything with the unusable VM. I would give them that option. If they can return an RC, they should have one available that fits this scenario. By providing this negative return code both within and without the VM, we can give future VM-upgrade projects the option to indicate an unusable VM with no error, removing the need for them to call exit(0) when the VM is unusable despite no error occurring. Also, in regards to the example option: I agree that this option should really be filtered out before we get to the exit(0)-slash-JNI_SILENT_EXIT RC. Perhaps we could abstract the "is this a help option" logic into a shared function, and tie that into the unrecognised options logic? > >> >> >> >> This specific case seems like a bug to me as the logic is assuming it >> >> is only ever called by a launcher which it is okay to terminate. >> >> Though to be honest the very existence of the "help" option seems to >> >> me somewhat misguided in a hosted-VM environment. That said, I see >> >> unified logging in 9 also added a terminating "help" option . >> >The agent "help" option case is tricky and would likely need an update >> >to the JVM TI spec and the Agent_OnLoad return value. >> > >> >> To clarify, the agent "help" option is only an example of this problem. >> There are 19 locations both within and without hotspot that call exit(0) >> directly, plus more places where exit is passed a variable that can be >> 0 (e.g. the aforementioned agent "help", which calls the forceExit function >> with an argument of 0, which calls exit(arg) in turn). >> >> I understand that your comment was intended as an effort to effect a fix >> for this specific instance of the problem. I wanted to make sure we kept >> sight of the wider problem, as ideally we'd come up with an ideal solution >> that could be applied to all cases. > >The fact there are numerous potential process termination points in the >VM and JDK native code, is something we just have to live with. I'm only >considering these kind of "report and terminate" flags to be the problem >cases that should be handled better. A fair statement. I posit that simply having this option available could prevent the need for further exit(0)'s in the future. Though I'm certainly not ruling out an entrepreneurial VM developer fixing these issues in the future. I'm simply agreeing that resolving all of these issues are outside of this proposal's scope. > >> My thought on this was a unique return code that tells the user's code >> that the VM is not in a usable state, but that no error has occurred. This >> should be a negative code (so the usual x<0 check will prevent the user's >> code from using the VM), but it shouldn't be one of the existing JNI codes; >> all of which seem to indicate either: >> >> a) The VM is fine and can be used (0). >> or >> b) The VM is not fine because an error occurred (-1 to -6). >> >> Ideally we need a c) The VM is not fine, but no error has occurred. > >It's somewhat debatable how to classify the case where you ask the VM to >load and then perform a one-off action that effectively succeeds but >leaves the VM unusable. Again ideally, to me, the VM would never do that >- such actions would occur as part of VM initialization, the VM would be >usable, but the launcher would do the termination because that is how >the flag is specified. But that is non-trivial to untangle. > >David Agreed. > >> Or is there another solution to the exit(0) problem? Other than putting >> a copy of the rest of your code on the exit hook, I mean. >> >> > >> >> >> >> Options processed by the VM will be recognized, while options >> >> processed by the Java launcher will not be. "-version", "-X", "-help" >> >> and numerous others are launcher options. Pure VM options are -XX >> >> options, but the VM also processes some -X flags and, as a result of >> >> jigsaw, now also processes a bunch of module-related flags that are >> >> simple --foo options. >> >Right because these options need to passed to CreateJavaVM as they are >> >used when initializing the VM. Using system properties would just repeat >> >the issues of past (e.g. java.class.path) and require documenting a slew >> >of system properties (which is complicated at repeating options). >> > >> >-Alan Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU