From igor.ignatyev at oracle.com Tue May 2 07:54:53 2017 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 2 May 2017 00:54:53 -0700 Subject: RFR(XXS) : 8179516 : add Utils.COMPILE_JDK constant Message-ID: <6C871C87-5160-4C3D-A735-9EACD5667427@oracle.com> http://cr.openjdk.java.net/~iignatyev//8179516/webrev.00/index.html > 5 lines changed: 5 ins; 0 del; 0 mod Hi all, could you please review this tiny fix which adds jdk.test.lib.Utils.COMPILE_JDK constant, which points to the JDK specified by '-compilejdk' jtreg flag or the JDK under test? webrev: http://cr.openjdk.java.net/~iignatyev//8179516/webrev.00/index.html JBS: https://bugs.openjdk.java.net/browse/JDK-8179516 Thanks, -- Igor From serguei.spitsyn at oracle.com Tue May 2 09:02:58 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 2 May 2017 02:02:58 -0700 Subject: RFR(M): 8172970: TESTBUG: need test coverage for the JVMTI functions allowed in the start phase In-Reply-To: References: <88938aee-0a83-99e9-1b95-6875173807a5@oracle.com> <5ea66d4b-f1bb-1729-ba51-5bdcd5b3c470@oracle.com> <1e443485-63d6-4473-8809-76f47a380436@oracle.com> Message-ID: <96edf450-1d0d-c21d-436d-bdbd378b4137@oracle.com> PING: I've got a thumbs up from David Holmes. One more review is needed for this jdk 10 test enhancement. Thanks! Serguei On 4/28/17 17:13, serguei.spitsyn at oracle.com wrote: > Hi David, > > > On 4/28/17 10:34, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> >> On 4/28/17 04:42, David Holmes wrote: >>> Hi Serguei, >>> >>> On 28/04/2017 6:07 PM, serguei.spitsyn at oracle.com wrote: >>>> The updated webrev is: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.2/ >>>> >>> >>> Thanks for the updates (the issue with long is that on win64 it is >>> only 32-bit while void* is 64-bit). >> >> Ok, thanks. >> Than you are right, using long on win64 is not compatible. >> >>> >>> I prefer to see fast-fail rather than potentially triggering >>> cascading failures (check_jvmti_error could even call exit() I >>> think). But let's see what others think - it's only a preference not >>> a necessity. >> >> Ok, I'll consider call exit() as it would keep it simple. >> > > New webrev version is: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.3/ > > > > Thanks, > Serguei > > >> Thanks, >> Serguei >> >>> >>> Thanks, >>> David >>> >>>> >>>> I've re-arranged a little bit code in the ClassPrepare callback and >>>> the >>>> function test_class_functions(). >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 4/28/17 00:47, serguei.spitsyn at oracle.com wrote: >>>>> Hi David, >>>>> >>>>> Thank you for looking at the test! >>>>> >>>>> >>>>> On 4/27/17 23:11, David Holmes wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> On 28/04/2017 3:14 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Please, review the jdk 10 fix for the test enhancement: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8172970 >>>>>>> >>>>>>> >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.1/ >>>>>>> >>>>>>> >>>>>> >>>>>> Sorry but I can't quite figure out exactly what this test is doing. >>>>>> What is the overall call structure here? >>>>> >>>>> This is to make sure the functions allowed in the start and live >>>>> phases work Ok. >>>>> As the list of functions is pretty big the test does sanity checks >>>>> that the functions do not crash nor return errors. >>>>> >>>>> >>>>>> I was expecting to see a difference between things that can be >>>>>> called >>>>>> at early-start and those that can not - or are these all expected to >>>>>> work okay in either case? >>>>> >>>>> All these functions are expected to work okay in both cases. >>>>> Of course, the main concern is the early start. >>>>> But we have never had such coverage in the past so that the normal >>>>> start phase needs to be covered too. >>>>> >>>>> >>>>>> >>>>>> A few comments: >>>>>> >>>>>> 44 #define TranslateError(err) "JVMTI error" >>>>>> >>>>>> I don't see the point of the above. >>>>> >>>>> Good catch - removed. >>>>> It is a left over from another test that I used as initial template. >>>>> >>>>> >>>>>> --- >>>>>> >>>>>> 99 static long get_thread_local(jvmtiEnv *jvmti, jthread thread) { >>>>>> >>>>>> The thread local functions use "long" as the datatype but that will >>>>>> only be 32-bit on 64-bit Windows. I think you need to use intptr_t >>>>>> for complete portability. >>>>> >>>>> The type long has the same format as the type void* which has to be >>>>> portable even on win-32. >>>>> But maybe I'm missing something. >>>>> Anyway, I've replaced it with the intptr_t. >>>>> >>>>> >>>>>> >>>>>> --- >>>>>> >>>>>> 277 printf(" Filed declaring"); >>>>>> >>>>>> typo: filed -> field >>>>> >>>>> >>>>> Good catch - fixed. >>>>> >>>>>> >>>>>> --- >>>>>> >>>>>> All your little wrapper functions make the JVMTI call and then call >>>>>> check_jvmti_error - but all that does is record if it passed or >>>>>> failed. If it failed you still continue with the next operation even >>>>>> if it relies on the success of the first one eg: >>>>>> >>>>>> 378 set_thread_local(jvmti, thread, exp_val); >>>>>> 379 act_val = get_thread_local(jvmti, cur_thread); >>>>>> >>>>>> and the sequences in print_method_info: >>>>>> >>>>>> 228 err = (*jvmti)->IsMethodNative(jvmti, method, &is_native); >>>>>> 229 check_jvmti_error(jvmti, "IsMethodNative", err); >>>>>> 230 printf(" Method is native: %d\n", is_native); >>>>>> 231 >>>>>> 232 if (is_native == JNI_FALSE) { >>>>>> 233 err = (*jvmti)->GetMaxLocals(jvmti, method, >>>>>> &locals_max); >>>>>> >>>>>> The call at #233 may not be valid because the method actually is >>>>>> native but the IsMethodNative call failed for some reason. >>>>>> >>>>> >>>>> It is intentional. I have done it as a last cleanup. >>>>> The point is to simplify code by skipping all the extra checks if it >>>>> does not lead to any fatal errors. >>>>> The most important in such a case is that the static variable result >>>>> is set to FAILED. >>>>> It will cause the test to fail. >>>>> Then there is no point to analyze the printed results if a JVMTI >>>>> error >>>>> reported before. >>>>> >>>>> If you insist, I will add back all the extra check to make sure all >>>>> printed output is valid. >>>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> >>>>>>> >>>>>>> Summary: >>>>>>> The task was to provide a test coverage for the JVMTI functions >>>>>>> allowed during the start phase. >>>>>>> It includes both enabling and disabling the >>>>>>> can_generate_early_vmstart >>>>>>> capability. >>>>>>> Testing the JVMTI functions allowed in any case has not been >>>>>>> targeted >>>>>>> by this fix. >>>>>>> >>>>>>> Testing: >>>>>>> New test is passed. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>> >>>> >> > From david.holmes at oracle.com Tue May 2 09:11:48 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 2 May 2017 19:11:48 +1000 Subject: RFR(XXS) : 8179516 : add Utils.COMPILE_JDK constant In-Reply-To: <6C871C87-5160-4C3D-A735-9EACD5667427@oracle.com> References: <6C871C87-5160-4C3D-A735-9EACD5667427@oracle.com> Message-ID: Hi Igor, Looks fine to me. Thanks, David On 2/05/2017 5:54 PM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8179516/webrev.00/index.html >> 5 lines changed: 5 ins; 0 del; 0 mod > > Hi all, > > could you please review this tiny fix which adds jdk.test.lib.Utils.COMPILE_JDK constant, which points to the JDK specified by '-compilejdk' jtreg flag or the JDK under test? > > webrev: http://cr.openjdk.java.net/~iignatyev//8179516/webrev.00/index.html > JBS: https://bugs.openjdk.java.net/browse/JDK-8179516 > > Thanks, > -- Igor > > From george.triantafillou at oracle.com Tue May 2 10:44:11 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 2 May 2017 06:44:11 -0400 Subject: RFR(XXS) : 8179516 : add Utils.COMPILE_JDK constant In-Reply-To: <6C871C87-5160-4C3D-A735-9EACD5667427@oracle.com> References: <6C871C87-5160-4C3D-A735-9EACD5667427@oracle.com> Message-ID: <55845f59-1301-3578-3437-8b5fdb261a41@oracle.com> Hi Igor, Looks good. -George On 5/2/2017 3:54 AM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8179516/webrev.00/index.html >> 5 lines changed: 5 ins; 0 del; 0 mod > Hi all, > > could you please review this tiny fix which adds jdk.test.lib.Utils.COMPILE_JDK constant, which points to the JDK specified by '-compilejdk' jtreg flag or the JDK under test? > > webrev: http://cr.openjdk.java.net/~iignatyev//8179516/webrev.00/index.html > JBS: https://bugs.openjdk.java.net/browse/JDK-8179516 > > Thanks, > -- Igor > From HORIE at jp.ibm.com Tue May 2 14:47:01 2017 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Tue, 2 May 2017 23:47:01 +0900 Subject: 8179527: Implement intrinsic code for reverseBytes with load/store Message-ID: Dear all, Would you please review following change? Bug: https://bugs.openjdk.java.net/browse/JDK-8179527 Webrev: http://cr.openjdk.java.net/~horii/8179527/webrev.00/ I added new intrinsic code for reverseBytes() in ppc.ad with * match(Set dst (ReverseBytesI/L/US/S (LoadI src))); * match(Set dst (StoreI dst (ReverseBytesI/L/US/S src))); Best regards, -- Michihiro, IBM Research - Tokyo From gustavo.scalet at eldorado.org.br Tue May 2 15:05:09 2017 From: gustavo.scalet at eldorado.org.br (Gustavo Serra Scalet) Date: Tue, 2 May 2017 15:05:09 +0000 Subject: 8179527: Implement intrinsic code for reverseBytes with load/store In-Reply-To: References: Message-ID: <3507c10563a84106ac6c2e8d2554c053@serv030.corp.eldorado.org.br> Hi Michihiro, I wonder if there is no vectorized approach for implementing your "bytes_reverse_long_Ex" instruct on ppc.ad. Or did you avoid doing it so intentionally? > -----Original Message----- > From: ppc-aix-port-dev [mailto:ppc-aix-port-dev- > bounces at openjdk.java.net] On Behalf Of Michihiro Horie > Sent: ter?a-feira, 2 de maio de 2017 11:47 > To: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; > volker.simonis at sap.com; martin.doerr at sap.com > Subject: 8179527: Implement intrinsic code for reverseBytes with > load/store > > Dear all, > > Would you please review following change? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8179527 > Webrev: http://cr.openjdk.java.net/~horii/8179527/webrev.00/ > > I added new intrinsic code for reverseBytes() in ppc.ad with > * match(Set dst (ReverseBytesI/L/US/S (LoadI src))); > * match(Set dst (StoreI dst (ReverseBytesI/L/US/S src))); > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo From martin.doerr at sap.com Tue May 2 17:23:29 2017 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 2 May 2017 17:23:29 +0000 Subject: 8179527: Implement intrinsic code for reverseBytes with load/store In-Reply-To: <3507c10563a84106ac6c2e8d2554c053@serv030.corp.eldorado.org.br> References: <3507c10563a84106ac6c2e8d2554c053@serv030.corp.eldorado.org.br> Message-ID: <7827421e2c6447f4ae406434f5bb3d25@sap.com> Hi Michihiro and Gustavo, thank you very much for implementing this change. @Gustavo: Thanks for taking a look. I think that the direct match rules are just there to satisfy match_rule_supported. They don't need to be fast, they are just a fall back solution. The goal is to exploit the byte reverse load and store instructions which should match in more performance critical cases. Now my review: assembler_ppc.hpp: Looks good except a minor formatting request: LDBRX_OPCODE = (31u << OPCODE_SHIFT | 532 << 1), should be LDBRX_OPCODE = (31u << OPCODE_SHIFT | 532u << 1), to be consistent. The comments // X-FORM should be aligned with the other ones. assembler_ppc.inline.hpp: Good. ppc.ad: I'm concerned about the additional match rules which are only used for the expand step. They could match directly leading to incorrect code. What they match is not what they do. I suggest to implement the code directly in the ins_encode. This would make the new code significantly shorter and less error prone. I think we don't need to optimize for Power6 anymore and newer processors shouldn't really suffer under a little less optimized instruction scheduling. Would you agree? Displacements may be too large for "li" so I suggest to use the "indirect" memory operand and let the compiler handle it. I know that it may increase latency because the compiler will need to insert an addition which could better be matched into the memory operand of the load which is harder to implement (it is possible to match an addition in an operand). Best regards, Martin -----Original Message----- From: Gustavo Serra Scalet [mailto:gustavo.scalet at eldorado.org.br] Sent: Dienstag, 2. Mai 2017 17:05 To: Michihiro Horie Cc: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; Simonis, Volker ; Doerr, Martin Subject: RE: 8179527: Implement intrinsic code for reverseBytes with load/store Hi Michihiro, I wonder if there is no vectorized approach for implementing your "bytes_reverse_long_Ex" instruct on ppc.ad. Or did you avoid doing it so intentionally? > -----Original Message----- > From: ppc-aix-port-dev [mailto:ppc-aix-port-dev- > bounces at openjdk.java.net] On Behalf Of Michihiro Horie > Sent: ter?a-feira, 2 de maio de 2017 11:47 > To: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; > volker.simonis at sap.com; martin.doerr at sap.com > Subject: 8179527: Implement intrinsic code for reverseBytes with > load/store > > Dear all, > > Would you please review following change? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8179527 > Webrev: http://cr.openjdk.java.net/~horii/8179527/webrev.00/ > > I added new intrinsic code for reverseBytes() in ppc.ad with > * match(Set dst (ReverseBytesI/L/US/S (LoadI src))); > * match(Set dst (StoreI dst (ReverseBytesI/L/US/S src))); > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo From daniel.daugherty at oracle.com Tue May 2 18:09:10 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 2 May 2017 12:09:10 -0600 Subject: RFR(M): 8172970: TESTBUG: need test coverage for the JVMTI functions allowed in the start phase In-Reply-To: <96edf450-1d0d-c21d-436d-bdbd378b4137@oracle.com> References: <88938aee-0a83-99e9-1b95-6875173807a5@oracle.com> <5ea66d4b-f1bb-1729-ba51-5bdcd5b3c470@oracle.com> <1e443485-63d6-4473-8809-76f47a380436@oracle.com> <96edf450-1d0d-c21d-436d-bdbd378b4137@oracle.com> Message-ID: <3f9716a4-ad29-4e71-21b5-802f6019b796@oracle.com> > New webrev version is: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.3/ make/test/JtregNative.gmk No comments. test/serviceability/jvmti/StartPhase/AllowedFunctions/AllowedFunctions.java L27: * @summary Verify the functions that allowed to operate in the start phase Typo: 'that allowed' -> 'that are allowed' L28: * with and without can_generate_early_vmstart capability Please add '.' to the end. test/serviceability/jvmti/StartPhase/AllowedFunctions/libAllowedFunctions.c L27: #include Should this include be in "alpha" order? L115: printf(" ## Error: unexpected class status: 0x%02x\n", status); L117: printf(" Class status: 0x%08x\n", status); Why the different format specifications? "02x" versus "08x"? L126: printf(" class: %s\n", name); L137: printf(" Class source file name: %s\n", name); Please consider adding single-quotes around the %s. L175: check_jvmti_error(jvmti, "GetClassMethods", err); Typo: "GetClassMethods" -> "GetClassFields" L229: err = (*jvmti)->IsMethodObsolete(jvmti, method, & is_obsolete); Please delete space after '&'. L265: check_jvmti_error(jvmti, "GetMethodModifiers", err); Typo: "GetMethodModifiers" -> "GetFieldModifiers" L301: if (methods != NULL) { Typo: 'methods' -> 'fields' This one can result in a memory leak. L308: jvmtiError err; L322: jvmtiError err; 'err' is unused. Please delete it. L396: check_jvmti_error(jvmti, "AddCapabilites", err); Other errors in here include "## Agent_Initialize: "; why not this one? L398: size = (jint)sizeof(callbacks); L399: memset(&callbacks, 0, sizeof(callbacks)); Perhaps use 'size' instead of 'sizeof(callbacks)' since you have it. You have a nice list of functions in the bug report. You might want to include the list of functions that are NOT covered by this test along with a brief comment about why that is okay. Dan On 5/2/17 3:02 AM, serguei.spitsyn at oracle.com wrote: > PING: > I've got a thumbs up from David Holmes. > One more review is needed for this jdk 10 test enhancement. > > Thanks! > Serguei > > > > On 4/28/17 17:13, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> >> On 4/28/17 10:34, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> >>> On 4/28/17 04:42, David Holmes wrote: >>>> Hi Serguei, >>>> >>>> On 28/04/2017 6:07 PM, serguei.spitsyn at oracle.com wrote: >>>>> The updated webrev is: >>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.2/ >>>>> >>>> >>>> Thanks for the updates (the issue with long is that on win64 it is >>>> only 32-bit while void* is 64-bit). >>> >>> Ok, thanks. >>> Than you are right, using long on win64 is not compatible. >>> >>>> >>>> I prefer to see fast-fail rather than potentially triggering >>>> cascading failures (check_jvmti_error could even call exit() I >>>> think). But let's see what others think - it's only a preference >>>> not a necessity. >>> >>> Ok, I'll consider call exit() as it would keep it simple. >>> >> >> New webrev version is: >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.3/ >> >> >> >> Thanks, >> Serguei >> >> >>> Thanks, >>> Serguei >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> >>>>> I've re-arranged a little bit code in the ClassPrepare callback >>>>> and the >>>>> function test_class_functions(). >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 4/28/17 00:47, serguei.spitsyn at oracle.com wrote: >>>>>> Hi David, >>>>>> >>>>>> Thank you for looking at the test! >>>>>> >>>>>> >>>>>> On 4/27/17 23:11, David Holmes wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> On 28/04/2017 3:14 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Please, review the jdk 10 fix for the test enhancement: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8172970 >>>>>>>> >>>>>>>> >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.1/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> Sorry but I can't quite figure out exactly what this test is doing. >>>>>>> What is the overall call structure here? >>>>>> >>>>>> This is to make sure the functions allowed in the start and live >>>>>> phases work Ok. >>>>>> As the list of functions is pretty big the test does sanity checks >>>>>> that the functions do not crash nor return errors. >>>>>> >>>>>> >>>>>>> I was expecting to see a difference between things that can be >>>>>>> called >>>>>>> at early-start and those that can not - or are these all >>>>>>> expected to >>>>>>> work okay in either case? >>>>>> >>>>>> All these functions are expected to work okay in both cases. >>>>>> Of course, the main concern is the early start. >>>>>> But we have never had such coverage in the past so that the normal >>>>>> start phase needs to be covered too. >>>>>> >>>>>> >>>>>>> >>>>>>> A few comments: >>>>>>> >>>>>>> 44 #define TranslateError(err) "JVMTI error" >>>>>>> >>>>>>> I don't see the point of the above. >>>>>> >>>>>> Good catch - removed. >>>>>> It is a left over from another test that I used as initial template. >>>>>> >>>>>> >>>>>>> --- >>>>>>> >>>>>>> 99 static long get_thread_local(jvmtiEnv *jvmti, jthread thread) { >>>>>>> >>>>>>> The thread local functions use "long" as the datatype but that will >>>>>>> only be 32-bit on 64-bit Windows. I think you need to use intptr_t >>>>>>> for complete portability. >>>>>> >>>>>> The type long has the same format as the type void* which has to be >>>>>> portable even on win-32. >>>>>> But maybe I'm missing something. >>>>>> Anyway, I've replaced it with the intptr_t. >>>>>> >>>>>> >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> 277 printf(" Filed declaring"); >>>>>>> >>>>>>> typo: filed -> field >>>>>> >>>>>> >>>>>> Good catch - fixed. >>>>>> >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> All your little wrapper functions make the JVMTI call and then call >>>>>>> check_jvmti_error - but all that does is record if it passed or >>>>>>> failed. If it failed you still continue with the next operation >>>>>>> even >>>>>>> if it relies on the success of the first one eg: >>>>>>> >>>>>>> 378 set_thread_local(jvmti, thread, exp_val); >>>>>>> 379 act_val = get_thread_local(jvmti, cur_thread); >>>>>>> >>>>>>> and the sequences in print_method_info: >>>>>>> >>>>>>> 228 err = (*jvmti)->IsMethodNative(jvmti, method, &is_native); >>>>>>> 229 check_jvmti_error(jvmti, "IsMethodNative", err); >>>>>>> 230 printf(" Method is native: %d\n", is_native); >>>>>>> 231 >>>>>>> 232 if (is_native == JNI_FALSE) { >>>>>>> 233 err = (*jvmti)->GetMaxLocals(jvmti, method, >>>>>>> &locals_max); >>>>>>> >>>>>>> The call at #233 may not be valid because the method actually is >>>>>>> native but the IsMethodNative call failed for some reason. >>>>>>> >>>>>> >>>>>> It is intentional. I have done it as a last cleanup. >>>>>> The point is to simplify code by skipping all the extra checks if it >>>>>> does not lead to any fatal errors. >>>>>> The most important in such a case is that the static variable result >>>>>> is set to FAILED. >>>>>> It will cause the test to fail. >>>>>> Then there is no point to analyze the printed results if a JVMTI >>>>>> error >>>>>> reported before. >>>>>> >>>>>> If you insist, I will add back all the extra check to make sure all >>>>>> printed output is valid. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Summary: >>>>>>>> The task was to provide a test coverage for the JVMTI functions >>>>>>>> allowed during the start phase. >>>>>>>> It includes both enabling and disabling the >>>>>>>> can_generate_early_vmstart >>>>>>>> capability. >>>>>>>> Testing the JVMTI functions allowed in any case has not been >>>>>>>> targeted >>>>>>>> by this fix. >>>>>>>> >>>>>>>> Testing: >>>>>>>> New test is passed. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>> >>>>> >>> >> > From serguei.spitsyn at oracle.com Tue May 2 18:40:48 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 2 May 2017 11:40:48 -0700 Subject: RFR(M): 8172970: TESTBUG: need test coverage for the JVMTI functions allowed in the start phase In-Reply-To: <3f9716a4-ad29-4e71-21b5-802f6019b796@oracle.com> References: <88938aee-0a83-99e9-1b95-6875173807a5@oracle.com> <5ea66d4b-f1bb-1729-ba51-5bdcd5b3c470@oracle.com> <1e443485-63d6-4473-8809-76f47a380436@oracle.com> <96edf450-1d0d-c21d-436d-bdbd378b4137@oracle.com> <3f9716a4-ad29-4e71-21b5-802f6019b796@oracle.com> Message-ID: Hi Dan, Thank you a lot for the comments! All are nice catches. I have to admit I've done too many typos in new test. Some of them is a result of the 'last minute' changes. The updated webrev is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.4/ Thanks, Serguei On 5/2/17 11:09, Daniel D. Daugherty wrote: > > New webrev version is: > > > http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.3/ > > make/test/JtregNative.gmk > No comments. > > test/serviceability/jvmti/StartPhase/AllowedFunctions/AllowedFunctions.java > > L27: * @summary Verify the functions that allowed to operate in > the start phase > Typo: 'that allowed' -> 'that are allowed' > > L28: * with and without can_generate_early_vmstart capability > Please add '.' to the end. > > test/serviceability/jvmti/StartPhase/AllowedFunctions/libAllowedFunctions.c > > L27: #include > Should this include be in "alpha" order? > > L115: printf(" ## Error: unexpected class status: > 0x%02x\n", status); > L117: printf(" Class status: 0x%08x\n", status); > Why the different format specifications? "02x" versus "08x"? > > L126: printf(" class: %s\n", name); > L137: printf(" Class source file name: %s\n", name); > Please consider adding single-quotes around the %s. > > L175: check_jvmti_error(jvmti, "GetClassMethods", err); > Typo: "GetClassMethods" -> "GetClassFields" > > L229: err = (*jvmti)->IsMethodObsolete(jvmti, method, & > is_obsolete); > Please delete space after '&'. > > L265: check_jvmti_error(jvmti, "GetMethodModifiers", err); > Typo: "GetMethodModifiers" -> "GetFieldModifiers" > > L301: if (methods != NULL) { > Typo: 'methods' -> 'fields' > > This one can result in a memory leak. > > L308: jvmtiError err; > L322: jvmtiError err; > 'err' is unused. Please delete it. > > L396: check_jvmti_error(jvmti, "AddCapabilites", err); > Other errors in here include "## Agent_Initialize: "; why not > this one? > > L398: size = (jint)sizeof(callbacks); > L399: memset(&callbacks, 0, sizeof(callbacks)); > Perhaps use 'size' instead of 'sizeof(callbacks)' since you > have it. > > > You have a nice list of functions in the bug report. > You might want to include the list of functions that > are NOT covered by this test along with a brief comment > about why that is okay. > > Dan > > > On 5/2/17 3:02 AM, serguei.spitsyn at oracle.com wrote: >> PING: >> I've got a thumbs up from David Holmes. >> One more review is needed for this jdk 10 test enhancement. >> >> Thanks! >> Serguei >> >> >> >> On 4/28/17 17:13, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> >>> On 4/28/17 10:34, serguei.spitsyn at oracle.com wrote: >>>> Hi David, >>>> >>>> >>>> On 4/28/17 04:42, David Holmes wrote: >>>>> Hi Serguei, >>>>> >>>>> On 28/04/2017 6:07 PM, serguei.spitsyn at oracle.com wrote: >>>>>> The updated webrev is: >>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.2/ >>>>>> >>>>> >>>>> Thanks for the updates (the issue with long is that on win64 it is >>>>> only 32-bit while void* is 64-bit). >>>> >>>> Ok, thanks. >>>> Than you are right, using long on win64 is not compatible. >>>> >>>>> >>>>> I prefer to see fast-fail rather than potentially triggering >>>>> cascading failures (check_jvmti_error could even call exit() I >>>>> think). But let's see what others think - it's only a preference >>>>> not a necessity. >>>> >>>> Ok, I'll consider call exit() as it would keep it simple. >>>> >>> >>> New webrev version is: >>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.3/ >>> >>> >>> >>> Thanks, >>> Serguei >>> >>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> >>>>>> I've re-arranged a little bit code in the ClassPrepare callback >>>>>> and the >>>>>> function test_class_functions(). >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 4/28/17 00:47, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Thank you for looking at the test! >>>>>>> >>>>>>> >>>>>>> On 4/27/17 23:11, David Holmes wrote: >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>> On 28/04/2017 3:14 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Please, review the jdk 10 fix for the test enhancement: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8172970 >>>>>>>>> >>>>>>>>> >>>>>>>>> Webrev: >>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.1/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> Sorry but I can't quite figure out exactly what this test is >>>>>>>> doing. >>>>>>>> What is the overall call structure here? >>>>>>> >>>>>>> This is to make sure the functions allowed in the start and live >>>>>>> phases work Ok. >>>>>>> As the list of functions is pretty big the test does sanity checks >>>>>>> that the functions do not crash nor return errors. >>>>>>> >>>>>>> >>>>>>>> I was expecting to see a difference between things that can be >>>>>>>> called >>>>>>>> at early-start and those that can not - or are these all >>>>>>>> expected to >>>>>>>> work okay in either case? >>>>>>> >>>>>>> All these functions are expected to work okay in both cases. >>>>>>> Of course, the main concern is the early start. >>>>>>> But we have never had such coverage in the past so that the normal >>>>>>> start phase needs to be covered too. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> A few comments: >>>>>>>> >>>>>>>> 44 #define TranslateError(err) "JVMTI error" >>>>>>>> >>>>>>>> I don't see the point of the above. >>>>>>> >>>>>>> Good catch - removed. >>>>>>> It is a left over from another test that I used as initial >>>>>>> template. >>>>>>> >>>>>>> >>>>>>>> --- >>>>>>>> >>>>>>>> 99 static long get_thread_local(jvmtiEnv *jvmti, jthread >>>>>>>> thread) { >>>>>>>> >>>>>>>> The thread local functions use "long" as the datatype but that >>>>>>>> will >>>>>>>> only be 32-bit on 64-bit Windows. I think you need to use intptr_t >>>>>>>> for complete portability. >>>>>>> >>>>>>> The type long has the same format as the type void* which has to be >>>>>>> portable even on win-32. >>>>>>> But maybe I'm missing something. >>>>>>> Anyway, I've replaced it with the intptr_t. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> --- >>>>>>>> >>>>>>>> 277 printf(" Filed declaring"); >>>>>>>> >>>>>>>> typo: filed -> field >>>>>>> >>>>>>> >>>>>>> Good catch - fixed. >>>>>>> >>>>>>>> >>>>>>>> --- >>>>>>>> >>>>>>>> All your little wrapper functions make the JVMTI call and then >>>>>>>> call >>>>>>>> check_jvmti_error - but all that does is record if it passed or >>>>>>>> failed. If it failed you still continue with the next operation >>>>>>>> even >>>>>>>> if it relies on the success of the first one eg: >>>>>>>> >>>>>>>> 378 set_thread_local(jvmti, thread, exp_val); >>>>>>>> 379 act_val = get_thread_local(jvmti, cur_thread); >>>>>>>> >>>>>>>> and the sequences in print_method_info: >>>>>>>> >>>>>>>> 228 err = (*jvmti)->IsMethodNative(jvmti, method, >>>>>>>> &is_native); >>>>>>>> 229 check_jvmti_error(jvmti, "IsMethodNative", err); >>>>>>>> 230 printf(" Method is native: %d\n", is_native); >>>>>>>> 231 >>>>>>>> 232 if (is_native == JNI_FALSE) { >>>>>>>> 233 err = (*jvmti)->GetMaxLocals(jvmti, method, >>>>>>>> &locals_max); >>>>>>>> >>>>>>>> The call at #233 may not be valid because the method actually is >>>>>>>> native but the IsMethodNative call failed for some reason. >>>>>>>> >>>>>>> >>>>>>> It is intentional. I have done it as a last cleanup. >>>>>>> The point is to simplify code by skipping all the extra checks >>>>>>> if it >>>>>>> does not lead to any fatal errors. >>>>>>> The most important in such a case is that the static variable >>>>>>> result >>>>>>> is set to FAILED. >>>>>>> It will cause the test to fail. >>>>>>> Then there is no point to analyze the printed results if a JVMTI >>>>>>> error >>>>>>> reported before. >>>>>>> >>>>>>> If you insist, I will add back all the extra check to make sure all >>>>>>> printed output is valid. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> The task was to provide a test coverage for the JVMTI functions >>>>>>>>> allowed during the start phase. >>>>>>>>> It includes both enabling and disabling the >>>>>>>>> can_generate_early_vmstart >>>>>>>>> capability. >>>>>>>>> Testing the JVMTI functions allowed in any case has not been >>>>>>>>> targeted >>>>>>>>> by this fix. >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> New test is passed. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>> >>>>>> >>>> >>> >> > From daniel.daugherty at oracle.com Tue May 2 19:23:10 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 2 May 2017 13:23:10 -0600 Subject: RFR(M): 8172970: TESTBUG: need test coverage for the JVMTI functions allowed in the start phase In-Reply-To: References: <88938aee-0a83-99e9-1b95-6875173807a5@oracle.com> <5ea66d4b-f1bb-1729-ba51-5bdcd5b3c470@oracle.com> <1e443485-63d6-4473-8809-76f47a380436@oracle.com> <96edf450-1d0d-c21d-436d-bdbd378b4137@oracle.com> <3f9716a4-ad29-4e71-21b5-802f6019b796@oracle.com> Message-ID: <6f4f5fec-6733-85c4-a741-20acb469f0e7@oracle.com> On 5/2/17 12:40 PM, serguei.spitsyn at oracle.com wrote: > Hi Dan, > > Thank you a lot for the comments! > All are nice catches. > I have to admit I've done too many typos in new test. > Some of them is a result of the 'last minute' changes. > > The updated webrev is: > http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.4/ > make/test/JtregNative.gmk No comments. test/serviceability/jvmti/StartPhase/AllowedFunctions/AllowedFunctions.java No comments. test/serviceability/jvmti/StartPhase/AllowedFunctions/libAllowedFunctions.c L126: printf(" class: \'%s\'\n", name); L137: printf(" Class source file name: \'%s\'\n", name); You don't need to escape the single-quotes with backslash here. Thumbs up! Dan > > Thanks, > Serguei > > > On 5/2/17 11:09, Daniel D. Daugherty wrote: >> > New webrev version is: >> > >> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.3/ >> >> make/test/JtregNative.gmk >> No comments. >> >> test/serviceability/jvmti/StartPhase/AllowedFunctions/AllowedFunctions.java >> >> L27: * @summary Verify the functions that allowed to operate in >> the start phase >> Typo: 'that allowed' -> 'that are allowed' >> >> L28: * with and without can_generate_early_vmstart capability >> Please add '.' to the end. >> >> test/serviceability/jvmti/StartPhase/AllowedFunctions/libAllowedFunctions.c >> >> L27: #include >> Should this include be in "alpha" order? >> >> L115: printf(" ## Error: unexpected class status: >> 0x%02x\n", status); >> L117: printf(" Class status: 0x%08x\n", status); >> Why the different format specifications? "02x" versus "08x"? >> >> L126: printf(" class: %s\n", name); >> L137: printf(" Class source file name: %s\n", name); >> Please consider adding single-quotes around the %s. >> >> L175: check_jvmti_error(jvmti, "GetClassMethods", err); >> Typo: "GetClassMethods" -> "GetClassFields" >> >> L229: err = (*jvmti)->IsMethodObsolete(jvmti, method, & >> is_obsolete); >> Please delete space after '&'. >> >> L265: check_jvmti_error(jvmti, "GetMethodModifiers", err); >> Typo: "GetMethodModifiers" -> "GetFieldModifiers" >> >> L301: if (methods != NULL) { >> Typo: 'methods' -> 'fields' >> >> This one can result in a memory leak. >> >> L308: jvmtiError err; >> L322: jvmtiError err; >> 'err' is unused. Please delete it. >> >> L396: check_jvmti_error(jvmti, "AddCapabilites", err); >> Other errors in here include "## Agent_Initialize: "; why not >> this one? >> >> L398: size = (jint)sizeof(callbacks); >> L399: memset(&callbacks, 0, sizeof(callbacks)); >> Perhaps use 'size' instead of 'sizeof(callbacks)' since you >> have it. >> >> >> You have a nice list of functions in the bug report. >> You might want to include the list of functions that >> are NOT covered by this test along with a brief comment >> about why that is okay. >> >> Dan >> >> >> On 5/2/17 3:02 AM, serguei.spitsyn at oracle.com wrote: >>> PING: >>> I've got a thumbs up from David Holmes. >>> One more review is needed for this jdk 10 test enhancement. >>> >>> Thanks! >>> Serguei >>> >>> >>> >>> On 4/28/17 17:13, serguei.spitsyn at oracle.com wrote: >>>> Hi David, >>>> >>>> >>>> On 4/28/17 10:34, serguei.spitsyn at oracle.com wrote: >>>>> Hi David, >>>>> >>>>> >>>>> On 4/28/17 04:42, David Holmes wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> On 28/04/2017 6:07 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> The updated webrev is: >>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.2/ >>>>>>> >>>>>> >>>>>> Thanks for the updates (the issue with long is that on win64 it >>>>>> is only 32-bit while void* is 64-bit). >>>>> >>>>> Ok, thanks. >>>>> Than you are right, using long on win64 is not compatible. >>>>> >>>>>> >>>>>> I prefer to see fast-fail rather than potentially triggering >>>>>> cascading failures (check_jvmti_error could even call exit() I >>>>>> think). But let's see what others think - it's only a preference >>>>>> not a necessity. >>>>> >>>>> Ok, I'll consider call exit() as it would keep it simple. >>>>> >>>> >>>> New webrev version is: >>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.3/ >>>> >>>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> >>>>>>> I've re-arranged a little bit code in the ClassPrepare callback >>>>>>> and the >>>>>>> function test_class_functions(). >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 4/28/17 00:47, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Thank you for looking at the test! >>>>>>>> >>>>>>>> >>>>>>>> On 4/27/17 23:11, David Holmes wrote: >>>>>>>>> Hi Serguei, >>>>>>>>> >>>>>>>>> On 28/04/2017 3:14 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Please, review the jdk 10 fix for the test enhancement: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8172970 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Webrev: >>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8172970-start-phase.1/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> Sorry but I can't quite figure out exactly what this test is >>>>>>>>> doing. >>>>>>>>> What is the overall call structure here? >>>>>>>> >>>>>>>> This is to make sure the functions allowed in the start and live >>>>>>>> phases work Ok. >>>>>>>> As the list of functions is pretty big the test does sanity checks >>>>>>>> that the functions do not crash nor return errors. >>>>>>>> >>>>>>>> >>>>>>>>> I was expecting to see a difference between things that can be >>>>>>>>> called >>>>>>>>> at early-start and those that can not - or are these all >>>>>>>>> expected to >>>>>>>>> work okay in either case? >>>>>>>> >>>>>>>> All these functions are expected to work okay in both cases. >>>>>>>> Of course, the main concern is the early start. >>>>>>>> But we have never had such coverage in the past so that the normal >>>>>>>> start phase needs to be covered too. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> A few comments: >>>>>>>>> >>>>>>>>> 44 #define TranslateError(err) "JVMTI error" >>>>>>>>> >>>>>>>>> I don't see the point of the above. >>>>>>>> >>>>>>>> Good catch - removed. >>>>>>>> It is a left over from another test that I used as initial >>>>>>>> template. >>>>>>>> >>>>>>>> >>>>>>>>> --- >>>>>>>>> >>>>>>>>> 99 static long get_thread_local(jvmtiEnv *jvmti, jthread >>>>>>>>> thread) { >>>>>>>>> >>>>>>>>> The thread local functions use "long" as the datatype but that >>>>>>>>> will >>>>>>>>> only be 32-bit on 64-bit Windows. I think you need to use >>>>>>>>> intptr_t >>>>>>>>> for complete portability. >>>>>>>> >>>>>>>> The type long has the same format as the type void* which has >>>>>>>> to be >>>>>>>> portable even on win-32. >>>>>>>> But maybe I'm missing something. >>>>>>>> Anyway, I've replaced it with the intptr_t. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> --- >>>>>>>>> >>>>>>>>> 277 printf(" Filed declaring"); >>>>>>>>> >>>>>>>>> typo: filed -> field >>>>>>>> >>>>>>>> >>>>>>>> Good catch - fixed. >>>>>>>> >>>>>>>>> >>>>>>>>> --- >>>>>>>>> >>>>>>>>> All your little wrapper functions make the JVMTI call and then >>>>>>>>> call >>>>>>>>> check_jvmti_error - but all that does is record if it passed or >>>>>>>>> failed. If it failed you still continue with the next >>>>>>>>> operation even >>>>>>>>> if it relies on the success of the first one eg: >>>>>>>>> >>>>>>>>> 378 set_thread_local(jvmti, thread, exp_val); >>>>>>>>> 379 act_val = get_thread_local(jvmti, cur_thread); >>>>>>>>> >>>>>>>>> and the sequences in print_method_info: >>>>>>>>> >>>>>>>>> 228 err = (*jvmti)->IsMethodNative(jvmti, method, >>>>>>>>> &is_native); >>>>>>>>> 229 check_jvmti_error(jvmti, "IsMethodNative", err); >>>>>>>>> 230 printf(" Method is native: %d\n", is_native); >>>>>>>>> 231 >>>>>>>>> 232 if (is_native == JNI_FALSE) { >>>>>>>>> 233 err = (*jvmti)->GetMaxLocals(jvmti, method, >>>>>>>>> &locals_max); >>>>>>>>> >>>>>>>>> The call at #233 may not be valid because the method actually is >>>>>>>>> native but the IsMethodNative call failed for some reason. >>>>>>>>> >>>>>>>> >>>>>>>> It is intentional. I have done it as a last cleanup. >>>>>>>> The point is to simplify code by skipping all the extra checks >>>>>>>> if it >>>>>>>> does not lead to any fatal errors. >>>>>>>> The most important in such a case is that the static variable >>>>>>>> result >>>>>>>> is set to FAILED. >>>>>>>> It will cause the test to fail. >>>>>>>> Then there is no point to analyze the printed results if a >>>>>>>> JVMTI error >>>>>>>> reported before. >>>>>>>> >>>>>>>> If you insist, I will add back all the extra check to make sure >>>>>>>> all >>>>>>>> printed output is valid. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> ----- >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Summary: >>>>>>>>>> The task was to provide a test coverage for the JVMTI >>>>>>>>>> functions >>>>>>>>>> allowed during the start phase. >>>>>>>>>> It includes both enabling and disabling the >>>>>>>>>> can_generate_early_vmstart >>>>>>>>>> capability. >>>>>>>>>> Testing the JVMTI functions allowed in any case has not >>>>>>>>>> been targeted >>>>>>>>>> by this fix. >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> New test is passed. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>> >>>>>>> >>>>> >>>> >>> >> > From chris.plummer at oracle.com Wed May 3 04:42:45 2017 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 2 May 2017 21:42:45 -0700 Subject: RFR: 8178352: BitMap::get_next_zero_offset may give wrong result on Mac In-Reply-To: References: <4a3f5ef6-0ce0-19da-1288-01271ddf9041@oracle.com> <10D07D0F-0050-49E4-9C74-9803727EFF73@oracle.com> <1213C931-FABD-4BF3-994C-3DA9E9F5D9EA@oracle.com> Message-ID: <616f6a0b-1b48-33e9-fb94-d4b16b08b246@oracle.com> On 4/18/17 11:40 PM, Kim Barrett wrote: >> On Apr 11, 2017, at 1:45 PM, Kim Barrett wrote: >> >>> On Apr 10, 2017, at 2:21 PM, Kim Barrett wrote: >>> >>>> On Apr 10, 2017, at 12:48 PM, Stefan Karlsson wrote: >>>> To be consistent with the other code it in that file (res & 1) == 0 should be !(res & 1). >>> That would be contrary to hotspot style guide; I think that should take precedence. >> Stefan and I discussed this privately, and I?ve agreed to go with the locally consistent style here, >> even though it?s contrary to the hotspot style guide. Part of my decision process is related to >> the next part of this reply. > Here?s the updated webrev: > http://cr.openjdk.java.net/~kbarrett/8178352/hotspot.02/ > > Need a second reviewer, in addition to waiting for Stefan. > Hi Kim, Looks good. Chris From kim.barrett at oracle.com Wed May 3 06:52:52 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 3 May 2017 02:52:52 -0400 Subject: RFR: 8178352: BitMap::get_next_zero_offset may give wrong result on Mac In-Reply-To: <616f6a0b-1b48-33e9-fb94-d4b16b08b246@oracle.com> References: <4a3f5ef6-0ce0-19da-1288-01271ddf9041@oracle.com> <10D07D0F-0050-49E4-9C74-9803727EFF73@oracle.com> <1213C931-FABD-4BF3-994C-3DA9E9F5D9EA@oracle.com> <616f6a0b-1b48-33e9-fb94-d4b16b08b246@oracle.com> Message-ID: <19A9F4FC-DC13-41C0-996D-C6FAB34102C3@oracle.com> > On May 3, 2017, at 12:42 AM, Chris Plummer wrote: > > On 4/18/17 11:40 PM, Kim Barrett wrote: >>> On Apr 11, 2017, at 1:45 PM, Kim Barrett wrote: >>> >>>> On Apr 10, 2017, at 2:21 PM, Kim Barrett wrote: >>>> >>>>> On Apr 10, 2017, at 12:48 PM, Stefan Karlsson wrote: >>>>> To be consistent with the other code it in that file (res & 1) == 0 should be !(res & 1). >>>> That would be contrary to hotspot style guide; I think that should take precedence. >>> Stefan and I discussed this privately, and I?ve agreed to go with the locally consistent style here, >>> even though it?s contrary to the hotspot style guide. Part of my decision process is related to >>> the next part of this reply. >> Here?s the updated webrev: >> http://cr.openjdk.java.net/~kbarrett/8178352/hotspot.02/ >> >> Need a second reviewer, in addition to waiting for Stefan. >> > Hi Kim, > > Looks good. > > Chris Thanks! From ioi.lam at oracle.com Wed May 3 13:02:20 2017 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 03 May 2017 06:02:20 -0700 Subject: RFR (L) 8171392 Move Klass pointers outside of ConstantPool entries so ConstantPool can be read-only In-Reply-To: <5904B17E.9090209@oracle.com> References: <58EC771B.9020202@oracle.com> <35e6276a-ddf1-9149-8588-acb4e13191f5@oracle.com> <58EF3D3A.6020903@oracle.com> <58F05EB5.10009@oracle.com> <58F0EB0E.60904@oracle.com> <58F33A4F.70104@oracle.com> <58FDECFE.5060105@oracle.com> <590327F1.7070200@oracle.com> <5904B17E.9090209@oracle.com> Message-ID: <5909D4DC.8000000@oracle.com> Andrew replied me off-list that he tested the aarch64 part and was happy about it. Thanks Andrew. So if there if no further comment, I will push the code as is. Thanks - Ioi On 4/29/17 8:30 AM, Ioi Lam wrote: > I've updated the patch to include Volker's ppc/s390 port as well as > his comments. I've also included an updated patch (untested) for > aarch64 for Andrew Haley to test: > > Full patch > http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v04/ > > > Delta from the previous version > http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v04.delta/ > > > Thanks > - Ioi > > On 4/28/17 4:30 AM, Ioi Lam wrote: >> >> >> On 4/25/17 8:06 AM, Volker Simonis wrote: >>> On Mon, Apr 24, 2017 at 2:18 PM, Ioi Lam wrote: >>>> Hi Volker, >>>> >>>> >>>> On 4/21/17 12:02 AM, Volker Simonis wrote: >>>>> Hi Ioi, >>>>> >>>>> thanks once again for considering our ports! Please find the required >>>>> additions for ppc64/s390x in the following webrew (which is based >>>>> upon >>>>> your latest v03 patch): >>>>> >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8171392_ppc64_s390x/ >>>> Thanks for the patch. I will integrate it and post an updated webrev. >>>>> @Martin/@Lucy: could you please have a look at my ppc64/s390x >>>>> assembly >>>>> code. I did some tests and I think it should be correct, but maybe >>>>> you >>>>> still find some improvements :) >>>>> >>>>> Besides that, I have some general questions/comments regarding your >>>>> change: >>>>> >>>>> 1. In constantPool.hpp, why don't you declare the '_name_index' and >>>>> '_resolved_klass_index' fields with type 'jushort'? As far as I can >>>>> see, they can only hold 16-bit values anyway. It would also save you >>>>> some space and several asserts (e.g. in unresolved_klass_at_put(): >>>>> >>>>> >>>>> 274 assert((name_index & 0xffff0000) == 0, "must be"); >>>>> 275 assert((resolved_klass_index & 0xffff0000) == 0, "must >>>>> be"); >>>> >>>> I think the HotSpot convention is to use ints as parameter and >>>> return types, >>>> for values that are actually 16-bits or less, like here in >>>> constantPool.hpp: >>>> >>>> void field_at_put(int which, int class_index, int >>>> name_and_type_index) { >>>> tag_at_put(which, JVM_CONSTANT_Fieldref); >>>> *int_at_addr(which) = ((jint) name_and_type_index<<16) | >>>> class_index; >>>> } >>>> >>>> I am not sure what the reasons are. It could be that the parameters >>>> usually >>>> need to be computed arithmetically, and it's much easier for the >>>> caller of >>>> the method to use ints -- otherwise you will get lots of compiler >>>> warnings >>>> which would force you to use lots of casting, resulting in code >>>> that's hard >>>> to read and probably incorrect. >>>> >>> OK, but you could still use shorts in the the object to save space, >>> although I'm not sure how much that will save in total. But if nobody >>> else cares, I'm fine with the current version. >> >> The CPKlassSlot objects are stored only on the stack, so the savings >> is not worth the trouble of adding extract type casts. >> >> Inside the ConstantPool itself, the name_index and >> resolved_klass_index are stored as a pair of 16-bit values. >> >>>>> 2. What do you mean by: >>>>> >>>>> 106 // ... will be changed to support compressed pointers >>>>> 107 Array* _resolved_klasses; >>>> >>>> Sorry the comment isn't very clear. How about this? >>>> >>>> 106 // Consider using an array of compressed klass pointers to >>>> // save space on 64-bit platforms. >>>> 107 Array* _resolved_klasses; >>>> >>> Sorry I still didn't get it? Do you mean you want to use array of >>> "narrowKlass" (i.e. unsigned int)? But using compressed class pointers >>> is a runtime decision while this is a compile time decision. >> >> I haven't figured out how to do it yet :-) >> >> Most likely, it will be something like: >> >> union { >> Array* X; >> Array* Y; >> } _resolved_klasses; >> >> and you need to decide at run time whether to use X or Y. >> >> - Ioi >>>>> 3. Why don't we need the call to "release_tag_at_put()" in >>>>> "klass_at_put(int class_index, Klass* k)"? "klass_at_put(int >>>>> class_index, Klass* k)" is used from >>>>> "ClassFileParser::fill_instance_klass() and before your change that >>>>> function used the previous version of "klass_at_put(int class_index, >>>>> Klass* k)" which did call "release_tag_at_put()". >>>> >>>> Good catch. I'll add the following, because the class is now resolved. >>>> >>>> release_tag_at_put(class_index, JVM_CONSTANT_UnresolvedClass); >>>>> 4. In ConstantPool::copy_entry_to() you've changed the behavior for >>>>> tags JVM_CONSTANT_Class, JVM_CONSTANT_UnresolvedClass, >>>>> JVM_CONSTANT_UnresolvedClassInError. Before, the resolved klass was >>>>> copied to the new constant pool if one existed but now you always >>>>> only >>>>> copy a class_index to the new constant pool (even if a resolved klass >>>>> existed). Is that OK? E.g. won't this lead to a new resolving for the >>>>> new constant pool and will this have performance impacts or other >>>>> side >>>>> effects? >>>> I think Coleen has answered this in a separate mail :-) >>>> >>>> Thanks >>>> - Ioi >>>> >>>>> Thanks again for doing this nice change and best regards, >>>>> Volker >>>>> >>>>> >>>>> On Sun, Apr 16, 2017 at 11:33 AM, Ioi Lam wrote: >>>>>> Hi Lois, >>>>>> >>>>>> I have updated the patch to include your comments, and fixes the >>>>>> handling >>>>>> of >>>>>> anonymous classes. I also added some more comments regarding the >>>>>> _temp_resolved_klass_index: >>>>>> >>>>>> (delta from last webrev) >>>>>> >>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v03.delta/ >>>>>> >>>>>> >>>>>> (full webrev) >>>>>> >>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v03/ >>>>>> >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>>>> >>>>>> On 4/15/17 2:31 AM, Lois Foltan wrote: >>>>>>> On 4/14/2017 11:30 AM, Ioi Lam wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 4/14/17 1:31 PM, Ioi Lam wrote: >>>>>>>>> HI Lois, >>>>>>>>> >>>>>>>>> Thanks for the review. Please see my comments in-line. >>>>>>>>> >>>>>>>>> On 4/14/17 4:32 AM, Lois Foltan wrote: >>>>>>>>>> Hi Ioi, >>>>>>>>>> >>>>>>>>>> Looks really good. A couple of comments: >>>>>>>>>> >>>>>>>>>> src/share/vm/classfile/classFileParser.cpp: >>>>>>>>>> * line #5676 - I'm not sure I completely understand the logic >>>>>>>>>> surrounding anonymous classes. Coleen and I discussed >>>>>>>>>> earlier today >>>>>>>>>> and I >>>>>>>>>> came away from that discussion with the idea that the only >>>>>>>>>> classes >>>>>>>>>> being >>>>>>>>>> patched currently are anonymous classes. >>>>>>>>> Line 5676 ... >>>>>>>>> >>>>>>>>> 5676 if (is_anonymous()) { >>>>>>>>> 5677 _max_num_patched_klasses ++; // for patching the >>>>>>>>> class >>>>>>>>> index >>>>>>>>> 5678 } >>>>>>>>> >>>>>>>>> corresponds to >>>>>>>>> >>>>>>>>> 5361 ik->set_name(_class_name); >>>>>>>>> 5362 >>>>>>>>> 5363 if (is_anonymous()) { >>>>>>>>> 5364 // I am well known to myself >>>>>>>>> 5365 patch_class(ik->constants(), _this_class_index, ik, >>>>>>>>> ik->name()); // eagerly resolve >>>>>>>>> 5366 } >>>>>>>>> >>>>>>>>> Even though the class is "anonymous", it actually has a name. >>>>>>>>> ik->name() >>>>>>>>> probably is part of the constant pool, but I am not 100% sure. >>>>>>>>> Also, I >>>>>>>>> would >>>>>>>>> need to search the constant pool to find the index for >>>>>>>>> ik->name(). So >>>>>>>>> I just >>>>>>>>> got lazy here and use the same logic in >>>>>>>>> ConstantPool::patch_class() to >>>>>>>>> append ik->name() to the end of the constant pool. >>>>>>>>> >>>>>>>>> "Anonymous" actually means "the class cannot be looked up by >>>>>>>>> name in >>>>>>>>> the >>>>>>>>> SystemDictionary". I think we need a better terminology :-) >>>>>>>>> >>>>>>>> I finally realized why we need the "eagerly resolve" on line >>>>>>>> 5365. I'll >>>>>>>> modify the comments to the following: >>>>>>>> >>>>>>>> // _this_class_index is a CONSTANT_Class entry that >>>>>>>> refers to this >>>>>>>> // anonymous class itself. If this class needs to refer >>>>>>>> to its own >>>>>>>> methods or >>>>>>>> // fields, it would use a CONSTANT_MethodRef, etc, which >>>>>>>> would >>>>>>>> reference >>>>>>>> // _this_class_index. However, because this class is >>>>>>>> anonymous >>>>>>>> (it's >>>>>>>> // not stored in SystemDictionary), _this_class_index >>>>>>>> cannot be >>>>>>>> resolved >>>>>>>> // with ConstantPool::klass_at_impl, which does a >>>>>>>> SystemDictionary >>>>>>>> lookup. >>>>>>>> // Therefore, we must eagerly resolve _this_class_index now. >>>>>>>> >>>>>>>> So, Lois is right. Line 5676 is not necessary. I will revise >>>>>>>> the code >>>>>>>> to >>>>>>>> do the "eager resolution" without using >>>>>>>> ClassFileParser::patch_class. >>>>>>>> I'll >>>>>>>> post the updated code later. >>>>>>> >>>>>>> Thanks Ioi for studying this and explaining! Look forward to >>>>>>> seeing the >>>>>>> updated webrev. >>>>>>> Lois >>>>>>> >>>>>>>> Thanks >>>>>>>> - Ioi >>>>>>>> >>>>>>>>>> So a bit confused as why the check on line #5676 and a check >>>>>>>>>> for a >>>>>>>>>> java/lang/Class on line #5684. >>>>>>>>> 5683 Handle patch = cp_patch_at(i); >>>>>>>>> 5684 if (java_lang_String::is_instance(patch()) || >>>>>>>>> java_lang_Class::is_instance(patch())) { >>>>>>>>> 5685 // We need to append the names of the patched >>>>>>>>> classes >>>>>>>>> to >>>>>>>>> the end of the constant pool, >>>>>>>>> 5686 // because a patched class may have a Utf8 name >>>>>>>>> that's >>>>>>>>> not already included in the >>>>>>>>> 5687 // original constant pool. >>>>>>>>> 5688 // >>>>>>>>> 5689 // Note that a String in cp_patch_at(i) may be >>>>>>>>> used to >>>>>>>>> patch a Utf8, a String, or a Class. >>>>>>>>> 5690 // At this point, we don't know the tag for >>>>>>>>> index i >>>>>>>>> yet, >>>>>>>>> because we haven't parsed the >>>>>>>>> 5691 // constant pool. So we can only assume the >>>>>>>>> worst -- >>>>>>>>> every String is used to patch a Class. >>>>>>>>> 5692 _max_num_patched_klasses ++; >>>>>>>>> >>>>>>>>> Line 5684 checks for all objects in the cp_patch array. Later, >>>>>>>>> when >>>>>>>>> ClassFileParser::patch_constant_pool() is called, any objects >>>>>>>>> that are >>>>>>>>> either Class or String could be treated as a Klass: >>>>>>>>> >>>>>>>>> 724 void ClassFileParser::patch_constant_pool(ConstantPool* >>>>>>>>> cp, >>>>>>>>> 725 int index, >>>>>>>>> 726 Handle patch, >>>>>>>>> 727 TRAPS) { >>>>>>>>> ... >>>>>>>>> 732 switch (cp->tag_at(index).value()) { >>>>>>>>> 733 >>>>>>>>> 734 case JVM_CONSTANT_UnresolvedClass: { >>>>>>>>> 735 // Patching a class means pre-resolving it. >>>>>>>>> 736 // The name in the constant pool is ignored. >>>>>>>>> 737 if (java_lang_Class::is_instance(patch())) { >>>>>>>>> 738 >>>>>>>>> guarantee_property(!java_lang_Class::is_primitive(patch()), >>>>>>>>> 739 "Illegal class patch at %d >>>>>>>>> in class >>>>>>>>> file >>>>>>>>> %s", >>>>>>>>> 740 index, CHECK); >>>>>>>>> 741 Klass* k = java_lang_Class::as_Klass(patch()); >>>>>>>>> 742 patch_class(cp, index, k, k->name()); >>>>>>>>> 743 } else { >>>>>>>>> 744 guarantee_property(java_lang_String::is_instance(patch()), >>>>>>>>> 745 "Illegal class patch at %d >>>>>>>>> in class >>>>>>>>> file >>>>>>>>> %s", >>>>>>>>> 746 index, CHECK); >>>>>>>>> 747 Symbol* const name = >>>>>>>>> java_lang_String::as_symbol(patch(), >>>>>>>>> CHECK); >>>>>>>>> 748 patch_class(cp, index, NULL, name); >>>>>>>>> 749 } >>>>>>>>> 750 break; >>>>>>>>> 751 } >>>>>>>>> >>>>>>>>>> Could the is_anonymous() if statement be combined into the loop? >>>>>>>>> >>>>>>>>> I think the answer is no. At line 5365, there is no guarantee >>>>>>>>> that >>>>>>>>> ik->name() is in the cp_patch array. >>>>>>>>> >>>>>>>>> 5365 patch_class(ik->constants(), _this_class_index, ik, >>>>>>>>> ik->name()); // eagerly resolve >>>>>>>>> >>>>>>>>>> Also why not do this calculation in the rewriter or is >>>>>>>>>> that too >>>>>>>>>> late? >>>>>>>>>> >>>>>>>>> Line 5676 and 5684 need to be executed BEFORE the constant >>>>>>>>> pool and >>>>>>>>> the >>>>>>>>> associated tags array is allocated (both of which are fixed >>>>>>>>> size, and >>>>>>>>> cannot >>>>>>>>> be expanded), which is way before the rewriter is run. At this >>>>>>>>> point, >>>>>>>>> we >>>>>>>>> don't know what cp->tag_at(index) is (line #732), so the code >>>>>>>>> needs to >>>>>>>>> make >>>>>>>>> a worst-case estimate on how long the CP/tags should be. >>>>>>>>> >>>>>>>>>> * line #5677, 5692 - a nit but I think the convention is to >>>>>>>>>> not have >>>>>>>>>> a >>>>>>>>>> space after the variable name and between the post increment >>>>>>>>>> operator. >>>>>>>>>> >>>>>>>>> Fixed. >>>>>>>>>> src/share/vm/classfile/constantPool.hpp: >>>>>>>>>> I understand the concept behind >>>>>>>>>> _invalid_resolved_klass_index, but it >>>>>>>>>> really is not so much invalid as temporary for class >>>>>>>>>> redefinition >>>>>>>>>> purposes, >>>>>>>>>> as you explain in ConstantPool::allocate_resolved_klasses. >>>>>>>>>> Please >>>>>>>>>> consider >>>>>>>>>> renaming to _temp_unresolved_klass_index. And whether you >>>>>>>>>> choose to >>>>>>>>>> rename >>>>>>>>>> the field or not, please add a one line comment ahead of >>>>>>>>>> ConstantPool::temp_unresolved_klass_at_put that only class >>>>>>>>>> redefinition uses >>>>>>>>>> this currently. >>>>>>>>>> >>>>>>>>> Good idea. Will do. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>>> Great change, thanks! >>>>>>>>>> Lois >>>>>>>>>> >>>>>>>>>> On 4/13/2017 4:56 AM, Ioi Lam wrote: >>>>>>>>>>> Hi Coleen, >>>>>>>>>>> >>>>>>>>>>> Thanks for the comments. Here's a delta from the last patch >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v02/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> In addition to your requests, I made these changes: >>>>>>>>>>> >>>>>>>>>>> [1] To consolidate the multiple extract_high/low code, I've >>>>>>>>>>> added >>>>>>>>>>> CPKlassSlot, so the code is cleaner: >>>>>>>>>>> >>>>>>>>>>> CPKlassSlot kslot = this_cp->klass_slot_at(which); >>>>>>>>>>> int resolved_klass_index = kslot.resolved_klass_index(); >>>>>>>>>>> int name_index = kslot.name_index(); >>>>>>>>>>> >>>>>>>>>>> [2] Renamed ConstantPool::is_shared_quick() to >>>>>>>>>>> ConstantPool::is_shared(). The C++ compiler should be able >>>>>>>>>>> to pick >>>>>>>>>>> this >>>>>>>>>>> function over MetaspaceObj::is_shared(). >>>>>>>>>>> >>>>>>>>>>> [3] Massaged the CDS region size set-up code a little to pass >>>>>>>>>>> internal >>>>>>>>>>> tests, because RO/RW ratio has changed. I didn't spend too >>>>>>>>>>> much time >>>>>>>>>>> picking >>>>>>>>>>> the "right" sizes, as this code will be obsoleted soon with >>>>>>>>>>> JDK-8072061 >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> - Ioi >>>>>>>>>>> >>>>>>>>>>> On 4/13/17 6:40 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> >>>>>>>>>>>> This looks really good! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v01/src/share/vm/oops/constantPool.cpp.udiff.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> + // Add one extra element to tags for storing >>>>>>>>>>>> ConstantPool::flags(). >>>>>>>>>>>> + Array* tags = >>>>>>>>>>>> MetadataFactory::new_writeable_array(loader_data, >>>>>>>>>>>> length+1, 0, >>>>>>>>>>>> CHECK_NULL); ... + assert(tags->length()-1 == _length, >>>>>>>>>>>> "invariant"); // >>>>>>>>>>>> tags->at(_length) is flags() >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I think this is left over, since _flags didn't get moved >>>>>>>>>>>> after all. >>>>>>>>>>>> >>>>>>>>>>>> + Klass** adr = >>>>>>>>>>>> this_cp->resolved_klasses()->adr_at(resolved_klass_index); >>>>>>>>>>>> + OrderAccess::release_store_ptr((Klass* volatile *)adr, k); >>>>>>>>>>>> + // The interpreter assumes when the tag is stored, the >>>>>>>>>>>> klass is >>>>>>>>>>>> resolved >>>>>>>>>>>> + // and the Klass* is a klass rather than a Symbol*, so we >>>>>>>>>>>> need >>>>>>>>>>>> + // hardware store ordering here. >>>>>>>>>>>> + this_cp->release_tag_at_put(which, JVM_CONSTANT_Class); >>>>>>>>>>>> + return k; >>>>>>>>>>>> >>>>>>>>>>>> The comment still refers to the switch between Symbol* and >>>>>>>>>>>> Klass* >>>>>>>>>>>> in >>>>>>>>>>>> the constant pool. The entry in the Klass array should be >>>>>>>>>>>> NULL. >>>>>>>>>>>> >>>>>>>>>>>> + int name_index = >>>>>>>>>>>> extract_high_short_from_int(*int_at_addr(which)); >>>>>>>>>>>> >>>>>>>>>>>> Can you put klass_name_index_at() in the constantPool.hpp >>>>>>>>>>>> header >>>>>>>>>>>> file >>>>>>>>>>>> (so it's inlined) and have all the places where you get >>>>>>>>>>>> name_index >>>>>>>>>>>> use this >>>>>>>>>>>> function? So we don't have to know in multiple places that >>>>>>>>>>>> extract_high_short_from_int() is where the name index is. >>>>>>>>>>>> And in >>>>>>>>>>>> constantPool.hpp, for unresolved_klass_at_put() add a >>>>>>>>>>>> comment about >>>>>>>>>>>> what the >>>>>>>>>>>> new format of the constant pool entry is {name_index, >>>>>>>>>>>> resolved_klass_index}. >>>>>>>>>>>> I'm happy to see this work nearing completion! The >>>>>>>>>>>> "constant" pool >>>>>>>>>>>> should >>>>>>>>>>>> be constant! thanks, Coleen >>>>>>>>>>>> On 4/11/17 2:26 AM, Ioi Lam wrote: >>>>>>>>>>>>> Hi,please review the following change >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8171392 >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v01/ >>>>>>>>>>>>> >>>>>>>>>>>>> *Summary:** * Before: + ConstantPool::klass_at(i) >>>>>>>>>>>>> finds the >>>>>>>>>>>>> Klass from >>>>>>>>>>>>> the i-th slot of ConstantPool. + When a klass is resolved, >>>>>>>>>>>>> the >>>>>>>>>>>>> ConstantPool >>>>>>>>>>>>> is modified to store the Klass pointer. After: + >>>>>>>>>>>>> ConstantPool::klass_at(i) finds the at >>>>>>>>>>>>> this->_resolved_klasses->at(i) + >>>>>>>>>>>>> When a klass is resolved, _resolved_klasses->at(i) is >>>>>>>>>>>>> modified. >>>>>>>>>>>>> In >>>>>>>>>>>>> addition: + I moved _resolved_references and >>>>>>>>>>>>> _reference_map >>>>>>>>>>>>> from >>>>>>>>>>>>> ConstantPool to ConstantPoolCache. + Now _flags is no >>>>>>>>>>>>> longer >>>>>>>>>>>>> modified for shared ConstantPools. As a result, none of the >>>>>>>>>>>>> fields in >>>>>>>>>>>>> shared ConstantPools are modified at run time, so we can >>>>>>>>>>>>> move them >>>>>>>>>>>>> into the >>>>>>>>>>>>> RO region in the CDS archive. *Testing:** * - Benchmark >>>>>>>>>>>>> results >>>>>>>>>>>>> show no >>>>>>>>>>>>> performance regression, despite the extra level of >>>>>>>>>>>>> indirection, >>>>>>>>>>>>> which has a >>>>>>>>>>>>> negligible overhead for the interpreter. - RBT testing for >>>>>>>>>>>>> tier2 >>>>>>>>>>>>> and >>>>>>>>>>>>> tier3. *Ports:** * - I have tested only the Oracle-support >>>>>>>>>>>>> ports. For the >>>>>>>>>>>>> aarch64, ppc and s390 ports, I have added some code without >>>>>>>>>>>>> testing (it's >>>>>>>>>>>>> probably incomplete) - Port owners, please check if my >>>>>>>>>>>>> patch >>>>>>>>>>>>> work for you, >>>>>>>>>>>>> and I can incorporate your changes in my push. >>>>>>>>>>>>> Alternatively, you >>>>>>>>>>>>> can wait >>>>>>>>>>>>> for my push and provide fixes (if necessary) in a new >>>>>>>>>>>>> changeset, >>>>>>>>>>>>> and I will >>>>>>>>>>>>> be happy to sponsor it. Thanks - Ioi >>>>>>>>>>> >> > From coleen.phillimore at oracle.com Wed May 3 13:09:33 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 3 May 2017 09:09:33 -0400 Subject: RFR (L) 8171392 Move Klass pointers outside of ConstantPool entries so ConstantPool can be read-only In-Reply-To: <5909D4DC.8000000@oracle.com> References: <58EC771B.9020202@oracle.com> <35e6276a-ddf1-9149-8588-acb4e13191f5@oracle.com> <58EF3D3A.6020903@oracle.com> <58F05EB5.10009@oracle.com> <58F0EB0E.60904@oracle.com> <58F33A4F.70104@oracle.com> <58FDECFE.5060105@oracle.com> <590327F1.7070200@oracle.com> <5904B17E.9090209@oracle.com> <5909D4DC.8000000@oracle.com> Message-ID: <1642a434-22bc-07be-9ece-7e6bfc69167a@oracle.com> On 5/3/17 9:02 AM, Ioi Lam wrote: > Andrew replied me off-list that he tested the aarch64 part and was > happy about it. Thanks Andrew. > > So if there if no further comment, I will push the code as is. thumbs up! Coleen > > Thanks > - Ioi > > On 4/29/17 8:30 AM, Ioi Lam wrote: >> I've updated the patch to include Volker's ppc/s390 port as well as >> his comments. I've also included an updated patch (untested) for >> aarch64 for Andrew Haley to test: >> >> Full patch >> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v04/ >> >> >> Delta from the previous version >> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v04.delta/ >> >> >> Thanks >> - Ioi >> >> On 4/28/17 4:30 AM, Ioi Lam wrote: >>> >>> >>> On 4/25/17 8:06 AM, Volker Simonis wrote: >>>> On Mon, Apr 24, 2017 at 2:18 PM, Ioi Lam wrote: >>>>> Hi Volker, >>>>> >>>>> >>>>> On 4/21/17 12:02 AM, Volker Simonis wrote: >>>>>> Hi Ioi, >>>>>> >>>>>> thanks once again for considering our ports! Please find the >>>>>> required >>>>>> additions for ppc64/s390x in the following webrew (which is based >>>>>> upon >>>>>> your latest v03 patch): >>>>>> >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8171392_ppc64_s390x/ >>>>>> >>>>> Thanks for the patch. I will integrate it and post an updated webrev. >>>>>> @Martin/@Lucy: could you please have a look at my ppc64/s390x >>>>>> assembly >>>>>> code. I did some tests and I think it should be correct, but >>>>>> maybe you >>>>>> still find some improvements :) >>>>>> >>>>>> Besides that, I have some general questions/comments regarding your >>>>>> change: >>>>>> >>>>>> 1. In constantPool.hpp, why don't you declare the '_name_index' and >>>>>> '_resolved_klass_index' fields with type 'jushort'? As far as I can >>>>>> see, they can only hold 16-bit values anyway. It would also save you >>>>>> some space and several asserts (e.g. in unresolved_klass_at_put(): >>>>>> >>>>>> >>>>>> 274 assert((name_index & 0xffff0000) == 0, "must be"); >>>>>> 275 assert((resolved_klass_index & 0xffff0000) == 0, "must >>>>>> be"); >>>>> >>>>> I think the HotSpot convention is to use ints as parameter and >>>>> return types, >>>>> for values that are actually 16-bits or less, like here in >>>>> constantPool.hpp: >>>>> >>>>> void field_at_put(int which, int class_index, int >>>>> name_and_type_index) { >>>>> tag_at_put(which, JVM_CONSTANT_Fieldref); >>>>> *int_at_addr(which) = ((jint) name_and_type_index<<16) | >>>>> class_index; >>>>> } >>>>> >>>>> I am not sure what the reasons are. It could be that the >>>>> parameters usually >>>>> need to be computed arithmetically, and it's much easier for the >>>>> caller of >>>>> the method to use ints -- otherwise you will get lots of compiler >>>>> warnings >>>>> which would force you to use lots of casting, resulting in code >>>>> that's hard >>>>> to read and probably incorrect. >>>>> >>>> OK, but you could still use shorts in the the object to save space, >>>> although I'm not sure how much that will save in total. But if nobody >>>> else cares, I'm fine with the current version. >>> >>> The CPKlassSlot objects are stored only on the stack, so the savings >>> is not worth the trouble of adding extract type casts. >>> >>> Inside the ConstantPool itself, the name_index and >>> resolved_klass_index are stored as a pair of 16-bit values. >>> >>>>>> 2. What do you mean by: >>>>>> >>>>>> 106 // ... will be changed to support compressed pointers >>>>>> 107 Array* _resolved_klasses; >>>>> >>>>> Sorry the comment isn't very clear. How about this? >>>>> >>>>> 106 // Consider using an array of compressed klass pointers to >>>>> // save space on 64-bit platforms. >>>>> 107 Array* _resolved_klasses; >>>>> >>>> Sorry I still didn't get it? Do you mean you want to use array of >>>> "narrowKlass" (i.e. unsigned int)? But using compressed class pointers >>>> is a runtime decision while this is a compile time decision. >>> >>> I haven't figured out how to do it yet :-) >>> >>> Most likely, it will be something like: >>> >>> union { >>> Array* X; >>> Array* Y; >>> } _resolved_klasses; >>> >>> and you need to decide at run time whether to use X or Y. >>> >>> - Ioi >>>>>> 3. Why don't we need the call to "release_tag_at_put()" in >>>>>> "klass_at_put(int class_index, Klass* k)"? "klass_at_put(int >>>>>> class_index, Klass* k)" is used from >>>>>> "ClassFileParser::fill_instance_klass() and before your change that >>>>>> function used the previous version of "klass_at_put(int class_index, >>>>>> Klass* k)" which did call "release_tag_at_put()". >>>>> >>>>> Good catch. I'll add the following, because the class is now >>>>> resolved. >>>>> >>>>> release_tag_at_put(class_index, JVM_CONSTANT_UnresolvedClass); >>>>>> 4. In ConstantPool::copy_entry_to() you've changed the behavior for >>>>>> tags JVM_CONSTANT_Class, JVM_CONSTANT_UnresolvedClass, >>>>>> JVM_CONSTANT_UnresolvedClassInError. Before, the resolved klass was >>>>>> copied to the new constant pool if one existed but now you always >>>>>> only >>>>>> copy a class_index to the new constant pool (even if a resolved >>>>>> klass >>>>>> existed). Is that OK? E.g. won't this lead to a new resolving for >>>>>> the >>>>>> new constant pool and will this have performance impacts or other >>>>>> side >>>>>> effects? >>>>> I think Coleen has answered this in a separate mail :-) >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>>> Thanks again for doing this nice change and best regards, >>>>>> Volker >>>>>> >>>>>> >>>>>> On Sun, Apr 16, 2017 at 11:33 AM, Ioi Lam >>>>>> wrote: >>>>>>> Hi Lois, >>>>>>> >>>>>>> I have updated the patch to include your comments, and fixes the >>>>>>> handling >>>>>>> of >>>>>>> anonymous classes. I also added some more comments regarding the >>>>>>> _temp_resolved_klass_index: >>>>>>> >>>>>>> (delta from last webrev) >>>>>>> >>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v03.delta/ >>>>>>> >>>>>>> >>>>>>> (full webrev) >>>>>>> >>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v03/ >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> >>>>>>> On 4/15/17 2:31 AM, Lois Foltan wrote: >>>>>>>> On 4/14/2017 11:30 AM, Ioi Lam wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 4/14/17 1:31 PM, Ioi Lam wrote: >>>>>>>>>> HI Lois, >>>>>>>>>> >>>>>>>>>> Thanks for the review. Please see my comments in-line. >>>>>>>>>> >>>>>>>>>> On 4/14/17 4:32 AM, Lois Foltan wrote: >>>>>>>>>>> Hi Ioi, >>>>>>>>>>> >>>>>>>>>>> Looks really good. A couple of comments: >>>>>>>>>>> >>>>>>>>>>> src/share/vm/classfile/classFileParser.cpp: >>>>>>>>>>> * line #5676 - I'm not sure I completely understand the logic >>>>>>>>>>> surrounding anonymous classes. Coleen and I discussed >>>>>>>>>>> earlier today >>>>>>>>>>> and I >>>>>>>>>>> came away from that discussion with the idea that the only >>>>>>>>>>> classes >>>>>>>>>>> being >>>>>>>>>>> patched currently are anonymous classes. >>>>>>>>>> Line 5676 ... >>>>>>>>>> >>>>>>>>>> 5676 if (is_anonymous()) { >>>>>>>>>> 5677 _max_num_patched_klasses ++; // for patching the >>>>>>>>>> class >>>>>>>>>> index >>>>>>>>>> 5678 } >>>>>>>>>> >>>>>>>>>> corresponds to >>>>>>>>>> >>>>>>>>>> 5361 ik->set_name(_class_name); >>>>>>>>>> 5362 >>>>>>>>>> 5363 if (is_anonymous()) { >>>>>>>>>> 5364 // I am well known to myself >>>>>>>>>> 5365 patch_class(ik->constants(), _this_class_index, ik, >>>>>>>>>> ik->name()); // eagerly resolve >>>>>>>>>> 5366 } >>>>>>>>>> >>>>>>>>>> Even though the class is "anonymous", it actually has a name. >>>>>>>>>> ik->name() >>>>>>>>>> probably is part of the constant pool, but I am not 100% >>>>>>>>>> sure. Also, I >>>>>>>>>> would >>>>>>>>>> need to search the constant pool to find the index for >>>>>>>>>> ik->name(). So >>>>>>>>>> I just >>>>>>>>>> got lazy here and use the same logic in >>>>>>>>>> ConstantPool::patch_class() to >>>>>>>>>> append ik->name() to the end of the constant pool. >>>>>>>>>> >>>>>>>>>> "Anonymous" actually means "the class cannot be looked up by >>>>>>>>>> name in >>>>>>>>>> the >>>>>>>>>> SystemDictionary". I think we need a better terminology :-) >>>>>>>>>> >>>>>>>>> I finally realized why we need the "eagerly resolve" on line >>>>>>>>> 5365. I'll >>>>>>>>> modify the comments to the following: >>>>>>>>> >>>>>>>>> // _this_class_index is a CONSTANT_Class entry that >>>>>>>>> refers to this >>>>>>>>> // anonymous class itself. If this class needs to refer >>>>>>>>> to its own >>>>>>>>> methods or >>>>>>>>> // fields, it would use a CONSTANT_MethodRef, etc, which >>>>>>>>> would >>>>>>>>> reference >>>>>>>>> // _this_class_index. However, because this class is >>>>>>>>> anonymous >>>>>>>>> (it's >>>>>>>>> // not stored in SystemDictionary), _this_class_index >>>>>>>>> cannot be >>>>>>>>> resolved >>>>>>>>> // with ConstantPool::klass_at_impl, which does a >>>>>>>>> SystemDictionary >>>>>>>>> lookup. >>>>>>>>> // Therefore, we must eagerly resolve _this_class_index >>>>>>>>> now. >>>>>>>>> >>>>>>>>> So, Lois is right. Line 5676 is not necessary. I will revise >>>>>>>>> the code >>>>>>>>> to >>>>>>>>> do the "eager resolution" without using >>>>>>>>> ClassFileParser::patch_class. >>>>>>>>> I'll >>>>>>>>> post the updated code later. >>>>>>>> >>>>>>>> Thanks Ioi for studying this and explaining! Look forward to >>>>>>>> seeing the >>>>>>>> updated webrev. >>>>>>>> Lois >>>>>>>> >>>>>>>>> Thanks >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>>>> So a bit confused as why the check on line #5676 and a check >>>>>>>>>>> for a >>>>>>>>>>> java/lang/Class on line #5684. >>>>>>>>>> 5683 Handle patch = cp_patch_at(i); >>>>>>>>>> 5684 if (java_lang_String::is_instance(patch()) || >>>>>>>>>> java_lang_Class::is_instance(patch())) { >>>>>>>>>> 5685 // We need to append the names of the patched >>>>>>>>>> classes >>>>>>>>>> to >>>>>>>>>> the end of the constant pool, >>>>>>>>>> 5686 // because a patched class may have a Utf8 >>>>>>>>>> name that's >>>>>>>>>> not already included in the >>>>>>>>>> 5687 // original constant pool. >>>>>>>>>> 5688 // >>>>>>>>>> 5689 // Note that a String in cp_patch_at(i) may be >>>>>>>>>> used to >>>>>>>>>> patch a Utf8, a String, or a Class. >>>>>>>>>> 5690 // At this point, we don't know the tag for >>>>>>>>>> index i >>>>>>>>>> yet, >>>>>>>>>> because we haven't parsed the >>>>>>>>>> 5691 // constant pool. So we can only assume the >>>>>>>>>> worst -- >>>>>>>>>> every String is used to patch a Class. >>>>>>>>>> 5692 _max_num_patched_klasses ++; >>>>>>>>>> >>>>>>>>>> Line 5684 checks for all objects in the cp_patch array. >>>>>>>>>> Later, when >>>>>>>>>> ClassFileParser::patch_constant_pool() is called, any objects >>>>>>>>>> that are >>>>>>>>>> either Class or String could be treated as a Klass: >>>>>>>>>> >>>>>>>>>> 724 void >>>>>>>>>> ClassFileParser::patch_constant_pool(ConstantPool* cp, >>>>>>>>>> 725 int index, >>>>>>>>>> 726 Handle patch, >>>>>>>>>> 727 TRAPS) { >>>>>>>>>> ... >>>>>>>>>> 732 switch (cp->tag_at(index).value()) { >>>>>>>>>> 733 >>>>>>>>>> 734 case JVM_CONSTANT_UnresolvedClass: { >>>>>>>>>> 735 // Patching a class means pre-resolving it. >>>>>>>>>> 736 // The name in the constant pool is ignored. >>>>>>>>>> 737 if (java_lang_Class::is_instance(patch())) { >>>>>>>>>> 738 >>>>>>>>>> guarantee_property(!java_lang_Class::is_primitive(patch()), >>>>>>>>>> 739 "Illegal class patch at %d >>>>>>>>>> in class >>>>>>>>>> file >>>>>>>>>> %s", >>>>>>>>>> 740 index, CHECK); >>>>>>>>>> 741 Klass* k = java_lang_Class::as_Klass(patch()); >>>>>>>>>> 742 patch_class(cp, index, k, k->name()); >>>>>>>>>> 743 } else { >>>>>>>>>> 744 >>>>>>>>>> guarantee_property(java_lang_String::is_instance(patch()), >>>>>>>>>> 745 "Illegal class patch at %d >>>>>>>>>> in class >>>>>>>>>> file >>>>>>>>>> %s", >>>>>>>>>> 746 index, CHECK); >>>>>>>>>> 747 Symbol* const name = >>>>>>>>>> java_lang_String::as_symbol(patch(), >>>>>>>>>> CHECK); >>>>>>>>>> 748 patch_class(cp, index, NULL, name); >>>>>>>>>> 749 } >>>>>>>>>> 750 break; >>>>>>>>>> 751 } >>>>>>>>>> >>>>>>>>>>> Could the is_anonymous() if statement be combined into the >>>>>>>>>>> loop? >>>>>>>>>> >>>>>>>>>> I think the answer is no. At line 5365, there is no guarantee >>>>>>>>>> that >>>>>>>>>> ik->name() is in the cp_patch array. >>>>>>>>>> >>>>>>>>>> 5365 patch_class(ik->constants(), _this_class_index, ik, >>>>>>>>>> ik->name()); // eagerly resolve >>>>>>>>>> >>>>>>>>>>> Also why not do this calculation in the rewriter or is >>>>>>>>>>> that too >>>>>>>>>>> late? >>>>>>>>>>> >>>>>>>>>> Line 5676 and 5684 need to be executed BEFORE the constant >>>>>>>>>> pool and >>>>>>>>>> the >>>>>>>>>> associated tags array is allocated (both of which are fixed >>>>>>>>>> size, and >>>>>>>>>> cannot >>>>>>>>>> be expanded), which is way before the rewriter is run. At >>>>>>>>>> this point, >>>>>>>>>> we >>>>>>>>>> don't know what cp->tag_at(index) is (line #732), so the code >>>>>>>>>> needs to >>>>>>>>>> make >>>>>>>>>> a worst-case estimate on how long the CP/tags should be. >>>>>>>>>> >>>>>>>>>>> * line #5677, 5692 - a nit but I think the convention is to >>>>>>>>>>> not have >>>>>>>>>>> a >>>>>>>>>>> space after the variable name and between the post increment >>>>>>>>>>> operator. >>>>>>>>>>> >>>>>>>>>> Fixed. >>>>>>>>>>> src/share/vm/classfile/constantPool.hpp: >>>>>>>>>>> I understand the concept behind >>>>>>>>>>> _invalid_resolved_klass_index, but it >>>>>>>>>>> really is not so much invalid as temporary for class >>>>>>>>>>> redefinition >>>>>>>>>>> purposes, >>>>>>>>>>> as you explain in ConstantPool::allocate_resolved_klasses. >>>>>>>>>>> Please >>>>>>>>>>> consider >>>>>>>>>>> renaming to _temp_unresolved_klass_index. And whether you >>>>>>>>>>> choose to >>>>>>>>>>> rename >>>>>>>>>>> the field or not, please add a one line comment ahead of >>>>>>>>>>> ConstantPool::temp_unresolved_klass_at_put that only class >>>>>>>>>>> redefinition uses >>>>>>>>>>> this currently. >>>>>>>>>>> >>>>>>>>>> Good idea. Will do. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> - Ioi >>>>>>>>>> >>>>>>>>>>> Great change, thanks! >>>>>>>>>>> Lois >>>>>>>>>>> >>>>>>>>>>> On 4/13/2017 4:56 AM, Ioi Lam wrote: >>>>>>>>>>>> Hi Coleen, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the comments. Here's a delta from the last patch >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v02/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> In addition to your requests, I made these changes: >>>>>>>>>>>> >>>>>>>>>>>> [1] To consolidate the multiple extract_high/low code, I've >>>>>>>>>>>> added >>>>>>>>>>>> CPKlassSlot, so the code is cleaner: >>>>>>>>>>>> >>>>>>>>>>>> CPKlassSlot kslot = this_cp->klass_slot_at(which); >>>>>>>>>>>> int resolved_klass_index = kslot.resolved_klass_index(); >>>>>>>>>>>> int name_index = kslot.name_index(); >>>>>>>>>>>> >>>>>>>>>>>> [2] Renamed ConstantPool::is_shared_quick() to >>>>>>>>>>>> ConstantPool::is_shared(). The C++ compiler should be able >>>>>>>>>>>> to pick >>>>>>>>>>>> this >>>>>>>>>>>> function over MetaspaceObj::is_shared(). >>>>>>>>>>>> >>>>>>>>>>>> [3] Massaged the CDS region size set-up code a little to pass >>>>>>>>>>>> internal >>>>>>>>>>>> tests, because RO/RW ratio has changed. I didn't spend too >>>>>>>>>>>> much time >>>>>>>>>>>> picking >>>>>>>>>>>> the "right" sizes, as this code will be obsoleted soon with >>>>>>>>>>>> JDK-8072061 >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> - Ioi >>>>>>>>>>>> >>>>>>>>>>>> On 4/13/17 6:40 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> This looks really good! >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v01/src/share/vm/oops/constantPool.cpp.udiff.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> + // Add one extra element to tags for storing >>>>>>>>>>>>> ConstantPool::flags(). >>>>>>>>>>>>> + Array* tags = >>>>>>>>>>>>> MetadataFactory::new_writeable_array(loader_data, >>>>>>>>>>>>> length+1, 0, >>>>>>>>>>>>> CHECK_NULL); ... + assert(tags->length()-1 == _length, >>>>>>>>>>>>> "invariant"); // >>>>>>>>>>>>> tags->at(_length) is flags() >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I think this is left over, since _flags didn't get moved >>>>>>>>>>>>> after all. >>>>>>>>>>>>> >>>>>>>>>>>>> + Klass** adr = >>>>>>>>>>>>> this_cp->resolved_klasses()->adr_at(resolved_klass_index); >>>>>>>>>>>>> + OrderAccess::release_store_ptr((Klass* volatile *)adr, k); >>>>>>>>>>>>> + // The interpreter assumes when the tag is stored, the >>>>>>>>>>>>> klass is >>>>>>>>>>>>> resolved >>>>>>>>>>>>> + // and the Klass* is a klass rather than a Symbol*, so >>>>>>>>>>>>> we need >>>>>>>>>>>>> + // hardware store ordering here. >>>>>>>>>>>>> + this_cp->release_tag_at_put(which, JVM_CONSTANT_Class); >>>>>>>>>>>>> + return k; >>>>>>>>>>>>> >>>>>>>>>>>>> The comment still refers to the switch between Symbol* and >>>>>>>>>>>>> Klass* >>>>>>>>>>>>> in >>>>>>>>>>>>> the constant pool. The entry in the Klass array should be >>>>>>>>>>>>> NULL. >>>>>>>>>>>>> >>>>>>>>>>>>> + int name_index = >>>>>>>>>>>>> extract_high_short_from_int(*int_at_addr(which)); >>>>>>>>>>>>> >>>>>>>>>>>>> Can you put klass_name_index_at() in the constantPool.hpp >>>>>>>>>>>>> header >>>>>>>>>>>>> file >>>>>>>>>>>>> (so it's inlined) and have all the places where you get >>>>>>>>>>>>> name_index >>>>>>>>>>>>> use this >>>>>>>>>>>>> function? So we don't have to know in multiple places that >>>>>>>>>>>>> extract_high_short_from_int() is where the name index is. >>>>>>>>>>>>> And in >>>>>>>>>>>>> constantPool.hpp, for unresolved_klass_at_put() add a >>>>>>>>>>>>> comment about >>>>>>>>>>>>> what the >>>>>>>>>>>>> new format of the constant pool entry is {name_index, >>>>>>>>>>>>> resolved_klass_index}. >>>>>>>>>>>>> I'm happy to see this work nearing completion! The >>>>>>>>>>>>> "constant" pool >>>>>>>>>>>>> should >>>>>>>>>>>>> be constant! thanks, Coleen >>>>>>>>>>>>> On 4/11/17 2:26 AM, Ioi Lam wrote: >>>>>>>>>>>>>> Hi,please review the following change >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8171392 >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v01/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> *Summary:** * Before: + ConstantPool::klass_at(i) >>>>>>>>>>>>>> finds the >>>>>>>>>>>>>> Klass from >>>>>>>>>>>>>> the i-th slot of ConstantPool. + When a klass is >>>>>>>>>>>>>> resolved, the >>>>>>>>>>>>>> ConstantPool >>>>>>>>>>>>>> is modified to store the Klass pointer. After: + >>>>>>>>>>>>>> ConstantPool::klass_at(i) finds the at >>>>>>>>>>>>>> this->_resolved_klasses->at(i) + >>>>>>>>>>>>>> When a klass is resolved, _resolved_klasses->at(i) is >>>>>>>>>>>>>> modified. >>>>>>>>>>>>>> In >>>>>>>>>>>>>> addition: + I moved _resolved_references and >>>>>>>>>>>>>> _reference_map >>>>>>>>>>>>>> from >>>>>>>>>>>>>> ConstantPool to ConstantPoolCache. + Now _flags is no >>>>>>>>>>>>>> longer >>>>>>>>>>>>>> modified for shared ConstantPools. As a result, none of >>>>>>>>>>>>>> the >>>>>>>>>>>>>> fields in >>>>>>>>>>>>>> shared ConstantPools are modified at run time, so we can >>>>>>>>>>>>>> move them >>>>>>>>>>>>>> into the >>>>>>>>>>>>>> RO region in the CDS archive. *Testing:** * - Benchmark >>>>>>>>>>>>>> results >>>>>>>>>>>>>> show no >>>>>>>>>>>>>> performance regression, despite the extra level of >>>>>>>>>>>>>> indirection, >>>>>>>>>>>>>> which has a >>>>>>>>>>>>>> negligible overhead for the interpreter. - RBT testing >>>>>>>>>>>>>> for tier2 >>>>>>>>>>>>>> and >>>>>>>>>>>>>> tier3. *Ports:** * - I have tested only the Oracle-support >>>>>>>>>>>>>> ports. For the >>>>>>>>>>>>>> aarch64, ppc and s390 ports, I have added some code without >>>>>>>>>>>>>> testing (it's >>>>>>>>>>>>>> probably incomplete) - Port owners, please check if my >>>>>>>>>>>>>> patch >>>>>>>>>>>>>> work for you, >>>>>>>>>>>>>> and I can incorporate your changes in my push. >>>>>>>>>>>>>> Alternatively, you >>>>>>>>>>>>>> can wait >>>>>>>>>>>>>> for my push and provide fixes (if necessary) in a new >>>>>>>>>>>>>> changeset, >>>>>>>>>>>>>> and I will >>>>>>>>>>>>>> be happy to sponsor it. Thanks - Ioi >>>>>>>>>>>> >>> >> > From gromero at linux.vnet.ibm.com Wed May 3 13:27:08 2017 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 3 May 2017 10:27:08 -0300 Subject: [10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: <59000AC0.7050507@linux.vnet.ibm.com> References: <58C1AE06.9060609@linux.vnet.ibm.com> <58EEAF7B.6020708@linux.vnet.ibm.com> <59000AC0.7050507@linux.vnet.ibm.com> Message-ID: <5909DAAC.3070202@linux.vnet.ibm.com> Hi community, I understand that there is nothing that can be done additionally regarding this issue, at this point, on the PPC64 side. It's a change in the shared code - but that in effect does not change anything in the numa detection mechanism for other platforms - and hence it's necessary a conjoint community effort to review the change and a sponsor to run it against the JPRT. I know it's a stabilizing moment of OpenJDK 9, but since that issue is of great concern on PPC64 (specially on POWER8 machines) I would be very glad if the community could point out directions on how that change could move on. Thank you! Best regards, Gustavo On 25-04-2017 23:49, Gustavo Romero wrote: > Dear Volker, > > On 24-04-2017 14:08, Volker Simonis wrote: >> Hi Gustavo, >> >> thanks for addressing this problem and sorry for my late reply. I >> think this is a good change which definitely improves the situation >> for uncommon NUMA configurations without changing the handling for >> common topologies. > > Thanks a lot for reviewing the change! > > >> It would be great if somebody could run this trough JPRT, but as >> Gustavo mentioned, I don't expect any regressions. >> >> @Igor: I think you've been the original author of the NUMA-aware >> allocator port to Linux (i.e. "6684395: Port NUMA-aware allocator to >> linux"). If you could find some spare minutes to take a look at this >> change, your comment would be very much appreciated :) >> >> Following some minor comments from me: >> >> - in os::numa_get_groups_num() you now use numa_num_configured_nodes() >> to get the actual number of configured nodes. This is good and >> certainly an improvement over the previous implementation. However, >> the man page for numa_num_configured_nodes() mentions that the >> returned count may contain currently disabled nodes. Do we currently >> handle disabled nodes? What will be the consequence if we would use >> such a disabled node (e.g. mbind() warnings)? > > In [1] 'numa_memnode_ptr' is set to keep a list of *just nodes with memory in > found in /sys/devices/system/node/* Hence numa_num_configured_nodes() just > returns the number of nodes in 'numa_memnode_ptr' [2], thus just returns the > number of nodes with memory in the system. To the best of my knowledge there is > no system configuration on Linux/PPC64 that could match such a notion of > "disabled nodes" as it appears in libnuma's manual. If it is enabled, it's in > that dir and just the ones with memory will be taken into account. If it's > disabled (somehow), it's not in the dir, so won't be taken into account (i.e. no > mbind() tried against it). > > On Power it's possible to have a numa node without memory (memory-less node, a > case covered in this change), a numa node without cpus at all but with memory > (a configured node anyway, so a case already covered) but to disable a specific > numa node so it does not appear in /sys/devices/system/node/* it's only possible > from the inners of the control module. Or other rare condition not invisible / > adjustable from the OS. Also I'm not aware of a case where a node is in this > dir but is at the same time flagged as something like "disabled". There are > cpu/memory hotplugs, but that does not change numa nodes status AFAIK. > > [1] https://github.com/numactl/numactl/blob/master/libnuma.c#L334-L347 > [2] https://github.com/numactl/numactl/blob/master/libnuma.c#L614-L618 > > >> - the same question applies to the usage of >> Linux::isnode_in_configured_nodes() within os::numa_get_leaf_groups(). >> Does isnode_in_configured_nodes() (i.e. the node set defined by >> 'numa_all_nodes_ptr' take into account the disabled nodes or not? Can >> this be a potential problem (i.e. if we use a disabled node). > > On the meaning of "disabled nodes", it's the same case as above, so to the > best of knowledge it's not a potential problem. > > Anyway 'numa_all_nodes_ptr' just includes the configured nodes (with memory), > i.e. "all nodes on which the calling task may allocate memory". It's exactly > the same pointer returned by numa_get_membind() v2 [3] which: > > "returns the mask of nodes from which memory can currently be allocated" > > and that is used, for example, in "numactl --show" to show nodes from where > memory can be allocated [4, 5]. > > [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L1147 > [4] https://github.com/numactl/numactl/blob/master/numactl.c#L144 > [5] https://github.com/numactl/numactl/blob/master/numactl.c#L177 > > >> - I'd like to suggest renaming the 'index' part of the following >> variables and functions to 'nindex' ('node_index' is probably to long) >> in the following code, to emphasize that we have node indexes pointing >> to actual, not always consecutive node numbers: >> >> 2879 // Create an index -> node mapping, since nodes are not >> always consecutive >> 2880 _index_to_node = new (ResourceObj::C_HEAP, mtInternal) >> GrowableArray(0, true); >> 2881 rebuild_index_to_node_map(); > > Simple change but much better to read indeed. Done. > > >> - can you please wrap the following one-line else statement into curly >> braces (it's more readable and we usually do it that way in HotSpot >> although there are no formal style guidelines :) >> >> 2953 } else >> 2954 // Current node is already a configured node. >> 2955 closest_node = index_to_node()->at(i); > > Done. > > >> - in os::Linux::rebuild_cpu_to_node_map(), if you set >> 'closest_distance' to INT_MAX at the beginning of the loop, you can >> later avoid the check for '|| !closest_distance'. Also, according to >> the man page, numa_distance() returns 0 if it can not determine the >> distance. So with the above change, the condition on line 2974 should >> read: >> >> 2947 if (distance && distance < closest_distance) { >> > > Sure, much better to set the initial condition as distant as possible and > adjust to a closer one bit by bit improving the if condition. Done. > > >> Finally, and not directly related to your change, I'd suggest the >> following clean-ups: >> >> - remove the usage of 'NCPUS = 32768' in >> os::Linux::rebuild_cpu_to_node_map(). The comment on that line is >> unclear to me and probably related to an older version/problem of >> libnuma? I think we should simply use >> numa_allocate_cpumask()/numa_free_cpumask() instead. >> >> - we still use the NUMA version 1 function prototypes (e.g. >> "numa_node_to_cpus(int node, unsigned long *buffer, int buffer_len)" >> instead of "numa_node_to_cpus(int node, struct bitmask *mask)", but >> also "numa_interleave_memory()" and maybe others). I think we should >> switch all prototypes to the new NUMA version 2 interface which you've >> already used for the new functions which you've added. > > I agree. Could I open a new bug to address these clean-ups? > > >> That said, I think these changes all require libnuma 2.0 (see >> os::Linux::libnuma_dlsym). So before starting this, you should make >> sure that libnuma 2.0 is available on all platforms to which you'd >> like to down-port this change. For jdk10 we could definitely do it, >> for jdk9 probably also, for jdk8 I'm not so sure. > > libnuma v1 last release dates back to 2008, but any idea how could I check that > for sure since it's on shared code? > > new webrev: http://cr.openjdk.java.net/~gromero/8175813/v3/ > > Thank you! > > Best regards, > Gustavo > > >> Regards, >> Volker >> >> On Thu, Apr 13, 2017 at 12:51 AM, Gustavo Romero >> wrote: >>> Hi, >>> >>> Any update on it? >>> >>> Thank you. >>> >>> Regards, >>> Gustavo >>> >>> On 09-03-2017 16:33, Gustavo Romero wrote: >>>> Hi, >>>> >>>> Could the following webrev be reviewed please? >>>> >>>> It improves the numa node detection when non-consecutive or memory-less nodes >>>> exist in the system. >>>> >>>> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/ >>>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>>> >>>> Currently, although no problem exists when the JVM detects numa nodes that are >>>> consecutive and have memory, for example in a numa topology like: >>>> >>>> available: 2 nodes (0-1) >>>> node 0 cpus: 0 8 16 24 32 >>>> node 0 size: 65258 MB >>>> node 0 free: 34 MB >>>> node 1 cpus: 40 48 56 64 72 >>>> node 1 size: 65320 MB >>>> node 1 free: 150 MB >>>> node distances: >>>> node 0 1 >>>> 0: 10 20 >>>> 1: 20 10, >>>> >>>> it fails on detecting numa nodes to be used in the Parallel GC in a numa >>>> topology like: >>>> >>>> available: 4 nodes (0-1,16-17) >>>> node 0 cpus: 0 8 16 24 32 >>>> node 0 size: 130706 MB >>>> node 0 free: 7729 MB >>>> node 1 cpus: 40 48 56 64 72 >>>> node 1 size: 0 MB >>>> node 1 free: 0 MB >>>> node 16 cpus: 80 88 96 104 112 >>>> node 16 size: 130630 MB >>>> node 16 free: 5282 MB >>>> node 17 cpus: 120 128 136 144 152 >>>> node 17 size: 0 MB >>>> node 17 free: 0 MB >>>> node distances: >>>> node 0 1 16 17 >>>> 0: 10 20 40 40 >>>> 1: 20 10 40 40 >>>> 16: 40 40 10 20 >>>> 17: 40 40 20 10, >>>> >>>> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have >>>> no memory. >>>> >>>> If a topology like that exists, os::numa_make_local() will receive a local group >>>> id as a hint that is not available in the system to be bound (it will receive >>>> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument" >>>> messages: >>>> >>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log >>>> >>>> That change improves the detection by making the JVM numa API aware of the >>>> existence of numa nodes that are non-consecutive from 0 to the highest node >>>> number and also of nodes that might be memory-less nodes, i.e. that might not >>>> be, in libnuma terms, a configured node. Hence just the configured nodes will >>>> be available: >>>> >>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log >>>> >>>> The change has no effect on numa topologies were the problem does not occur, >>>> i.e. no change in the number of nodes and no change in the cpu to node map. On >>>> numa topologies where memory-less nodes exist (like in the last example above), >>>> cpus from a memory-less node won't be able to bind locally so they are mapped >>>> to the closest node, otherwise they would be not associate to any node and >>>> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the >>>> performance. >>>> >>>> I found no regressions on x64 for the following numa topology: >>>> >>>> available: 2 nodes (0-1) >>>> node 0 cpus: 0 1 2 3 8 9 10 11 >>>> node 0 size: 24102 MB >>>> node 0 free: 19806 MB >>>> node 1 cpus: 4 5 6 7 12 13 14 15 >>>> node 1 size: 24190 MB >>>> node 1 free: 21951 MB >>>> node distances: >>>> node 0 1 >>>> 0: 10 21 >>>> 1: 21 10 >>>> >>>> I understand that fixing the current numa detection is a prerequisite to enable >>>> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2]. >>>> >>>> Thank you. >>>> >>>> >>>> Best regards, >>>> Gustavo >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate) >>>> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation) >>>> >>> >> > From volker.simonis at gmail.com Wed May 3 14:33:25 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 3 May 2017 16:33:25 +0200 Subject: [10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: <59000AC0.7050507@linux.vnet.ibm.com> References: <58C1AE06.9060609@linux.vnet.ibm.com> <58EEAF7B.6020708@linux.vnet.ibm.com> <59000AC0.7050507@linux.vnet.ibm.com> Message-ID: Hi Gustavo, thanks for the latest corrections. I think your change looks good now. On Wed, Apr 26, 2017 at 4:49 AM, Gustavo Romero wrote: > Dear Volker, > > On 24-04-2017 14:08, Volker Simonis wrote: >> Hi Gustavo, >> >> thanks for addressing this problem and sorry for my late reply. I >> think this is a good change which definitely improves the situation >> for uncommon NUMA configurations without changing the handling for >> common topologies. > > Thanks a lot for reviewing the change! > > >> It would be great if somebody could run this trough JPRT, but as >> Gustavo mentioned, I don't expect any regressions. >> >> @Igor: I think you've been the original author of the NUMA-aware >> allocator port to Linux (i.e. "6684395: Port NUMA-aware allocator to >> linux"). If you could find some spare minutes to take a look at this >> change, your comment would be very much appreciated :) >> >> Following some minor comments from me: >> >> - in os::numa_get_groups_num() you now use numa_num_configured_nodes() >> to get the actual number of configured nodes. This is good and >> certainly an improvement over the previous implementation. However, >> the man page for numa_num_configured_nodes() mentions that the >> returned count may contain currently disabled nodes. Do we currently >> handle disabled nodes? What will be the consequence if we would use >> such a disabled node (e.g. mbind() warnings)? > > In [1] 'numa_memnode_ptr' is set to keep a list of *just nodes with memory in > found in /sys/devices/system/node/* Hence numa_num_configured_nodes() just > returns the number of nodes in 'numa_memnode_ptr' [2], thus just returns the > number of nodes with memory in the system. To the best of my knowledge there is > no system configuration on Linux/PPC64 that could match such a notion of > "disabled nodes" as it appears in libnuma's manual. If it is enabled, it's in > that dir and just the ones with memory will be taken into account. If it's > disabled (somehow), it's not in the dir, so won't be taken into account (i.e. no > mbind() tried against it). > > On Power it's possible to have a numa node without memory (memory-less node, a > case covered in this change), a numa node without cpus at all but with memory > (a configured node anyway, so a case already covered) but to disable a specific > numa node so it does not appear in /sys/devices/system/node/* it's only possible > from the inners of the control module. Or other rare condition not invisible / > adjustable from the OS. Also I'm not aware of a case where a node is in this > dir but is at the same time flagged as something like "disabled". There are > cpu/memory hotplugs, but that does not change numa nodes status AFAIK. > > [1] https://github.com/numactl/numactl/blob/master/libnuma.c#L334-L347 > [2] https://github.com/numactl/numactl/blob/master/libnuma.c#L614-L618 > > >> - the same question applies to the usage of >> Linux::isnode_in_configured_nodes() within os::numa_get_leaf_groups(). >> Does isnode_in_configured_nodes() (i.e. the node set defined by >> 'numa_all_nodes_ptr' take into account the disabled nodes or not? Can >> this be a potential problem (i.e. if we use a disabled node). > > On the meaning of "disabled nodes", it's the same case as above, so to the > best of knowledge it's not a potential problem. > > Anyway 'numa_all_nodes_ptr' just includes the configured nodes (with memory), > i.e. "all nodes on which the calling task may allocate memory". It's exactly > the same pointer returned by numa_get_membind() v2 [3] which: > > "returns the mask of nodes from which memory can currently be allocated" > > and that is used, for example, in "numactl --show" to show nodes from where > memory can be allocated [4, 5]. > > [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L1147 > [4] https://github.com/numactl/numactl/blob/master/numactl.c#L144 > [5] https://github.com/numactl/numactl/blob/master/numactl.c#L177 > > >> - I'd like to suggest renaming the 'index' part of the following >> variables and functions to 'nindex' ('node_index' is probably to long) >> in the following code, to emphasize that we have node indexes pointing >> to actual, not always consecutive node numbers: >> >> 2879 // Create an index -> node mapping, since nodes are not >> always consecutive >> 2880 _index_to_node = new (ResourceObj::C_HEAP, mtInternal) >> GrowableArray(0, true); >> 2881 rebuild_index_to_node_map(); > > Simple change but much better to read indeed. Done. > > >> - can you please wrap the following one-line else statement into curly >> braces (it's more readable and we usually do it that way in HotSpot >> although there are no formal style guidelines :) >> >> 2953 } else >> 2954 // Current node is already a configured node. >> 2955 closest_node = index_to_node()->at(i); > > Done. > > >> - in os::Linux::rebuild_cpu_to_node_map(), if you set >> 'closest_distance' to INT_MAX at the beginning of the loop, you can >> later avoid the check for '|| !closest_distance'. Also, according to >> the man page, numa_distance() returns 0 if it can not determine the >> distance. So with the above change, the condition on line 2974 should >> read: >> >> 2947 if (distance && distance < closest_distance) { >> > > Sure, much better to set the initial condition as distant as possible and > adjust to a closer one bit by bit improving the if condition. Done. > > >> Finally, and not directly related to your change, I'd suggest the >> following clean-ups: >> >> - remove the usage of 'NCPUS = 32768' in >> os::Linux::rebuild_cpu_to_node_map(). The comment on that line is >> unclear to me and probably related to an older version/problem of >> libnuma? I think we should simply use >> numa_allocate_cpumask()/numa_free_cpumask() instead. >> >> - we still use the NUMA version 1 function prototypes (e.g. >> "numa_node_to_cpus(int node, unsigned long *buffer, int buffer_len)" >> instead of "numa_node_to_cpus(int node, struct bitmask *mask)", but >> also "numa_interleave_memory()" and maybe others). I think we should >> switch all prototypes to the new NUMA version 2 interface which you've >> already used for the new functions which you've added. > > I agree. Could I open a new bug to address these clean-ups? > Yes, would be great to track that. Maybe you could add another topic for implementing os::get_page_info() on Linux. The information will be used for -XX:+NUMAStats. Not sure if you can easily gather it on Linux. As far as I can see it is currently only implemented on Solaris. > >> That said, I think these changes all require libnuma 2.0 (see >> os::Linux::libnuma_dlsym). So before starting this, you should make >> sure that libnuma 2.0 is available on all platforms to which you'd >> like to down-port this change. For jdk10 we could definitely do it, >> for jdk9 probably also, for jdk8 I'm not so sure. > > libnuma v1 last release dates back to 2008, but any idea how could I check that > for sure since it's on shared code? > > new webrev: http://cr.openjdk.java.net/~gromero/8175813/v3/ > > Thank you! > > Best regards, > Gustavo > > >> Regards, >> Volker >> >> On Thu, Apr 13, 2017 at 12:51 AM, Gustavo Romero >> wrote: >>> Hi, >>> >>> Any update on it? >>> >>> Thank you. >>> >>> Regards, >>> Gustavo >>> >>> On 09-03-2017 16:33, Gustavo Romero wrote: >>>> Hi, >>>> >>>> Could the following webrev be reviewed please? >>>> >>>> It improves the numa node detection when non-consecutive or memory-less nodes >>>> exist in the system. >>>> >>>> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/ >>>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>>> >>>> Currently, although no problem exists when the JVM detects numa nodes that are >>>> consecutive and have memory, for example in a numa topology like: >>>> >>>> available: 2 nodes (0-1) >>>> node 0 cpus: 0 8 16 24 32 >>>> node 0 size: 65258 MB >>>> node 0 free: 34 MB >>>> node 1 cpus: 40 48 56 64 72 >>>> node 1 size: 65320 MB >>>> node 1 free: 150 MB >>>> node distances: >>>> node 0 1 >>>> 0: 10 20 >>>> 1: 20 10, >>>> >>>> it fails on detecting numa nodes to be used in the Parallel GC in a numa >>>> topology like: >>>> >>>> available: 4 nodes (0-1,16-17) >>>> node 0 cpus: 0 8 16 24 32 >>>> node 0 size: 130706 MB >>>> node 0 free: 7729 MB >>>> node 1 cpus: 40 48 56 64 72 >>>> node 1 size: 0 MB >>>> node 1 free: 0 MB >>>> node 16 cpus: 80 88 96 104 112 >>>> node 16 size: 130630 MB >>>> node 16 free: 5282 MB >>>> node 17 cpus: 120 128 136 144 152 >>>> node 17 size: 0 MB >>>> node 17 free: 0 MB >>>> node distances: >>>> node 0 1 16 17 >>>> 0: 10 20 40 40 >>>> 1: 20 10 40 40 >>>> 16: 40 40 10 20 >>>> 17: 40 40 20 10, >>>> >>>> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have >>>> no memory. >>>> >>>> If a topology like that exists, os::numa_make_local() will receive a local group >>>> id as a hint that is not available in the system to be bound (it will receive >>>> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument" >>>> messages: >>>> >>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log >>>> >>>> That change improves the detection by making the JVM numa API aware of the >>>> existence of numa nodes that are non-consecutive from 0 to the highest node >>>> number and also of nodes that might be memory-less nodes, i.e. that might not >>>> be, in libnuma terms, a configured node. Hence just the configured nodes will >>>> be available: >>>> >>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log >>>> >>>> The change has no effect on numa topologies were the problem does not occur, >>>> i.e. no change in the number of nodes and no change in the cpu to node map. On >>>> numa topologies where memory-less nodes exist (like in the last example above), >>>> cpus from a memory-less node won't be able to bind locally so they are mapped >>>> to the closest node, otherwise they would be not associate to any node and >>>> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the >>>> performance. >>>> >>>> I found no regressions on x64 for the following numa topology: >>>> >>>> available: 2 nodes (0-1) >>>> node 0 cpus: 0 1 2 3 8 9 10 11 >>>> node 0 size: 24102 MB >>>> node 0 free: 19806 MB >>>> node 1 cpus: 4 5 6 7 12 13 14 15 >>>> node 1 size: 24190 MB >>>> node 1 free: 21951 MB >>>> node distances: >>>> node 0 1 >>>> 0: 10 21 >>>> 1: 21 10 >>>> >>>> I understand that fixing the current numa detection is a prerequisite to enable >>>> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2]. >>>> >>>> Thank you. >>>> >>>> >>>> Best regards, >>>> Gustavo >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate) >>>> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation) >>>> >>> >> > From volker.simonis at gmail.com Wed May 3 14:34:16 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 3 May 2017 16:34:16 +0200 Subject: [10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: <5909DAAC.3070202@linux.vnet.ibm.com> References: <58C1AE06.9060609@linux.vnet.ibm.com> <58EEAF7B.6020708@linux.vnet.ibm.com> <59000AC0.7050507@linux.vnet.ibm.com> <5909DAAC.3070202@linux.vnet.ibm.com> Message-ID: Hi, I've reviewed Gustavo's change and I'm fine with the latest version at: http://cr.openjdk.java.net/~gromero/8175813/v3/ Can somebody please sponsor the change? Thank you and best regards, Volker On Wed, May 3, 2017 at 3:27 PM, Gustavo Romero wrote: > Hi community, > > I understand that there is nothing that can be done additionally regarding this > issue, at this point, on the PPC64 side. > > It's a change in the shared code - but that in effect does not change anything in > the numa detection mechanism for other platforms - and hence it's necessary a > conjoint community effort to review the change and a sponsor to run it against > the JPRT. > > I know it's a stabilizing moment of OpenJDK 9, but since that issue is of > great concern on PPC64 (specially on POWER8 machines) I would be very glad if > the community could point out directions on how that change could move on. > > Thank you! > > Best regards, > Gustavo > > On 25-04-2017 23:49, Gustavo Romero wrote: >> Dear Volker, >> >> On 24-04-2017 14:08, Volker Simonis wrote: >>> Hi Gustavo, >>> >>> thanks for addressing this problem and sorry for my late reply. I >>> think this is a good change which definitely improves the situation >>> for uncommon NUMA configurations without changing the handling for >>> common topologies. >> >> Thanks a lot for reviewing the change! >> >> >>> It would be great if somebody could run this trough JPRT, but as >>> Gustavo mentioned, I don't expect any regressions. >>> >>> @Igor: I think you've been the original author of the NUMA-aware >>> allocator port to Linux (i.e. "6684395: Port NUMA-aware allocator to >>> linux"). If you could find some spare minutes to take a look at this >>> change, your comment would be very much appreciated :) >>> >>> Following some minor comments from me: >>> >>> - in os::numa_get_groups_num() you now use numa_num_configured_nodes() >>> to get the actual number of configured nodes. This is good and >>> certainly an improvement over the previous implementation. However, >>> the man page for numa_num_configured_nodes() mentions that the >>> returned count may contain currently disabled nodes. Do we currently >>> handle disabled nodes? What will be the consequence if we would use >>> such a disabled node (e.g. mbind() warnings)? >> >> In [1] 'numa_memnode_ptr' is set to keep a list of *just nodes with memory in >> found in /sys/devices/system/node/* Hence numa_num_configured_nodes() just >> returns the number of nodes in 'numa_memnode_ptr' [2], thus just returns the >> number of nodes with memory in the system. To the best of my knowledge there is >> no system configuration on Linux/PPC64 that could match such a notion of >> "disabled nodes" as it appears in libnuma's manual. If it is enabled, it's in >> that dir and just the ones with memory will be taken into account. If it's >> disabled (somehow), it's not in the dir, so won't be taken into account (i.e. no >> mbind() tried against it). >> >> On Power it's possible to have a numa node without memory (memory-less node, a >> case covered in this change), a numa node without cpus at all but with memory >> (a configured node anyway, so a case already covered) but to disable a specific >> numa node so it does not appear in /sys/devices/system/node/* it's only possible >> from the inners of the control module. Or other rare condition not invisible / >> adjustable from the OS. Also I'm not aware of a case where a node is in this >> dir but is at the same time flagged as something like "disabled". There are >> cpu/memory hotplugs, but that does not change numa nodes status AFAIK. >> >> [1] https://github.com/numactl/numactl/blob/master/libnuma.c#L334-L347 >> [2] https://github.com/numactl/numactl/blob/master/libnuma.c#L614-L618 >> >> >>> - the same question applies to the usage of >>> Linux::isnode_in_configured_nodes() within os::numa_get_leaf_groups(). >>> Does isnode_in_configured_nodes() (i.e. the node set defined by >>> 'numa_all_nodes_ptr' take into account the disabled nodes or not? Can >>> this be a potential problem (i.e. if we use a disabled node). >> >> On the meaning of "disabled nodes", it's the same case as above, so to the >> best of knowledge it's not a potential problem. >> >> Anyway 'numa_all_nodes_ptr' just includes the configured nodes (with memory), >> i.e. "all nodes on which the calling task may allocate memory". It's exactly >> the same pointer returned by numa_get_membind() v2 [3] which: >> >> "returns the mask of nodes from which memory can currently be allocated" >> >> and that is used, for example, in "numactl --show" to show nodes from where >> memory can be allocated [4, 5]. >> >> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L1147 >> [4] https://github.com/numactl/numactl/blob/master/numactl.c#L144 >> [5] https://github.com/numactl/numactl/blob/master/numactl.c#L177 >> >> >>> - I'd like to suggest renaming the 'index' part of the following >>> variables and functions to 'nindex' ('node_index' is probably to long) >>> in the following code, to emphasize that we have node indexes pointing >>> to actual, not always consecutive node numbers: >>> >>> 2879 // Create an index -> node mapping, since nodes are not >>> always consecutive >>> 2880 _index_to_node = new (ResourceObj::C_HEAP, mtInternal) >>> GrowableArray(0, true); >>> 2881 rebuild_index_to_node_map(); >> >> Simple change but much better to read indeed. Done. >> >> >>> - can you please wrap the following one-line else statement into curly >>> braces (it's more readable and we usually do it that way in HotSpot >>> although there are no formal style guidelines :) >>> >>> 2953 } else >>> 2954 // Current node is already a configured node. >>> 2955 closest_node = index_to_node()->at(i); >> >> Done. >> >> >>> - in os::Linux::rebuild_cpu_to_node_map(), if you set >>> 'closest_distance' to INT_MAX at the beginning of the loop, you can >>> later avoid the check for '|| !closest_distance'. Also, according to >>> the man page, numa_distance() returns 0 if it can not determine the >>> distance. So with the above change, the condition on line 2974 should >>> read: >>> >>> 2947 if (distance && distance < closest_distance) { >>> >> >> Sure, much better to set the initial condition as distant as possible and >> adjust to a closer one bit by bit improving the if condition. Done. >> >> >>> Finally, and not directly related to your change, I'd suggest the >>> following clean-ups: >>> >>> - remove the usage of 'NCPUS = 32768' in >>> os::Linux::rebuild_cpu_to_node_map(). The comment on that line is >>> unclear to me and probably related to an older version/problem of >>> libnuma? I think we should simply use >>> numa_allocate_cpumask()/numa_free_cpumask() instead. >>> >>> - we still use the NUMA version 1 function prototypes (e.g. >>> "numa_node_to_cpus(int node, unsigned long *buffer, int buffer_len)" >>> instead of "numa_node_to_cpus(int node, struct bitmask *mask)", but >>> also "numa_interleave_memory()" and maybe others). I think we should >>> switch all prototypes to the new NUMA version 2 interface which you've >>> already used for the new functions which you've added. >> >> I agree. Could I open a new bug to address these clean-ups? >> >> >>> That said, I think these changes all require libnuma 2.0 (see >>> os::Linux::libnuma_dlsym). So before starting this, you should make >>> sure that libnuma 2.0 is available on all platforms to which you'd >>> like to down-port this change. For jdk10 we could definitely do it, >>> for jdk9 probably also, for jdk8 I'm not so sure. >> >> libnuma v1 last release dates back to 2008, but any idea how could I check that >> for sure since it's on shared code? >> >> new webrev: http://cr.openjdk.java.net/~gromero/8175813/v3/ >> >> Thank you! >> >> Best regards, >> Gustavo >> >> >>> Regards, >>> Volker >>> >>> On Thu, Apr 13, 2017 at 12:51 AM, Gustavo Romero >>> wrote: >>>> Hi, >>>> >>>> Any update on it? >>>> >>>> Thank you. >>>> >>>> Regards, >>>> Gustavo >>>> >>>> On 09-03-2017 16:33, Gustavo Romero wrote: >>>>> Hi, >>>>> >>>>> Could the following webrev be reviewed please? >>>>> >>>>> It improves the numa node detection when non-consecutive or memory-less nodes >>>>> exist in the system. >>>>> >>>>> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/ >>>>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>>>> >>>>> Currently, although no problem exists when the JVM detects numa nodes that are >>>>> consecutive and have memory, for example in a numa topology like: >>>>> >>>>> available: 2 nodes (0-1) >>>>> node 0 cpus: 0 8 16 24 32 >>>>> node 0 size: 65258 MB >>>>> node 0 free: 34 MB >>>>> node 1 cpus: 40 48 56 64 72 >>>>> node 1 size: 65320 MB >>>>> node 1 free: 150 MB >>>>> node distances: >>>>> node 0 1 >>>>> 0: 10 20 >>>>> 1: 20 10, >>>>> >>>>> it fails on detecting numa nodes to be used in the Parallel GC in a numa >>>>> topology like: >>>>> >>>>> available: 4 nodes (0-1,16-17) >>>>> node 0 cpus: 0 8 16 24 32 >>>>> node 0 size: 130706 MB >>>>> node 0 free: 7729 MB >>>>> node 1 cpus: 40 48 56 64 72 >>>>> node 1 size: 0 MB >>>>> node 1 free: 0 MB >>>>> node 16 cpus: 80 88 96 104 112 >>>>> node 16 size: 130630 MB >>>>> node 16 free: 5282 MB >>>>> node 17 cpus: 120 128 136 144 152 >>>>> node 17 size: 0 MB >>>>> node 17 free: 0 MB >>>>> node distances: >>>>> node 0 1 16 17 >>>>> 0: 10 20 40 40 >>>>> 1: 20 10 40 40 >>>>> 16: 40 40 10 20 >>>>> 17: 40 40 20 10, >>>>> >>>>> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have >>>>> no memory. >>>>> >>>>> If a topology like that exists, os::numa_make_local() will receive a local group >>>>> id as a hint that is not available in the system to be bound (it will receive >>>>> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument" >>>>> messages: >>>>> >>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log >>>>> >>>>> That change improves the detection by making the JVM numa API aware of the >>>>> existence of numa nodes that are non-consecutive from 0 to the highest node >>>>> number and also of nodes that might be memory-less nodes, i.e. that might not >>>>> be, in libnuma terms, a configured node. Hence just the configured nodes will >>>>> be available: >>>>> >>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log >>>>> >>>>> The change has no effect on numa topologies were the problem does not occur, >>>>> i.e. no change in the number of nodes and no change in the cpu to node map. On >>>>> numa topologies where memory-less nodes exist (like in the last example above), >>>>> cpus from a memory-less node won't be able to bind locally so they are mapped >>>>> to the closest node, otherwise they would be not associate to any node and >>>>> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the >>>>> performance. >>>>> >>>>> I found no regressions on x64 for the following numa topology: >>>>> >>>>> available: 2 nodes (0-1) >>>>> node 0 cpus: 0 1 2 3 8 9 10 11 >>>>> node 0 size: 24102 MB >>>>> node 0 free: 19806 MB >>>>> node 1 cpus: 4 5 6 7 12 13 14 15 >>>>> node 1 size: 24190 MB >>>>> node 1 free: 21951 MB >>>>> node distances: >>>>> node 0 1 >>>>> 0: 10 21 >>>>> 1: 21 10 >>>>> >>>>> I understand that fixing the current numa detection is a prerequisite to enable >>>>> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2]. >>>>> >>>>> Thank you. >>>>> >>>>> >>>>> Best regards, >>>>> Gustavo >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate) >>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation) >>>>> >>>> >>> >> > From volker.simonis at gmail.com Wed May 3 14:39:29 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 3 May 2017 16:39:29 +0200 Subject: RFR (L) 8171392 Move Klass pointers outside of ConstantPool entries so ConstantPool can be read-only In-Reply-To: <5909D4DC.8000000@oracle.com> References: <58EC771B.9020202@oracle.com> <35e6276a-ddf1-9149-8588-acb4e13191f5@oracle.com> <58EF3D3A.6020903@oracle.com> <58F05EB5.10009@oracle.com> <58F0EB0E.60904@oracle.com> <58F33A4F.70104@oracle.com> <58FDECFE.5060105@oracle.com> <590327F1.7070200@oracle.com> <5904B17E.9090209@oracle.com> <5909D4DC.8000000@oracle.com> Message-ID: Fine with me! Regards, Volker On Wed, May 3, 2017 at 3:02 PM, Ioi Lam wrote: > Andrew replied me off-list that he tested the aarch64 part and was happy > about it. Thanks Andrew. > > So if there if no further comment, I will push the code as is. > > Thanks > - Ioi > > > On 4/29/17 8:30 AM, Ioi Lam wrote: >> >> I've updated the patch to include Volker's ppc/s390 port as well as his >> comments. I've also included an updated patch (untested) for aarch64 for >> Andrew Haley to test: >> >> Full patch >> >> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v04/ >> >> Delta from the previous version >> >> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v04.delta/ >> >> Thanks >> - Ioi >> >> On 4/28/17 4:30 AM, Ioi Lam wrote: >>> >>> >>> >>> On 4/25/17 8:06 AM, Volker Simonis wrote: >>>> >>>> On Mon, Apr 24, 2017 at 2:18 PM, Ioi Lam wrote: >>>>> >>>>> Hi Volker, >>>>> >>>>> >>>>> On 4/21/17 12:02 AM, Volker Simonis wrote: >>>>>> >>>>>> Hi Ioi, >>>>>> >>>>>> thanks once again for considering our ports! Please find the required >>>>>> additions for ppc64/s390x in the following webrew (which is based upon >>>>>> your latest v03 patch): >>>>>> >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8171392_ppc64_s390x/ >>>>> >>>>> Thanks for the patch. I will integrate it and post an updated webrev. >>>>>> >>>>>> @Martin/@Lucy: could you please have a look at my ppc64/s390x assembly >>>>>> code. I did some tests and I think it should be correct, but maybe you >>>>>> still find some improvements :) >>>>>> >>>>>> Besides that, I have some general questions/comments regarding your >>>>>> change: >>>>>> >>>>>> 1. In constantPool.hpp, why don't you declare the '_name_index' and >>>>>> '_resolved_klass_index' fields with type 'jushort'? As far as I can >>>>>> see, they can only hold 16-bit values anyway. It would also save you >>>>>> some space and several asserts (e.g. in unresolved_klass_at_put(): >>>>>> >>>>>> >>>>>> 274 assert((name_index & 0xffff0000) == 0, "must be"); >>>>>> 275 assert((resolved_klass_index & 0xffff0000) == 0, "must >>>>>> be"); >>>>> >>>>> >>>>> I think the HotSpot convention is to use ints as parameter and return >>>>> types, >>>>> for values that are actually 16-bits or less, like here in >>>>> constantPool.hpp: >>>>> >>>>> void field_at_put(int which, int class_index, int >>>>> name_and_type_index) { >>>>> tag_at_put(which, JVM_CONSTANT_Fieldref); >>>>> *int_at_addr(which) = ((jint) name_and_type_index<<16) | >>>>> class_index; >>>>> } >>>>> >>>>> I am not sure what the reasons are. It could be that the parameters >>>>> usually >>>>> need to be computed arithmetically, and it's much easier for the caller >>>>> of >>>>> the method to use ints -- otherwise you will get lots of compiler >>>>> warnings >>>>> which would force you to use lots of casting, resulting in code that's >>>>> hard >>>>> to read and probably incorrect. >>>>> >>>> OK, but you could still use shorts in the the object to save space, >>>> although I'm not sure how much that will save in total. But if nobody >>>> else cares, I'm fine with the current version. >>> >>> >>> The CPKlassSlot objects are stored only on the stack, so the savings is >>> not worth the trouble of adding extract type casts. >>> >>> Inside the ConstantPool itself, the name_index and resolved_klass_index >>> are stored as a pair of 16-bit values. >>> >>>>>> 2. What do you mean by: >>>>>> >>>>>> 106 // ... will be changed to support compressed pointers >>>>>> 107 Array* _resolved_klasses; >>>>> >>>>> >>>>> Sorry the comment isn't very clear. How about this? >>>>> >>>>> 106 // Consider using an array of compressed klass pointers to >>>>> // save space on 64-bit platforms. >>>>> 107 Array* _resolved_klasses; >>>>> >>>> Sorry I still didn't get it? Do you mean you want to use array of >>>> "narrowKlass" (i.e. unsigned int)? But using compressed class pointers >>>> is a runtime decision while this is a compile time decision. >>> >>> >>> I haven't figured out how to do it yet :-) >>> >>> Most likely, it will be something like: >>> >>> union { >>> Array* X; >>> Array* Y; >>> } _resolved_klasses; >>> >>> and you need to decide at run time whether to use X or Y. >>> >>> - Ioi >>>>>> >>>>>> 3. Why don't we need the call to "release_tag_at_put()" in >>>>>> "klass_at_put(int class_index, Klass* k)"? "klass_at_put(int >>>>>> class_index, Klass* k)" is used from >>>>>> "ClassFileParser::fill_instance_klass() and before your change that >>>>>> function used the previous version of "klass_at_put(int class_index, >>>>>> Klass* k)" which did call "release_tag_at_put()". >>>>> >>>>> >>>>> Good catch. I'll add the following, because the class is now resolved. >>>>> >>>>> release_tag_at_put(class_index, JVM_CONSTANT_UnresolvedClass); >>>>>> >>>>>> 4. In ConstantPool::copy_entry_to() you've changed the behavior for >>>>>> tags JVM_CONSTANT_Class, JVM_CONSTANT_UnresolvedClass, >>>>>> JVM_CONSTANT_UnresolvedClassInError. Before, the resolved klass was >>>>>> copied to the new constant pool if one existed but now you always only >>>>>> copy a class_index to the new constant pool (even if a resolved klass >>>>>> existed). Is that OK? E.g. won't this lead to a new resolving for the >>>>>> new constant pool and will this have performance impacts or other side >>>>>> effects? >>>>> >>>>> I think Coleen has answered this in a separate mail :-) >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>>>>> Thanks again for doing this nice change and best regards, >>>>>> Volker >>>>>> >>>>>> >>>>>> On Sun, Apr 16, 2017 at 11:33 AM, Ioi Lam wrote: >>>>>>> >>>>>>> Hi Lois, >>>>>>> >>>>>>> I have updated the patch to include your comments, and fixes the >>>>>>> handling >>>>>>> of >>>>>>> anonymous classes. I also added some more comments regarding the >>>>>>> _temp_resolved_klass_index: >>>>>>> >>>>>>> (delta from last webrev) >>>>>>> >>>>>>> >>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v03.delta/ >>>>>>> >>>>>>> (full webrev) >>>>>>> >>>>>>> >>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v03/ >>>>>>> >>>>>>> Thanks >>>>>>> - Ioi >>>>>>> >>>>>>> >>>>>>> On 4/15/17 2:31 AM, Lois Foltan wrote: >>>>>>>> >>>>>>>> On 4/14/2017 11:30 AM, Ioi Lam wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 4/14/17 1:31 PM, Ioi Lam wrote: >>>>>>>>>> >>>>>>>>>> HI Lois, >>>>>>>>>> >>>>>>>>>> Thanks for the review. Please see my comments in-line. >>>>>>>>>> >>>>>>>>>> On 4/14/17 4:32 AM, Lois Foltan wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Ioi, >>>>>>>>>>> >>>>>>>>>>> Looks really good. A couple of comments: >>>>>>>>>>> >>>>>>>>>>> src/share/vm/classfile/classFileParser.cpp: >>>>>>>>>>> * line #5676 - I'm not sure I completely understand the logic >>>>>>>>>>> surrounding anonymous classes. Coleen and I discussed earlier >>>>>>>>>>> today >>>>>>>>>>> and I >>>>>>>>>>> came away from that discussion with the idea that the only >>>>>>>>>>> classes >>>>>>>>>>> being >>>>>>>>>>> patched currently are anonymous classes. >>>>>>>>>> >>>>>>>>>> Line 5676 ... >>>>>>>>>> >>>>>>>>>> 5676 if (is_anonymous()) { >>>>>>>>>> 5677 _max_num_patched_klasses ++; // for patching the >>>>>>>>>> class >>>>>>>>>> index >>>>>>>>>> 5678 } >>>>>>>>>> >>>>>>>>>> corresponds to >>>>>>>>>> >>>>>>>>>> 5361 ik->set_name(_class_name); >>>>>>>>>> 5362 >>>>>>>>>> 5363 if (is_anonymous()) { >>>>>>>>>> 5364 // I am well known to myself >>>>>>>>>> 5365 patch_class(ik->constants(), _this_class_index, ik, >>>>>>>>>> ik->name()); // eagerly resolve >>>>>>>>>> 5366 } >>>>>>>>>> >>>>>>>>>> Even though the class is "anonymous", it actually has a name. >>>>>>>>>> ik->name() >>>>>>>>>> probably is part of the constant pool, but I am not 100% sure. >>>>>>>>>> Also, I >>>>>>>>>> would >>>>>>>>>> need to search the constant pool to find the index for ik->name(). >>>>>>>>>> So >>>>>>>>>> I just >>>>>>>>>> got lazy here and use the same logic in >>>>>>>>>> ConstantPool::patch_class() to >>>>>>>>>> append ik->name() to the end of the constant pool. >>>>>>>>>> >>>>>>>>>> "Anonymous" actually means "the class cannot be looked up by name >>>>>>>>>> in >>>>>>>>>> the >>>>>>>>>> SystemDictionary". I think we need a better terminology :-) >>>>>>>>>> >>>>>>>>> I finally realized why we need the "eagerly resolve" on line 5365. >>>>>>>>> I'll >>>>>>>>> modify the comments to the following: >>>>>>>>> >>>>>>>>> // _this_class_index is a CONSTANT_Class entry that refers to >>>>>>>>> this >>>>>>>>> // anonymous class itself. If this class needs to refer to >>>>>>>>> its own >>>>>>>>> methods or >>>>>>>>> // fields, it would use a CONSTANT_MethodRef, etc, which >>>>>>>>> would >>>>>>>>> reference >>>>>>>>> // _this_class_index. However, because this class is >>>>>>>>> anonymous >>>>>>>>> (it's >>>>>>>>> // not stored in SystemDictionary), _this_class_index cannot >>>>>>>>> be >>>>>>>>> resolved >>>>>>>>> // with ConstantPool::klass_at_impl, which does a >>>>>>>>> SystemDictionary >>>>>>>>> lookup. >>>>>>>>> // Therefore, we must eagerly resolve _this_class_index now. >>>>>>>>> >>>>>>>>> So, Lois is right. Line 5676 is not necessary. I will revise the >>>>>>>>> code >>>>>>>>> to >>>>>>>>> do the "eager resolution" without using >>>>>>>>> ClassFileParser::patch_class. >>>>>>>>> I'll >>>>>>>>> post the updated code later. >>>>>>>> >>>>>>>> >>>>>>>> Thanks Ioi for studying this and explaining! Look forward to seeing >>>>>>>> the >>>>>>>> updated webrev. >>>>>>>> Lois >>>>>>>> >>>>>>>>> Thanks >>>>>>>>> - Ioi >>>>>>>>> >>>>>>>>>>> So a bit confused as why the check on line #5676 and a check for >>>>>>>>>>> a >>>>>>>>>>> java/lang/Class on line #5684. >>>>>>>>>> >>>>>>>>>> 5683 Handle patch = cp_patch_at(i); >>>>>>>>>> 5684 if (java_lang_String::is_instance(patch()) || >>>>>>>>>> java_lang_Class::is_instance(patch())) { >>>>>>>>>> 5685 // We need to append the names of the patched >>>>>>>>>> classes >>>>>>>>>> to >>>>>>>>>> the end of the constant pool, >>>>>>>>>> 5686 // because a patched class may have a Utf8 name >>>>>>>>>> that's >>>>>>>>>> not already included in the >>>>>>>>>> 5687 // original constant pool. >>>>>>>>>> 5688 // >>>>>>>>>> 5689 // Note that a String in cp_patch_at(i) may be used >>>>>>>>>> to >>>>>>>>>> patch a Utf8, a String, or a Class. >>>>>>>>>> 5690 // At this point, we don't know the tag for index i >>>>>>>>>> yet, >>>>>>>>>> because we haven't parsed the >>>>>>>>>> 5691 // constant pool. So we can only assume the worst >>>>>>>>>> -- >>>>>>>>>> every String is used to patch a Class. >>>>>>>>>> 5692 _max_num_patched_klasses ++; >>>>>>>>>> >>>>>>>>>> Line 5684 checks for all objects in the cp_patch array. Later, >>>>>>>>>> when >>>>>>>>>> ClassFileParser::patch_constant_pool() is called, any objects that >>>>>>>>>> are >>>>>>>>>> either Class or String could be treated as a Klass: >>>>>>>>>> >>>>>>>>>> 724 void ClassFileParser::patch_constant_pool(ConstantPool* cp, >>>>>>>>>> 725 int index, >>>>>>>>>> 726 Handle patch, >>>>>>>>>> 727 TRAPS) { >>>>>>>>>> ... >>>>>>>>>> 732 switch (cp->tag_at(index).value()) { >>>>>>>>>> 733 >>>>>>>>>> 734 case JVM_CONSTANT_UnresolvedClass: { >>>>>>>>>> 735 // Patching a class means pre-resolving it. >>>>>>>>>> 736 // The name in the constant pool is ignored. >>>>>>>>>> 737 if (java_lang_Class::is_instance(patch())) { >>>>>>>>>> 738 guarantee_property(!java_lang_Class::is_primitive(patch()), >>>>>>>>>> 739 "Illegal class patch at %d in >>>>>>>>>> class >>>>>>>>>> file >>>>>>>>>> %s", >>>>>>>>>> 740 index, CHECK); >>>>>>>>>> 741 Klass* k = java_lang_Class::as_Klass(patch()); >>>>>>>>>> 742 patch_class(cp, index, k, k->name()); >>>>>>>>>> 743 } else { >>>>>>>>>> 744 guarantee_property(java_lang_String::is_instance(patch()), >>>>>>>>>> 745 "Illegal class patch at %d in >>>>>>>>>> class >>>>>>>>>> file >>>>>>>>>> %s", >>>>>>>>>> 746 index, CHECK); >>>>>>>>>> 747 Symbol* const name = >>>>>>>>>> java_lang_String::as_symbol(patch(), >>>>>>>>>> CHECK); >>>>>>>>>> 748 patch_class(cp, index, NULL, name); >>>>>>>>>> 749 } >>>>>>>>>> 750 break; >>>>>>>>>> 751 } >>>>>>>>>> >>>>>>>>>>> Could the is_anonymous() if statement be combined into the loop? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I think the answer is no. At line 5365, there is no guarantee that >>>>>>>>>> ik->name() is in the cp_patch array. >>>>>>>>>> >>>>>>>>>> 5365 patch_class(ik->constants(), _this_class_index, ik, >>>>>>>>>> ik->name()); // eagerly resolve >>>>>>>>>> >>>>>>>>>>> Also why not do this calculation in the rewriter or is that >>>>>>>>>>> too >>>>>>>>>>> late? >>>>>>>>>>> >>>>>>>>>> Line 5676 and 5684 need to be executed BEFORE the constant pool >>>>>>>>>> and >>>>>>>>>> the >>>>>>>>>> associated tags array is allocated (both of which are fixed size, >>>>>>>>>> and >>>>>>>>>> cannot >>>>>>>>>> be expanded), which is way before the rewriter is run. At this >>>>>>>>>> point, >>>>>>>>>> we >>>>>>>>>> don't know what cp->tag_at(index) is (line #732), so the code >>>>>>>>>> needs to >>>>>>>>>> make >>>>>>>>>> a worst-case estimate on how long the CP/tags should be. >>>>>>>>>> >>>>>>>>>>> * line #5677, 5692 - a nit but I think the convention is to not >>>>>>>>>>> have >>>>>>>>>>> a >>>>>>>>>>> space after the variable name and between the post increment >>>>>>>>>>> operator. >>>>>>>>>>> >>>>>>>>>> Fixed. >>>>>>>>>>> >>>>>>>>>>> src/share/vm/classfile/constantPool.hpp: >>>>>>>>>>> I understand the concept behind _invalid_resolved_klass_index, >>>>>>>>>>> but it >>>>>>>>>>> really is not so much invalid as temporary for class redefinition >>>>>>>>>>> purposes, >>>>>>>>>>> as you explain in ConstantPool::allocate_resolved_klasses. >>>>>>>>>>> Please >>>>>>>>>>> consider >>>>>>>>>>> renaming to _temp_unresolved_klass_index. And whether you choose >>>>>>>>>>> to >>>>>>>>>>> rename >>>>>>>>>>> the field or not, please add a one line comment ahead of >>>>>>>>>>> ConstantPool::temp_unresolved_klass_at_put that only class >>>>>>>>>>> redefinition uses >>>>>>>>>>> this currently. >>>>>>>>>>> >>>>>>>>>> Good idea. Will do. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> - Ioi >>>>>>>>>> >>>>>>>>>>> Great change, thanks! >>>>>>>>>>> Lois >>>>>>>>>>> >>>>>>>>>>> On 4/13/2017 4:56 AM, Ioi Lam wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Coleen, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the comments. Here's a delta from the last patch >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v02/ >>>>>>>>>>>> >>>>>>>>>>>> In addition to your requests, I made these changes: >>>>>>>>>>>> >>>>>>>>>>>> [1] To consolidate the multiple extract_high/low code, I've >>>>>>>>>>>> added >>>>>>>>>>>> CPKlassSlot, so the code is cleaner: >>>>>>>>>>>> >>>>>>>>>>>> CPKlassSlot kslot = this_cp->klass_slot_at(which); >>>>>>>>>>>> int resolved_klass_index = kslot.resolved_klass_index(); >>>>>>>>>>>> int name_index = kslot.name_index(); >>>>>>>>>>>> >>>>>>>>>>>> [2] Renamed ConstantPool::is_shared_quick() to >>>>>>>>>>>> ConstantPool::is_shared(). The C++ compiler should be able to >>>>>>>>>>>> pick >>>>>>>>>>>> this >>>>>>>>>>>> function over MetaspaceObj::is_shared(). >>>>>>>>>>>> >>>>>>>>>>>> [3] Massaged the CDS region size set-up code a little to pass >>>>>>>>>>>> internal >>>>>>>>>>>> tests, because RO/RW ratio has changed. I didn't spend too much >>>>>>>>>>>> time >>>>>>>>>>>> picking >>>>>>>>>>>> the "right" sizes, as this code will be obsoleted soon with >>>>>>>>>>>> JDK-8072061 >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> - Ioi >>>>>>>>>>>> >>>>>>>>>>>> On 4/13/17 6:40 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> This looks really good! >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v01/src/share/vm/oops/constantPool.cpp.udiff.html >>>>>>>>>>>>> >>>>>>>>>>>>> + // Add one extra element to tags for storing >>>>>>>>>>>>> ConstantPool::flags(). >>>>>>>>>>>>> + Array* tags = >>>>>>>>>>>>> MetadataFactory::new_writeable_array(loader_data, length+1, >>>>>>>>>>>>> 0, >>>>>>>>>>>>> CHECK_NULL); ... + assert(tags->length()-1 == _length, >>>>>>>>>>>>> "invariant"); // >>>>>>>>>>>>> tags->at(_length) is flags() >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I think this is left over, since _flags didn't get moved after >>>>>>>>>>>>> all. >>>>>>>>>>>>> >>>>>>>>>>>>> + Klass** adr = >>>>>>>>>>>>> this_cp->resolved_klasses()->adr_at(resolved_klass_index); >>>>>>>>>>>>> + OrderAccess::release_store_ptr((Klass* volatile *)adr, k); >>>>>>>>>>>>> + // The interpreter assumes when the tag is stored, the klass >>>>>>>>>>>>> is >>>>>>>>>>>>> resolved >>>>>>>>>>>>> + // and the Klass* is a klass rather than a Symbol*, so we >>>>>>>>>>>>> need >>>>>>>>>>>>> + // hardware store ordering here. >>>>>>>>>>>>> + this_cp->release_tag_at_put(which, JVM_CONSTANT_Class); >>>>>>>>>>>>> + return k; >>>>>>>>>>>>> >>>>>>>>>>>>> The comment still refers to the switch between Symbol* and >>>>>>>>>>>>> Klass* >>>>>>>>>>>>> in >>>>>>>>>>>>> the constant pool. The entry in the Klass array should be >>>>>>>>>>>>> NULL. >>>>>>>>>>>>> >>>>>>>>>>>>> + int name_index = >>>>>>>>>>>>> extract_high_short_from_int(*int_at_addr(which)); >>>>>>>>>>>>> >>>>>>>>>>>>> Can you put klass_name_index_at() in the constantPool.hpp >>>>>>>>>>>>> header >>>>>>>>>>>>> file >>>>>>>>>>>>> (so it's inlined) and have all the places where you get >>>>>>>>>>>>> name_index >>>>>>>>>>>>> use this >>>>>>>>>>>>> function? So we don't have to know in multiple places that >>>>>>>>>>>>> extract_high_short_from_int() is where the name index is. And >>>>>>>>>>>>> in >>>>>>>>>>>>> constantPool.hpp, for unresolved_klass_at_put() add a comment >>>>>>>>>>>>> about >>>>>>>>>>>>> what the >>>>>>>>>>>>> new format of the constant pool entry is {name_index, >>>>>>>>>>>>> resolved_klass_index}. >>>>>>>>>>>>> I'm happy to see this work nearing completion! The "constant" >>>>>>>>>>>>> pool >>>>>>>>>>>>> should >>>>>>>>>>>>> be constant! thanks, Coleen >>>>>>>>>>>>> On 4/11/17 2:26 AM, Ioi Lam wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi,please review the following change >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8171392 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk10/8171392_make_constantpool_read_only.v01/ >>>>>>>>>>>>>> *Summary:** * Before: + ConstantPool::klass_at(i) finds >>>>>>>>>>>>>> the >>>>>>>>>>>>>> Klass from >>>>>>>>>>>>>> the i-th slot of ConstantPool. + When a klass is resolved, the >>>>>>>>>>>>>> ConstantPool >>>>>>>>>>>>>> is modified to store the Klass pointer. After: + >>>>>>>>>>>>>> ConstantPool::klass_at(i) finds the at >>>>>>>>>>>>>> this->_resolved_klasses->at(i) + >>>>>>>>>>>>>> When a klass is resolved, _resolved_klasses->at(i) is >>>>>>>>>>>>>> modified. >>>>>>>>>>>>>> In >>>>>>>>>>>>>> addition: + I moved _resolved_references and >>>>>>>>>>>>>> _reference_map >>>>>>>>>>>>>> from >>>>>>>>>>>>>> ConstantPool to ConstantPoolCache. + Now _flags is no >>>>>>>>>>>>>> longer >>>>>>>>>>>>>> modified for shared ConstantPools. As a result, none of the >>>>>>>>>>>>>> fields in >>>>>>>>>>>>>> shared ConstantPools are modified at run time, so we can move >>>>>>>>>>>>>> them >>>>>>>>>>>>>> into the >>>>>>>>>>>>>> RO region in the CDS archive. *Testing:** * - Benchmark >>>>>>>>>>>>>> results >>>>>>>>>>>>>> show no >>>>>>>>>>>>>> performance regression, despite the extra level of >>>>>>>>>>>>>> indirection, >>>>>>>>>>>>>> which has a >>>>>>>>>>>>>> negligible overhead for the interpreter. - RBT testing for >>>>>>>>>>>>>> tier2 >>>>>>>>>>>>>> and >>>>>>>>>>>>>> tier3. *Ports:** * - I have tested only the Oracle-support >>>>>>>>>>>>>> ports. For the >>>>>>>>>>>>>> aarch64, ppc and s390 ports, I have added some code without >>>>>>>>>>>>>> testing (it's >>>>>>>>>>>>>> probably incomplete) - Port owners, please check if my patch >>>>>>>>>>>>>> work for you, >>>>>>>>>>>>>> and I can incorporate your changes in my push. Alternatively, >>>>>>>>>>>>>> you >>>>>>>>>>>>>> can wait >>>>>>>>>>>>>> for my push and provide fixes (if necessary) in a new >>>>>>>>>>>>>> changeset, >>>>>>>>>>>>>> and I will >>>>>>>>>>>>>> be happy to sponsor it. Thanks - Ioi >>>>>>>>>>>> >>>>>>>>>>>> >>> >> > From aph at redhat.com Wed May 3 17:05:10 2017 From: aph at redhat.com (Andrew Haley) Date: Wed, 3 May 2017 18:05:10 +0100 Subject: RFR: 8179444: AArch64: Put zero_words on a diet Message-ID: New version, corrected: The code we generate for ClearArray in C2 is much too verbose. It looks like this: 0x000003ffad2213a4: cbz x11, 0x000003ffad22140c 0x000003ffad2213a8: tbz w10, #3, 0x000003ffad2213b4 0x000003ffad2213ac: str xzr, [x10],#8 0x000003ffad2213b0: sub x11, x11, #0x1 0x000003ffad2213b4: subs xscratch1, x11, #0x20 0x000003ffad2213b8: b.lt 0x000003ffad2213c0 0x000003ffad2213bc: bl Stub::zero_longs ; {external_word} 0x000003ffad2213c0: and xscratch1, x11, #0xe 0x000003ffad2213c4: sub x11, x11, xscratch1 0x000003ffad2213c8: add x10, x10, xscratch1, lsl #3 0x000003ffad2213cc: adr xscratch2, 0x000003ffad2213fc 0x000003ffad2213d0: sub xscratch2, xscratch2, xscratch1, lsl #1 0x000003ffad2213d4: br xscratch2 0x000003ffad2213d8: add x10, x10, #0x80 0x000003ffad2213dc: stp xzr, xzr, [x10,#-128] 0x000003ffad2213e0: stp xzr, xzr, [x10,#-112] 0x000003ffad2213e4: stp xzr, xzr, [x10,#-96] 0x000003ffad2213e8: stp xzr, xzr, [x10,#-80] 0x000003ffad2213ec: stp xzr, xzr, [x10,#-64] 0x000003ffad2213f0: stp xzr, xzr, [x10,#-48] 0x000003ffad2213f4: stp xzr, xzr, [x10,#-32] 0x000003ffad2213f8: stp xzr, xzr, [x10,#-16] 0x000003ffad2213fc: subs x11, x11, #0x10 0x000003ffad221400: b.ge 0x000003ffad2213d8 0x000003ffad221404: tbz w11, #0, 0x000003ffad22140c 0x000003ffad221408: str xzr, [x10],#8 This patch takes much of this code and puts it into a stub. The new version of ClearArray is: 0x000003ff8022b7b0: cmp x11, #0x8 0x000003ff8022b7b4: b.cc 0x000003ff8022b7bc 0x000003ff8022b7b8: bl Stub::zero_blocks ; {runtime_call StubRoutines (2)} 0x000003ff8022b7bc: tbz w11, #2, 0x000003ff8022b7c8 0x000003ff8022b7c0: stp xzr, xzr, [x10],#16 0x000003ff8022b7c4: stp xzr, xzr, [x10],#16 0x000003ff8022b7c8: tbz w11, #1, 0x000003ff8022b7d0 0x000003ff8022b7cc: stp xzr, xzr, [x10],#16 0x000003ff8022b7d0: tbz w11, #0, 0x000003ff8022b7d8 0x000003ff8022b7d4: str xzr, [x10] ... which I hope you'll agree is much better. The idea is to handle array sizes of 0-7 words inline, so small arrays are got out of the way very quickly, and handle anything larger in Stub::zero_blocks. I wanted to make sure that there is no significant loss of performance, and I have attached the results of the benchmark I used, which does no more than create an array of ints of various sizes. There are winners and losers, but nothing is changed by very much, and the code cache usage of each ClearArray goes down from 104 to 40 bytes. http://cr.openjdk.java.net/~aph/8179444-2/ OK? Andrew. Before: Benchmark (size) Mode Cnt Score Error Units CreateArray.newArray 5 avgt 10 48.273 ? 1.679 ns/op CreateArray.newArray 7 avgt 10 48.915 ? 0.793 ns/op CreateArray.newArray 10 avgt 10 49.826 ? 0.868 ns/op CreateArray.newArray 15 avgt 10 52.582 ? 0.521 ns/op CreateArray.newArray 23 avgt 10 57.589 ? 0.670 ns/op CreateArray.newArray 34 avgt 10 67.233 ? 0.984 ns/op CreateArray.newArray 51 avgt 10 120.652 ? 2.018 ns/op CreateArray.newArray 77 avgt 10 102.745 ? 1.034 ns/op CreateArray.newArray 115 avgt 10 136.703 ? 1.067 ns/op CreateArray.newArray 173 avgt 10 182.247 ? 1.093 ns/op CreateArray.newArray 259 avgt 10 163.168 ? 5.967 ns/op CreateArray.newArray 389 avgt 10 233.874 ? 3.400 ns/op CreateArray.newArray 584 avgt 10 251.286 ? 4.892 ns/op CreateArray.newArray 876 avgt 10 242.510 ? 0.520 ns/op CreateArray.newArray 1314 avgt 10 382.846 ? 0.624 ns/op CreateArray.newArray 1971 avgt 10 487.590 ? 1.409 ns/op After: Benchmark (size) Mode Cnt Score Error Units CreateArray.newArray 5 avgt 10 47.208 ? 0.656 ns/op CreateArray.newArray 7 avgt 10 47.838 ? 0.608 ns/op CreateArray.newArray 10 avgt 10 48.798 ? 0.797 ns/op CreateArray.newArray 15 avgt 10 51.981 ? 0.424 ns/op CreateArray.newArray 23 avgt 10 56.614 ? 1.064 ns/op CreateArray.newArray 34 avgt 10 65.986 ? 1.114 ns/op CreateArray.newArray 51 avgt 10 119.811 ? 0.857 ns/op CreateArray.newArray 77 avgt 10 101.694 ? 1.192 ns/op CreateArray.newArray 115 avgt 10 137.169 ? 2.159 ns/op CreateArray.newArray 173 avgt 10 185.815 ? 0.754 ns/op CreateArray.newArray 259 avgt 10 163.305 ? 2.107 ns/op CreateArray.newArray 389 avgt 10 234.049 ? 3.162 ns/op CreateArray.newArray 584 avgt 10 250.729 ? 1.714 ns/op CreateArray.newArray 876 avgt 10 242.921 ? 0.577 ns/op CreateArray.newArray 1314 avgt 10 384.337 ? 1.465 ns/op CreateArray.newArray 1971 avgt 10 486.948 ? 5.303 ns/op From david.holmes at oracle.com Thu May 4 01:50:27 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 May 2017 11:50:27 +1000 Subject: [10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: References: <58C1AE06.9060609@linux.vnet.ibm.com> <58EEAF7B.6020708@linux.vnet.ibm.com> <59000AC0.7050507@linux.vnet.ibm.com> <5909DAAC.3070202@linux.vnet.ibm.com> Message-ID: <0e89961f-e5da-cb85-e30d-33e424b69a0b@oracle.com> Hi Volker, Gustavo, I will try to take a look at this again, but may be a day or two. David On 4/05/2017 12:34 AM, Volker Simonis wrote: > Hi, > > I've reviewed Gustavo's change and I'm fine with the latest version at: > > http://cr.openjdk.java.net/~gromero/8175813/v3/ > > Can somebody please sponsor the change? > > Thank you and best regards, > Volker > > > On Wed, May 3, 2017 at 3:27 PM, Gustavo Romero > wrote: >> Hi community, >> >> I understand that there is nothing that can be done additionally regarding this >> issue, at this point, on the PPC64 side. >> >> It's a change in the shared code - but that in effect does not change anything in >> the numa detection mechanism for other platforms - and hence it's necessary a >> conjoint community effort to review the change and a sponsor to run it against >> the JPRT. >> >> I know it's a stabilizing moment of OpenJDK 9, but since that issue is of >> great concern on PPC64 (specially on POWER8 machines) I would be very glad if >> the community could point out directions on how that change could move on. >> >> Thank you! >> >> Best regards, >> Gustavo >> >> On 25-04-2017 23:49, Gustavo Romero wrote: >>> Dear Volker, >>> >>> On 24-04-2017 14:08, Volker Simonis wrote: >>>> Hi Gustavo, >>>> >>>> thanks for addressing this problem and sorry for my late reply. I >>>> think this is a good change which definitely improves the situation >>>> for uncommon NUMA configurations without changing the handling for >>>> common topologies. >>> >>> Thanks a lot for reviewing the change! >>> >>> >>>> It would be great if somebody could run this trough JPRT, but as >>>> Gustavo mentioned, I don't expect any regressions. >>>> >>>> @Igor: I think you've been the original author of the NUMA-aware >>>> allocator port to Linux (i.e. "6684395: Port NUMA-aware allocator to >>>> linux"). If you could find some spare minutes to take a look at this >>>> change, your comment would be very much appreciated :) >>>> >>>> Following some minor comments from me: >>>> >>>> - in os::numa_get_groups_num() you now use numa_num_configured_nodes() >>>> to get the actual number of configured nodes. This is good and >>>> certainly an improvement over the previous implementation. However, >>>> the man page for numa_num_configured_nodes() mentions that the >>>> returned count may contain currently disabled nodes. Do we currently >>>> handle disabled nodes? What will be the consequence if we would use >>>> such a disabled node (e.g. mbind() warnings)? >>> >>> In [1] 'numa_memnode_ptr' is set to keep a list of *just nodes with memory in >>> found in /sys/devices/system/node/* Hence numa_num_configured_nodes() just >>> returns the number of nodes in 'numa_memnode_ptr' [2], thus just returns the >>> number of nodes with memory in the system. To the best of my knowledge there is >>> no system configuration on Linux/PPC64 that could match such a notion of >>> "disabled nodes" as it appears in libnuma's manual. If it is enabled, it's in >>> that dir and just the ones with memory will be taken into account. If it's >>> disabled (somehow), it's not in the dir, so won't be taken into account (i.e. no >>> mbind() tried against it). >>> >>> On Power it's possible to have a numa node without memory (memory-less node, a >>> case covered in this change), a numa node without cpus at all but with memory >>> (a configured node anyway, so a case already covered) but to disable a specific >>> numa node so it does not appear in /sys/devices/system/node/* it's only possible >>> from the inners of the control module. Or other rare condition not invisible / >>> adjustable from the OS. Also I'm not aware of a case where a node is in this >>> dir but is at the same time flagged as something like "disabled". There are >>> cpu/memory hotplugs, but that does not change numa nodes status AFAIK. >>> >>> [1] https://github.com/numactl/numactl/blob/master/libnuma.c#L334-L347 >>> [2] https://github.com/numactl/numactl/blob/master/libnuma.c#L614-L618 >>> >>> >>>> - the same question applies to the usage of >>>> Linux::isnode_in_configured_nodes() within os::numa_get_leaf_groups(). >>>> Does isnode_in_configured_nodes() (i.e. the node set defined by >>>> 'numa_all_nodes_ptr' take into account the disabled nodes or not? Can >>>> this be a potential problem (i.e. if we use a disabled node). >>> >>> On the meaning of "disabled nodes", it's the same case as above, so to the >>> best of knowledge it's not a potential problem. >>> >>> Anyway 'numa_all_nodes_ptr' just includes the configured nodes (with memory), >>> i.e. "all nodes on which the calling task may allocate memory". It's exactly >>> the same pointer returned by numa_get_membind() v2 [3] which: >>> >>> "returns the mask of nodes from which memory can currently be allocated" >>> >>> and that is used, for example, in "numactl --show" to show nodes from where >>> memory can be allocated [4, 5]. >>> >>> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L1147 >>> [4] https://github.com/numactl/numactl/blob/master/numactl.c#L144 >>> [5] https://github.com/numactl/numactl/blob/master/numactl.c#L177 >>> >>> >>>> - I'd like to suggest renaming the 'index' part of the following >>>> variables and functions to 'nindex' ('node_index' is probably to long) >>>> in the following code, to emphasize that we have node indexes pointing >>>> to actual, not always consecutive node numbers: >>>> >>>> 2879 // Create an index -> node mapping, since nodes are not >>>> always consecutive >>>> 2880 _index_to_node = new (ResourceObj::C_HEAP, mtInternal) >>>> GrowableArray(0, true); >>>> 2881 rebuild_index_to_node_map(); >>> >>> Simple change but much better to read indeed. Done. >>> >>> >>>> - can you please wrap the following one-line else statement into curly >>>> braces (it's more readable and we usually do it that way in HotSpot >>>> although there are no formal style guidelines :) >>>> >>>> 2953 } else >>>> 2954 // Current node is already a configured node. >>>> 2955 closest_node = index_to_node()->at(i); >>> >>> Done. >>> >>> >>>> - in os::Linux::rebuild_cpu_to_node_map(), if you set >>>> 'closest_distance' to INT_MAX at the beginning of the loop, you can >>>> later avoid the check for '|| !closest_distance'. Also, according to >>>> the man page, numa_distance() returns 0 if it can not determine the >>>> distance. So with the above change, the condition on line 2974 should >>>> read: >>>> >>>> 2947 if (distance && distance < closest_distance) { >>>> >>> >>> Sure, much better to set the initial condition as distant as possible and >>> adjust to a closer one bit by bit improving the if condition. Done. >>> >>> >>>> Finally, and not directly related to your change, I'd suggest the >>>> following clean-ups: >>>> >>>> - remove the usage of 'NCPUS = 32768' in >>>> os::Linux::rebuild_cpu_to_node_map(). The comment on that line is >>>> unclear to me and probably related to an older version/problem of >>>> libnuma? I think we should simply use >>>> numa_allocate_cpumask()/numa_free_cpumask() instead. >>>> >>>> - we still use the NUMA version 1 function prototypes (e.g. >>>> "numa_node_to_cpus(int node, unsigned long *buffer, int buffer_len)" >>>> instead of "numa_node_to_cpus(int node, struct bitmask *mask)", but >>>> also "numa_interleave_memory()" and maybe others). I think we should >>>> switch all prototypes to the new NUMA version 2 interface which you've >>>> already used for the new functions which you've added. >>> >>> I agree. Could I open a new bug to address these clean-ups? >>> >>> >>>> That said, I think these changes all require libnuma 2.0 (see >>>> os::Linux::libnuma_dlsym). So before starting this, you should make >>>> sure that libnuma 2.0 is available on all platforms to which you'd >>>> like to down-port this change. For jdk10 we could definitely do it, >>>> for jdk9 probably also, for jdk8 I'm not so sure. >>> >>> libnuma v1 last release dates back to 2008, but any idea how could I check that >>> for sure since it's on shared code? >>> >>> new webrev: http://cr.openjdk.java.net/~gromero/8175813/v3/ >>> >>> Thank you! >>> >>> Best regards, >>> Gustavo >>> >>> >>>> Regards, >>>> Volker >>>> >>>> On Thu, Apr 13, 2017 at 12:51 AM, Gustavo Romero >>>> wrote: >>>>> Hi, >>>>> >>>>> Any update on it? >>>>> >>>>> Thank you. >>>>> >>>>> Regards, >>>>> Gustavo >>>>> >>>>> On 09-03-2017 16:33, Gustavo Romero wrote: >>>>>> Hi, >>>>>> >>>>>> Could the following webrev be reviewed please? >>>>>> >>>>>> It improves the numa node detection when non-consecutive or memory-less nodes >>>>>> exist in the system. >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/ >>>>>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>>>>> >>>>>> Currently, although no problem exists when the JVM detects numa nodes that are >>>>>> consecutive and have memory, for example in a numa topology like: >>>>>> >>>>>> available: 2 nodes (0-1) >>>>>> node 0 cpus: 0 8 16 24 32 >>>>>> node 0 size: 65258 MB >>>>>> node 0 free: 34 MB >>>>>> node 1 cpus: 40 48 56 64 72 >>>>>> node 1 size: 65320 MB >>>>>> node 1 free: 150 MB >>>>>> node distances: >>>>>> node 0 1 >>>>>> 0: 10 20 >>>>>> 1: 20 10, >>>>>> >>>>>> it fails on detecting numa nodes to be used in the Parallel GC in a numa >>>>>> topology like: >>>>>> >>>>>> available: 4 nodes (0-1,16-17) >>>>>> node 0 cpus: 0 8 16 24 32 >>>>>> node 0 size: 130706 MB >>>>>> node 0 free: 7729 MB >>>>>> node 1 cpus: 40 48 56 64 72 >>>>>> node 1 size: 0 MB >>>>>> node 1 free: 0 MB >>>>>> node 16 cpus: 80 88 96 104 112 >>>>>> node 16 size: 130630 MB >>>>>> node 16 free: 5282 MB >>>>>> node 17 cpus: 120 128 136 144 152 >>>>>> node 17 size: 0 MB >>>>>> node 17 free: 0 MB >>>>>> node distances: >>>>>> node 0 1 16 17 >>>>>> 0: 10 20 40 40 >>>>>> 1: 20 10 40 40 >>>>>> 16: 40 40 10 20 >>>>>> 17: 40 40 20 10, >>>>>> >>>>>> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have >>>>>> no memory. >>>>>> >>>>>> If a topology like that exists, os::numa_make_local() will receive a local group >>>>>> id as a hint that is not available in the system to be bound (it will receive >>>>>> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument" >>>>>> messages: >>>>>> >>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log >>>>>> >>>>>> That change improves the detection by making the JVM numa API aware of the >>>>>> existence of numa nodes that are non-consecutive from 0 to the highest node >>>>>> number and also of nodes that might be memory-less nodes, i.e. that might not >>>>>> be, in libnuma terms, a configured node. Hence just the configured nodes will >>>>>> be available: >>>>>> >>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log >>>>>> >>>>>> The change has no effect on numa topologies were the problem does not occur, >>>>>> i.e. no change in the number of nodes and no change in the cpu to node map. On >>>>>> numa topologies where memory-less nodes exist (like in the last example above), >>>>>> cpus from a memory-less node won't be able to bind locally so they are mapped >>>>>> to the closest node, otherwise they would be not associate to any node and >>>>>> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the >>>>>> performance. >>>>>> >>>>>> I found no regressions on x64 for the following numa topology: >>>>>> >>>>>> available: 2 nodes (0-1) >>>>>> node 0 cpus: 0 1 2 3 8 9 10 11 >>>>>> node 0 size: 24102 MB >>>>>> node 0 free: 19806 MB >>>>>> node 1 cpus: 4 5 6 7 12 13 14 15 >>>>>> node 1 size: 24190 MB >>>>>> node 1 free: 21951 MB >>>>>> node distances: >>>>>> node 0 1 >>>>>> 0: 10 21 >>>>>> 1: 21 10 >>>>>> >>>>>> I understand that fixing the current numa detection is a prerequisite to enable >>>>>> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2]. >>>>>> >>>>>> Thank you. >>>>>> >>>>>> >>>>>> Best regards, >>>>>> Gustavo >>>>>> >>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate) >>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation) >>>>>> >>>>> >>>> >>> >> From chris.plummer at oracle.com Thu May 4 22:38:45 2017 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 4 May 2017 15:38:45 -0700 Subject: RFR(10)(M): JDK-8164563: Test nsk/jvmti/CompiledMethodUnload/compmethunload001 keeps reporting: PRODUCT BUG: class was not unloaded in 5 Message-ID: <2ad9e3a1-0b11-5cea-4ba8-e411e3f91a6f@oracle.com> Hello, Please review the following changes: http://cr.openjdk.java.net/~cjplummer/8164563/webrev.00/webrev.hotspot/ https://bugs.openjdk.java.net/browse/JDK-8164563 There was an issue with CompileMethodUnload events not getting delivered. Short story is the root cause was the JvmtiDeferredEventQueue::_pending_list, which upon further review was deemed unnecessary, so all code related to it has been pulled. The _pending_list is a temporary list for CompileMethodUnload events that occur while at a safepoint. Putting them directly on the JvmtiDeferredEventQueue was thought not to be allowed at a safepoint, because doing so required acquiring the Service_lock, and it was thought that you can no do that while at a safepoint. The issue with the _pending_list is that it only gets processed if there is a subsequent JvmtiDeferredEventQueue::enqueue() call. For the test in question, this was not always happening. The test sits in a loop waiting for the unload events, but unless it triggers some compilation during this time, which results in enqueue() being called for the CompileMethodLoad event, it will never see the CompileMethodUnload events. It eventually gives up and fails. Most times however there is a compilation triggered while in the loop so the test passes. The first attempted solution was to use a VM op that triggered processing of the _pending_list. However, this also ran up against the issue of not being able to grab the Service_lock while at a safepoint because even _concurrent VM ops can end up being run synchronously and at a safepoint if you are already in a VMThread. After further review of the safepoint concern with the Service_lock, it was determined that it should be ok to grab it while at a safepoint, thus removing the need for the _pending_list. So basically the fix is to remove all _pending_list code, and have nmethod::post_compiled_method_unload() always directly call JvmtiDeferredEventQueue::enqueue(). I tested by running the failing test at least 100 times on all supported platforms (it used to fail with a fairly high frequency). I also ran our other CompileMethodUnload and CompileMethodLoad tests about 100 times, and ran our full jvmti test suite a few times on each platform, along with the jck/vm/jvmti tests. thanks, Chris From david.holmes at oracle.com Thu May 4 23:33:27 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 May 2017 09:33:27 +1000 Subject: RFR(10)(M): JDK-8164563: Test nsk/jvmti/CompiledMethodUnload/compmethunload001 keeps reporting: PRODUCT BUG: class was not unloaded in 5 In-Reply-To: <2ad9e3a1-0b11-5cea-4ba8-e411e3f91a6f@oracle.com> References: <2ad9e3a1-0b11-5cea-4ba8-e411e3f91a6f@oracle.com> Message-ID: Looks good Chris. I think we finally ended up with the cleanest and simplest solution. Thanks, David On 5/05/2017 8:38 AM, Chris Plummer wrote: > Hello, > > Please review the following changes: > > http://cr.openjdk.java.net/~cjplummer/8164563/webrev.00/webrev.hotspot/ > https://bugs.openjdk.java.net/browse/JDK-8164563 > > There was an issue with CompileMethodUnload events not getting > delivered. Short story is the root cause was the > JvmtiDeferredEventQueue::_pending_list, which upon further review was > deemed unnecessary, so all code related to it has been pulled. > > The _pending_list is a temporary list for CompileMethodUnload events > that occur while at a safepoint. Putting them directly on the > JvmtiDeferredEventQueue was thought not to be allowed at a safepoint, > because doing so required acquiring the Service_lock, and it was thought > that you can no do that while at a safepoint. > > The issue with the _pending_list is that it only gets processed if there > is a subsequent JvmtiDeferredEventQueue::enqueue() call. For the test in > question, this was not always happening. The test sits in a loop waiting > for the unload events, but unless it triggers some compilation during > this time, which results in enqueue() being called for the > CompileMethodLoad event, it will never see the CompileMethodUnload > events. It eventually gives up and fails. Most times however there is a > compilation triggered while in the loop so the test passes. > > The first attempted solution was to use a VM op that triggered > processing of the _pending_list. However, this also ran up against the > issue of not being able to grab the Service_lock while at a safepoint > because even _concurrent VM ops can end up being run synchronously and > at a safepoint if you are already in a VMThread. > > After further review of the safepoint concern with the Service_lock, it > was determined that it should be ok to grab it while at a safepoint, > thus removing the need for the _pending_list. So basically the fix is to > remove all _pending_list code, and have > nmethod::post_compiled_method_unload() always directly call > JvmtiDeferredEventQueue::enqueue(). > > I tested by running the failing test at least 100 times on all supported > platforms (it used to fail with a fairly high frequency). I also ran our > other CompileMethodUnload and CompileMethodLoad tests about 100 times, > and ran our full jvmti test suite a few times on each platform, along > with the jck/vm/jvmti tests. > > thanks, > > Chris From david.holmes at oracle.com Fri May 5 00:32:17 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 May 2017 10:32:17 +1000 Subject: [10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: References: <58C1AE06.9060609@linux.vnet.ibm.com> <58EEAF7B.6020708@linux.vnet.ibm.com> <59000AC0.7050507@linux.vnet.ibm.com> <5909DAAC.3070202@linux.vnet.ibm.com> Message-ID: Hi Volker, Gustavo, On 4/05/2017 12:34 AM, Volker Simonis wrote: > Hi, > > I've reviewed Gustavo's change and I'm fine with the latest version at: > > http://cr.openjdk.java.net/~gromero/8175813/v3/ Nothing has really changed for me since I first looked at this - I don't know NUMA and I can't comment on any of the details. But no-one else has commented negatively so they are implicitly okay with this, or else they should have spoken up. So with Volker as the Reviewer and myself as a second reviewer, I will sponsor this. I'll run the current patch through JPRT while awaiting the final version. One thing I was unclear on with all this numa code is the expectation regarding all those dynamically looked up functions - is it expected that we will have them all or else have none? It wasn't at all obvious what would happen if we don't have those functions but still executed this code - assuming that is even possible. I guess I would have expected that no numa code would execute unless -XX:+UseNUMA was set, in which case the VM would abort if any of the libnuma functions could not be found. That way we wouldn't need the null checks for the function pointers. Style nits: - we should avoid implicit booleans, so the isnode_in_* functions should return bool not int; and check "distance != 0" - spaces around operators eg. node=0 should be node = 0 Thanks, David > Can somebody please sponsor the change? > > Thank you and best regards, > Volker > > > On Wed, May 3, 2017 at 3:27 PM, Gustavo Romero > wrote: >> Hi community, >> >> I understand that there is nothing that can be done additionally regarding this >> issue, at this point, on the PPC64 side. >> >> It's a change in the shared code - but that in effect does not change anything in >> the numa detection mechanism for other platforms - and hence it's necessary a >> conjoint community effort to review the change and a sponsor to run it against >> the JPRT. >> >> I know it's a stabilizing moment of OpenJDK 9, but since that issue is of >> great concern on PPC64 (specially on POWER8 machines) I would be very glad if >> the community could point out directions on how that change could move on. >> >> Thank you! >> >> Best regards, >> Gustavo >> >> On 25-04-2017 23:49, Gustavo Romero wrote: >>> Dear Volker, >>> >>> On 24-04-2017 14:08, Volker Simonis wrote: >>>> Hi Gustavo, >>>> >>>> thanks for addressing this problem and sorry for my late reply. I >>>> think this is a good change which definitely improves the situation >>>> for uncommon NUMA configurations without changing the handling for >>>> common topologies. >>> >>> Thanks a lot for reviewing the change! >>> >>> >>>> It would be great if somebody could run this trough JPRT, but as >>>> Gustavo mentioned, I don't expect any regressions. >>>> >>>> @Igor: I think you've been the original author of the NUMA-aware >>>> allocator port to Linux (i.e. "6684395: Port NUMA-aware allocator to >>>> linux"). If you could find some spare minutes to take a look at this >>>> change, your comment would be very much appreciated :) >>>> >>>> Following some minor comments from me: >>>> >>>> - in os::numa_get_groups_num() you now use numa_num_configured_nodes() >>>> to get the actual number of configured nodes. This is good and >>>> certainly an improvement over the previous implementation. However, >>>> the man page for numa_num_configured_nodes() mentions that the >>>> returned count may contain currently disabled nodes. Do we currently >>>> handle disabled nodes? What will be the consequence if we would use >>>> such a disabled node (e.g. mbind() warnings)? >>> >>> In [1] 'numa_memnode_ptr' is set to keep a list of *just nodes with memory in >>> found in /sys/devices/system/node/* Hence numa_num_configured_nodes() just >>> returns the number of nodes in 'numa_memnode_ptr' [2], thus just returns the >>> number of nodes with memory in the system. To the best of my knowledge there is >>> no system configuration on Linux/PPC64 that could match such a notion of >>> "disabled nodes" as it appears in libnuma's manual. If it is enabled, it's in >>> that dir and just the ones with memory will be taken into account. If it's >>> disabled (somehow), it's not in the dir, so won't be taken into account (i.e. no >>> mbind() tried against it). >>> >>> On Power it's possible to have a numa node without memory (memory-less node, a >>> case covered in this change), a numa node without cpus at all but with memory >>> (a configured node anyway, so a case already covered) but to disable a specific >>> numa node so it does not appear in /sys/devices/system/node/* it's only possible >>> from the inners of the control module. Or other rare condition not invisible / >>> adjustable from the OS. Also I'm not aware of a case where a node is in this >>> dir but is at the same time flagged as something like "disabled". There are >>> cpu/memory hotplugs, but that does not change numa nodes status AFAIK. >>> >>> [1] https://github.com/numactl/numactl/blob/master/libnuma.c#L334-L347 >>> [2] https://github.com/numactl/numactl/blob/master/libnuma.c#L614-L618 >>> >>> >>>> - the same question applies to the usage of >>>> Linux::isnode_in_configured_nodes() within os::numa_get_leaf_groups(). >>>> Does isnode_in_configured_nodes() (i.e. the node set defined by >>>> 'numa_all_nodes_ptr' take into account the disabled nodes or not? Can >>>> this be a potential problem (i.e. if we use a disabled node). >>> >>> On the meaning of "disabled nodes", it's the same case as above, so to the >>> best of knowledge it's not a potential problem. >>> >>> Anyway 'numa_all_nodes_ptr' just includes the configured nodes (with memory), >>> i.e. "all nodes on which the calling task may allocate memory". It's exactly >>> the same pointer returned by numa_get_membind() v2 [3] which: >>> >>> "returns the mask of nodes from which memory can currently be allocated" >>> >>> and that is used, for example, in "numactl --show" to show nodes from where >>> memory can be allocated [4, 5]. >>> >>> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L1147 >>> [4] https://github.com/numactl/numactl/blob/master/numactl.c#L144 >>> [5] https://github.com/numactl/numactl/blob/master/numactl.c#L177 >>> >>> >>>> - I'd like to suggest renaming the 'index' part of the following >>>> variables and functions to 'nindex' ('node_index' is probably to long) >>>> in the following code, to emphasize that we have node indexes pointing >>>> to actual, not always consecutive node numbers: >>>> >>>> 2879 // Create an index -> node mapping, since nodes are not >>>> always consecutive >>>> 2880 _index_to_node = new (ResourceObj::C_HEAP, mtInternal) >>>> GrowableArray(0, true); >>>> 2881 rebuild_index_to_node_map(); >>> >>> Simple change but much better to read indeed. Done. >>> >>> >>>> - can you please wrap the following one-line else statement into curly >>>> braces (it's more readable and we usually do it that way in HotSpot >>>> although there are no formal style guidelines :) >>>> >>>> 2953 } else >>>> 2954 // Current node is already a configured node. >>>> 2955 closest_node = index_to_node()->at(i); >>> >>> Done. >>> >>> >>>> - in os::Linux::rebuild_cpu_to_node_map(), if you set >>>> 'closest_distance' to INT_MAX at the beginning of the loop, you can >>>> later avoid the check for '|| !closest_distance'. Also, according to >>>> the man page, numa_distance() returns 0 if it can not determine the >>>> distance. So with the above change, the condition on line 2974 should >>>> read: >>>> >>>> 2947 if (distance && distance < closest_distance) { >>>> >>> >>> Sure, much better to set the initial condition as distant as possible and >>> adjust to a closer one bit by bit improving the if condition. Done. >>> >>> >>>> Finally, and not directly related to your change, I'd suggest the >>>> following clean-ups: >>>> >>>> - remove the usage of 'NCPUS = 32768' in >>>> os::Linux::rebuild_cpu_to_node_map(). The comment on that line is >>>> unclear to me and probably related to an older version/problem of >>>> libnuma? I think we should simply use >>>> numa_allocate_cpumask()/numa_free_cpumask() instead. >>>> >>>> - we still use the NUMA version 1 function prototypes (e.g. >>>> "numa_node_to_cpus(int node, unsigned long *buffer, int buffer_len)" >>>> instead of "numa_node_to_cpus(int node, struct bitmask *mask)", but >>>> also "numa_interleave_memory()" and maybe others). I think we should >>>> switch all prototypes to the new NUMA version 2 interface which you've >>>> already used for the new functions which you've added. >>> >>> I agree. Could I open a new bug to address these clean-ups? >>> >>> >>>> That said, I think these changes all require libnuma 2.0 (see >>>> os::Linux::libnuma_dlsym). So before starting this, you should make >>>> sure that libnuma 2.0 is available on all platforms to which you'd >>>> like to down-port this change. For jdk10 we could definitely do it, >>>> for jdk9 probably also, for jdk8 I'm not so sure. >>> >>> libnuma v1 last release dates back to 2008, but any idea how could I check that >>> for sure since it's on shared code? >>> >>> new webrev: http://cr.openjdk.java.net/~gromero/8175813/v3/ >>> >>> Thank you! >>> >>> Best regards, >>> Gustavo >>> >>> >>>> Regards, >>>> Volker >>>> >>>> On Thu, Apr 13, 2017 at 12:51 AM, Gustavo Romero >>>> wrote: >>>>> Hi, >>>>> >>>>> Any update on it? >>>>> >>>>> Thank you. >>>>> >>>>> Regards, >>>>> Gustavo >>>>> >>>>> On 09-03-2017 16:33, Gustavo Romero wrote: >>>>>> Hi, >>>>>> >>>>>> Could the following webrev be reviewed please? >>>>>> >>>>>> It improves the numa node detection when non-consecutive or memory-less nodes >>>>>> exist in the system. >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/ >>>>>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>>>>> >>>>>> Currently, although no problem exists when the JVM detects numa nodes that are >>>>>> consecutive and have memory, for example in a numa topology like: >>>>>> >>>>>> available: 2 nodes (0-1) >>>>>> node 0 cpus: 0 8 16 24 32 >>>>>> node 0 size: 65258 MB >>>>>> node 0 free: 34 MB >>>>>> node 1 cpus: 40 48 56 64 72 >>>>>> node 1 size: 65320 MB >>>>>> node 1 free: 150 MB >>>>>> node distances: >>>>>> node 0 1 >>>>>> 0: 10 20 >>>>>> 1: 20 10, >>>>>> >>>>>> it fails on detecting numa nodes to be used in the Parallel GC in a numa >>>>>> topology like: >>>>>> >>>>>> available: 4 nodes (0-1,16-17) >>>>>> node 0 cpus: 0 8 16 24 32 >>>>>> node 0 size: 130706 MB >>>>>> node 0 free: 7729 MB >>>>>> node 1 cpus: 40 48 56 64 72 >>>>>> node 1 size: 0 MB >>>>>> node 1 free: 0 MB >>>>>> node 16 cpus: 80 88 96 104 112 >>>>>> node 16 size: 130630 MB >>>>>> node 16 free: 5282 MB >>>>>> node 17 cpus: 120 128 136 144 152 >>>>>> node 17 size: 0 MB >>>>>> node 17 free: 0 MB >>>>>> node distances: >>>>>> node 0 1 16 17 >>>>>> 0: 10 20 40 40 >>>>>> 1: 20 10 40 40 >>>>>> 16: 40 40 10 20 >>>>>> 17: 40 40 20 10, >>>>>> >>>>>> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have >>>>>> no memory. >>>>>> >>>>>> If a topology like that exists, os::numa_make_local() will receive a local group >>>>>> id as a hint that is not available in the system to be bound (it will receive >>>>>> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument" >>>>>> messages: >>>>>> >>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log >>>>>> >>>>>> That change improves the detection by making the JVM numa API aware of the >>>>>> existence of numa nodes that are non-consecutive from 0 to the highest node >>>>>> number and also of nodes that might be memory-less nodes, i.e. that might not >>>>>> be, in libnuma terms, a configured node. Hence just the configured nodes will >>>>>> be available: >>>>>> >>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log >>>>>> >>>>>> The change has no effect on numa topologies were the problem does not occur, >>>>>> i.e. no change in the number of nodes and no change in the cpu to node map. On >>>>>> numa topologies where memory-less nodes exist (like in the last example above), >>>>>> cpus from a memory-less node won't be able to bind locally so they are mapped >>>>>> to the closest node, otherwise they would be not associate to any node and >>>>>> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the >>>>>> performance. >>>>>> >>>>>> I found no regressions on x64 for the following numa topology: >>>>>> >>>>>> available: 2 nodes (0-1) >>>>>> node 0 cpus: 0 1 2 3 8 9 10 11 >>>>>> node 0 size: 24102 MB >>>>>> node 0 free: 19806 MB >>>>>> node 1 cpus: 4 5 6 7 12 13 14 15 >>>>>> node 1 size: 24190 MB >>>>>> node 1 free: 21951 MB >>>>>> node distances: >>>>>> node 0 1 >>>>>> 0: 10 21 >>>>>> 1: 21 10 >>>>>> >>>>>> I understand that fixing the current numa detection is a prerequisite to enable >>>>>> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2]. >>>>>> >>>>>> Thank you. >>>>>> >>>>>> >>>>>> Best regards, >>>>>> Gustavo >>>>>> >>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate) >>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation) >>>>>> >>>>> >>>> >>> >> From kim.barrett at oracle.com Fri May 5 05:43:41 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 5 May 2017 01:43:41 -0400 Subject: RFR: 8179004: Add an efficient implementation of the "count trailing zeros" operation In-Reply-To: <6B50E7BF-A0CC-4100-A332-9E0EC8054C39@oracle.com> References: <6B50E7BF-A0CC-4100-A332-9E0EC8054C39@oracle.com> Message-ID: Still looking for a Reviewer. > On Apr 20, 2017, at 2:17 AM, Kim Barrett wrote: > > Please review this addition of the count_trailing_zeros function. > > Unfortunately, there isn't an obvious direct and portable way to write > such a function. But supported hardware architectures generally > provide an efficient implementation, e.g. a single instruction or a > short sequence. Compilers often provide "built-in" or "intrinsic" > access to that hardware implementation, or one can use inline > assembler. > > If a platform doesn't have such a built-in efficient implementation, > the function can be implemented using de Bruijn sequences as discussed > in the literature. But all of the OpenJDK-supported platforms provide > an efficient implementation, so we aren't providing such a fallback. > > As part of reviewing this change, please feel free to suggest > alternative ways to structure the code. I'm not completely happy with > the way I've done it, but alternatives I've tried seemed worse. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8179004 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8179004/hotspot.00/ > > Testing: > Unit test for new function. > > Experimented with modifying BitMap::get_next_one_offset and the like > to replace the existing for-loop with a call to this new function, and > measured substantial speedups on all tested platforms for a test > program that counts the bits in a bitmap by iterating calls to that > search function. The speedup varies with the density of set bits, with > very sparse or nearly full being similar to the existing code, since > the bit counting is not the dominant factor in those situations. But > in between give speedups of factors of x1.5 to x5 or more, depending > on the density and the platform. From serguei.spitsyn at oracle.com Fri May 5 06:29:37 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 4 May 2017 23:29:37 -0700 Subject: RFR(10)(M): JDK-8164563: Test nsk/jvmti/CompiledMethodUnload/compmethunload001 keeps reporting: PRODUCT BUG: class was not unloaded in 5 In-Reply-To: <2ad9e3a1-0b11-5cea-4ba8-e411e3f91a6f@oracle.com> References: <2ad9e3a1-0b11-5cea-4ba8-e411e3f91a6f@oracle.com> Message-ID: <3604d39f-d18c-3800-a93d-a10629059f0a@oracle.com> Hi Chris, The fix looks good. I like the simplification. Great bug hunt, Chris! Thanks, Serguei On 5/4/17 15:38, Chris Plummer wrote: > Hello, > > Please review the following changes: > > http://cr.openjdk.java.net/~cjplummer/8164563/webrev.00/webrev.hotspot/ > https://bugs.openjdk.java.net/browse/JDK-8164563 > > There was an issue with CompileMethodUnload events not getting > delivered. Short story is the root cause was the > JvmtiDeferredEventQueue::_pending_list, which upon further review was > deemed unnecessary, so all code related to it has been pulled. > > The _pending_list is a temporary list for CompileMethodUnload events > that occur while at a safepoint. Putting them directly on the > JvmtiDeferredEventQueue was thought not to be allowed at a safepoint, > because doing so required acquiring the Service_lock, and it was > thought that you can no do that while at a safepoint. > > The issue with the _pending_list is that it only gets processed if > there is a subsequent JvmtiDeferredEventQueue::enqueue() call. For the > test in question, this was not always happening. The test sits in a > loop waiting for the unload events, but unless it triggers some > compilation during this time, which results in enqueue() being called > for the CompileMethodLoad event, it will never see the > CompileMethodUnload events. It eventually gives up and fails. Most > times however there is a compilation triggered while in the loop so > the test passes. > > The first attempted solution was to use a VM op that triggered > processing of the _pending_list. However, this also ran up against the > issue of not being able to grab the Service_lock while at a safepoint > because even _concurrent VM ops can end up being run synchronously and > at a safepoint if you are already in a VMThread. > > After further review of the safepoint concern with the Service_lock, > it was determined that it should be ok to grab it while at a > safepoint, thus removing the need for the _pending_list. So basically > the fix is to remove all _pending_list code, and have > nmethod::post_compiled_method_unload() always directly call > JvmtiDeferredEventQueue::enqueue(). > > I tested by running the failing test at least 100 times on all > supported platforms (it used to fail with a fairly high frequency). I > also ran our other CompileMethodUnload and CompileMethodLoad tests > about 100 times, and ran our full jvmti test suite a few times on each > platform, along with the jck/vm/jvmti tests. > > thanks, > > Chris From david.holmes at oracle.com Fri May 5 08:02:53 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 May 2017 18:02:53 +1000 Subject: RFR: 8179004: Add an efficient implementation of the "count trailing zeros" operation In-Reply-To: References: <6B50E7BF-A0CC-4100-A332-9E0EC8054C39@oracle.com> Message-ID: <260360aa-9eb1-4320-f000-4b2a46cf5e86@oracle.com> On 5/05/2017 3:43 PM, Kim Barrett wrote: > Still looking for a Reviewer. Reviewed. I don't see an obviously better way to deal with the structure. Thanks, David >> On Apr 20, 2017, at 2:17 AM, Kim Barrett wrote: >> >> Please review this addition of the count_trailing_zeros function. >> >> Unfortunately, there isn't an obvious direct and portable way to write >> such a function. But supported hardware architectures generally >> provide an efficient implementation, e.g. a single instruction or a >> short sequence. Compilers often provide "built-in" or "intrinsic" >> access to that hardware implementation, or one can use inline >> assembler. >> >> If a platform doesn't have such a built-in efficient implementation, >> the function can be implemented using de Bruijn sequences as discussed >> in the literature. But all of the OpenJDK-supported platforms provide >> an efficient implementation, so we aren't providing such a fallback. >> >> As part of reviewing this change, please feel free to suggest >> alternative ways to structure the code. I'm not completely happy with >> the way I've done it, but alternatives I've tried seemed worse. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8179004 >> >> Webrev: >> http://cr.openjdk.java.net/~kbarrett/8179004/hotspot.00/ >> >> Testing: >> Unit test for new function. >> >> Experimented with modifying BitMap::get_next_one_offset and the like >> to replace the existing for-loop with a call to this new function, and >> measured substantial speedups on all tested platforms for a test >> program that counts the bits in a bitmap by iterating calls to that >> search function. The speedup varies with the density of set bits, with >> very sparse or nearly full being similar to the existing code, since >> the bit counting is not the dominant factor in those situations. But >> in between give speedups of factors of x1.5 to x5 or more, depending >> on the density and the platform. > > From adinn at redhat.com Fri May 5 09:47:48 2017 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 5 May 2017 10:47:48 +0100 Subject: RFR: 8179444: AArch64: Put zero_words on a diet In-Reply-To: References: Message-ID: On 03/05/17 18:05, Andrew Haley wrote: > New version, corrected: > . . . > http://cr.openjdk.java.net/~aph/8179444-2/ > > OK? The patch looks good (not an official review) except that I don't understand one detail. Why does MacroAssembler::zero_words include this? + RuntimeAddress zero_blocks = RuntimeAddress(StubRoutines::aarch64::zero_blocks()); + assert(zero_blocks.target() != NULL, "zero_blocks stub has not been generated"); + if (StubRoutines::aarch64::complete()) { + trampoline_call(zero_blocks); + } else { + bl(zero_blocks); + } regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From coleen.phillimore at oracle.com Fri May 5 13:05:59 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 5 May 2017 09:05:59 -0400 Subject: RFR(10)(M): JDK-8164563: Test nsk/jvmti/CompiledMethodUnload/compmethunload001 keeps reporting: PRODUCT BUG: class was not unloaded in 5 In-Reply-To: <3604d39f-d18c-3800-a93d-a10629059f0a@oracle.com> References: <2ad9e3a1-0b11-5cea-4ba8-e411e3f91a6f@oracle.com> <3604d39f-d18c-3800-a93d-a10629059f0a@oracle.com> Message-ID: <960322aa-ea40-7b42-36ac-d278f0daa724@oracle.com> On 5/5/17 2:29 AM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > The fix looks good. > I like the simplification. > Great bug hunt, Chris! +1. Great job figuring out this bug! Coleen > > Thanks, > Serguei > > > On 5/4/17 15:38, Chris Plummer wrote: >> Hello, >> >> Please review the following changes: >> >> http://cr.openjdk.java.net/~cjplummer/8164563/webrev.00/webrev.hotspot/ >> https://bugs.openjdk.java.net/browse/JDK-8164563 >> >> There was an issue with CompileMethodUnload events not getting >> delivered. Short story is the root cause was the >> JvmtiDeferredEventQueue::_pending_list, which upon further review was >> deemed unnecessary, so all code related to it has been pulled. >> >> The _pending_list is a temporary list for CompileMethodUnload events >> that occur while at a safepoint. Putting them directly on the >> JvmtiDeferredEventQueue was thought not to be allowed at a safepoint, >> because doing so required acquiring the Service_lock, and it was >> thought that you can no do that while at a safepoint. >> >> The issue with the _pending_list is that it only gets processed if >> there is a subsequent JvmtiDeferredEventQueue::enqueue() call. For >> the test in question, this was not always happening. The test sits in >> a loop waiting for the unload events, but unless it triggers some >> compilation during this time, which results in enqueue() being called >> for the CompileMethodLoad event, it will never see the >> CompileMethodUnload events. It eventually gives up and fails. Most >> times however there is a compilation triggered while in the loop so >> the test passes. >> >> The first attempted solution was to use a VM op that triggered >> processing of the _pending_list. However, this also ran up against >> the issue of not being able to grab the Service_lock while at a >> safepoint because even _concurrent VM ops can end up being run >> synchronously and at a safepoint if you are already in a VMThread. >> >> After further review of the safepoint concern with the Service_lock, >> it was determined that it should be ok to grab it while at a >> safepoint, thus removing the need for the _pending_list. So basically >> the fix is to remove all _pending_list code, and have >> nmethod::post_compiled_method_unload() always directly call >> JvmtiDeferredEventQueue::enqueue(). >> >> I tested by running the failing test at least 100 times on all >> supported platforms (it used to fail with a fairly high frequency). I >> also ran our other CompileMethodUnload and CompileMethodLoad tests >> about 100 times, and ran our full jvmti test suite a few times on >> each platform, along with the jck/vm/jvmti tests. >> >> thanks, >> >> Chris > From aph at redhat.com Fri May 5 15:06:51 2017 From: aph at redhat.com (Andrew Haley) Date: Fri, 5 May 2017 16:06:51 +0100 Subject: RFR: 8179444: AArch64: Put zero_words on a diet In-Reply-To: References: Message-ID: On 05/05/17 10:47, Andrew Dinn wrote: > On 03/05/17 18:05, Andrew Haley wrote: >> New version, corrected: >> . . . >> http://cr.openjdk.java.net/~aph/8179444-2/ >> >> OK? > > The patch looks good (not an official review) except that I don't > understand one detail. Why does MacroAssembler::zero_words include this? > > + RuntimeAddress zero_blocks = > RuntimeAddress(StubRoutines::aarch64::zero_blocks()); > + assert(zero_blocks.target() != NULL, "zero_blocks stub has not been > generated"); > + if (StubRoutines::aarch64::complete()) { > + trampoline_call(zero_blocks); > + } else { > + bl(zero_blocks); > + } Trampoline calls only work from compiler-generated code, so we have to do something different when we're generating the stubs. I suppose I could have had two versions of MacroAssembler::zero_words or added a parameter to say we're generating stubs. Would that be clearer? Andrew. From adinn at redhat.com Fri May 5 15:20:31 2017 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 5 May 2017 16:20:31 +0100 Subject: RFR: 8179444: AArch64: Put zero_words on a diet In-Reply-To: References: Message-ID: On 05/05/17 16:06, Andrew Haley wrote: > On 05/05/17 10:47, Andrew Dinn wrote: >> On 03/05/17 18:05, Andrew Haley wrote: >>> New version, corrected: >>> . . . >>> http://cr.openjdk.java.net/~aph/8179444-2/ >>> >>> OK? >> >> The patch looks good (not an official review) except that I don't >> understand one detail. Why does MacroAssembler::zero_words include this? >> >> + RuntimeAddress zero_blocks = >> RuntimeAddress(StubRoutines::aarch64::zero_blocks()); >> + assert(zero_blocks.target() != NULL, "zero_blocks stub has not been >> generated"); >> + if (StubRoutines::aarch64::complete()) { >> + trampoline_call(zero_blocks); >> + } else { >> + bl(zero_blocks); >> + } > > Trampoline calls only work from compiler-generated code, so we have to > do something different when we're generating the stubs. I suppose I > could have had two versions of MacroAssembler::zero_words or added a > parameter to say we're generating stubs. Would that be clearer? Ok, I see now. Two versions of the methods seems like overkill. A comment before the if explaining what is going on is probably all that is needed. For example: + // if stubs are complete then we are generating under + // the compiler so we need to use a trampoline_call + // otherwise we have to use a normal call + if (StubRoutines::aarch64::complete()) { + trampoline_call(zero_blocks); + } else { I have not yet tested the patch. I will do so and report back. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From daniel.daugherty at oracle.com Fri May 5 15:41:52 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 5 May 2017 09:41:52 -0600 Subject: RFR(10)(M): JDK-8164563: Test nsk/jvmti/CompiledMethodUnload/compmethunload001 keeps reporting: PRODUCT BUG: class was not unloaded in 5 In-Reply-To: <2ad9e3a1-0b11-5cea-4ba8-e411e3f91a6f@oracle.com> References: <2ad9e3a1-0b11-5cea-4ba8-e411e3f91a6f@oracle.com> Message-ID: On 5/4/17 4:38 PM, Chris Plummer wrote: > Hello, > > Please review the following changes: > > http://cr.openjdk.java.net/~cjplummer/8164563/webrev.00/webrev.hotspot/ > https://bugs.openjdk.java.net/browse/JDK-8164563 src/share/vm/prims/jvmtiImpl.hpp No comments. src/share/vm/prims/jvmtiImpl.cpp No comments. src/share/vm/code/nmethod.cpp No comments. Removal of pending_list support looks clean. Thumbs up! Dan > > There was an issue with CompileMethodUnload events not getting > delivered. Short story is the root cause was the > JvmtiDeferredEventQueue::_pending_list, which upon further review was > deemed unnecessary, so all code related to it has been pulled. > > The _pending_list is a temporary list for CompileMethodUnload events > that occur while at a safepoint. Putting them directly on the > JvmtiDeferredEventQueue was thought not to be allowed at a safepoint, > because doing so required acquiring the Service_lock, and it was > thought that you can no do that while at a safepoint. > > The issue with the _pending_list is that it only gets processed if > there is a subsequent JvmtiDeferredEventQueue::enqueue() call. For the > test in question, this was not always happening. The test sits in a > loop waiting for the unload events, but unless it triggers some > compilation during this time, which results in enqueue() being called > for the CompileMethodLoad event, it will never see the > CompileMethodUnload events. It eventually gives up and fails. Most > times however there is a compilation triggered while in the loop so > the test passes. > > The first attempted solution was to use a VM op that triggered > processing of the _pending_list. However, this also ran up against the > issue of not being able to grab the Service_lock while at a safepoint > because even _concurrent VM ops can end up being run synchronously and > at a safepoint if you are already in a VMThread. > > After further review of the safepoint concern with the Service_lock, > it was determined that it should be ok to grab it while at a > safepoint, thus removing the need for the _pending_list. So basically > the fix is to remove all _pending_list code, and have > nmethod::post_compiled_method_unload() always directly call > JvmtiDeferredEventQueue::enqueue(). > > I tested by running the failing test at least 100 times on all > supported platforms (it used to fail with a fairly high frequency). I > also ran our other CompileMethodUnload and CompileMethodLoad tests > about 100 times, and ran our full jvmti test suite a few times on each > platform, along with the jck/vm/jvmti tests. > > thanks, > > Chris From adinn at redhat.com Fri May 5 15:52:44 2017 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 5 May 2017 16:52:44 +0100 Subject: RFR: 8179444: AArch64: Put zero_words on a diet In-Reply-To: References: Message-ID: Ok, this ought not strictly to count as an official review (as I am not a jdk9/10 reviewer) but I have i) eyeballed the code to my satisfaction and ii) successfully built and tested a patched jdk10 (by running java Hello, javac Hello.java and netbeans). Ship it! regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From rwestrel at redhat.com Fri May 5 15:58:34 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 05 May 2017 17:58:34 +0200 Subject: RFR: 8179444: AArch64: Put zero_words on a diet In-Reply-To: References: Message-ID: > http://cr.openjdk.java.net/~aph/8179444-2/ Looks ok to me. Roland. From chris.plummer at oracle.com Fri May 5 18:25:16 2017 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 5 May 2017 11:25:16 -0700 Subject: RFR(10)(M): JDK-8164563: Test nsk/jvmti/CompiledMethodUnload/compmethunload001 keeps reporting: PRODUCT BUG: class was not unloaded in 5 In-Reply-To: References: <2ad9e3a1-0b11-5cea-4ba8-e411e3f91a6f@oracle.com> Message-ID: <104698f5-233f-e85c-5b0d-8b627c0f2f54@oracle.com> On 5/5/17 8:41 AM, Daniel D. Daugherty wrote: > On 5/4/17 4:38 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following changes: >> >> http://cr.openjdk.java.net/~cjplummer/8164563/webrev.00/webrev.hotspot/ >> https://bugs.openjdk.java.net/browse/JDK-8164563 > > src/share/vm/prims/jvmtiImpl.hpp > No comments. > > src/share/vm/prims/jvmtiImpl.cpp > No comments. > > src/share/vm/code/nmethod.cpp > No comments. > > Removal of pending_list support looks clean. Thumbs up! Thanks Dan! Chris > > Dan > > > > >> >> There was an issue with CompileMethodUnload events not getting >> delivered. Short story is the root cause was the >> JvmtiDeferredEventQueue::_pending_list, which upon further review was >> deemed unnecessary, so all code related to it has been pulled. >> >> The _pending_list is a temporary list for CompileMethodUnload events >> that occur while at a safepoint. Putting them directly on the >> JvmtiDeferredEventQueue was thought not to be allowed at a safepoint, >> because doing so required acquiring the Service_lock, and it was >> thought that you can no do that while at a safepoint. >> >> The issue with the _pending_list is that it only gets processed if >> there is a subsequent JvmtiDeferredEventQueue::enqueue() call. For >> the test in question, this was not always happening. The test sits in >> a loop waiting for the unload events, but unless it triggers some >> compilation during this time, which results in enqueue() being called >> for the CompileMethodLoad event, it will never see the >> CompileMethodUnload events. It eventually gives up and fails. Most >> times however there is a compilation triggered while in the loop so >> the test passes. >> >> The first attempted solution was to use a VM op that triggered >> processing of the _pending_list. However, this also ran up against >> the issue of not being able to grab the Service_lock while at a >> safepoint because even _concurrent VM ops can end up being run >> synchronously and at a safepoint if you are already in a VMThread. >> >> After further review of the safepoint concern with the Service_lock, >> it was determined that it should be ok to grab it while at a >> safepoint, thus removing the need for the _pending_list. So basically >> the fix is to remove all _pending_list code, and have >> nmethod::post_compiled_method_unload() always directly call >> JvmtiDeferredEventQueue::enqueue(). >> >> I tested by running the failing test at least 100 times on all >> supported platforms (it used to fail with a fairly high frequency). I >> also ran our other CompileMethodUnload and CompileMethodLoad tests >> about 100 times, and ran our full jvmti test suite a few times on >> each platform, along with the jck/vm/jvmti tests. >> >> thanks, >> >> Chris > From gromero at linux.vnet.ibm.com Fri May 5 19:43:35 2017 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Fri, 5 May 2017 16:43:35 -0300 Subject: [10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: References: <58C1AE06.9060609@linux.vnet.ibm.com> <58EEAF7B.6020708@linux.vnet.ibm.com> <59000AC0.7050507@linux.vnet.ibm.com> <5909DAAC.3070202@linux.vnet.ibm.com> Message-ID: <590CD5E7.10809@linux.vnet.ibm.com> Hi David, On 04-05-2017 21:32, David Holmes wrote: > Hi Volker, Gustavo, > > On 4/05/2017 12:34 AM, Volker Simonis wrote: >> Hi, >> >> I've reviewed Gustavo's change and I'm fine with the latest version at: >> >> http://cr.openjdk.java.net/~gromero/8175813/v3/ > > Nothing has really changed for me since I first looked at this - I don't know NUMA and I can't comment on any of the details. But no-one else has commented negatively so they are implicitly okay with > this, or else they should have spoken up. So with Volker as the Reviewer and myself as a second reviewer, I will sponsor this. I'll run the current patch through JPRT while awaiting the final version. Thanks a lot for reviewing and sponsoring the change. > One thing I was unclear on with all this numa code is the expectation regarding all those dynamically looked up functions - is it expected that we will have them all or else have none? It wasn't at > all obvious what would happen if we don't have those functions but still executed this code - assuming that is even possible. I guess I would have expected that no numa code would execute unless > -XX:+UseNUMA was set, in which case the VM would abort if any of the libnuma functions could not be found. That way we wouldn't need the null checks for the function pointers. If libnuma is not available in the system os::Linux::libnuma_init() will return false and JVM will refuse to enable the UseNUMA features instead of aborting: 4904 if (UseNUMA) { 4905 if (!Linux::libnuma_init()) { 4906 UseNUMA = false; 4907 } else { I understand those null checks as part of the initial design of JVM numa api to enforce protection against the usage of its methods in other parts of the code when JVM api failed to initialize properly, even tho it's expected that UseNUMA = false should suffice to protect such a usages. That said, I could not find any recent Linux distribution that does not support libnuma v2 api (and so also v1 api). On Ubuntu it will be installed as a dependency of metapackage ubuntu-standard and because that requires "irqbalance" it also requires libnuma. Libnuma was updated from libnuma v1 to v2 around mid 2008: numactl (2.0.1-1) unstable; urgency=low * New upstream * patches/static-lib.patch: update * debian/watch: update to new SGI location -- Ian Wienand Sat, 07 Jun 2008 14:18:22 -0700 numactl (1.0.2-1) unstable; urgency=low * New upstream * Closes: #442690 -- Add to rules a hack to remove libnuma.a after unpatching * Update README.debian -- Ian Wienand Wed, 03 Oct 2007 21:49:27 +1000 It's similar on RHEL, where "irqbalance" is in core group. Regarding the libnuma version it was also updated in 2008 to v2, so since Fedora 11 contains v2, hence RHEL 6 and RHEL 7 contains it: * Wed Feb 25 2009 Fedora Release Engineering - 2.0.2-3 - Rebuilt for https://fedoraproject.org/wiki/Fedora_11_Mass_Rebuild * Mon Sep 29 2008 Neil Horman - 2.0.2-2 - Fix build break due to register selection in asm * Mon Sep 29 2008 Neil Horman - 2.0.2-1 - Update rawhide to version 2.0.2 of numactl * Fri Apr 25 2008 Neil Horman - 1.0.2-6 - Fix buffer size passing and arg sanity check for physcpubind (bz 442521) Also, the last release of libnuma v1 dates back to 2008: https://github.com/numactl/numactl/releases/tag/v1.0.2 So it looks like libnuma v2 absence on Linux is by now uncommon. > Style nits: > - we should avoid implicit booleans, so the isnode_in_* functions should return bool not int; and check "distance != 0" > - spaces around operators eg. node=0 should be node = 0 new webrev: http://cr.openjdk.java.net/~gromero/8175813/v4/ Thank you and best regards, Gustavo > Thanks, > David > >> Can somebody please sponsor the change? >> >> Thank you and best regards, >> Volker >> >> >> On Wed, May 3, 2017 at 3:27 PM, Gustavo Romero >> wrote: >>> Hi community, >>> >>> I understand that there is nothing that can be done additionally regarding this >>> issue, at this point, on the PPC64 side. >>> >>> It's a change in the shared code - but that in effect does not change anything in >>> the numa detection mechanism for other platforms - and hence it's necessary a >>> conjoint community effort to review the change and a sponsor to run it against >>> the JPRT. >>> >>> I know it's a stabilizing moment of OpenJDK 9, but since that issue is of >>> great concern on PPC64 (specially on POWER8 machines) I would be very glad if >>> the community could point out directions on how that change could move on. >>> >>> Thank you! >>> >>> Best regards, >>> Gustavo >>> >>> On 25-04-2017 23:49, Gustavo Romero wrote: >>>> Dear Volker, >>>> >>>> On 24-04-2017 14:08, Volker Simonis wrote: >>>>> Hi Gustavo, >>>>> >>>>> thanks for addressing this problem and sorry for my late reply. I >>>>> think this is a good change which definitely improves the situation >>>>> for uncommon NUMA configurations without changing the handling for >>>>> common topologies. >>>> >>>> Thanks a lot for reviewing the change! >>>> >>>> >>>>> It would be great if somebody could run this trough JPRT, but as >>>>> Gustavo mentioned, I don't expect any regressions. >>>>> >>>>> @Igor: I think you've been the original author of the NUMA-aware >>>>> allocator port to Linux (i.e. "6684395: Port NUMA-aware allocator to >>>>> linux"). If you could find some spare minutes to take a look at this >>>>> change, your comment would be very much appreciated :) >>>>> >>>>> Following some minor comments from me: >>>>> >>>>> - in os::numa_get_groups_num() you now use numa_num_configured_nodes() >>>>> to get the actual number of configured nodes. This is good and >>>>> certainly an improvement over the previous implementation. However, >>>>> the man page for numa_num_configured_nodes() mentions that the >>>>> returned count may contain currently disabled nodes. Do we currently >>>>> handle disabled nodes? What will be the consequence if we would use >>>>> such a disabled node (e.g. mbind() warnings)? >>>> >>>> In [1] 'numa_memnode_ptr' is set to keep a list of *just nodes with memory in >>>> found in /sys/devices/system/node/* Hence numa_num_configured_nodes() just >>>> returns the number of nodes in 'numa_memnode_ptr' [2], thus just returns the >>>> number of nodes with memory in the system. To the best of my knowledge there is >>>> no system configuration on Linux/PPC64 that could match such a notion of >>>> "disabled nodes" as it appears in libnuma's manual. If it is enabled, it's in >>>> that dir and just the ones with memory will be taken into account. If it's >>>> disabled (somehow), it's not in the dir, so won't be taken into account (i.e. no >>>> mbind() tried against it). >>>> >>>> On Power it's possible to have a numa node without memory (memory-less node, a >>>> case covered in this change), a numa node without cpus at all but with memory >>>> (a configured node anyway, so a case already covered) but to disable a specific >>>> numa node so it does not appear in /sys/devices/system/node/* it's only possible >>>> from the inners of the control module. Or other rare condition not invisible / >>>> adjustable from the OS. Also I'm not aware of a case where a node is in this >>>> dir but is at the same time flagged as something like "disabled". There are >>>> cpu/memory hotplugs, but that does not change numa nodes status AFAIK. >>>> >>>> [1] https://github.com/numactl/numactl/blob/master/libnuma.c#L334-L347 >>>> [2] https://github.com/numactl/numactl/blob/master/libnuma.c#L614-L618 >>>> >>>> >>>>> - the same question applies to the usage of >>>>> Linux::isnode_in_configured_nodes() within os::numa_get_leaf_groups(). >>>>> Does isnode_in_configured_nodes() (i.e. the node set defined by >>>>> 'numa_all_nodes_ptr' take into account the disabled nodes or not? Can >>>>> this be a potential problem (i.e. if we use a disabled node). >>>> >>>> On the meaning of "disabled nodes", it's the same case as above, so to the >>>> best of knowledge it's not a potential problem. >>>> >>>> Anyway 'numa_all_nodes_ptr' just includes the configured nodes (with memory), >>>> i.e. "all nodes on which the calling task may allocate memory". It's exactly >>>> the same pointer returned by numa_get_membind() v2 [3] which: >>>> >>>> "returns the mask of nodes from which memory can currently be allocated" >>>> >>>> and that is used, for example, in "numactl --show" to show nodes from where >>>> memory can be allocated [4, 5]. >>>> >>>> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L1147 >>>> [4] https://github.com/numactl/numactl/blob/master/numactl.c#L144 >>>> [5] https://github.com/numactl/numactl/blob/master/numactl.c#L177 >>>> >>>> >>>>> - I'd like to suggest renaming the 'index' part of the following >>>>> variables and functions to 'nindex' ('node_index' is probably to long) >>>>> in the following code, to emphasize that we have node indexes pointing >>>>> to actual, not always consecutive node numbers: >>>>> >>>>> 2879 // Create an index -> node mapping, since nodes are not >>>>> always consecutive >>>>> 2880 _index_to_node = new (ResourceObj::C_HEAP, mtInternal) >>>>> GrowableArray(0, true); >>>>> 2881 rebuild_index_to_node_map(); >>>> >>>> Simple change but much better to read indeed. Done. >>>> >>>> >>>>> - can you please wrap the following one-line else statement into curly >>>>> braces (it's more readable and we usually do it that way in HotSpot >>>>> although there are no formal style guidelines :) >>>>> >>>>> 2953 } else >>>>> 2954 // Current node is already a configured node. >>>>> 2955 closest_node = index_to_node()->at(i); >>>> >>>> Done. >>>> >>>> >>>>> - in os::Linux::rebuild_cpu_to_node_map(), if you set >>>>> 'closest_distance' to INT_MAX at the beginning of the loop, you can >>>>> later avoid the check for '|| !closest_distance'. Also, according to >>>>> the man page, numa_distance() returns 0 if it can not determine the >>>>> distance. So with the above change, the condition on line 2974 should >>>>> read: >>>>> >>>>> 2947 if (distance && distance < closest_distance) { >>>>> >>>> >>>> Sure, much better to set the initial condition as distant as possible and >>>> adjust to a closer one bit by bit improving the if condition. Done. >>>> >>>> >>>>> Finally, and not directly related to your change, I'd suggest the >>>>> following clean-ups: >>>>> >>>>> - remove the usage of 'NCPUS = 32768' in >>>>> os::Linux::rebuild_cpu_to_node_map(). The comment on that line is >>>>> unclear to me and probably related to an older version/problem of >>>>> libnuma? I think we should simply use >>>>> numa_allocate_cpumask()/numa_free_cpumask() instead. >>>>> >>>>> - we still use the NUMA version 1 function prototypes (e.g. >>>>> "numa_node_to_cpus(int node, unsigned long *buffer, int buffer_len)" >>>>> instead of "numa_node_to_cpus(int node, struct bitmask *mask)", but >>>>> also "numa_interleave_memory()" and maybe others). I think we should >>>>> switch all prototypes to the new NUMA version 2 interface which you've >>>>> already used for the new functions which you've added. >>>> >>>> I agree. Could I open a new bug to address these clean-ups? >>>> >>>> >>>>> That said, I think these changes all require libnuma 2.0 (see >>>>> os::Linux::libnuma_dlsym). So before starting this, you should make >>>>> sure that libnuma 2.0 is available on all platforms to which you'd >>>>> like to down-port this change. For jdk10 we could definitely do it, >>>>> for jdk9 probably also, for jdk8 I'm not so sure. >>>> >>>> libnuma v1 last release dates back to 2008, but any idea how could I check that >>>> for sure since it's on shared code? >>>> >>>> new webrev: http://cr.openjdk.java.net/~gromero/8175813/v3/ >>>> >>>> Thank you! >>>> >>>> Best regards, >>>> Gustavo >>>> >>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> On Thu, Apr 13, 2017 at 12:51 AM, Gustavo Romero >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> Any update on it? >>>>>> >>>>>> Thank you. >>>>>> >>>>>> Regards, >>>>>> Gustavo >>>>>> >>>>>> On 09-03-2017 16:33, Gustavo Romero wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Could the following webrev be reviewed please? >>>>>>> >>>>>>> It improves the numa node detection when non-consecutive or memory-less nodes >>>>>>> exist in the system. >>>>>>> >>>>>>> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/ >>>>>>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>>>>>> >>>>>>> Currently, although no problem exists when the JVM detects numa nodes that are >>>>>>> consecutive and have memory, for example in a numa topology like: >>>>>>> >>>>>>> available: 2 nodes (0-1) >>>>>>> node 0 cpus: 0 8 16 24 32 >>>>>>> node 0 size: 65258 MB >>>>>>> node 0 free: 34 MB >>>>>>> node 1 cpus: 40 48 56 64 72 >>>>>>> node 1 size: 65320 MB >>>>>>> node 1 free: 150 MB >>>>>>> node distances: >>>>>>> node 0 1 >>>>>>> 0: 10 20 >>>>>>> 1: 20 10, >>>>>>> >>>>>>> it fails on detecting numa nodes to be used in the Parallel GC in a numa >>>>>>> topology like: >>>>>>> >>>>>>> available: 4 nodes (0-1,16-17) >>>>>>> node 0 cpus: 0 8 16 24 32 >>>>>>> node 0 size: 130706 MB >>>>>>> node 0 free: 7729 MB >>>>>>> node 1 cpus: 40 48 56 64 72 >>>>>>> node 1 size: 0 MB >>>>>>> node 1 free: 0 MB >>>>>>> node 16 cpus: 80 88 96 104 112 >>>>>>> node 16 size: 130630 MB >>>>>>> node 16 free: 5282 MB >>>>>>> node 17 cpus: 120 128 136 144 152 >>>>>>> node 17 size: 0 MB >>>>>>> node 17 free: 0 MB >>>>>>> node distances: >>>>>>> node 0 1 16 17 >>>>>>> 0: 10 20 40 40 >>>>>>> 1: 20 10 40 40 >>>>>>> 16: 40 40 10 20 >>>>>>> 17: 40 40 20 10, >>>>>>> >>>>>>> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have >>>>>>> no memory. >>>>>>> >>>>>>> If a topology like that exists, os::numa_make_local() will receive a local group >>>>>>> id as a hint that is not available in the system to be bound (it will receive >>>>>>> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument" >>>>>>> messages: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log >>>>>>> >>>>>>> That change improves the detection by making the JVM numa API aware of the >>>>>>> existence of numa nodes that are non-consecutive from 0 to the highest node >>>>>>> number and also of nodes that might be memory-less nodes, i.e. that might not >>>>>>> be, in libnuma terms, a configured node. Hence just the configured nodes will >>>>>>> be available: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log >>>>>>> >>>>>>> The change has no effect on numa topologies were the problem does not occur, >>>>>>> i.e. no change in the number of nodes and no change in the cpu to node map. On >>>>>>> numa topologies where memory-less nodes exist (like in the last example above), >>>>>>> cpus from a memory-less node won't be able to bind locally so they are mapped >>>>>>> to the closest node, otherwise they would be not associate to any node and >>>>>>> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the >>>>>>> performance. >>>>>>> >>>>>>> I found no regressions on x64 for the following numa topology: >>>>>>> >>>>>>> available: 2 nodes (0-1) >>>>>>> node 0 cpus: 0 1 2 3 8 9 10 11 >>>>>>> node 0 size: 24102 MB >>>>>>> node 0 free: 19806 MB >>>>>>> node 1 cpus: 4 5 6 7 12 13 14 15 >>>>>>> node 1 size: 24190 MB >>>>>>> node 1 free: 21951 MB >>>>>>> node distances: >>>>>>> node 0 1 >>>>>>> 0: 10 21 >>>>>>> 1: 21 10 >>>>>>> >>>>>>> I understand that fixing the current numa detection is a prerequisite to enable >>>>>>> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2]. >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> >>>>>>> Best regards, >>>>>>> Gustavo >>>>>>> >>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate) >>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation) >>>>>>> >>>>>> >>>>> >>>> >>> > From kim.barrett at oracle.com Fri May 5 20:07:01 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 5 May 2017 16:07:01 -0400 Subject: RFR: 8179004: Add an efficient implementation of the "count trailing zeros" operation In-Reply-To: <260360aa-9eb1-4320-f000-4b2a46cf5e86@oracle.com> References: <6B50E7BF-A0CC-4100-A332-9E0EC8054C39@oracle.com> <260360aa-9eb1-4320-f000-4b2a46cf5e86@oracle.com> Message-ID: <8F112CE8-F54F-4C51-A7F6-A0B98A80BF48@oracle.com> > On May 5, 2017, at 4:02 AM, David Holmes wrote: > > On 5/05/2017 3:43 PM, Kim Barrett wrote: >> Still looking for a Reviewer. > > Reviewed. > > I don't see an obviously better way to deal with the structure. Thanks. From ron.pressler at oracle.com Fri May 5 19:27:29 2017 From: ron.pressler at oracle.com (Ron Pressler) Date: Fri, 5 May 2017 22:27:29 +0300 Subject: RFR 10 JDK-8159995: Rename internal Unsafe.compare methods Message-ID: <590CD221.4080100@oracle.com> Hi, Please review the following core/hotspot change: Bug: https://bugs.openjdk.java.net/browse/JDK-8159995 core webrev: http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-jdk/webrev/ hotspot webrev: http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-hotspot/webrev/ This change is covered by existing tests. The following renaming was applied: - compareAndExchange*Volatile -> compareAndExchange* - compareAndSwap* -> compareAndSet* - weakCompareAndSwap* -> weakCompareAndSet*Plain - weakCompareAndSwap*Volatile -> weakCompareAndSet* At this stage, only method and hotspot intrinsic names were changed; node names were left as-is, and may be handled in a separate issue. Ron From Derek.White at cavium.com Fri May 5 22:54:40 2017 From: Derek.White at cavium.com (White, Derek) Date: Fri, 5 May 2017 22:54:40 +0000 Subject: RFR: 8179444: AArch64: Put zero_words on a diet In-Reply-To: References: Message-ID: Hi Andrew, Really nice seeing the smaller code! I had some questions, observations, and suggestions. src/cpu/aarch64/vm/macroAssembler_aarch64.cpp: - zero_words(reg, reg): - Comment mentions that it's used by the ClearArray pattern, but it's also used by generate_fill(). - The old zero_words(), via block_zero() and fill_words(), would align base to 16-bytes before doing stps, the new code doesn't. It may be worth conditionally pre-aligning if AvoidUnalignedAcesses is true. And could conditionally remove pre-aligning in zero_blocks. Line 4999: This is pre-existing, but it's confusing - could you rename "ShortArraySize" to "SmallArraySize"? - zero_words(reg, int): - I don't see a great way to handle unaligned accesses without blowing up the code size. So no real request here. - It looks like worst case is 15 unaligned stp instructions. - With some benchmarking someday I might argue for calling this constant count version for smaller copies. - Unless ClearArray only zero's from the first array element on - then we could guess if base is 16-byte unaligned by looking at the array header size. - zero_dcache_blocks(): - Suggest a new comment that mentions that it's only called from zero_blocks()? Or if it's meant to be more general then add comments about the requirements for base alignment, and that cnt has to be >= 2*zva_length. - zero_blocks DOES align base to 16-bytes, so we don't need to check here? Or make it a runtime assert? Thanks, - Derek -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Roland Westrelin Sent: Friday, May 05, 2017 11:59 AM To: Andrew Haley ; hotspot-dev Source Developers Subject: Re: RFR: 8179444: AArch64: Put zero_words on a diet > http://cr.openjdk.java.net/~aph/8179444-2/ Looks ok to me. Roland. From volker.simonis at gmail.com Sat May 6 06:59:15 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Sat, 6 May 2017 08:59:15 +0200 Subject: [10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: <590CD5E7.10809@linux.vnet.ibm.com> References: <58C1AE06.9060609@linux.vnet.ibm.com> <58EEAF7B.6020708@linux.vnet.ibm.com> <59000AC0.7050507@linux.vnet.ibm.com> <5909DAAC.3070202@linux.vnet.ibm.com> <590CD5E7.10809@linux.vnet.ibm.com> Message-ID: On Fri, May 5, 2017 at 9:43 PM, Gustavo Romero wrote: > Hi David, > > On 04-05-2017 21:32, David Holmes wrote: >> Hi Volker, Gustavo, >> >> On 4/05/2017 12:34 AM, Volker Simonis wrote: >>> Hi, >>> >>> I've reviewed Gustavo's change and I'm fine with the latest version at: >>> >>> http://cr.openjdk.java.net/~gromero/8175813/v3/ >> >> Nothing has really changed for me since I first looked at this - I don't know NUMA and I can't comment on any of the details. But no-one else has commented negatively so they are implicitly okay with >> this, or else they should have spoken up. So with Volker as the Reviewer and myself as a second reviewer, I will sponsor this. I'll run the current patch through JPRT while awaiting the final version. > > Thanks a lot for reviewing and sponsoring the change. > > >> One thing I was unclear on with all this numa code is the expectation regarding all those dynamically looked up functions - is it expected that we will have them all or else have none? It wasn't at >> all obvious what would happen if we don't have those functions but still executed this code - assuming that is even possible. I guess I would have expected that no numa code would execute unless >> -XX:+UseNUMA was set, in which case the VM would abort if any of the libnuma functions could not be found. That way we wouldn't need the null checks for the function pointers. > > If libnuma is not available in the system os::Linux::libnuma_init() will return > false and JVM will refuse to enable the UseNUMA features instead of aborting: > > 4904 if (UseNUMA) { > 4905 if (!Linux::libnuma_init()) { > 4906 UseNUMA = false; > 4907 } else { > > I understand those null checks as part of the initial design of JVM numa api to > enforce protection against the usage of its methods in other parts of the code > when JVM api failed to initialize properly, even tho it's expected that > UseNUMA = false should suffice to protect such a usages. > > That said, I could not find any recent Linux distribution that does not support > libnuma v2 api (and so also v1 api). On Ubuntu it will be installed as a > dependency of metapackage ubuntu-standard and because that requires "irqbalance" > it also requires libnuma. Libnuma was updated from libnuma v1 to v2 > around mid 2008: > > numactl (2.0.1-1) unstable; urgency=low > > * New upstream > * patches/static-lib.patch: update > * debian/watch: update to new SGI location > > -- Ian Wienand Sat, 07 Jun 2008 14:18:22 -0700 > > numactl (1.0.2-1) unstable; urgency=low > > * New upstream > * Closes: #442690 -- Add to rules a hack to remove libnuma.a after > unpatching > * Update README.debian > > > -- Ian Wienand Wed, 03 Oct 2007 21:49:27 +1000 > > > It's similar on RHEL, where "irqbalance" is in core group. Regarding > the libnuma version it was also updated in 2008 to v2, so since > Fedora 11 contains v2, hence RHEL 6 and RHEL 7 contains it: > > * Wed Feb 25 2009 Fedora Release Engineering - 2.0.2-3 > - Rebuilt for https://fedoraproject.org/wiki/Fedora_11_Mass_Rebuild > > * Mon Sep 29 2008 Neil Horman - 2.0.2-2 > - Fix build break due to register selection in asm > > * Mon Sep 29 2008 Neil Horman - 2.0.2-1 > - Update rawhide to version 2.0.2 of numactl > > * Fri Apr 25 2008 Neil Horman - 1.0.2-6 > - Fix buffer size passing and arg sanity check for physcpubind (bz 442521) > > > Also, the last release of libnuma v1 dates back to 2008: > https://github.com/numactl/numactl/releases/tag/v1.0.2 > > So it looks like libnuma v2 absence on Linux is by now uncommon. > > >> Style nits: >> - we should avoid implicit booleans, so the isnode_in_* functions should return bool not int; and check "distance != 0" >> - spaces around operators eg. node=0 should be node = 0 > > new webrev: http://cr.openjdk.java.net/~gromero/8175813/v4/ > Still good :) THumbs up! And thanks a lot for digging into the history of libnuma and its incarnation in various Linux distros. That's really useful information! Regards, Volker > > Thank you and best regards, > Gustavo > >> Thanks, >> David >> >>> Can somebody please sponsor the change? >>> >>> Thank you and best regards, >>> Volker >>> >>> >>> On Wed, May 3, 2017 at 3:27 PM, Gustavo Romero >>> wrote: >>>> Hi community, >>>> >>>> I understand that there is nothing that can be done additionally regarding this >>>> issue, at this point, on the PPC64 side. >>>> >>>> It's a change in the shared code - but that in effect does not change anything in >>>> the numa detection mechanism for other platforms - and hence it's necessary a >>>> conjoint community effort to review the change and a sponsor to run it against >>>> the JPRT. >>>> >>>> I know it's a stabilizing moment of OpenJDK 9, but since that issue is of >>>> great concern on PPC64 (specially on POWER8 machines) I would be very glad if >>>> the community could point out directions on how that change could move on. >>>> >>>> Thank you! >>>> >>>> Best regards, >>>> Gustavo >>>> >>>> On 25-04-2017 23:49, Gustavo Romero wrote: >>>>> Dear Volker, >>>>> >>>>> On 24-04-2017 14:08, Volker Simonis wrote: >>>>>> Hi Gustavo, >>>>>> >>>>>> thanks for addressing this problem and sorry for my late reply. I >>>>>> think this is a good change which definitely improves the situation >>>>>> for uncommon NUMA configurations without changing the handling for >>>>>> common topologies. >>>>> >>>>> Thanks a lot for reviewing the change! >>>>> >>>>> >>>>>> It would be great if somebody could run this trough JPRT, but as >>>>>> Gustavo mentioned, I don't expect any regressions. >>>>>> >>>>>> @Igor: I think you've been the original author of the NUMA-aware >>>>>> allocator port to Linux (i.e. "6684395: Port NUMA-aware allocator to >>>>>> linux"). If you could find some spare minutes to take a look at this >>>>>> change, your comment would be very much appreciated :) >>>>>> >>>>>> Following some minor comments from me: >>>>>> >>>>>> - in os::numa_get_groups_num() you now use numa_num_configured_nodes() >>>>>> to get the actual number of configured nodes. This is good and >>>>>> certainly an improvement over the previous implementation. However, >>>>>> the man page for numa_num_configured_nodes() mentions that the >>>>>> returned count may contain currently disabled nodes. Do we currently >>>>>> handle disabled nodes? What will be the consequence if we would use >>>>>> such a disabled node (e.g. mbind() warnings)? >>>>> >>>>> In [1] 'numa_memnode_ptr' is set to keep a list of *just nodes with memory in >>>>> found in /sys/devices/system/node/* Hence numa_num_configured_nodes() just >>>>> returns the number of nodes in 'numa_memnode_ptr' [2], thus just returns the >>>>> number of nodes with memory in the system. To the best of my knowledge there is >>>>> no system configuration on Linux/PPC64 that could match such a notion of >>>>> "disabled nodes" as it appears in libnuma's manual. If it is enabled, it's in >>>>> that dir and just the ones with memory will be taken into account. If it's >>>>> disabled (somehow), it's not in the dir, so won't be taken into account (i.e. no >>>>> mbind() tried against it). >>>>> >>>>> On Power it's possible to have a numa node without memory (memory-less node, a >>>>> case covered in this change), a numa node without cpus at all but with memory >>>>> (a configured node anyway, so a case already covered) but to disable a specific >>>>> numa node so it does not appear in /sys/devices/system/node/* it's only possible >>>>> from the inners of the control module. Or other rare condition not invisible / >>>>> adjustable from the OS. Also I'm not aware of a case where a node is in this >>>>> dir but is at the same time flagged as something like "disabled". There are >>>>> cpu/memory hotplugs, but that does not change numa nodes status AFAIK. >>>>> >>>>> [1] https://github.com/numactl/numactl/blob/master/libnuma.c#L334-L347 >>>>> [2] https://github.com/numactl/numactl/blob/master/libnuma.c#L614-L618 >>>>> >>>>> >>>>>> - the same question applies to the usage of >>>>>> Linux::isnode_in_configured_nodes() within os::numa_get_leaf_groups(). >>>>>> Does isnode_in_configured_nodes() (i.e. the node set defined by >>>>>> 'numa_all_nodes_ptr' take into account the disabled nodes or not? Can >>>>>> this be a potential problem (i.e. if we use a disabled node). >>>>> >>>>> On the meaning of "disabled nodes", it's the same case as above, so to the >>>>> best of knowledge it's not a potential problem. >>>>> >>>>> Anyway 'numa_all_nodes_ptr' just includes the configured nodes (with memory), >>>>> i.e. "all nodes on which the calling task may allocate memory". It's exactly >>>>> the same pointer returned by numa_get_membind() v2 [3] which: >>>>> >>>>> "returns the mask of nodes from which memory can currently be allocated" >>>>> >>>>> and that is used, for example, in "numactl --show" to show nodes from where >>>>> memory can be allocated [4, 5]. >>>>> >>>>> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L1147 >>>>> [4] https://github.com/numactl/numactl/blob/master/numactl.c#L144 >>>>> [5] https://github.com/numactl/numactl/blob/master/numactl.c#L177 >>>>> >>>>> >>>>>> - I'd like to suggest renaming the 'index' part of the following >>>>>> variables and functions to 'nindex' ('node_index' is probably to long) >>>>>> in the following code, to emphasize that we have node indexes pointing >>>>>> to actual, not always consecutive node numbers: >>>>>> >>>>>> 2879 // Create an index -> node mapping, since nodes are not >>>>>> always consecutive >>>>>> 2880 _index_to_node = new (ResourceObj::C_HEAP, mtInternal) >>>>>> GrowableArray(0, true); >>>>>> 2881 rebuild_index_to_node_map(); >>>>> >>>>> Simple change but much better to read indeed. Done. >>>>> >>>>> >>>>>> - can you please wrap the following one-line else statement into curly >>>>>> braces (it's more readable and we usually do it that way in HotSpot >>>>>> although there are no formal style guidelines :) >>>>>> >>>>>> 2953 } else >>>>>> 2954 // Current node is already a configured node. >>>>>> 2955 closest_node = index_to_node()->at(i); >>>>> >>>>> Done. >>>>> >>>>> >>>>>> - in os::Linux::rebuild_cpu_to_node_map(), if you set >>>>>> 'closest_distance' to INT_MAX at the beginning of the loop, you can >>>>>> later avoid the check for '|| !closest_distance'. Also, according to >>>>>> the man page, numa_distance() returns 0 if it can not determine the >>>>>> distance. So with the above change, the condition on line 2974 should >>>>>> read: >>>>>> >>>>>> 2947 if (distance && distance < closest_distance) { >>>>>> >>>>> >>>>> Sure, much better to set the initial condition as distant as possible and >>>>> adjust to a closer one bit by bit improving the if condition. Done. >>>>> >>>>> >>>>>> Finally, and not directly related to your change, I'd suggest the >>>>>> following clean-ups: >>>>>> >>>>>> - remove the usage of 'NCPUS = 32768' in >>>>>> os::Linux::rebuild_cpu_to_node_map(). The comment on that line is >>>>>> unclear to me and probably related to an older version/problem of >>>>>> libnuma? I think we should simply use >>>>>> numa_allocate_cpumask()/numa_free_cpumask() instead. >>>>>> >>>>>> - we still use the NUMA version 1 function prototypes (e.g. >>>>>> "numa_node_to_cpus(int node, unsigned long *buffer, int buffer_len)" >>>>>> instead of "numa_node_to_cpus(int node, struct bitmask *mask)", but >>>>>> also "numa_interleave_memory()" and maybe others). I think we should >>>>>> switch all prototypes to the new NUMA version 2 interface which you've >>>>>> already used for the new functions which you've added. >>>>> >>>>> I agree. Could I open a new bug to address these clean-ups? >>>>> >>>>> >>>>>> That said, I think these changes all require libnuma 2.0 (see >>>>>> os::Linux::libnuma_dlsym). So before starting this, you should make >>>>>> sure that libnuma 2.0 is available on all platforms to which you'd >>>>>> like to down-port this change. For jdk10 we could definitely do it, >>>>>> for jdk9 probably also, for jdk8 I'm not so sure. >>>>> >>>>> libnuma v1 last release dates back to 2008, but any idea how could I check that >>>>> for sure since it's on shared code? >>>>> >>>>> new webrev: http://cr.openjdk.java.net/~gromero/8175813/v3/ >>>>> >>>>> Thank you! >>>>> >>>>> Best regards, >>>>> Gustavo >>>>> >>>>> >>>>>> Regards, >>>>>> Volker >>>>>> >>>>>> On Thu, Apr 13, 2017 at 12:51 AM, Gustavo Romero >>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Any update on it? >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> Regards, >>>>>>> Gustavo >>>>>>> >>>>>>> On 09-03-2017 16:33, Gustavo Romero wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Could the following webrev be reviewed please? >>>>>>>> >>>>>>>> It improves the numa node detection when non-consecutive or memory-less nodes >>>>>>>> exist in the system. >>>>>>>> >>>>>>>> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/ >>>>>>>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>>>>>>> >>>>>>>> Currently, although no problem exists when the JVM detects numa nodes that are >>>>>>>> consecutive and have memory, for example in a numa topology like: >>>>>>>> >>>>>>>> available: 2 nodes (0-1) >>>>>>>> node 0 cpus: 0 8 16 24 32 >>>>>>>> node 0 size: 65258 MB >>>>>>>> node 0 free: 34 MB >>>>>>>> node 1 cpus: 40 48 56 64 72 >>>>>>>> node 1 size: 65320 MB >>>>>>>> node 1 free: 150 MB >>>>>>>> node distances: >>>>>>>> node 0 1 >>>>>>>> 0: 10 20 >>>>>>>> 1: 20 10, >>>>>>>> >>>>>>>> it fails on detecting numa nodes to be used in the Parallel GC in a numa >>>>>>>> topology like: >>>>>>>> >>>>>>>> available: 4 nodes (0-1,16-17) >>>>>>>> node 0 cpus: 0 8 16 24 32 >>>>>>>> node 0 size: 130706 MB >>>>>>>> node 0 free: 7729 MB >>>>>>>> node 1 cpus: 40 48 56 64 72 >>>>>>>> node 1 size: 0 MB >>>>>>>> node 1 free: 0 MB >>>>>>>> node 16 cpus: 80 88 96 104 112 >>>>>>>> node 16 size: 130630 MB >>>>>>>> node 16 free: 5282 MB >>>>>>>> node 17 cpus: 120 128 136 144 152 >>>>>>>> node 17 size: 0 MB >>>>>>>> node 17 free: 0 MB >>>>>>>> node distances: >>>>>>>> node 0 1 16 17 >>>>>>>> 0: 10 20 40 40 >>>>>>>> 1: 20 10 40 40 >>>>>>>> 16: 40 40 10 20 >>>>>>>> 17: 40 40 20 10, >>>>>>>> >>>>>>>> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have >>>>>>>> no memory. >>>>>>>> >>>>>>>> If a topology like that exists, os::numa_make_local() will receive a local group >>>>>>>> id as a hint that is not available in the system to be bound (it will receive >>>>>>>> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument" >>>>>>>> messages: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log >>>>>>>> >>>>>>>> That change improves the detection by making the JVM numa API aware of the >>>>>>>> existence of numa nodes that are non-consecutive from 0 to the highest node >>>>>>>> number and also of nodes that might be memory-less nodes, i.e. that might not >>>>>>>> be, in libnuma terms, a configured node. Hence just the configured nodes will >>>>>>>> be available: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log >>>>>>>> >>>>>>>> The change has no effect on numa topologies were the problem does not occur, >>>>>>>> i.e. no change in the number of nodes and no change in the cpu to node map. On >>>>>>>> numa topologies where memory-less nodes exist (like in the last example above), >>>>>>>> cpus from a memory-less node won't be able to bind locally so they are mapped >>>>>>>> to the closest node, otherwise they would be not associate to any node and >>>>>>>> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the >>>>>>>> performance. >>>>>>>> >>>>>>>> I found no regressions on x64 for the following numa topology: >>>>>>>> >>>>>>>> available: 2 nodes (0-1) >>>>>>>> node 0 cpus: 0 1 2 3 8 9 10 11 >>>>>>>> node 0 size: 24102 MB >>>>>>>> node 0 free: 19806 MB >>>>>>>> node 1 cpus: 4 5 6 7 12 13 14 15 >>>>>>>> node 1 size: 24190 MB >>>>>>>> node 1 free: 21951 MB >>>>>>>> node distances: >>>>>>>> node 0 1 >>>>>>>> 0: 10 21 >>>>>>>> 1: 21 10 >>>>>>>> >>>>>>>> I understand that fixing the current numa detection is a prerequisite to enable >>>>>>>> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2]. >>>>>>>> >>>>>>>> Thank you. >>>>>>>> >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Gustavo >>>>>>>> >>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate) >>>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation) >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> > From aph at redhat.com Sun May 7 08:41:13 2017 From: aph at redhat.com (Andrew Haley) Date: Sun, 7 May 2017 09:41:13 +0100 Subject: RFR: 8179444: AArch64: Put zero_words on a diet In-Reply-To: References: Message-ID: Hi, On 05/05/17 23:54, White, Derek wrote: > src/cpu/aarch64/vm/macroAssembler_aarch64.cpp: > - zero_words(reg, reg): > - Comment mentions that it's used by the ClearArray pattern, but > it's also used by generate_fill(). OK, but there's no contradiction there. The comment just explains why zero_words must be small. > - The old zero_words(), via block_zero() and fill_words(), would > align base to 16-bytes before doing stps, the new code doesn't. It > may be worth conditionally pre-aligning if AvoidUnalignedAcesses is > true. And could conditionally remove pre-aligning in zero_blocks. But that'd bloat zero_words, surely. And I haven't see the new code running any slower on any hardware, > Line 4999: This is pre-existing, but it's confusing - could you > rename "ShortArraySize" to "SmallArraySize"? OK. > - zero_words(reg, int): > - I don't see a great way to handle unaligned accesses without > blowing up the code size. So no real request here. > - It looks like worst case is 15 unaligned stp instructions. > - With some benchmarking someday I might argue for calling this > constant count version for smaller copies. I don't understand this. > - Unless ClearArray only zero's from the first array element on - > then we could guess if base is 16-byte unaligned by looking at the > array header size. > > - zero_dcache_blocks(): > - Suggest a new comment that mentions that it's only called from > zero_blocks()? Or if it's meant to be more general then add comments > about the requirements for base alignment, and that cnt has to be >= > 2*zva_length. OK. > - zero_blocks DOES align base to 16-bytes, so we don't need to > check here? Or make it a runtime assert? I did wonder about that, but left it in as a safety net because its cost is immeasurably small. Thanks for such a detailed review. Andrew. From david.holmes at oracle.com Sun May 7 20:45:09 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 May 2017 06:45:09 +1000 Subject: [10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: <590CD5E7.10809@linux.vnet.ibm.com> References: <58C1AE06.9060609@linux.vnet.ibm.com> <58EEAF7B.6020708@linux.vnet.ibm.com> <59000AC0.7050507@linux.vnet.ibm.com> <5909DAAC.3070202@linux.vnet.ibm.com> <590CD5E7.10809@linux.vnet.ibm.com> Message-ID: <4b26117f-508d-90ee-1ced-2a2c720a1047@oracle.com> Hi Gustavo, On 6/05/2017 5:43 AM, Gustavo Romero wrote: > Hi David, > > On 04-05-2017 21:32, David Holmes wrote: >> Hi Volker, Gustavo, >> >> On 4/05/2017 12:34 AM, Volker Simonis wrote: >>> Hi, >>> >>> I've reviewed Gustavo's change and I'm fine with the latest version at: >>> >>> http://cr.openjdk.java.net/~gromero/8175813/v3/ >> >> Nothing has really changed for me since I first looked at this - I don't know NUMA and I can't comment on any of the details. But no-one else has commented negatively so they are implicitly okay with >> this, or else they should have spoken up. So with Volker as the Reviewer and myself as a second reviewer, I will sponsor this. I'll run the current patch through JPRT while awaiting the final version. > > Thanks a lot for reviewing and sponsoring the change. > > >> One thing I was unclear on with all this numa code is the expectation regarding all those dynamically looked up functions - is it expected that we will have them all or else have none? It wasn't at >> all obvious what would happen if we don't have those functions but still executed this code - assuming that is even possible. I guess I would have expected that no numa code would execute unless >> -XX:+UseNUMA was set, in which case the VM would abort if any of the libnuma functions could not be found. That way we wouldn't need the null checks for the function pointers. > > If libnuma is not available in the system os::Linux::libnuma_init() will return > false and JVM will refuse to enable the UseNUMA features instead of aborting: > > 4904 if (UseNUMA) { > 4905 if (!Linux::libnuma_init()) { > 4906 UseNUMA = false; > 4907 } else { > > I understand those null checks as part of the initial design of JVM numa api to > enforce protection against the usage of its methods in other parts of the code > when JVM api failed to initialize properly, even tho it's expected that > UseNUMA = false should suffice to protect such a usages. Ok. Seems like they should be asserts rather than runtime checks if all the paths are properly guarded by UseNUMA - but that isn't your problem. > That said, I could not find any recent Linux distribution that does not support > libnuma v2 api (and so also v1 api). On Ubuntu it will be installed as a > dependency of metapackage ubuntu-standard and because that requires "irqbalance" > it also requires libnuma. Libnuma was updated from libnuma v1 to v2 > around mid 2008: Thanks for the additional info. > numactl (2.0.1-1) unstable; urgency=low > > * New upstream > * patches/static-lib.patch: update > * debian/watch: update to new SGI location > > -- Ian Wienand Sat, 07 Jun 2008 14:18:22 -0700 > > numactl (1.0.2-1) unstable; urgency=low > > * New upstream > * Closes: #442690 -- Add to rules a hack to remove libnuma.a after > unpatching > * Update README.debian > > > -- Ian Wienand Wed, 03 Oct 2007 21:49:27 +1000 > > > It's similar on RHEL, where "irqbalance" is in core group. Regarding > the libnuma version it was also updated in 2008 to v2, so since > Fedora 11 contains v2, hence RHEL 6 and RHEL 7 contains it: > > * Wed Feb 25 2009 Fedora Release Engineering - 2.0.2-3 > - Rebuilt for https://fedoraproject.org/wiki/Fedora_11_Mass_Rebuild > > * Mon Sep 29 2008 Neil Horman - 2.0.2-2 > - Fix build break due to register selection in asm > > * Mon Sep 29 2008 Neil Horman - 2.0.2-1 > - Update rawhide to version 2.0.2 of numactl > > * Fri Apr 25 2008 Neil Horman - 1.0.2-6 > - Fix buffer size passing and arg sanity check for physcpubind (bz 442521) > > > Also, the last release of libnuma v1 dates back to 2008: > https://github.com/numactl/numactl/releases/tag/v1.0.2 > > So it looks like libnuma v2 absence on Linux is by now uncommon. > > >> Style nits: >> - we should avoid implicit booleans, so the isnode_in_* functions should return bool not int; and check "distance != 0" >> - spaces around operators eg. node=0 should be node = 0 > > new webrev: http://cr.openjdk.java.net/~gromero/8175813/v4/ Looks good. Changes being pushed now. David ----- > > Thank you and best regards, > Gustavo > >> Thanks, >> David >> >>> Can somebody please sponsor the change? >>> >>> Thank you and best regards, >>> Volker >>> >>> >>> On Wed, May 3, 2017 at 3:27 PM, Gustavo Romero >>> wrote: >>>> Hi community, >>>> >>>> I understand that there is nothing that can be done additionally regarding this >>>> issue, at this point, on the PPC64 side. >>>> >>>> It's a change in the shared code - but that in effect does not change anything in >>>> the numa detection mechanism for other platforms - and hence it's necessary a >>>> conjoint community effort to review the change and a sponsor to run it against >>>> the JPRT. >>>> >>>> I know it's a stabilizing moment of OpenJDK 9, but since that issue is of >>>> great concern on PPC64 (specially on POWER8 machines) I would be very glad if >>>> the community could point out directions on how that change could move on. >>>> >>>> Thank you! >>>> >>>> Best regards, >>>> Gustavo >>>> >>>> On 25-04-2017 23:49, Gustavo Romero wrote: >>>>> Dear Volker, >>>>> >>>>> On 24-04-2017 14:08, Volker Simonis wrote: >>>>>> Hi Gustavo, >>>>>> >>>>>> thanks for addressing this problem and sorry for my late reply. I >>>>>> think this is a good change which definitely improves the situation >>>>>> for uncommon NUMA configurations without changing the handling for >>>>>> common topologies. >>>>> >>>>> Thanks a lot for reviewing the change! >>>>> >>>>> >>>>>> It would be great if somebody could run this trough JPRT, but as >>>>>> Gustavo mentioned, I don't expect any regressions. >>>>>> >>>>>> @Igor: I think you've been the original author of the NUMA-aware >>>>>> allocator port to Linux (i.e. "6684395: Port NUMA-aware allocator to >>>>>> linux"). If you could find some spare minutes to take a look at this >>>>>> change, your comment would be very much appreciated :) >>>>>> >>>>>> Following some minor comments from me: >>>>>> >>>>>> - in os::numa_get_groups_num() you now use numa_num_configured_nodes() >>>>>> to get the actual number of configured nodes. This is good and >>>>>> certainly an improvement over the previous implementation. However, >>>>>> the man page for numa_num_configured_nodes() mentions that the >>>>>> returned count may contain currently disabled nodes. Do we currently >>>>>> handle disabled nodes? What will be the consequence if we would use >>>>>> such a disabled node (e.g. mbind() warnings)? >>>>> >>>>> In [1] 'numa_memnode_ptr' is set to keep a list of *just nodes with memory in >>>>> found in /sys/devices/system/node/* Hence numa_num_configured_nodes() just >>>>> returns the number of nodes in 'numa_memnode_ptr' [2], thus just returns the >>>>> number of nodes with memory in the system. To the best of my knowledge there is >>>>> no system configuration on Linux/PPC64 that could match such a notion of >>>>> "disabled nodes" as it appears in libnuma's manual. If it is enabled, it's in >>>>> that dir and just the ones with memory will be taken into account. If it's >>>>> disabled (somehow), it's not in the dir, so won't be taken into account (i.e. no >>>>> mbind() tried against it). >>>>> >>>>> On Power it's possible to have a numa node without memory (memory-less node, a >>>>> case covered in this change), a numa node without cpus at all but with memory >>>>> (a configured node anyway, so a case already covered) but to disable a specific >>>>> numa node so it does not appear in /sys/devices/system/node/* it's only possible >>>>> from the inners of the control module. Or other rare condition not invisible / >>>>> adjustable from the OS. Also I'm not aware of a case where a node is in this >>>>> dir but is at the same time flagged as something like "disabled". There are >>>>> cpu/memory hotplugs, but that does not change numa nodes status AFAIK. >>>>> >>>>> [1] https://github.com/numactl/numactl/blob/master/libnuma.c#L334-L347 >>>>> [2] https://github.com/numactl/numactl/blob/master/libnuma.c#L614-L618 >>>>> >>>>> >>>>>> - the same question applies to the usage of >>>>>> Linux::isnode_in_configured_nodes() within os::numa_get_leaf_groups(). >>>>>> Does isnode_in_configured_nodes() (i.e. the node set defined by >>>>>> 'numa_all_nodes_ptr' take into account the disabled nodes or not? Can >>>>>> this be a potential problem (i.e. if we use a disabled node). >>>>> >>>>> On the meaning of "disabled nodes", it's the same case as above, so to the >>>>> best of knowledge it's not a potential problem. >>>>> >>>>> Anyway 'numa_all_nodes_ptr' just includes the configured nodes (with memory), >>>>> i.e. "all nodes on which the calling task may allocate memory". It's exactly >>>>> the same pointer returned by numa_get_membind() v2 [3] which: >>>>> >>>>> "returns the mask of nodes from which memory can currently be allocated" >>>>> >>>>> and that is used, for example, in "numactl --show" to show nodes from where >>>>> memory can be allocated [4, 5]. >>>>> >>>>> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L1147 >>>>> [4] https://github.com/numactl/numactl/blob/master/numactl.c#L144 >>>>> [5] https://github.com/numactl/numactl/blob/master/numactl.c#L177 >>>>> >>>>> >>>>>> - I'd like to suggest renaming the 'index' part of the following >>>>>> variables and functions to 'nindex' ('node_index' is probably to long) >>>>>> in the following code, to emphasize that we have node indexes pointing >>>>>> to actual, not always consecutive node numbers: >>>>>> >>>>>> 2879 // Create an index -> node mapping, since nodes are not >>>>>> always consecutive >>>>>> 2880 _index_to_node = new (ResourceObj::C_HEAP, mtInternal) >>>>>> GrowableArray(0, true); >>>>>> 2881 rebuild_index_to_node_map(); >>>>> >>>>> Simple change but much better to read indeed. Done. >>>>> >>>>> >>>>>> - can you please wrap the following one-line else statement into curly >>>>>> braces (it's more readable and we usually do it that way in HotSpot >>>>>> although there are no formal style guidelines :) >>>>>> >>>>>> 2953 } else >>>>>> 2954 // Current node is already a configured node. >>>>>> 2955 closest_node = index_to_node()->at(i); >>>>> >>>>> Done. >>>>> >>>>> >>>>>> - in os::Linux::rebuild_cpu_to_node_map(), if you set >>>>>> 'closest_distance' to INT_MAX at the beginning of the loop, you can >>>>>> later avoid the check for '|| !closest_distance'. Also, according to >>>>>> the man page, numa_distance() returns 0 if it can not determine the >>>>>> distance. So with the above change, the condition on line 2974 should >>>>>> read: >>>>>> >>>>>> 2947 if (distance && distance < closest_distance) { >>>>>> >>>>> >>>>> Sure, much better to set the initial condition as distant as possible and >>>>> adjust to a closer one bit by bit improving the if condition. Done. >>>>> >>>>> >>>>>> Finally, and not directly related to your change, I'd suggest the >>>>>> following clean-ups: >>>>>> >>>>>> - remove the usage of 'NCPUS = 32768' in >>>>>> os::Linux::rebuild_cpu_to_node_map(). The comment on that line is >>>>>> unclear to me and probably related to an older version/problem of >>>>>> libnuma? I think we should simply use >>>>>> numa_allocate_cpumask()/numa_free_cpumask() instead. >>>>>> >>>>>> - we still use the NUMA version 1 function prototypes (e.g. >>>>>> "numa_node_to_cpus(int node, unsigned long *buffer, int buffer_len)" >>>>>> instead of "numa_node_to_cpus(int node, struct bitmask *mask)", but >>>>>> also "numa_interleave_memory()" and maybe others). I think we should >>>>>> switch all prototypes to the new NUMA version 2 interface which you've >>>>>> already used for the new functions which you've added. >>>>> >>>>> I agree. Could I open a new bug to address these clean-ups? >>>>> >>>>> >>>>>> That said, I think these changes all require libnuma 2.0 (see >>>>>> os::Linux::libnuma_dlsym). So before starting this, you should make >>>>>> sure that libnuma 2.0 is available on all platforms to which you'd >>>>>> like to down-port this change. For jdk10 we could definitely do it, >>>>>> for jdk9 probably also, for jdk8 I'm not so sure. >>>>> >>>>> libnuma v1 last release dates back to 2008, but any idea how could I check that >>>>> for sure since it's on shared code? >>>>> >>>>> new webrev: http://cr.openjdk.java.net/~gromero/8175813/v3/ >>>>> >>>>> Thank you! >>>>> >>>>> Best regards, >>>>> Gustavo >>>>> >>>>> >>>>>> Regards, >>>>>> Volker >>>>>> >>>>>> On Thu, Apr 13, 2017 at 12:51 AM, Gustavo Romero >>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Any update on it? >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> Regards, >>>>>>> Gustavo >>>>>>> >>>>>>> On 09-03-2017 16:33, Gustavo Romero wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Could the following webrev be reviewed please? >>>>>>>> >>>>>>>> It improves the numa node detection when non-consecutive or memory-less nodes >>>>>>>> exist in the system. >>>>>>>> >>>>>>>> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/ >>>>>>>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>>>>>>> >>>>>>>> Currently, although no problem exists when the JVM detects numa nodes that are >>>>>>>> consecutive and have memory, for example in a numa topology like: >>>>>>>> >>>>>>>> available: 2 nodes (0-1) >>>>>>>> node 0 cpus: 0 8 16 24 32 >>>>>>>> node 0 size: 65258 MB >>>>>>>> node 0 free: 34 MB >>>>>>>> node 1 cpus: 40 48 56 64 72 >>>>>>>> node 1 size: 65320 MB >>>>>>>> node 1 free: 150 MB >>>>>>>> node distances: >>>>>>>> node 0 1 >>>>>>>> 0: 10 20 >>>>>>>> 1: 20 10, >>>>>>>> >>>>>>>> it fails on detecting numa nodes to be used in the Parallel GC in a numa >>>>>>>> topology like: >>>>>>>> >>>>>>>> available: 4 nodes (0-1,16-17) >>>>>>>> node 0 cpus: 0 8 16 24 32 >>>>>>>> node 0 size: 130706 MB >>>>>>>> node 0 free: 7729 MB >>>>>>>> node 1 cpus: 40 48 56 64 72 >>>>>>>> node 1 size: 0 MB >>>>>>>> node 1 free: 0 MB >>>>>>>> node 16 cpus: 80 88 96 104 112 >>>>>>>> node 16 size: 130630 MB >>>>>>>> node 16 free: 5282 MB >>>>>>>> node 17 cpus: 120 128 136 144 152 >>>>>>>> node 17 size: 0 MB >>>>>>>> node 17 free: 0 MB >>>>>>>> node distances: >>>>>>>> node 0 1 16 17 >>>>>>>> 0: 10 20 40 40 >>>>>>>> 1: 20 10 40 40 >>>>>>>> 16: 40 40 10 20 >>>>>>>> 17: 40 40 20 10, >>>>>>>> >>>>>>>> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have >>>>>>>> no memory. >>>>>>>> >>>>>>>> If a topology like that exists, os::numa_make_local() will receive a local group >>>>>>>> id as a hint that is not available in the system to be bound (it will receive >>>>>>>> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument" >>>>>>>> messages: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log >>>>>>>> >>>>>>>> That change improves the detection by making the JVM numa API aware of the >>>>>>>> existence of numa nodes that are non-consecutive from 0 to the highest node >>>>>>>> number and also of nodes that might be memory-less nodes, i.e. that might not >>>>>>>> be, in libnuma terms, a configured node. Hence just the configured nodes will >>>>>>>> be available: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log >>>>>>>> >>>>>>>> The change has no effect on numa topologies were the problem does not occur, >>>>>>>> i.e. no change in the number of nodes and no change in the cpu to node map. On >>>>>>>> numa topologies where memory-less nodes exist (like in the last example above), >>>>>>>> cpus from a memory-less node won't be able to bind locally so they are mapped >>>>>>>> to the closest node, otherwise they would be not associate to any node and >>>>>>>> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the >>>>>>>> performance. >>>>>>>> >>>>>>>> I found no regressions on x64 for the following numa topology: >>>>>>>> >>>>>>>> available: 2 nodes (0-1) >>>>>>>> node 0 cpus: 0 1 2 3 8 9 10 11 >>>>>>>> node 0 size: 24102 MB >>>>>>>> node 0 free: 19806 MB >>>>>>>> node 1 cpus: 4 5 6 7 12 13 14 15 >>>>>>>> node 1 size: 24190 MB >>>>>>>> node 1 free: 21951 MB >>>>>>>> node distances: >>>>>>>> node 0 1 >>>>>>>> 0: 10 21 >>>>>>>> 1: 21 10 >>>>>>>> >>>>>>>> I understand that fixing the current numa detection is a prerequisite to enable >>>>>>>> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2]. >>>>>>>> >>>>>>>> Thank you. >>>>>>>> >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Gustavo >>>>>>>> >>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate) >>>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation) >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> > From david.holmes at oracle.com Mon May 8 00:47:08 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 May 2017 10:47:08 +1000 Subject: Add support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs on Windows platforms In-Reply-To: References: Message-ID: Added back jdk10-dev as a bcc. Added hotspot-dev and core-libs-dev (for launcher) for follow up discussions. Hi John, On 8/05/2017 10:33 AM, John Platts wrote: > I actually did a search through the code that implements > JNI_CreateJavaVM, and I found that the conversion of the strings is done > using java_lang_String::create_from_platform_dependent_str, which > converts from the platform-default encoding to Unicode. In the case of > Windows-based platforms, the conversion is done based on the ANSI > character encoding instead of UTF-8 or Modified UTF-8. > > > The platform encoding detection logic on Windows is implemented > java_props_md.c, which can be found at > jdk/src/windows/native/java/lang/java_props_md.c in releases prior to > JDK 9 and at src/java.base/windows/native/libjava/java_props_md.c in JDK > 9 and later. The encoding used for command-line arguments passed into > the JNI invocation API is Cp1252 for English locales on Windows > platforms, and not Modified UTF-8 or UTF-8. > > > The documentation found > at http://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html also > states that the strings passed into JNI_CreateJavaVM are in the > platform-default encoding. Thanks for the additional details. I assume you are referring to: typedef struct JavaVMOption { char *optionString; /* the option as a string in the default platform encoding */ that comment should not form part of the specification as it is non-normative text. If the intent is truly to use the platform default encoding and not UTF-8 then that should be very clearly spelt out in the spec! That said, the implementation is following this so it is a limitation. I suspect this is historical. > A version of JNI_CreateJavaVM that takes UTF-16-encoded strings should > be added to the JNI Invocation API. The java.exe launchers and javaw.exe > launchers should also be updated to use the UTF-16 version of the > JNI_CreateJavaVM function on Windows platforms and to use wmain and > wWinMain instead of main and WinMain. Why versions for UTF-16 instead of the missing UTF-8 variants? As I said the whole spec is intended to be based around UTF-8 so we would not want to throw in just a couple of UTF-16 based usages. Thanks, David > > A few files in HotSpot would need to be changed in order to implement > the UTF-16 version of JNI_CreateJavaVM, but the change would improve > consistency across different locales on Windows platforms and allow > arguments that contain Unicode characters that are not available in the > platform-default encoding to be passed into the JVM on the command line. > > > The UTF-16-based version of JNI_CreateJavaVM also makes it easier to > allocate string objects that contain non-ASCII characters as the strings > are already in UTF-16 format, at least in cases where the strings > contain Unicode characters that are not in Latin-1 or on VMs that do not > support compact Latin-1 strings. > > > The UTF-16-based version of JNI_CreateJavaVM should probably be > implemented as a separate function so that the solution could be > backported to JDK 8 and JDK 9 updates and so that backwards > compatibility with the current JNI_CreateJavaVM implementation is > maintained. > > > Here is what the new UTF-16-based API might look like: > > typedef struct JavaVMInitArgs_UTF16 { > jint version; > jint nOptions; > JavaVMOptionUTF16 *options; > jboolean ignoreUnrecognized; > } JavaVMInitArgs; > > > typedef struct JavaVMOption_UTF16 { > char *optionString; /* the option as a string in the default > platform encoding */ > void *extraInfo; > } JavaVMOptionUTF16; > > /* vm_args is an pointer to a JavaVMInitArgs_UTF16 structure */ > > jint JNI_CreateJavaVM_UTF16(JavaVM **p_vm, void **p_env, void *vm_args); > > > /* vm_args is a pointer to a JavaVMInitArgs_UTF16 structure */ > > jint JNI_GetDefaultJavaVMInitArgs_UTF16(void *vm_args); > > ------------------------------------------------------------------------ > *From:* David Holmes > *Sent:* Thursday, May 4, 2017 11:07 PM > *To:* John Platts; jdk10-dev at openjdk.java.net > *Subject:* Re: Add support for Unicode versions of JNI_CreateJavaVM and > JNI_GetDefaultJavaVMInitArgs on Windows platforms > > Hi John, > > The JNI is defined to use Modified UTF-8 format for strings, so any > Unicode character should be handled if passed in in the right format. > Updating the JNI specification and implementation to accept UTF-16 > directly would be a major undertaking. > > Is the issue here that you want a tool, like the java launcher, to > accept arbitrary Unicode strings in a end-user friendly manner and then > have it perform the modified UTF-8 conversion when invoking the VM? > > Can you give a concrete example of what you would like to be able to > pass as arguments to the JVM? > > Thanks, > David > > On 5/05/2017 1:04 PM, John Platts wrote: >> The JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods in the JNI invocation API expect ANSI strings on Windows platforms instead of Unicode-encoded strings. This is an issue on Windows-based platforms since some of the option strings that are passed into JNI_CreateJavaVM might contain Unicode characters that are not in > the ANSI encoding on Windows platforms. >> >> >> There is support for UTF-16 literals on Windows platforms with wchar_t and wide character literals prefixed with the L prefix, and on platforms that support C11 and C++11 with char16_t and UTF-16 character literals that are prefixed with the u prefix. >> >> >> jchar is currently defined to be a typedef for unsigned short on all platforms, but char16_t is a separate type and not a typedef for unsigned short or jchar in C++11 and later. jchar should be changed to be a typedef for wchar_t on Windows platforms and to be a typedef for char16_t on non-Windows platforms that support the > char16_t type. This change will make it possible to define jchar > character and string literals on Windows platforms and on non-Windows > platforms that support the C11 or C++11 standard. >> >> >> The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on Windows: >> >> #define JCHAR_LITERAL(x) L ## x >> >> >> The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on non-Windows platforms: >> >> #define JCHAR_LITERAL(x) u ## x >> >> >> Here is how the Unicode version of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs could be defined: >> >> typedef struct JavaVMUnicodeOption { >> const jchar *optionString; /* the option as a string in UTF-16 encoding */ >> void *extraInfo; >> } JavaVMUnicodeOption; >> >> typedef struct JavaVMUnicodeInitArgs { >> jint version; >> jint nOptions; >> JavaVMUnicodeOption *options; >> jboolean ignoreUnrecognized; >> } JavaVMUnicodeInitArgs; >> >> jint JNI_CreateJavaVMUnicode(JavaVM **pvm, void **penv, void *args); >> jint JNI_GetDefaultJavaVMInitArgs(void *args); >> >> The java.exe wrapper should use wmain instead of main on Windows platforms, and the javaw.exe wrapper should use wWinMain instead of WinMain on Windows platforms. This change, along with the support for Unicode-enabled version of the JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods, would allow the JVM to be > launched with arguments that contain Unicode characters that are not in > the platform-default encoding. >> >> All of the Windows platforms that Java SE 10 and later VMs would be supported on do support Unicode. Adding support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs will allow Unicode characters that are not in the platform-default encoding on Windows platforms to be supported in command-line arguments > that are passed to the JVM. >> From david.holmes at oracle.com Mon May 8 05:30:42 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 May 2017 15:30:42 +1000 Subject: RFR 10 JDK-8159995: Rename internal Unsafe.compare methods In-Reply-To: <590CD221.4080100@oracle.com> References: <590CD221.4080100@oracle.com> Message-ID: <6cfadc10-f64b-c536-0d29-c74c3e81f2b7@oracle.com> Hi Ron, On 6/05/2017 5:27 AM, Ron Pressler wrote: > Hi, > Please review the following core/hotspot change: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8159995 > core webrev: > http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-jdk/webrev/ > > hotspot webrev: > http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-hotspot/webrev/ > > > This change is covered by existing tests. > > The following renaming was applied: > > - compareAndExchange*Volatile -> compareAndExchange* > - compareAndSwap* -> compareAndSet* So to clarify this for others, there was confusion surrounding the use of "swap" versus "exchange" when both words mean the same thing effectively, but the "swap" functions return a boolean, while the "exchange" functions return the old value. So we changed "swap" to "set" across the APIs - _except_ for the old /jdk.unsupported/share/classes/sun/misc/Unsafe.java because we can't change its exported API for compatibility reasons. Given any "swap(exp, new)" function can be implemented as "exchange(exp, new) == exp" I'm not sure why we have two complete sets of functions all the way through. But I guess that is a different issue. :) > - weakCompareAndSwap* -> weakCompareAndSet*Plain > - weakCompareAndSwap*Volatile -> weakCompareAndSet* > > At this stage, only method and hotspot intrinsic names were changed; > node names were left as-is, and may be handled in a separate issue. Overall looks good for libs and hotspot changes. One nit I spotted: src/java.base/share/classes/java/util/concurrent/atomic/AtomicLong.java + * compareAndSwap for longs. While the intrinsic compareAndSetLong compareAndSwap should be compareAndSet --- All hotspot files need their copyright years updated to 2017 (if not already). As there are hotspot changes this must be pushed using JPRT and "-testset hotspot" (but your sponsor should know that :) ). Thanks, David > Ron From david.holmes at oracle.com Mon May 8 06:27:39 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 May 2017 16:27:39 +1000 Subject: JDK10/RFR(L): 8172231: SPARC ISA/CPU feature detection is broken/insufficient (on Solaris). In-Reply-To: <6801aebd-eb60-6711-2859-0096818007eb@oracle.com> References: <6801aebd-eb60-6711-2859-0096818007eb@oracle.com> Message-ID: <2fb17c2c-684a-267f-5e3b-ae7defeed874@oracle.com> Hi Patric, I have read the below and looked through the proposed changes. While I can't validate the details (as I am not familiar with chip capabilities) the overall approach looks good and I prefer the capability-based tests to the "family" based tests. As you note there are a few fixme's and follow ups to do, but one suggestion I have is to remove the UseV8InstrsOnly flag. It doesn't make sense to me to keep this if we will abort on V8 anyway. The nature of the flag, in an unsupported environment, precludes it from following our more usual deprecation process, and it is a develop flag anyway. I do have concerns about how this may work on Fujitsu, but hopefully there is plenty of bake-time in JDK 10 to shake out any issue. Thanks, David On 28/04/2017 11:48 PM, Patric Hedlin wrote: > Dear all, > > I would like to ask for help to review the following change/update: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8172231 > > Webrev: http://cr.openjdk.java.net/~neliasso/8172231/ > > > > 8172231: SPARC ISA/CPU feature detection is broken/insufficient (on > Solaris). > > Updating SPARC feature/capability detection (incorporating changes > from Martin Walsh). > More complete set of features as provided by 'getisax(2)' interface, > propagated via JVMCI. > More robust hardware probing for additional features (up to Core S4). > Removing support for old, pre Niagara, hardware. > Removing support for old, pre 11.1, Solaris. > > Changed behaviour: > Changing SPARC setup for AllocatePrefetchLines and > AllocateInstancePrefetchLines > such that they will (still) be doubled when cache-line size is small > (32 bytes), > but more moderately increased on new/contemporary hardware (inc >= > 50%). > Changing to default instruction fetch alignment based on derived > caps. instead > of relying on default/configuration values. > > The above changes also subsumes: > 8035146: assert(is_T_family(features) == is_niagara(features), > "Niagara should be T series") is incorrect > 8054979: Remove unnecessary defines in SPARC's > VM_Version::platform_features > > > Rationale: > > Current hardware detection on Solaris/SPARC is not up to date with > the "latest" (here, > meaning commercially available server solutions, i.e. T7/M7). To > facilitate improved > use of the new hardware features provided (by Core S3&S4) these > capabilities need to > be recognised by the JVM. > > NOTE: This update is limited to Core S3&S4, i.e. not including Core > S5. Proper Core S5 > support will be added when regular testing and benchmarking > resources are available, > i.e. regular testing need to include M8 hardware. > > > Caveat: > > This update will introduce some redundancies into the code base, > features and definitions > currently not used, as well as a (small) number of FIXMEs, addressed > by subsequent bug or > feature updates/patches. Fujitsu HW is treated very conservatively. > > > Testing: > > Mostly tested on JDK9 (RBT/hotspot/comp). Only local testing on > JDK10 (jtreg/hotspot). > > > Benchmarking: > > Benchmark reports from a limited set of runs can be found at: > > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.jbb05 > > > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.jvm08 > > > > http://aurora.se.oracle.com/performance/reporting/report/patric.hedlin.TvM.octane.plus > > > > (Limited availability of M7 hardware prevents complete suites/runs.) > > > Best regards, > Patric > From aph at redhat.com Mon May 8 08:20:12 2017 From: aph at redhat.com (Andrew Haley) Date: Mon, 8 May 2017 09:20:12 +0100 Subject: RFR: 8179701: AArch64: Reinstate FP as an allocatable register Message-ID: http://cr.openjdk.java.net/~aph/8179701/ OK? Andrew. From aph at redhat.com Mon May 8 09:26:38 2017 From: aph at redhat.com (Andrew Haley) Date: Mon, 8 May 2017 10:26:38 +0100 Subject: RFR: 8179444: AArch64: Put zero_words on a diet In-Reply-To: References: Message-ID: <0405303a-0333-1e76-d1b1-09607e2d73f2@redhat.com> On 05/05/17 23:54, White, Derek wrote: > - Unless ClearArray only zero's from the first array element on - > then we could guess if base is 16-byte unaligned by looking at the > array header size. I think the heap is only HeapWord-aligned anyway. I hope that in the future there will be no hardware penalty for word-aligned STRD. In the meantime I'm not at all eager to put a load of special-purpose tweaks into the AArch64 port. Andrew. From thomas.stuefe at gmail.com Mon May 8 09:29:28 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 8 May 2017 11:29:28 +0200 Subject: Question about vfprintf hook VM argument Message-ID: Hi all, what exactly is the purpose of the FILE* argument in the vfprintf hook? We had - actually several times already - the problem that our VM was embedded by a customized launcher which used the vfprintf hook to redirect the VM output. If the launcher uses the FILE* handed over by the VM to write to, it must be linked against the same C-Runtime as the VM itself. This is not necessarily a given, especially on Windows: the launcher may link against the debug C-Runtime (compiled with /MDd) wheras the JDK is build with "/MD" and links against the release C-Runtime. Or the launcher may even have been linked statically against the C-Runtime. Or... In my opinion it is not a good idea to hand over C-Runtime internals - be it malloced memory or FILE* pointers - to other binaries which may have been built with different build options. But I do not even understand the point of passing FILE* to the hook? If the point of the hook is to give embedding code the ability to write to somewhere else, why even bother giving it *my* file pointer? Thanks & Kind Regards, Thomas From adinn at redhat.com Mon May 8 09:31:29 2017 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 8 May 2017 10:31:29 +0100 Subject: RFR: 8179701: AArch64: Reinstate FP as an allocatable register In-Reply-To: References: Message-ID: On 08/05/17 09:20, Andrew Haley wrote: > http://cr.openjdk.java.net/~aph/8179701/ > > OK? Yes, that needs to go back in. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From gromero at linux.vnet.ibm.com Mon May 8 14:21:51 2017 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 8 May 2017 11:21:51 -0300 Subject: [10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: <4b26117f-508d-90ee-1ced-2a2c720a1047@oracle.com> References: <58C1AE06.9060609@linux.vnet.ibm.com> <58EEAF7B.6020708@linux.vnet.ibm.com> <59000AC0.7050507@linux.vnet.ibm.com> <5909DAAC.3070202@linux.vnet.ibm.com> <590CD5E7.10809@linux.vnet.ibm.com> <4b26117f-508d-90ee-1ced-2a2c720a1047@oracle.com> Message-ID: <59107EFF.9000805@linux.vnet.ibm.com> Hi David, Volker Thanks a lot reviewing and pushing the change! Regards, Gustavo On 07-05-2017 17:45, David Holmes wrote: > Hi Gustavo, > > On 6/05/2017 5:43 AM, Gustavo Romero wrote: >> Hi David, >> >> On 04-05-2017 21:32, David Holmes wrote: >>> Hi Volker, Gustavo, >>> >>> On 4/05/2017 12:34 AM, Volker Simonis wrote: >>>> Hi, >>>> >>>> I've reviewed Gustavo's change and I'm fine with the latest version at: >>>> >>>> http://cr.openjdk.java.net/~gromero/8175813/v3/ >>> >>> Nothing has really changed for me since I first looked at this - I don't know NUMA and I can't comment on any of the details. But no-one else has commented negatively so they are implicitly okay with >>> this, or else they should have spoken up. So with Volker as the Reviewer and myself as a second reviewer, I will sponsor this. I'll run the current patch through JPRT while awaiting the final version. >> >> Thanks a lot for reviewing and sponsoring the change. >> >> >>> One thing I was unclear on with all this numa code is the expectation regarding all those dynamically looked up functions - is it expected that we will have them all or else have none? It wasn't at >>> all obvious what would happen if we don't have those functions but still executed this code - assuming that is even possible. I guess I would have expected that no numa code would execute unless >>> -XX:+UseNUMA was set, in which case the VM would abort if any of the libnuma functions could not be found. That way we wouldn't need the null checks for the function pointers. >> >> If libnuma is not available in the system os::Linux::libnuma_init() will return >> false and JVM will refuse to enable the UseNUMA features instead of aborting: >> >> 4904 if (UseNUMA) { >> 4905 if (!Linux::libnuma_init()) { >> 4906 UseNUMA = false; >> 4907 } else { >> >> I understand those null checks as part of the initial design of JVM numa api to >> enforce protection against the usage of its methods in other parts of the code >> when JVM api failed to initialize properly, even tho it's expected that >> UseNUMA = false should suffice to protect such a usages. > > Ok. Seems like they should be asserts rather than runtime checks if all the paths are properly guarded by UseNUMA - but that isn't your problem. > >> That said, I could not find any recent Linux distribution that does not support >> libnuma v2 api (and so also v1 api). On Ubuntu it will be installed as a >> dependency of metapackage ubuntu-standard and because that requires "irqbalance" >> it also requires libnuma. Libnuma was updated from libnuma v1 to v2 >> around mid 2008: > > Thanks for the additional info. > >> numactl (2.0.1-1) unstable; urgency=low >> >> * New upstream >> * patches/static-lib.patch: update >> * debian/watch: update to new SGI location >> >> -- Ian Wienand Sat, 07 Jun 2008 14:18:22 -0700 >> >> numactl (1.0.2-1) unstable; urgency=low >> >> * New upstream >> * Closes: #442690 -- Add to rules a hack to remove libnuma.a after >> unpatching >> * Update README.debian >> >> >> -- Ian Wienand Wed, 03 Oct 2007 21:49:27 +1000 >> >> >> It's similar on RHEL, where "irqbalance" is in core group. Regarding >> the libnuma version it was also updated in 2008 to v2, so since >> Fedora 11 contains v2, hence RHEL 6 and RHEL 7 contains it: >> >> * Wed Feb 25 2009 Fedora Release Engineering - 2.0.2-3 >> - Rebuilt for https://fedoraproject.org/wiki/Fedora_11_Mass_Rebuild >> >> * Mon Sep 29 2008 Neil Horman - 2.0.2-2 >> - Fix build break due to register selection in asm >> >> * Mon Sep 29 2008 Neil Horman - 2.0.2-1 >> - Update rawhide to version 2.0.2 of numactl >> >> * Fri Apr 25 2008 Neil Horman - 1.0.2-6 >> - Fix buffer size passing and arg sanity check for physcpubind (bz 442521) >> >> >> Also, the last release of libnuma v1 dates back to 2008: >> https://github.com/numactl/numactl/releases/tag/v1.0.2 >> >> So it looks like libnuma v2 absence on Linux is by now uncommon. >> >> >>> Style nits: >>> - we should avoid implicit booleans, so the isnode_in_* functions should return bool not int; and check "distance != 0" >>> - spaces around operators eg. node=0 should be node = 0 >> >> new webrev: http://cr.openjdk.java.net/~gromero/8175813/v4/ > > Looks good. Changes being pushed now. > > David > ----- > >> >> Thank you and best regards, >> Gustavo >> >>> Thanks, >>> David >>> >>>> Can somebody please sponsor the change? >>>> >>>> Thank you and best regards, >>>> Volker >>>> >>>> >>>> On Wed, May 3, 2017 at 3:27 PM, Gustavo Romero >>>> wrote: >>>>> Hi community, >>>>> >>>>> I understand that there is nothing that can be done additionally regarding this >>>>> issue, at this point, on the PPC64 side. >>>>> >>>>> It's a change in the shared code - but that in effect does not change anything in >>>>> the numa detection mechanism for other platforms - and hence it's necessary a >>>>> conjoint community effort to review the change and a sponsor to run it against >>>>> the JPRT. >>>>> >>>>> I know it's a stabilizing moment of OpenJDK 9, but since that issue is of >>>>> great concern on PPC64 (specially on POWER8 machines) I would be very glad if >>>>> the community could point out directions on how that change could move on. >>>>> >>>>> Thank you! >>>>> >>>>> Best regards, >>>>> Gustavo >>>>> >>>>> On 25-04-2017 23:49, Gustavo Romero wrote: >>>>>> Dear Volker, >>>>>> >>>>>> On 24-04-2017 14:08, Volker Simonis wrote: >>>>>>> Hi Gustavo, >>>>>>> >>>>>>> thanks for addressing this problem and sorry for my late reply. I >>>>>>> think this is a good change which definitely improves the situation >>>>>>> for uncommon NUMA configurations without changing the handling for >>>>>>> common topologies. >>>>>> >>>>>> Thanks a lot for reviewing the change! >>>>>> >>>>>> >>>>>>> It would be great if somebody could run this trough JPRT, but as >>>>>>> Gustavo mentioned, I don't expect any regressions. >>>>>>> >>>>>>> @Igor: I think you've been the original author of the NUMA-aware >>>>>>> allocator port to Linux (i.e. "6684395: Port NUMA-aware allocator to >>>>>>> linux"). If you could find some spare minutes to take a look at this >>>>>>> change, your comment would be very much appreciated :) >>>>>>> >>>>>>> Following some minor comments from me: >>>>>>> >>>>>>> - in os::numa_get_groups_num() you now use numa_num_configured_nodes() >>>>>>> to get the actual number of configured nodes. This is good and >>>>>>> certainly an improvement over the previous implementation. However, >>>>>>> the man page for numa_num_configured_nodes() mentions that the >>>>>>> returned count may contain currently disabled nodes. Do we currently >>>>>>> handle disabled nodes? What will be the consequence if we would use >>>>>>> such a disabled node (e.g. mbind() warnings)? >>>>>> >>>>>> In [1] 'numa_memnode_ptr' is set to keep a list of *just nodes with memory in >>>>>> found in /sys/devices/system/node/* Hence numa_num_configured_nodes() just >>>>>> returns the number of nodes in 'numa_memnode_ptr' [2], thus just returns the >>>>>> number of nodes with memory in the system. To the best of my knowledge there is >>>>>> no system configuration on Linux/PPC64 that could match such a notion of >>>>>> "disabled nodes" as it appears in libnuma's manual. If it is enabled, it's in >>>>>> that dir and just the ones with memory will be taken into account. If it's >>>>>> disabled (somehow), it's not in the dir, so won't be taken into account (i.e. no >>>>>> mbind() tried against it). >>>>>> >>>>>> On Power it's possible to have a numa node without memory (memory-less node, a >>>>>> case covered in this change), a numa node without cpus at all but with memory >>>>>> (a configured node anyway, so a case already covered) but to disable a specific >>>>>> numa node so it does not appear in /sys/devices/system/node/* it's only possible >>>>>> from the inners of the control module. Or other rare condition not invisible / >>>>>> adjustable from the OS. Also I'm not aware of a case where a node is in this >>>>>> dir but is at the same time flagged as something like "disabled". There are >>>>>> cpu/memory hotplugs, but that does not change numa nodes status AFAIK. >>>>>> >>>>>> [1] https://github.com/numactl/numactl/blob/master/libnuma.c#L334-L347 >>>>>> [2] https://github.com/numactl/numactl/blob/master/libnuma.c#L614-L618 >>>>>> >>>>>> >>>>>>> - the same question applies to the usage of >>>>>>> Linux::isnode_in_configured_nodes() within os::numa_get_leaf_groups(). >>>>>>> Does isnode_in_configured_nodes() (i.e. the node set defined by >>>>>>> 'numa_all_nodes_ptr' take into account the disabled nodes or not? Can >>>>>>> this be a potential problem (i.e. if we use a disabled node). >>>>>> >>>>>> On the meaning of "disabled nodes", it's the same case as above, so to the >>>>>> best of knowledge it's not a potential problem. >>>>>> >>>>>> Anyway 'numa_all_nodes_ptr' just includes the configured nodes (with memory), >>>>>> i.e. "all nodes on which the calling task may allocate memory". It's exactly >>>>>> the same pointer returned by numa_get_membind() v2 [3] which: >>>>>> >>>>>> "returns the mask of nodes from which memory can currently be allocated" >>>>>> >>>>>> and that is used, for example, in "numactl --show" to show nodes from where >>>>>> memory can be allocated [4, 5]. >>>>>> >>>>>> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L1147 >>>>>> [4] https://github.com/numactl/numactl/blob/master/numactl.c#L144 >>>>>> [5] https://github.com/numactl/numactl/blob/master/numactl.c#L177 >>>>>> >>>>>> >>>>>>> - I'd like to suggest renaming the 'index' part of the following >>>>>>> variables and functions to 'nindex' ('node_index' is probably to long) >>>>>>> in the following code, to emphasize that we have node indexes pointing >>>>>>> to actual, not always consecutive node numbers: >>>>>>> >>>>>>> 2879 // Create an index -> node mapping, since nodes are not >>>>>>> always consecutive >>>>>>> 2880 _index_to_node = new (ResourceObj::C_HEAP, mtInternal) >>>>>>> GrowableArray(0, true); >>>>>>> 2881 rebuild_index_to_node_map(); >>>>>> >>>>>> Simple change but much better to read indeed. Done. >>>>>> >>>>>> >>>>>>> - can you please wrap the following one-line else statement into curly >>>>>>> braces (it's more readable and we usually do it that way in HotSpot >>>>>>> although there are no formal style guidelines :) >>>>>>> >>>>>>> 2953 } else >>>>>>> 2954 // Current node is already a configured node. >>>>>>> 2955 closest_node = index_to_node()->at(i); >>>>>> >>>>>> Done. >>>>>> >>>>>> >>>>>>> - in os::Linux::rebuild_cpu_to_node_map(), if you set >>>>>>> 'closest_distance' to INT_MAX at the beginning of the loop, you can >>>>>>> later avoid the check for '|| !closest_distance'. Also, according to >>>>>>> the man page, numa_distance() returns 0 if it can not determine the >>>>>>> distance. So with the above change, the condition on line 2974 should >>>>>>> read: >>>>>>> >>>>>>> 2947 if (distance && distance < closest_distance) { >>>>>>> >>>>>> >>>>>> Sure, much better to set the initial condition as distant as possible and >>>>>> adjust to a closer one bit by bit improving the if condition. Done. >>>>>> >>>>>> >>>>>>> Finally, and not directly related to your change, I'd suggest the >>>>>>> following clean-ups: >>>>>>> >>>>>>> - remove the usage of 'NCPUS = 32768' in >>>>>>> os::Linux::rebuild_cpu_to_node_map(). The comment on that line is >>>>>>> unclear to me and probably related to an older version/problem of >>>>>>> libnuma? I think we should simply use >>>>>>> numa_allocate_cpumask()/numa_free_cpumask() instead. >>>>>>> >>>>>>> - we still use the NUMA version 1 function prototypes (e.g. >>>>>>> "numa_node_to_cpus(int node, unsigned long *buffer, int buffer_len)" >>>>>>> instead of "numa_node_to_cpus(int node, struct bitmask *mask)", but >>>>>>> also "numa_interleave_memory()" and maybe others). I think we should >>>>>>> switch all prototypes to the new NUMA version 2 interface which you've >>>>>>> already used for the new functions which you've added. >>>>>> >>>>>> I agree. Could I open a new bug to address these clean-ups? >>>>>> >>>>>> >>>>>>> That said, I think these changes all require libnuma 2.0 (see >>>>>>> os::Linux::libnuma_dlsym). So before starting this, you should make >>>>>>> sure that libnuma 2.0 is available on all platforms to which you'd >>>>>>> like to down-port this change. For jdk10 we could definitely do it, >>>>>>> for jdk9 probably also, for jdk8 I'm not so sure. >>>>>> >>>>>> libnuma v1 last release dates back to 2008, but any idea how could I check that >>>>>> for sure since it's on shared code? >>>>>> >>>>>> new webrev: http://cr.openjdk.java.net/~gromero/8175813/v3/ >>>>>> >>>>>> Thank you! >>>>>> >>>>>> Best regards, >>>>>> Gustavo >>>>>> >>>>>> >>>>>>> Regards, >>>>>>> Volker >>>>>>> >>>>>>> On Thu, Apr 13, 2017 at 12:51 AM, Gustavo Romero >>>>>>> wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Any update on it? >>>>>>>> >>>>>>>> Thank you. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Gustavo >>>>>>>> >>>>>>>> On 09-03-2017 16:33, Gustavo Romero wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Could the following webrev be reviewed please? >>>>>>>>> >>>>>>>>> It improves the numa node detection when non-consecutive or memory-less nodes >>>>>>>>> exist in the system. >>>>>>>>> >>>>>>>>> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/ >>>>>>>>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>>>>>>>> >>>>>>>>> Currently, although no problem exists when the JVM detects numa nodes that are >>>>>>>>> consecutive and have memory, for example in a numa topology like: >>>>>>>>> >>>>>>>>> available: 2 nodes (0-1) >>>>>>>>> node 0 cpus: 0 8 16 24 32 >>>>>>>>> node 0 size: 65258 MB >>>>>>>>> node 0 free: 34 MB >>>>>>>>> node 1 cpus: 40 48 56 64 72 >>>>>>>>> node 1 size: 65320 MB >>>>>>>>> node 1 free: 150 MB >>>>>>>>> node distances: >>>>>>>>> node 0 1 >>>>>>>>> 0: 10 20 >>>>>>>>> 1: 20 10, >>>>>>>>> >>>>>>>>> it fails on detecting numa nodes to be used in the Parallel GC in a numa >>>>>>>>> topology like: >>>>>>>>> >>>>>>>>> available: 4 nodes (0-1,16-17) >>>>>>>>> node 0 cpus: 0 8 16 24 32 >>>>>>>>> node 0 size: 130706 MB >>>>>>>>> node 0 free: 7729 MB >>>>>>>>> node 1 cpus: 40 48 56 64 72 >>>>>>>>> node 1 size: 0 MB >>>>>>>>> node 1 free: 0 MB >>>>>>>>> node 16 cpus: 80 88 96 104 112 >>>>>>>>> node 16 size: 130630 MB >>>>>>>>> node 16 free: 5282 MB >>>>>>>>> node 17 cpus: 120 128 136 144 152 >>>>>>>>> node 17 size: 0 MB >>>>>>>>> node 17 free: 0 MB >>>>>>>>> node distances: >>>>>>>>> node 0 1 16 17 >>>>>>>>> 0: 10 20 40 40 >>>>>>>>> 1: 20 10 40 40 >>>>>>>>> 16: 40 40 10 20 >>>>>>>>> 17: 40 40 20 10, >>>>>>>>> >>>>>>>>> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have >>>>>>>>> no memory. >>>>>>>>> >>>>>>>>> If a topology like that exists, os::numa_make_local() will receive a local group >>>>>>>>> id as a hint that is not available in the system to be bound (it will receive >>>>>>>>> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument" >>>>>>>>> messages: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log >>>>>>>>> >>>>>>>>> That change improves the detection by making the JVM numa API aware of the >>>>>>>>> existence of numa nodes that are non-consecutive from 0 to the highest node >>>>>>>>> number and also of nodes that might be memory-less nodes, i.e. that might not >>>>>>>>> be, in libnuma terms, a configured node. Hence just the configured nodes will >>>>>>>>> be available: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log >>>>>>>>> >>>>>>>>> The change has no effect on numa topologies were the problem does not occur, >>>>>>>>> i.e. no change in the number of nodes and no change in the cpu to node map. On >>>>>>>>> numa topologies where memory-less nodes exist (like in the last example above), >>>>>>>>> cpus from a memory-less node won't be able to bind locally so they are mapped >>>>>>>>> to the closest node, otherwise they would be not associate to any node and >>>>>>>>> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the >>>>>>>>> performance. >>>>>>>>> >>>>>>>>> I found no regressions on x64 for the following numa topology: >>>>>>>>> >>>>>>>>> available: 2 nodes (0-1) >>>>>>>>> node 0 cpus: 0 1 2 3 8 9 10 11 >>>>>>>>> node 0 size: 24102 MB >>>>>>>>> node 0 free: 19806 MB >>>>>>>>> node 1 cpus: 4 5 6 7 12 13 14 15 >>>>>>>>> node 1 size: 24190 MB >>>>>>>>> node 1 free: 21951 MB >>>>>>>>> node distances: >>>>>>>>> node 0 1 >>>>>>>>> 0: 10 21 >>>>>>>>> 1: 21 10 >>>>>>>>> >>>>>>>>> I understand that fixing the current numa detection is a prerequisite to enable >>>>>>>>> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2]. >>>>>>>>> >>>>>>>>> Thank you. >>>>>>>>> >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Gustavo >>>>>>>>> >>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate) >>>>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation) >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>> >> > From rkennke at redhat.com Mon May 8 14:43:25 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 8 May 2017 16:43:25 +0200 Subject: RFR: JDK-8167659: Access of mark word should use oopDesc::mark_offset_in_bytes() instead of '0' In-Reply-To: <23550b76-2f10-bb4f-315d-4b2137ad796e@redhat.com> References: <23550b76-2f10-bb4f-315d-4b2137ad796e@redhat.com> Message-ID: Ping? > I posted this before: > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-October/024889.html > > And just realized it's not been fixed yet. Maybe now would be a good > time to include it? > > I believe this changeset address all the issues mentioned in the above > discussions. > > http://cr.openjdk.java.net/~rkennke/8167659/webrev.00/ > > > Testing: jcstress -m sanity, specjvm > (cannot currently run jtreg tests because I've no idea how to run the > jcstress-jtreg tests..) > > Roman > > From Derek.White at cavium.com Mon May 8 15:56:00 2017 From: Derek.White at cavium.com (White, Derek) Date: Mon, 8 May 2017 15:56:00 +0000 Subject: RFR: 8179444: AArch64: Put zero_words on a diet In-Reply-To: References: Message-ID: Hi Andrew, Looks OK then. If I find a major performance issue on misaligned stp I'll propose a fix separately. In some HW I've seen a up to a 2x performance penalty on unaligned destinations in some memcpy-like code. - Derek -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Sunday, May 07, 2017 4:41 AM To: White, Derek ; Roland Westrelin ; hotspot-dev Source Developers Subject: Re: RFR: 8179444: AArch64: Put zero_words on a diet Hi, On 05/05/17 23:54, White, Derek wrote: > src/cpu/aarch64/vm/macroAssembler_aarch64.cpp: > - zero_words(reg, reg): > - Comment mentions that it's used by the ClearArray pattern, but > it's also used by generate_fill(). OK, but there's no contradiction there. The comment just explains why zero_words must be small. > - The old zero_words(), via block_zero() and fill_words(), would > align base to 16-bytes before doing stps, the new code doesn't. It may > be worth conditionally pre-aligning if AvoidUnalignedAcesses is true. > And could conditionally remove pre-aligning in zero_blocks. But that'd bloat zero_words, surely. And I haven't see the new code running any slower on any hardware, > Line 4999: This is pre-existing, but it's confusing - could you rename > "ShortArraySize" to "SmallArraySize"? OK. > - zero_words(reg, int): > - I don't see a great way to handle unaligned accesses without > blowing up the code size. So no real request here. > - It looks like worst case is 15 unaligned stp instructions. > - With some benchmarking someday I might argue for calling this > constant count version for smaller copies. I don't understand this. > - Unless ClearArray only zero's from the first array element on - > then we could guess if base is 16-byte unaligned by looking at the > array header size. > > - zero_dcache_blocks(): > - Suggest a new comment that mentions that it's only called from > zero_blocks()? Or if it's meant to be more general then add comments > about the requirements for base alignment, and that cnt has to be >= > 2*zva_length. OK. > - zero_blocks DOES align base to 16-bytes, so we don't need to check > here? Or make it a runtime assert? I did wonder about that, but left it in as a safety net because its cost is immeasurably small. Thanks for such a detailed review. Andrew. From david.holmes at oracle.com Mon May 8 21:25:33 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 May 2017 07:25:33 +1000 Subject: Question about vfprintf hook VM argument In-Reply-To: References: Message-ID: Hi Thomas, On 8/05/2017 7:29 PM, Thomas St?fe wrote: > Hi all, > > what exactly is the purpose of the FILE* argument in the vfprintf hook? I see your point. :) The vfprint_hook is a replacement vfprintf function to be called from jio_vfprintf: int jio_vfprintf(FILE* f, const char *fmt, va_list args) { if (Arguments::vfprintf_hook() != NULL) { return Arguments::vfprintf_hook()(f, fmt, args); } else { return vfprintf(f, fmt, args); } } so whatever gets passed to jio_vfprintf gets passed through to the hook. But ... > We had - actually several times already - the problem that our VM was > embedded by a customized launcher which used the vfprintf hook to redirect > the VM output. If the launcher uses the FILE* handed over by the VM to > write to, it must be linked against the same C-Runtime as the VM itself. > This is not necessarily a given, especially on Windows: the launcher may > link against the debug C-Runtime (compiled with /MDd) wheras the JDK is > build with "/MD" and links against the release C-Runtime. Or the launcher > may even have been linked statically against the C-Runtime. Or... > > In my opinion it is not a good idea to hand over C-Runtime internals - be > it malloced memory or FILE* pointers - to other binaries which may have > been built with different build options. But I do not even understand the > point of passing FILE* to the hook? If the point of the hook is to give > embedding code the ability to write to somewhere else, why even bother > giving it *my* file pointer? ... I confess I had no idea why this vfprint hook exists, but this somewhat explains it: https://bugs.openjdk.java.net/browse/JDK-4015550 and yes it does suggest that although the FILE* is passed in the expectation is that the function will actually write somewhere else. IIUC the intent was to allow fd's 0,1 and 2 to be re-mapped by the hook to match whatever the embedded app had change System.out/err to. But as fd's were per-dll they couldn't pass through the fd so they passed through the FILE*. But how they expected that to be mapped to stdout/stderr I have no idea. Cheers, David > Thanks & Kind Regards, Thomas > From david.holmes at oracle.com Mon May 8 21:28:32 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 May 2017 07:28:32 +1000 Subject: RFR: 8179701: AArch64: Reinstate FP as an allocatable register In-Reply-To: References: Message-ID: <12a122c9-f316-fb48-6b7a-38190deb4e2b@oracle.com> I'll add the Reviewer stamp based on Andrew D.s review. David On 8/05/2017 7:31 PM, Andrew Dinn wrote: > On 08/05/17 09:20, Andrew Haley wrote: >> http://cr.openjdk.java.net/~aph/8179701/ >> >> OK? > > Yes, that needs to go back in. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From paul.sandoz at oracle.com Mon May 8 21:43:56 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 8 May 2017 14:43:56 -0700 Subject: RFR 10 JDK-8159995: Rename internal Unsafe.compare methods In-Reply-To: <6cfadc10-f64b-c536-0d29-c74c3e81f2b7@oracle.com> References: <590CD221.4080100@oracle.com> <6cfadc10-f64b-c536-0d29-c74c3e81f2b7@oracle.com> Message-ID: <06957B86-7C6A-4F9B-AF23-327DAC46F0CE@oracle.com> > On 7 May 2017, at 22:30, David Holmes wrote: > > Hi Ron, > > On 6/05/2017 5:27 AM, Ron Pressler wrote: >> Hi, >> Please review the following core/hotspot change: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8159995 >> core webrev: >> http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-jdk/webrev/ >> >> hotspot webrev: >> http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-hotspot/webrev/ >> >> >> This change is covered by existing tests. >> >> The following renaming was applied: >> >> - compareAndExchange*Volatile -> compareAndExchange* >> - compareAndSwap* -> compareAndSet* > > So to clarify this for others, there was confusion surrounding the use of "swap" versus "exchange" when both words mean the same thing effectively, but the "swap" functions return a boolean, while the "exchange" functions return the old value. So we changed "swap" to "set" across the APIs - _except_ for the old /jdk.unsupported/share/classes/sun/misc/Unsafe.java because we can't change its exported API for compatibility reasons. > > Given any "swap(exp, new)" function can be implemented as "exchange(exp, new) == exp" I'm not sure why we have two complete sets of functions all the way through. But I guess that is a different issue. :) > Yes, it might be possible after some careful performance analysis (we might run into some subtle issues). >> - weakCompareAndSwap* -> weakCompareAndSet*Plain >> - weakCompareAndSwap*Volatile -> weakCompareAndSet* >> >> At this stage, only method and hotspot intrinsic names were changed; >> node names were left as-is, and may be handled in a separate issue. > > Overall looks good for libs and hotspot changes. > > One nit I spotted: > > src/java.base/share/classes/java/util/concurrent/atomic/AtomicLong.java > > + * compareAndSwap for longs. While the intrinsic compareAndSetLong > > compareAndSwap should be compareAndSet > > --- > > All hotspot files need their copyright years updated to 2017 (if not already). > > As there are hotspot changes this must be pushed using JPRT and "-testset hotspot" (but your sponsor should know that :) ). > I do :-) Paul. From david.holmes at oracle.com Mon May 8 23:50:57 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 May 2017 09:50:57 +1000 Subject: Add support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs on Windows platforms In-Reply-To: References: Message-ID: Hi John, Responding back on the mailing lists. There are people on the mailing lists who are in a better position to evaluate the merits of the proposal. I searched the bug database and could not see this issue being raised in the past. On 9/05/2017 8:46 AM, John Platts wrote: > The real reasons to add UTF-16 versions of these APIs is the following: > > * The arguments passed into the wmain and wWinMain functions use > UTF-16-encoded strings instead of UTF-8 strings > * The arguments passed into the main and WinMain functions on > Windows-platforms are in the ANSI character encoding instead of the > UTF-8 character encoding > * The NewString and GetStringChars APIs in the JNI already use > UTF-16-encoded strings Yes you are right the String functions already support UTF-16 as that is the format for char[] and so java.lang.String. > * Unicode APIs on Windows normally use UTF-16-encoded strings > * The C11 and C++11 standards support UTF-16 strings through the > char16_t type and support for UTF-16 character literals with a u prefix Thanks for the additional input. David > > ------------------------------------------------------------------------ > *From:* David Holmes > *Sent:* Sunday, May 7, 2017 7:47 PM > *To:* John Platts > *Cc:* hotspot-dev developers; core-libs-dev Libs > *Subject:* Re: Add support for Unicode versions of JNI_CreateJavaVM and > JNI_GetDefaultJavaVMInitArgs on Windows platforms > > Added back jdk10-dev as a bcc. > > Added hotspot-dev and core-libs-dev (for launcher) for follow up > discussions. > > Hi John, > > On 8/05/2017 10:33 AM, John Platts wrote: >> I actually did a search through the code that implements >> JNI_CreateJavaVM, and I found that the conversion of the strings is done >> using java_lang_String::create_from_platform_dependent_str, which >> converts from the platform-default encoding to Unicode. In the case of >> Windows-based platforms, the conversion is done based on the ANSI >> character encoding instead of UTF-8 or Modified UTF-8. >> >> >> The platform encoding detection logic on Windows is implemented >> java_props_md.c, which can be found at >> jdk/src/windows/native/java/lang/java_props_md.c in releases prior to >> JDK 9 and at src/java.base/windows/native/libjava/java_props_md.c in JDK >> 9 and later. The encoding used for command-line arguments passed into >> the JNI invocation API is Cp1252 for English locales on Windows >> platforms, and not Modified UTF-8 or UTF-8. >> >> >> The documentation found >> at http://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html > also > The Invocation API - Oracle > > docs.oracle.com > The Invocation API allows software vendors to load the Java VM into an > arbitrary native application. Vendors can deliver Java-enabled > applications without having to ... > > > >> states that the strings passed into JNI_CreateJavaVM are in the >> platform-default encoding. > > Thanks for the additional details. I assume you are referring to: > > typedef struct JavaVMOption { > char *optionString; /* the option as a string in the default > platform encoding */ > > that comment should not form part of the specification as it is > non-normative text. If the intent is truly to use the platform default > encoding and not UTF-8 then that should be very clearly spelt out in the > spec! > > That said, the implementation is following this so it is a limitation. I > suspect this is historical. > >> A version of JNI_CreateJavaVM that takes UTF-16-encoded strings should >> be added to the JNI Invocation API. The java.exe launchers and javaw.exe >> launchers should also be updated to use the UTF-16 version of the >> JNI_CreateJavaVM function on Windows platforms and to use wmain and >> wWinMain instead of main and WinMain. > > Why versions for UTF-16 instead of the missing UTF-8 variants? As I said > the whole spec is intended to be based around UTF-8 so we would not want > to throw in just a couple of UTF-16 based usages. > > Thanks, > David > >> >> A few files in HotSpot would need to be changed in order to implement >> the UTF-16 version of JNI_CreateJavaVM, but the change would improve >> consistency across different locales on Windows platforms and allow >> arguments that contain Unicode characters that are not available in the >> platform-default encoding to be passed into the JVM on the command line. >> >> >> The UTF-16-based version of JNI_CreateJavaVM also makes it easier to >> allocate string objects that contain non-ASCII characters as the strings >> are already in UTF-16 format, at least in cases where the strings >> contain Unicode characters that are not in Latin-1 or on VMs that do not >> support compact Latin-1 strings. >> >> >> The UTF-16-based version of JNI_CreateJavaVM should probably be >> implemented as a separate function so that the solution could be >> backported to JDK 8 and JDK 9 updates and so that backwards >> compatibility with the current JNI_CreateJavaVM implementation is >> maintained. >> >> >> Here is what the new UTF-16-based API might look like: >> >> typedef struct JavaVMInitArgs_UTF16 { >> jint version; >> jint nOptions; >> JavaVMOptionUTF16 *options; >> jboolean ignoreUnrecognized; >> } JavaVMInitArgs; >> >> >> typedef struct JavaVMOption_UTF16 { >> char *optionString; /* the option as a string in the default >> platform encoding */ >> void *extraInfo; >> } JavaVMOptionUTF16; >> >> /* vm_args is an pointer to a JavaVMInitArgs_UTF16 structure */ >> >> jint JNI_CreateJavaVM_UTF16(JavaVM **p_vm, void **p_env, void *vm_args); >> >> >> /* vm_args is a pointer to a JavaVMInitArgs_UTF16 structure */ >> >> jint JNI_GetDefaultJavaVMInitArgs_UTF16(void *vm_args); >> >> ------------------------------------------------------------------------ >> *From:* David Holmes >> *Sent:* Thursday, May 4, 2017 11:07 PM >> *To:* John Platts; jdk10-dev at openjdk.java.net >> *Subject:* Re: Add support for Unicode versions of JNI_CreateJavaVM and >> JNI_GetDefaultJavaVMInitArgs on Windows platforms >> >> Hi John, >> >> The JNI is defined to use Modified UTF-8 format for strings, so any >> Unicode character should be handled if passed in in the right format. >> Updating the JNI specification and implementation to accept UTF-16 >> directly would be a major undertaking. >> >> Is the issue here that you want a tool, like the java launcher, to >> accept arbitrary Unicode strings in a end-user friendly manner and then >> have it perform the modified UTF-8 conversion when invoking the VM? >> >> Can you give a concrete example of what you would like to be able to >> pass as arguments to the JVM? >> >> Thanks, >> David >> >> On 5/05/2017 1:04 PM, John Platts wrote: >>> The JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods in the JNI invocation API expect ANSI strings on Windows platforms instead of Unicode-encoded strings. This is an issue on Windows-based platforms since some of the option strings that are passed into JNI_CreateJavaVM might contain Unicode characters that are not in >> the ANSI encoding on Windows platforms. >>> >>> >>> There is support for UTF-16 literals on Windows platforms with wchar_t and wide character literals prefixed with the L prefix, and on platforms that support C11 and C++11 with char16_t and UTF-16 character literals that are prefixed with the u prefix. >>> >>> >>> jchar is currently defined to be a typedef for unsigned short on all platforms, but char16_t is a separate type and not a typedef for unsigned short or jchar in C++11 and later. jchar should be changed to be a typedef for wchar_t on Windows platforms and to be a typedef for char16_t on non-Windows platforms that support the >> char16_t type. This change will make it possible to define jchar >> character and string literals on Windows platforms and on non-Windows >> platforms that support the C11 or C++11 standard. >>> >>> >>> The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on Windows: >>> >>> #define JCHAR_LITERAL(x) L ## x >>> >>> >>> The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on non-Windows platforms: >>> >>> #define JCHAR_LITERAL(x) u ## x >>> >>> >>> Here is how the Unicode version of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs could be defined: >>> >>> typedef struct JavaVMUnicodeOption { >>> const jchar *optionString; /* the option as a string in UTF-16 encoding */ >>> void *extraInfo; >>> } JavaVMUnicodeOption; >>> >>> typedef struct JavaVMUnicodeInitArgs { >>> jint version; >>> jint nOptions; >>> JavaVMUnicodeOption *options; >>> jboolean ignoreUnrecognized; >>> } JavaVMUnicodeInitArgs; >>> >>> jint JNI_CreateJavaVMUnicode(JavaVM **pvm, void **penv, void *args); >>> jint JNI_GetDefaultJavaVMInitArgs(void *args); >>> >>> The java.exe wrapper should use wmain instead of main on Windows platforms, and the javaw.exe wrapper should use wWinMain instead of WinMain on Windows platforms. This change, along with the support for Unicode-enabled version of the JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods, would allow the JVM to be >> launched with arguments that contain Unicode characters that are not in >> the platform-default encoding. >>> >>> All of the Windows platforms that Java SE 10 and later VMs would be supported on do support Unicode. Adding support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs will allow Unicode characters that are not in the platform-default encoding on Windows platforms to be supported in command-line arguments >> that are passed to the JVM. >>> From david.holmes at oracle.com Tue May 9 01:17:20 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 May 2017 11:17:20 +1000 Subject: RFR: JDK-8167659: Access of mark word should use oopDesc::mark_offset_in_bytes() instead of '0' In-Reply-To: References: <23550b76-2f10-bb4f-315d-4b2137ad796e@redhat.com> Message-ID: Pinging Coleen as she indicated the other platforms would be looked at, which is necessary before this is accepted and pushed. The changes as presented seem fine to me. Thanks, David On 9/05/2017 12:43 AM, Roman Kennke wrote: > Ping? > > >> I posted this before: >> >> http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-October/024889.html >> >> And just realized it's not been fixed yet. Maybe now would be a good >> time to include it? >> >> I believe this changeset address all the issues mentioned in the above >> discussions. >> >> http://cr.openjdk.java.net/~rkennke/8167659/webrev.00/ >> >> >> Testing: jcstress -m sanity, specjvm >> (cannot currently run jtreg tests because I've no idea how to run the >> jcstress-jtreg tests..) >> >> Roman >> >> > > From thomas.stuefe at gmail.com Tue May 9 06:54:51 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 9 May 2017 08:54:51 +0200 Subject: Question about vfprintf hook VM argument In-Reply-To: References: Message-ID: Hi David, On Mon, May 8, 2017 at 11:25 PM, David Holmes wrote: > Hi Thomas, > > On 8/05/2017 7:29 PM, Thomas St?fe wrote: > >> Hi all, >> >> what exactly is the purpose of the FILE* argument in the vfprintf hook? >> > > I see your point. :) The vfprint_hook is a replacement vfprintf function > to be called from jio_vfprintf: > > int jio_vfprintf(FILE* f, const char *fmt, va_list args) { > if (Arguments::vfprintf_hook() != NULL) { > return Arguments::vfprintf_hook()(f, fmt, args); > } else { > return vfprintf(f, fmt, args); > } > } > > so whatever gets passed to jio_vfprintf gets passed through to the hook. > > But ... > > We had - actually several times already - the problem that our VM was >> embedded by a customized launcher which used the vfprintf hook to redirect >> the VM output. If the launcher uses the FILE* handed over by the VM to >> write to, it must be linked against the same C-Runtime as the VM itself. >> This is not necessarily a given, especially on Windows: the launcher may >> link against the debug C-Runtime (compiled with /MDd) wheras the JDK is >> build with "/MD" and links against the release C-Runtime. Or the launcher >> may even have been linked statically against the C-Runtime. Or... >> >> In my opinion it is not a good idea to hand over C-Runtime internals - be >> it malloced memory or FILE* pointers - to other binaries which may have >> been built with different build options. But I do not even understand the >> point of passing FILE* to the hook? If the point of the hook is to give >> embedding code the ability to write to somewhere else, why even bother >> giving it *my* file pointer? >> > > ... I confess I had no idea why this vfprint hook exists, but this > somewhat explains it: > > https://bugs.openjdk.java.net/browse/JDK-4015550 > > and yes it does suggest that although the FILE* is passed in the > expectation is that the function will actually write somewhere else. IIUC > the intent was to allow fd's 0,1 and 2 to be re-mapped by the hook to match > whatever the embedded app had change System.out/err to. But as fd's were > per-dll they couldn't pass through the fd so they passed through the FILE*. > But how they expected that to be mapped to stdout/stderr I have no idea. > > Thanks for looking at this, interesting piece of history! Well, maybe this was just not that well thought out. Handing the va_list up to the hookee and letting him unwrap it is is also unconventional, but probably does no harm, even when done by a different C-Runtime. I guess we continue living with it. We have a checklist for potential embedders writing launchers (e.g. not to use the primordial thread on AIX), and will add "Use the same C-Runtime as the JDK on Windows" to the list. Regards, Thomas > Cheers, > David > > > Thanks & Kind Regards, Thomas >> >> From david.holmes at oracle.com Tue May 9 07:17:01 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 9 May 2017 17:17:01 +1000 Subject: Question about vfprintf hook VM argument In-Reply-To: References: Message-ID: <80d6d7b1-bb3a-e335-0c06-4517125d2858@oracle.com> Hi Thomas, On 9/05/2017 4:54 PM, Thomas St?fe wrote: > Hi David, > > On Mon, May 8, 2017 at 11:25 PM, David Holmes > wrote: > > Hi Thomas, > > On 8/05/2017 7:29 PM, Thomas St?fe wrote: > > Hi all, > > what exactly is the purpose of the FILE* argument in the > vfprintf hook? > > > I see your point. :) The vfprint_hook is a replacement vfprintf > function to be called from jio_vfprintf: > > int jio_vfprintf(FILE* f, const char *fmt, va_list args) { > if (Arguments::vfprintf_hook() != NULL) { > return Arguments::vfprintf_hook()(f, fmt, args); > } else { > return vfprintf(f, fmt, args); > } > } > > so whatever gets passed to jio_vfprintf gets passed through to the hook. > > But ... > > We had - actually several times already - the problem that our > VM was > embedded by a customized launcher which used the vfprintf hook > to redirect > the VM output. If the launcher uses the FILE* handed over by the > VM to > write to, it must be linked against the same C-Runtime as the VM > itself. > This is not necessarily a given, especially on Windows: the > launcher may > link against the debug C-Runtime (compiled with /MDd) wheras the > JDK is > build with "/MD" and links against the release C-Runtime. Or the > launcher > may even have been linked statically against the C-Runtime. Or... > > In my opinion it is not a good idea to hand over C-Runtime > internals - be > it malloced memory or FILE* pointers - to other binaries which > may have > been built with different build options. But I do not even > understand the > point of passing FILE* to the hook? If the point of the hook is > to give > embedding code the ability to write to somewhere else, why even > bother > giving it *my* file pointer? > > > ... I confess I had no idea why this vfprint hook exists, but this > somewhat explains it: > > https://bugs.openjdk.java.net/browse/JDK-4015550 > > > and yes it does suggest that although the FILE* is passed in the > expectation is that the function will actually write somewhere else. > IIUC the intent was to allow fd's 0,1 and 2 to be re-mapped by the > hook to match whatever the embedded app had change System.out/err > to. But as fd's were per-dll they couldn't pass through the fd so > they passed through the FILE*. But how they expected that to be > mapped to stdout/stderr I have no idea. > > > Thanks for looking at this, interesting piece of history! > > Well, maybe this was just not that well thought out. Handing the va_list > up to the hookee and letting him unwrap it is is also unconventional, > but probably does no harm, even when done by a different C-Runtime. > > I guess we continue living with it. We have a checklist for potential > embedders writing launchers (e.g. not to use the primordial thread on > AIX), and will add "Use the same C-Runtime as the JDK on Windows" to the > list. I still don't understand how this is supposed to work anyway. The intent is to provide a means for the VM to write to System.out/err, but there is no general way to determine whether the FILE* the VM passes through represents "stdout" or "stderr". ??? Cheers, David > Regards, Thomas > > > > Cheers, > David > > > Thanks & Kind Regards, Thomas > > From thomas.stuefe at gmail.com Tue May 9 07:27:50 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 9 May 2017 09:27:50 +0200 Subject: Question about vfprintf hook VM argument In-Reply-To: <80d6d7b1-bb3a-e335-0c06-4517125d2858@oracle.com> References: <80d6d7b1-bb3a-e335-0c06-4517125d2858@oracle.com> Message-ID: On Tue, May 9, 2017 at 9:17 AM, David Holmes wrote: > Hi Thomas, > > On 9/05/2017 4:54 PM, Thomas St?fe wrote: > >> Hi David, >> >> On Mon, May 8, 2017 at 11:25 PM, David Holmes > > wrote: >> >> Hi Thomas, >> >> On 8/05/2017 7:29 PM, Thomas St?fe wrote: >> >> Hi all, >> >> what exactly is the purpose of the FILE* argument in the >> vfprintf hook? >> >> >> I see your point. :) The vfprint_hook is a replacement vfprintf >> function to be called from jio_vfprintf: >> >> int jio_vfprintf(FILE* f, const char *fmt, va_list args) { >> if (Arguments::vfprintf_hook() != NULL) { >> return Arguments::vfprintf_hook()(f, fmt, args); >> } else { >> return vfprintf(f, fmt, args); >> } >> } >> >> so whatever gets passed to jio_vfprintf gets passed through to the >> hook. >> >> But ... >> >> We had - actually several times already - the problem that our >> VM was >> embedded by a customized launcher which used the vfprintf hook >> to redirect >> the VM output. If the launcher uses the FILE* handed over by the >> VM to >> write to, it must be linked against the same C-Runtime as the VM >> itself. >> This is not necessarily a given, especially on Windows: the >> launcher may >> link against the debug C-Runtime (compiled with /MDd) wheras the >> JDK is >> build with "/MD" and links against the release C-Runtime. Or the >> launcher >> may even have been linked statically against the C-Runtime. Or... >> >> In my opinion it is not a good idea to hand over C-Runtime >> internals - be >> it malloced memory or FILE* pointers - to other binaries which >> may have >> been built with different build options. But I do not even >> understand the >> point of passing FILE* to the hook? If the point of the hook is >> to give >> embedding code the ability to write to somewhere else, why even >> bother >> giving it *my* file pointer? >> >> >> ... I confess I had no idea why this vfprint hook exists, but this >> somewhat explains it: >> >> https://bugs.openjdk.java.net/browse/JDK-4015550 >> >> >> and yes it does suggest that although the FILE* is passed in the >> expectation is that the function will actually write somewhere else. >> IIUC the intent was to allow fd's 0,1 and 2 to be re-mapped by the >> hook to match whatever the embedded app had change System.out/err >> to. But as fd's were per-dll they couldn't pass through the fd so >> they passed through the FILE*. But how they expected that to be >> mapped to stdout/stderr I have no idea. >> >> >> Thanks for looking at this, interesting piece of history! >> >> Well, maybe this was just not that well thought out. Handing the va_list >> up to the hookee and letting him unwrap it is is also unconventional, >> but probably does no harm, even when done by a different C-Runtime. >> >> I guess we continue living with it. We have a checklist for potential >> embedders writing launchers (e.g. not to use the primordial thread on >> AIX), and will add "Use the same C-Runtime as the JDK on Windows" to the >> list. >> > > I still don't understand how this is supposed to work anyway. The intent > is to provide a means for the VM to write to System.out/err, but there is > no general way to determine whether the FILE* the VM passes through > represents "stdout" or "stderr". ??? > > Cheers, > David > > Yes, it is confusing. The only (far fetched) explanation I have is that the intent was to reformulate the message to be written but still write it to the original output FILE*. In that case you would not have to know if FILE* is stderr or stdout. But if I would have to bet, I'd say this looks like someone was in a rush and needed a quick solution. There is also almost no documentation about it other than one half sentence I found in the official Invocation API doc ( http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/invocation.html ). ..Thomas > Regards, Thomas >> >> >> >> Cheers, >> David >> >> >> Thanks & Kind Regards, Thomas >> >> >> From aph at redhat.com Tue May 9 08:17:18 2017 From: aph at redhat.com (Andrew Haley) Date: Tue, 9 May 2017 09:17:18 +0100 Subject: RFR 10 JDK-8159995: Rename internal Unsafe.compare methods In-Reply-To: <06957B86-7C6A-4F9B-AF23-327DAC46F0CE@oracle.com> References: <590CD221.4080100@oracle.com> <6cfadc10-f64b-c536-0d29-c74c3e81f2b7@oracle.com> <06957B86-7C6A-4F9B-AF23-327DAC46F0CE@oracle.com> Message-ID: On 08/05/17 22:43, Paul Sandoz wrote: >> Given any "swap(exp, new)" function can be implemented as >> "exchange(exp, new) == exp" I'm not sure why we have two complete >> sets of functions all the way through. But I guess that is a >> different issue. :) > > Yes, it might be possible after some careful performance analysis > (we might run into some subtle issues). They don't quite generate the same code, and there is no way to write an "Exchange" version of a weak "Swap". Andrew. From john_platts at hotmail.com Tue May 9 00:28:09 2017 From: john_platts at hotmail.com (John Platts) Date: Tue, 9 May 2017 00:28:09 +0000 Subject: Add support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs on Windows platforms In-Reply-To: References: , Message-ID: Reasons to add UTF-16 versions of the JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs APIs include the following: * The arguments passed into the wmain and wWinMain functions use UTF-16-encoded strings instead of UTF-8 strings. * The arguments passed into the main and WinMain functions on Windows-platforms are in the ANSI character encoding instead of the UTF-8 character encoding. * The arguments passed into the wmain and wWinMain functions would need to be converted to UTF-8 or modified UTF-8 encoding unless a UTF-16 version of JNI_CreateJavaVM is added. * The NewString and GetStringChars APIs in the JNI already use UTF-16-encoded strings. * Unicode APIs on Windows normally use UTF-16-encoded strings. * The C11 and C++11 standards support UTF-16 strings through the char16_t type and support for UTF-16 literals with a u prefix. * Windows platforms have long supported UTF-16 strings in C and C++ through the wchar_t type and support for UTF-16 literals with a L prefix. * A UTF-16 version of JNI_CreateJavaVM would allow command line arguments to be passed into the JVM without having to perform the platform-dependent encoding to UTF-16 conversion that currently has to be done in the JVM. * A UTF-16 version of JNI_CreateJavaVM would improve consistency across different locales on Windows-based platforms since the command-line arguments can be passed into the JVM in a locale-independent manner on Windows-based platforms. ________________________________ From: David Holmes Sent: Sunday, May 7, 2017 7:47 PM To: John Platts Cc: hotspot-dev developers; core-libs-dev Libs Subject: Re: Add support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs on Windows platforms Added back jdk10-dev as a bcc. Added hotspot-dev and core-libs-dev (for launcher) for follow up discussions. Hi John, On 8/05/2017 10:33 AM, John Platts wrote: > I actually did a search through the code that implements > JNI_CreateJavaVM, and I found that the conversion of the strings is done > using java_lang_String::create_from_platform_dependent_str, which > converts from the platform-default encoding to Unicode. In the case of > Windows-based platforms, the conversion is done based on the ANSI > character encoding instead of UTF-8 or Modified UTF-8. > > > The platform encoding detection logic on Windows is implemented > java_props_md.c, which can be found at > jdk/src/windows/native/java/lang/java_props_md.c in releases prior to > JDK 9 and at src/java.base/windows/native/libjava/java_props_md.c in JDK > 9 and later. The encoding used for command-line arguments passed into > the JNI invocation API is Cp1252 for English locales on Windows > platforms, and not Modified UTF-8 or UTF-8. > > > The documentation found > at http://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html also The Invocation API - Oracle docs.oracle.com The Invocation API allows software vendors to load the Java VM into an arbitrary native application. Vendors can deliver Java-enabled applications without having to ... > states that the strings passed into JNI_CreateJavaVM are in the > platform-default encoding. Thanks for the additional details. I assume you are referring to: typedef struct JavaVMOption { char *optionString; /* the option as a string in the default platform encoding */ that comment should not form part of the specification as it is non-normative text. If the intent is truly to use the platform default encoding and not UTF-8 then that should be very clearly spelt out in the spec! That said, the implementation is following this so it is a limitation. I suspect this is historical. > A version of JNI_CreateJavaVM that takes UTF-16-encoded strings should > be added to the JNI Invocation API. The java.exe launchers and javaw.exe > launchers should also be updated to use the UTF-16 version of the > JNI_CreateJavaVM function on Windows platforms and to use wmain and > wWinMain instead of main and WinMain. Why versions for UTF-16 instead of the missing UTF-8 variants? As I said the whole spec is intended to be based around UTF-8 so we would not want to throw in just a couple of UTF-16 based usages. Thanks, David > > A few files in HotSpot would need to be changed in order to implement > the UTF-16 version of JNI_CreateJavaVM, but the change would improve > consistency across different locales on Windows platforms and allow > arguments that contain Unicode characters that are not available in the > platform-default encoding to be passed into the JVM on the command line. > > > The UTF-16-based version of JNI_CreateJavaVM also makes it easier to > allocate string objects that contain non-ASCII characters as the strings > are already in UTF-16 format, at least in cases where the strings > contain Unicode characters that are not in Latin-1 or on VMs that do not > support compact Latin-1 strings. > > > The UTF-16-based version of JNI_CreateJavaVM should probably be > implemented as a separate function so that the solution could be > backported to JDK 8 and JDK 9 updates and so that backwards > compatibility with the current JNI_CreateJavaVM implementation is > maintained. > > > Here is what the new UTF-16-based API might look like: > > typedef struct JavaVMInitArgs_UTF16 { > jint version; > jint nOptions; > JavaVMOptionUTF16 *options; > jboolean ignoreUnrecognized; > } JavaVMInitArgs; > > > typedef struct JavaVMOption_UTF16 { > char *optionString; /* the option as a string in the default > platform encoding */ > void *extraInfo; > } JavaVMOptionUTF16; > > /* vm_args is an pointer to a JavaVMInitArgs_UTF16 structure */ > > jint JNI_CreateJavaVM_UTF16(JavaVM **p_vm, void **p_env, void *vm_args); > > > /* vm_args is a pointer to a JavaVMInitArgs_UTF16 structure */ > > jint JNI_GetDefaultJavaVMInitArgs_UTF16(void *vm_args); > > ------------------------------------------------------------------------ > *From:* David Holmes > *Sent:* Thursday, May 4, 2017 11:07 PM > *To:* John Platts; jdk10-dev at openjdk.java.net > *Subject:* Re: Add support for Unicode versions of JNI_CreateJavaVM and > JNI_GetDefaultJavaVMInitArgs on Windows platforms > > Hi John, > > The JNI is defined to use Modified UTF-8 format for strings, so any > Unicode character should be handled if passed in in the right format. > Updating the JNI specification and implementation to accept UTF-16 > directly would be a major undertaking. > > Is the issue here that you want a tool, like the java launcher, to > accept arbitrary Unicode strings in a end-user friendly manner and then > have it perform the modified UTF-8 conversion when invoking the VM? > > Can you give a concrete example of what you would like to be able to > pass as arguments to the JVM? > > Thanks, > David > > On 5/05/2017 1:04 PM, John Platts wrote: >> The JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods in the JNI invocation API expect ANSI strings on Windows platforms instead of Unicode-encoded strings. This is an issue on Windows-based platforms since some of the option strings that are passed into JNI_CreateJavaVM might contain Unicode characters that are not in > the ANSI encoding on Windows platforms. >> >> >> There is support for UTF-16 literals on Windows platforms with wchar_t and wide character literals prefixed with the L prefix, and on platforms that support C11 and C++11 with char16_t and UTF-16 character literals that are prefixed with the u prefix. >> >> >> jchar is currently defined to be a typedef for unsigned short on all platforms, but char16_t is a separate type and not a typedef for unsigned short or jchar in C++11 and later. jchar should be changed to be a typedef for wchar_t on Windows platforms and to be a typedef for char16_t on non-Windows platforms that support the > char16_t type. This change will make it possible to define jchar > character and string literals on Windows platforms and on non-Windows > platforms that support the C11 or C++11 standard. >> >> >> The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on Windows: >> >> #define JCHAR_LITERAL(x) L ## x >> >> >> The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on non-Windows platforms: >> >> #define JCHAR_LITERAL(x) u ## x >> >> >> Here is how the Unicode version of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs could be defined: >> >> typedef struct JavaVMUnicodeOption { >> const jchar *optionString; /* the option as a string in UTF-16 encoding */ >> void *extraInfo; >> } JavaVMUnicodeOption; >> >> typedef struct JavaVMUnicodeInitArgs { >> jint version; >> jint nOptions; >> JavaVMUnicodeOption *options; >> jboolean ignoreUnrecognized; >> } JavaVMUnicodeInitArgs; >> >> jint JNI_CreateJavaVMUnicode(JavaVM **pvm, void **penv, void *args); >> jint JNI_GetDefaultJavaVMInitArgs(void *args); >> >> The java.exe wrapper should use wmain instead of main on Windows platforms, and the javaw.exe wrapper should use wWinMain instead of WinMain on Windows platforms. This change, along with the support for Unicode-enabled version of the JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods, would allow the JVM to be > launched with arguments that contain Unicode characters that are not in > the platform-default encoding. >> >> All of the Windows platforms that Java SE 10 and later VMs would be supported on do support Unicode. Adding support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs will allow Unicode characters that are not in the platform-default encoding on Windows platforms to be supported in command-line arguments > that are passed to the JVM. >> From aph at redhat.com Tue May 9 17:18:12 2017 From: aph at redhat.com (Andrew Haley) Date: Tue, 9 May 2017 18:18:12 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent Message-ID: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> In C2 we use LDAR/STLR to handle volatile accesses, but in C1 and the interpreter we use separate DMB instructions and relaxed loads. When used together, these do not form a sequentially-consistent memory ordering. For example, if stores use STLR and loads use LDR;DMB a simple Dekker idiom will fail. This is extremely hard to test because the loads and stores have to be in separately-compiled methods, but it is incorrect, and likely to fail in very weakly-ordered implementations. Note: this is for JDK 9. http://cr.openjdk.java.net/~aph/8179954/ Andrew. From igor.ignatyev at oracle.com Tue May 9 17:20:44 2017 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 9 May 2017 10:20:44 -0700 Subject: RFR(XXS) : 8179930: jdk.test.lib.artifacts.ArtifactResolver::resolve should return Map instead of HashMap Message-ID: <67CAF340-65EA-41FF-B1DC-20A2B4C0BBD3@oracle.com> http://cr.openjdk.java.net/~iignatyev//8179930/webrev.00/index.html > 8 lines changed: 1 ins; 0 del; 7 mod; Hi all, could you please review this small patch which changes jdk.test.lib.artifacts.ArtifactResolver::resolve signature to return a Map instead of HasMap and updates the tests accordingly? I have also changed the argument type from raw Class to Class, so we don't need to have casts in the method. webrev: http://cr.openjdk.java.net/~iignatyev//8179930/webrev.00/index.html JBS: https://bugs.openjdk.java.net/browse/JDK-8179930 testing: the affected tests (hotspot/test/applications) -- Igor From shade at redhat.com Tue May 9 17:30:27 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 9 May 2017 19:30:27 +0200 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> Message-ID: <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> On 05/09/2017 07:18 PM, Andrew Haley wrote: > In C2 we use LDAR/STLR to handle volatile accesses, but in C1 and the > interpreter we use separate DMB instructions and relaxed loads. When > used together, these do not form a sequentially-consistent memory > ordering. For example, if stores use STLR and loads use LDR;DMB a > simple Dekker idiom will fail. > > This is extremely hard to test because the loads and stores have to be > in separately-compiled methods, but it is incorrect, and likely to > fail in very weakly-ordered implementations. > > Note: this is for JDK 9. > > http://cr.openjdk.java.net/~aph/8179954/ Makes sense to me. -Aleksey From harold.seigel at oracle.com Tue May 9 17:39:49 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 9 May 2017 13:39:49 -0400 Subject: RFR 8153646: Move vm/utilities/array.hpp to vm/oops Message-ID: Hi, Please review this JDK-10 change to move hotspot header file array.hpp from the vm/utilities directory to the vm/oops directory. This was done because after moving typedefs for basic type arrays to growableArray.hpp, the only remaining declaration in array.hpp is the metaspace class Array. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8153646/webrev/ JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8153646 The fix was tested with JCK Lang and VM tests, the JTreg hotspot, java/io, java/lang, java/util and other tests, the co-located NSK tests, and with JPRT. Thanks, Harold From coleen.phillimore at oracle.com Tue May 9 17:40:11 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 9 May 2017 13:40:11 -0400 Subject: RFR: JDK-8167659: Access of mark word should use oopDesc::mark_offset_in_bytes() instead of '0' In-Reply-To: References: <23550b76-2f10-bb4f-315d-4b2137ad796e@redhat.com> Message-ID: The changes seem fine to me. It's a nice improvement to not have raw 0s there. We could file RFE's for the other platforms. I don't think they have to be fixed together. I guess you need a sponsor. I could sponsor it. thanks, Coleen On 5/8/17 9:17 PM, David Holmes wrote: > Pinging Coleen as she indicated the other platforms would be looked > at, which is necessary before this is accepted and pushed. > > The changes as presented seem fine to me. > > Thanks, > David > > On 9/05/2017 12:43 AM, Roman Kennke wrote: >> Ping? >> >> >>> I posted this before: >>> >>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-October/024889.html >>> >>> >>> And just realized it's not been fixed yet. Maybe now would be a good >>> time to include it? >>> >>> I believe this changeset address all the issues mentioned in the above >>> discussions. >>> >>> http://cr.openjdk.java.net/~rkennke/8167659/webrev.00/ >>> >>> >>> Testing: jcstress -m sanity, specjvm >>> (cannot currently run jtreg tests because I've no idea how to run the >>> jcstress-jtreg tests..) >>> >>> Roman >>> >>> >> >> From george.triantafillou at oracle.com Tue May 9 17:55:55 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Tue, 9 May 2017 13:55:55 -0400 Subject: RFR(XXS) : 8179930: jdk.test.lib.artifacts.ArtifactResolver::resolve should return Map instead of HashMap In-Reply-To: <67CAF340-65EA-41FF-B1DC-20A2B4C0BBD3@oracle.com> References: <67CAF340-65EA-41FF-B1DC-20A2B4C0BBD3@oracle.com> Message-ID: <8ba067b5-a076-200d-a717-c933caef1ebc@oracle.com> Hi Igor, Looks good. -George On 5/9/2017 1:20 PM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8179930/webrev.00/index.html >> 8 lines changed: 1 ins; 0 del; 7 mod; > Hi all, > > could you please review this small patch which changes jdk.test.lib.artifacts.ArtifactResolver::resolve signature to return a Map instead of HasMap and updates the tests accordingly? > I have also changed the argument type from raw Class to Class, so we don't need to have casts in the method. > > webrev: http://cr.openjdk.java.net/~iignatyev//8179930/webrev.00/index.html > JBS: https://bugs.openjdk.java.net/browse/JDK-8179930 > testing: the affected tests (hotspot/test/applications) > > -- Igor From mikhailo.seledtsov at oracle.com Tue May 9 18:40:24 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Tue, 9 May 2017 11:40:24 -0700 Subject: RFR(XXS) : 8179930: jdk.test.lib.artifacts.ArtifactResolver::resolve should return Map instead of HashMap In-Reply-To: <8ba067b5-a076-200d-a717-c933caef1ebc@oracle.com> References: <67CAF340-65EA-41FF-B1DC-20A2B4C0BBD3@oracle.com> <8ba067b5-a076-200d-a717-c933caef1ebc@oracle.com> Message-ID: Looks good, Misha On 05/09/2017 10:55 AM, George Triantafillou wrote: > Hi Igor, > > Looks good. > > -George > > On 5/9/2017 1:20 PM, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8179930/webrev.00/index.html >>> 8 lines changed: 1 ins; 0 del; 7 mod; >> Hi all, >> >> could you please review this small patch which changes >> jdk.test.lib.artifacts.ArtifactResolver::resolve signature to return >> a Map instead of HasMap and updates the tests accordingly? >> I have also changed the argument type from raw Class to Class, so >> we don't need to have casts in the method. >> >> webrev: >> http://cr.openjdk.java.net/~iignatyev//8179930/webrev.00/index.html >> JBS: https://bugs.openjdk.java.net/browse/JDK-8179930 >> testing: the affected tests (hotspot/test/applications) >> >> -- Igor > From rkennke at redhat.com Tue May 9 18:47:42 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 9 May 2017 20:47:42 +0200 Subject: RFR: JDK-8167659: Access of mark word should use oopDesc::mark_offset_in_bytes() instead of '0' In-Reply-To: References: <23550b76-2f10-bb4f-315d-4b2137ad796e@redhat.com> Message-ID: <2165c681-3750-879b-4e4a-cbc6f6950798@redhat.com> Hi Coleen & all, > The changes seem fine to me. It's a nice improvement to not have raw > 0s there. We could file RFE's for the other platforms. I don't > think they have to be fixed together. Ok, great. > I guess you need a sponsor. Yes :-) > I could sponsor it. Thanks! Roman > > thanks, > Coleen > > On 5/8/17 9:17 PM, David Holmes wrote: >> Pinging Coleen as she indicated the other platforms would be looked >> at, which is necessary before this is accepted and pushed. >> >> The changes as presented seem fine to me. >> >> Thanks, >> David >> >> On 9/05/2017 12:43 AM, Roman Kennke wrote: >>> Ping? >>> >>> >>>> I posted this before: >>>> >>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-October/024889.html >>>> >>>> >>>> And just realized it's not been fixed yet. Maybe now would be a good >>>> time to include it? >>>> >>>> I believe this changeset address all the issues mentioned in the above >>>> discussions. >>>> >>>> http://cr.openjdk.java.net/~rkennke/8167659/webrev.00/ >>>> >>>> >>>> Testing: jcstress -m sanity, specjvm >>>> (cannot currently run jtreg tests because I've no idea how to run the >>>> jcstress-jtreg tests..) >>>> >>>> Roman >>>> >>>> >>> >>> > From igor.ignatyev at oracle.com Tue May 9 21:05:57 2017 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 9 May 2017 14:05:57 -0700 Subject: RFR(XXS) : 8180004: jdk.test.lib.DynamicVMOption should be moved to jdk.test.lib.management Message-ID: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> http://cr.openjdk.java.net/~iignatyev//8180004/webrev.00/index.html > 8 lines changed: 1 ins; 0 del; 7 mod; Hi all, could you please review this tiny patch which moves jdk.test.lib.DynamicVMOption class to jdk.test.lib.management package and updates the tests which use it? j.t.l.DynamicVMOption uses classes from jdk.management module, so having it in common testlibrary package might cause redundant module dependencies. webrev: http://cr.openjdk.java.net/~iignatyev//8180004/webrev.00/index.html jbs: https://bugs.openjdk.java.net/browse/JDK-8180004 testing: :hotspot_all Thanks, -- Igor From mikael.vidstedt at oracle.com Tue May 9 21:29:13 2017 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 9 May 2017 14:29:13 -0700 Subject: RFR(S): 8180003: Remove sys/ prefix from poll.h and signal.h includes Message-ID: Please review this small change which removes the sys/ prefix from a bunch of includes of poll.h and signal.h. hotspot: http://cr.openjdk.java.net/~mikael/webrevs/8180003/webrev.00/hotspot/webrev/ jdk: http://cr.openjdk.java.net/~mikael/webrevs/8180003/webrev.00/jdk/webrev/ Using the sys/ prefix works on many platforms, but the posix spec makes it clear that the poll.h and signal.h header files should be included without the prefix. I have verified that this change works on all the Oracle supported platforms, but I could use some help verifying it on AIX. Cheers, Mikael From Derek.White at cavium.com Tue May 9 21:39:00 2017 From: Derek.White at cavium.com (White, Derek) Date: Tue, 9 May 2017 21:39:00 +0000 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> Message-ID: Hi Andrew, Good catch! My only comment is in src/cpu/aarch64/vm/templateTable_aarch64.cpp, TemplateTable::getfield_or_static(): The comment in the trailing member around line 2542 says: - "It's really not worth bothering to check whether this field really is volatile in the slow case." But getfield_or_static() is used once to "quicken" getfield byte codes, as well as used forevermore on all getstatic bytecodes (and some weird cases in class sharing?). I can't claim that makes a definite performance difference (it's just the interpreter), but adding an additional unconditional membar might make it more likely to matter. FYI, the ppc and arm64 ports do check if the field is volatile before executing the membar (s in the ppc case). Thanks, - Derek -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Aleksey Shipilev Sent: Tuesday, May 09, 2017 1:30 PM To: Andrew Haley ; hotspot-dev Source Developers Subject: Re: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent On 05/09/2017 07:18 PM, Andrew Haley wrote: > In C2 we use LDAR/STLR to handle volatile accesses, but in C1 and the > interpreter we use separate DMB instructions and relaxed loads. When > used together, these do not form a sequentially-consistent memory > ordering. For example, if stores use STLR and loads use LDR;DMB a > simple Dekker idiom will fail. > > This is extremely hard to test because the loads and stores have to be > in separately-compiled methods, but it is incorrect, and likely to > fail in very weakly-ordered implementations. > > Note: this is for JDK 9. > > http://cr.openjdk.java.net/~aph/8179954/ Makes sense to me. -Aleksey From brian.burkhalter at oracle.com Tue May 9 21:45:04 2017 From: brian.burkhalter at oracle.com (Brian Burkhalter) Date: Tue, 9 May 2017 14:45:04 -0700 Subject: RFR(S): 8180003: Remove sys/ prefix from poll.h and signal.h includes In-Reply-To: References: Message-ID: <83DEDA3B-2BD5-4F99-A3BE-2F3AE8F2C39B@oracle.com> On May 9, 2017, at 2:29 PM, Mikael Vidstedt wrote: > Please review this small change which removes the sys/ prefix from a bunch of includes of poll.h and signal.h. > > hotspot: http://cr.openjdk.java.net/~mikael/webrevs/8180003/webrev.00/hotspot/webrev/ > jdk: http://cr.openjdk.java.net/~mikael/webrevs/8180003/webrev.00/jdk/webrev/ The JDK NIO changes look fine at least. > Using the sys/ prefix works on many platforms, but the posix spec makes it clear that the poll.h and signal.h header files should be included without the prefix. Just had to look ? [1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/poll.h.html [2] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html > I have verified that this change works on all the Oracle supported platforms, but I could use some help verifying it on AIX. Good about the Oracle platforms. Brian From serguei.spitsyn at oracle.com Tue May 9 22:37:22 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 9 May 2017 15:37:22 -0700 Subject: RFR 8153646: Move vm/utilities/array.hpp to vm/oops In-Reply-To: References: Message-ID: <09ce165b-1e52-f41d-85b4-50463808c761@oracle.com> Hi Harold, The fix looks good. Thanks, Serguei On 5/9/17 10:39, harold seigel wrote: > Hi, > > Please review this JDK-10 change to move hotspot header file array.hpp > from the vm/utilities directory to the vm/oops directory. This was > done because after moving typedefs for basic type arrays to > growableArray.hpp, the only remaining declaration in array.hpp is the > metaspace class Array. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8153646/webrev/ > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8153646 > > The fix was tested with JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util and other tests, the co-located NSK > tests, and with JPRT. > > Thanks, Harold > From mikael.vidstedt at oracle.com Tue May 9 22:40:34 2017 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 9 May 2017 15:40:34 -0700 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser Message-ID: Warning: It may be wise to stock up on coffee or tea before reading this. Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ * Background (from the JBS description) x86 is normally very forgiving when it comes to dereferencing unaligned pointers. Even if the pointer isn?t aligned on the natural size of the element being accessed, the hardware will do the Right Thing(tm) and nobody gets hurt. However, turns out there are exceptions to this. Specifically, SSA2 introduced the movdqa instruction, which is a 128-bit load/store which *does* require that the pointer is 128-bit aligned. Normally this isn?t a problem, because after all we don?t typically use 128-bit data types in our C/C++ code. However, just because we don?t use any such data types explicitly, there?s nothing preventing the C compiler to do so under the covers. Specifically, the C compiler tends to do loop unrolling and vectorization, which can turn pretty much any data access into vectorized SIMD accesses. We?ve actually run into a variation on this exact same problem a while back when upgrading to gcc 4.9.2. That time the problem (as captured in JDK-8141491) was in nio/Bits.c, and it was fixed by moving the copy functionality into hotspot (copy.[ch]pp), making sure the copy logic does the relevant alignment checks etc. This time the problem is with ClassFileParser. Or more accurately, it?s in the methods ClassFileParser makes use of. Specifically, the problem is with the copy_u2_with_conversion method, used to copy various data from the class file and put it in the ?native? endian order in memory. It, in turn, uses Bytes::get_Java_u2 to read and potentially byte swap a 16-bit entry from the class file. bytes_x86.hpp has this to say about its implementation: // Efficient reading and writing of unaligned unsigned data in platform-specific byte ordering // (no special code is needed since x86 CPUs can access unaligned data) While that is /almost/ always true for the x86 architecture in itself, the C standard still expects accesses to be aligned, and the C compiler is free to make use of that expectation to, for example, vectorize operations and make use of the movdqa instruction. I noticed this when working on the portola/musl port, and in that environment the bug is tickled immediately. Why is it only a problem with musl? Turns out it sorta isn?t. It seems that this isn't an actual problem on any of the current platforms/toolchains we're using, but it's a latent bug which may be triggered at any point. bytes_x86.hpp will, in the end, actually use system library functions to do byte swapping. Specifically, on linux_x86 it will come down to bytes_linux_x86.inline.hpp which, on AMD64, uses the system functions/macros swap_u{16,32,64} to do the actual byte swapping. Now here?s the ?funny? & interesting part: With glibc, the swap_u{16,32,64} methods are implemented using inline assembly - in the end it comes down to an inline rotate ?rorw? instruction. Since GCC can?t see through the inline assembly, it will not realize that there are loop unrolling/vectorization opportunities, and of specific interest to us: the movdqa instruction will not be used. The code will potentially not be as efficient as it could be, but it will be functional. With musl, the swap methods are instead implemented as normal macros, shifting bits around to achieve the desired effect. GCC recognizes the bit shifting patterns, will realize that it?s just byte swapping a bunch of values, will vectorize the loop, and *will* make use of the movdqa instruction. Kaboom. To recap: dereferencing unaligned pointers in C/C++ is a no-no, even in cases where you think it should be okay. With the existing compilers and header files we are not currently running into this problem, but even a small change in the byte swap implementation exposes the problem. * About the change The key changes are in three different areas: 1. copy.[ch]pp Introducing: conjoint_swap_if_needed conjoint_swap_if_needed copies data, and bytes wap it on-the-fly if the specified endianness differs from the native/CPU endianness. It does this by either delegating to conjoint_swap (on endian mismatch), or conjoint_copy (on match). In copy.cpp, the changes all boil down to making the innermost do_conjoint_swap method more flexible so that it can be reused for both cases (straight copy as well as copy+swap). 2. classFile{Parser,Stream} The key change is in classFileParser.cpp, switching to copying data from the class file using the new conjoint_swap_if_needed method, replacing the loop implemented in copy_u2_with_conversion/Bytes::get_Java_u2. However, in addition to that change, I noticed that there are a lot of u2* passed around in the code, pointers which are not necessarily 16-bit aligned. While there?s nothing wrong with *having* an unaligned pointer in C - as long as it?s not dereferenced everything is peachy - it made me uneasy to see it passed around and used the way it is. Specifically, ClassFileStream::get_u2_buffer() could, to the untrained eye, be a bit misleading. One could accidentally and incorrectly assume that the returned pointer is, in fact, 16-bit aligned and start dereferencing it directly, where in fact there is no such guarantee. Perhaps even use it as an array and attract the wrath of the C compiler. Changing to void* may or may not be the right thing to do here. In a way I?d actually like to ?carry? the type information, but in some way still prevent the pointer from being directly dereferenced. Taking suggestions. 3. bytes_x86.hpp This is addressing the wider concern that other parts of hotspot may use the same primitives in much the same (potentially broken) way, and in particular the fact that the get/put primitives aren?t checking whether the pointer argument is aligned before they dereference it. It may well be that a simple assert or two would do the trick here. That said: It turns out that the various platforms all have their own unique ways of implementing bytes.hpp, duplicating some logic which could/should be platform independent. I tried to clean up and unify it all a bit while at it by introducing an Endian helper class in bytes.hpp. The primitives for accessing data in memory now check for alignment and either perform the raw memory access (when the pointer is aligned), or does a memcpy (if unaligned). There?s some template ?magic? in there avoid duplicating code, but hopefully the magic is relatively straightforward. * Testing I?ve run some basic testing on this, but I?m very much looking for advice on which tests to run. Let me know if you have any suggestions! Cheers, Mikael From david.holmes at oracle.com Tue May 9 23:19:23 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 May 2017 09:19:23 +1000 Subject: RFR(S): 8180003: Remove sys/ prefix from poll.h and signal.h includes In-Reply-To: References: Message-ID: <3e50ead4-6b81-c0f2-1654-847390981357@oracle.com> Hi Mikael, To repeat myself from: http://mail.openjdk.java.net/pipermail/portola-dev/2017-April/000025.html Changes look okay. I agree with the rationale. Looking at actual implementations, linux and mac OS are trivially fine (poll.h just includes sys/poll.h). Solaris is non-trivially fine - poll.h does more than what sys/poll.h does, but nothing that affects our sources. Thanks, David :) On 10/05/2017 7:29 AM, Mikael Vidstedt wrote: > > Please review this small change which removes the sys/ prefix from a bunch of includes of poll.h and signal.h. > > hotspot: http://cr.openjdk.java.net/~mikael/webrevs/8180003/webrev.00/hotspot/webrev/ > jdk: http://cr.openjdk.java.net/~mikael/webrevs/8180003/webrev.00/jdk/webrev/ > > Using the sys/ prefix works on many platforms, but the posix spec makes it clear that the poll.h and signal.h header files should be included without the prefix. > > I have verified that this change works on all the Oracle supported platforms, but I could use some help verifying it on AIX. > > Cheers, > Mikael > From david.holmes at oracle.com Wed May 10 00:12:20 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 May 2017 10:12:20 +1000 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: References: Message-ID: <42e3f5a6-0fd1-7fe8-c446-d72e86c5d315@oracle.com> Hi Mikael, On 10/05/2017 8:40 AM, Mikael Vidstedt wrote: > > Warning: It may be wise to stock up on coffee or tea before reading this. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 > Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ Overall this looks good to me. I like the refactoring from Bytes to Endian - that simplifies things a lot I think. The actual "copy/swap" changes and templates I'm not an expert on but I get the gist and they seemed okay. Was a little unsure about all the changes to void* from u2*/u1* in classFileParser.h/cpp - does that just simplify use of the copy/swap code? Though I see some casts to u2* are no longer necessary as well. A couple of oddities I noticed: src/share/vm/classfile/classFileStream.hpp Without the get_u2_buffer/get_u1_buffer distinction get_u1_buffer seems superfluous and all uses can be replaced by the existing current() accessor. --- src/share/vm/classfile/classFileParser.cpp We do we have void* here: 1707 const void* const exception_table_start = cfs->get_u1_buffer(); but u1* here: 1845 const u1* const localvariable_table_start = cfs->get_u1_buffer(); Thanks, David > * Background (from the JBS description) > > x86 is normally very forgiving when it comes to dereferencing unaligned pointers. Even if the pointer isn?t aligned on the natural size of the element being accessed, the hardware will do the Right Thing(tm) and nobody gets hurt. However, turns out there are exceptions to this. Specifically, SSA2 introduced the movdqa instruction, which is a 128-bit load/store which *does* require that the pointer is 128-bit aligned. > > Normally this isn?t a problem, because after all we don?t typically use 128-bit data types in our C/C++ code. However, just because we don?t use any such data types explicitly, there?s nothing preventing the C compiler to do so under the covers. Specifically, the C compiler tends to do loop unrolling and vectorization, which can turn pretty much any data access into vectorized SIMD accesses. > > We?ve actually run into a variation on this exact same problem a while back when upgrading to gcc 4.9.2. That time the problem (as captured in JDK-8141491) was in nio/Bits.c, and it was fixed by moving the copy functionality into hotspot (copy.[ch]pp), making sure the copy logic does the relevant alignment checks etc. > > This time the problem is with ClassFileParser. Or more accurately, it?s in the methods ClassFileParser makes use of. Specifically, the problem is with the copy_u2_with_conversion method, used to copy various data from the class file and put it in the ?native? endian order in memory. It, in turn, uses Bytes::get_Java_u2 to read and potentially byte swap a 16-bit entry from the class file. bytes_x86.hpp has this to say about its implementation: > > // Efficient reading and writing of unaligned unsigned data in platform-specific byte ordering > // (no special code is needed since x86 CPUs can access unaligned data) > > While that is /almost/ always true for the x86 architecture in itself, the C standard still expects accesses to be aligned, and the C compiler is free to make use of that expectation to, for example, vectorize operations and make use of the movdqa instruction. > > I noticed this when working on the portola/musl port, and in that environment the bug is tickled immediately. Why is it only a problem with musl? Turns out it sorta isn?t. It seems that this isn't an actual problem on any of the current platforms/toolchains we're using, but it's a latent bug which may be triggered at any point. > > bytes_x86.hpp will, in the end, actually use system library functions to do byte swapping. Specifically, on linux_x86 it will come down to bytes_linux_x86.inline.hpp which, on AMD64, uses the system functions/macros swap_u{16,32,64} to do the actual byte swapping. Now here?s the ?funny? & interesting part: > > With glibc, the swap_u{16,32,64} methods are implemented using inline assembly - in the end it comes down to an inline rotate ?rorw? instruction. Since GCC can?t see through the inline assembly, it will not realize that there are loop unrolling/vectorization opportunities, and of specific interest to us: the movdqa instruction will not be used. The code will potentially not be as efficient as it could be, but it will be functional. > > With musl, the swap methods are instead implemented as normal macros, shifting bits around to achieve the desired effect. GCC recognizes the bit shifting patterns, will realize that it?s just byte swapping a bunch of values, will vectorize the loop, and *will* make use of the movdqa instruction. Kaboom. > > To recap: dereferencing unaligned pointers in C/C++ is a no-no, even in cases where you think it should be okay. With the existing compilers and header files we are not currently running into this problem, but even a small change in the byte swap implementation exposes the problem. > > > > * About the change > > The key changes are in three different areas: > > 1. copy.[ch]pp > > Introducing: conjoint_swap_if_needed > > conjoint_swap_if_needed copies data, and bytes wap it on-the-fly if the specified endianness differs from the native/CPU endianness. It does this by either delegating to conjoint_swap (on endian mismatch), or conjoint_copy (on match). In copy.cpp, the changes all boil down to making the innermost do_conjoint_swap method more flexible so that it can be reused for both cases (straight copy as well as copy+swap). > > 2. classFile{Parser,Stream} > > The key change is in classFileParser.cpp, switching to copying data from the class file using the new conjoint_swap_if_needed method, replacing the loop implemented in copy_u2_with_conversion/Bytes::get_Java_u2. > > However, in addition to that change, I noticed that there are a lot of u2* passed around in the code, pointers which are not necessarily 16-bit aligned. While there?s nothing wrong with *having* an unaligned pointer in C - as long as it?s not dereferenced everything is peachy - it made me uneasy to see it passed around and used the way it is. Specifically, ClassFileStream::get_u2_buffer() could, to the untrained eye, be a bit misleading. One could accidentally and incorrectly assume that the returned pointer is, in fact, 16-bit aligned and start dereferencing it directly, where in fact there is no such guarantee. Perhaps even use it as an array and attract the wrath of the C compiler. > > Changing to void* may or may not be the right thing to do here. In a way I?d actually like to ?carry? the type information, but in some way still prevent the pointer from being directly dereferenced. Taking suggestions. > > > 3. bytes_x86.hpp > > This is addressing the wider concern that other parts of hotspot may use the same primitives in much the same (potentially broken) way, and in particular the fact that the get/put primitives aren?t checking whether the pointer argument is aligned before they dereference it. It may well be that a simple assert or two would do the trick here. That said: > > It turns out that the various platforms all have their own unique ways of implementing bytes.hpp, duplicating some logic which could/should be platform independent. I tried to clean up and unify it all a bit while at it by introducing an Endian helper class in bytes.hpp. The primitives for accessing data in memory now check for alignment and either perform the raw memory access (when the pointer is aligned), or does a memcpy (if unaligned). There?s some template ?magic? in there avoid duplicating code, but hopefully the magic is relatively straightforward. > > > * Testing > > I?ve run some basic testing on this, but I?m very much looking for advice on which tests to run. Let me know if you have any suggestions! > > Cheers, > Mikael > From david.holmes at oracle.com Wed May 10 04:04:58 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 May 2017 14:04:58 +1000 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> Message-ID: Hi Andrew, On 10/05/2017 3:18 AM, Andrew Haley wrote: > In C2 we use LDAR/STLR to handle volatile accesses, but in C1 and the > interpreter we use separate DMB instructions and relaxed loads. When > used together, these do not form a sequentially-consistent memory > ordering. For example, if stores use STLR and loads use LDR;DMB a > simple Dekker idiom will fail. I'm somewhat confused by this description. Outside of Aarch64 the general approach, for C1 and Unsafe at least, is that a volatile-read is a load-acquire() (or a fence-load-acquire if you want the IRIW support) and a volatile write is a release-store-fence (or just release-store with IRIW support). Does Aarch64 not follow this pattern? I'm trying to see if the issue here is the original code generation or a subtle incompatibility between the ld-acq/st-rel instructions and explicit DMB. Thanks, David > This is extremely hard to test because the loads and stores have to be > in separately-compiled methods, but it is incorrect, and likely to > fail in very weakly-ordered implementations. > > Note: this is for JDK 9. > > http://cr.openjdk.java.net/~aph/8179954/ > > Andrew. > From david.holmes at oracle.com Wed May 10 04:27:51 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 May 2017 14:27:51 +1000 Subject: RFR 8153646: Move vm/utilities/array.hpp to vm/oops In-Reply-To: References: Message-ID: Hi Harold, On 10/05/2017 3:39 AM, harold seigel wrote: > Hi, > > Please review this JDK-10 change to move hotspot header file array.hpp > from the vm/utilities directory to the vm/oops directory. This was done > because after moving typedefs for basic type arrays to > growableArray.hpp, the only remaining declaration in array.hpp is the > metaspace class Array. > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8153646/webrev/ Looks good - two minor comments: Can I ask a favour - for these modified "empty" files: src/cpu/aarch64/vm/c1_FpuStackSim_aarch64.cpp src/cpu/arm/vm/c1_FpuStackSim_arm.cpp src/cpu/sparc/vm/c1_FpuStackSim_sparc.cpp can you delete all of the #include lines, please. --- src/share/vm/c1/c1_CodeStubs.hpp This uses GrowableArray so should #include its header. Thanks, David > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8153646 > > The fix was tested with JCK Lang and VM tests, the JTreg hotspot, > java/io, java/lang, java/util and other tests, the co-located NSK tests, > and with JPRT. > > Thanks, Harold > From igor.ignatyev at oracle.com Wed May 10 04:30:54 2017 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 9 May 2017 21:30:54 -0700 Subject: RFR(S) : 8180037 : move jdk.test.lib.InMemoryJavaCompiler to a separate package Message-ID: <4C45D0B5-C136-4E4D-B146-E458219CB06D@oracle.com> http://cr.openjdk.java.net/~iignatyev//8180037/webrev.00/index.html > 41 lines changed: 3 ins; 13 del; 25 mod; Hi all, could you please review this small patch which moves jdk.test.lib.InMemoryJavaCompiler to jdk.test.lib.compiler package and updates the tests? InMemoryJavaCompiler depends on java.compiler module, so in order to avoid unneeded module dependency it should be moved to a separate package. webrev: http://cr.openjdk.java.net/~iignatyev//8180037/webrev.00/index.html JBS: https://bugs.openjdk.java.net/browse/JDK-8180037 testing: :hotspot_all Thanks, -- Igor From aph at redhat.com Wed May 10 06:07:25 2017 From: aph at redhat.com (Andrew Haley) Date: Wed, 10 May 2017 07:07:25 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> Message-ID: <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> On 09/05/17 22:39, White, Derek wrote: > My only comment is in src/cpu/aarch64/vm/templateTable_aarch64.cpp, TemplateTable::getfield_or_static(): > > The comment in the trailing member around line 2542 says: > - "It's really not worth bothering to check whether this field > really is volatile in the slow case." > > But getfield_or_static() is used once to "quicken" getfield byte > codes, as well as used forevermore on all getstatic bytecodes (and > some weird cases in class sharing?). > > I can't claim that makes a definite performance difference (it's > just the interpreter), but adding an additional unconditional membar > might make it more likely to matter. > > FYI, the ppc and arm64 ports do check if the field is volatile > before executing the membar (s in the ppc case). Sure. There's always a trade-off between code complexity and absolute speed, and we tend to concentrate our fire where it can help the most, and that's probably not in the slow getfield part of the interpreter. I know that mispredicted branches hurt, and that fence instructions hurt; I don't know which hurts the most in general. I'm not at all sure that the conditional branches around the fences are right, and it might work better if they all were removed. The right way to fix this case is to rewrite the interpreter to use stlr and ldar, but that's too complex for JDK9. It'd be nice to get it done for JDK10. So yeah, you might have a point, but I really don't think it's worth changing the patch for JDK9. Andrew. From aph at redhat.com Wed May 10 06:21:18 2017 From: aph at redhat.com (Andrew Haley) Date: Wed, 10 May 2017 07:21:18 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> Message-ID: <77dd684f-c0af-9b25-4b64-c89b8b1773e2@redhat.com> On 10/05/17 05:04, David Holmes wrote: > I'm somewhat confused by this description. Outside of Aarch64 the > general approach, for C1 and Unsafe at least, is that a volatile-read is > a load-acquire() (or a fence-load-acquire if you want the IRIW support) > and a volatile write is a release-store-fence (or just release-store > with IRIW support). Does Aarch64 not follow this pattern? No. AArch64 has its own sequentially-consistent load and store instructions which are designed to provide just enough for volatiles but no more. These are preferable to using fences, but that's hard in C1 because shared code inserts fences, regardless of the target machine. This is wrong, but it's legacy code. > I'm trying to see if the issue here is the original code generation or a > subtle incompatibility between the ld-acq/st-rel instructions and > explicit DMB. I wouldn't be surprised. The problem is that the approach taken in HotSpot is much too naive. There is not an exact correspondence between real processors' fence instructions and what we need for Hotspot. The best mappings are here: https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html The ones we need for volatiles are the Seq Cst set. As you can see, the choices of instruction sequences are different for different processors. C1 (and all the compilers) should delegate this to the back ends, but instead they try map volatile accesses onto acquire/release/fence. PPC has special code which is #ifdef'd in the shared code in the compilers, so I'm sure it gets this right. Andrew. From adinn at redhat.com Wed May 10 07:48:35 2017 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 10 May 2017 08:48:35 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> Message-ID: On 09/05/17 18:18, Andrew Haley wrote: > In C2 we use LDAR/STLR to handle volatile accesses, but in C1 and the > interpreter we use separate DMB instructions and relaxed loads. When > used together, these do not form a sequentially-consistent memory > ordering. For example, if stores use STLR and loads use LDR;DMB a > simple Dekker idiom will fail. Oh, well caught! > This is extremely hard to test because the loads and stores have to be > in separately-compiled methods, but it is incorrect, and likely to > fail in very weakly-ordered implementations. Not to mention hard to debug ;-) > Note: this is for JDK 9. > > http://cr.openjdk.java.net/~aph/8179954/ Yes, this patch looks good and really ought to go into jdk9. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From harold.seigel at oracle.com Wed May 10 11:56:29 2017 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 10 May 2017 07:56:29 -0400 Subject: RFR 8153646: Move vm/utilities/array.hpp to vm/oops In-Reply-To: <09ce165b-1e52-f41d-85b4-50463808c761@oracle.com> References: <09ce165b-1e52-f41d-85b4-50463808c761@oracle.com> Message-ID: Thanks Serguei! Harold On 5/9/2017 6:37 PM, serguei.spitsyn at oracle.com wrote: > Hi Harold, > > The fix looks good. > > Thanks, > Serguei > > > On 5/9/17 10:39, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to move hotspot header file >> array.hpp from the vm/utilities directory to the vm/oops directory. >> This was done because after moving typedefs for basic type arrays to >> growableArray.hpp, the only remaining declaration in array.hpp is the >> metaspace class Array. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8153646/webrev/ >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8153646 >> >> The fix was tested with JCK Lang and VM tests, the JTreg hotspot, >> java/io, java/lang, java/util and other tests, the co-located NSK >> tests, and with JPRT. >> >> Thanks, Harold >> > From harold.seigel at oracle.com Wed May 10 11:57:09 2017 From: harold.seigel at oracle.com (harold seigel) Date: Wed, 10 May 2017 07:57:09 -0400 Subject: RFR 8153646: Move vm/utilities/array.hpp to vm/oops In-Reply-To: References: Message-ID: <22b30dd3-62ef-d26b-f34b-4879790dab49@oracle.com> Thanks David! I'll make those changes before pushing it. Harold On 5/10/2017 12:27 AM, David Holmes wrote: > Hi Harold, > > On 10/05/2017 3:39 AM, harold seigel wrote: >> Hi, >> >> Please review this JDK-10 change to move hotspot header file array.hpp >> from the vm/utilities directory to the vm/oops directory. This was done >> because after moving typedefs for basic type arrays to >> growableArray.hpp, the only remaining declaration in array.hpp is the >> metaspace class Array. >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8153646/webrev/ > > Looks good - two minor comments: > > Can I ask a favour - for these modified "empty" files: > > src/cpu/aarch64/vm/c1_FpuStackSim_aarch64.cpp > src/cpu/arm/vm/c1_FpuStackSim_arm.cpp > src/cpu/sparc/vm/c1_FpuStackSim_sparc.cpp > > can you delete all of the #include lines, please. > > --- > > src/share/vm/c1/c1_CodeStubs.hpp > > This uses GrowableArray so should #include its header. > > Thanks, > David > > >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8153646 >> >> The fix was tested with JCK Lang and VM tests, the JTreg hotspot, >> java/io, java/lang, java/util and other tests, the co-located NSK tests, >> and with JPRT. >> >> Thanks, Harold >> From aph at redhat.com Wed May 10 14:18:42 2017 From: aph at redhat.com (Andrew Haley) Date: Wed, 10 May 2017 15:18:42 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> Message-ID: So you were right. Dammit. :-) This test: @State(Scope.Benchmark) public static class BenchmarkState { static int nn = 99; } @Benchmark public int testMethod(BenchmarkState state) { return state.nn; } Interpreter-only, before my patch: Benchmark Mode Cnt Score Error Units MyBenchmark.testMethod avgt 5 92.938 ? 0.870 ns/op After my patch, it's slower: # Run complete. Total time: 00:00:07 Benchmark Mode Cnt Score Error Units MyBenchmark.testMethod avgt 5 94.518 ? 0.562 ns/op But if I insert conditional branches around the fences as you suggest, the result is much better: Benchmark Mode Cnt Score Error Units MyBenchmark.testMethod avgt 25 83.825 ? 0.161 ns/op New patch at http://cr.openjdk.java.net/~aph/8179954-2/ OK? Andrew. From Derek.White at cavium.com Wed May 10 16:42:37 2017 From: Derek.White at cavium.com (White, Derek) Date: Wed, 10 May 2017 16:42:37 +0000 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> Message-ID: So good news - you made getstatic 5x faster. Bad news - only in the interpreter. So this one goes out to all the static initializers! Actual code review question: In templateTable_aarch64.cpp: Line 2411: We tbz on flags. Line 2423: We extract into flags. Line 2547: We tbz on flags. - Is the volatile bit still in flags? Thanks! - Derek -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Wednesday, May 10, 2017 10:19 AM To: White, Derek ; Aleksey Shipilev ; hotspot-dev Source Developers Subject: Re: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent So you were right. Dammit. :-) This test: @State(Scope.Benchmark) public static class BenchmarkState { static int nn = 99; } @Benchmark public int testMethod(BenchmarkState state) { return state.nn; } Interpreter-only, before my patch: Benchmark Mode Cnt Score Error Units MyBenchmark.testMethod avgt 5 92.938 ? 0.870 ns/op After my patch, it's slower: # Run complete. Total time: 00:00:07 Benchmark Mode Cnt Score Error Units MyBenchmark.testMethod avgt 5 94.518 ? 0.562 ns/op But if I insert conditional branches around the fences as you suggest, the result is much better: Benchmark Mode Cnt Score Error Units MyBenchmark.testMethod avgt 25 83.825 ? 0.161 ns/op New patch at http://cr.openjdk.java.net/~aph/8179954-2/ OK? Andrew. From paul.sandoz at oracle.com Wed May 10 17:25:14 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 10 May 2017 10:25:14 -0700 Subject: RFR 10 JDK-8159995: Rename internal Unsafe.compare methods In-Reply-To: <590CD221.4080100@oracle.com> References: <590CD221.4080100@oracle.com> Message-ID: <43110285-7C52-4DC5-B213-61DC1468119F@oracle.com> Hi, Looks good. Some minor comments. Paul. src/share/vm/classfile/vmSymbols.cpp ? A prior bug: in the original code the case statements for the _weakCompareAndSwapLongVolatile were missing, so we need to add: case vmIntrinsics::_weakCompareAndSeLong etc. I think this is mostly benign as it?s related to disabling intrinsics. src/share/vm/opto/library_call.cpp ? 2594 // LS_cmp_swap_weak: 2595 // 2596 // boolean weakCompareAndSetObjectPlain( Object o, long offset, Object expected, Object x); 2597 // boolean weakCompareAndSetObjectAcquire(Object o, long offset, Object expected, Object x); 2598 // boolean weakCompareAndSetObjectRelease(Object o, long offset, Object expected, Object x); 2599 // 2600 // boolean weakCompareAndSetIntPlain( Object o, long offset, int expected, int x); 2601 // boolean weakCompareAndSetIntAcquire( Object o, long offset, int expected, int x); 2602 // boolean weakCompareAndSetIntRelease( Object o, long offset, int expected, int x); 2603 // 2604 // boolean weakCompareAndSetLongPlain( Object o, long offset, long expected, long x); 2605 // boolean weakCompareAndSetLongAcquire( Object o, long offset, long expected, long x); 2606 // boolean weakCompareAndSetLongRelease( Object o, long offset, long expected, long x); 2607 // Missing volatile variantes in the comment. @@ -4962,7 +4962,7 @@ // See arraycopy_restore_alloc_state() comment // if alloc == NULL we don't have to worry about a tightly coupled allocation so we can emit all needed guards // if saved_jvms != NULL (then alloc != NULL) then we can handle guards and a tightly coupled allocation - // if saved_jvms == NULL and alloc != NULL, we can?t emit any guards + // if saved_jvms == NULL and alloc != NULL, we can???t emit any guards Rouge characters that substituted ?," test/compiler/unsafe/ ? In this directory there are templates you can update from which the test source is generated by running a script. See: hotspot/test/compiler/unsafe/generate-unsafe-access-tests.sh hotspot/test/compiler/unsafe/X-UnsafeAccessTest.java.template > On 5 May 2017, at 12:27, Ron Pressler wrote: > > Hi, > Please review the following core/hotspot change: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8159995 > core webrev: http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-jdk/webrev/ > hotspot webrev: http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-hotspot/webrev/ > > This change is covered by existing tests. > > The following renaming was applied: > > - compareAndExchange*Volatile -> compareAndExchange* > - compareAndSwap* -> compareAndSet* > - weakCompareAndSwap* -> weakCompareAndSet*Plain > - weakCompareAndSwap*Volatile -> weakCompareAndSet* > > At this stage, only method and hotspot intrinsic names were changed; node names were left as-is, and may be handled in a separate issue. > > Ron From aph at redhat.com Wed May 10 20:10:11 2017 From: aph at redhat.com (Andrew Haley) Date: Wed, 10 May 2017 21:10:11 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> Message-ID: On 10/05/17 17:42, White, Derek wrote: > So this one goes out to all the static initializers! > > Actual code review question: > In templateTable_aarch64.cpp: > Line 2411: We tbz on flags. > Line 2423: We extract into flags. > Line 2547: We tbz on flags. > - Is the volatile bit still in flags? Ummm, probably not. Yuck. I'll think on. You can see why I couldn't be bothered to handle the volatile bit. Andrew. From ron.pressler at oracle.com Wed May 10 20:57:07 2017 From: ron.pressler at oracle.com (Ron Pressler) Date: Wed, 10 May 2017 23:57:07 +0300 Subject: RFR 10 JDK-8159995: Rename internal Unsafe.compare methods In-Reply-To: <6cfadc10-f64b-c536-0d29-c74c3e81f2b7@oracle.com> References: <590CD221.4080100@oracle.com> <6cfadc10-f64b-c536-0d29-c74c3e81f2b7@oracle.com> Message-ID: <59137EA3.5020403@oracle.com> Thank you. Your comments, and Paul's, have all been addressed in a revised patch (same webrev). Ron On 08/05/2017 08:30, David Holmes wrote: > Hi Ron, > > On 6/05/2017 5:27 AM, Ron Pressler wrote: >> Hi, >> Please review the following core/hotspot change: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8159995 >> core webrev: >> http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-jdk/webrev/ >> >> >> hotspot webrev: >> http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8159995-unsafe-compare-and-swap-to-set-hotspot/webrev/ >> >> >> >> This change is covered by existing tests. >> >> The following renaming was applied: >> >> - compareAndExchange*Volatile -> compareAndExchange* >> - compareAndSwap* -> compareAndSet* > > So to clarify this for others, there was confusion surrounding the use > of "swap" versus "exchange" when both words mean the same thing > effectively, but the "swap" functions return a boolean, while the > "exchange" functions return the old value. So we changed "swap" to > "set" across the APIs - _except_ for the old > /jdk.unsupported/share/classes/sun/misc/Unsafe.java because we can't > change its exported API for compatibility reasons. > > Given any "swap(exp, new)" function can be implemented as > "exchange(exp, new) == exp" I'm not sure why we have two complete sets > of functions all the way through. But I guess that is a different > issue. :) > >> - weakCompareAndSwap* -> weakCompareAndSet*Plain >> - weakCompareAndSwap*Volatile -> weakCompareAndSet* >> >> At this stage, only method and hotspot intrinsic names were changed; >> node names were left as-is, and may be handled in a separate issue. > > Overall looks good for libs and hotspot changes. > > One nit I spotted: > > src/java.base/share/classes/java/util/concurrent/atomic/AtomicLong.java > > + * compareAndSwap for longs. While the intrinsic compareAndSetLong > > compareAndSwap should be compareAndSet > > --- > > All hotspot files need their copyright years updated to 2017 (if not > already). > > As there are hotspot changes this must be pushed using JPRT and > "-testset hotspot" (but your sponsor should know that :) ). > > Thanks, > David > >> Ron From mikhailo.seledtsov at oracle.com Wed May 10 23:08:20 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Wed, 10 May 2017 16:08:20 -0700 Subject: RFR(S) : 8180037 : move jdk.test.lib.InMemoryJavaCompiler to a separate package In-Reply-To: <4C45D0B5-C136-4E4D-B146-E458219CB06D@oracle.com> References: <4C45D0B5-C136-4E4D-B146-E458219CB06D@oracle.com> Message-ID: <00b70bc1-faaf-0fed-3e00-774c9019c74e@oracle.com> Hi Igor, Changes look good, Misha On 05/09/2017 09:30 PM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8180037/webrev.00/index.html >> 41 lines changed: 3 ins; 13 del; 25 mod; > Hi all, > > could you please review this small patch which moves jdk.test.lib.InMemoryJavaCompiler to jdk.test.lib.compiler package and updates the tests? > InMemoryJavaCompiler depends on java.compiler module, so in order to avoid unneeded module dependency it should be moved to a separate package. > > webrev: http://cr.openjdk.java.net/~iignatyev//8180037/webrev.00/index.html > JBS: https://bugs.openjdk.java.net/browse/JDK-8180037 > testing: :hotspot_all > > Thanks, > -- Igor From mikhailo.seledtsov at oracle.com Wed May 10 23:11:49 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Wed, 10 May 2017 16:11:49 -0700 Subject: RFR(XXS) : 8180004: jdk.test.lib.DynamicVMOption should be moved to jdk.test.lib.management In-Reply-To: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> References: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> Message-ID: <433618d5-78fc-7281-8220-9b1bf292abe6@oracle.com> Looks good to me, Misha On 05/09/2017 02:05 PM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8180004/webrev.00/index.html >> 8 lines changed: 1 ins; 0 del; 7 mod; > Hi all, > > could you please review this tiny patch which moves jdk.test.lib.DynamicVMOption class to jdk.test.lib.management package and updates the tests which use it? > j.t.l.DynamicVMOption uses classes from jdk.management module, so having it in common testlibrary package might cause redundant module dependencies. > > webrev: http://cr.openjdk.java.net/~iignatyev//8180004/webrev.00/index.html > jbs: https://bugs.openjdk.java.net/browse/JDK-8180004 > testing: :hotspot_all > > Thanks, > -- Igor > From aph at redhat.com Thu May 11 09:02:14 2017 From: aph at redhat.com (Andrew Haley) Date: Thu, 11 May 2017 10:02:14 +0100 Subject: Optimizing byte reverse code for int value In-Reply-To: References: <622fa3e77da546dfb5155a1e4afacd7c@sap.com> <362a21f4-277c-c3f3-f7f0-08b55c8b2b0b@redhat.com> <89abbea5-9998-2e4d-62d3-e1f3e9bbd1d5@redhat.com> <2e13a32b56cd4d9f89758f4042602e9a@sap.com> <174bf72968b5473cb3757a4f1c125bf7@sap.com> Message-ID: <4bdec074-3884-497e-ec86-f5a2dab6202f@redhat.com> On 11/05/17 07:46, Michihiro Horie wrote: > Thanks a lot for your helpful comments. I fixed my code. > http://cr.openjdk.java.net/~horii/8178294/webrev.06/ > >> @Andrew: Do you think this is the right way to do it and is there a chance > to get it in jdk8u? > Andrew, I would be grateful if you would approve this change for jdk8u. The list of jdk8u reviewers is at http://openjdk.java.net/census#jdk8u. You'll want someone who is on the HotSpot team. I have mixed feelings about this patch. It seems too specific to me: if you had something that would work with any integer type it would be more useful, I feel. And - generally speaking - the rule is that patches go into JDK 9 first, but JDK 9 is closed for enhancements. So, I'm sorry for the bad news. Your patch looks interesting and useful but I do not know how to get it committed. Andrew. From aph at redhat.com Thu May 11 12:31:25 2017 From: aph at redhat.com (Andrew Haley) Date: Thu, 11 May 2017 13:31:25 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> Message-ID: <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> Thanks for finding that bug. It's a reminder that we need to concentrate on doing the minimum at this stage to ensure correctness. Our performance is good, and this change is not hugely profitable. It's not worth risking the whole ship for. Nevertheless, I've made a new webrev, which does the right thing. I stepped through the code to make sure. I've come this far, so I might as well get it right. (Yes, I'm aware that I just fell into the sunk cost fallacy, but I want something to show for the work I've done.) http://cr.openjdk.java.net/~aph/8179954-3/ In JDK 10 we should look at replacing all the explicit fences used for volatiles with LDAR/STLR. OK? I'd like two reviewers for this one. Andrew. From stuart.monteith at linaro.org Thu May 11 15:34:13 2017 From: stuart.monteith at linaro.org (Stuart Monteith) Date: Thu, 11 May 2017 16:34:13 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> Message-ID: Hi, It looks fine as far as I can tell. Could the comment be explicit about replacing the code sequence: STLR LDR DMB with: STLR DMB LDR as initially, I was thinking about them being on different threads. (although, looking at the thread, it was probably just me that thought that). BR, Stuart On 11 May 2017 at 13:31, Andrew Haley wrote: > Thanks for finding that bug. It's a reminder that we need to > concentrate on doing the minimum at this stage to ensure correctness. > Our performance is good, and this change is not hugely profitable. It's > not worth risking the whole ship for. > > Nevertheless, I've made a new webrev, which does the right thing. I > stepped through the code to make sure. I've come this far, so I might > as well get it right. (Yes, I'm aware that I just fell into the sunk > cost fallacy, but I want something to show for the work I've done.) > > http://cr.openjdk.java.net/~aph/8179954-3/ > > In JDK 10 we should look at replacing all the explicit fences used for > volatiles with LDAR/STLR. > > OK? I'd like two reviewers for this one. > > Andrew. From Derek.White at cavium.com Thu May 11 15:58:32 2017 From: Derek.White at cavium.com (White, Derek) Date: Thu, 11 May 2017 15:58:32 +0000 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> Message-ID: Hi Andrew, I understand your point on correctness. But thanks for fighting through this one. Code looks good. - Derek -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Thursday, May 11, 2017 8:31 AM To: White, Derek ; Aleksey Shipilev ; hotspot-dev Source Developers Subject: Re: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent Thanks for finding that bug. It's a reminder that we need to concentrate on doing the minimum at this stage to ensure correctness. Our performance is good, and this change is not hugely profitable. It's not worth risking the whole ship for. Nevertheless, I've made a new webrev, which does the right thing. I stepped through the code to make sure. I've come this far, so I might as well get it right. (Yes, I'm aware that I just fell into the sunk cost fallacy, but I want something to show for the work I've done.) http://cr.openjdk.java.net/~aph/8179954-3/ In JDK 10 we should look at replacing all the explicit fences used for volatiles with LDAR/STLR. OK? I'd like two reviewers for this one. Andrew. From aph at redhat.com Thu May 11 15:58:59 2017 From: aph at redhat.com (Andrew Haley) Date: Thu, 11 May 2017 16:58:59 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> Message-ID: <9a39bbad-6c48-b5aa-70b5-964cd50cc8b0@redhat.com> On 11/05/17 16:34, Stuart Monteith wrote: > It looks fine as far as I can tell. Could the comment be explicit > about replacing the code sequence: > > STLR > LDR > DMB > > with: > STLR > DMB > LDR I've put it in the bug report. I think that should do it. Andrew. From rwestrel at redhat.com Thu May 11 16:23:47 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 11 May 2017 18:23:47 +0200 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> Message-ID: > http://cr.openjdk.java.net/~aph/8179954-3/ That looks good to me. Roland. From adinn at redhat.com Thu May 11 16:27:00 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 11 May 2017 17:27:00 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> Message-ID: <05005c14-8931-0059-069d-6d42187c8a8a@redhat.com> On 11/05/17 13:31, Andrew Haley wrote: > Thanks for finding that bug. It's a reminder that we need to > concentrate on doing the minimum at this stage to ensure correctness. > Our performance is good, and this change is not hugely profitable. It's > not worth risking the whole ship for. > > Nevertheless, I've made a new webrev, which does the right thing. I > stepped through the code to make sure. I've come this far, so I might > as well get it right. (Yes, I'm aware that I just fell into the sunk > cost fallacy, but I want something to show for the work I've done.) > > http://cr.openjdk.java.net/~aph/8179954-3/ > > In JDK 10 we should look at replacing all the explicit fences used for > volatiles with LDAR/STLR. > > OK? I'd like two reviewers for this one. The latest patch also looks good by eyeball. I'm currently checking it builds and runs ok. Will gte back soon re that. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From dmitry.fazunenko at oracle.com Thu May 11 16:40:57 2017 From: dmitry.fazunenko at oracle.com (Dmitry Fazunenko) Date: Thu, 11 May 2017 19:40:57 +0300 Subject: RFR(XXS) : 8180183: Confusing javadoc comment to the getOutput(ProcessBuilder processBuilder) method of jdk.test.lib.process.ProcessTools In-Reply-To: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> References: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> Message-ID: http://cr.openjdk.java.net/~dfazunen/8180183/webrev.00/ > 1 line changed: 0 ins; 0 del; 1 mod; Hi everyone, a tiny patch is waiting for reviewers. Thanks, Dima http://cr.openjdk.java.net/~dfazunen/8180183/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8180183 From shade at redhat.com Thu May 11 16:42:38 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 11 May 2017 18:42:38 +0200 Subject: RFR(XXS) : 8180183: Confusing javadoc comment to the getOutput(ProcessBuilder processBuilder) method of jdk.test.lib.process.ProcessTools In-Reply-To: References: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> Message-ID: On 05/11/2017 06:40 PM, Dmitry Fazunenko wrote: > http://cr.openjdk.java.net/~dfazunen/8180183/webrev.00/ >> 1 line changed: 0 ins; 0 del; 1 mod; Looks good. Thanks, -Aleksey From adinn at redhat.com Thu May 11 16:52:54 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 11 May 2017 17:52:54 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: <05005c14-8931-0059-069d-6d42187c8a8a@redhat.com> References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> <05005c14-8931-0059-069d-6d42187c8a8a@redhat.com> Message-ID: On 11/05/17 17:27, Andrew Dinn wrote: > On 11/05/17 13:31, Andrew Haley wrote: >> OK? I'd like two reviewers for this one. > > The latest patch also looks good by eyeball. > > I'm currently checking it builds and runs ok. Will gte back soon re that. Yes, it builds for me and my build runs java Hello, javac Hello.java and netbeans ok. I cannot guarantee that I am likely to have tested the actual fix since, as Andrew points out, it needs to have code with different levels of compilation happen to mix in the right way. But this at least suggests that nothing got side-swiped along the way. So, I say ship it :-) regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From dmitry.fazunenko at oracle.com Thu May 11 17:04:18 2017 From: dmitry.fazunenko at oracle.com (Dmitry Fazunenko) Date: Thu, 11 May 2017 20:04:18 +0300 Subject: RFR(XXS) : 8180183: Confusing javadoc comment to the getOutput(ProcessBuilder processBuilder) method of jdk.test.lib.process.ProcessTools In-Reply-To: References: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> Message-ID: Thanks, Aleksey! On 11.05.2017 19:42, Aleksey Shipilev wrote: > On 05/11/2017 06:40 PM, Dmitry Fazunenko wrote: >> http://cr.openjdk.java.net/~dfazunen/8180183/webrev.00/ >>> 1 line changed: 0 ins; 0 del; 1 mod; > Looks good. > > Thanks, > -Aleksey > > From stuart.monteith at linaro.org Thu May 11 18:02:12 2017 From: stuart.monteith at linaro.org (Stuart Monteith) Date: Thu, 11 May 2017 19:02:12 +0100 Subject: RFR: AArch64: 8179954: AArch64: C1 and C2 volatile accesses are not sequentially consistent In-Reply-To: References: <3769fcaf-4bbb-c9d2-6e0f-dceb8947e6ba@redhat.com> <8aacce34-7cdb-2d3c-e19c-906a61b648be@redhat.com> <026f928f-b91e-6aa2-595f-1b051d0b5dda@redhat.com> <15a7e75e-b67d-53d7-7310-a4987caf6d0f@redhat.com> <05005c14-8931-0059-069d-6d42187c8a8a@redhat.com> Message-ID: Looks good to me - a quick "sanity" quick on JCStress ran cleanly too. On 11 May 2017 at 17:52, Andrew Dinn wrote: > On 11/05/17 17:27, Andrew Dinn wrote: >> On 11/05/17 13:31, Andrew Haley wrote: >>> OK? I'd like two reviewers for this one. >> >> The latest patch also looks good by eyeball. >> >> I'm currently checking it builds and runs ok. Will gte back soon re that. > > Yes, it builds for me and my build runs java Hello, javac Hello.java and > netbeans ok. > > I cannot guarantee that I am likely to have tested the actual fix since, > as Andrew points out, it needs to have code with different levels of > compilation happen to mix in the right way. But this at least suggests > that nothing got side-swiped along the way. > > So, I say ship it :-) > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From vladimir.x.ivanov at oracle.com Thu May 11 18:30:32 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 11 May 2017 21:30:32 +0300 Subject: RFR(XXS) : 8180004: jdk.test.lib.DynamicVMOption should be moved to jdk.test.lib.management In-Reply-To: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> References: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> Message-ID: Reviewed. Best regards, Vladimir Ivanov On 5/10/17 12:05 AM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8180004/webrev.00/index.html >> 8 lines changed: 1 ins; 0 del; 7 mod; > > Hi all, > > could you please review this tiny patch which moves jdk.test.lib.DynamicVMOption class to jdk.test.lib.management package and updates the tests which use it? > j.t.l.DynamicVMOption uses classes from jdk.management module, so having it in common testlibrary package might cause redundant module dependencies. > > webrev: http://cr.openjdk.java.net/~iignatyev//8180004/webrev.00/index.html > jbs: https://bugs.openjdk.java.net/browse/JDK-8180004 > testing: :hotspot_all > > Thanks, > -- Igor > From vladimir.x.ivanov at oracle.com Thu May 11 18:31:04 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 11 May 2017 21:31:04 +0300 Subject: RFR(S) : 8180037 : move jdk.test.lib.InMemoryJavaCompiler to a separate package In-Reply-To: <4C45D0B5-C136-4E4D-B146-E458219CB06D@oracle.com> References: <4C45D0B5-C136-4E4D-B146-E458219CB06D@oracle.com> Message-ID: <7e1f06e3-1f6c-e2f3-1198-42c81cd6a8c3@oracle.com> Reviewed. Best regards, Vladimir Ivanov On 5/10/17 7:30 AM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8180037/webrev.00/index.html >> 41 lines changed: 3 ins; 13 del; 25 mod; > > Hi all, > > could you please review this small patch which moves jdk.test.lib.InMemoryJavaCompiler to jdk.test.lib.compiler package and updates the tests? > InMemoryJavaCompiler depends on java.compiler module, so in order to avoid unneeded module dependency it should be moved to a separate package. > > webrev: http://cr.openjdk.java.net/~iignatyev//8180037/webrev.00/index.html > JBS: https://bugs.openjdk.java.net/browse/JDK-8180037 > testing: :hotspot_all > > Thanks, > -- Igor > From vladimir.x.ivanov at oracle.com Thu May 11 18:31:32 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 11 May 2017 21:31:32 +0300 Subject: RFR(XXS) : 8179930: jdk.test.lib.artifacts.ArtifactResolver::resolve should return Map instead of HashMap In-Reply-To: <67CAF340-65EA-41FF-B1DC-20A2B4C0BBD3@oracle.com> References: <67CAF340-65EA-41FF-B1DC-20A2B4C0BBD3@oracle.com> Message-ID: <3186404a-de94-c377-204d-9ca97bcd20e1@oracle.com> Reviewed. Best regards, Vladimir Ivanov On 5/9/17 8:20 PM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8179930/webrev.00/index.html >> 8 lines changed: 1 ins; 0 del; 7 mod; > > Hi all, > > could you please review this small patch which changes jdk.test.lib.artifacts.ArtifactResolver::resolve signature to return a Map instead of HasMap and updates the tests accordingly? > I have also changed the argument type from raw Class to Class, so we don't need to have casts in the method. > > webrev: http://cr.openjdk.java.net/~iignatyev//8179930/webrev.00/index.html > JBS: https://bugs.openjdk.java.net/browse/JDK-8179930 > testing: the affected tests (hotspot/test/applications) > > -- Igor > From Derek.White at cavium.com Thu May 11 18:33:18 2017 From: Derek.White at cavium.com (White, Derek) Date: Thu, 11 May 2017 18:33:18 +0000 Subject: Optimizing byte reverse code for int value In-Reply-To: <4bdec074-3884-497e-ec86-f5a2dab6202f@redhat.com> References: <622fa3e77da546dfb5155a1e4afacd7c@sap.com> <362a21f4-277c-c3f3-f7f0-08b55c8b2b0b@redhat.com> <89abbea5-9998-2e4d-62d3-e1f3e9bbd1d5@redhat.com> <2e13a32b56cd4d9f89758f4042602e9a@sap.com> <174bf72968b5473cb3757a4f1c125bf7@sap.com> <4bdec074-3884-497e-ec86-f5a2dab6202f@redhat.com> Message-ID: Hi Michihiro, Not a jdk8u reviewer OR C2 expert, but a possible simplification: I think a tree like: // AndI // /\ // LoadB ConI(255) will get turned into a LoadUBNode, via AndINode::Ideal() and AndINode::Identity(). It certainly should, considering how often this code pattern is used! If so, you should be able to simplify your pattern matching greatly. - Derek -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Andrew Haley Sent: Thursday, May 11, 2017 5:02 AM To: Michihiro Horie ; Doerr, Martin Cc: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; Hiroshi H Horii ; Simonis, Volker Subject: Re: Optimizing byte reverse code for int value On 11/05/17 07:46, Michihiro Horie wrote: > Thanks a lot for your helpful comments. I fixed my code. > http://cr.openjdk.java.net/~horii/8178294/webrev.06/ > >> @Andrew: Do you think this is the right way to do it and is there a >> chance > to get it in jdk8u? > Andrew, I would be grateful if you would approve this change for jdk8u. The list of jdk8u reviewers is at http://openjdk.java.net/census#jdk8u. You'll want someone who is on the HotSpot team. I have mixed feelings about this patch. It seems too specific to me: if you had something that would work with any integer type it would be more useful, I feel. And - generally speaking - the rule is that patches go into JDK 9 first, but JDK 9 is closed for enhancements. So, I'm sorry for the bad news. Your patch looks interesting and useful but I do not know how to get it committed. Andrew. From vladimir.kozlov at oracle.com Fri May 12 19:29:30 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 12 May 2017 12:29:30 -0700 Subject: jdk10/hs is CLOSED for pushes today Message-ID: <04b3db28-85c4-04a5-a9bd-5deec3a26d6e@oracle.com> Please, don't push anything into jdk10/hs today. I am helping Jesper to merge jdk10/jdk10 into jdk10/hs. It is very painful due to Jigsaw/JMVIC/Graal changes pulled from jdk 9. I will let you know when I am done. Thanks, Vladimir From serguei.spitsyn at oracle.com Fri May 12 23:29:29 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 12 May 2017 16:29:29 -0700 Subject: RFR(XXS) : 8180183: Confusing javadoc comment to the getOutput(ProcessBuilder processBuilder) method of jdk.test.lib.process.ProcessTools In-Reply-To: References: <6F983C21-9097-4909-98C3-93C225640C93@oracle.com> Message-ID: +1 Thanks, Serguei On 5/11/17 09:42, Aleksey Shipilev wrote: > On 05/11/2017 06:40 PM, Dmitry Fazunenko wrote: >> http://cr.openjdk.java.net/~dfazunen/8180183/webrev.00/ >>> 1 line changed: 0 ins; 0 del; 1 mod; > Looks good. > > Thanks, > -Aleksey > > From mikael.vidstedt at oracle.com Mon May 15 18:34:54 2017 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Mon, 15 May 2017 11:34:54 -0700 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: <42e3f5a6-0fd1-7fe8-c446-d72e86c5d315@oracle.com> References: <42e3f5a6-0fd1-7fe8-c446-d72e86c5d315@oracle.com> Message-ID: <88E1F61C-5297-4948-83C3-25E146674F7A@oracle.com> New webrevs: full: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01/hotspot/webrev/ incremental (from webrev.00): http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01.incr/hotspot/webrev/ I definitely want a second reviewer of this, and I?m (still) taking suggestions on tests to run etc.! Also, comments inline below.. > On May 9, 2017, at 5:12 PM, David Holmes wrote: > > Hi Mikael, > > On 10/05/2017 8:40 AM, Mikael Vidstedt wrote: >> >> Warning: It may be wise to stock up on coffee or tea before reading this. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ > > Overall this looks good to me. I like the refactoring from Bytes to Endian - that simplifies things a lot I think. Thanks, I like it too ;) > The actual "copy/swap" changes and templates I'm not an expert on but I get the gist and they seemed okay. > > Was a little unsure about all the changes to void* from u2*/u1* in classFileParser.h/cpp - does that just simplify use of the copy/swap code? Though I see some casts to u2* are no longer necessary as well. Right, I could go either way here, but when I personally see a u2* I feel tempted to just dereference it, and that?s not valid in these cases. With void* it?s more obvious that you need to do something else to get the data, but the width/type of the underlying data is lost. It may make sense to introduce a few helpful typedefs of void* which include in their names the underlying data type, or something along those lines. I suggest wrapping up and pushing what I have and working on improving the story here as a separate change. Reasonable? > A couple of oddities I noticed: > > src/share/vm/classfile/classFileStream.hpp > > Without the get_u2_buffer/get_u1_buffer distinction get_u1_buffer seems superfluous and all uses can be replaced by the existing current() accessor. Good point, get_u1_buffer can be removed (and it?s gone in the new webrev). > > --- > > src/share/vm/classfile/classFileParser.cpp > > We do we have void* here: > > 1707 const void* const exception_table_start = cfs->get_u1_buffer(); > > but u1* here: > > 1845 const u1* const localvariable_table_start = cfs->get_u1_buffer(); Good catch. Changed to void*. Cheers, Mikael > > Thanks, > David > >> * Background (from the JBS description) >> >> x86 is normally very forgiving when it comes to dereferencing unaligned pointers. Even if the pointer isn?t aligned on the natural size of the element being accessed, the hardware will do the Right Thing(tm) and nobody gets hurt. However, turns out there are exceptions to this. Specifically, SSA2 introduced the movdqa instruction, which is a 128-bit load/store which *does* require that the pointer is 128-bit aligned. >> >> Normally this isn?t a problem, because after all we don?t typically use 128-bit data types in our C/C++ code. However, just because we don?t use any such data types explicitly, there?s nothing preventing the C compiler to do so under the covers. Specifically, the C compiler tends to do loop unrolling and vectorization, which can turn pretty much any data access into vectorized SIMD accesses. >> >> We?ve actually run into a variation on this exact same problem a while back when upgrading to gcc 4.9.2. That time the problem (as captured in JDK-8141491) was in nio/Bits.c, and it was fixed by moving the copy functionality into hotspot (copy.[ch]pp), making sure the copy logic does the relevant alignment checks etc. >> >> This time the problem is with ClassFileParser. Or more accurately, it?s in the methods ClassFileParser makes use of. Specifically, the problem is with the copy_u2_with_conversion method, used to copy various data from the class file and put it in the ?native? endian order in memory. It, in turn, uses Bytes::get_Java_u2 to read and potentially byte swap a 16-bit entry from the class file. bytes_x86.hpp has this to say about its implementation: >> >> // Efficient reading and writing of unaligned unsigned data in platform-specific byte ordering >> // (no special code is needed since x86 CPUs can access unaligned data) >> >> While that is /almost/ always true for the x86 architecture in itself, the C standard still expects accesses to be aligned, and the C compiler is free to make use of that expectation to, for example, vectorize operations and make use of the movdqa instruction. >> >> I noticed this when working on the portola/musl port, and in that environment the bug is tickled immediately. Why is it only a problem with musl? Turns out it sorta isn?t. It seems that this isn't an actual problem on any of the current platforms/toolchains we're using, but it's a latent bug which may be triggered at any point. >> >> bytes_x86.hpp will, in the end, actually use system library functions to do byte swapping. Specifically, on linux_x86 it will come down to bytes_linux_x86.inline.hpp which, on AMD64, uses the system functions/macros swap_u{16,32,64} to do the actual byte swapping. Now here?s the ?funny? & interesting part: >> >> With glibc, the swap_u{16,32,64} methods are implemented using inline assembly - in the end it comes down to an inline rotate ?rorw? instruction. Since GCC can?t see through the inline assembly, it will not realize that there are loop unrolling/vectorization opportunities, and of specific interest to us: the movdqa instruction will not be used. The code will potentially not be as efficient as it could be, but it will be functional. >> >> With musl, the swap methods are instead implemented as normal macros, shifting bits around to achieve the desired effect. GCC recognizes the bit shifting patterns, will realize that it?s just byte swapping a bunch of values, will vectorize the loop, and *will* make use of the movdqa instruction. Kaboom. >> >> To recap: dereferencing unaligned pointers in C/C++ is a no-no, even in cases where you think it should be okay. With the existing compilers and header files we are not currently running into this problem, but even a small change in the byte swap implementation exposes the problem. >> >> >> >> * About the change >> >> The key changes are in three different areas: >> >> 1. copy.[ch]pp >> >> Introducing: conjoint_swap_if_needed >> >> conjoint_swap_if_needed copies data, and bytes wap it on-the-fly if the specified endianness differs from the native/CPU endianness. It does this by either delegating to conjoint_swap (on endian mismatch), or conjoint_copy (on match). In copy.cpp, the changes all boil down to making the innermost do_conjoint_swap method more flexible so that it can be reused for both cases (straight copy as well as copy+swap). >> >> 2. classFile{Parser,Stream} >> >> The key change is in classFileParser.cpp, switching to copying data from the class file using the new conjoint_swap_if_needed method, replacing the loop implemented in copy_u2_with_conversion/Bytes::get_Java_u2. >> >> However, in addition to that change, I noticed that there are a lot of u2* passed around in the code, pointers which are not necessarily 16-bit aligned. While there?s nothing wrong with *having* an unaligned pointer in C - as long as it?s not dereferenced everything is peachy - it made me uneasy to see it passed around and used the way it is. Specifically, ClassFileStream::get_u2_buffer() could, to the untrained eye, be a bit misleading. One could accidentally and incorrectly assume that the returned pointer is, in fact, 16-bit aligned and start dereferencing it directly, where in fact there is no such guarantee. Perhaps even use it as an array and attract the wrath of the C compiler. >> >> Changing to void* may or may not be the right thing to do here. In a way I?d actually like to ?carry? the type information, but in some way still prevent the pointer from being directly dereferenced. Taking suggestions. >> >> >> 3. bytes_x86.hpp >> >> This is addressing the wider concern that other parts of hotspot may use the same primitives in much the same (potentially broken) way, and in particular the fact that the get/put primitives aren?t checking whether the pointer argument is aligned before they dereference it. It may well be that a simple assert or two would do the trick here. That said: >> >> It turns out that the various platforms all have their own unique ways of implementing bytes.hpp, duplicating some logic which could/should be platform independent. I tried to clean up and unify it all a bit while at it by introducing an Endian helper class in bytes.hpp. The primitives for accessing data in memory now check for alignment and either perform the raw memory access (when the pointer is aligned), or does a memcpy (if unaligned). There?s some template ?magic? in there avoid duplicating code, but hopefully the magic is relatively straightforward. >> >> >> * Testing >> >> I?ve run some basic testing on this, but I?m very much looking for advice on which tests to run. Let me know if you have any suggestions! >> >> Cheers, >> Mikael >> From aph at redhat.com Mon May 15 14:49:44 2017 From: aph at redhat.com (Andrew Haley) Date: Mon, 15 May 2017 15:49:44 +0100 Subject: Where do JDK 9,10 fixes get pushed to? Message-ID: <3cfa045b-7e89-a8f0-36dd-6433472a2124@redhat.com> Looks like nothing is going into jdk9 / hs / hotspot. Everything now goes into jdk9 / dev / hotspot. I guess this is normal during rampdown? But with JDK 10, jdk10 / hs / hotspot is live, and so is jdk10 / jdk10 / hotspot. How does this work? Is there a two-way merge going on? Thanks, Andrew. From jesper.wilhelmsson at oracle.com Tue May 16 12:16:33 2017 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Tue, 16 May 2017 14:16:33 +0200 Subject: Where do JDK 9,10 fixes get pushed to? In-Reply-To: <3cfa045b-7e89-a8f0-36dd-6433472a2124@redhat.com> References: <3cfa045b-7e89-a8f0-36dd-6433472a2124@redhat.com> Message-ID: <4FC1DEC3-A531-447A-A047-ACE5EC78DB93@oracle.com> Right, jdk9/hs is closed as we are ramping down and the number of Hotspot changes in JDK 9 is fairly small at this point. jdk10/jdk10 (a.k.a. 10/10) is the main JDK 10 forest. jdk10/hs (10/hs) is a child of 10/10. 10/hs is where JDK 10 Hotspot changes should be pushed. These changes will go through an extra layer of testing before being integrated to 10/10. Currently changes are frequently pulled from 10/10 to 10/hs to keep 10/hs in sync, but so far we have not started to push changes from 10/hs to 10/10. Pushing 10/hs to 10/10 would complicate the forward ports currently being done on a weekly basis from JDK 9 so we are holding off with hs integrations until the frequency of changes in JDK 9 goes down. Hth, /Jesper > On 15 May 2017, at 16:49, Andrew Haley wrote: > > Looks like nothing is going into jdk9 / hs / hotspot. Everything now goes into > jdk9 / dev / hotspot. I guess this is normal during rampdown? > > But with JDK 10, jdk10 / hs / hotspot is live, and so is jdk10 / jdk10 / hotspot. > > How does this work? Is there a two-way merge going on? > > Thanks, > > Andrew. From harold.seigel at oracle.com Tue May 16 19:07:16 2017 From: harold.seigel at oracle.com (harold seigel) Date: Tue, 16 May 2017 15:07:16 -0400 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: <88E1F61C-5297-4948-83C3-25E146674F7A@oracle.com> References: <42e3f5a6-0fd1-7fe8-c446-d72e86c5d315@oracle.com> <88E1F61C-5297-4948-83C3-25E146674F7A@oracle.com> Message-ID: Hi Mikael, The changes look good. One minor typo in bytes_sparc.hpp: "nativ byte". It might be good to run the RBT tier2 - tier5 tests on one big endian and one little endian platform. Thanks, Harold On 5/15/2017 2:34 PM, Mikael Vidstedt wrote: > New webrevs: > > full: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01/hotspot/webrev/ > incremental (from webrev.00): http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01.incr/hotspot/webrev/ > > I definitely want a second reviewer of this, and I?m (still) taking suggestions on tests to run etc.! > > Also, comments inline below.. > >> On May 9, 2017, at 5:12 PM, David Holmes wrote: >> >> Hi Mikael, >> >> On 10/05/2017 8:40 AM, Mikael Vidstedt wrote: >>> Warning: It may be wise to stock up on coffee or tea before reading this. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >> Overall this looks good to me. I like the refactoring from Bytes to Endian - that simplifies things a lot I think. > Thanks, I like it too ;) > >> The actual "copy/swap" changes and templates I'm not an expert on but I get the gist and they seemed okay. >> >> Was a little unsure about all the changes to void* from u2*/u1* in classFileParser.h/cpp - does that just simplify use of the copy/swap code? Though I see some casts to u2* are no longer necessary as well. > Right, I could go either way here, but when I personally see a u2* I feel tempted to just dereference it, and that?s not valid in these cases. With void* it?s more obvious that you need to do something else to get the data, but the width/type of the underlying data is lost. It may make sense to introduce a few helpful typedefs of void* which include in their names the underlying data type, or something along those lines. > > I suggest wrapping up and pushing what I have and working on improving the story here as a separate change. Reasonable? > >> A couple of oddities I noticed: >> >> src/share/vm/classfile/classFileStream.hpp >> >> Without the get_u2_buffer/get_u1_buffer distinction get_u1_buffer seems superfluous and all uses can be replaced by the existing current() accessor. > Good point, get_u1_buffer can be removed (and it?s gone in the new webrev). > >> --- >> >> src/share/vm/classfile/classFileParser.cpp >> >> We do we have void* here: >> >> 1707 const void* const exception_table_start = cfs->get_u1_buffer(); >> >> but u1* here: >> >> 1845 const u1* const localvariable_table_start = cfs->get_u1_buffer(); > Good catch. Changed to void*. > > Cheers, > Mikael > >> Thanks, >> David >> >>> * Background (from the JBS description) >>> >>> x86 is normally very forgiving when it comes to dereferencing unaligned pointers. Even if the pointer isn?t aligned on the natural size of the element being accessed, the hardware will do the Right Thing(tm) and nobody gets hurt. However, turns out there are exceptions to this. Specifically, SSA2 introduced the movdqa instruction, which is a 128-bit load/store which *does* require that the pointer is 128-bit aligned. >>> >>> Normally this isn?t a problem, because after all we don?t typically use 128-bit data types in our C/C++ code. However, just because we don?t use any such data types explicitly, there?s nothing preventing the C compiler to do so under the covers. Specifically, the C compiler tends to do loop unrolling and vectorization, which can turn pretty much any data access into vectorized SIMD accesses. >>> >>> We?ve actually run into a variation on this exact same problem a while back when upgrading to gcc 4.9.2. That time the problem (as captured in JDK-8141491) was in nio/Bits.c, and it was fixed by moving the copy functionality into hotspot (copy.[ch]pp), making sure the copy logic does the relevant alignment checks etc. >>> >>> This time the problem is with ClassFileParser. Or more accurately, it?s in the methods ClassFileParser makes use of. Specifically, the problem is with the copy_u2_with_conversion method, used to copy various data from the class file and put it in the ?native? endian order in memory. It, in turn, uses Bytes::get_Java_u2 to read and potentially byte swap a 16-bit entry from the class file. bytes_x86.hpp has this to say about its implementation: >>> >>> // Efficient reading and writing of unaligned unsigned data in platform-specific byte ordering >>> // (no special code is needed since x86 CPUs can access unaligned data) >>> >>> While that is /almost/ always true for the x86 architecture in itself, the C standard still expects accesses to be aligned, and the C compiler is free to make use of that expectation to, for example, vectorize operations and make use of the movdqa instruction. >>> >>> I noticed this when working on the portola/musl port, and in that environment the bug is tickled immediately. Why is it only a problem with musl? Turns out it sorta isn?t. It seems that this isn't an actual problem on any of the current platforms/toolchains we're using, but it's a latent bug which may be triggered at any point. >>> >>> bytes_x86.hpp will, in the end, actually use system library functions to do byte swapping. Specifically, on linux_x86 it will come down to bytes_linux_x86.inline.hpp which, on AMD64, uses the system functions/macros swap_u{16,32,64} to do the actual byte swapping. Now here?s the ?funny? & interesting part: >>> >>> With glibc, the swap_u{16,32,64} methods are implemented using inline assembly - in the end it comes down to an inline rotate ?rorw? instruction. Since GCC can?t see through the inline assembly, it will not realize that there are loop unrolling/vectorization opportunities, and of specific interest to us: the movdqa instruction will not be used. The code will potentially not be as efficient as it could be, but it will be functional. >>> >>> With musl, the swap methods are instead implemented as normal macros, shifting bits around to achieve the desired effect. GCC recognizes the bit shifting patterns, will realize that it?s just byte swapping a bunch of values, will vectorize the loop, and *will* make use of the movdqa instruction. Kaboom. >>> >>> To recap: dereferencing unaligned pointers in C/C++ is a no-no, even in cases where you think it should be okay. With the existing compilers and header files we are not currently running into this problem, but even a small change in the byte swap implementation exposes the problem. >>> >>> >>> >>> * About the change >>> >>> The key changes are in three different areas: >>> >>> 1. copy.[ch]pp >>> >>> Introducing: conjoint_swap_if_needed >>> >>> conjoint_swap_if_needed copies data, and bytes wap it on-the-fly if the specified endianness differs from the native/CPU endianness. It does this by either delegating to conjoint_swap (on endian mismatch), or conjoint_copy (on match). In copy.cpp, the changes all boil down to making the innermost do_conjoint_swap method more flexible so that it can be reused for both cases (straight copy as well as copy+swap). >>> >>> 2. classFile{Parser,Stream} >>> >>> The key change is in classFileParser.cpp, switching to copying data from the class file using the new conjoint_swap_if_needed method, replacing the loop implemented in copy_u2_with_conversion/Bytes::get_Java_u2. >>> >>> However, in addition to that change, I noticed that there are a lot of u2* passed around in the code, pointers which are not necessarily 16-bit aligned. While there?s nothing wrong with *having* an unaligned pointer in C - as long as it?s not dereferenced everything is peachy - it made me uneasy to see it passed around and used the way it is. Specifically, ClassFileStream::get_u2_buffer() could, to the untrained eye, be a bit misleading. One could accidentally and incorrectly assume that the returned pointer is, in fact, 16-bit aligned and start dereferencing it directly, where in fact there is no such guarantee. Perhaps even use it as an array and attract the wrath of the C compiler. >>> >>> Changing to void* may or may not be the right thing to do here. In a way I?d actually like to ?carry? the type information, but in some way still prevent the pointer from being directly dereferenced. Taking suggestions. >>> >>> >>> 3. bytes_x86.hpp >>> >>> This is addressing the wider concern that other parts of hotspot may use the same primitives in much the same (potentially broken) way, and in particular the fact that the get/put primitives aren?t checking whether the pointer argument is aligned before they dereference it. It may well be that a simple assert or two would do the trick here. That said: >>> >>> It turns out that the various platforms all have their own unique ways of implementing bytes.hpp, duplicating some logic which could/should be platform independent. I tried to clean up and unify it all a bit while at it by introducing an Endian helper class in bytes.hpp. The primitives for accessing data in memory now check for alignment and either perform the raw memory access (when the pointer is aligned), or does a memcpy (if unaligned). There?s some template ?magic? in there avoid duplicating code, but hopefully the magic is relatively straightforward. >>> >>> >>> * Testing >>> >>> I?ve run some basic testing on this, but I?m very much looking for advice on which tests to run. Let me know if you have any suggestions! >>> >>> Cheers, >>> Mikael >>> From vladimir.kozlov at oracle.com Tue May 16 23:11:50 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 16 May 2017 16:11:50 -0700 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: <88E1F61C-5297-4948-83C3-25E146674F7A@oracle.com> References: <42e3f5a6-0fd1-7fe8-c446-d72e86c5d315@oracle.com> <88E1F61C-5297-4948-83C3-25E146674F7A@oracle.com> Message-ID: <6786e0bc-095d-49ea-8416-5d40776a130b@oracle.com> I am concern about using libc call memcpy() for few bytes coping. Mikael says that he verified that it is intrinsified by most C++ compilers. What about Solaris C++ which is less advance then gcc? Also what happens in fasdebug JVM? May we should also put TODO comments into bytes_aarch64.hpp and bytes_s390.hpp. Or file bugs. Thanks, Vladimir On 5/15/17 11:34 AM, Mikael Vidstedt wrote: > > New webrevs: > > full: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01/hotspot/webrev/ > incremental (from webrev.00): http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01.incr/hotspot/webrev/ > > I definitely want a second reviewer of this, and I?m (still) taking suggestions on tests to run etc.! > > Also, comments inline below.. > >> On May 9, 2017, at 5:12 PM, David Holmes wrote: >> >> Hi Mikael, >> >> On 10/05/2017 8:40 AM, Mikael Vidstedt wrote: >>> >>> Warning: It may be wise to stock up on coffee or tea before reading this. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >> >> Overall this looks good to me. I like the refactoring from Bytes to Endian - that simplifies things a lot I think. > > Thanks, I like it too ;) > >> The actual "copy/swap" changes and templates I'm not an expert on but I get the gist and they seemed okay. >> >> Was a little unsure about all the changes to void* from u2*/u1* in classFileParser.h/cpp - does that just simplify use of the copy/swap code? Though I see some casts to u2* are no longer necessary as well. > > Right, I could go either way here, but when I personally see a u2* I feel tempted to just dereference it, and that?s not valid in these cases. With void* it?s more obvious that you need to do something else to get the data, but the width/type of the underlying data is lost. It may make sense to introduce a few helpful typedefs of void* which include in their names the underlying data type, or something along those lines. > > I suggest wrapping up and pushing what I have and working on improving the story here as a separate change. Reasonable? > >> A couple of oddities I noticed: >> >> src/share/vm/classfile/classFileStream.hpp >> >> Without the get_u2_buffer/get_u1_buffer distinction get_u1_buffer seems superfluous and all uses can be replaced by the existing current() accessor. > > Good point, get_u1_buffer can be removed (and it?s gone in the new webrev). > >> >> --- >> >> src/share/vm/classfile/classFileParser.cpp >> >> We do we have void* here: >> >> 1707 const void* const exception_table_start = cfs->get_u1_buffer(); >> >> but u1* here: >> >> 1845 const u1* const localvariable_table_start = cfs->get_u1_buffer(); > > Good catch. Changed to void*. > > Cheers, > Mikael > >> >> Thanks, >> David >> >>> * Background (from the JBS description) >>> >>> x86 is normally very forgiving when it comes to dereferencing unaligned pointers. Even if the pointer isn?t aligned on the natural size of the element being accessed, the hardware will do the Right Thing(tm) and nobody gets hurt. However, turns out there are exceptions to this. Specifically, SSA2 introduced the movdqa instruction, which is a 128-bit load/store which *does* require that the pointer is 128-bit aligned. >>> >>> Normally this isn?t a problem, because after all we don?t typically use 128-bit data types in our C/C++ code. However, just because we don?t use any such data types explicitly, there?s nothing preventing the C compiler to do so under the covers. Specifically, the C compiler tends to do loop unrolling and vectorization, which can turn pretty much any data access into vectorized SIMD accesses. >>> >>> We?ve actually run into a variation on this exact same problem a while back when upgrading to gcc 4.9.2. That time the problem (as captured in JDK-8141491) was in nio/Bits.c, and it was fixed by moving the copy functionality into hotspot (copy.[ch]pp), making sure the copy logic does the relevant alignment checks etc. >>> >>> This time the problem is with ClassFileParser. Or more accurately, it?s in the methods ClassFileParser makes use of. Specifically, the problem is with the copy_u2_with_conversion method, used to copy various data from the class file and put it in the ?native? endian order in memory. It, in turn, uses Bytes::get_Java_u2 to read and potentially byte swap a 16-bit entry from the class file. bytes_x86.hpp has this to say about its implementation: >>> >>> // Efficient reading and writing of unaligned unsigned data in platform-specific byte ordering >>> // (no special code is needed since x86 CPUs can access unaligned data) >>> >>> While that is /almost/ always true for the x86 architecture in itself, the C standard still expects accesses to be aligned, and the C compiler is free to make use of that expectation to, for example, vectorize operations and make use of the movdqa instruction. >>> >>> I noticed this when working on the portola/musl port, and in that environment the bug is tickled immediately. Why is it only a problem with musl? Turns out it sorta isn?t. It seems that this isn't an actual problem on any of the current platforms/toolchains we're using, but it's a latent bug which may be triggered at any point. >>> >>> bytes_x86.hpp will, in the end, actually use system library functions to do byte swapping. Specifically, on linux_x86 it will come down to bytes_linux_x86.inline.hpp which, on AMD64, uses the system functions/macros swap_u{16,32,64} to do the actual byte swapping. Now here?s the ?funny? & interesting part: >>> >>> With glibc, the swap_u{16,32,64} methods are implemented using inline assembly - in the end it comes down to an inline rotate ?rorw? instruction. Since GCC can?t see through the inline assembly, it will not realize that there are loop unrolling/vectorization opportunities, and of specific interest to us: the movdqa instruction will not be used. The code will potentially not be as efficient as it could be, but it will be functional. >>> >>> With musl, the swap methods are instead implemented as normal macros, shifting bits around to achieve the desired effect. GCC recognizes the bit shifting patterns, will realize that it?s just byte swapping a bunch of values, will vectorize the loop, and *will* make use of the movdqa instruction. Kaboom. >>> >>> To recap: dereferencing unaligned pointers in C/C++ is a no-no, even in cases where you think it should be okay. With the existing compilers and header files we are not currently running into this problem, but even a small change in the byte swap implementation exposes the problem. >>> >>> >>> >>> * About the change >>> >>> The key changes are in three different areas: >>> >>> 1. copy.[ch]pp >>> >>> Introducing: conjoint_swap_if_needed >>> >>> conjoint_swap_if_needed copies data, and bytes wap it on-the-fly if the specified endianness differs from the native/CPU endianness. It does this by either delegating to conjoint_swap (on endian mismatch), or conjoint_copy (on match). In copy.cpp, the changes all boil down to making the innermost do_conjoint_swap method more flexible so that it can be reused for both cases (straight copy as well as copy+swap). >>> >>> 2. classFile{Parser,Stream} >>> >>> The key change is in classFileParser.cpp, switching to copying data from the class file using the new conjoint_swap_if_needed method, replacing the loop implemented in copy_u2_with_conversion/Bytes::get_Java_u2. >>> >>> However, in addition to that change, I noticed that there are a lot of u2* passed around in the code, pointers which are not necessarily 16-bit aligned. While there?s nothing wrong with *having* an unaligned pointer in C - as long as it?s not dereferenced everything is peachy - it made me uneasy to see it passed around and used the way it is. Specifically, ClassFileStream::get_u2_buffer() could, to the untrained eye, be a bit misleading. One could accidentally and incorrectly assume that the returned pointer is, in fact, 16-bit aligned and start dereferencing it directly, where in fact there is no such guarantee. Perhaps even use it as an array and attract the wrath of the C compiler. >>> >>> Changing to void* may or may not be the right thing to do here. In a way I?d actually like to ?carry? the type information, but in some way still prevent the pointer from being directly dereferenced. Taking suggestions. >>> >>> >>> 3. bytes_x86.hpp >>> >>> This is addressing the wider concern that other parts of hotspot may use the same primitives in much the same (potentially broken) way, and in particular the fact that the get/put primitives aren?t checking whether the pointer argument is aligned before they dereference it. It may well be that a simple assert or two would do the trick here. That said: >>> >>> It turns out that the various platforms all have their own unique ways of implementing bytes.hpp, duplicating some logic which could/should be platform independent. I tried to clean up and unify it all a bit while at it by introducing an Endian helper class in bytes.hpp. The primitives for accessing data in memory now check for alignment and either perform the raw memory access (when the pointer is aligned), or does a memcpy (if unaligned). There?s some template ?magic? in there avoid duplicating code, but hopefully the magic is relatively straightforward. >>> >>> >>> * Testing >>> >>> I?ve run some basic testing on this, but I?m very much looking for advice on which tests to run. Let me know if you have any suggestions! >>> >>> Cheers, >>> Mikael >>> > From cthalinger at twitter.com Tue May 16 23:17:15 2017 From: cthalinger at twitter.com (Christian Thalinger) Date: Tue, 16 May 2017 13:17:15 -1000 Subject: RFR (S): 8180453: mx eclipseinit doesn't pick up generated sources Message-ID: <6CEF4715-E5BB-4CDA-98CF-27777FE41D8B@twitter.com> https://bugs.openjdk.java.net/browse/JDK-8180453 The fix is trivial and does not affect production code. diff --git a/.mx.jvmci/mx_jvmci.py b/.mx.jvmci/mx_jvmci.py index b87bab7..37a9baf 100644 --- a/.mx.jvmci/mx_jvmci.py +++ b/.mx.jvmci/mx_jvmci.py @@ -303,9 +303,9 @@ class HotSpotProject(mx.NativeProject): out.close('link') out.open('link') - out.element('name', data='generated') + out.element('name', data='gensrc') out.element('type', data='2') - generated = join(_get_hotspot_build_dir(jvmVariant, debugLevel), 'generated') + generated = join(_get_hotspot_build_dir(jvmVariant, debugLevel), 'gensrc') out.element('locationURI', data=mx.get_eclipse_project_rel_locationURI(generated, eclProjectDir)) out.close('link') @@ -620,18 +620,12 @@ _jvmci_bootclasspath_prepends = [] def _get_hotspot_build_dir(jvmVariant=None, debugLevel=None): """ Gets the directory in which a particular HotSpot configuration is built - (e.g., /build/macosx-x86_64-normal-server-release/hotspot/bsd_amd64_compiler2) + (e.g., /build/macosx-x86_64-normal-server-release/hotspot/variant-) """ if jvmVariant is None: jvmVariant = _vm.jvmVariant - os = mx.get_os() - if os == 'darwin': - os = 'bsd' - arch = mx.get_arch() - buildname = {'client': 'compiler1', 'server': 'compiler2'}.get(jvmVariant, jvmVariant) - - name = '{}_{}_{}'.format(os, arch, buildname) + name = 'variant-{}'.format(jvmVariant) return join(_get_jdk_build_dir(debugLevel=debugLevel), 'hotspot', name) class JVMCI9JDKConfig(mx.JDKConfig): From vladimir.kozlov at oracle.com Tue May 16 23:29:12 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 16 May 2017 16:29:12 -0700 Subject: RFR (S): 8180453: mx eclipseinit doesn't pick up generated sources In-Reply-To: <6CEF4715-E5BB-4CDA-98CF-27777FE41D8B@twitter.com> References: <6CEF4715-E5BB-4CDA-98CF-27777FE41D8B@twitter.com> Message-ID: <11bd1bb6-a3aa-230f-36c8-10be51f74d79@oracle.com> JDK 9 or JDK 10? Changes are fine but you need approval for JDK 9. Thanks, Vladimir On 5/16/17 4:17 PM, Christian Thalinger wrote: > https://bugs.openjdk.java.net/browse/JDK-8180453 > > The fix is trivial and does not affect production code. > > diff --git a/.mx.jvmci/mx_jvmci.py b/.mx.jvmci/mx_jvmci.py > index b87bab7..37a9baf 100644 > --- a/.mx.jvmci/mx_jvmci.py > +++ b/.mx.jvmci/mx_jvmci.py > @@ -303,9 +303,9 @@ class HotSpotProject(mx.NativeProject): > out.close('link') > > out.open('link') > - out.element('name', data='generated') > + out.element('name', data='gensrc') > out.element('type', data='2') > - generated = join(_get_hotspot_build_dir(jvmVariant, debugLevel), 'generated') > + generated = join(_get_hotspot_build_dir(jvmVariant, debugLevel), 'gensrc') > out.element('locationURI', data=mx.get_eclipse_project_rel_locationURI(generated, eclProjectDir)) > out.close('link') > > @@ -620,18 +620,12 @@ _jvmci_bootclasspath_prepends = [] > def _get_hotspot_build_dir(jvmVariant=None, debugLevel=None): > """ > Gets the directory in which a particular HotSpot configuration is built > - (e.g., /build/macosx-x86_64-normal-server-release/hotspot/bsd_amd64_compiler2) > + (e.g., /build/macosx-x86_64-normal-server-release/hotspot/variant-) > """ > if jvmVariant is None: > jvmVariant = _vm.jvmVariant > > - os = mx.get_os() > - if os == 'darwin': > - os = 'bsd' > - arch = mx.get_arch() > - buildname = {'client': 'compiler1', 'server': 'compiler2'}.get(jvmVariant, jvmVariant) > - > - name = '{}_{}_{}'.format(os, arch, buildname) > + name = 'variant-{}'.format(jvmVariant) > return join(_get_jdk_build_dir(debugLevel=debugLevel), 'hotspot', name) > > class JVMCI9JDKConfig(mx.JDKConfig): > From cthalinger at twitter.com Tue May 16 23:32:51 2017 From: cthalinger at twitter.com (Christian Thalinger) Date: Tue, 16 May 2017 13:32:51 -1000 Subject: RFR (S): 8180453: mx eclipseinit doesn't pick up generated sources In-Reply-To: <11bd1bb6-a3aa-230f-36c8-10be51f74d79@oracle.com> References: <6CEF4715-E5BB-4CDA-98CF-27777FE41D8B@twitter.com> <11bd1bb6-a3aa-230f-36c8-10be51f74d79@oracle.com> Message-ID: > On May 16, 2017, at 1:29 PM, Vladimir Kozlov wrote: > > JDK 9 or JDK 10? Both. > > Changes are fine but you need approval for JDK 9. How? Bunch of labels, I guess? :-) > > Thanks, > Vladimir > > On 5/16/17 4:17 PM, Christian Thalinger wrote: >> https://bugs.openjdk.java.net/browse/JDK-8180453 >> >> The fix is trivial and does not affect production code. >> >> diff --git a/.mx.jvmci/mx_jvmci.py b/.mx.jvmci/mx_jvmci.py >> index b87bab7..37a9baf 100644 >> --- a/.mx.jvmci/mx_jvmci.py >> +++ b/.mx.jvmci/mx_jvmci.py >> @@ -303,9 +303,9 @@ class HotSpotProject(mx.NativeProject): >> out.close('link') >> >> out.open('link') >> - out.element('name', data='generated') >> + out.element('name', data='gensrc') >> out.element('type', data='2') >> - generated = join(_get_hotspot_build_dir(jvmVariant, debugLevel), 'generated') >> + generated = join(_get_hotspot_build_dir(jvmVariant, debugLevel), 'gensrc') >> out.element('locationURI', data=mx.get_eclipse_project_rel_locationURI(generated, eclProjectDir)) >> out.close('link') >> >> @@ -620,18 +620,12 @@ _jvmci_bootclasspath_prepends = [] >> def _get_hotspot_build_dir(jvmVariant=None, debugLevel=None): >> """ >> Gets the directory in which a particular HotSpot configuration is built >> - (e.g., /build/macosx-x86_64-normal-server-release/hotspot/bsd_amd64_compiler2) >> + (e.g., /build/macosx-x86_64-normal-server-release/hotspot/variant-) >> """ >> if jvmVariant is None: >> jvmVariant = _vm.jvmVariant >> >> - os = mx.get_os() >> - if os == 'darwin': >> - os = 'bsd' >> - arch = mx.get_arch() >> - buildname = {'client': 'compiler1', 'server': 'compiler2'}.get(jvmVariant, jvmVariant) >> - >> - name = '{}_{}_{}'.format(os, arch, buildname) >> + name = 'variant-{}'.format(jvmVariant) >> return join(_get_jdk_build_dir(debugLevel=debugLevel), 'hotspot', name) >> >> class JVMCI9JDKConfig(mx.JDKConfig): >> From cthalinger at twitter.com Tue May 16 23:38:41 2017 From: cthalinger at twitter.com (Christian Thalinger) Date: Tue, 16 May 2017 13:38:41 -1000 Subject: RFR (S): 8180453: mx eclipseinit doesn't pick up generated sources In-Reply-To: References: <6CEF4715-E5BB-4CDA-98CF-27777FE41D8B@twitter.com> <11bd1bb6-a3aa-230f-36c8-10be51f74d79@oracle.com> Message-ID: > On May 16, 2017, at 1:32 PM, Christian Thalinger wrote: > > >> On May 16, 2017, at 1:29 PM, Vladimir Kozlov wrote: >> >> JDK 9 or JDK 10? > > Both. > >> >> Changes are fine but you need approval for JDK 9. > > How? Bunch of labels, I guess? :-) Found it: http://openjdk.java.net/projects/jdk9/fix-request-process > >> >> Thanks, >> Vladimir >> >> On 5/16/17 4:17 PM, Christian Thalinger wrote: >>> https://bugs.openjdk.java.net/browse/JDK-8180453 >>> >>> The fix is trivial and does not affect production code. >>> >>> diff --git a/.mx.jvmci/mx_jvmci.py b/.mx.jvmci/mx_jvmci.py >>> index b87bab7..37a9baf 100644 >>> --- a/.mx.jvmci/mx_jvmci.py >>> +++ b/.mx.jvmci/mx_jvmci.py >>> @@ -303,9 +303,9 @@ class HotSpotProject(mx.NativeProject): >>> out.close('link') >>> >>> out.open('link') >>> - out.element('name', data='generated') >>> + out.element('name', data='gensrc') >>> out.element('type', data='2') >>> - generated = join(_get_hotspot_build_dir(jvmVariant, debugLevel), 'generated') >>> + generated = join(_get_hotspot_build_dir(jvmVariant, debugLevel), 'gensrc') >>> out.element('locationURI', data=mx.get_eclipse_project_rel_locationURI(generated, eclProjectDir)) >>> out.close('link') >>> >>> @@ -620,18 +620,12 @@ _jvmci_bootclasspath_prepends = [] >>> def _get_hotspot_build_dir(jvmVariant=None, debugLevel=None): >>> """ >>> Gets the directory in which a particular HotSpot configuration is built >>> - (e.g., /build/macosx-x86_64-normal-server-release/hotspot/bsd_amd64_compiler2) >>> + (e.g., /build/macosx-x86_64-normal-server-release/hotspot/variant-) >>> """ >>> if jvmVariant is None: >>> jvmVariant = _vm.jvmVariant >>> >>> - os = mx.get_os() >>> - if os == 'darwin': >>> - os = 'bsd' >>> - arch = mx.get_arch() >>> - buildname = {'client': 'compiler1', 'server': 'compiler2'}.get(jvmVariant, jvmVariant) >>> - >>> - name = '{}_{}_{}'.format(os, arch, buildname) >>> + name = 'variant-{}'.format(jvmVariant) >>> return join(_get_jdk_build_dir(debugLevel=debugLevel), 'hotspot', name) >>> >>> class JVMCI9JDKConfig(mx.JDKConfig): >>> > From kim.barrett at oracle.com Wed May 17 01:46:41 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 16 May 2017 21:46:41 -0400 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: References: Message-ID: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> > On May 9, 2017, at 6:40 PM, Mikael Vidstedt wrote: > > > Warning: It may be wise to stock up on coffee or tea before reading this. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 > Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ Not a review, just a question. ------------------------------------------------------------------------------ src/cpu/x86/vm/bytes_x86.hpp 40 template 41 static inline T get_native(const void* p) { 42 assert(p != NULL, "null pointer"); 43 44 T x; 45 46 if (is_ptr_aligned(p, sizeof(T))) { 47 x = *(T*)p; 48 } else { 49 memcpy(&x, p, sizeof(T)); 50 } 51 52 return x; I'm looking at this and wondering if there's a good reason to not just unconditionally use memcpy here. gcc -O will generate a single move instruction for that on x86_64. I'm not sure what happens on 32bit with an 8 byte value, but I suspect it will do something similarly sensible, e.g. 2 4 byte memory to memory transfers. ------------------------------------------------------------------------------ From david.holmes at oracle.com Wed May 17 03:55:18 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 17 May 2017 13:55:18 +1000 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: <88E1F61C-5297-4948-83C3-25E146674F7A@oracle.com> References: <42e3f5a6-0fd1-7fe8-c446-d72e86c5d315@oracle.com> <88E1F61C-5297-4948-83C3-25E146674F7A@oracle.com> Message-ID: On 16/05/2017 4:34 AM, Mikael Vidstedt wrote: > > New webrevs: > > full: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01/hotspot/webrev/ > incremental (from webrev.00): http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01.incr/hotspot/webrev/ > > I definitely want a second reviewer of this, and I?m (still) taking suggestions on tests to run etc.! > > Also, comments inline below.. Response inline below.. >> On May 9, 2017, at 5:12 PM, David Holmes wrote: >> >> Hi Mikael, >> >> On 10/05/2017 8:40 AM, Mikael Vidstedt wrote: >>> >>> Warning: It may be wise to stock up on coffee or tea before reading this. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >> >> Overall this looks good to me. I like the refactoring from Bytes to Endian - that simplifies things a lot I think. > > Thanks, I like it too ;) > >> The actual "copy/swap" changes and templates I'm not an expert on but I get the gist and they seemed okay. >> >> Was a little unsure about all the changes to void* from u2*/u1* in classFileParser.h/cpp - does that just simplify use of the copy/swap code? Though I see some casts to u2* are no longer necessary as well. > > Right, I could go either way here, but when I personally see a u2* I feel tempted to just dereference it, and that?s not valid in these cases. With void* it?s more obvious that you need to do something else to get the data, but the width/type of the underlying data is lost. It may make sense to introduce a few helpful typedefs of void* which include in their names the underlying data type, or something along those lines. That seems quite reasonable. I agree de-referencing a void* is not something that would be done without thinking. > I suggest wrapping up and pushing what I have and working on improving the story here as a separate change. Reasonable? Yes. >> A couple of oddities I noticed: >> >> src/share/vm/classfile/classFileStream.hpp >> >> Without the get_u2_buffer/get_u1_buffer distinction get_u1_buffer seems superfluous and all uses can be replaced by the existing current() accessor. > > Good point, get_u1_buffer can be removed (and it?s gone in the new webrev). Looks good. Thanks, David >> >> --- >> >> src/share/vm/classfile/classFileParser.cpp >> >> We do we have void* here: >> >> 1707 const void* const exception_table_start = cfs->get_u1_buffer(); >> >> but u1* here: >> >> 1845 const u1* const localvariable_table_start = cfs->get_u1_buffer(); > > Good catch. Changed to void*. > > Cheers, > Mikael > >> >> Thanks, >> David >> >>> * Background (from the JBS description) >>> >>> x86 is normally very forgiving when it comes to dereferencing unaligned pointers. Even if the pointer isn?t aligned on the natural size of the element being accessed, the hardware will do the Right Thing(tm) and nobody gets hurt. However, turns out there are exceptions to this. Specifically, SSA2 introduced the movdqa instruction, which is a 128-bit load/store which *does* require that the pointer is 128-bit aligned. >>> >>> Normally this isn?t a problem, because after all we don?t typically use 128-bit data types in our C/C++ code. However, just because we don?t use any such data types explicitly, there?s nothing preventing the C compiler to do so under the covers. Specifically, the C compiler tends to do loop unrolling and vectorization, which can turn pretty much any data access into vectorized SIMD accesses. >>> >>> We?ve actually run into a variation on this exact same problem a while back when upgrading to gcc 4.9.2. That time the problem (as captured in JDK-8141491) was in nio/Bits.c, and it was fixed by moving the copy functionality into hotspot (copy.[ch]pp), making sure the copy logic does the relevant alignment checks etc. >>> >>> This time the problem is with ClassFileParser. Or more accurately, it?s in the methods ClassFileParser makes use of. Specifically, the problem is with the copy_u2_with_conversion method, used to copy various data from the class file and put it in the ?native? endian order in memory. It, in turn, uses Bytes::get_Java_u2 to read and potentially byte swap a 16-bit entry from the class file. bytes_x86.hpp has this to say about its implementation: >>> >>> // Efficient reading and writing of unaligned unsigned data in platform-specific byte ordering >>> // (no special code is needed since x86 CPUs can access unaligned data) >>> >>> While that is /almost/ always true for the x86 architecture in itself, the C standard still expects accesses to be aligned, and the C compiler is free to make use of that expectation to, for example, vectorize operations and make use of the movdqa instruction. >>> >>> I noticed this when working on the portola/musl port, and in that environment the bug is tickled immediately. Why is it only a problem with musl? Turns out it sorta isn?t. It seems that this isn't an actual problem on any of the current platforms/toolchains we're using, but it's a latent bug which may be triggered at any point. >>> >>> bytes_x86.hpp will, in the end, actually use system library functions to do byte swapping. Specifically, on linux_x86 it will come down to bytes_linux_x86.inline.hpp which, on AMD64, uses the system functions/macros swap_u{16,32,64} to do the actual byte swapping. Now here?s the ?funny? & interesting part: >>> >>> With glibc, the swap_u{16,32,64} methods are implemented using inline assembly - in the end it comes down to an inline rotate ?rorw? instruction. Since GCC can?t see through the inline assembly, it will not realize that there are loop unrolling/vectorization opportunities, and of specific interest to us: the movdqa instruction will not be used. The code will potentially not be as efficient as it could be, but it will be functional. >>> >>> With musl, the swap methods are instead implemented as normal macros, shifting bits around to achieve the desired effect. GCC recognizes the bit shifting patterns, will realize that it?s just byte swapping a bunch of values, will vectorize the loop, and *will* make use of the movdqa instruction. Kaboom. >>> >>> To recap: dereferencing unaligned pointers in C/C++ is a no-no, even in cases where you think it should be okay. With the existing compilers and header files we are not currently running into this problem, but even a small change in the byte swap implementation exposes the problem. >>> >>> >>> >>> * About the change >>> >>> The key changes are in three different areas: >>> >>> 1. copy.[ch]pp >>> >>> Introducing: conjoint_swap_if_needed >>> >>> conjoint_swap_if_needed copies data, and bytes wap it on-the-fly if the specified endianness differs from the native/CPU endianness. It does this by either delegating to conjoint_swap (on endian mismatch), or conjoint_copy (on match). In copy.cpp, the changes all boil down to making the innermost do_conjoint_swap method more flexible so that it can be reused for both cases (straight copy as well as copy+swap). >>> >>> 2. classFile{Parser,Stream} >>> >>> The key change is in classFileParser.cpp, switching to copying data from the class file using the new conjoint_swap_if_needed method, replacing the loop implemented in copy_u2_with_conversion/Bytes::get_Java_u2. >>> >>> However, in addition to that change, I noticed that there are a lot of u2* passed around in the code, pointers which are not necessarily 16-bit aligned. While there?s nothing wrong with *having* an unaligned pointer in C - as long as it?s not dereferenced everything is peachy - it made me uneasy to see it passed around and used the way it is. Specifically, ClassFileStream::get_u2_buffer() could, to the untrained eye, be a bit misleading. One could accidentally and incorrectly assume that the returned pointer is, in fact, 16-bit aligned and start dereferencing it directly, where in fact there is no such guarantee. Perhaps even use it as an array and attract the wrath of the C compiler. >>> >>> Changing to void* may or may not be the right thing to do here. In a way I?d actually like to ?carry? the type information, but in some way still prevent the pointer from being directly dereferenced. Taking suggestions. >>> >>> >>> 3. bytes_x86.hpp >>> >>> This is addressing the wider concern that other parts of hotspot may use the same primitives in much the same (potentially broken) way, and in particular the fact that the get/put primitives aren?t checking whether the pointer argument is aligned before they dereference it. It may well be that a simple assert or two would do the trick here. That said: >>> >>> It turns out that the various platforms all have their own unique ways of implementing bytes.hpp, duplicating some logic which could/should be platform independent. I tried to clean up and unify it all a bit while at it by introducing an Endian helper class in bytes.hpp. The primitives for accessing data in memory now check for alignment and either perform the raw memory access (when the pointer is aligned), or does a memcpy (if unaligned). There?s some template ?magic? in there avoid duplicating code, but hopefully the magic is relatively straightforward. >>> >>> >>> * Testing >>> >>> I?ve run some basic testing on this, but I?m very much looking for advice on which tests to run. Let me know if you have any suggestions! >>> >>> Cheers, >>> Mikael >>> > From shade at redhat.com Wed May 17 11:02:40 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 17 May 2017 13:02:40 +0200 Subject: RFR (S) 8180482: Reformat -XX:+PrintSafepointStatistics table Message-ID: https://bugs.openjdk.java.net/browse/JDK-8180482 Current table is garbled because VM ops names are too long, headers are not accounted in format specifiers, etc. Patch: http://cr.openjdk.java.net/~shade/8180482/webrev.01/ Before/after: http://cr.openjdk.java.net/~shade/8180482/before.txt http://cr.openjdk.java.net/~shade/8180482/after.txt Thanks, -Aleksey From david.holmes at oracle.com Wed May 17 12:19:54 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 17 May 2017 22:19:54 +1000 Subject: RFR (S) 8180482: Reformat -XX:+PrintSafepointStatistics table In-Reply-To: References: Message-ID: <097a9f85-4f39-ac0f-5ed6-80153a7f5ef7@oracle.com> Hi Aleksey, On 17/05/2017 9:02 PM, Aleksey Shipilev wrote: > https://bugs.openjdk.java.net/browse/JDK-8180482 > > Current table is garbled because VM ops names are too long, headers are not > accounted in format specifiers, etc. > > Patch: > http://cr.openjdk.java.net/~shade/8180482/webrev.01/ Can't say I like all the magic numbers. Is this: tty->print("[ %5s %7s %7s %7s %7s %7s ] ", "time:", "spin", "block", "sync", "cleanup", "vmop"); really worth the effort versus: // widest column name needs 7 chars so space accordingly tty->print("[ time: spin block sync cleanup vmop ]"); ? There's nothing to link the width specifiers in the header with those used in the print function. David > Before/after: > http://cr.openjdk.java.net/~shade/8180482/before.txt > http://cr.openjdk.java.net/~shade/8180482/after.txt > > Thanks, > -Aleksey > From shade at redhat.com Wed May 17 13:24:21 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 17 May 2017 15:24:21 +0200 Subject: RFR (S) 8180482: Reformat -XX:+PrintSafepointStatistics table In-Reply-To: <097a9f85-4f39-ac0f-5ed6-80153a7f5ef7@oracle.com> References: <097a9f85-4f39-ac0f-5ed6-80153a7f5ef7@oracle.com> Message-ID: Hi David, On 05/17/2017 02:19 PM, David Holmes wrote: > On 17/05/2017 9:02 PM, Aleksey Shipilev wrote: >> Patch: >> http://cr.openjdk.java.net/~shade/8180482/webrev.01/ > > Can't say I like all the magic numbers. Is this: > > tty->print("[ %5s %7s %7s %7s %7s %7s ] ", > "time:", "spin", "block", "sync", "cleanup", "vmop"); > > really worth the effort versus: > > // widest column name needs 7 chars so space accordingly > tty->print("[ time: spin block sync cleanup vmop ]"); > > ? Agreed, it was easier to handle with width specifiers when tidying up the code. The final version can just use the appropriate number of spaces: http://cr.openjdk.java.net/~shade/8180482/webrev.02/ Thanks, -Aleksey From coleen.phillimore at oracle.com Wed May 17 16:01:19 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 May 2017 12:01:19 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table Message-ID: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> Summary: Add a Java type called ResolvedMethodName which is immutable and can be stored in a hashtable, that is weakly collected by gc Thanks to John for his help with MemberName, and to-be-filed RFEs for further improvements. Thanks to Stefan for GC help. open webrev at http://cr.openjdk.java.net/~coleenp/8174749.01/webrev open webrev at http://cr.openjdk.java.net/~coleenp/8174749.jdk.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8174749 Tested with RBT nightly, compiler/jsr292 tests (included in rbt nightly), JPRT, jdk/test/java/lang/invoke, jdk/test/java/lang/instrument tests. There are platform dependent changes in this change. They are very straightforward, ie. add an indirection to MemberName invocations, but could people with access to these platforms test this out for me? Performance testing showed no regression, and large 1000% improvement for the cases that caused us to backout previous attempts at this change. Thanks, Coleen From ekaterina.pavlova at oracle.com Wed May 17 20:50:25 2017 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Wed, 17 May 2017 13:50:25 -0700 Subject: RFR(XS) 8180324: failed JVMCI junit test NativeCallTest.java Message-ID: <027637b7-83e8-e9ea-7373-327ade20363a@oracle.com> Hi, Please review this small change that fixes compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java test. The test missed 'native' declaration and as result failed with UnsatisfiedLinkError when '-nativepath' flag was not passed to jtreg. Now the test will fail with "Use -nativepath to specify the location of native code" error message which is more understandable. Fixed also runtime/noClassDefFoundMsg/NoClassDefFoundMsg.java and serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java tests which had the same issue. bug: https://bugs.openjdk.java.net/browse/JDK-8180324 webrev: http://cr.openjdk.java.net/~epavlova//8180324/webrev.00/ Tested by running jprt plus manual testing of fixed tests. thanks, -katya p.s. Igor Ignatyev volunteered to sponsor this change. From igor.ignatyev at oracle.com Wed May 17 21:07:46 2017 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 17 May 2017 14:07:46 -0700 Subject: RFR(XS) 8180324: failed JVMCI junit test NativeCallTest.java In-Reply-To: <027637b7-83e8-e9ea-7373-327ade20363a@oracle.com> References: <027637b7-83e8-e9ea-7373-327ade20363a@oracle.com> Message-ID: <8BBE7C6F-5149-48DC-A0BF-AAFF45687828@oracle.com> Hi Katya, the fix looks good to me. -- Igor > On May 17, 2017, at 1:50 PM, Ekaterina Pavlova wrote: > > Hi, > > Please review this small change that fixes compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java test. > The test missed 'native' declaration and as result failed with UnsatisfiedLinkError when '-nativepath' flag was not passed to > jtreg. Now the test will fail with "Use -nativepath to specify the location of native code" error message which is more > understandable. > > Fixed also runtime/noClassDefFoundMsg/NoClassDefFoundMsg.java and > serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java tests which had the same issue. > > bug: https://bugs.openjdk.java.net/browse/JDK-8180324 > webrev: http://cr.openjdk.java.net/~epavlova//8180324/webrev.00/ > > Tested by running jprt plus manual testing of fixed tests. > > thanks, > -katya > > p.s. > Igor Ignatyev volunteered to sponsor this change. From david.holmes at oracle.com Wed May 17 21:09:29 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 May 2017 07:09:29 +1000 Subject: RFR (S) 8180482: Reformat -XX:+PrintSafepointStatistics table In-Reply-To: References: <097a9f85-4f39-ac0f-5ed6-80153a7f5ef7@oracle.com> Message-ID: <6b6a0845-ce86-113d-cfb3-3603aa27f45b@oracle.com> On 17/05/2017 11:24 PM, Aleksey Shipilev wrote: > Hi David, > > On 05/17/2017 02:19 PM, David Holmes wrote: >> On 17/05/2017 9:02 PM, Aleksey Shipilev wrote: >>> Patch: >>> http://cr.openjdk.java.net/~shade/8180482/webrev.01/ >> >> Can't say I like all the magic numbers. Is this: >> >> tty->print("[ %5s %7s %7s %7s %7s %7s ] ", >> "time:", "spin", "block", "sync", "cleanup", "vmop"); >> >> really worth the effort versus: >> >> // widest column name needs 7 chars so space accordingly >> tty->print("[ time: spin block sync cleanup vmop ]"); >> >> ? > > Agreed, it was easier to handle with width specifiers when tidying up the code. > The final version can just use the appropriate number of spaces: > http://cr.openjdk.java.net/~shade/8180482/webrev.02/ Looks much simpler :) This is also overkill for adding spaces: 1269 tty->print("[ %5s " but I'd take it as-is as well. Don't forget to update copyright year before pushing. I think this qualifies as a trivial change (only one Reviewer needed). Thanks, David > Thanks, > -Aleksey > From shade at redhat.com Wed May 17 21:20:07 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 17 May 2017 23:20:07 +0200 Subject: RFR (S) 8180482: Reformat -XX:+PrintSafepointStatistics table In-Reply-To: <6b6a0845-ce86-113d-cfb3-3603aa27f45b@oracle.com> References: <097a9f85-4f39-ac0f-5ed6-80153a7f5ef7@oracle.com> <6b6a0845-ce86-113d-cfb3-3603aa27f45b@oracle.com> Message-ID: On 05/17/2017 11:09 PM, David Holmes wrote: > This is also overkill for adding spaces: > > 1269 tty->print("[ %5s " Oh yeah, fixed. > but I'd take it as-is as well. Don't forget to update copyright year before > pushing. Right. > I think this qualifies as a trivial change (only one Reviewer needed). I don't think I can push myself, JPRT and all that? Please sponsor: http://cr.openjdk.java.net/~shade/8180482/8180482.changeset -Aleksey From david.holmes at oracle.com Wed May 17 21:28:00 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 May 2017 07:28:00 +1000 Subject: RFR (S) 8180482: Reformat -XX:+PrintSafepointStatistics table In-Reply-To: References: <097a9f85-4f39-ac0f-5ed6-80153a7f5ef7@oracle.com> <6b6a0845-ce86-113d-cfb3-3603aa27f45b@oracle.com> Message-ID: <2b033274-3d35-6d91-838e-302a66bda868@oracle.com> On 18/05/2017 7:20 AM, Aleksey Shipilev wrote: > On 05/17/2017 11:09 PM, David Holmes wrote: >> This is also overkill for adding spaces: >> >> 1269 tty->print("[ %5s " > > Oh yeah, fixed. > >> but I'd take it as-is as well. Don't forget to update copyright year before >> pushing. > > Right. > >> I think this qualifies as a trivial change (only one Reviewer needed). > > I don't think I can push myself, JPRT and all that? Oops - forgot :) > Please sponsor: > http://cr.openjdk.java.net/~shade/8180482/8180482.changeset Will do. David > -Aleksey > > > From mikhailo.seledtsov at oracle.com Wed May 17 21:40:05 2017 From: mikhailo.seledtsov at oracle.com (mikhailo) Date: Wed, 17 May 2017 14:40:05 -0700 Subject: RFR(XS) 8180324: failed JVMCI junit test NativeCallTest.java In-Reply-To: <8BBE7C6F-5149-48DC-A0BF-AAFF45687828@oracle.com> References: <027637b7-83e8-e9ea-7373-327ade20363a@oracle.com> <8BBE7C6F-5149-48DC-A0BF-AAFF45687828@oracle.com> Message-ID: Looks good, Misha On 05/17/2017 02:07 PM, Igor Ignatyev wrote: > Hi Katya, > > the fix looks good to me. > > -- Igor >> On May 17, 2017, at 1:50 PM, Ekaterina Pavlova wrote: >> >> Hi, >> >> Please review this small change that fixes compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java test. >> The test missed 'native' declaration and as result failed with UnsatisfiedLinkError when '-nativepath' flag was not passed to >> jtreg. Now the test will fail with "Use -nativepath to specify the location of native code" error message which is more >> understandable. >> >> Fixed also runtime/noClassDefFoundMsg/NoClassDefFoundMsg.java and >> serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java tests which had the same issue. >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8180324 >> webrev: http://cr.openjdk.java.net/~epavlova//8180324/webrev.00/ >> >> Tested by running jprt plus manual testing of fixed tests. >> >> thanks, >> -katya >> >> p.s. >> Igor Ignatyev volunteered to sponsor this change. From vladimir.kozlov at oracle.com Wed May 17 23:34:10 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 17 May 2017 16:34:10 -0700 Subject: RFR(XS) 8180324: failed JVMCI junit test NativeCallTest.java In-Reply-To: <027637b7-83e8-e9ea-7373-327ade20363a@oracle.com> References: <027637b7-83e8-e9ea-7373-327ade20363a@oracle.com> Message-ID: <1b0efdee-9a5a-7dd6-b3ad-44dffa7a3bd4@oracle.com> Looks good. Thank you for fixing it. Vladimir On 5/17/17 1:50 PM, Ekaterina Pavlova wrote: > Hi, > > Please review this small change that fixes compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java test. > The test missed 'native' declaration and as result failed with UnsatisfiedLinkError when '-nativepath' flag was not passed to > jtreg. Now the test will fail with "Use -nativepath to specify the location of native code" error message which is more > understandable. > > Fixed also runtime/noClassDefFoundMsg/NoClassDefFoundMsg.java and > serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java tests which had the same issue. > > bug: https://bugs.openjdk.java.net/browse/JDK-8180324 > webrev: http://cr.openjdk.java.net/~epavlova//8180324/webrev.00/ > > Tested by running jprt plus manual testing of fixed tests. > > thanks, > -katya > > p.s. > Igor Ignatyev volunteered to sponsor this change. > From david.holmes at oracle.com Thu May 18 06:25:59 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 May 2017 16:25:59 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems Message-ID: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 webrevs: Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ hotspot: http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ First a big thank you to Thomas Stuefe for testing various versions of this on AIX. This is primarily a refactoring and cleanup exercise (ie lots of deleted duplicated code!). I have taken the PlatformEvent, PlatformParker and Parker::* code, out of os_linux and moved it into os_posix for use by Linux, OSX, BSD, AIX and perhaps one day Solaris (more on that later). The Linux code was the most functionally complete, dealing with correct use of CLOCK_MONOTONIC for relative timed waits, and the default wall-clock for absolute timed waits. That functionality is not, unfortunately, supported by all our POSIX platforms so there are some configure time build checks to set some #defines, and then some dynamic lookup at runtime**. We allow for the runtime environment to be less capable than the build environment, but not the other way around (without build time support we don't know the runtime types needed to make library calls). ** There is some duplication of dynamic lookup code on Linux but this can be cleaned up in future work if we refactor the time/clock code into os_posix as well. The cleanup covers a number of things: - removal of linux anachronisms that got "ported" into the other platforms - eg EINTR can not be returned from the wait methods - removal of solaris anachronisms that got ported into the linux code and then on to other platforms - eg ETIMEDOUT is what we expect never ETIME - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() from the Parker methods - consolidation of unpackTime and compute_abstime into one utility function - use statics for things completely private to the implementation rather than making them part of the os* API (eg access to condAttr objects) - cleanup up commentary and style within methods of the same class - clean up coding style in places eg not using Names that start with capitals. I have not tried to cleanup every single oddity, nor tried to reconcile differences between the very similar in places PlatformEvent and Park methods. For example PlatformEvent still examines the FilterSpuriousWakeups** flag, and Parker still ignores it. ** Perhaps a candidate for deprecation and future removal. There is one mini "enhancement" slipped in this. I now explicitly initialize mutexes with a mutexAttr object with its type set to PTHREAD_MUTEX_NORMAL, instead of relying on the definition of PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but "error checking" and so is slow. On all other current platforms there is no effective change. Finally, Solaris is excluded from all this (other than the debug signal blocking cleanup) because it potentially supports three different low-level sync subsystems: UI thr*, Pthread, and direct LWP sync. Solaris cleanup would be a separate RFE. No doubt I've overlooked mentioning something that someone will spot. :) Thanks, David From magnus.ihse.bursie at oracle.com Thu May 18 07:32:27 2017 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 18 May 2017 09:32:27 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> Message-ID: <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> On 2017-05-18 08:25, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 > > webrevs: > > Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ Build changes look good. /Magnus > hotspot: > http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ > > First a big thank you to Thomas Stuefe for testing various versions of > this on AIX. > > This is primarily a refactoring and cleanup exercise (ie lots of > deleted duplicated code!). > > I have taken the PlatformEvent, PlatformParker and Parker::* code, out > of os_linux and moved it into os_posix for use by Linux, OSX, BSD, AIX > and perhaps one day Solaris (more on that later). > > The Linux code was the most functionally complete, dealing with > correct use of CLOCK_MONOTONIC for relative timed waits, and the > default wall-clock for absolute timed waits. That functionality is > not, unfortunately, supported by all our POSIX platforms so there are > some configure time build checks to set some #defines, and then some > dynamic lookup at runtime**. We allow for the runtime environment to > be less capable than the build environment, but not the other way > around (without build time support we don't know the runtime types > needed to make library calls). > > ** There is some duplication of dynamic lookup code on Linux but this > can be cleaned up in future work if we refactor the time/clock code > into os_posix as well. > > The cleanup covers a number of things: > - removal of linux anachronisms that got "ported" into the other > platforms > - eg EINTR can not be returned from the wait methods > - removal of solaris anachronisms that got ported into the linux code > and then on to other platforms > - eg ETIMEDOUT is what we expect never ETIME > - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() > from the Parker methods > - consolidation of unpackTime and compute_abstime into one utility > function > - use statics for things completely private to the implementation > rather than making them part of the os* API (eg access to condAttr > objects) > - cleanup up commentary and style within methods of the same class > - clean up coding style in places eg not using Names that start with > capitals. > > I have not tried to cleanup every single oddity, nor tried to > reconcile differences between the very similar in places PlatformEvent > and Park methods. For example PlatformEvent still examines the > FilterSpuriousWakeups** flag, and Parker still ignores it. > > ** Perhaps a candidate for deprecation and future removal. > > There is one mini "enhancement" slipped in this. I now explicitly > initialize mutexes with a mutexAttr object with its type set to > PTHREAD_MUTEX_NORMAL, instead of relying on the definition of > PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but > "error checking" and so is slow. On all other current platforms there > is no effective change. > > Finally, Solaris is excluded from all this (other than the debug > signal blocking cleanup) because it potentially supports three > different low-level sync subsystems: UI thr*, Pthread, and direct LWP > sync. Solaris cleanup would be a separate RFE. > > No doubt I've overlooked mentioning something that someone will spot. :) > > Thanks, > David From david.holmes at oracle.com Thu May 18 07:35:09 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 May 2017 17:35:09 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> Message-ID: On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: > On 2017-05-18 08:25, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >> >> webrevs: >> >> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ > > Build changes look good. Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging prints outs - do you want me to remove them? I suppose they may be useful if something goes wrong on some platform. David > /Magnus > >> hotspot: >> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >> >> First a big thank you to Thomas Stuefe for testing various versions of >> this on AIX. >> >> This is primarily a refactoring and cleanup exercise (ie lots of >> deleted duplicated code!). >> >> I have taken the PlatformEvent, PlatformParker and Parker::* code, out >> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, AIX >> and perhaps one day Solaris (more on that later). >> >> The Linux code was the most functionally complete, dealing with >> correct use of CLOCK_MONOTONIC for relative timed waits, and the >> default wall-clock for absolute timed waits. That functionality is >> not, unfortunately, supported by all our POSIX platforms so there are >> some configure time build checks to set some #defines, and then some >> dynamic lookup at runtime**. We allow for the runtime environment to >> be less capable than the build environment, but not the other way >> around (without build time support we don't know the runtime types >> needed to make library calls). >> >> ** There is some duplication of dynamic lookup code on Linux but this >> can be cleaned up in future work if we refactor the time/clock code >> into os_posix as well. >> >> The cleanup covers a number of things: >> - removal of linux anachronisms that got "ported" into the other >> platforms >> - eg EINTR can not be returned from the wait methods >> - removal of solaris anachronisms that got ported into the linux code >> and then on to other platforms >> - eg ETIMEDOUT is what we expect never ETIME >> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >> from the Parker methods >> - consolidation of unpackTime and compute_abstime into one utility >> function >> - use statics for things completely private to the implementation >> rather than making them part of the os* API (eg access to condAttr >> objects) >> - cleanup up commentary and style within methods of the same class >> - clean up coding style in places eg not using Names that start with >> capitals. >> >> I have not tried to cleanup every single oddity, nor tried to >> reconcile differences between the very similar in places PlatformEvent >> and Park methods. For example PlatformEvent still examines the >> FilterSpuriousWakeups** flag, and Parker still ignores it. >> >> ** Perhaps a candidate for deprecation and future removal. >> >> There is one mini "enhancement" slipped in this. I now explicitly >> initialize mutexes with a mutexAttr object with its type set to >> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >> "error checking" and so is slow. On all other current platforms there >> is no effective change. >> >> Finally, Solaris is excluded from all this (other than the debug >> signal blocking cleanup) because it potentially supports three >> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >> sync. Solaris cleanup would be a separate RFE. >> >> No doubt I've overlooked mentioning something that someone will spot. :) >> >> Thanks, >> David > From tobias.hartmann at oracle.com Thu May 18 07:47:57 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 18 May 2017 09:47:57 +0200 Subject: [10] RFR(XS): 8180587: Assert in layout_helper_log2_element_size(jint) compares bits instead of bytes Message-ID: <93ea86ab-a94d-c36b-a57a-1fc1fb0e9bf3@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8180587 http://cr.openjdk.java.net/~thartmann/8180587/webrev.00/ While working on value types, I wondered that the "l2esz <= LogBitsPerLong" assert in layout_helper_log2_element_size() did not fire although the array element size of a value type is greater than the size of a long. The problem is that the assert compares the log2 element size which is in bytes (see initialization in Klass::array_layout_helper()) to LogBitsPerLong which obviously is in bits. Thanks, Tobias From robbin.ehn at oracle.com Thu May 18 09:59:42 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 18 May 2017 11:59:42 +0200 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> References: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> Message-ID: Hi, On 05/17/2017 03:46 AM, Kim Barrett wrote: >> On May 9, 2017, at 6:40 PM, Mikael Vidstedt wrote: >> >> >> Warning: It may be wise to stock up on coffee or tea before reading this. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ > > Not a review, just a question. > > ------------------------------------------------------------------------------ > src/cpu/x86/vm/bytes_x86.hpp > 40 template > 41 static inline T get_native(const void* p) { > 42 assert(p != NULL, "null pointer"); > 43 > 44 T x; > 45 > 46 if (is_ptr_aligned(p, sizeof(T))) { > 47 x = *(T*)p; > 48 } else { > 49 memcpy(&x, p, sizeof(T)); > 50 } > 51 > 52 return x; > > I'm looking at this and wondering if there's a good reason to not just > unconditionally use memcpy here. gcc -O will generate a single move > instruction for that on x86_64. I'm not sure what happens on 32bit > with an 8 byte value, but I suspect it will do something similarly > sensible, e.g. 2 4 byte memory to memory transfers. Unconditionally memcpy would be nice! Are going to look into that Mikael? /Robbin > > ------------------------------------------------------------------------------ > From magnus.ihse.bursie at oracle.com Thu May 18 10:06:58 2017 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Thu, 18 May 2017 12:06:58 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> Message-ID: On 2017-05-18 09:35, David Holmes wrote: > On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: >> On 2017-05-18 08:25, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>> >>> webrevs: >>> >>> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >> >> Build changes look good. > > Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging > prints outs - do you want me to remove them? I suppose they may be > useful if something goes wrong on some platform. I didn't even notice them. :-/ It's a bit unfortunate we don't have a debug level on the logging from configure. :-( Otherwise they would have clearly belonged there. The AC_MSG_NOTICE messages stands out much from the rest of the configure log, so maybe it's better that you remove them. The logic itself is very simple, if the -D flags are missing then we can surely tell what happened. So yes, please remove them. Alternatively, rewrite them as CHECKING/RESULT, if you want to keep the logging. That way they match better the rest of the configure log (and also describes what you're doing). Just check if AC_SEARCH_LIBS prints some output (likely so, I think), then you can't do it in the middle of a CHECKING/RESULT pair, but have to do the CHECKING part after AC_SEARCH_LIBS. /Magnus > > David > >> /Magnus >> >>> hotspot: >>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >>> >>> First a big thank you to Thomas Stuefe for testing various versions of >>> this on AIX. >>> >>> This is primarily a refactoring and cleanup exercise (ie lots of >>> deleted duplicated code!). >>> >>> I have taken the PlatformEvent, PlatformParker and Parker::* code, out >>> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, AIX >>> and perhaps one day Solaris (more on that later). >>> >>> The Linux code was the most functionally complete, dealing with >>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>> default wall-clock for absolute timed waits. That functionality is >>> not, unfortunately, supported by all our POSIX platforms so there are >>> some configure time build checks to set some #defines, and then some >>> dynamic lookup at runtime**. We allow for the runtime environment to >>> be less capable than the build environment, but not the other way >>> around (without build time support we don't know the runtime types >>> needed to make library calls). >>> >>> ** There is some duplication of dynamic lookup code on Linux but this >>> can be cleaned up in future work if we refactor the time/clock code >>> into os_posix as well. >>> >>> The cleanup covers a number of things: >>> - removal of linux anachronisms that got "ported" into the other >>> platforms >>> - eg EINTR can not be returned from the wait methods >>> - removal of solaris anachronisms that got ported into the linux code >>> and then on to other platforms >>> - eg ETIMEDOUT is what we expect never ETIME >>> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >>> from the Parker methods >>> - consolidation of unpackTime and compute_abstime into one utility >>> function >>> - use statics for things completely private to the implementation >>> rather than making them part of the os* API (eg access to condAttr >>> objects) >>> - cleanup up commentary and style within methods of the same class >>> - clean up coding style in places eg not using Names that start with >>> capitals. >>> >>> I have not tried to cleanup every single oddity, nor tried to >>> reconcile differences between the very similar in places PlatformEvent >>> and Park methods. For example PlatformEvent still examines the >>> FilterSpuriousWakeups** flag, and Parker still ignores it. >>> >>> ** Perhaps a candidate for deprecation and future removal. >>> >>> There is one mini "enhancement" slipped in this. I now explicitly >>> initialize mutexes with a mutexAttr object with its type set to >>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>> "error checking" and so is slow. On all other current platforms there >>> is no effective change. >>> >>> Finally, Solaris is excluded from all this (other than the debug >>> signal blocking cleanup) because it potentially supports three >>> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >>> sync. Solaris cleanup would be a separate RFE. >>> >>> No doubt I've overlooked mentioning something that someone will >>> spot. :) >>> >>> Thanks, >>> David >> From coleen.phillimore at oracle.com Thu May 18 12:52:33 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 18 May 2017 08:52:33 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <3e673a09-594b-86dd-b430-3574af214d83@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3e673a09-594b-86dd-b430-3574af214d83@oracle.com> Message-ID: <8090a8b8-0943-3b86-53da-59c6c7f3999d@oracle.com> Hi Serguei, Thank you for reviewing this. I should have called you out to do it :) On 5/18/17 4:10 AM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > > Nice refactoring! Thank you! > > Some quick comments. > > > http://oklahoma.us.oracle.com/~cphillim/webrev/8174749.01/webrev/src/share/vm/classfile/javaClasses.hpp.udiff.html > > The function name ResolvedMethodName (and the > j.l.i_ResolvedMethodName) sounds confusing. > Would it better name it ResolvedMethod (by its meaning it is > MemberNameResolvedMethod)? John and I liked that name. When I called it ResolvedMethod, in the vm code it didn't look like an oop. Also, we're envisioning it to replace MemberName in some places like LambdaForms and StackWalk frames, so I wanted to keep the Name in the name. > > > http://oklahoma.us.oracle.com/~cphillim/webrev/8174749.01/webrev/src/share/vm/prims/resolvedMethodTable.hpp.html > 85 // Called from MethodHandles > 86 static oop find_method(Method* vmtarget); > 87 static oop add_method(Handle mem_name_target); > > Do you mean, called from the MethodHandles.cpp ? > I do not see these functions are ever called from the MethodHandles.cpp. > But they are called from the javaClasses.cpp. Thanks for noticing that. I'd moved the code. Updated the comment like: // Called from java_lang_invoke_ResolvedMethodName static oop find_method(Method* vmtarget); static oop add_method(Handle mem_name_target); > > > http://oklahoma.us.oracle.com/~cphillim/webrev/8174749.01/webrev/src/share/vm/prims/methodHandles.cpp.udiff.html > void MethodHandles::expand_MemberName(Handle mname, int suppress, TRAPS) { > assert(java_lang_invoke_MemberName::is_instance(mname()), ""); > - Metadata* vmtarget = java_lang_invoke_MemberName::vmtarget(mname()); > int vmindex = java_lang_invoke_MemberName::vmindex(mname()); > - if (vmtarget == NULL) { > + bool have_defc = (java_lang_invoke_MemberName::clazz(mname()) != NULL); > + > + assert(have_defc, "defining class should be present"); > + > + if (!have_defc) { > THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "nothing to expand"); > } > > - bool have_defc = (java_lang_invoke_MemberName::clazz(mname()) != NULL); > bool have_name = (java_lang_invoke_MemberName::name(mname()) != NULL); > bool have_type = (java_lang_invoke_MemberName::type(mname()) != NULL); > int flags = java_lang_invoke_MemberName::flags(mname()); > > if (suppress != 0) { > - if (suppress & _suppress_defc) have_defc = true; > if (suppress & _suppress_name) have_name = true; > if (suppress & _suppress_type) have_type = true; > } > > It seems, the assert and THROW_MSG duplicate each other. > Also, the _suppress_defc is not used anymore. > Should it be removed from the enum in the methodHandles.hpp, > or there is a plan to use it later? There is no plan, I missed that flag. I had the assert to verify that the clazz is always initialized properly when the MemberName needs expansion. > > > http://oklahoma.us.oracle.com/~cphillim/webrev/8174749.01/webrev/src/share/vm/prims/jvmtiRedefineClasses.cpp.udiff.html > > + _any_class_has_resolved_methods |= the_class->has_resolved_methods(); > Should we use '||=' instead of '|=' ? Yes, fixed. Thanks, Coleen > > > Thanks, > Serguei > > > > > > On 5/17/17 09:01, coleen.phillimore at oracle.com wrote: >> Summary: Add a Java type called ResolvedMethodName which is immutable >> and can be stored in a hashtable, that is weakly collected by gc >> >> Thanks to John for his help with MemberName, and to-be-filed RFEs for >> further improvements. Thanks to Stefan for GC help. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.01/webrev >> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.jdk.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8174749 >> >> Tested with RBT nightly, compiler/jsr292 tests (included in rbt >> nightly), JPRT, jdk/test/java/lang/invoke, >> jdk/test/java/lang/instrument tests. >> >> There are platform dependent changes in this change. They are very >> straightforward, ie. add an indirection to MemberName invocations, >> but could people with access to these platforms test this out for me? >> >> Performance testing showed no regression, and large 1000% improvement >> for the cases that caused us to backout previous attempts at this >> change. >> >> Thanks, >> Coleen >> > From thomas.stuefe at gmail.com Thu May 18 14:40:38 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 18 May 2017 16:40:38 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> Message-ID: Hi David, Magnus, compiles and works fine on AIX, but as mentioned before off-list to David I see this stdout: configure: No CLOCK_GETTIME_IN_LIBRT configure: No CLOCK_GETTIME_IN_LIBRT Also, the -DSUPPORTS_CLOCK_MONOTONIC appears twice on the command line. Full compile command looks like this: /bin/xlC_r -q64 -qpic -D_REENTRANT -D__STDC_FORMAT_MACROS -DSUPPORTS_CLOCK_MONOTONIC -DSUPPORTS_CLOCK_MONOTONIC -DAIX -qtune=balanced -qalias=noansi -qstrict -qtls=default -qlanglvl=c99vla -qlanglvl=noredefmac -qnortti -qnoeh -qignerrno -qarch=ppc64 -DASSERT -DTARGET_ARCH_ppc -DINCLUDE_SUFFIX_OS=_aix -DINCLUDE_SUFFIX_CPU=_ppc -DTARGET_COMPILER_xlc -DPPC64 -DHOTSPOT_LIB_ARCH='"ppc64"' -DCOMPILER1 -DCOMPILER2 -DINCLUDE_JVMCI=0 -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/share/vm -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os/aix/vm -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os/posix/vm -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/cpu/ppc/vm -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os_cpu/aix_ppc/vm -I/priv/d031900/openjdk/jdk10-hs/output/hotspot/variant-server/gensrc -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/share/vm/precompiled -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/share/vm/prims -DDONT_USE_PRECOMPILED_HEADER -g -qsuppress=1540-0216 -qsuppress=1540-0198 -qsuppress=1540-1090 -qsuppress=1540-1639 -qsuppress=1540-1088 -qsuppress=1500-010 -O3 -qhot=level=1 -qinline -qinlglue -DTHIS_FILE='"os_posix.cpp"' -c -qmakedep=gcc -MF /priv/d031900/openjdk/jdk10-hs/output/hotspot/variant-server/libjvm/objs/os_posix.d -o /priv/d031900/openjdk/jdk10-hs/output/hotspot/variant-server/libjvm/objs/os_posix.o /priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os/posix/vm/os_posix.cpp -DSUPPORTS_CLOCK_MONOTONIC is the only switch appearing twice. I'm baffled. Do you have any idea? Regards, Thomas On Thu, May 18, 2017 at 12:06 PM, Magnus Ihse Bursie < magnus.ihse.bursie at oracle.com> wrote: > > > On 2017-05-18 09:35, David Holmes wrote: > >> On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: >> >>> On 2017-05-18 08:25, David Holmes wrote: >>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>>> >>>> webrevs: >>>> >>>> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >>>> >>> >>> Build changes look good. >>> >> >> Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging >> prints outs - do you want me to remove them? I suppose they may be useful >> if something goes wrong on some platform. >> > > I didn't even notice them. :-/ > > It's a bit unfortunate we don't have a debug level on the logging from > configure. :-( Otherwise they would have clearly belonged there. > > The AC_MSG_NOTICE messages stands out much from the rest of the configure > log, so maybe it's better that you remove them. The logic itself is very > simple, if the -D flags are missing then we can surely tell what happened. > So yes, please remove them. > > Alternatively, rewrite them as CHECKING/RESULT, if you want to keep the > logging. That way they match better the rest of the configure log (and also > describes what you're doing). Just check if AC_SEARCH_LIBS prints some > output (likely so, I think), then you can't do it in the middle of a > CHECKING/RESULT pair, but have to do the CHECKING part after AC_SEARCH_LIBS. > > /Magnus > > > >> David >> >> /Magnus >>> >>> hotspot: >>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >>>> >>>> First a big thank you to Thomas Stuefe for testing various versions of >>>> this on AIX. >>>> >>>> This is primarily a refactoring and cleanup exercise (ie lots of >>>> deleted duplicated code!). >>>> >>>> I have taken the PlatformEvent, PlatformParker and Parker::* code, out >>>> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, AIX >>>> and perhaps one day Solaris (more on that later). >>>> >>>> The Linux code was the most functionally complete, dealing with >>>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>>> default wall-clock for absolute timed waits. That functionality is >>>> not, unfortunately, supported by all our POSIX platforms so there are >>>> some configure time build checks to set some #defines, and then some >>>> dynamic lookup at runtime**. We allow for the runtime environment to >>>> be less capable than the build environment, but not the other way >>>> around (without build time support we don't know the runtime types >>>> needed to make library calls). >>>> >>>> ** There is some duplication of dynamic lookup code on Linux but this >>>> can be cleaned up in future work if we refactor the time/clock code >>>> into os_posix as well. >>>> >>>> The cleanup covers a number of things: >>>> - removal of linux anachronisms that got "ported" into the other >>>> platforms >>>> - eg EINTR can not be returned from the wait methods >>>> - removal of solaris anachronisms that got ported into the linux code >>>> and then on to other platforms >>>> - eg ETIMEDOUT is what we expect never ETIME >>>> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >>>> from the Parker methods >>>> - consolidation of unpackTime and compute_abstime into one utility >>>> function >>>> - use statics for things completely private to the implementation >>>> rather than making them part of the os* API (eg access to condAttr >>>> objects) >>>> - cleanup up commentary and style within methods of the same class >>>> - clean up coding style in places eg not using Names that start with >>>> capitals. >>>> >>>> I have not tried to cleanup every single oddity, nor tried to >>>> reconcile differences between the very similar in places PlatformEvent >>>> and Park methods. For example PlatformEvent still examines the >>>> FilterSpuriousWakeups** flag, and Parker still ignores it. >>>> >>>> ** Perhaps a candidate for deprecation and future removal. >>>> >>>> There is one mini "enhancement" slipped in this. I now explicitly >>>> initialize mutexes with a mutexAttr object with its type set to >>>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>>> "error checking" and so is slow. On all other current platforms there >>>> is no effective change. >>>> >>>> Finally, Solaris is excluded from all this (other than the debug >>>> signal blocking cleanup) because it potentially supports three >>>> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >>>> sync. Solaris cleanup would be a separate RFE. >>>> >>>> No doubt I've overlooked mentioning something that someone will spot. :) >>>> >>>> Thanks, >>>> David >>>> >>> >>> > From zoltan.majo at oracle.com Thu May 18 14:50:49 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 18 May 2017 16:50:49 +0200 Subject: [10] RFR(XS): 8180587: Assert in layout_helper_log2_element_size(jint) compares bits instead of bytes In-Reply-To: <93ea86ab-a94d-c36b-a57a-1fc1fb0e9bf3@oracle.com> References: <93ea86ab-a94d-c36b-a57a-1fc1fb0e9bf3@oracle.com> Message-ID: Hi Tobias, the proposed change looks good to me. Thank you! Best regards, Zoltan On 05/18/2017 09:47 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8180587 > http://cr.openjdk.java.net/~thartmann/8180587/webrev.00/ > > While working on value types, I wondered that the "l2esz <= LogBitsPerLong" assert in layout_helper_log2_element_size() did not fire although the array element size of a value type is greater than the size of a long. The problem is that the assert compares the log2 element size which is in bytes (see initialization in Klass::array_layout_helper()) to LogBitsPerLong which obviously is in bits. > > Thanks, > Tobias From tobias.hartmann at oracle.com Thu May 18 14:52:30 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 18 May 2017 16:52:30 +0200 Subject: [10] RFR(XS): 8180587: Assert in layout_helper_log2_element_size(jint) compares bits instead of bytes In-Reply-To: References: <93ea86ab-a94d-c36b-a57a-1fc1fb0e9bf3@oracle.com> Message-ID: <589945f2-d036-a396-4a37-1eec1a6b2e32@oracle.com> Hi Zoltan, thanks for the review! Best regards, Tobias On 18.05.2017 16:50, Zolt?n Maj? wrote: > Hi Tobias, > > > the proposed change looks good to me. > > Thank you! > > Best regards, > > > Zoltan > > > On 05/18/2017 09:47 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8180587 >> http://cr.openjdk.java.net/~thartmann/8180587/webrev.00/ >> >> While working on value types, I wondered that the "l2esz <= LogBitsPerLong" assert in layout_helper_log2_element_size() did not fire although the array element size of a value type is greater than the size of a long. The problem is that the assert compares the log2 element size which is in bytes (see initialization in Klass::array_layout_helper()) to LogBitsPerLong which obviously is in bits. >> >> Thanks, >> Tobias > From vladimir.kozlov at oracle.com Thu May 18 17:06:57 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 18 May 2017 10:06:57 -0700 Subject: [10] RFR(XS): 8180587: Assert in layout_helper_log2_element_size(jint) compares bits instead of bytes In-Reply-To: References: <93ea86ab-a94d-c36b-a57a-1fc1fb0e9bf3@oracle.com> Message-ID: On 5/18/17 7:50 AM, Zolt?n Maj? wrote: > Hi Tobias, > > the proposed change looks good to me. +1 Vladimir > > Thank you! > > Best regards, > > > Zoltan > > > On 05/18/2017 09:47 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8180587 >> http://cr.openjdk.java.net/~thartmann/8180587/webrev.00/ >> >> While working on value types, I wondered that the "l2esz <= LogBitsPerLong" assert in >> layout_helper_log2_element_size() did not fire although the array element size of a value type is greater than the >> size of a long. The problem is that the assert compares the log2 element size which is in bytes (see initialization in >> Klass::array_layout_helper()) to LogBitsPerLong which obviously is in bits. >> >> Thanks, >> Tobias > From thomas.stuefe at gmail.com Thu May 18 18:39:47 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 18 May 2017 20:39:47 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> Message-ID: Okay, I regenerated the generated-configure.sh and the double definitions for -DSUPPORTS_CLOCK_MONOTONIC disappeared. So the generated-configure.sh you posted was just outdated. However, the debug output still appears twice: configure: No CLOCK_GETTIME_IN_LIBRT AC_MSG_NOTICE([No CLOCK_GETTIME_IN_LIBRT]) expands to two invocations of ac_echo(..). I am out of my depth here, not sure what the background is. But as you want to remove the debug output anyway, I think this is not an issue. I will take more time for a full review later. Just adding that I really like this fix, it takes a lot of coding off our (platform specific) backs, which is a good thing! Kind Regards, Thomas On Thu, May 18, 2017 at 4:40 PM, Thomas St?fe wrote: > Hi David, Magnus, > > compiles and works fine on AIX, but as mentioned before off-list to David > I see this stdout: > > configure: No CLOCK_GETTIME_IN_LIBRT > configure: No CLOCK_GETTIME_IN_LIBRT > > Also, the -DSUPPORTS_CLOCK_MONOTONIC appears twice on the command line. > Full compile command looks like this: > > /bin/xlC_r -q64 -qpic -D_REENTRANT -D__STDC_FORMAT_MACROS > -DSUPPORTS_CLOCK_MONOTONIC -DSUPPORTS_CLOCK_MONOTONIC -DAIX -qtune=balanced > -qalias=noansi -qstrict -qtls=default -qlanglvl=c99vla -qlanglvl=noredefmac > -qnortti -qnoeh -qignerrno -qarch=ppc64 -DASSERT -DTARGET_ARCH_ppc > -DINCLUDE_SUFFIX_OS=_aix -DINCLUDE_SUFFIX_CPU=_ppc -DTARGET_COMPILER_xlc > -DPPC64 -DHOTSPOT_LIB_ARCH='"ppc64"' -DCOMPILER1 -DCOMPILER2 > -DINCLUDE_JVMCI=0 -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/share/vm > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os/aix/vm > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os/posix/vm > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/cpu/ppc/vm > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os_cpu/aix_ppc/vm > -I/priv/d031900/openjdk/jdk10-hs/output/hotspot/variant-server/gensrc > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/share/vm/precompiled > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/share/vm/prims > -DDONT_USE_PRECOMPILED_HEADER -g -qsuppress=1540-0216 -qsuppress=1540-0198 > -qsuppress=1540-1090 -qsuppress=1540-1639 -qsuppress=1540-1088 > -qsuppress=1500-010 -O3 -qhot=level=1 -qinline -qinlglue > -DTHIS_FILE='"os_posix.cpp"' -c -qmakedep=gcc -MF > /priv/d031900/openjdk/jdk10-hs/output/hotspot/variant-server/libjvm/objs/os_posix.d > -o /priv/d031900/openjdk/jdk10-hs/output/hotspot/variant-server/libjvm/objs/os_posix.o > /priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os/posix/vm/os_posix.cpp > > -DSUPPORTS_CLOCK_MONOTONIC is the only switch appearing twice. I'm > baffled. Do you have any idea? > > Regards, Thomas > > > On Thu, May 18, 2017 at 12:06 PM, Magnus Ihse Bursie < > magnus.ihse.bursie at oracle.com> wrote: > >> >> >> On 2017-05-18 09:35, David Holmes wrote: >> >>> On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: >>> >>>> On 2017-05-18 08:25, David Holmes wrote: >>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>>>> >>>>> webrevs: >>>>> >>>>> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >>>>> >>>> >>>> Build changes look good. >>>> >>> >>> Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging >>> prints outs - do you want me to remove them? I suppose they may be useful >>> if something goes wrong on some platform. >>> >> >> I didn't even notice them. :-/ >> >> It's a bit unfortunate we don't have a debug level on the logging from >> configure. :-( Otherwise they would have clearly belonged there. >> >> The AC_MSG_NOTICE messages stands out much from the rest of the configure >> log, so maybe it's better that you remove them. The logic itself is very >> simple, if the -D flags are missing then we can surely tell what happened. >> So yes, please remove them. >> >> Alternatively, rewrite them as CHECKING/RESULT, if you want to keep the >> logging. That way they match better the rest of the configure log (and also >> describes what you're doing). Just check if AC_SEARCH_LIBS prints some >> output (likely so, I think), then you can't do it in the middle of a >> CHECKING/RESULT pair, but have to do the CHECKING part after AC_SEARCH_LIBS. >> >> /Magnus >> >> >> >>> David >>> >>> /Magnus >>>> >>>> hotspot: >>>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >>>>> >>>>> First a big thank you to Thomas Stuefe for testing various versions of >>>>> this on AIX. >>>>> >>>>> This is primarily a refactoring and cleanup exercise (ie lots of >>>>> deleted duplicated code!). >>>>> >>>>> I have taken the PlatformEvent, PlatformParker and Parker::* code, out >>>>> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, AIX >>>>> and perhaps one day Solaris (more on that later). >>>>> >>>>> The Linux code was the most functionally complete, dealing with >>>>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>>>> default wall-clock for absolute timed waits. That functionality is >>>>> not, unfortunately, supported by all our POSIX platforms so there are >>>>> some configure time build checks to set some #defines, and then some >>>>> dynamic lookup at runtime**. We allow for the runtime environment to >>>>> be less capable than the build environment, but not the other way >>>>> around (without build time support we don't know the runtime types >>>>> needed to make library calls). >>>>> >>>>> ** There is some duplication of dynamic lookup code on Linux but this >>>>> can be cleaned up in future work if we refactor the time/clock code >>>>> into os_posix as well. >>>>> >>>>> The cleanup covers a number of things: >>>>> - removal of linux anachronisms that got "ported" into the other >>>>> platforms >>>>> - eg EINTR can not be returned from the wait methods >>>>> - removal of solaris anachronisms that got ported into the linux code >>>>> and then on to other platforms >>>>> - eg ETIMEDOUT is what we expect never ETIME >>>>> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >>>>> from the Parker methods >>>>> - consolidation of unpackTime and compute_abstime into one utility >>>>> function >>>>> - use statics for things completely private to the implementation >>>>> rather than making them part of the os* API (eg access to condAttr >>>>> objects) >>>>> - cleanup up commentary and style within methods of the same class >>>>> - clean up coding style in places eg not using Names that start with >>>>> capitals. >>>>> >>>>> I have not tried to cleanup every single oddity, nor tried to >>>>> reconcile differences between the very similar in places PlatformEvent >>>>> and Park methods. For example PlatformEvent still examines the >>>>> FilterSpuriousWakeups** flag, and Parker still ignores it. >>>>> >>>>> ** Perhaps a candidate for deprecation and future removal. >>>>> >>>>> There is one mini "enhancement" slipped in this. I now explicitly >>>>> initialize mutexes with a mutexAttr object with its type set to >>>>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>>>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>>>> "error checking" and so is slow. On all other current platforms there >>>>> is no effective change. >>>>> >>>>> Finally, Solaris is excluded from all this (other than the debug >>>>> signal blocking cleanup) because it potentially supports three >>>>> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >>>>> sync. Solaris cleanup would be a separate RFE. >>>>> >>>>> No doubt I've overlooked mentioning something that someone will spot. >>>>> :) >>>>> >>>>> Thanks, >>>>> David >>>>> >>>> >>>> >> > From david.holmes at oracle.com Thu May 18 21:39:29 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 May 2017 07:39:29 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> Message-ID: Hi Thomas, On 19/05/2017 4:39 AM, Thomas St?fe wrote: > Okay, I regenerated the generated-configure.sh and the double > definitions for -DSUPPORTS_CLOCK_MONOTONIC disappeared. So the > generated-configure.sh you posted was just outdated. Not sure how but as long as it is fixed. :) > However, the debug output still appears twice: configure: No > CLOCK_GETTIME_IN_LIBRT Yes that's a side-effect of the flag helper routine being called twice: once for build platform and once for target. > AC_MSG_NOTICE([No CLOCK_GETTIME_IN_LIBRT]) expands to two invocations of > ac_echo(..). I am out of my depth here, not sure what the background is. > But as you want to remove the debug output anyway, I think this is not > an issue. Right the messages will be gone. > I will take more time for a full review later. Just adding that I really > like this fix, it takes a lot of coding off our (platform specific) > backs, which is a good thing! Thanks for looking at this in so much detail already. Cheers, David > Kind Regards, Thomas > > On Thu, May 18, 2017 at 4:40 PM, Thomas St?fe > wrote: > > Hi David, Magnus, > > compiles and works fine on AIX, but as mentioned before off-list to > David I see this stdout: > > configure: No CLOCK_GETTIME_IN_LIBRT > configure: No CLOCK_GETTIME_IN_LIBRT > > Also, the -DSUPPORTS_CLOCK_MONOTONIC appears twice on the command > line. Full compile command looks like this: > > /bin/xlC_r -q64 -qpic -D_REENTRANT -D__STDC_FORMAT_MACROS > -DSUPPORTS_CLOCK_MONOTONIC -DSUPPORTS_CLOCK_MONOTONIC -DAIX > -qtune=balanced -qalias=noansi -qstrict -qtls=default > -qlanglvl=c99vla -qlanglvl=noredefmac -qnortti -qnoeh -qignerrno > -qarch=ppc64 -DASSERT -DTARGET_ARCH_ppc -DINCLUDE_SUFFIX_OS=_aix > -DINCLUDE_SUFFIX_CPU=_ppc -DTARGET_COMPILER_xlc -DPPC64 > -DHOTSPOT_LIB_ARCH='"ppc64"' -DCOMPILER1 -DCOMPILER2 > -DINCLUDE_JVMCI=0 > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/share/vm > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os/aix/vm > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os/posix/vm > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/cpu/ppc/vm > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os_cpu/aix_ppc/vm -I/priv/d031900/openjdk/jdk10-hs/output/hotspot/variant-server/gensrc > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/share/vm/precompiled > -I/priv/d031900/openjdk/jdk10-hs/source/hotspot/src/share/vm/prims > -DDONT_USE_PRECOMPILED_HEADER -g -qsuppress=1540-0216 > -qsuppress=1540-0198 -qsuppress=1540-1090 -qsuppress=1540-1639 > -qsuppress=1540-1088 -qsuppress=1500-010 -O3 -qhot=level=1 -qinline > -qinlglue -DTHIS_FILE='"os_posix.cpp"' -c -qmakedep=gcc -MF > /priv/d031900/openjdk/jdk10-hs/output/hotspot/variant-server/libjvm/objs/os_posix.d > -o > /priv/d031900/openjdk/jdk10-hs/output/hotspot/variant-server/libjvm/objs/os_posix.o > /priv/d031900/openjdk/jdk10-hs/source/hotspot/src/os/posix/vm/os_posix.cpp > > -DSUPPORTS_CLOCK_MONOTONIC is the only switch appearing twice. I'm > baffled. Do you have any idea? > > Regards, Thomas > > > On Thu, May 18, 2017 at 12:06 PM, Magnus Ihse Bursie > > wrote: > > > > On 2017-05-18 09:35, David Holmes wrote: > > On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: > > On 2017-05-18 08:25, David Holmes wrote: > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8174231 > > > webrevs: > > Build-related: > http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ > > > > Build changes look good. > > > Thanks Magnus! I just realized I left in the AC_MSG_NOTICE > debugging prints outs - do you want me to remove them? I > suppose they may be useful if something goes wrong on some > platform. > > > I didn't even notice them. :-/ > > It's a bit unfortunate we don't have a debug level on the > logging from configure. :-( Otherwise they would have clearly > belonged there. > > The AC_MSG_NOTICE messages stands out much from the rest of the > configure log, so maybe it's better that you remove them. The > logic itself is very simple, if the -D flags are missing then we > can surely tell what happened. So yes, please remove them. > > Alternatively, rewrite them as CHECKING/RESULT, if you want to > keep the logging. That way they match better the rest of the > configure log (and also describes what you're doing). Just check > if AC_SEARCH_LIBS prints some output (likely so, I think), then > you can't do it in the middle of a CHECKING/RESULT pair, but > have to do the CHECKING part after AC_SEARCH_LIBS. > > /Magnus > > > > David > > /Magnus > > hotspot: > http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ > > > First a big thank you to Thomas Stuefe for testing > various versions of > this on AIX. > > This is primarily a refactoring and cleanup exercise > (ie lots of > deleted duplicated code!). > > I have taken the PlatformEvent, PlatformParker and > Parker::* code, out > of os_linux and moved it into os_posix for use by > Linux, OSX, BSD, AIX > and perhaps one day Solaris (more on that later). > > The Linux code was the most functionally complete, > dealing with > correct use of CLOCK_MONOTONIC for relative timed > waits, and the > default wall-clock for absolute timed waits. That > functionality is > not, unfortunately, supported by all our POSIX > platforms so there are > some configure time build checks to set some > #defines, and then some > dynamic lookup at runtime**. We allow for the > runtime environment to > be less capable than the build environment, but not > the other way > around (without build time support we don't know the > runtime types > needed to make library calls). > > ** There is some duplication of dynamic lookup code > on Linux but this > can be cleaned up in future work if we refactor the > time/clock code > into os_posix as well. > > The cleanup covers a number of things: > - removal of linux anachronisms that got "ported" > into the other > platforms > - eg EINTR can not be returned from the wait methods > - removal of solaris anachronisms that got ported > into the linux code > and then on to other platforms > - eg ETIMEDOUT is what we expect never ETIME > - removal of the ancient/obsolete > os::*::allowdebug_blocked_signals() > from the Parker methods > - consolidation of unpackTime and compute_abstime > into one utility > function > - use statics for things completely private to the > implementation > rather than making them part of the os* API (eg > access to condAttr > objects) > - cleanup up commentary and style within methods of > the same class > - clean up coding style in places eg not using Names > that start with > capitals. > > I have not tried to cleanup every single oddity, nor > tried to > reconcile differences between the very similar in > places PlatformEvent > and Park methods. For example PlatformEvent still > examines the > FilterSpuriousWakeups** flag, and Parker still > ignores it. > > ** Perhaps a candidate for deprecation and future > removal. > > There is one mini "enhancement" slipped in this. I > now explicitly > initialize mutexes with a mutexAttr object with its > type set to > PTHREAD_MUTEX_NORMAL, instead of relying on the > definition of > PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is > not "normal" but > "error checking" and so is slow. On all other > current platforms there > is no effective change. > > Finally, Solaris is excluded from all this (other > than the debug > signal blocking cleanup) because it potentially > supports three > different low-level sync subsystems: UI thr*, > Pthread, and direct LWP > sync. Solaris cleanup would be a separate RFE. > > No doubt I've overlooked mentioning something that > someone will spot. :) > > Thanks, > David > > > > > From mikael.vidstedt at oracle.com Thu May 18 22:15:55 2017 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 18 May 2017 15:15:55 -0700 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: References: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> Message-ID: <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> > On May 18, 2017, at 2:59 AM, Robbin Ehn wrote: > > Hi, > > On 05/17/2017 03:46 AM, Kim Barrett wrote: >>> On May 9, 2017, at 6:40 PM, Mikael Vidstedt wrote: >>> >>> >>> Warning: It may be wise to stock up on coffee or tea before reading this. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >> Not a review, just a question. >> ------------------------------------------------------------------------------ >> src/cpu/x86/vm/bytes_x86.hpp >> 40 template >> 41 static inline T get_native(const void* p) { >> 42 assert(p != NULL, "null pointer"); >> 43 >> 44 T x; >> 45 >> 46 if (is_ptr_aligned(p, sizeof(T))) { >> 47 x = *(T*)p; >> 48 } else { >> 49 memcpy(&x, p, sizeof(T)); >> 50 } >> 51 >> 52 return x; >> I'm looking at this and wondering if there's a good reason to not just >> unconditionally use memcpy here. gcc -O will generate a single move >> instruction for that on x86_64. I'm not sure what happens on 32bit >> with an 8 byte value, but I suspect it will do something similarly >> sensible, e.g. 2 4 byte memory to memory transfers. > > Unconditionally memcpy would be nice! > > Are going to look into that Mikael? It?s complicated? We may be able to switch, but there is (maybe) a subtle reason why the alignment check is in there: to avoid word tearing.. Think of two threads racing: * thread 1 is writing to the memory location X * thread 2 is reading from the same memory location X Will thread 2 always see a consistent value (either the original value or the fully updated value)? In the unaligned/memcpy case I think we can agree that there?s nothing preventing the compiler from doing individual loads/stores of the bytes making up the data. Especially in something like slowdebug that becomes more or less obvious - memcpy most likely isn?t intrinsified and is quite likely just copying a byte at a time. Given that the data is, in fact, unaligned, there is really no simple way to prevent word tearing, so I?m pretty sure that we never depend on it - if needed, we?re likely to already have some higher level synchronization in place guarding the accesses. And the fact that the other, non-x86 platforms already do individual byte loads/stores when the pointer is unaligned indicates is a further indication that that?s the case. However, the aligned case is where stuff gets more interesting. I don?t think the C/C++ spec guarantees that accessing a memory location using a pointer of type T will result in code which does a single load/store of size >= sizeof(T), but for all the compilers we *actually* use that?s likely to be the case. If it?s true that the compilers don?t splits the memory accesses, that means we won?t have word tearing when using the Bytes::get/put methods with *aligned* pointers. If I switch to always using memcpy, there?s a risk that it introduces tearing problems where earlier we had none. Two questions come to mind: * For the cases where the get/put methods get used *today*, is that a problem? * What happens if somebody in the *future* decides that put_Java_u4 seems like a great thing to use to write to a Java int field on the Java heap, and a Java thread is racing to read that same data? All that said though, I think this is worth exploring and it may well turn out that word tearing really isn?t a problem. Also, I believe there may be opportunities to further clean up this code and perhaps unify it a bit across the various platforms. And *that* said, I think the change as it stands is still an improvement, so I?m leaning towards pushing it and filing an enhancement and following up on it separately. Let me know if you strongly feel that this should be looked into and addressed now and I may reconsider :) Cheers, Mikael > > /Robbin > >> ------------------------------------------------------------------------------ From mikael.vidstedt at oracle.com Thu May 18 22:17:27 2017 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 18 May 2017 15:17:27 -0700 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: References: <42e3f5a6-0fd1-7fe8-c446-d72e86c5d315@oracle.com> <88E1F61C-5297-4948-83C3-25E146674F7A@oracle.com> Message-ID: <8F8DA7E1-07FB-411E-B69F-3030BCCC5A4B@oracle.com> > On May 16, 2017, at 12:07 PM, harold seigel wrote: > > Hi Mikael, > > The changes look good. One minor typo in bytes_sparc.hpp: "nativ byte?. I believe you?re referring to the removed code, in which case it will go away automatically if/when I push this :) > > It might be good to run the RBT tier2 - tier5 tests on one big endian and one little endian platform. Thanks, I will do that! Cheers, Mikael > > Thanks, Harold > > > On 5/15/2017 2:34 PM, Mikael Vidstedt wrote: >> New webrevs: >> >> full: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01/hotspot/webrev/ >> incremental (from webrev.00): http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01.incr/hotspot/webrev/ >> >> I definitely want a second reviewer of this, and I?m (still) taking suggestions on tests to run etc.! >> >> Also, comments inline below.. >> >>> On May 9, 2017, at 5:12 PM, David Holmes wrote: >>> >>> Hi Mikael, >>> >>> On 10/05/2017 8:40 AM, Mikael Vidstedt wrote: >>>> Warning: It may be wise to stock up on coffee or tea before reading this. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>>> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >>> Overall this looks good to me. I like the refactoring from Bytes to Endian - that simplifies things a lot I think. >> Thanks, I like it too ;) >> >>> The actual "copy/swap" changes and templates I'm not an expert on but I get the gist and they seemed okay. >>> >>> Was a little unsure about all the changes to void* from u2*/u1* in classFileParser.h/cpp - does that just simplify use of the copy/swap code? Though I see some casts to u2* are no longer necessary as well. >> Right, I could go either way here, but when I personally see a u2* I feel tempted to just dereference it, and that?s not valid in these cases. With void* it?s more obvious that you need to do something else to get the data, but the width/type of the underlying data is lost. It may make sense to introduce a few helpful typedefs of void* which include in their names the underlying data type, or something along those lines. >> >> I suggest wrapping up and pushing what I have and working on improving the story here as a separate change. Reasonable? >> >>> A couple of oddities I noticed: >>> >>> src/share/vm/classfile/classFileStream.hpp >>> >>> Without the get_u2_buffer/get_u1_buffer distinction get_u1_buffer seems superfluous and all uses can be replaced by the existing current() accessor. >> Good point, get_u1_buffer can be removed (and it?s gone in the new webrev). >> >>> --- >>> >>> src/share/vm/classfile/classFileParser.cpp >>> >>> We do we have void* here: >>> >>> 1707 const void* const exception_table_start = cfs->get_u1_buffer(); >>> >>> but u1* here: >>> >>> 1845 const u1* const localvariable_table_start = cfs->get_u1_buffer(); >> Good catch. Changed to void*. >> >> Cheers, >> Mikael >> >>> Thanks, >>> David >>> >>>> * Background (from the JBS description) >>>> >>>> x86 is normally very forgiving when it comes to dereferencing unaligned pointers. Even if the pointer isn?t aligned on the natural size of the element being accessed, the hardware will do the Right Thing(tm) and nobody gets hurt. However, turns out there are exceptions to this. Specifically, SSA2 introduced the movdqa instruction, which is a 128-bit load/store which *does* require that the pointer is 128-bit aligned. >>>> >>>> Normally this isn?t a problem, because after all we don?t typically use 128-bit data types in our C/C++ code. However, just because we don?t use any such data types explicitly, there?s nothing preventing the C compiler to do so under the covers. Specifically, the C compiler tends to do loop unrolling and vectorization, which can turn pretty much any data access into vectorized SIMD accesses. >>>> >>>> We?ve actually run into a variation on this exact same problem a while back when upgrading to gcc 4.9.2. That time the problem (as captured in JDK-8141491) was in nio/Bits.c, and it was fixed by moving the copy functionality into hotspot (copy.[ch]pp), making sure the copy logic does the relevant alignment checks etc. >>>> >>>> This time the problem is with ClassFileParser. Or more accurately, it?s in the methods ClassFileParser makes use of. Specifically, the problem is with the copy_u2_with_conversion method, used to copy various data from the class file and put it in the ?native? endian order in memory. It, in turn, uses Bytes::get_Java_u2 to read and potentially byte swap a 16-bit entry from the class file. bytes_x86.hpp has this to say about its implementation: >>>> >>>> // Efficient reading and writing of unaligned unsigned data in platform-specific byte ordering >>>> // (no special code is needed since x86 CPUs can access unaligned data) >>>> >>>> While that is /almost/ always true for the x86 architecture in itself, the C standard still expects accesses to be aligned, and the C compiler is free to make use of that expectation to, for example, vectorize operations and make use of the movdqa instruction. >>>> >>>> I noticed this when working on the portola/musl port, and in that environment the bug is tickled immediately. Why is it only a problem with musl? Turns out it sorta isn?t. It seems that this isn't an actual problem on any of the current platforms/toolchains we're using, but it's a latent bug which may be triggered at any point. >>>> >>>> bytes_x86.hpp will, in the end, actually use system library functions to do byte swapping. Specifically, on linux_x86 it will come down to bytes_linux_x86.inline.hpp which, on AMD64, uses the system functions/macros swap_u{16,32,64} to do the actual byte swapping. Now here?s the ?funny? & interesting part: >>>> >>>> With glibc, the swap_u{16,32,64} methods are implemented using inline assembly - in the end it comes down to an inline rotate ?rorw? instruction. Since GCC can?t see through the inline assembly, it will not realize that there are loop unrolling/vectorization opportunities, and of specific interest to us: the movdqa instruction will not be used. The code will potentially not be as efficient as it could be, but it will be functional. >>>> >>>> With musl, the swap methods are instead implemented as normal macros, shifting bits around to achieve the desired effect. GCC recognizes the bit shifting patterns, will realize that it?s just byte swapping a bunch of values, will vectorize the loop, and *will* make use of the movdqa instruction. Kaboom. >>>> >>>> To recap: dereferencing unaligned pointers in C/C++ is a no-no, even in cases where you think it should be okay. With the existing compilers and header files we are not currently running into this problem, but even a small change in the byte swap implementation exposes the problem. >>>> >>>> >>>> >>>> * About the change >>>> >>>> The key changes are in three different areas: >>>> >>>> 1. copy.[ch]pp >>>> >>>> Introducing: conjoint_swap_if_needed >>>> >>>> conjoint_swap_if_needed copies data, and bytes wap it on-the-fly if the specified endianness differs from the native/CPU endianness. It does this by either delegating to conjoint_swap (on endian mismatch), or conjoint_copy (on match). In copy.cpp, the changes all boil down to making the innermost do_conjoint_swap method more flexible so that it can be reused for both cases (straight copy as well as copy+swap). >>>> >>>> 2. classFile{Parser,Stream} >>>> >>>> The key change is in classFileParser.cpp, switching to copying data from the class file using the new conjoint_swap_if_needed method, replacing the loop implemented in copy_u2_with_conversion/Bytes::get_Java_u2. >>>> >>>> However, in addition to that change, I noticed that there are a lot of u2* passed around in the code, pointers which are not necessarily 16-bit aligned. While there?s nothing wrong with *having* an unaligned pointer in C - as long as it?s not dereferenced everything is peachy - it made me uneasy to see it passed around and used the way it is. Specifically, ClassFileStream::get_u2_buffer() could, to the untrained eye, be a bit misleading. One could accidentally and incorrectly assume that the returned pointer is, in fact, 16-bit aligned and start dereferencing it directly, where in fact there is no such guarantee. Perhaps even use it as an array and attract the wrath of the C compiler. >>>> >>>> Changing to void* may or may not be the right thing to do here. In a way I?d actually like to ?carry? the type information, but in some way still prevent the pointer from being directly dereferenced. Taking suggestions. >>>> >>>> >>>> 3. bytes_x86.hpp >>>> >>>> This is addressing the wider concern that other parts of hotspot may use the same primitives in much the same (potentially broken) way, and in particular the fact that the get/put primitives aren?t checking whether the pointer argument is aligned before they dereference it. It may well be that a simple assert or two would do the trick here. That said: >>>> >>>> It turns out that the various platforms all have their own unique ways of implementing bytes.hpp, duplicating some logic which could/should be platform independent. I tried to clean up and unify it all a bit while at it by introducing an Endian helper class in bytes.hpp. The primitives for accessing data in memory now check for alignment and either perform the raw memory access (when the pointer is aligned), or does a memcpy (if unaligned). There?s some template ?magic? in there avoid duplicating code, but hopefully the magic is relatively straightforward. >>>> >>>> >>>> * Testing >>>> >>>> I?ve run some basic testing on this, but I?m very much looking for advice on which tests to run. Let me know if you have any suggestions! >>>> >>>> Cheers, >>>> Mikael >>>> > From david.holmes at oracle.com Thu May 18 22:50:25 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 May 2017 08:50:25 +1000 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> References: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> Message-ID: <440931cb-f4a4-6edd-fbd3-82bbfe162b81@oracle.com> Hi Mikael, On 19/05/2017 8:15 AM, Mikael Vidstedt wrote: > >> On May 18, 2017, at 2:59 AM, Robbin Ehn wrote: >> >> Hi, >> >> On 05/17/2017 03:46 AM, Kim Barrett wrote: >>>> On May 9, 2017, at 6:40 PM, Mikael Vidstedt wrote: >>>> >>>> >>>> Warning: It may be wise to stock up on coffee or tea before reading this. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>>> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >>> Not a review, just a question. >>> ------------------------------------------------------------------------------ >>> src/cpu/x86/vm/bytes_x86.hpp >>> 40 template >>> 41 static inline T get_native(const void* p) { >>> 42 assert(p != NULL, "null pointer"); >>> 43 >>> 44 T x; >>> 45 >>> 46 if (is_ptr_aligned(p, sizeof(T))) { >>> 47 x = *(T*)p; >>> 48 } else { >>> 49 memcpy(&x, p, sizeof(T)); >>> 50 } >>> 51 >>> 52 return x; >>> I'm looking at this and wondering if there's a good reason to not just >>> unconditionally use memcpy here. gcc -O will generate a single move >>> instruction for that on x86_64. I'm not sure what happens on 32bit >>> with an 8 byte value, but I suspect it will do something similarly >>> sensible, e.g. 2 4 byte memory to memory transfers. >> >> Unconditionally memcpy would be nice! >> >> Are going to look into that Mikael? > > It?s complicated? > > We may be able to switch, but there is (maybe) a subtle reason why the alignment check is in there: to avoid word tearing.. > > Think of two threads racing: > > * thread 1 is writing to the memory location X > * thread 2 is reading from the same memory location X > > Will thread 2 always see a consistent value (either the original value or the fully updated value)? We're talking about internal VM load and stores rights? For those we need to use appropriate atomic routine if there are potential races. But we should never be mixing these kind of accesses with Java level field accesses - that would be very broken. For classFileparser we should no concurrency issues. David > In the unaligned/memcpy case I think we can agree that there?s nothing preventing the compiler from doing individual loads/stores of the bytes making up the data. Especially in something like slowdebug that becomes more or less obvious - memcpy most likely isn?t intrinsified and is quite likely just copying a byte at a time. Given that the data is, in fact, unaligned, there is really no simple way to prevent word tearing, so I?m pretty sure that we never depend on it - if needed, we?re likely to already have some higher level synchronization in place guarding the accesses. And the fact that the other, non-x86 platforms already do individual byte loads/stores when the pointer is unaligned indicates is a further indication that that?s the case. > > However, the aligned case is where stuff gets more interesting. I don?t think the C/C++ spec guarantees that accessing a memory location using a pointer of type T will result in code which does a single load/store of size >= sizeof(T), but for all the compilers we *actually* use that?s likely to be the case. If it?s true that the compilers don?t splits the memory accesses, that means we won?t have word tearing when using the Bytes::get/put methods with *aligned* pointers. > > If I switch to always using memcpy, there?s a risk that it introduces tearing problems where earlier we had none. Two questions come to mind: > > * For the cases where the get/put methods get used *today*, is that a problem? > * What happens if somebody in the *future* decides that put_Java_u4 seems like a great thing to use to write to a Java int field on the Java heap, and a Java thread is racing to read that same data? > > > All that said though, I think this is worth exploring and it may well turn out that word tearing really isn?t a problem. Also, I believe there may be opportunities to further clean up this code and perhaps unify it a bit across the various platforms. > > And *that* said, I think the change as it stands is still an improvement, so I?m leaning towards pushing it and filing an enhancement and following up on it separately. Let me know if you strongly feel that this should be looked into and addressed now and I may reconsider :) > > Cheers, > Mikael > >> >> /Robbin >> >>> ------------------------------------------------------------------------------ > From mikael.vidstedt at oracle.com Thu May 18 23:19:38 2017 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 18 May 2017 16:19:38 -0700 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: <440931cb-f4a4-6edd-fbd3-82bbfe162b81@oracle.com> References: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> <440931cb-f4a4-6edd-fbd3-82bbfe162b81@oracle.com> Message-ID: > On May 18, 2017, at 3:50 PM, David Holmes wrote: > > Hi Mikael, > > On 19/05/2017 8:15 AM, Mikael Vidstedt wrote: >> >>> On May 18, 2017, at 2:59 AM, Robbin Ehn wrote: >>> >>> Hi, >>> >>> On 05/17/2017 03:46 AM, Kim Barrett wrote: >>>>> On May 9, 2017, at 6:40 PM, Mikael Vidstedt wrote: >>>>> >>>>> >>>>> Warning: It may be wise to stock up on coffee or tea before reading this. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>>>> Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >>>> Not a review, just a question. >>>> ------------------------------------------------------------------------------ >>>> src/cpu/x86/vm/bytes_x86.hpp >>>> 40 template >>>> 41 static inline T get_native(const void* p) { >>>> 42 assert(p != NULL, "null pointer"); >>>> 43 >>>> 44 T x; >>>> 45 >>>> 46 if (is_ptr_aligned(p, sizeof(T))) { >>>> 47 x = *(T*)p; >>>> 48 } else { >>>> 49 memcpy(&x, p, sizeof(T)); >>>> 50 } >>>> 51 >>>> 52 return x; >>>> I'm looking at this and wondering if there's a good reason to not just >>>> unconditionally use memcpy here. gcc -O will generate a single move >>>> instruction for that on x86_64. I'm not sure what happens on 32bit >>>> with an 8 byte value, but I suspect it will do something similarly >>>> sensible, e.g. 2 4 byte memory to memory transfers. >>> >>> Unconditionally memcpy would be nice! >>> >>> Are going to look into that Mikael? >> >> It?s complicated? >> >> We may be able to switch, but there is (maybe) a subtle reason why the alignment check is in there: to avoid word tearing.. >> >> Think of two threads racing: >> >> * thread 1 is writing to the memory location X >> * thread 2 is reading from the same memory location X >> >> Will thread 2 always see a consistent value (either the original value or the fully updated value)? > > We're talking about internal VM load and stores rights? For those we need to use appropriate atomic routine if there are potential races. But we should never be mixing these kind of accesses with Java level field accesses - that would be very broken. That seems reasonable, but for my untrained eye it?s not trivially true that relaxing the implementation is correct for all the uses of the get/put primitives. I am therefore a bit reluctant to do so without understanding the implications. > For classFileparser we should no concurrency issues. That seems reasonable. What degree of certainty does your ?should? come with? :) Cheers, Mikael > > David > >> In the unaligned/memcpy case I think we can agree that there?s nothing preventing the compiler from doing individual loads/stores of the bytes making up the data. Especially in something like slowdebug that becomes more or less obvious - memcpy most likely isn?t intrinsified and is quite likely just copying a byte at a time. Given that the data is, in fact, unaligned, there is really no simple way to prevent word tearing, so I?m pretty sure that we never depend on it - if needed, we?re likely to already have some higher level synchronization in place guarding the accesses. And the fact that the other, non-x86 platforms already do individual byte loads/stores when the pointer is unaligned indicates is a further indication that that?s the case. >> >> However, the aligned case is where stuff gets more interesting. I don?t think the C/C++ spec guarantees that accessing a memory location using a pointer of type T will result in code which does a single load/store of size >= sizeof(T), but for all the compilers we *actually* use that?s likely to be the case. If it?s true that the compilers don?t splits the memory accesses, that means we won?t have word tearing when using the Bytes::get/put methods with *aligned* pointers. >> >> If I switch to always using memcpy, there?s a risk that it introduces tearing problems where earlier we had none. Two questions come to mind: >> >> * For the cases where the get/put methods get used *today*, is that a problem? >> * What happens if somebody in the *future* decides that put_Java_u4 seems like a great thing to use to write to a Java int field on the Java heap, and a Java thread is racing to read that same data? >> >> >> All that said though, I think this is worth exploring and it may well turn out that word tearing really isn?t a problem. Also, I believe there may be opportunities to further clean up this code and perhaps unify it a bit across the various platforms. >> >> And *that* said, I think the change as it stands is still an improvement, so I?m leaning towards pushing it and filing an enhancement and following up on it separately. Let me know if you strongly feel that this should be looked into and addressed now and I may reconsider :) >> >> Cheers, >> Mikael >> >>> >>> /Robbin >>> >>>> ------------------------------------------------------------------------------ From david.holmes at oracle.com Fri May 19 00:18:47 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 May 2017 10:18:47 +1000 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: References: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> <440931cb-f4a4-6edd-fbd3-82bbfe162b81@oracle.com> Message-ID: <6a0f369e-4d34-02dc-653d-90a8aa19b901@oracle.com> On 19/05/2017 9:19 AM, Mikael Vidstedt wrote: > >> On May 18, 2017, at 3:50 PM, David Holmes > > wrote: >> >> Hi Mikael, >> >> On 19/05/2017 8:15 AM, Mikael Vidstedt wrote: >>> >>>> On May 18, 2017, at 2:59 AM, Robbin Ehn >>> > wrote: >>>> >>>> Hi, >>>> >>>> On 05/17/2017 03:46 AM, Kim Barrett wrote: >>>>>> On May 9, 2017, at 6:40 PM, Mikael Vidstedt >>>>>> > >>>>>> wrote: >>>>>> >>>>>> >>>>>> Warning: It may be wise to stock up on coffee or tea before >>>>>> reading this. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >>>>>> >>>>> Not a review, just a question. >>>>> ------------------------------------------------------------------------------ >>>>> src/cpu/x86/vm/bytes_x86.hpp >>>>> 40 template >>>>> 41 static inline T get_native(const void* p) { >>>>> 42 assert(p != NULL, "null pointer"); >>>>> 43 >>>>> 44 T x; >>>>> 45 >>>>> 46 if (is_ptr_aligned(p, sizeof(T))) { >>>>> 47 x = *(T*)p; >>>>> 48 } else { >>>>> 49 memcpy(&x, p, sizeof(T)); >>>>> 50 } >>>>> 51 >>>>> 52 return x; >>>>> I'm looking at this and wondering if there's a good reason to not just >>>>> unconditionally use memcpy here. gcc -O will generate a single move >>>>> instruction for that on x86_64. I'm not sure what happens on 32bit >>>>> with an 8 byte value, but I suspect it will do something similarly >>>>> sensible, e.g. 2 4 byte memory to memory transfers. >>>> >>>> Unconditionally memcpy would be nice! >>>> >>>> Are going to look into that Mikael? >>> >>> It?s complicated? >>> >>> We may be able to switch, but there is (maybe) a subtle reason why >>> the alignment check is in there: to avoid word tearing.. >>> >>> Think of two threads racing: >>> >>> * thread 1 is writing to the memory location X >>> * thread 2 is reading from the same memory location X >>> >>> Will thread 2 always see a consistent value (either the original >>> value or the fully updated value)? >> >> We're talking about internal VM load and stores rights? For those we >> need to use appropriate atomic routine if there are potential races. >> But we should never be mixing these kind of accesses with Java level >> field accesses - that would be very broken. > > That seems reasonable, but for my untrained eye it?s not trivially true > that relaxing the implementation is correct for all the uses of the > get/put primitives. I am therefore a bit reluctant to do so without > understanding the implications. If a Copy routine doesn't have Atomic in its name then I don't expect atomicity. Even then unaligned accesses are not atomic even in the Atomic routine! But I'm not clear exactly how all these routines get used. >> For classFileparser we should no concurrency issues. > > That seems reasonable. What degree of certainty does your ?should? come > with? :) Pretty high. We're parsing a stream of bytes and writing values into local structures that will eventually be passed across to a klass instance, which in turn will eventually be published via the SD as a loaded class. The actual parsing phase is purely single-threaded. David > Cheers, > Mikael > >> >> David >> >>> In the unaligned/memcpy case I think we can agree that there?s >>> nothing preventing the compiler from doing individual loads/stores of >>> the bytes making up the data. Especially in something like slowdebug >>> that becomes more or less obvious - memcpy most likely isn?t >>> intrinsified and is quite likely just copying a byte at a time. Given >>> that the data is, in fact, unaligned, there is really no simple way >>> to prevent word tearing, so I?m pretty sure that we never depend on >>> it - if needed, we?re likely to already have some higher level >>> synchronization in place guarding the accesses. And the fact that the >>> other, non-x86 platforms already do individual byte loads/stores when >>> the pointer is unaligned indicates is a further indication that >>> that?s the case. >>> >>> However, the aligned case is where stuff gets more interesting. I >>> don?t think the C/C++ spec guarantees that accessing a memory >>> location using a pointer of type T will result in code which does a >>> single load/store of size >= sizeof(T), but for all the compilers we >>> *actually* use that?s likely to be the case. If it?s true that the >>> compilers don?t splits the memory accesses, that means we won?t have >>> word tearing when using the Bytes::get/put methods with *aligned* >>> pointers. >>> >>> If I switch to always using memcpy, there?s a risk that it introduces >>> tearing problems where earlier we had none. Two questions come to mind: >>> >>> * For the cases where the get/put methods get used *today*, is that a >>> problem? >>> * What happens if somebody in the *future* decides that put_Java_u4 >>> seems like a great thing to use to write to a Java int field on the >>> Java heap, and a Java thread is racing to read that same data? >>> >>> >>> All that said though, I think this is worth exploring and it may well >>> turn out that word tearing really isn?t a problem. Also, I believe >>> there may be opportunities to further clean up this code and perhaps >>> unify it a bit across the various platforms. >>> >>> And *that* said, I think the change as it stands is still an >>> improvement, so I?m leaning towards pushing it and filing an >>> enhancement and following up on it separately. Let me know if you >>> strongly feel that this should be looked into and addressed now and I >>> may reconsider :) >>> >>> Cheers, >>> Mikael >>> >>>> >>>> /Robbin >>>> >>>>> ------------------------------------------------------------------------------ > From tobias.hartmann at oracle.com Fri May 19 06:26:47 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 19 May 2017 08:26:47 +0200 Subject: [10] RFR(XS): 8180587: Assert in layout_helper_log2_element_size(jint) compares bits instead of bytes In-Reply-To: References: <93ea86ab-a94d-c36b-a57a-1fc1fb0e9bf3@oracle.com> Message-ID: <1a5ddf8b-44f2-f067-ec93-86c679a2a44c@oracle.com> Hi Vladimir, thanks for the review! Best regards, Tobias On 18.05.2017 19:06, Vladimir Kozlov wrote: > On 5/18/17 7:50 AM, Zolt?n Maj? wrote: >> Hi Tobias, >> >> the proposed change looks good to me. > > +1 > > Vladimir > >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> >> >> On 05/18/2017 09:47 AM, Tobias Hartmann wrote: >>> Hi, >>> >>> please review the following patch: >>> https://bugs.openjdk.java.net/browse/JDK-8180587 >>> http://cr.openjdk.java.net/~thartmann/8180587/webrev.00/ >>> >>> While working on value types, I wondered that the "l2esz <= LogBitsPerLong" assert in >>> layout_helper_log2_element_size() did not fire although the array element size of a value type is greater than the >>> size of a long. The problem is that the assert compares the log2 element size which is in bytes (see initialization in >>> Klass::array_layout_helper()) to LogBitsPerLong which obviously is in bits. >>> >>> Thanks, >>> Tobias >> From david.holmes at oracle.com Fri May 19 07:15:46 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 May 2017 17:15:46 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> Message-ID: <1ef68ed4-91ca-b52d-2a75-1d66226ad21b@oracle.com> Hi Magnus, On 18/05/2017 8:06 PM, Magnus Ihse Bursie wrote: > > > On 2017-05-18 09:35, David Holmes wrote: >> On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: >>> On 2017-05-18 08:25, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>>> >>>> webrevs: >>>> >>>> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >>> >>> Build changes look good. >> >> Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging >> prints outs - do you want me to remove them? I suppose they may be >> useful if something goes wrong on some platform. > > I didn't even notice them. :-/ > > It's a bit unfortunate we don't have a debug level on the logging from > configure. :-( Otherwise they would have clearly belonged there. > > The AC_MSG_NOTICE messages stands out much from the rest of the > configure log, so maybe it's better that you remove them. The logic > itself is very simple, if the -D flags are missing then we can surely > tell what happened. So yes, please remove them. Webrev updated in place. I have removed them to avoid noise - particularly as they get executed twice. I also made an adjustment to AC_SEARCH_LIBS as I don't want to pass the saved_LIBS value in explicitly, but want it to use the LIBS value - which is no longer cleared before the call. I've verified all platforms are okay - except AIX which I'll need Thomas to recheck when he can. I also discovered an oddity in that our ARM64 builds seem to use different system libraries in that librt.so is not needed for clock_gettime. This still seems to work ok. Of more of a concern if we were expand this kind of function-existence check is that libc seems to contain "dummy" (or at least dysfunctional) versions of a number of the core pthread APIs! Thanks, David > Alternatively, rewrite them as CHECKING/RESULT, if you want to keep the > logging. That way they match better the rest of the configure log (and > also describes what you're doing). Just check if AC_SEARCH_LIBS prints > some output (likely so, I think), then you can't do it in the middle of > a CHECKING/RESULT pair, but have to do the CHECKING part after > AC_SEARCH_LIBS. > > /Magnus > >> >> David >> >>> /Magnus >>> >>>> hotspot: >>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >>>> >>>> First a big thank you to Thomas Stuefe for testing various versions of >>>> this on AIX. >>>> >>>> This is primarily a refactoring and cleanup exercise (ie lots of >>>> deleted duplicated code!). >>>> >>>> I have taken the PlatformEvent, PlatformParker and Parker::* code, out >>>> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, AIX >>>> and perhaps one day Solaris (more on that later). >>>> >>>> The Linux code was the most functionally complete, dealing with >>>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>>> default wall-clock for absolute timed waits. That functionality is >>>> not, unfortunately, supported by all our POSIX platforms so there are >>>> some configure time build checks to set some #defines, and then some >>>> dynamic lookup at runtime**. We allow for the runtime environment to >>>> be less capable than the build environment, but not the other way >>>> around (without build time support we don't know the runtime types >>>> needed to make library calls). >>>> >>>> ** There is some duplication of dynamic lookup code on Linux but this >>>> can be cleaned up in future work if we refactor the time/clock code >>>> into os_posix as well. >>>> >>>> The cleanup covers a number of things: >>>> - removal of linux anachronisms that got "ported" into the other >>>> platforms >>>> - eg EINTR can not be returned from the wait methods >>>> - removal of solaris anachronisms that got ported into the linux code >>>> and then on to other platforms >>>> - eg ETIMEDOUT is what we expect never ETIME >>>> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >>>> from the Parker methods >>>> - consolidation of unpackTime and compute_abstime into one utility >>>> function >>>> - use statics for things completely private to the implementation >>>> rather than making them part of the os* API (eg access to condAttr >>>> objects) >>>> - cleanup up commentary and style within methods of the same class >>>> - clean up coding style in places eg not using Names that start with >>>> capitals. >>>> >>>> I have not tried to cleanup every single oddity, nor tried to >>>> reconcile differences between the very similar in places PlatformEvent >>>> and Park methods. For example PlatformEvent still examines the >>>> FilterSpuriousWakeups** flag, and Parker still ignores it. >>>> >>>> ** Perhaps a candidate for deprecation and future removal. >>>> >>>> There is one mini "enhancement" slipped in this. I now explicitly >>>> initialize mutexes with a mutexAttr object with its type set to >>>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>>> "error checking" and so is slow. On all other current platforms there >>>> is no effective change. >>>> >>>> Finally, Solaris is excluded from all this (other than the debug >>>> signal blocking cleanup) because it potentially supports three >>>> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >>>> sync. Solaris cleanup would be a separate RFE. >>>> >>>> No doubt I've overlooked mentioning something that someone will >>>> spot. :) >>>> >>>> Thanks, >>>> David >>> > From erik.joelsson at oracle.com Fri May 19 08:07:35 2017 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Fri, 19 May 2017 10:07:35 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <1ef68ed4-91ca-b52d-2a75-1d66226ad21b@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> <1ef68ed4-91ca-b52d-2a75-1d66226ad21b@oracle.com> Message-ID: <48795ff4-1b3e-0919-d008-93c3e0def878@oracle.com> Build changes look good to me. /Erik On 2017-05-19 09:15, David Holmes wrote: > Hi Magnus, > > On 18/05/2017 8:06 PM, Magnus Ihse Bursie wrote: >> >> >> On 2017-05-18 09:35, David Holmes wrote: >>> On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: >>>> On 2017-05-18 08:25, David Holmes wrote: >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>>>> >>>>> webrevs: >>>>> >>>>> Build-related: >>>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >>>> >>>> Build changes look good. >>> >>> Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging >>> prints outs - do you want me to remove them? I suppose they may be >>> useful if something goes wrong on some platform. >> >> I didn't even notice them. :-/ >> >> It's a bit unfortunate we don't have a debug level on the logging >> from configure. :-( Otherwise they would have clearly belonged there. >> >> The AC_MSG_NOTICE messages stands out much from the rest of the >> configure log, so maybe it's better that you remove them. The logic >> itself is very simple, if the -D flags are missing then we can surely >> tell what happened. So yes, please remove them. > > Webrev updated in place. > > I have removed them to avoid noise - particularly as they get executed > twice. > > I also made an adjustment to AC_SEARCH_LIBS as I don't want to pass > the saved_LIBS value in explicitly, but want it to use the LIBS value > - which is no longer cleared before the call. I've verified all > platforms are okay - except AIX which I'll need Thomas to recheck when > he can. > > I also discovered an oddity in that our ARM64 builds seem to use > different system libraries in that librt.so is not needed for > clock_gettime. This still seems to work ok. Of more of a concern if we > were expand this kind of function-existence check is that libc seems > to contain "dummy" (or at least dysfunctional) versions of a number of > the core pthread APIs! > > Thanks, > David > >> Alternatively, rewrite them as CHECKING/RESULT, if you want to keep >> the logging. That way they match better the rest of the configure log >> (and also describes what you're doing). Just check if AC_SEARCH_LIBS >> prints some output (likely so, I think), then you can't do it in the >> middle of a CHECKING/RESULT pair, but have to do the CHECKING part >> after AC_SEARCH_LIBS. >> >> /Magnus >> >>> >>> David >>> >>>> /Magnus >>>> >>>>> hotspot: >>>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >>>>> >>>>> First a big thank you to Thomas Stuefe for testing various >>>>> versions of >>>>> this on AIX. >>>>> >>>>> This is primarily a refactoring and cleanup exercise (ie lots of >>>>> deleted duplicated code!). >>>>> >>>>> I have taken the PlatformEvent, PlatformParker and Parker::* code, >>>>> out >>>>> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, >>>>> AIX >>>>> and perhaps one day Solaris (more on that later). >>>>> >>>>> The Linux code was the most functionally complete, dealing with >>>>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>>>> default wall-clock for absolute timed waits. That functionality is >>>>> not, unfortunately, supported by all our POSIX platforms so there are >>>>> some configure time build checks to set some #defines, and then some >>>>> dynamic lookup at runtime**. We allow for the runtime environment to >>>>> be less capable than the build environment, but not the other way >>>>> around (without build time support we don't know the runtime types >>>>> needed to make library calls). >>>>> >>>>> ** There is some duplication of dynamic lookup code on Linux but this >>>>> can be cleaned up in future work if we refactor the time/clock code >>>>> into os_posix as well. >>>>> >>>>> The cleanup covers a number of things: >>>>> - removal of linux anachronisms that got "ported" into the other >>>>> platforms >>>>> - eg EINTR can not be returned from the wait methods >>>>> - removal of solaris anachronisms that got ported into the linux code >>>>> and then on to other platforms >>>>> - eg ETIMEDOUT is what we expect never ETIME >>>>> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >>>>> from the Parker methods >>>>> - consolidation of unpackTime and compute_abstime into one utility >>>>> function >>>>> - use statics for things completely private to the implementation >>>>> rather than making them part of the os* API (eg access to condAttr >>>>> objects) >>>>> - cleanup up commentary and style within methods of the same class >>>>> - clean up coding style in places eg not using Names that start with >>>>> capitals. >>>>> >>>>> I have not tried to cleanup every single oddity, nor tried to >>>>> reconcile differences between the very similar in places >>>>> PlatformEvent >>>>> and Park methods. For example PlatformEvent still examines the >>>>> FilterSpuriousWakeups** flag, and Parker still ignores it. >>>>> >>>>> ** Perhaps a candidate for deprecation and future removal. >>>>> >>>>> There is one mini "enhancement" slipped in this. I now explicitly >>>>> initialize mutexes with a mutexAttr object with its type set to >>>>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>>>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>>>> "error checking" and so is slow. On all other current platforms there >>>>> is no effective change. >>>>> >>>>> Finally, Solaris is excluded from all this (other than the debug >>>>> signal blocking cleanup) because it potentially supports three >>>>> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >>>>> sync. Solaris cleanup would be a separate RFE. >>>>> >>>>> No doubt I've overlooked mentioning something that someone will >>>>> spot. :) >>>>> >>>>> Thanks, >>>>> David >>>> >> From robbin.ehn at oracle.com Fri May 19 08:36:29 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 19 May 2017 10:36:29 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> Message-ID: Hi David, On 05/18/2017 08:25 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 > > webrevs: > > Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ > hotspot: http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ > I like this, with neg delta of 700 loc, nice! It's hard to see if you broken anything, since you combined 4 separate implementation into 1. I guess you have tested this proper? What stands out in os_posix.cpp is the static void to_abstime(timespec* abstime, jlong timeout, bool isAbsolute) The ifdef scopes of SUPPORTS_CLOCK_MONOTONIC is large and calculations are repeated 3 times. Please consider something like: #ifdef SUPPORTS_CLOCK_MONOTONIC if (_use_clock_monotonic_condattr && !isAbsolute) { // Why aren't we using this when not isAbsolute is set? // I suggest removing that check from this if and use monotonic for that also. struct timespec now; int status = _clock_gettime(CLOCK_MONOTONIC, &now); assert_status(status == 0, status, "clock_gettime"); calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_nsec, NANOUNITS); } else { #else { #endif struct timeval now; int status = gettimeofday(&now, NULL); assert(status == 0, "gettimeofday"); calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_usec, MICROUNITS); } #endif Thanks for fixing this! /Robbin From david.holmes at oracle.com Fri May 19 08:53:18 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 May 2017 18:53:18 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <48795ff4-1b3e-0919-d008-93c3e0def878@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> <1ef68ed4-91ca-b52d-2a75-1d66226ad21b@oracle.com> <48795ff4-1b3e-0919-d008-93c3e0def878@oracle.com> Message-ID: <6cee21b8-bb88-8f3f-cf8f-698735eee456@oracle.com> Thanks Erik! David On 19/05/2017 6:07 PM, Erik Joelsson wrote: > Build changes look good to me. > > /Erik > > > On 2017-05-19 09:15, David Holmes wrote: >> Hi Magnus, >> >> On 18/05/2017 8:06 PM, Magnus Ihse Bursie wrote: >>> >>> >>> On 2017-05-18 09:35, David Holmes wrote: >>>> On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: >>>>> On 2017-05-18 08:25, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>>>>> >>>>>> webrevs: >>>>>> >>>>>> Build-related: >>>>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >>>>> >>>>> Build changes look good. >>>> >>>> Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging >>>> prints outs - do you want me to remove them? I suppose they may be >>>> useful if something goes wrong on some platform. >>> >>> I didn't even notice them. :-/ >>> >>> It's a bit unfortunate we don't have a debug level on the logging >>> from configure. :-( Otherwise they would have clearly belonged there. >>> >>> The AC_MSG_NOTICE messages stands out much from the rest of the >>> configure log, so maybe it's better that you remove them. The logic >>> itself is very simple, if the -D flags are missing then we can surely >>> tell what happened. So yes, please remove them. >> >> Webrev updated in place. >> >> I have removed them to avoid noise - particularly as they get executed >> twice. >> >> I also made an adjustment to AC_SEARCH_LIBS as I don't want to pass >> the saved_LIBS value in explicitly, but want it to use the LIBS value >> - which is no longer cleared before the call. I've verified all >> platforms are okay - except AIX which I'll need Thomas to recheck when >> he can. >> >> I also discovered an oddity in that our ARM64 builds seem to use >> different system libraries in that librt.so is not needed for >> clock_gettime. This still seems to work ok. Of more of a concern if we >> were expand this kind of function-existence check is that libc seems >> to contain "dummy" (or at least dysfunctional) versions of a number of >> the core pthread APIs! >> >> Thanks, >> David >> >>> Alternatively, rewrite them as CHECKING/RESULT, if you want to keep >>> the logging. That way they match better the rest of the configure log >>> (and also describes what you're doing). Just check if AC_SEARCH_LIBS >>> prints some output (likely so, I think), then you can't do it in the >>> middle of a CHECKING/RESULT pair, but have to do the CHECKING part >>> after AC_SEARCH_LIBS. >>> >>> /Magnus >>> >>>> >>>> David >>>> >>>>> /Magnus >>>>> >>>>>> hotspot: >>>>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >>>>>> >>>>>> First a big thank you to Thomas Stuefe for testing various >>>>>> versions of >>>>>> this on AIX. >>>>>> >>>>>> This is primarily a refactoring and cleanup exercise (ie lots of >>>>>> deleted duplicated code!). >>>>>> >>>>>> I have taken the PlatformEvent, PlatformParker and Parker::* code, >>>>>> out >>>>>> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, >>>>>> AIX >>>>>> and perhaps one day Solaris (more on that later). >>>>>> >>>>>> The Linux code was the most functionally complete, dealing with >>>>>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>>>>> default wall-clock for absolute timed waits. That functionality is >>>>>> not, unfortunately, supported by all our POSIX platforms so there are >>>>>> some configure time build checks to set some #defines, and then some >>>>>> dynamic lookup at runtime**. We allow for the runtime environment to >>>>>> be less capable than the build environment, but not the other way >>>>>> around (without build time support we don't know the runtime types >>>>>> needed to make library calls). >>>>>> >>>>>> ** There is some duplication of dynamic lookup code on Linux but this >>>>>> can be cleaned up in future work if we refactor the time/clock code >>>>>> into os_posix as well. >>>>>> >>>>>> The cleanup covers a number of things: >>>>>> - removal of linux anachronisms that got "ported" into the other >>>>>> platforms >>>>>> - eg EINTR can not be returned from the wait methods >>>>>> - removal of solaris anachronisms that got ported into the linux code >>>>>> and then on to other platforms >>>>>> - eg ETIMEDOUT is what we expect never ETIME >>>>>> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >>>>>> from the Parker methods >>>>>> - consolidation of unpackTime and compute_abstime into one utility >>>>>> function >>>>>> - use statics for things completely private to the implementation >>>>>> rather than making them part of the os* API (eg access to condAttr >>>>>> objects) >>>>>> - cleanup up commentary and style within methods of the same class >>>>>> - clean up coding style in places eg not using Names that start with >>>>>> capitals. >>>>>> >>>>>> I have not tried to cleanup every single oddity, nor tried to >>>>>> reconcile differences between the very similar in places >>>>>> PlatformEvent >>>>>> and Park methods. For example PlatformEvent still examines the >>>>>> FilterSpuriousWakeups** flag, and Parker still ignores it. >>>>>> >>>>>> ** Perhaps a candidate for deprecation and future removal. >>>>>> >>>>>> There is one mini "enhancement" slipped in this. I now explicitly >>>>>> initialize mutexes with a mutexAttr object with its type set to >>>>>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>>>>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>>>>> "error checking" and so is slow. On all other current platforms there >>>>>> is no effective change. >>>>>> >>>>>> Finally, Solaris is excluded from all this (other than the debug >>>>>> signal blocking cleanup) because it potentially supports three >>>>>> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >>>>>> sync. Solaris cleanup would be a separate RFE. >>>>>> >>>>>> No doubt I've overlooked mentioning something that someone will >>>>>> spot. :) >>>>>> >>>>>> Thanks, >>>>>> David >>>>> >>> > From david.holmes at oracle.com Fri May 19 09:07:46 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 May 2017 19:07:46 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> Message-ID: <6ce9d6d2-d448-f571-0d73-ad6518c92834@oracle.com> Hi Robbin, Thanks for looking at this. On 19/05/2017 6:36 PM, Robbin Ehn wrote: > Hi David, > > On 05/18/2017 08:25 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >> >> webrevs: >> >> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >> hotspot: >> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >> > > I like this, with neg delta of 700 loc, nice! > > It's hard to see if you broken anything, since you combined 4 separate > implementation into 1. Well not really. As I said this is basically a cleaned up version of the Linux code. The BSD and AIX versions were already based on earlier versions of the Linux code, minus the proper handling of CLOCK_MONOTONIC and absolute timeouts. > I guess you have tested this proper? Only JPRT so far. I should have mentioned that I'm not expecting this to be reviewed in pushed within a couple of days, so some refinements and continual testing will occur. > What stands out in os_posix.cpp is the > static void to_abstime(timespec* abstime, jlong timeout, bool isAbsolute) > > The ifdef scopes of SUPPORTS_CLOCK_MONOTONIC is large and calculations > are repeated 3 times. They have to be as there are three cases: 1. Relative wait using CLOCK_MONOTONIC 2. Relative wait using gettimeofday() 3. Absolute wait using gettimeofday() > Please consider something like: > > #ifdef SUPPORTS_CLOCK_MONOTONIC > if (_use_clock_monotonic_condattr && !isAbsolute) { // Why aren't we > using this when not isAbsolute is set? > // I suggest removing that check from > this if and use monotonic for that also. Absolute waits have to be based on wall-clock time and follow any adjustments made to wall clock time. In contrast relative waits should never be affected by wall-clock time adjustments hence the use of CLOCK_MONOTONIC when available. In Java the relative timed-waits are: - Thread.sleep(ms) - Object.wait(ms)/wait(ms,ns) - LockSupport.parkNanos(ns) (and all the j.u.c blocking methods built on top of it) While the only absolute timed-wait we have is the LockSupport.parkUntil method(s). Hope that clarifies things. Thanks, David ----- > struct timespec now; > int status = _clock_gettime(CLOCK_MONOTONIC, &now); > assert_status(status == 0, status, "clock_gettime"); > calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_nsec, > NANOUNITS); > } else { > #else > { > #endif > struct timeval now; > int status = gettimeofday(&now, NULL); > assert(status == 0, "gettimeofday"); > calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_usec, > MICROUNITS); > } > #endif > > Thanks for fixing this! > > /Robbin From magnus.ihse.bursie at oracle.com Fri May 19 09:15:21 2017 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 19 May 2017 11:15:21 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <1ef68ed4-91ca-b52d-2a75-1d66226ad21b@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> <1ef68ed4-91ca-b52d-2a75-1d66226ad21b@oracle.com> Message-ID: <4d2113c5-e3a0-b6be-4a12-ecb6462a75df@oracle.com> On 2017-05-19 09:15, David Holmes wrote: > Hi Magnus, > > On 18/05/2017 8:06 PM, Magnus Ihse Bursie wrote: >> >> >> On 2017-05-18 09:35, David Holmes wrote: >>> On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: >>>> On 2017-05-18 08:25, David Holmes wrote: >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>>>> >>>>> webrevs: >>>>> >>>>> Build-related: >>>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >>>> >>>> Build changes look good. >>> >>> Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging >>> prints outs - do you want me to remove them? I suppose they may be >>> useful if something goes wrong on some platform. >> >> I didn't even notice them. :-/ >> >> It's a bit unfortunate we don't have a debug level on the logging >> from configure. :-( Otherwise they would have clearly belonged there. >> >> The AC_MSG_NOTICE messages stands out much from the rest of the >> configure log, so maybe it's better that you remove them. The logic >> itself is very simple, if the -D flags are missing then we can surely >> tell what happened. So yes, please remove them. > > Webrev updated in place. Code looks good! In the future, I very much prefer if you do not update webrevs in place. It's hopeless if you start reading a thread after some updates have occured, the mails don't make any sense, and it's hard to follow after-the-fact how the patch evolved. > > I have removed them to avoid noise - particularly as they get executed > twice. > > I also made an adjustment to AC_SEARCH_LIBS as I don't want to pass > the saved_LIBS value in explicitly, but want it to use the LIBS value > - which is no longer cleared before the call. I've verified all > platforms are okay - except AIX which I'll need Thomas to recheck when > he can. > > I also discovered an oddity in that our ARM64 builds seem to use > different system libraries in that librt.so is not needed for > clock_gettime. This still seems to work ok. Of more of a concern if we > were expand this kind of function-existence check is that libc seems > to contain "dummy" (or at least dysfunctional) versions of a number of > the core pthread APIs! That's good to know. /Magnus > > Thanks, > David > >> Alternatively, rewrite them as CHECKING/RESULT, if you want to keep >> the logging. That way they match better the rest of the configure log >> (and also describes what you're doing). Just check if AC_SEARCH_LIBS >> prints some output (likely so, I think), then you can't do it in the >> middle of a CHECKING/RESULT pair, but have to do the CHECKING part >> after AC_SEARCH_LIBS. >> >> /Magnus >> >>> >>> David >>> >>>> /Magnus >>>> >>>>> hotspot: >>>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >>>>> >>>>> First a big thank you to Thomas Stuefe for testing various >>>>> versions of >>>>> this on AIX. >>>>> >>>>> This is primarily a refactoring and cleanup exercise (ie lots of >>>>> deleted duplicated code!). >>>>> >>>>> I have taken the PlatformEvent, PlatformParker and Parker::* code, >>>>> out >>>>> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, >>>>> AIX >>>>> and perhaps one day Solaris (more on that later). >>>>> >>>>> The Linux code was the most functionally complete, dealing with >>>>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>>>> default wall-clock for absolute timed waits. That functionality is >>>>> not, unfortunately, supported by all our POSIX platforms so there are >>>>> some configure time build checks to set some #defines, and then some >>>>> dynamic lookup at runtime**. We allow for the runtime environment to >>>>> be less capable than the build environment, but not the other way >>>>> around (without build time support we don't know the runtime types >>>>> needed to make library calls). >>>>> >>>>> ** There is some duplication of dynamic lookup code on Linux but this >>>>> can be cleaned up in future work if we refactor the time/clock code >>>>> into os_posix as well. >>>>> >>>>> The cleanup covers a number of things: >>>>> - removal of linux anachronisms that got "ported" into the other >>>>> platforms >>>>> - eg EINTR can not be returned from the wait methods >>>>> - removal of solaris anachronisms that got ported into the linux code >>>>> and then on to other platforms >>>>> - eg ETIMEDOUT is what we expect never ETIME >>>>> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >>>>> from the Parker methods >>>>> - consolidation of unpackTime and compute_abstime into one utility >>>>> function >>>>> - use statics for things completely private to the implementation >>>>> rather than making them part of the os* API (eg access to condAttr >>>>> objects) >>>>> - cleanup up commentary and style within methods of the same class >>>>> - clean up coding style in places eg not using Names that start with >>>>> capitals. >>>>> >>>>> I have not tried to cleanup every single oddity, nor tried to >>>>> reconcile differences between the very similar in places >>>>> PlatformEvent >>>>> and Park methods. For example PlatformEvent still examines the >>>>> FilterSpuriousWakeups** flag, and Parker still ignores it. >>>>> >>>>> ** Perhaps a candidate for deprecation and future removal. >>>>> >>>>> There is one mini "enhancement" slipped in this. I now explicitly >>>>> initialize mutexes with a mutexAttr object with its type set to >>>>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>>>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>>>> "error checking" and so is slow. On all other current platforms there >>>>> is no effective change. >>>>> >>>>> Finally, Solaris is excluded from all this (other than the debug >>>>> signal blocking cleanup) because it potentially supports three >>>>> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >>>>> sync. Solaris cleanup would be a separate RFE. >>>>> >>>>> No doubt I've overlooked mentioning something that someone will >>>>> spot. :) >>>>> >>>>> Thanks, >>>>> David >>>> >> From david.holmes at oracle.com Fri May 19 09:18:21 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 May 2017 19:18:21 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <4d2113c5-e3a0-b6be-4a12-ecb6462a75df@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> <1ef68ed4-91ca-b52d-2a75-1d66226ad21b@oracle.com> <4d2113c5-e3a0-b6be-4a12-ecb6462a75df@oracle.com> Message-ID: Replying to all this time :) On 19/05/2017 7:15 PM, Magnus Ihse Bursie wrote: > > On 2017-05-19 09:15, David Holmes wrote: >> Hi Magnus, >> >> On 18/05/2017 8:06 PM, Magnus Ihse Bursie wrote: >>> >>> >>> On 2017-05-18 09:35, David Holmes wrote: >>>> On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: >>>>> On 2017-05-18 08:25, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>>>>> >>>>>> webrevs: >>>>>> >>>>>> Build-related: >>>>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >>>>> >>>>> Build changes look good. >>>> >>>> Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging >>>> prints outs - do you want me to remove them? I suppose they may be >>>> useful if something goes wrong on some platform. >>> >>> I didn't even notice them. :-/ >>> >>> It's a bit unfortunate we don't have a debug level on the logging >>> from configure. :-( Otherwise they would have clearly belonged there. >>> >>> The AC_MSG_NOTICE messages stands out much from the rest of the >>> configure log, so maybe it's better that you remove them. The logic >>> itself is very simple, if the -D flags are missing then we can surely >>> tell what happened. So yes, please remove them. >> >> Webrev updated in place. > Code looks good! Thanks for the re-review. > In the future, I very much prefer if you do not update webrevs in place. > It's hopeless if you start reading a thread after some updates have > occured, the mails don't make any sense, and it's hard to follow > after-the-fact how the patch evolved. Sorry. Point taken. David ----- >> >> I have removed them to avoid noise - particularly as they get executed >> twice. >> >> I also made an adjustment to AC_SEARCH_LIBS as I don't want to pass >> the saved_LIBS value in explicitly, but want it to use the LIBS value >> - which is no longer cleared before the call. I've verified all >> platforms are okay - except AIX which I'll need Thomas to recheck when >> he can. >> >> I also discovered an oddity in that our ARM64 builds seem to use >> different system libraries in that librt.so is not needed for >> clock_gettime. This still seems to work ok. Of more of a concern if we >> were expand this kind of function-existence check is that libc seems >> to contain "dummy" (or at least dysfunctional) versions of a number of >> the core pthread APIs! > That's good to know. > > /Magnus >> >> Thanks, >> David >> >>> Alternatively, rewrite them as CHECKING/RESULT, if you want to keep >>> the logging. That way they match better the rest of the configure log >>> (and also describes what you're doing). Just check if AC_SEARCH_LIBS >>> prints some output (likely so, I think), then you can't do it in the >>> middle of a CHECKING/RESULT pair, but have to do the CHECKING part >>> after AC_SEARCH_LIBS. >>> >>> /Magnus >>> >>>> >>>> David >>>> >>>>> /Magnus >>>>> >>>>>> hotspot: >>>>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >>>>>> >>>>>> First a big thank you to Thomas Stuefe for testing various >>>>>> versions of >>>>>> this on AIX. >>>>>> >>>>>> This is primarily a refactoring and cleanup exercise (ie lots of >>>>>> deleted duplicated code!). >>>>>> >>>>>> I have taken the PlatformEvent, PlatformParker and Parker::* code, >>>>>> out >>>>>> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, >>>>>> AIX >>>>>> and perhaps one day Solaris (more on that later). >>>>>> >>>>>> The Linux code was the most functionally complete, dealing with >>>>>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>>>>> default wall-clock for absolute timed waits. That functionality is >>>>>> not, unfortunately, supported by all our POSIX platforms so there are >>>>>> some configure time build checks to set some #defines, and then some >>>>>> dynamic lookup at runtime**. We allow for the runtime environment to >>>>>> be less capable than the build environment, but not the other way >>>>>> around (without build time support we don't know the runtime types >>>>>> needed to make library calls). >>>>>> >>>>>> ** There is some duplication of dynamic lookup code on Linux but this >>>>>> can be cleaned up in future work if we refactor the time/clock code >>>>>> into os_posix as well. >>>>>> >>>>>> The cleanup covers a number of things: >>>>>> - removal of linux anachronisms that got "ported" into the other >>>>>> platforms >>>>>> - eg EINTR can not be returned from the wait methods >>>>>> - removal of solaris anachronisms that got ported into the linux code >>>>>> and then on to other platforms >>>>>> - eg ETIMEDOUT is what we expect never ETIME >>>>>> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >>>>>> from the Parker methods >>>>>> - consolidation of unpackTime and compute_abstime into one utility >>>>>> function >>>>>> - use statics for things completely private to the implementation >>>>>> rather than making them part of the os* API (eg access to condAttr >>>>>> objects) >>>>>> - cleanup up commentary and style within methods of the same class >>>>>> - clean up coding style in places eg not using Names that start with >>>>>> capitals. >>>>>> >>>>>> I have not tried to cleanup every single oddity, nor tried to >>>>>> reconcile differences between the very similar in places >>>>>> PlatformEvent >>>>>> and Park methods. For example PlatformEvent still examines the >>>>>> FilterSpuriousWakeups** flag, and Parker still ignores it. >>>>>> >>>>>> ** Perhaps a candidate for deprecation and future removal. >>>>>> >>>>>> There is one mini "enhancement" slipped in this. I now explicitly >>>>>> initialize mutexes with a mutexAttr object with its type set to >>>>>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>>>>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>>>>> "error checking" and so is slow. On all other current platforms there >>>>>> is no effective change. >>>>>> >>>>>> Finally, Solaris is excluded from all this (other than the debug >>>>>> signal blocking cleanup) because it potentially supports three >>>>>> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >>>>>> sync. Solaris cleanup would be a separate RFE. >>>>>> >>>>>> No doubt I've overlooked mentioning something that someone will >>>>>> spot. :) >>>>>> >>>>>> Thanks, >>>>>> David >>>>> >>> > From doug.simon at oracle.com Fri May 19 09:19:42 2017 From: doug.simon at oracle.com (Doug Simon) Date: Fri, 19 May 2017 11:19:42 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <4d2113c5-e3a0-b6be-4a12-ecb6462a75df@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <1c0a1da9-0c45-785c-9393-e79aad985746@oracle.com> <1ef68ed4-91ca-b52d-2a75-1d66226ad21b@oracle.com> <4d2113c5-e3a0-b6be-4a12-ecb6462a75df@oracle.com> Message-ID: <51717D86-1BCB-4C6F-947F-65A43E67554D@oracle.com> > On 19 May 2017, at 11:15, Magnus Ihse Bursie wrote: > > > On 2017-05-19 09:15, David Holmes wrote: >> Hi Magnus, >> >> On 18/05/2017 8:06 PM, Magnus Ihse Bursie wrote: >>> >>> >>> On 2017-05-18 09:35, David Holmes wrote: >>>> On 18/05/2017 5:32 PM, Magnus Ihse Bursie wrote: >>>>> On 2017-05-18 08:25, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>>>>> >>>>>> webrevs: >>>>>> >>>>>> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >>>>> >>>>> Build changes look good. >>>> >>>> Thanks Magnus! I just realized I left in the AC_MSG_NOTICE debugging prints outs - do you want me to remove them? I suppose they may be useful if something goes wrong on some platform. >>> >>> I didn't even notice them. :-/ >>> >>> It's a bit unfortunate we don't have a debug level on the logging from configure. :-( Otherwise they would have clearly belonged there. >>> >>> The AC_MSG_NOTICE messages stands out much from the rest of the configure log, so maybe it's better that you remove them. The logic itself is very simple, if the -D flags are missing then we can surely tell what happened. So yes, please remove them. >> >> Webrev updated in place. > Code looks good! > > In the future, I very much prefer if you do not update webrevs in place. It's hopeless if you start reading a thread after some updates have occured, the mails don't make any sense, and it's hard to follow after-the-fact how the patch evolved. Is there any chance openjdk code reviewing will adopt a slightly more modern process than webrevs such as Crucible where a full history of code evolution during a review is preserved? -Doug From robbin.ehn at oracle.com Fri May 19 09:25:13 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 19 May 2017 11:25:13 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <6ce9d6d2-d448-f571-0d73-ad6518c92834@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <6ce9d6d2-d448-f571-0d73-ad6518c92834@oracle.com> Message-ID: <16450c53-7032-72aa-116e-b8757db2786c@oracle.com> On 05/19/2017 11:07 AM, David Holmes wrote: > > They have to be as there are three cases: > > 1. Relative wait using CLOCK_MONOTONIC > 2. Relative wait using gettimeofday() > 3. Absolute wait using gettimeofday() > >> Please consider something like: >> >> #ifdef SUPPORTS_CLOCK_MONOTONIC >> if (_use_clock_monotonic_condattr && !isAbsolute) { // Why aren't we using this when not isAbsolute is set? >> // I suggest removing that check from this if and use monotonic for that also. > > Absolute waits have to be based on wall-clock time and follow any adjustments made to wall clock time. In contrast relative waits should never be affected by wall-clock > time adjustments hence the use of CLOCK_MONOTONIC when available. > > In Java the relative timed-waits are: > - Thread.sleep(ms) > - Object.wait(ms)/wait(ms,ns) > - LockSupport.parkNanos(ns) (and all the j.u.c blocking methods built on top of it) > > While the only absolute timed-wait we have is the LockSupport.parkUntil method(s). > > Hope that clarifies things. Yes thanks! But you can still re-factoring to something similar to what I suggested and two of the calculation should be the same just ns vs us, correct? Leaving the if statement with the "!isAbsolute" check, in my head calc_time is something like: void calc_time(...) { if (isAbsolute) { calc_abs_time(...); } else { calc_rel_time(...); } } I do not see a problem with this, only better readability? /Robbin > > Thanks, > David > ----- > >> struct timespec now; >> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >> assert_status(status == 0, status, "clock_gettime"); >> calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_nsec, NANOUNITS); >> } else { >> #else >> { >> #endif >> struct timeval now; >> int status = gettimeofday(&now, NULL); >> assert(status == 0, "gettimeofday"); >> calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_usec, MICROUNITS); >> } >> #endif >> >> Thanks for fixing this! >> >> /Robbin From david.holmes at oracle.com Fri May 19 10:53:50 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 May 2017 20:53:50 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <16450c53-7032-72aa-116e-b8757db2786c@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <6ce9d6d2-d448-f571-0d73-ad6518c92834@oracle.com> <16450c53-7032-72aa-116e-b8757db2786c@oracle.com> Message-ID: On 19/05/2017 7:25 PM, Robbin Ehn wrote: > On 05/19/2017 11:07 AM, David Holmes wrote: >> >> They have to be as there are three cases: >> >> 1. Relative wait using CLOCK_MONOTONIC >> 2. Relative wait using gettimeofday() >> 3. Absolute wait using gettimeofday() >> >>> Please consider something like: >>> >>> #ifdef SUPPORTS_CLOCK_MONOTONIC >>> if (_use_clock_monotonic_condattr && !isAbsolute) { // Why aren't >>> we using this when not isAbsolute is set? >>> // I suggest removing that check from >>> this if and use monotonic for that also. >> >> Absolute waits have to be based on wall-clock time and follow any >> adjustments made to wall clock time. In contrast relative waits should >> never be affected by wall-clock time adjustments hence the use of >> CLOCK_MONOTONIC when available. >> >> In Java the relative timed-waits are: >> - Thread.sleep(ms) >> - Object.wait(ms)/wait(ms,ns) >> - LockSupport.parkNanos(ns) (and all the j.u.c blocking methods built >> on top of it) >> >> While the only absolute timed-wait we have is the >> LockSupport.parkUntil method(s). >> >> Hope that clarifies things. > > Yes thanks! > > But you can still re-factoring to something similar to what I suggested > and two of the calculation should be the same just ns vs us, correct? There are three different forms of the calculation. The two relative time versions use a different time function and so a different time structure (timeval vs timespec) and a different calculation. > Leaving the if statement with the "!isAbsolute" check, in my head > calc_time is something like: > > void calc_time(...) { > if (isAbsolute) { > calc_abs_time(...); > } else { #ifdef SUPPORTS_CLOCK_MONOTONIC > calc_rel_time_from_clock_monotonic(...); #else > calc_rel_time_from_gettimeofday(...); #endif > } > } David ----- > I do not see a problem with this, only better readability? > > /Robbin > >> >> Thanks, >> David >> ----- >> >>> struct timespec now; >>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>> assert_status(status == 0, status, "clock_gettime"); >>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, >>> now.tv_nsec, NANOUNITS); >>> } else { >>> #else >>> { >>> #endif >>> struct timeval now; >>> int status = gettimeofday(&now, NULL); >>> assert(status == 0, "gettimeofday"); >>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, >>> now.tv_usec, MICROUNITS); >>> } >>> #endif >>> >>> Thanks for fixing this! >>> >>> /Robbin From david.holmes at oracle.com Fri May 19 11:36:11 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 May 2017 21:36:11 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <6ce9d6d2-d448-f571-0d73-ad6518c92834@oracle.com> <16450c53-7032-72aa-116e-b8757db2786c@oracle.com> Message-ID: Correction ... On 19/05/2017 8:53 PM, David Holmes wrote: > On 19/05/2017 7:25 PM, Robbin Ehn wrote: >> On 05/19/2017 11:07 AM, David Holmes wrote: >>> >>> They have to be as there are three cases: >>> >>> 1. Relative wait using CLOCK_MONOTONIC >>> 2. Relative wait using gettimeofday() >>> 3. Absolute wait using gettimeofday() >>> >>>> Please consider something like: >>>> >>>> #ifdef SUPPORTS_CLOCK_MONOTONIC >>>> if (_use_clock_monotonic_condattr && !isAbsolute) { // Why >>>> aren't we using this when not isAbsolute is set? >>>> // I suggest removing that check >>>> from this if and use monotonic for that also. >>> >>> Absolute waits have to be based on wall-clock time and follow any >>> adjustments made to wall clock time. In contrast relative waits >>> should never be affected by wall-clock time adjustments hence the use >>> of CLOCK_MONOTONIC when available. >>> >>> In Java the relative timed-waits are: >>> - Thread.sleep(ms) >>> - Object.wait(ms)/wait(ms,ns) >>> - LockSupport.parkNanos(ns) (and all the j.u.c blocking methods built >>> on top of it) >>> >>> While the only absolute timed-wait we have is the >>> LockSupport.parkUntil method(s). >>> >>> Hope that clarifies things. >> >> Yes thanks! >> >> But you can still re-factoring to something similar to what I >> suggested and two of the calculation should be the same just ns vs us, >> correct? > > There are three different forms of the calculation. The two relative > time versions use a different time function and so a different time > structure (timeval vs timespec) and a different calculation. > >> Leaving the if statement with the "!isAbsolute" check, in my head >> calc_time is something like: >> >> void calc_time(...) { >> if (isAbsolute) { >> calc_abs_time(...); >> } else { > #ifdef SUPPORTS_CLOCK_MONOTONIC >> calc_rel_time_from_clock_monotonic(...); > #else > > calc_rel_time_from_gettimeofday(...); > #endif >> } >> } It's more complicated than that because we may have build time SUPPORTS_CLOCK_MONOTONIC but we still need the runtime check as well. to_abstime is the old linux unpackTime with the addition of the build time conditionals. David > David > ----- > >> I do not see a problem with this, only better readability? >> >> /Robbin >> >>> >>> Thanks, >>> David >>> ----- >>> >>>> struct timespec now; >>>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>>> assert_status(status == 0, status, "clock_gettime"); >>>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, >>>> now.tv_nsec, NANOUNITS); >>>> } else { >>>> #else >>>> { >>>> #endif >>>> struct timeval now; >>>> int status = gettimeofday(&now, NULL); >>>> assert(status == 0, "gettimeofday"); >>>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, >>>> now.tv_usec, MICROUNITS); >>>> } >>>> #endif >>>> >>>> Thanks for fixing this! >>>> >>>> /Robbin From robbin.ehn at oracle.com Fri May 19 12:33:46 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 19 May 2017 14:33:46 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <6ce9d6d2-d448-f571-0d73-ad6518c92834@oracle.com> <16450c53-7032-72aa-116e-b8757db2786c@oracle.com> Message-ID: <9bd7dd4b-b513-88b4-92fb-0c815b479252@oracle.com> Hi David On 05/19/2017 01:36 PM, David Holmes wrote: >> >> There are three different forms of the calculation. The two relative time versions use a different time function and so a different time structure (timeval vs timespec) >> and a different calculation. Yes that's why I included unit in my example signature. > > It's more complicated than that because we may have build time SUPPORTS_CLOCK_MONOTONIC but we still need the runtime check as well. > > to_abstime is the old linux unpackTime with the addition of the build time conditionals. I'm not changing that, I'm not sure how to explain better, so here is the patch (only compiled and silly naming convention): http://cr.openjdk.java.net/~rehn/8174231/webrev/ This makes the calculation independent of source/unit. Thanks! /Robbin > > David > >> David >> ----- >> >>> I do not see a problem with this, only better readability? >>> >>> /Robbin >>> >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> struct timespec now; >>>>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>>>> assert_status(status == 0, status, "clock_gettime"); >>>>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_nsec, NANOUNITS); >>>>> } else { >>>>> #else >>>>> { >>>>> #endif >>>>> struct timeval now; >>>>> int status = gettimeofday(&now, NULL); >>>>> assert(status == 0, "gettimeofday"); >>>>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_usec, MICROUNITS); >>>>> } >>>>> #endif >>>>> >>>>> Thanks for fixing this! >>>>> >>>>> /Robbin From stefan.karlsson at oracle.com Fri May 19 12:37:29 2017 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 19 May 2017 14:37:29 +0200 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> Message-ID: Hi Coleen, I'm mainly reviewing the GC specific parts. http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/prims/resolvedMethodTable.cpp.html 143 void ResolvedMethodTable::unlink_or_oops_do(BoolObjectClosure* is_alive, OopClosure* f) { ... 151 if (f != NULL) { 152 f->do_oop((oop*)entry->literal_addr()); 153 p = entry->next_addr(); 154 } else { 155 if (!is_alive->do_object_b(entry->literal())) { 156 _oops_removed++; 157 if (log_is_enabled(Debug, membername, table)) { 158 ResourceMark rm; 159 Method* m = (Method*)java_lang_invoke_ResolvedMethodName::vmtarget(entry->literal()); 160 log_debug(membername, table) ("ResolvedMethod vmtarget entry removed for %s index %d", 161 m->name_and_sig_as_C_string(), i); 162 } 163 *p = entry->next(); 164 _the_table->free_entry(entry); 165 } else { 166 p = entry->next_addr(); 167 } 168 } This code looks backwards to me. If you pass in both an is_alive closure and an f (OopClosure), then you ignore the is_alive closure. This will break if we someday want to clear these entries during a copying GC. Those GCs want to unlink dead entries and apply the f closure to the oop*s of the live entries. Could you change this to mimic the code in the StringTable?: http://hg.openjdk.java.net/jdk10/hs/hotspot/file/094298f42cc7/src/share/vm/classfile/stringTable.cpp if (is_alive->do_object_b(entry->literal())) { if (f != NULL) { f->do_oop((oop*)entry->literal_addr()); } p = entry->next_addr(); } else { *p = entry->next(); the_table()->free_entry(entry); (*removed)++; } ------------------------------------------------------------------------------ http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/gc/g1/g1CollectedHeap.cpp.frames.html 3888 // The parallel work done by all worker threads. 3889 void work(uint worker_id) { 3890 // Do first pass of code cache cleaning. 3891 _code_cache_task.work_first_pass(worker_id); 3892 3893 // Let the threads mark that the first pass is done. 3894 _code_cache_task.barrier_mark(worker_id); 3895 3896 // Clean the Strings and Symbols. 3897 _string_symbol_task.work(worker_id); 3898 3899 // Wait for all workers to finish the first code cache cleaning pass. 3900 _code_cache_task.barrier_wait(worker_id); 3901 3902 // Do the second code cache cleaning work, which realize on 3903 // the liveness information gathered during the first pass. 3904 _code_cache_task.work_second_pass(worker_id); 3905 3906 // Clean all klasses that were not unloaded. 3907 _klass_cleaning_task.work(); 3908 3909 // Clean unreferenced things in the ResolvedMethodTable 3910 _resolved_method_cleaning_task.work(); 3911 } The GC workers wait in the barrier_wait function as long as there are workers left that have not passed the barrier_mark point. If you move the _resolved_method_cleaning_task.work() to somewhere between barrier_mark and barrier_wait, there might be some opportunity for one of the workers to do work instead of waiting in the mark_wait barrier. ------------------------------------------------------------------------------ 3876 G1MemberNameCleaningTask _resolved_method_cleaning_task; There seems to be a naming confusion in this patch. Sometimes it talks about MemberNames and sometimes ResolvedMethods. Could you make this more consistent throughout the patch? ------------------------------------------------------------------------------ http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/prims/resolvedMethodTable.cpp.html 25 #include "precompiled.hpp" 26 #include "gc/shared/gcLocker.hpp" 27 #include "memory/allocation.hpp" 28 #include "oops/oop.inline.hpp" 29 #include "oops/method.hpp" 30 #include "oops/symbol.hpp" 31 #include "prims/resolvedMethodTable.hpp" 32 #include "runtime/handles.inline.hpp" 33 #include "runtime/mutexLocker.hpp" 34 #include "utilities/hashtable.inline.hpp" 35 #include "utilities/macros.hpp" 36 #if INCLUDE_ALL_GCS 37 #include "gc/g1/g1CollectedHeap.hpp" 38 #include "gc/g1/g1SATBCardTableModRefBS.hpp" 39 #include "gc/g1/g1StringDedup.hpp" 40 #endif I don't thing you should include gcLocker.hpp, g1CollectedHeap.hpp, or g1StringDedup.hpp from this file. Thanks, StefanK On 2017-05-17 18:01, coleen.phillimore at oracle.com wrote: > Summary: Add a Java type called ResolvedMethodName which is immutable > and can be stored in a hashtable, that is weakly collected by gc > > Thanks to John for his help with MemberName, and to-be-filed RFEs for > further improvements. Thanks to Stefan for GC help. > > open webrev at http://cr.openjdk.java.net/~coleenp/8174749.01/webrev > open webrev at http://cr.openjdk.java.net/~coleenp/8174749.jdk.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8174749 > > Tested with RBT nightly, compiler/jsr292 tests (included in rbt > nightly), JPRT, jdk/test/java/lang/invoke, jdk/test/java/lang/instrument > tests. > > There are platform dependent changes in this change. They are very > straightforward, ie. add an indirection to MemberName invocations, but > could people with access to these platforms test this out for me? > > Performance testing showed no regression, and large 1000% improvement > for the cases that caused us to backout previous attempts at this change. > > Thanks, > Coleen > From coleen.phillimore at oracle.com Fri May 19 15:05:27 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 19 May 2017 11:05:27 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> Message-ID: <3caa6d26-89f9-e885-4fc0-63319e9b4822@oracle.com> Stefan, Thank you for reviewing the GC code (and your help). On 5/19/17 8:37 AM, Stefan Karlsson wrote: > Hi Coleen, > > I'm mainly reviewing the GC specific parts. > > http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/prims/resolvedMethodTable.cpp.html > > > 143 void ResolvedMethodTable::unlink_or_oops_do(BoolObjectClosure* > is_alive, OopClosure* f) { > ... > 151 if (f != NULL) { > 152 f->do_oop((oop*)entry->literal_addr()); > 153 p = entry->next_addr(); > 154 } else { > 155 if (!is_alive->do_object_b(entry->literal())) { > 156 _oops_removed++; > 157 if (log_is_enabled(Debug, membername, table)) { > 158 ResourceMark rm; > 159 Method* m = > (Method*)java_lang_invoke_ResolvedMethodName::vmtarget(entry->literal()); > 160 log_debug(membername, table) ("ResolvedMethod > vmtarget entry removed for %s index %d", > 161 m->name_and_sig_as_C_string(), i); > 162 } > 163 *p = entry->next(); > 164 _the_table->free_entry(entry); > 165 } else { > 166 p = entry->next_addr(); > 167 } > 168 } > > This code looks backwards to me. If you pass in both an is_alive > closure and an f (OopClosure), then you ignore the is_alive closure. > This will break if we someday want to clear these entries during a > copying GC. Those GCs want to unlink dead entries and apply the f > closure to the oop*s of the live entries. The reason I did this is because i didn't want to copy the same loop for oops_do() and unlink(), so it's not the same as StringTable. is_alive closure is null for the oops_do case. I'll decouple and just have unlink and oops_do, and if we decide to clear these during copy GC, it can be easily changed to unlink_or_oops_do(). > > Could you change this to mimic the code in the StringTable?: > > http://hg.openjdk.java.net/jdk10/hs/hotspot/file/094298f42cc7/src/share/vm/classfile/stringTable.cpp > > > if (is_alive->do_object_b(entry->literal())) { > if (f != NULL) { > f->do_oop((oop*)entry->literal_addr()); > } > p = entry->next_addr(); > } else { > *p = entry->next(); > the_table()->free_entry(entry); > (*removed)++; > } > I can't. is_alive is null when called for oops_do. > ------------------------------------------------------------------------------ > > > http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/gc/g1/g1CollectedHeap.cpp.frames.html > > > 3888 // The parallel work done by all worker threads. > 3889 void work(uint worker_id) { > 3890 // Do first pass of code cache cleaning. > 3891 _code_cache_task.work_first_pass(worker_id); > 3892 > 3893 // Let the threads mark that the first pass is done. > 3894 _code_cache_task.barrier_mark(worker_id); > 3895 > 3896 // Clean the Strings and Symbols. > 3897 _string_symbol_task.work(worker_id); > 3898 > 3899 // Wait for all workers to finish the first code cache > cleaning pass. > 3900 _code_cache_task.barrier_wait(worker_id); > 3901 > 3902 // Do the second code cache cleaning work, which realize on > 3903 // the liveness information gathered during the first pass. > 3904 _code_cache_task.work_second_pass(worker_id); > 3905 > 3906 // Clean all klasses that were not unloaded. > 3907 _klass_cleaning_task.work(); > 3908 > 3909 // Clean unreferenced things in the ResolvedMethodTable > 3910 _resolved_method_cleaning_task.work(); > 3911 } > > The GC workers wait in the barrier_wait function as long as there are > workers left that have not passed the barrier_mark point. If you move > the _resolved_method_cleaning_task.work() to somewhere between > barrier_mark and barrier_wait, there might be some opportunity for one > of the workers to do work instead of waiting in the mark_wait barrier. Okay, yes, now I see it. I added the resolved_method cleaning task to after string_symbol_task. > > ------------------------------------------------------------------------------ > > > 3876 G1MemberNameCleaningTask _resolved_method_cleaning_task; > > There seems to be a naming confusion in this patch. Sometimes it talks > about MemberNames and sometimes ResolvedMethods. Could you make this > more consistent throughout the patch? > I missed that in the renaming. Thank you for catching it. > ------------------------------------------------------------------------------ > > > http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/prims/resolvedMethodTable.cpp.html > > > 25 #include "precompiled.hpp" > 26 #include "gc/shared/gcLocker.hpp" > 27 #include "memory/allocation.hpp" > 28 #include "oops/oop.inline.hpp" > 29 #include "oops/method.hpp" > 30 #include "oops/symbol.hpp" > 31 #include "prims/resolvedMethodTable.hpp" > 32 #include "runtime/handles.inline.hpp" > 33 #include "runtime/mutexLocker.hpp" > 34 #include "utilities/hashtable.inline.hpp" > 35 #include "utilities/macros.hpp" > 36 #if INCLUDE_ALL_GCS > 37 #include "gc/g1/g1CollectedHeap.hpp" > 38 #include "gc/g1/g1SATBCardTableModRefBS.hpp" > 39 #include "gc/g1/g1StringDedup.hpp" > 40 #endif > > I don't thing you should include gcLocker.hpp, g1CollectedHeap.hpp, or > g1StringDedup.hpp from this file. I need gcLocker.hpp because NoSafepointVerifier is declared there, but removed the others unnecessary #include. Webrev with changes tested with my test case (more testing in progress): open webrev at http://cr.openjdk.java.net/~coleenp/8174749.02/webrev Thank you!! Coleen > > Thanks, > StefanK > > On 2017-05-17 18:01, coleen.phillimore at oracle.com wrote: >> Summary: Add a Java type called ResolvedMethodName which is immutable >> and can be stored in a hashtable, that is weakly collected by gc >> >> Thanks to John for his help with MemberName, and to-be-filed RFEs for >> further improvements. Thanks to Stefan for GC help. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.01/webrev >> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.jdk.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8174749 >> >> Tested with RBT nightly, compiler/jsr292 tests (included in rbt >> nightly), JPRT, jdk/test/java/lang/invoke, jdk/test/java/lang/instrument >> tests. >> >> There are platform dependent changes in this change. They are very >> straightforward, ie. add an indirection to MemberName invocations, but >> could people with access to these platforms test this out for me? >> >> Performance testing showed no regression, and large 1000% improvement >> for the cases that caused us to backout previous attempts at this >> change. >> >> Thanks, >> Coleen >> From stefan.karlsson at oracle.com Fri May 19 15:56:23 2017 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 19 May 2017 17:56:23 +0200 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <3caa6d26-89f9-e885-4fc0-63319e9b4822@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3caa6d26-89f9-e885-4fc0-63319e9b4822@oracle.com> Message-ID: <0da1ffbb-e4fa-8e7f-a6da-093a6a31a6d6@oracle.com> On 2017-05-19 17:05, coleen.phillimore at oracle.com wrote: > > Stefan, Thank you for reviewing the GC code (and your help). > > On 5/19/17 8:37 AM, Stefan Karlsson wrote: >> Hi Coleen, >> >> I'm mainly reviewing the GC specific parts. >> >> http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/prims/resolvedMethodTable.cpp.html >> >> >> 143 void ResolvedMethodTable::unlink_or_oops_do(BoolObjectClosure* >> is_alive, OopClosure* f) { >> ... >> 151 if (f != NULL) { >> 152 f->do_oop((oop*)entry->literal_addr()); >> 153 p = entry->next_addr(); >> 154 } else { >> 155 if (!is_alive->do_object_b(entry->literal())) { >> 156 _oops_removed++; >> 157 if (log_is_enabled(Debug, membername, table)) { >> 158 ResourceMark rm; >> 159 Method* m = >> (Method*)java_lang_invoke_ResolvedMethodName::vmtarget(entry->literal()); >> 160 log_debug(membername, table) ("ResolvedMethod >> vmtarget entry removed for %s index %d", >> 161 m->name_and_sig_as_C_string(), i); >> 162 } >> 163 *p = entry->next(); >> 164 _the_table->free_entry(entry); >> 165 } else { >> 166 p = entry->next_addr(); >> 167 } >> 168 } >> >> This code looks backwards to me. If you pass in both an is_alive >> closure and an f (OopClosure), then you ignore the is_alive closure. >> This will break if we someday want to clear these entries during a >> copying GC. Those GCs want to unlink dead entries and apply the f >> closure to the oop*s of the live entries. > > The reason I did this is because i didn't want to copy the same loop for > oops_do() and unlink(), so it's not the same as StringTable. is_alive > closure is null for the oops_do case. See my comment below: I'll decouple and just have > unlink and oops_do, and if we decide to clear these during copy GC, it > can be easily changed to unlink_or_oops_do(). > >> >> Could you change this to mimic the code in the StringTable?: >> >> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/094298f42cc7/src/share/vm/classfile/stringTable.cpp >> >> >> if (is_alive->do_object_b(entry->literal())) { >> if (f != NULL) { >> f->do_oop((oop*)entry->literal_addr()); >> } >> p = entry->next_addr(); >> } else { >> *p = entry->next(); >> the_table()->free_entry(entry); >> (*removed)++; >> } >> > > I can't. is_alive is null when called for oops_do. OK. There are ways to do it anyway. You could change the code to: if (is_alive == NULL || is_alive->do_object_b(entry->literal())) { Or, use the AlwaysTrueClosure, as we do in in jniHandles.hpp: void JNIHandles::weak_oops_do(OopClosure* f) { AlwaysTrueClosure always_true; weak_oops_do(&always_true, f); } The proposed decoupling is good as well. >> ------------------------------------------------------------------------------ >> >> >> http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/gc/g1/g1CollectedHeap.cpp.frames.html >> >> >> 3888 // The parallel work done by all worker threads. >> 3889 void work(uint worker_id) { >> 3890 // Do first pass of code cache cleaning. >> 3891 _code_cache_task.work_first_pass(worker_id); >> 3892 >> 3893 // Let the threads mark that the first pass is done. >> 3894 _code_cache_task.barrier_mark(worker_id); >> 3895 >> 3896 // Clean the Strings and Symbols. >> 3897 _string_symbol_task.work(worker_id); >> 3898 >> 3899 // Wait for all workers to finish the first code cache >> cleaning pass. >> 3900 _code_cache_task.barrier_wait(worker_id); >> 3901 >> 3902 // Do the second code cache cleaning work, which realize on >> 3903 // the liveness information gathered during the first pass. >> 3904 _code_cache_task.work_second_pass(worker_id); >> 3905 >> 3906 // Clean all klasses that were not unloaded. >> 3907 _klass_cleaning_task.work(); >> 3908 >> 3909 // Clean unreferenced things in the ResolvedMethodTable >> 3910 _resolved_method_cleaning_task.work(); >> 3911 } >> >> The GC workers wait in the barrier_wait function as long as there are >> workers left that have not passed the barrier_mark point. If you move >> the _resolved_method_cleaning_task.work() to somewhere between >> barrier_mark and barrier_wait, there might be some opportunity for one >> of the workers to do work instead of waiting in the mark_wait barrier. > > Okay, yes, now I see it. I added the resolved_method cleaning task to > after string_symbol_task. > >> >> ------------------------------------------------------------------------------ >> >> >> 3876 G1MemberNameCleaningTask _resolved_method_cleaning_task; >> >> There seems to be a naming confusion in this patch. Sometimes it talks >> about MemberNames and sometimes ResolvedMethods. Could you make this >> more consistent throughout the patch? >> > > I missed that in the renaming. Thank you for catching it. >> ------------------------------------------------------------------------------ >> >> >> http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/prims/resolvedMethodTable.cpp.html >> >> >> 25 #include "precompiled.hpp" >> 26 #include "gc/shared/gcLocker.hpp" >> 27 #include "memory/allocation.hpp" >> 28 #include "oops/oop.inline.hpp" >> 29 #include "oops/method.hpp" >> 30 #include "oops/symbol.hpp" >> 31 #include "prims/resolvedMethodTable.hpp" >> 32 #include "runtime/handles.inline.hpp" >> 33 #include "runtime/mutexLocker.hpp" >> 34 #include "utilities/hashtable.inline.hpp" >> 35 #include "utilities/macros.hpp" >> 36 #if INCLUDE_ALL_GCS >> 37 #include "gc/g1/g1CollectedHeap.hpp" >> 38 #include "gc/g1/g1SATBCardTableModRefBS.hpp" >> 39 #include "gc/g1/g1StringDedup.hpp" >> 40 #endif >> >> I don't thing you should include gcLocker.hpp, g1CollectedHeap.hpp, or >> g1StringDedup.hpp from this file. > > I need gcLocker.hpp because NoSafepointVerifier is declared there, but > removed the others unnecessary #include. Right. Forgot about that one. > > Webrev with changes tested with my test case (more testing in progress): > > open webrev at http://cr.openjdk.java.net/~coleenp/8174749.02/webrev The GC parts look good. Thanks, StefanK > > Thank you!! > Coleen >> >> Thanks, >> StefanK >> >> On 2017-05-17 18:01, coleen.phillimore at oracle.com wrote: >>> Summary: Add a Java type called ResolvedMethodName which is immutable >>> and can be stored in a hashtable, that is weakly collected by gc >>> >>> Thanks to John for his help with MemberName, and to-be-filed RFEs for >>> further improvements. Thanks to Stefan for GC help. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.01/webrev >>> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.jdk.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8174749 >>> >>> Tested with RBT nightly, compiler/jsr292 tests (included in rbt >>> nightly), JPRT, jdk/test/java/lang/invoke, jdk/test/java/lang/instrument >>> tests. >>> >>> There are platform dependent changes in this change. They are very >>> straightforward, ie. add an indirection to MemberName invocations, but >>> could people with access to these platforms test this out for me? >>> >>> Performance testing showed no regression, and large 1000% improvement >>> for the cases that caused us to backout previous attempts at this >>> change. >>> >>> Thanks, >>> Coleen >>> > From coleen.phillimore at oracle.com Fri May 19 18:10:43 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 19 May 2017 14:10:43 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <0da1ffbb-e4fa-8e7f-a6da-093a6a31a6d6@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3caa6d26-89f9-e885-4fc0-63319e9b4822@oracle.com> <0da1ffbb-e4fa-8e7f-a6da-093a6a31a6d6@oracle.com> Message-ID: <45c537ed-1635-86ab-3cad-a72b2159d36d@oracle.com> On 5/19/17 11:56 AM, Stefan Karlsson wrote: > > > On 2017-05-19 17:05, coleen.phillimore at oracle.com wrote: >> >> Stefan, Thank you for reviewing the GC code (and your help). >> >> On 5/19/17 8:37 AM, Stefan Karlsson wrote: >>> Hi Coleen, >>> >>> I'm mainly reviewing the GC specific parts. >>> >>> http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/prims/resolvedMethodTable.cpp.html >>> >>> >>> >>> 143 void ResolvedMethodTable::unlink_or_oops_do(BoolObjectClosure* >>> is_alive, OopClosure* f) { >>> ... >>> 151 if (f != NULL) { >>> 152 f->do_oop((oop*)entry->literal_addr()); >>> 153 p = entry->next_addr(); >>> 154 } else { >>> 155 if (!is_alive->do_object_b(entry->literal())) { >>> 156 _oops_removed++; >>> 157 if (log_is_enabled(Debug, membername, table)) { >>> 158 ResourceMark rm; >>> 159 Method* m = >>> (Method*)java_lang_invoke_ResolvedMethodName::vmtarget(entry->literal()); >>> >>> 160 log_debug(membername, table) ("ResolvedMethod >>> vmtarget entry removed for %s index %d", >>> 161 m->name_and_sig_as_C_string(), i); >>> 162 } >>> 163 *p = entry->next(); >>> 164 _the_table->free_entry(entry); >>> 165 } else { >>> 166 p = entry->next_addr(); >>> 167 } >>> 168 } >>> >>> This code looks backwards to me. If you pass in both an is_alive >>> closure and an f (OopClosure), then you ignore the is_alive closure. >>> This will break if we someday want to clear these entries during a >>> copying GC. Those GCs want to unlink dead entries and apply the f >>> closure to the oop*s of the live entries. >> >> The reason I did this is because i didn't want to copy the same loop for >> oops_do() and unlink(), so it's not the same as StringTable. is_alive >> closure is null for the oops_do case. > > See my comment below: > > I'll decouple and just have >> unlink and oops_do, and if we decide to clear these during copy GC, it >> can be easily changed to unlink_or_oops_do(). >> >>> >>> Could you change this to mimic the code in the StringTable?: >>> >>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/094298f42cc7/src/share/vm/classfile/stringTable.cpp >>> >>> >>> >>> if (is_alive->do_object_b(entry->literal())) { >>> if (f != NULL) { >>> f->do_oop((oop*)entry->literal_addr()); >>> } >>> p = entry->next_addr(); >>> } else { >>> *p = entry->next(); >>> the_table()->free_entry(entry); >>> (*removed)++; >>> } >>> >> >> I can't. is_alive is null when called for oops_do. > > OK. There are ways to do it anyway. > > You could change the code to: > if (is_alive == NULL || is_alive->do_object_b(entry->literal())) { > > Or, use the AlwaysTrueClosure, as we do in in jniHandles.hpp: > > void JNIHandles::weak_oops_do(OopClosure* f) { > AlwaysTrueClosure always_true; > weak_oops_do(&always_true, f); > } > > The proposed decoupling is good as well. Thanks, I think we might need to add to this in the future so I'll keep it simple for now. > >>> ------------------------------------------------------------------------------ >>> >>> >>> >>> http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/gc/g1/g1CollectedHeap.cpp.frames.html >>> >>> >>> >>> 3888 // The parallel work done by all worker threads. >>> 3889 void work(uint worker_id) { >>> 3890 // Do first pass of code cache cleaning. >>> 3891 _code_cache_task.work_first_pass(worker_id); >>> 3892 >>> 3893 // Let the threads mark that the first pass is done. >>> 3894 _code_cache_task.barrier_mark(worker_id); >>> 3895 >>> 3896 // Clean the Strings and Symbols. >>> 3897 _string_symbol_task.work(worker_id); >>> 3898 >>> 3899 // Wait for all workers to finish the first code cache >>> cleaning pass. >>> 3900 _code_cache_task.barrier_wait(worker_id); >>> 3901 >>> 3902 // Do the second code cache cleaning work, which realize on >>> 3903 // the liveness information gathered during the first pass. >>> 3904 _code_cache_task.work_second_pass(worker_id); >>> 3905 >>> 3906 // Clean all klasses that were not unloaded. >>> 3907 _klass_cleaning_task.work(); >>> 3908 >>> 3909 // Clean unreferenced things in the ResolvedMethodTable >>> 3910 _resolved_method_cleaning_task.work(); >>> 3911 } >>> >>> The GC workers wait in the barrier_wait function as long as there are >>> workers left that have not passed the barrier_mark point. If you move >>> the _resolved_method_cleaning_task.work() to somewhere between >>> barrier_mark and barrier_wait, there might be some opportunity for one >>> of the workers to do work instead of waiting in the mark_wait barrier. >> >> Okay, yes, now I see it. I added the resolved_method cleaning task to >> after string_symbol_task. >> >>> >>> ------------------------------------------------------------------------------ >>> >>> >>> >>> 3876 G1MemberNameCleaningTask _resolved_method_cleaning_task; >>> >>> There seems to be a naming confusion in this patch. Sometimes it talks >>> about MemberNames and sometimes ResolvedMethods. Could you make this >>> more consistent throughout the patch? >>> >> >> I missed that in the renaming. Thank you for catching it. >>> ------------------------------------------------------------------------------ >>> >>> >>> >>> http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/prims/resolvedMethodTable.cpp.html >>> >>> >>> >>> 25 #include "precompiled.hpp" >>> 26 #include "gc/shared/gcLocker.hpp" >>> 27 #include "memory/allocation.hpp" >>> 28 #include "oops/oop.inline.hpp" >>> 29 #include "oops/method.hpp" >>> 30 #include "oops/symbol.hpp" >>> 31 #include "prims/resolvedMethodTable.hpp" >>> 32 #include "runtime/handles.inline.hpp" >>> 33 #include "runtime/mutexLocker.hpp" >>> 34 #include "utilities/hashtable.inline.hpp" >>> 35 #include "utilities/macros.hpp" >>> 36 #if INCLUDE_ALL_GCS >>> 37 #include "gc/g1/g1CollectedHeap.hpp" >>> 38 #include "gc/g1/g1SATBCardTableModRefBS.hpp" >>> 39 #include "gc/g1/g1StringDedup.hpp" >>> 40 #endif >>> >>> I don't thing you should include gcLocker.hpp, g1CollectedHeap.hpp, or >>> g1StringDedup.hpp from this file. >> >> I need gcLocker.hpp because NoSafepointVerifier is declared there, but >> removed the others unnecessary #include. > > Right. Forgot about that one. > >> >> Webrev with changes tested with my test case (more testing in progress): >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.02/webrev > > The GC parts look good. thank you!! Coleen > > Thanks, > StefanK > >> >> Thank you!! >> Coleen >>> >>> Thanks, >>> StefanK >>> >>> On 2017-05-17 18:01, coleen.phillimore at oracle.com wrote: >>>> Summary: Add a Java type called ResolvedMethodName which is immutable >>>> and can be stored in a hashtable, that is weakly collected by gc >>>> >>>> Thanks to John for his help with MemberName, and to-be-filed RFEs for >>>> further improvements. Thanks to Stefan for GC help. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.01/webrev >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/8174749.jdk.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8174749 >>>> >>>> Tested with RBT nightly, compiler/jsr292 tests (included in rbt >>>> nightly), JPRT, jdk/test/java/lang/invoke, >>>> jdk/test/java/lang/instrument >>>> tests. >>>> >>>> There are platform dependent changes in this change. They are very >>>> straightforward, ie. add an indirection to MemberName invocations, but >>>> could people with access to these platforms test this out for me? >>>> >>>> Performance testing showed no regression, and large 1000% improvement >>>> for the cases that caused us to backout previous attempts at this >>>> change. >>>> >>>> Thanks, >>>> Coleen >>>> >> From david.holmes at oracle.com Sat May 20 13:07:26 2017 From: david.holmes at oracle.com (David Holmes) Date: Sat, 20 May 2017 23:07:26 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <9bd7dd4b-b513-88b4-92fb-0c815b479252@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <6ce9d6d2-d448-f571-0d73-ad6518c92834@oracle.com> <16450c53-7032-72aa-116e-b8757db2786c@oracle.com> <9bd7dd4b-b513-88b4-92fb-0c815b479252@oracle.com> Message-ID: <6209a37f-b717-8455-0ae7-f1b235911a69@oracle.com> Hi Robbin, On 19/05/2017 10:33 PM, Robbin Ehn wrote: > Hi David > > On 05/19/2017 01:36 PM, David Holmes wrote: >>> >>> There are three different forms of the calculation. The two relative >>> time versions use a different time function and so a different time >>> structure (timeval vs timespec) and a different calculation. > > Yes that's why I included unit in my example signature. > >> >> It's more complicated than that because we may have build time >> SUPPORTS_CLOCK_MONOTONIC but we still need the runtime check as well. >> >> to_abstime is the old linux unpackTime with the addition of the build >> time conditionals. > > I'm not changing that, I'm not sure how to explain better, > so here is the patch (only compiled and silly naming convention): > http://cr.openjdk.java.net/~rehn/8174231/webrev/ Thanks for the taking the time to do this. > This makes the calculation independent of source/unit. Okay I see. Took me a few read throughs to get the gist of it - and it helps to read from the bottom functions up :) Not sure why you are returning a value from the functions though ?? Let's see what others think. It's somewhat harder to compare against the existing code. Thanks again. David > Thanks! > > /Robbin > > > >> >> David >> >>> David >>> ----- >>> >>>> I do not see a problem with this, only better readability? >>>> >>>> /Robbin >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> struct timespec now; >>>>>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>>>>> assert_status(status == 0, status, "clock_gettime"); >>>>>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, >>>>>> now.tv_nsec, NANOUNITS); >>>>>> } else { >>>>>> #else >>>>>> { >>>>>> #endif >>>>>> struct timeval now; >>>>>> int status = gettimeofday(&now, NULL); >>>>>> assert(status == 0, "gettimeofday"); >>>>>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, >>>>>> now.tv_usec, MICROUNITS); >>>>>> } >>>>>> #endif >>>>>> >>>>>> Thanks for fixing this! >>>>>> >>>>>> /Robbin From robbin.ehn at oracle.com Mon May 22 08:47:05 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 22 May 2017 10:47:05 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <6209a37f-b717-8455-0ae7-f1b235911a69@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <6ce9d6d2-d448-f571-0d73-ad6518c92834@oracle.com> <16450c53-7032-72aa-116e-b8757db2786c@oracle.com> <9bd7dd4b-b513-88b4-92fb-0c815b479252@oracle.com> <6209a37f-b717-8455-0ae7-f1b235911a69@oracle.com> Message-ID: <0eefabb3-bb58-1217-ed27-b24efdf9885a@oracle.com> Hi David, On 05/20/2017 03:07 PM, David Holmes wrote: > Okay I see. Took me a few read throughs to get the gist of it - and it helps to read from the bottom functions up :) Great! Yes, C-style with static functions tends to end up that way, since you don't want a lot of forward declarations. > Not sure why you are returning a value from the functions though ?? I skipped (re-)moving an assert on the max_secs value, 1660 assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); just to make the code the same. So there are some minors/nits in the patch. > > Let's see what others think. It's somewhat harder to compare against the existing code. Yes agreed. /Robbin > > Thanks again. > David > >> Thanks! >> >> /Robbin >> >> >> >>> >>> David >>> >>>> David >>>> ----- >>>> >>>>> I do not see a problem with this, only better readability? >>>>> >>>>> /Robbin >>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> struct timespec now; >>>>>>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>>>>>> assert_status(status == 0, status, "clock_gettime"); >>>>>>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_nsec, NANOUNITS); >>>>>>> } else { >>>>>>> #else >>>>>>> { >>>>>>> #endif >>>>>>> struct timeval now; >>>>>>> int status = gettimeofday(&now, NULL); >>>>>>> assert(status == 0, "gettimeofday"); >>>>>>> calc_time(abstime, timeout, isAbsolute, now.tv_sec, now.tv_usec, MICROUNITS); >>>>>>> } >>>>>>> #endif >>>>>>> >>>>>>> Thanks for fixing this! >>>>>>> >>>>>>> /Robbin From igor.ignatyev at oracle.com Mon May 22 18:09:00 2017 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 22 May 2017 11:09:00 -0700 Subject: RFR(XXS) : 8180721: clean up ProblemList Message-ID: <0BE3FBED-6F5D-4B1F-A40C-2A3E322BA565@oracle.com> http://cr.openjdk.java.net/~iignatyev/8180721/webrev.00/index.html > 3 lines changed: 0 ins; 1 del; 2 mod; Hi all, could you please review this tiny patch which cleans up hotspot problem list? BootAppendTests.java was problem listed due to 8150683[1] which has been closed as CNR, however the test has been fixed only by 8179103[2] which is not integrated into JDK 9. AllModulesCommandTest.java was problem listed due to 8168478[3] which has been closed as NAI. this test also has @ignore 8170541[4], but the test should be quarantined rather than excluded due to this bug. webrev: http://cr.openjdk.java.net/~iignatyev/8180721/webrev.00/index.html jbs: https://bugs.openjdk.java.net/browse/JDK-8180721 Thanks, -- Igor [1] https://bugs.openjdk.java.net/browse/JDK-8150683 [2] https://bugs.openjdk.java.net/browse/JDK-8179103 [3] https://bugs.openjdk.java.net/browse/JDK-8168478 [4] https://bugs.openjdk.java.net/browse/JDK-8170541 From george.triantafillou at oracle.com Mon May 22 18:46:24 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Mon, 22 May 2017 14:46:24 -0400 Subject: RFR(XXS) : 8180721: clean up ProblemList In-Reply-To: <0BE3FBED-6F5D-4B1F-A40C-2A3E322BA565@oracle.com> References: <0BE3FBED-6F5D-4B1F-A40C-2A3E322BA565@oracle.com> Message-ID: <566228fa-d66a-f91b-0f50-415d76adce8c@oracle.com> Hi Igor, Looks good. -George On 5/22/2017 2:09 PM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev/8180721/webrev.00/index.html >> 3 lines changed: 0 ins; 1 del; 2 mod; > Hi all, > > could you please review this tiny patch which cleans up hotspot problem list? > BootAppendTests.java was problem listed due to 8150683[1] which has been closed as CNR, however the test has been fixed only by 8179103[2] which is not integrated into JDK 9. > AllModulesCommandTest.java was problem listed due to 8168478[3] which has been closed as NAI. this test also has @ignore 8170541[4], but the test should be quarantined rather than excluded due to this bug. > > webrev: http://cr.openjdk.java.net/~iignatyev/8180721/webrev.00/index.html > jbs: https://bugs.openjdk.java.net/browse/JDK-8180721 > > Thanks, > -- Igor > > [1] https://bugs.openjdk.java.net/browse/JDK-8150683 > [2] https://bugs.openjdk.java.net/browse/JDK-8179103 > [3] https://bugs.openjdk.java.net/browse/JDK-8168478 > [4] https://bugs.openjdk.java.net/browse/JDK-8170541 From igor.ignatyev at oracle.com Mon May 22 20:24:46 2017 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 22 May 2017 13:24:46 -0700 Subject: RFR(XXS) : 8180721: clean up ProblemList In-Reply-To: <04c35e1d-9175-a150-9419-98dce26ef50d@oracle.com> References: <0BE3FBED-6F5D-4B1F-A40C-2A3E322BA565@oracle.com> <04c35e1d-9175-a150-9419-98dce26ef50d@oracle.com> Message-ID: Hi Serguei, by mistake, I've rewritten open part by closed part, should be fixed now. -- Igor > On May 22, 2017, at 1:12 PM, serguei.spitsyn at oracle.com wrote: > > Igor, > > > On 5/22/17 11:09, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev/8180721/webrev.00/index.html >>> 3 lines changed: 0 ins; 1 del; 2 mod; >> Hi all, >> >> could you please review this tiny patch which cleans up hotspot problem list? >> BootAppendTests.java was problem listed due to 8150683[1] which has been closed as CNR, however the test has been fixed only by 8179103[2] which is not integrated into JDK 9. >> AllModulesCommandTest.java was problem listed due to 8168478[3] which has been closed as NAI. this test also has @ignore 8170541[4], but the test should be quarantined rather than excluded due to this bug. >> >> webrev: http://cr.openjdk.java.net/~iignatyev/8180721/webrev.00/index.html > This webrev was supposed to fix the open part with the tests BootAppendTests.java > and AllModulesCommandTest.java but the fix is for closed test Test6329104.java . > Do I miss anything? > > > Thanks, > Serguei > > >> jbs: https://bugs.openjdk.java.net/browse/JDK-8180721 >> >> Thanks, >> -- Igor >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8150683 >> [2] https://bugs.openjdk.java.net/browse/JDK-8179103 >> [3] https://bugs.openjdk.java.net/browse/JDK-8168478 >> [4] https://bugs.openjdk.java.net/browse/JDK-8170541 From igor.ignatyev at oracle.com Mon May 22 21:31:20 2017 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 22 May 2017 14:31:20 -0700 Subject: RFR(XXS) : 8180793 : move jdk.test.lib.wrappers.* to jdk.test.lib package Message-ID: <50A10F9F-93BC-49DC-97F4-96807ABBD248@oracle.com> http://cr.openjdk.java.net/~iignatyev//8180793/webrev.00/index.html > 6 lines changed: 0 ins; 0 del; 6 mod; Hi all, could you please review this tiny changeset which moves TimeLimitedRunner and InfiniteLoop from jdk.test.lib.wrappers to jdk.test.lib package? webrev: http://cr.openjdk.java.net/~iignatyev//8180793/webrev.00/index.html jbs: https://bugs.openjdk.java.net/browse/JDK-8180793 Thanks, -- Igor From mandy.chung at oracle.com Mon May 22 22:23:27 2017 From: mandy.chung at oracle.com (Mandy Chung) Date: Mon, 22 May 2017 15:23:27 -0700 Subject: RFR(XXS) : 8180793 : move jdk.test.lib.wrappers.* to jdk.test.lib package In-Reply-To: <50A10F9F-93BC-49DC-97F4-96807ABBD248@oracle.com> References: <50A10F9F-93BC-49DC-97F4-96807ABBD248@oracle.com> Message-ID: <44441FE5-1B5B-421B-A633-825EBCA7BFA6@oracle.com> > On May 22, 2017, at 2:31 PM, Igor Ignatyev wrote: > > http://cr.openjdk.java.net/~iignatyev//8180793/webrev.00/index.html >> 6 lines changed: 0 ins; 0 del; 6 mod; > > Hi all, > > could you please review this tiny changeset which moves TimeLimitedRunner and InfiniteLoop from jdk.test.lib.wrappers to jdk.test.lib package? > > webrev: http://cr.openjdk.java.net/~iignatyev//8180793/webrev.00/index.html > jbs: https://bugs.openjdk.java.net/browse/JDK-8180793 +1 Mandy From mikhailo.seledtsov at oracle.com Mon May 22 22:40:02 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 22 May 2017 15:40:02 -0700 Subject: RFR(XXS) : 8180793 : move jdk.test.lib.wrappers.* to jdk.test.lib package In-Reply-To: <44441FE5-1B5B-421B-A633-825EBCA7BFA6@oracle.com> References: <50A10F9F-93BC-49DC-97F4-96807ABBD248@oracle.com> <44441FE5-1B5B-421B-A633-825EBCA7BFA6@oracle.com> Message-ID: <592368C2.3070003@oracle.com> Looks good to me, Misha On 5/22/17, 3:23 PM, Mandy Chung wrote: >> On May 22, 2017, at 2:31 PM, Igor Ignatyev wrote: >> >> http://cr.openjdk.java.net/~iignatyev//8180793/webrev.00/index.html >>> 6 lines changed: 0 ins; 0 del; 6 mod; >> Hi all, >> >> could you please review this tiny changeset which moves TimeLimitedRunner and InfiniteLoop from jdk.test.lib.wrappers to jdk.test.lib package? >> >> webrev: http://cr.openjdk.java.net/~iignatyev//8180793/webrev.00/index.html >> jbs: https://bugs.openjdk.java.net/browse/JDK-8180793 > > +1 > Mandy From mikhailo.seledtsov at oracle.com Mon May 22 22:43:00 2017 From: mikhailo.seledtsov at oracle.com (Mikhailo Seledtsov) Date: Mon, 22 May 2017 15:43:00 -0700 Subject: RFR(XXS) : 8180721: clean up ProblemList In-Reply-To: References: <0BE3FBED-6F5D-4B1F-A40C-2A3E322BA565@oracle.com> <04c35e1d-9175-a150-9419-98dce26ef50d@oracle.com> Message-ID: <59236974.7080809@oracle.com> +1, Misha On 5/22/17, 1:24 PM, Igor Ignatyev wrote: > Hi Serguei, > > by mistake, I've rewritten open part by closed part, should be fixed now. > -- Igor > >> On May 22, 2017, at 1:12 PM, serguei.spitsyn at oracle.com wrote: >> >> Igor, >> >> >> On 5/22/17 11:09, Igor Ignatyev wrote: >>> http://cr.openjdk.java.net/~iignatyev/8180721/webrev.00/index.html >>>> 3 lines changed: 0 ins; 1 del; 2 mod; >>> Hi all, >>> >>> could you please review this tiny patch which cleans up hotspot problem list? >>> BootAppendTests.java was problem listed due to 8150683[1] which has been closed as CNR, however the test has been fixed only by 8179103[2] which is not integrated into JDK 9. >>> AllModulesCommandTest.java was problem listed due to 8168478[3] which has been closed as NAI. this test also has @ignore 8170541[4], but the test should be quarantined rather than excluded due to this bug. >>> >>> webrev: http://cr.openjdk.java.net/~iignatyev/8180721/webrev.00/index.html >> This webrev was supposed to fix the open part with the tests BootAppendTests.java >> and AllModulesCommandTest.java but the fix is for closed test Test6329104.java . >> Do I miss anything? >> >> >> Thanks, >> Serguei >> >> >>> jbs: https://bugs.openjdk.java.net/browse/JDK-8180721 >>> >>> Thanks, >>> -- Igor >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8150683 >>> [2] https://bugs.openjdk.java.net/browse/JDK-8179103 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8168478 >>> [4] https://bugs.openjdk.java.net/browse/JDK-8170541 From coleen.phillimore at oracle.com Tue May 23 14:10:54 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 May 2017 10:10:54 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <8c9bdd72-f239-dfc5-66e5-29b91428f897@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <8c9bdd72-f239-dfc5-66e5-29b91428f897@oracle.com> Message-ID: On 5/22/17 11:23 PM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > > I've finished reviewing, it looks great! Thank you, Serguei! > > Just some nits. > > http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/classfile/javaClasses.cpp.frames.html > > A dot is missed at the end of the comment line 3259: > 3258 // Add a reference to the loader (actually mirror because > anonymous classes will not have > 3259 // distinct loaders) to ensure the metadata is kept alive > 3260 // This mirror may be different than the one in clazz field. > ok. > Dots are also missed in a couple of places in the > resolvedMethodTable.?pp comments: > resolvedMethodTable.hpp: L31, L35 > resolvedMethodTable.cpp: > L130, L141 (L140 has an unnecessary dot) > Those comments didn't get fixed when I changed the code. I wish the compiler would check this! I fixed them: // Serially invoke removed unused oops from the table. // This is done late during GC. void ResolvedMethodTable::unlink(BoolObjectClosure* is_alive) { ... // Serially invoke "f->do_oop" on the locations of all oops in the table. void ResolvedMethodTable::oops_do(OopClosure* f) { ... > resolvedMethodTable.cpp: > > Values of the counters _oops_removed and_oops_counted are not used. > Is it a leftover from the debugging or there was a plan to log them? They are used for counting and I added this logging message to the end of ResolvedMethodTable::unlink() to print them. log_debug(membername, table) ("ResolvedMethod entries counted %d removed %d", _oos_counted, _oops_removed); > > 195 // For each entry in MNT, change to new method > > MNT should be RMT now. :) > Changed it. Nice catch. Thank you for reviewing this! Coleen > > Thanks, > Serguei > > > > On 5/17/17 09:01, coleen.phillimore at oracle.com wrote: >> Summary: Add a Java type called ResolvedMethodName which is immutable >> and can be stored in a hashtable, that is weakly collected by gc >> >> Thanks to John for his help with MemberName, and to-be-filed RFEs for >> further improvements. Thanks to Stefan for GC help. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.01/webrev >> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.jdk.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8174749 >> >> Tested with RBT nightly, compiler/jsr292 tests (included in rbt >> nightly), JPRT, jdk/test/java/lang/invoke, >> jdk/test/java/lang/instrument tests. >> >> There are platform dependent changes in this change. They are very >> straightforward, ie. add an indirection to MemberName invocations, >> but could people with access to these platforms test this out for me? >> >> Performance testing showed no regression, and large 1000% improvement >> for the cases that caused us to backout previous attempts at this >> change. >> >> Thanks, >> Coleen >> > From coleen.phillimore at oracle.com Tue May 23 14:17:20 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 May 2017 10:17:20 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <8c9bdd72-f239-dfc5-66e5-29b91428f897@oracle.com> Message-ID: <034043ea-1885-91be-00f5-358f93d540d2@oracle.com> On 5/23/17 10:10 AM, coleen.phillimore at oracle.com wrote: > > > On 5/22/17 11:23 PM, serguei.spitsyn at oracle.com wrote: >> Hi Coleen, >> >> >> I've finished reviewing, it looks great! > > Thank you, Serguei! >> >> Just some nits. >> >> http://cr.openjdk.java.net/~coleenp/8174749.01/webrev/src/share/vm/classfile/javaClasses.cpp.frames.html >> >> >> A dot is missed at the end of the comment line 3259: >> 3258 // Add a reference to the loader (actually mirror because >> anonymous classes will not have >> 3259 // distinct loaders) to ensure the metadata is kept alive >> 3260 // This mirror may be different than the one in clazz field. >> > ok. > >> Dots are also missed in a couple of places in the >> resolvedMethodTable.?pp comments: >> resolvedMethodTable.hpp: L31, L35 >> resolvedMethodTable.cpp: >> L130, L141 (L140 has an unnecessary dot) >> > > Those comments didn't get fixed when I changed the code. I wish the > compiler would check this! I fixed them: > > // Serially invoke removed unused oops from the table. > // This is done late during GC. > void ResolvedMethodTable::unlink(BoolObjectClosure* is_alive) { > > ... > > // Serially invoke "f->do_oop" on the locations of all oops in the table. > void ResolvedMethodTable::oops_do(OopClosure* f) { > ... > >> resolvedMethodTable.cpp: >> >> Values of the counters _oops_removed and_oops_counted are not used. >> Is it a leftover from the debugging or there was a plan to log them? > > They are used for counting and I added this logging message to the end > of ResolvedMethodTable::unlink() to print them. > > log_debug(membername, table) ("ResolvedMethod entries counted %d > removed %d", > _oos_counted, _oops_removed); Make that _oops_counted, _oops_removed. This is useful logging. thanks, Coleen > >> >> 195 // For each entry in MNT, change to new method >> >> MNT should be RMT now. :) >> > > Changed it. Nice catch. > > Thank you for reviewing this! > Coleen > >> >> Thanks, >> Serguei >> >> >> >> On 5/17/17 09:01, coleen.phillimore at oracle.com wrote: >>> Summary: Add a Java type called ResolvedMethodName which is >>> immutable and can be stored in a hashtable, that is weakly collected >>> by gc >>> >>> Thanks to John for his help with MemberName, and to-be-filed RFEs >>> for further improvements. Thanks to Stefan for GC help. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.01/webrev >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/8174749.jdk.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8174749 >>> >>> Tested with RBT nightly, compiler/jsr292 tests (included in rbt >>> nightly), JPRT, jdk/test/java/lang/invoke, >>> jdk/test/java/lang/instrument tests. >>> >>> There are platform dependent changes in this change. They are very >>> straightforward, ie. add an indirection to MemberName invocations, >>> but could people with access to these platforms test this out for me? >>> >>> Performance testing showed no regression, and large 1000% >>> improvement for the cases that caused us to backout previous >>> attempts at this change. >>> >>> Thanks, >>> Coleen >>> >> > From gromero at linux.vnet.ibm.com Wed May 24 13:31:53 2017 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 24 May 2017 10:31:53 -0300 Subject: [8u] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used Message-ID: <59258B49.9080602@linux.vnet.ibm.com> Hi, Could this backport of 8175813 for jdk8u be reviewed, please? It applies cleanly to jdk8u except for a chunk in os::Linux::libnuma_init(), but it's just due to an indentation change introduced with cleanup [1]. It improves JVM NUMA node detection on PPC64. Currently there is no Linux distros that package only libnuma v1, so libnuma API v2 used in that change is always available. webrev : http://cr.openjdk.java.net/~gromero/8175813/backport/ bug : https://bugs.openjdk.java.net/browse/JDK-8175813 review thread: http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-May/026788.html Thank you. Regards, Gustavo [1] https://bugs.openjdk.java.net/browse/JDK-8057107 From matthias.baesken at sap.com Wed May 24 15:42:00 2017 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 24 May 2017 15:42:00 +0000 Subject: [XS] RFR : 8180945 : vmError.cpp : adjust dup and fclose Message-ID: <22d5e5a59fc7417eafef35c0468dafea@sap.com> Hello, could I please have a review for the following small change . In vmError.cpp there is a part where the dup return code in case of an error is not handled, and additionally fclose might be called with parameter NULL . The change adjusts this. Bug : https://bugs.openjdk.java.net/browse/JDK-8180945 webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8180945/ Thanks, Matthias From kim.barrett at oracle.com Wed May 24 23:42:41 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 24 May 2017 19:42:41 -0400 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters Message-ID: Please review this change to Atomic::load and OrderAccess::load_acquire overloads to make their source const qualified, e.g. instead of "volatile T*" make them "const volatile T*". This eliminates the need for casting away const when, for example, applying one of these operations to a member variable when in a const-qualified method. There are probably places that previously required casting away const but now do not. Similarly, there are probably places where values couldn't be const or member functions couldn't be const qualified, but now can be. I did a little searching and found a few candidates, but none that were otherwise trivial to add to this change, so haven't included any. This change touches platform-specific code for non-Oracle supported platforms that I can't test, so I'd like reviews from the respective platform owners. Aside regarding aarch64: I don't know why gcc's __atomic_load doesn't const-qualify the source argument; that seems like a bug. Or maybe they are, but not documented that way. And I wonder why the aarch64 port uses __atomic_load rather than __atomic_load_n. CR: https://bugs.openjdk.java.net/browse/JDK-8166651 Webrev: http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.00 Testing: JPRT From david.holmes at oracle.com Thu May 25 08:44:18 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 May 2017 18:44:18 +1000 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: References: Message-ID: <14874294-a199-62dd-a343-54be6ad956b9@oracle.com> Hi Kim, On 25/05/2017 9:42 AM, Kim Barrett wrote: > Please review this change to Atomic::load and OrderAccess::load_acquire > overloads to make their source const qualified, e.g. instead of > "volatile T*" make them "const volatile T*". This eliminates the need > for casting away const when, for example, applying one of these > operations to a member variable when in a const-qualified method. This looks quite reasonable - thanks - provided ... > There are probably places that previously required casting away const > but now do not. Similarly, there are probably places where values ... our compilers do not complain about unnecessary casts :) Cheers, David > couldn't be const or member functions couldn't be const qualified, but > now can be. I did a little searching and found a few candidates, but > none that were otherwise trivial to add to this change, so haven't > included any. > > This change touches platform-specific code for non-Oracle supported > platforms that I can't test, so I'd like reviews from the respective > platform owners. > > Aside regarding aarch64: I don't know why gcc's __atomic_load doesn't > const-qualify the source argument; that seems like a bug. Or maybe > they are, but not documented that way. And I wonder why the aarch64 > port uses __atomic_load rather than __atomic_load_n. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8166651 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.00 > > Testing: > JPRT > From coleen.phillimore at oracle.com Thu May 25 11:43:56 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 25 May 2017 07:43:56 -0400 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: <14874294-a199-62dd-a343-54be6ad956b9@oracle.com> References: <14874294-a199-62dd-a343-54be6ad956b9@oracle.com> Message-ID: <27eb0395-c73f-abae-9727-1dc2c5ec8dfb@oracle.com> Looks good. Coleen On 5/25/17 4:44 AM, David Holmes wrote: > Hi Kim, > > On 25/05/2017 9:42 AM, Kim Barrett wrote: >> Please review this change to Atomic::load and OrderAccess::load_acquire >> overloads to make their source const qualified, e.g. instead of >> "volatile T*" make them "const volatile T*". This eliminates the need >> for casting away const when, for example, applying one of these >> operations to a member variable when in a const-qualified method. > > This looks quite reasonable - thanks - provided ... > >> There are probably places that previously required casting away const >> but now do not. Similarly, there are probably places where values > > ... our compilers do not complain about unnecessary casts :) > > Cheers, > David > > >> couldn't be const or member functions couldn't be const qualified, but >> now can be. I did a little searching and found a few candidates, but >> none that were otherwise trivial to add to this change, so haven't >> included any. >> >> This change touches platform-specific code for non-Oracle supported >> platforms that I can't test, so I'd like reviews from the respective >> platform owners. >> >> Aside regarding aarch64: I don't know why gcc's __atomic_load doesn't >> const-qualify the source argument; that seems like a bug. Or maybe >> they are, but not documented that way. And I wonder why the aarch64 >> port uses __atomic_load rather than __atomic_load_n. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8166651 >> >> Webrev: >> http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.00 >> >> Testing: >> JPRT >> From adinn at redhat.com Thu May 25 13:16:40 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 May 2017 14:16:40 +0100 Subject: Fwd: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> Message-ID: Forwarding this to hotpsot-dev which is probably the more appropriate destination. -------- Forwarded Message -------- Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException Date: Thu, 25 May 2017 14:12:53 +0100 From: Andrew Dinn To: jdk10-dev The following webrev fixes a race condition that is present in jdk10 and also jdk9 and jdk8. It is caused by a misplaced volatile keyword that faild to ensure correct ordering of writes by the compiler. Reviews welcome. http://cr.openjdk.java.net/~adinn/8181085/webrev.00/ Backporting: This same fix is required in jdk9 and jdk8. Testing: The reproducer posted with the original issue manifests the NPE reliably on jdk8. It does not manifest on jdk9/10 but that is only thanks to changes introduced into the resolution process in jdk9 which change the timing of execution. However, without this fix the out-of-order write problem is still present in jdk9/10, as can be seen by eyeballing the compiled code for ConstantPoolCacheEntry::set_direct_or_vtable_call. The patch has been validated on jdk8 by running the reproducer. It stops any resulting NPEs. The code for ConstantPoolCacheEntry::set_direct_or_vtable_call on jdk8-10 has been eyeballed to ensure that post-patch the assignments now occur in the correct order. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From ramkri123 at gmail.com Thu May 25 14:35:22 2017 From: ramkri123 at gmail.com (Ram Krishnan) Date: Thu, 25 May 2017 07:35:22 -0700 Subject: output of jstack command Message-ID: Hi, I would like to leverage the output of jstack command for extracting additional information about the type of threads, thread ids etc. Since I will be parsing the output, I need the precise format. Is there any documentation on jstack output format changes and the openjdk release(s) where the changes happened? ?Tha?nks in advance. -- Thanks, Ramki From adinn at redhat.com Thu May 25 14:41:33 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 May 2017 15:41:33 +0100 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: References: Message-ID: <5ec3ca7f-f4cd-337c-63d5-3b00fd2839a7@redhat.com> Hi Kim, On 25/05/17 00:42, Kim Barrett wrote: > > Aside regarding aarch64: I don't know why gcc's __atomic_load doesn't > const-qualify the source argument; that seems like a bug. Or maybe > they are, but not documented that way. And I wonder why the aarch64 > port uses __atomic_load rather than __atomic_load_n. This is braaking when I compile it on AArch64. Specifically, it's the void * atomic load definition in orderAccess_linux_aarch64.inline.hpp that is causing the problem. Here is one of many errors: In file included from /home/adinn/openjdk/jdk10-hs/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:33:0, ... /home/adinn/openjdk/jdk10-hs/hotspot/src/os_cpu/linux_aarch64/vm/orderAccess_linux_aarch64.inline.hpp: In static member function ?static void* OrderAccess::load_ptr_acquire(const volatile void*)?: /home/adinn/openjdk/jdk10-hs/hotspot/src/os_cpu/linux_aarch64/vm/orderAccess_linux_aarch64.inline.hpp:77:59: error: invalid const_cast from type ?const volatile void*? to type ?void* volatile*? { void* data; __atomic_load(const_cast(p), &data, __ATOMIC_ACQUIRE); return data; } The new declaration doesn't exactly look very convincing (but then nor did the old one. Here is the relevant change: ... inline void* OrderAccess::load_ptr_acquire(const volatile void* p) -{ void* data; __atomic_load((void* const volatile *)p, &data, __ATOMIC_ACQUIRE); return data; } +{ void* data; __atomic_load(const_cast(p), &data, __ATOMIC_ACQUIRE); return data; } ... I'm still puzzling over what is actually needed to do the right job here. I can see why you have made the change the way you have but the compiler definitely does not want to eat it. I'll play with the code and see what does compile. Oddly enough, I have just posted a fix to jdk10-dev (I forwarded the original note to hotspot-dev but the discussion now seems to be progressing in jdk10-dev -- apologies) that relates to a use of this code and to the correct declaration of volatile fields that store pointers vs fields that store volatile pointers). It might be worth looking at that to see if it bears on this issue. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From daniel.daugherty at oracle.com Thu May 25 14:59:22 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 25 May 2017 08:59:22 -0600 Subject: output of jstack command In-Reply-To: References: Message-ID: <5c8d22b5-5c6c-1aca-c57f-2b28733efe3f@oracle.com> Adding serviceability-dev at ... since jstack is a Serviceability tool. I believe jstack is experimental which means the output format can change at any time... Dan On 5/25/17 8:35 AM, Ram Krishnan wrote: > Hi, > > I would like to leverage the output of jstack command for extracting > additional information about the type of threads, thread ids etc. Since I > will be parsing the output, I need the precise format. Is there any > documentation on jstack output format changes and the openjdk release(s) > where the changes happened? > > ?Tha?nks in advance. > From kirk.pepperdine at gmail.com Thu May 25 15:07:26 2017 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Thu, 25 May 2017 17:07:26 +0200 Subject: output of jstack command In-Reply-To: <5c8d22b5-5c6c-1aca-c57f-2b28733efe3f@oracle.com> References: <5c8d22b5-5c6c-1aca-c57f-2b28733efe3f@oracle.com> Message-ID: <14CF8360-5840-4204-9F2D-6A123A5F9858@gmail.com> Hi Ramki, The source for jstack is in openJDK. Feel free to create your own copy of jstack where you can output the information in any format he likes. If you are suggesting that the existing format be changed do be aware that there are many tools that expect the current format. These have been adjusted to a change in format that was introduced with Java 8. I don?t see any reason why the format shouldn?t include information that is currently missing and is relevant. However I?d want to make sure that is is relevant and important before breaking the tool chain once again. I believe thread ids are already in the header. Certainly thread names are there. Not sure what you mean by types of threads. Kind regards, Kirk > On May 25, 2017, at 4:59 PM, Daniel D. Daugherty wrote: > > Adding serviceability-dev at ... since jstack is a Serviceability tool. > > I believe jstack is experimental which means the output format can > change at any time... > > Dan > > On 5/25/17 8:35 AM, Ram Krishnan wrote: >> Hi, >> >> I would like to leverage the output of jstack command for extracting >> additional information about the type of threads, thread ids etc. Since I >> will be parsing the output, I need the precise format. Is there any >> documentation on jstack output format changes and the openjdk release(s) >> where the changes happened? >> >> ?Tha?nks in advance. >> > From adinn at redhat.com Thu May 25 15:54:56 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 May 2017 16:54:56 +0100 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: <5ec3ca7f-f4cd-337c-63d5-3b00fd2839a7@redhat.com> References: <5ec3ca7f-f4cd-337c-63d5-3b00fd2839a7@redhat.com> Message-ID: <702f7e19-da8d-6ca2-8277-185a9468ef2a@redhat.com> Hi Kim, On 25/05/17 15:41, Andrew Dinn wrote: > On 25/05/17 00:42, Kim Barrett wrote: >> >> Aside regarding aarch64: I don't know why gcc's __atomic_load doesn't >> const-qualify the source argument; that seems like a bug. Or maybe >> they are, but not documented that way. And I wonder why the aarch64 >> port uses __atomic_load rather than __atomic_load_n. Hmm, this does not appear to be borne out by my experience (see below). > This is breaking when I compile it on AArch64. Specifically, it's the > void * atomic load definition in orderAccess_linux_aarch64.inline.hpp > that is causing the problem. Here is one of many errors: > . . . > > I'm still puzzling over what is actually needed to do the right job > here. I can see why you have made the change the way you have but the > compiler definitely does not want to eat it. I'll play with the code and > see what does compile. Ok, so I have several interesting things to report. Your patch states // __atomic_load's ptr parameter is non-const, so need casts. and then proceeds to insert const_cast into each inline load_acquire definition: inline jbyte OrderAccess::load_acquire(const volatile jbyte* p) { jbyte data; __atomic_load(const_cast(p), &data, __ATOMIC_ACQUIRE); return data; } inline jshort OrderAccess::load_acquire(const volatile jshort* p) { jshort data; __atomic_load(const_cast(p), &data, __ATOMIC_ACQUIRE); return data; } ... I don't know what version of the compiler or the lib code for __atomic_load you derived that from (or maybe you read the fine manual???) but, leaving aside the vodi* flavour method one which blew up earlier, the const_cast is not needed on my machine for any of the other load_acquire methods. So, I can compile all of them quite happily with no const_cast: inline jbyte OrderAccess::load_acquire(const volatile jbyte* p) { jbyte data; __atomic_load(p, &data, __ATOMIC_ACQUIRE); return data; } inline jshort OrderAccess::load_acquire(const volatile jshort* p) { jshort data; __atomic_load(p, &data, __ATOMIC_ACQUIRE); return data; } ... As regards the void* flavour which /does/ blow up the problem does not seem strictly to be the constness but the fact that you are trying to pass a void* in a context where a void** is really needed. The const and volatile qualifiers simply get in the way of making that recast work in one go. I did manage to get it to compile with two casts: inline void* OrderAccess::load_ptr_acquire(const volatile void* p) { void* data; __atomic_load(static_cast(const_cast(p)), &data, __ATOMIC_ACQUIRE); return data; } i.e. if you cast away the constness as a prior step then you can happily static_cast the resulting (volatile void*) to a (void * volatile *). Of course, I'll happily bow to your superior knowledge of C++ if you think this is wrong. > Oddly enough, I have just posted a fix to jdk10-dev (I forwarded the > original note to hotspot-dev but the discussion now seems to be > progressing in jdk10-dev -- apologies) that relates to a use of this > code and to the correct declaration of volatile fields that store > pointers vs fields that store volatile pointers). It might be worth > looking at that to see if it bears on this issue. Please ignore that last remark. The problem was only in jdk8 not jdk9/10. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From ramkri123 at gmail.com Thu May 25 16:01:03 2017 From: ramkri123 at gmail.com (Ram Krishnan) Date: Thu, 25 May 2017 09:01:03 -0700 Subject: output of jstack command In-Reply-To: <14CF8360-5840-4204-9F2D-6A123A5F9858@gmail.com> References: <5c8d22b5-5c6c-1aca-c57f-2b28733efe3f@oracle.com> <14CF8360-5840-4204-9F2D-6A123A5F9858@gmail.com> Message-ID: Hi Kirk, Daniel, Many thanks for the immediate reply. By type of thread, I meant GC thread vs normal thread. Looks like that information is already there in the thread name. Looks like in Java 9 the output of jstack is different, so the tools need to change for Java 9. Is it fair to count on a consistent format per Java release for jstack? Thanks, Ramki On Thu, May 25, 2017 at 8:07 AM, Kirk Pepperdine wrote: > Hi Ramki, > > The source for jstack is in openJDK. Feel free to create your own copy of > jstack where you can output the information in any format he likes. If you > are suggesting that the existing format be changed do be aware that there > are many tools that expect the current format. These have been adjusted to > a change in format that was introduced with Java 8. I don?t see any reason > why the format shouldn?t include information that is currently missing and > is relevant. However I?d want to make sure that is is relevant and > important before breaking the tool chain once again. > > I believe thread ids are already in the header. Certainly thread names are > there. Not sure what you mean by types of threads. > > Kind regards, > Kirk > > On May 25, 2017, at 4:59 PM, Daniel D. Daugherty < > daniel.daugherty at oracle.com> wrote: > > > > Adding serviceability-dev at ... since jstack is a Serviceability tool. > > > > I believe jstack is experimental which means the output format can > > change at any time... > > > > Dan > > > > On 5/25/17 8:35 AM, Ram Krishnan wrote: > >> Hi, > >> > >> I would like to leverage the output of jstack command for extracting > >> additional information about the type of threads, thread ids etc. Since > I > >> will be parsing the output, I need the precise format. Is there any > >> documentation on jstack output format changes and the openjdk release(s) > >> where the changes happened? > >> > >> ?Tha?nks in advance. > >> > > > > -- Thanks, Ramki From adinn at redhat.com Thu May 25 16:25:38 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 May 2017 17:25:38 +0100 Subject: Fwd: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> Message-ID: <7b268254-e17b-482f-071a-cb4a3e4b19f5@redhat.com> Apologies but this RFR is retracted -- the problem only applies to jdk8. I will be posting a revised RFR to jdk8u. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander On 25/05/17 14:16, Andrew Dinn wrote: > Forwarding this to hotpsot-dev which is probably the more appropriate > destination. > > > -------- Forwarded Message -------- > Subject: RFR: 8181085: Race condition in method resolution may produce > spurious NullPointerException > Date: Thu, 25 May 2017 14:12:53 +0100 > From: Andrew Dinn > To: jdk10-dev > > The following webrev fixes a race condition that is present in jdk10 and > also jdk9 and jdk8. It is caused by a misplaced volatile keyword that > faild to ensure correct ordering of writes by the compiler. Reviews welcome. > > http://cr.openjdk.java.net/~adinn/8181085/webrev.00/ > > Backporting: > This same fix is required in jdk9 and jdk8. > > Testing: > The reproducer posted with the original issue manifests the NPE reliably > on jdk8. It does not manifest on jdk9/10 but that is only thanks to > changes introduced into the resolution process in jdk9 which change the > timing of execution. However, without this fix the out-of-order write > problem is still present in jdk9/10, as can be seen by eyeballing the > compiled code for ConstantPoolCacheEntry::set_direct_or_vtable_call. > > The patch has been validated on jdk8 by running the reproducer. It stops > any resulting NPEs. > > The code for ConstantPoolCacheEntry::set_direct_or_vtable_call on > jdk8-10 has been eyeballed to ensure that post-patch the assignments now > occur in the correct order. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > > From daniel.daugherty at oracle.com Thu May 25 16:48:54 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 25 May 2017 10:48:54 -0600 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> Message-ID: On 5/18/17 12:25 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 > > webrevs: > > Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ General comment(s): - Sometimes you've updated the copyright year for the file and sometimes you haven't. Please check before pushing. common/autoconf/flags.m4 No comments. common/autoconf/generated-configure.sh No comments. > hotspot: http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ src/os/aix/vm/os_aix.cpp No comments; did not try to compare deleted code with os_posix.cpp. src/os/aix/vm/os_aix.hpp No comments; did not try to compare deleted code with os_posix.hpp. src/os/bsd/vm/os_bsd.cpp No comments; compared deleted code with os_posix.cpp version; nothing jumped out as wrong. src/os/bsd/vm/os_bsd.hpp No comments; compared deleted code with os_posix.hpp version; nothing jumped out as wrong. src/os/linux/vm/os_linux.cpp No comments; compared deleted code with os_posix.cpp version; nothing jumped out as wrong. src/os/linux/vm/os_linux.hpp No comments; compared deleted code with os_posix.hpp version; nothing jumped out as wrong. src/os/posix/vm/os_posix.cpp L1401: // Not currently usable by Solaris L1408: // time-of-day clock nit - needs period at end of the sentence L1433: // build time support then there can not be typo - "can not" -> "cannot" L1435: // int or int64_t. typo - needs a ')' before the period. L1446: // determine what POSIX API's are present and do appropriate L1447: // configuration nits - 'determine' -> 'Determine' - needs period at end of the sentence L1455: // 1. Check for CLOCK_MONOTONIC support nit - needs period at end of the sentence L1462: // we do dlopen's in this particular order due to bug in linux L1463: // dynamical loader (see 6348968) leading to crash on exit nits - 'we' -> 'We' - needs period at end of the sentence typo - 'dynamical' -> 'dynamic' L1481: // we assume that if both clock_gettime and clock_getres support L1482: // CLOCK_MONOTONIC then the OS provides true high-res monotonic clock nits - 'we' -> 'We' - needs period at end of the sentence L1486: clock_gettime_func(CLOCK_MONOTONIC, &tp) == 0) { nit - extra space before '==' L1487: // yes, monotonic clock is supported nits - 'yes' -> 'Yes' - needs period at end of the sentence L1491: // close librt if there is no monotonic clock nits - 'close' -> 'Close' - needs period at end of the sentence L1499: // 2. Check for pthread_condattr_setclock support L1503: // libpthread is already loaded L1511: // Now do general initialization nit - needs period at end of the sentence L1591: if (timeout < 0) L1592: timeout = 0; nit - missing braces L1609: // More seconds than we can add, so pin to max_secs L1658: // More seconds than we can add, so pin to max_secs nit - needs period at end of the sentence L1643: // Absolue seconds exceeds allow max, so pin to max_secs typo - 'Absolue' -> 'Absolute' nit - needs period at end of the sentence src/os/posix/vm/os_posix.hpp L149: ~PlatformEvent() { guarantee(0, "invariant"); } L185: ~PlatformParker() { guarantee(0, "invariant"); } nit - '0' should be 'false' or just call fatal() src/os/solaris/vm/os_solaris.cpp No comments. src/os/solaris/vm/os_solaris.hpp No comments. As Robbin said, this is very hard to review and be sure that everything is relocated correctly. I tried to look at this code a couple of different ways and nothing jumped out at me as wrong. I did my usual crawl style review through posix.cpp and posix.hpp. I only found nits and typos that you can chose to ignore since you're on a time crunch here. Thumbs up! Dan > > First a big thank you to Thomas Stuefe for testing various versions of > this on AIX. > > This is primarily a refactoring and cleanup exercise (ie lots of > deleted duplicated code!). > > I have taken the PlatformEvent, PlatformParker and Parker::* code, out > of os_linux and moved it into os_posix for use by Linux, OSX, BSD, AIX > and perhaps one day Solaris (more on that later). > > The Linux code was the most functionally complete, dealing with > correct use of CLOCK_MONOTONIC for relative timed waits, and the > default wall-clock for absolute timed waits. That functionality is > not, unfortunately, supported by all our POSIX platforms so there are > some configure time build checks to set some #defines, and then some > dynamic lookup at runtime**. We allow for the runtime environment to > be less capable than the build environment, but not the other way > around (without build time support we don't know the runtime types > needed to make library calls). > > ** There is some duplication of dynamic lookup code on Linux but this > can be cleaned up in future work if we refactor the time/clock code > into os_posix as well. > > The cleanup covers a number of things: > - removal of linux anachronisms that got "ported" into the other > platforms > - eg EINTR can not be returned from the wait methods > - removal of solaris anachronisms that got ported into the linux code > and then on to other platforms > - eg ETIMEDOUT is what we expect never ETIME > - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() > from the Parker methods > - consolidation of unpackTime and compute_abstime into one utility > function > - use statics for things completely private to the implementation > rather than making them part of the os* API (eg access to condAttr > objects) > - cleanup up commentary and style within methods of the same class > - clean up coding style in places eg not using Names that start with > capitals. > > I have not tried to cleanup every single oddity, nor tried to > reconcile differences between the very similar in places PlatformEvent > and Park methods. For example PlatformEvent still examines the > FilterSpuriousWakeups** flag, and Parker still ignores it. > > ** Perhaps a candidate for deprecation and future removal. > > There is one mini "enhancement" slipped in this. I now explicitly > initialize mutexes with a mutexAttr object with its type set to > PTHREAD_MUTEX_NORMAL, instead of relying on the definition of > PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but > "error checking" and so is slow. On all other current platforms there > is no effective change. > > Finally, Solaris is excluded from all this (other than the debug > signal blocking cleanup) because it potentially supports three > different low-level sync subsystems: UI thr*, Pthread, and direct LWP > sync. Solaris cleanup would be a separate RFE. > > No doubt I've overlooked mentioning something that someone will spot. :) > > Thanks, > David > From kirk.pepperdine at gmail.com Thu May 25 18:06:44 2017 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Thu, 25 May 2017 20:06:44 +0200 Subject: output of jstack command In-Reply-To: References: <5c8d22b5-5c6c-1aca-c57f-2b28733efe3f@oracle.com> <14CF8360-5840-4204-9F2D-6A123A5F9858@gmail.com> Message-ID: Hi, The format of thread dumps have been stable until 8 when it seems that we stopped caring about breaking tooling. That it changed in 9 is of no surprise. Again, pain we?re willing to accept if there is valuable information being added. Kind regards, Kirk > On May 25, 2017, at 6:01 PM, Ram Krishnan wrote: > > Hi Kirk, Daniel, > > Many thanks for the immediate reply. > > By type of thread, I meant GC thread vs normal thread. Looks like that information is already there in the thread name. > > Looks like in Java 9 the output of jstack is different, so the tools need to change for Java 9. > > Is it fair to count on a consistent format per Java release for jstack? > > Thanks, > Ramki > > On Thu, May 25, 2017 at 8:07 AM, Kirk Pepperdine > wrote: > Hi Ramki, > > The source for jstack is in openJDK. Feel free to create your own copy of jstack where you can output the information in any format he likes. If you are suggesting that the existing format be changed do be aware that there are many tools that expect the current format. These have been adjusted to a change in format that was introduced with Java 8. I don?t see any reason why the format shouldn?t include information that is currently missing and is relevant. However I?d want to make sure that is is relevant and important before breaking the tool chain once again. > > I believe thread ids are already in the header. Certainly thread names are there. Not sure what you mean by types of threads. > > Kind regards, > Kirk > > On May 25, 2017, at 4:59 PM, Daniel D. Daugherty > wrote: > > > > Adding serviceability-dev at ... since jstack is a Serviceability tool. > > > > I believe jstack is experimental which means the output format can > > change at any time... > > > > Dan > > > > On 5/25/17 8:35 AM, Ram Krishnan wrote: > >> Hi, > >> > >> I would like to leverage the output of jstack command for extracting > >> additional information about the type of threads, thread ids etc. Since I > >> will be parsing the output, I need the precise format. Is there any > >> documentation on jstack output format changes and the openjdk release(s) > >> where the changes happened? > >> > >> ?Tha?nks in advance. > >> > > > > > > > -- > Thanks, > Ramki From martinrb at google.com Thu May 25 18:55:58 2017 From: martinrb at google.com (Martin Buchholz) Date: Thu, 25 May 2017 11:55:58 -0700 Subject: Fwd: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <7b268254-e17b-482f-071a-cb4a3e4b19f5@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <7b268254-e17b-482f-071a-cb4a3e4b19f5@redhat.com> Message-ID: [+jdk8u-dev] We've been hunting the elusive spurious NPEs as well; the following seems to be working for us (but we don't have any small repro recipe); something like this should be put into jdk8: --- hotspot/src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp 2016-11-22 15:30:39.000000000 -0800 +++ hotspot/src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp 2017-04-27 18:12:33.000000000 -0700 @@ -32,6 +32,11 @@ // Implementation of class OrderAccess. +// A compiler barrier, forcing the C++ compiler to invalidate all memory assumptions +static inline void compiler_barrier() { + __asm__ volatile ("" : : : "memory"); +} + inline void OrderAccess::loadload() { acquire(); } inline void OrderAccess::storestore() { release(); } inline void OrderAccess::loadstore() { acquire(); } @@ -47,9 +52,7 @@ } inline void OrderAccess::release() { - // Avoid hitting the same cache-line from - // different threads. - volatile jint local_dummy = 0; + compiler_barrier(); } inline void OrderAccess::fence() { @@ -63,34 +66,34 @@ } } -inline jbyte OrderAccess::load_acquire(volatile jbyte* p) { return *p; } -inline jshort OrderAccess::load_acquire(volatile jshort* p) { return *p; } -inline jint OrderAccess::load_acquire(volatile jint* p) { return *p; } -inline jlong OrderAccess::load_acquire(volatile jlong* p) { return Atomic::load(p); } -inline jubyte OrderAccess::load_acquire(volatile jubyte* p) { return *p; } -inline jushort OrderAccess::load_acquire(volatile jushort* p) { return *p; } -inline juint OrderAccess::load_acquire(volatile juint* p) { return *p; } -inline julong OrderAccess::load_acquire(volatile julong* p) { return Atomic::load((volatile jlong*)p); } -inline jfloat OrderAccess::load_acquire(volatile jfloat* p) { return *p; } -inline jdouble OrderAccess::load_acquire(volatile jdouble* p) { return jdouble_cast(Atomic::load((volatile jlong*)p)); } - -inline intptr_t OrderAccess::load_ptr_acquire(volatile intptr_t* p) { return *p; } -inline void* OrderAccess::load_ptr_acquire(volatile void* p) { return *(void* volatile *)p; } -inline void* OrderAccess::load_ptr_acquire(const volatile void* p) { return *(void* const volatile *)p; } - -inline void OrderAccess::release_store(volatile jbyte* p, jbyte v) { *p = v; } -inline void OrderAccess::release_store(volatile jshort* p, jshort v) { *p = v; } -inline void OrderAccess::release_store(volatile jint* p, jint v) { *p = v; } -inline void OrderAccess::release_store(volatile jlong* p, jlong v) { Atomic::store(v, p); } -inline void OrderAccess::release_store(volatile jubyte* p, jubyte v) { *p = v; } -inline void OrderAccess::release_store(volatile jushort* p, jushort v) { *p = v; } -inline void OrderAccess::release_store(volatile juint* p, juint v) { *p = v; } -inline void OrderAccess::release_store(volatile julong* p, julong v) { Atomic::store((jlong)v, (volatile jlong*)p); } -inline void OrderAccess::release_store(volatile jfloat* p, jfloat v) { *p = v; } +inline jbyte OrderAccess::load_acquire(volatile jbyte* p) { jbyte v = *p; compiler_barrier(); return v; } +inline jshort OrderAccess::load_acquire(volatile jshort* p) { jshort v = *p; compiler_barrier(); return v; } +inline jint OrderAccess::load_acquire(volatile jint* p) { jint v = *p; compiler_barrier(); return v; } +inline jlong OrderAccess::load_acquire(volatile jlong* p) { jlong v = Atomic::load(p); compiler_barrier(); return v; } +inline jubyte OrderAccess::load_acquire(volatile jubyte* p) { jubyte v = *p; compiler_barrier(); return v; } +inline jushort OrderAccess::load_acquire(volatile jushort* p) { jushort v = *p; compiler_barrier(); return v; } +inline juint OrderAccess::load_acquire(volatile juint* p) { juint v = *p; compiler_barrier(); return v; } +inline julong OrderAccess::load_acquire(volatile julong* p) { julong v = Atomic::load((volatile jlong*)p); compiler_barrier(); return v; } +inline jfloat OrderAccess::load_acquire(volatile jfloat* p) { jfloat v = *p; compiler_barrier(); return v; } +inline jdouble OrderAccess::load_acquire(volatile jdouble* p) { jdouble v = jdouble_cast(Atomic::load((volatile jlong*)p)); compiler_barrier(); return v; } + +inline intptr_t OrderAccess::load_ptr_acquire(volatile intptr_t* p) { intptr_t v = *p; compiler_barrier(); return v; } +inline void* OrderAccess::load_ptr_acquire(volatile void* p) { void* v = *(void* volatile *)p; compiler_barrier(); return v; } +inline void* OrderAccess::load_ptr_acquire(const volatile void* p) { void* v = *(void* const volatile *)p; compiler_barrier(); return v; } + +inline void OrderAccess::release_store(volatile jbyte* p, jbyte v) { compiler_barrier(); *p = v; } +inline void OrderAccess::release_store(volatile jshort* p, jshort v) { compiler_barrier(); *p = v; } +inline void OrderAccess::release_store(volatile jint* p, jint v) { compiler_barrier(); *p = v; } +inline void OrderAccess::release_store(volatile jlong* p, jlong v) { compiler_barrier(); Atomic::store(v, p); } +inline void OrderAccess::release_store(volatile jubyte* p, jubyte v) { compiler_barrier(); *p = v; } +inline void OrderAccess::release_store(volatile jushort* p, jushort v) { compiler_barrier(); *p = v; } +inline void OrderAccess::release_store(volatile juint* p, juint v) { compiler_barrier(); *p = v; } +inline void OrderAccess::release_store(volatile julong* p, julong v) { compiler_barrier(); Atomic::store((jlong)v, (volatile jlong*)p); } +inline void OrderAccess::release_store(volatile jfloat* p, jfloat v) { compiler_barrier(); *p = v; } inline void OrderAccess::release_store(volatile jdouble* p, jdouble v) { release_store((volatile jlong *)p, jlong_cast(v)); } -inline void OrderAccess::release_store_ptr(volatile intptr_t* p, intptr_t v) { *p = v; } -inline void OrderAccess::release_store_ptr(volatile void* p, void* v) { *(void* volatile *)p = v; } +inline void OrderAccess::release_store_ptr(volatile intptr_t* p, intptr_t v) { compiler_barrier(); *p = v; } +inline void OrderAccess::release_store_ptr(volatile void* p, void* v) { compiler_barrier(); *(void* volatile *)p = v; } inline void OrderAccess::store_fence(jbyte* p, jbyte v) { __asm__ volatile ( "xchgb (%2),%0" On Thu, May 25, 2017 at 9:25 AM, Andrew Dinn wrote: > Apologies but this RFR is retracted -- the problem only applies to jdk8. > > I will be posting a revised RFR to jdk8u. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > > On 25/05/17 14:16, Andrew Dinn wrote: > > Forwarding this to hotpsot-dev which is probably the more appropriate > > destination. > > > > > > -------- Forwarded Message -------- > > Subject: RFR: 8181085: Race condition in method resolution may produce > > spurious NullPointerException > > Date: Thu, 25 May 2017 14:12:53 +0100 > > From: Andrew Dinn > > To: jdk10-dev > > > > The following webrev fixes a race condition that is present in jdk10 and > > also jdk9 and jdk8. It is caused by a misplaced volatile keyword that > > faild to ensure correct ordering of writes by the compiler. Reviews > welcome. > > > > http://cr.openjdk.java.net/~adinn/8181085/webrev.00/ > > > > Backporting: > > This same fix is required in jdk9 and jdk8. > > > > Testing: > > The reproducer posted with the original issue manifests the NPE reliably > > on jdk8. It does not manifest on jdk9/10 but that is only thanks to > > changes introduced into the resolution process in jdk9 which change the > > timing of execution. However, without this fix the out-of-order write > > problem is still present in jdk9/10, as can be seen by eyeballing the > > compiled code for ConstantPoolCacheEntry::set_direct_or_vtable_call. > > > > The patch has been validated on jdk8 by running the reproducer. It stops > > any resulting NPEs. > > > > The code for ConstantPoolCacheEntry::set_direct_or_vtable_call on > > jdk8-10 has been eyeballed to ensure that post-patch the assignments now > > occur in the correct order. > > > > regards, > > > > > > Andrew Dinn > > ----------- > > Senior Principal Software Engineer > > Red Hat UK Ltd > > Registered in England and Wales under Company Registration No. 03798903 > > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > > > > > > From kim.barrett at oracle.com Thu May 25 19:47:02 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 25 May 2017 15:47:02 -0400 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: <702f7e19-da8d-6ca2-8277-185a9468ef2a@redhat.com> References: <5ec3ca7f-f4cd-337c-63d5-3b00fd2839a7@redhat.com> <702f7e19-da8d-6ca2-8277-185a9468ef2a@redhat.com> Message-ID: > On May 25, 2017, at 11:54 AM, Andrew Dinn wrote: > > Ok, so I have several interesting things to report. Thanks for trying this out, and apologies for the blunders. I got the impression that casting away constness would be needed here from the documentation, which has the signature as __atomic_load(type *ptr, type *ret, int memorder) The lack of any mention of const, and that "type" can't be optionally const in both places (since ret *is* written to), led me to that apparently erroneous conclusion. I could have saved us both some trouble if I'd thought of simply experimenting with it on some platform that I *do* have access to. A little experimenting today (on x86_64) suggests the documentation is misleading. Sorry about that. Regarding the load_ptr_acquire problem, that's happening because const_cast can *only* modify the cv-qualifiers, and can't affect the non-cv part of the type. So my error here. In my (weak) defense, I try not to write code that needs const_cast, so haven't used it much and simply forgot its protocol. And this is another case I could have found by experimenting on another platform. It looks like I also forgot to remove one of the now duplicate load_ptr_acquire variants for aarch64 too. Strangely, what should be an equivalent variation of your suggested load_ptr generates a spurious compiler warning when I try it with gcc4.9 on x86_64. If the static_cast is captured in a variable, as in: void* foo(const volatile void* p) { void* data; void* const volatile* pp = static_cast(p); __atomic_load(pp, &data, __ATOMIC_ACQUIRE); return data; } I get this warning: warning: variable ?pp? set but not used [-Wunused-but-set-variable] although the proper code gets generated. I have no explanation for this. But it doesn't matter for now, since we don't need to use that form. In fact, the old version of load_ptr_acquire(const volatile void*) works fine, so changes to it have been reverted. I've only removed the unneeded load_ptr_acquire(volatile void*) overload. And regarding my question about __atomic_load vs __atomic_load_n, at least for x86_64 the same code gets generated. Probably the same is true for aarch64. So here's the new webrev. The only changes are in orderAccess_linux_aarch64.inline.hpp, which I can't test. Hopefully I've not made any more blunders. http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.01/ From kim.barrett at oracle.com Thu May 25 19:51:07 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 25 May 2017 15:51:07 -0400 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: <14874294-a199-62dd-a343-54be6ad956b9@oracle.com> References: <14874294-a199-62dd-a343-54be6ad956b9@oracle.com> Message-ID: <2DA07E10-07D5-4B9F-B29A-D62161EB16B6@oracle.com> > On May 25, 2017, at 4:44 AM, David Holmes wrote: > > Hi Kim, > > On 25/05/2017 9:42 AM, Kim Barrett wrote: >> Please review this change to Atomic::load and OrderAccess::load_acquire >> overloads to make their source const qualified, e.g. instead of >> "volatile T*" make them "const volatile T*". This eliminates the need >> for casting away const when, for example, applying one of these >> operations to a member variable when in a const-qualified method. > > This looks quite reasonable - thanks - provided ... > >> There are probably places that previously required casting away const >> but now do not. Similarly, there are probably places where values > > ... our compilers do not complain about unnecessary casts :) We?re in serious trouble if that starts happening; the number of unnecessary casts in our code seems to be legion :( Of course, I have this pipe-dream of someday turning on -Wold-style-cast :) From kim.barrett at oracle.com Thu May 25 19:51:32 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 25 May 2017 15:51:32 -0400 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: <27eb0395-c73f-abae-9727-1dc2c5ec8dfb@oracle.com> References: <14874294-a199-62dd-a343-54be6ad956b9@oracle.com> <27eb0395-c73f-abae-9727-1dc2c5ec8dfb@oracle.com> Message-ID: > On May 25, 2017, at 7:43 AM, coleen.phillimore at oracle.com wrote: > > Looks good. > Coleen Thanks. > > On 5/25/17 4:44 AM, David Holmes wrote: >> Hi Kim, >> >> On 25/05/2017 9:42 AM, Kim Barrett wrote: >>> Please review this change to Atomic::load and OrderAccess::load_acquire >>> overloads to make their source const qualified, e.g. instead of >>> "volatile T*" make them "const volatile T*". This eliminates the need >>> for casting away const when, for example, applying one of these >>> operations to a member variable when in a const-qualified method. >> >> This looks quite reasonable - thanks - provided ... >> >>> There are probably places that previously required casting away const >>> but now do not. Similarly, there are probably places where values >> >> ... our compilers do not complain about unnecessary casts :) >> >> Cheers, >> David >> >> >>> couldn't be const or member functions couldn't be const qualified, but >>> now can be. I did a little searching and found a few candidates, but >>> none that were otherwise trivial to add to this change, so haven't >>> included any. >>> >>> This change touches platform-specific code for non-Oracle supported >>> platforms that I can't test, so I'd like reviews from the respective >>> platform owners. >>> >>> Aside regarding aarch64: I don't know why gcc's __atomic_load doesn't >>> const-qualify the source argument; that seems like a bug. Or maybe >>> they are, but not documented that way. And I wonder why the aarch64 >>> port uses __atomic_load rather than __atomic_load_n. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8166651 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.00 >>> >>> Testing: >>> JPRT From mikael.vidstedt at oracle.com Thu May 25 22:00:38 2017 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 25 May 2017 15:00:38 -0700 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: <6a0f369e-4d34-02dc-653d-90a8aa19b901@oracle.com> References: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> <440931cb-f4a4-6edd-fbd3-82bbfe162b81@oracle.com> <6a0f369e-4d34-02dc-653d-90a8aa19b901@oracle.com> Message-ID: I?ve been spending the last few days going down a rabbit hole of what turned out to be a totally unrelated performance issue. Long story short: startup time is affected, in some cases significantly, by the length of the path to the JDK in the file system. More on that in a separate thread/at another time. After having looked at generated code, and having run benchmarks stressing class loading/startup time my conclusion is that this change is performance neutral. For example, the alignment check introduced in bytes_x86.hpp get_native/put_native collapses down to a single unconditional load unless, of course, it?s done in a loop in which case it gets unrolled+vectorized. I also ran hs-tier2, which should more than cover the changes in question, and there were no failures. With that in mind I would like to push the change in its current form[1] and handle a few things as follow-up work (roughly in order): * Introduce typedefs in classFileParser for potentially unaligned pointer types * Always using memcpy to do the read - need to investigate how the primitives are used wrt. tearing * Unify the Bytes::* impl across platforms - need to investigate/verify the implications on performance Reasonable? Cheers, Mikael [1] http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01/hotspot/webrev/ > On May 18, 2017, at 5:18 PM, David Holmes wrote: > > On 19/05/2017 9:19 AM, Mikael Vidstedt wrote: >> >>> On May 18, 2017, at 3:50 PM, David Holmes >> > wrote: >>> >>> Hi Mikael, >>> >>> On 19/05/2017 8:15 AM, Mikael Vidstedt wrote: >>>> >>>>> On May 18, 2017, at 2:59 AM, Robbin Ehn >>>> > wrote: >>>>> >>>>> Hi, >>>>> >>>>> On 05/17/2017 03:46 AM, Kim Barrett wrote: >>>>>>> On May 9, 2017, at 6:40 PM, Mikael Vidstedt >>>>>>> > >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> Warning: It may be wise to stock up on coffee or tea before >>>>>>> reading this. >>>>>>> >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >>>>>>> >>>>>> Not a review, just a question. >>>>>> ------------------------------------------------------------------------------ >>>>>> src/cpu/x86/vm/bytes_x86.hpp >>>>>> 40 template >>>>>> 41 static inline T get_native(const void* p) { >>>>>> 42 assert(p != NULL, "null pointer"); >>>>>> 43 >>>>>> 44 T x; >>>>>> 45 >>>>>> 46 if (is_ptr_aligned(p, sizeof(T))) { >>>>>> 47 x = *(T*)p; >>>>>> 48 } else { >>>>>> 49 memcpy(&x, p, sizeof(T)); >>>>>> 50 } >>>>>> 51 >>>>>> 52 return x; >>>>>> I'm looking at this and wondering if there's a good reason to not just >>>>>> unconditionally use memcpy here. gcc -O will generate a single move >>>>>> instruction for that on x86_64. I'm not sure what happens on 32bit >>>>>> with an 8 byte value, but I suspect it will do something similarly >>>>>> sensible, e.g. 2 4 byte memory to memory transfers. >>>>> >>>>> Unconditionally memcpy would be nice! >>>>> >>>>> Are going to look into that Mikael? >>>> >>>> It?s complicated? >>>> >>>> We may be able to switch, but there is (maybe) a subtle reason why >>>> the alignment check is in there: to avoid word tearing.. >>>> >>>> Think of two threads racing: >>>> >>>> * thread 1 is writing to the memory location X >>>> * thread 2 is reading from the same memory location X >>>> >>>> Will thread 2 always see a consistent value (either the original >>>> value or the fully updated value)? >>> >>> We're talking about internal VM load and stores rights? For those we >>> need to use appropriate atomic routine if there are potential races. >>> But we should never be mixing these kind of accesses with Java level >>> field accesses - that would be very broken. >> >> That seems reasonable, but for my untrained eye it?s not trivially true >> that relaxing the implementation is correct for all the uses of the >> get/put primitives. I am therefore a bit reluctant to do so without >> understanding the implications. > > If a Copy routine doesn't have Atomic in its name then I don't expect atomicity. Even then unaligned accesses are not atomic even in the Atomic routine! > > But I'm not clear exactly how all these routines get used. > >>> For classFileparser we should no concurrency issues. >> >> That seems reasonable. What degree of certainty does your ?should? come >> with? :) > > Pretty high. We're parsing a stream of bytes and writing values into local structures that will eventually be passed across to a klass instance, which in turn will eventually be published via the SD as a loaded class. The actual parsing phase is purely single-threaded. > > David > >> Cheers, >> Mikael >> >>> >>> David >>> >>>> In the unaligned/memcpy case I think we can agree that there?s >>>> nothing preventing the compiler from doing individual loads/stores of >>>> the bytes making up the data. Especially in something like slowdebug >>>> that becomes more or less obvious - memcpy most likely isn?t >>>> intrinsified and is quite likely just copying a byte at a time. Given >>>> that the data is, in fact, unaligned, there is really no simple way >>>> to prevent word tearing, so I?m pretty sure that we never depend on >>>> it - if needed, we?re likely to already have some higher level >>>> synchronization in place guarding the accesses. And the fact that the >>>> other, non-x86 platforms already do individual byte loads/stores when >>>> the pointer is unaligned indicates is a further indication that >>>> that?s the case. >>>> >>>> However, the aligned case is where stuff gets more interesting. I >>>> don?t think the C/C++ spec guarantees that accessing a memory >>>> location using a pointer of type T will result in code which does a >>>> single load/store of size >= sizeof(T), but for all the compilers we >>>> *actually* use that?s likely to be the case. If it?s true that the >>>> compilers don?t splits the memory accesses, that means we won?t have >>>> word tearing when using the Bytes::get/put methods with *aligned* >>>> pointers. >>>> >>>> If I switch to always using memcpy, there?s a risk that it introduces >>>> tearing problems where earlier we had none. Two questions come to mind: >>>> >>>> * For the cases where the get/put methods get used *today*, is that a >>>> problem? >>>> * What happens if somebody in the *future* decides that put_Java_u4 >>>> seems like a great thing to use to write to a Java int field on the >>>> Java heap, and a Java thread is racing to read that same data? >>>> >>>> >>>> All that said though, I think this is worth exploring and it may well >>>> turn out that word tearing really isn?t a problem. Also, I believe >>>> there may be opportunities to further clean up this code and perhaps >>>> unify it a bit across the various platforms. >>>> >>>> And *that* said, I think the change as it stands is still an >>>> improvement, so I?m leaning towards pushing it and filing an >>>> enhancement and following up on it separately. Let me know if you >>>> strongly feel that this should be looked into and addressed now and I >>>> may reconsider :) >>>> >>>> Cheers, >>>> Mikael >>>> >>>>> >>>>> /Robbin >>>>> >>>>>> ------------------------------------------------------------------------------ >> From john.r.rose at oracle.com Fri May 26 00:04:33 2017 From: john.r.rose at oracle.com (John Rose) Date: Thu, 25 May 2017 17:04:33 -0700 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> Message-ID: <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> On May 17, 2017, at 9:01 AM, coleen.phillimore at oracle.com wrote: > > Summary: Add a Java type called ResolvedMethodName which is immutable and can be stored in a hashtable, that is weakly collected by gc I'm looking at the 8174749.03/webrev version of your changes. A few comments: In the JVMCI changes, this line appears to be incorrect on 32-bit machines: + vmtargetField = (HotSpotResolvedJavaField) findFieldInClass(methodType, "vmtarget", resolveType(long.class)); (It's a pre-existing condition, and I'm not sure if it is a problem.) In the new hash table file, the parameter names seem like they could be made more consistent. 85 oop ResolvedMethodTable::basic_add(Method* method, oop vmtarget) { 114 oop ResolvedMethodTable::add_method(Handle mem_name_target) { I think vmtarget and mem_name_target are the same kind of thing. Consider renaming them to "entry_to_add" or something more aligned with the rest of the code. I don't think that MethodHandles::init_field_MemberName needs TRAPS. Also, MethodHandles::init_method_MemberName could omit TRAPS if it were passed the RMN pointer first. Suggestion: Remove TRAPS from both *and* add a trapping function which does the info->RMN step. static oop init_method_MemberName(Handle mname_h, CallInfo& info, oop resolved_method); static oop init_method_MemberName(Handle mname_h, CallInfo& info, TRAPS); Then the trapping overloading can pick up the RMN immediately from the info, and call the non-trapping overloading. The reason to do something indirect like this is that the existing code for init_method_MemberName is (a) complex and (b) non-trapping. Promoting it all to trapping makes it harder to work with. In other words, non-TRAPS code is (IMO) easier to read and reason about, so converting a big method to TRAPS for one line is something I'd like to avoid. At least, that's the way I thought about this particular code when I first wrote it. Better: Since init_m_MN is joined at the hip with CallInfo, consider adding the trapping operation to CallInfo. See patch below. I think that preserves CI's claim to be the Source of Truth for call sites, even in methodHandles.cpp. Thank you very much for this fix. I know it's been years since we started talking about it. I'm glad you let it bother you enough to fix it! I looked at everything else and didn't find anything out of place. Reviewed. ? John diff --git a/src/share/vm/interpreter/linkResolver.hpp b/src/share/vm/interpreter/linkResolver.hpp --- a/src/share/vm/interpreter/linkResolver.hpp +++ b/src/share/vm/interpreter/linkResolver.hpp @@ -56,6 +56,7 @@ int _call_index; // vtable or itable index of selected class method (if any) Handle _resolved_appendix; // extra argument in constant pool (if CPCE::has_appendix) Handle _resolved_method_type; // MethodType (for invokedynamic and invokehandle call sites) + Handle _resolved_method_name; // optional ResolvedMethodName object for java.lang.invoke void set_static(KlassHandle resolved_klass, const methodHandle& resolved_method, TRAPS); void set_interface(KlassHandle resolved_klass, KlassHandle selected_klass, @@ -97,6 +98,7 @@ methodHandle selected_method() const { return _selected_method; } Handle resolved_appendix() const { return _resolved_appendix; } Handle resolved_method_type() const { return _resolved_method_type; } + Handle resolved_method_name() const { return _resolved_method_name; } BasicType result_type() const { return selected_method()->result_type(); } CallKind call_kind() const { return _call_kind; } @@ -117,6 +119,12 @@ return _call_index; } + oop find_resolved_method_name(TRAPS) { + if (_resolved_method_name.is_null()) + java_lang_invoke_ResolvedMethodName::find_resolved_method(_resolved_method, CHECK_NULL); + return _resolved_method_name; + } + // debugging #ifdef ASSERT bool has_vtable_index() const { return _call_index >= 0 && _call_kind != CallInfo::itable_call; } diff --git a/src/share/vm/interpreter/linkResolver.hpp b/src/share/vm/interpreter/linkResolver.hpp --- a/src/share/vm/interpreter/linkResolver.hpp +++ b/src/share/vm/interpreter/linkResolver.hpp @@ -56,6 +56,7 @@ int _call_index; // vtable or itable index of selected class method (if any) Handle _resolved_appendix; // extra argument in constant pool (if CPCE::has_appendix) Handle _resolved_method_type; // MethodType (for invokedynamic and invokehandle call sites) + Handle _resolved_method_name; // optional ResolvedMethodName object for java.lang.invoke void set_static(KlassHandle resolved_klass, const methodHandle& resolved_method, TRAPS); void set_interface(KlassHandle resolved_klass, KlassHandle selected_klass, @@ -97,6 +98,7 @@ methodHandle selected_method() const { return _selected_method; } Handle resolved_appendix() const { return _resolved_appendix; } Handle resolved_method_type() const { return _resolved_method_type; } + Handle resolved_method_name() const { return _resolved_method_name; } BasicType result_type() const { return selected_method()->result_type(); } CallKind call_kind() const { return _call_kind; } @@ -117,6 +119,12 @@ return _call_index; } + oop find_resolved_method_name(TRAPS) { + if (_resolved_method_name.is_null()) + java_lang_invoke_ResolvedMethodName::find_resolved_method(_resolved_method, CHECK_NULL); + return _resolved_method_name; + } + // debugging #ifdef ASSERT bool has_vtable_index() const { return _call_index >= 0 && _call_kind != CallInfo::itable_call; } From david.holmes at oracle.com Fri May 26 00:39:19 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 10:39:19 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> Message-ID: Hi Dan, Thanks very much for the review. I will apply all the (mostly inherited) grammatical fixes to the comments, etc. What do you think about Robbin's suggested refactoring of the to_abstime logic? (http://cr.openjdk.java.net/~rehn/8174231/webrev/) Thanks, David On 26/05/2017 2:48 AM, Daniel D. Daugherty wrote: > On 5/18/17 12:25 AM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >> >> webrevs: >> >> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ > > General comment(s): > - Sometimes you've updated the copyright year for the file and > sometimes you haven't. Please check before pushing. > > > common/autoconf/flags.m4 > No comments. > > common/autoconf/generated-configure.sh > No comments. > > >> hotspot: http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ > > src/os/aix/vm/os_aix.cpp > No comments; did not try to compare deleted code with os_posix.cpp. > > src/os/aix/vm/os_aix.hpp > No comments; did not try to compare deleted code with os_posix.hpp. > > src/os/bsd/vm/os_bsd.cpp > No comments; compared deleted code with os_posix.cpp version; nothing > jumped out as wrong. > > src/os/bsd/vm/os_bsd.hpp > No comments; compared deleted code with os_posix.hpp version; nothing > jumped out as wrong. > > src/os/linux/vm/os_linux.cpp > No comments; compared deleted code with os_posix.cpp version; nothing > jumped out as wrong. > > src/os/linux/vm/os_linux.hpp > No comments; compared deleted code with os_posix.hpp version; nothing > jumped out as wrong. > > src/os/posix/vm/os_posix.cpp > L1401: // Not currently usable by Solaris > L1408: // time-of-day clock > nit - needs period at end of the sentence > > L1433: // build time support then there can not be > typo - "can not" -> "cannot" > > L1435: // int or int64_t. > typo - needs a ')' before the period. > > L1446: // determine what POSIX API's are present and do appropriate > L1447: // configuration > nits - 'determine' -> 'Determine' > - needs period at end of the sentence > > L1455: // 1. Check for CLOCK_MONOTONIC support > nit - needs period at end of the sentence > > L1462: // we do dlopen's in this particular order due to bug in > linux > L1463: // dynamical loader (see 6348968) leading to crash on exit > nits - 'we' -> 'We' > - needs period at end of the sentence > > typo - 'dynamical' -> 'dynamic' > > L1481: // we assume that if both clock_gettime and clock_getres > support > L1482: // CLOCK_MONOTONIC then the OS provides true high-res > monotonic clock > nits - 'we' -> 'We' > - needs period at end of the sentence > > L1486: clock_gettime_func(CLOCK_MONOTONIC, &tp) == 0) { > nit - extra space before '==' > > L1487: // yes, monotonic clock is supported > nits - 'yes' -> 'Yes' > - needs period at end of the sentence > > L1491: // close librt if there is no monotonic clock > nits - 'close' -> 'Close' > - needs period at end of the sentence > > L1499: // 2. Check for pthread_condattr_setclock support > L1503: // libpthread is already loaded > L1511: // Now do general initialization > nit - needs period at end of the sentence > > L1591: if (timeout < 0) > L1592: timeout = 0; > nit - missing braces > > L1609: // More seconds than we can add, so pin to max_secs > L1658: // More seconds than we can add, so pin to max_secs > nit - needs period at end of the sentence > > L1643: // Absolue seconds exceeds allow max, so pin to > max_secs > typo - 'Absolue' -> 'Absolute' > nit - needs period at end of the sentence > > src/os/posix/vm/os_posix.hpp > L149: ~PlatformEvent() { guarantee(0, "invariant"); } > L185: ~PlatformParker() { guarantee(0, "invariant"); } > nit - '0' should be 'false' or just call fatal() > > src/os/solaris/vm/os_solaris.cpp > No comments. > > src/os/solaris/vm/os_solaris.hpp > No comments. > > > As Robbin said, this is very hard to review and be sure that everything > is relocated correctly. I tried to look at this code a couple of different > ways and nothing jumped out at me as wrong. > > I did my usual crawl style review through posix.cpp and posix.hpp. I only > found nits and typos that you can chose to ignore since you're on a time > crunch here. > > Thumbs up! > > Dan > > > >> >> First a big thank you to Thomas Stuefe for testing various versions of >> this on AIX. >> >> This is primarily a refactoring and cleanup exercise (ie lots of >> deleted duplicated code!). >> >> I have taken the PlatformEvent, PlatformParker and Parker::* code, out >> of os_linux and moved it into os_posix for use by Linux, OSX, BSD, AIX >> and perhaps one day Solaris (more on that later). >> >> The Linux code was the most functionally complete, dealing with >> correct use of CLOCK_MONOTONIC for relative timed waits, and the >> default wall-clock for absolute timed waits. That functionality is >> not, unfortunately, supported by all our POSIX platforms so there are >> some configure time build checks to set some #defines, and then some >> dynamic lookup at runtime**. We allow for the runtime environment to >> be less capable than the build environment, but not the other way >> around (without build time support we don't know the runtime types >> needed to make library calls). >> >> ** There is some duplication of dynamic lookup code on Linux but this >> can be cleaned up in future work if we refactor the time/clock code >> into os_posix as well. >> >> The cleanup covers a number of things: >> - removal of linux anachronisms that got "ported" into the other >> platforms >> - eg EINTR can not be returned from the wait methods >> - removal of solaris anachronisms that got ported into the linux code >> and then on to other platforms >> - eg ETIMEDOUT is what we expect never ETIME >> - removal of the ancient/obsolete os::*::allowdebug_blocked_signals() >> from the Parker methods >> - consolidation of unpackTime and compute_abstime into one utility >> function >> - use statics for things completely private to the implementation >> rather than making them part of the os* API (eg access to condAttr >> objects) >> - cleanup up commentary and style within methods of the same class >> - clean up coding style in places eg not using Names that start with >> capitals. >> >> I have not tried to cleanup every single oddity, nor tried to >> reconcile differences between the very similar in places PlatformEvent >> and Park methods. For example PlatformEvent still examines the >> FilterSpuriousWakeups** flag, and Parker still ignores it. >> >> ** Perhaps a candidate for deprecation and future removal. >> >> There is one mini "enhancement" slipped in this. I now explicitly >> initialize mutexes with a mutexAttr object with its type set to >> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >> "error checking" and so is slow. On all other current platforms there >> is no effective change. >> >> Finally, Solaris is excluded from all this (other than the debug >> signal blocking cleanup) because it potentially supports three >> different low-level sync subsystems: UI thr*, Pthread, and direct LWP >> sync. Solaris cleanup would be a separate RFE. >> >> No doubt I've overlooked mentioning something that someone will spot. :) >> >> Thanks, >> David >> > From david.holmes at oracle.com Fri May 26 00:47:24 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 10:47:24 +1000 Subject: [XS] RFR : 8180945 : vmError.cpp : adjust dup and fclose In-Reply-To: <22d5e5a59fc7417eafef35c0468dafea@sap.com> References: <22d5e5a59fc7417eafef35c0468dafea@sap.com> Message-ID: <90a26f0c-c8b8-0c75-4c5d-5b447cefd5e2@oracle.com> Hi Matthias, On 25/05/2017 1:42 AM, Baesken, Matthias wrote: > Hello, > could I please have a review for the following small change . > In vmError.cpp there is a part where the dup return code in case of an error is not handled, and additionally fclose might be called with parameter NULL . The fix looks good to me. I will sponsor this for you. I think this constitutes a trivial fix so I will apply the one Reviewer rule. Thanks, David > The change adjusts this. > > Bug : > > https://bugs.openjdk.java.net/browse/JDK-8180945 > > webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8180945/ > > > Thanks, Matthias > From david.holmes at oracle.com Fri May 26 01:04:56 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 11:04:56 +1000 Subject: Fwd: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <7b268254-e17b-482f-071a-cb4a3e4b19f5@redhat.com> Message-ID: On 26/05/2017 4:55 AM, Martin Buchholz wrote: > [+jdk8u-dev] > > We've been hunting the elusive spurious NPEs as well; the following seems > to be working for us (but we don't have any small repro recipe); something > like this should be put into jdk8: In other words you want a backport of: 8061964: Insufficient compiler barriers for GCC in OrderAccess functions ? https://bugs.openjdk.java.net/browse/JDK-8061964 IIRC what stopped this from being an 'automatic' backport candidate was the potential problem of older gcc's needing to be validated. Cheers, David ----- > --- hotspot/src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp 2016-11-22 > 15:30:39.000000000 -0800 > +++ hotspot/src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp 2017-04-27 > 18:12:33.000000000 -0700 > @@ -32,6 +32,11 @@ > > // Implementation of class OrderAccess. > > +// A compiler barrier, forcing the C++ compiler to invalidate all memory > assumptions > +static inline void compiler_barrier() { > + __asm__ volatile ("" : : : "memory"); > +} > + > inline void OrderAccess::loadload() { acquire(); } > inline void OrderAccess::storestore() { release(); } > inline void OrderAccess::loadstore() { acquire(); } > @@ -47,9 +52,7 @@ > } > > inline void OrderAccess::release() { > - // Avoid hitting the same cache-line from > - // different threads. > - volatile jint local_dummy = 0; > + compiler_barrier(); > } > > inline void OrderAccess::fence() { > @@ -63,34 +66,34 @@ > } > } > > -inline jbyte OrderAccess::load_acquire(volatile jbyte* p) { return > *p; } > -inline jshort OrderAccess::load_acquire(volatile jshort* p) { return > *p; } > -inline jint OrderAccess::load_acquire(volatile jint* p) { return > *p; } > -inline jlong OrderAccess::load_acquire(volatile jlong* p) { return > Atomic::load(p); } > -inline jubyte OrderAccess::load_acquire(volatile jubyte* p) { return > *p; } > -inline jushort OrderAccess::load_acquire(volatile jushort* p) { return > *p; } > -inline juint OrderAccess::load_acquire(volatile juint* p) { return > *p; } > -inline julong OrderAccess::load_acquire(volatile julong* p) { return > Atomic::load((volatile jlong*)p); } > -inline jfloat OrderAccess::load_acquire(volatile jfloat* p) { return > *p; } > -inline jdouble OrderAccess::load_acquire(volatile jdouble* p) { return > jdouble_cast(Atomic::load((volatile jlong*)p)); } > - > -inline intptr_t OrderAccess::load_ptr_acquire(volatile intptr_t* p) { > return *p; } > -inline void* OrderAccess::load_ptr_acquire(volatile void* p) { > return *(void* volatile *)p; } > -inline void* OrderAccess::load_ptr_acquire(const volatile void* p) { > return *(void* const volatile *)p; } > - > -inline void OrderAccess::release_store(volatile jbyte* p, jbyte v) > { *p = v; } > -inline void OrderAccess::release_store(volatile jshort* p, jshort v) > { *p = v; } > -inline void OrderAccess::release_store(volatile jint* p, jint v) > { *p = v; } > -inline void OrderAccess::release_store(volatile jlong* p, jlong v) > { Atomic::store(v, p); } > -inline void OrderAccess::release_store(volatile jubyte* p, jubyte v) > { *p = v; } > -inline void OrderAccess::release_store(volatile jushort* p, jushort v) > { *p = v; } > -inline void OrderAccess::release_store(volatile juint* p, juint v) > { *p = v; } > -inline void OrderAccess::release_store(volatile julong* p, julong v) > { Atomic::store((jlong)v, (volatile jlong*)p); } > -inline void OrderAccess::release_store(volatile jfloat* p, jfloat v) > { *p = v; } > +inline jbyte OrderAccess::load_acquire(volatile jbyte* p) { jbyte v > = *p; compiler_barrier(); return v; } > +inline jshort OrderAccess::load_acquire(volatile jshort* p) { jshort v > = *p; compiler_barrier(); return v; } > +inline jint OrderAccess::load_acquire(volatile jint* p) { jint v > = *p; compiler_barrier(); return v; } > +inline jlong OrderAccess::load_acquire(volatile jlong* p) { jlong v > = Atomic::load(p); compiler_barrier(); return v; } > +inline jubyte OrderAccess::load_acquire(volatile jubyte* p) { jubyte v > = *p; compiler_barrier(); return v; } > +inline jushort OrderAccess::load_acquire(volatile jushort* p) { jushort v > = *p; compiler_barrier(); return v; } > +inline juint OrderAccess::load_acquire(volatile juint* p) { juint v > = *p; compiler_barrier(); return v; } > +inline julong OrderAccess::load_acquire(volatile julong* p) { julong v > = Atomic::load((volatile jlong*)p); compiler_barrier(); return v; } > +inline jfloat OrderAccess::load_acquire(volatile jfloat* p) { jfloat v > = *p; compiler_barrier(); return v; } > +inline jdouble OrderAccess::load_acquire(volatile jdouble* p) { jdouble v > = jdouble_cast(Atomic::load((volatile jlong*)p)); compiler_barrier(); > return v; } > + > +inline intptr_t OrderAccess::load_ptr_acquire(volatile intptr_t* p) { > intptr_t v = *p; compiler_barrier(); return v; } > +inline void* OrderAccess::load_ptr_acquire(volatile void* p) { > void* v = *(void* volatile *)p; compiler_barrier(); return v; } > +inline void* OrderAccess::load_ptr_acquire(const volatile void* p) { > void* v = *(void* const volatile *)p; compiler_barrier(); return v; } > + > +inline void OrderAccess::release_store(volatile jbyte* p, jbyte v) > { compiler_barrier(); *p = v; } > +inline void OrderAccess::release_store(volatile jshort* p, jshort v) > { compiler_barrier(); *p = v; } > +inline void OrderAccess::release_store(volatile jint* p, jint v) > { compiler_barrier(); *p = v; } > +inline void OrderAccess::release_store(volatile jlong* p, jlong v) > { compiler_barrier(); Atomic::store(v, p); } > +inline void OrderAccess::release_store(volatile jubyte* p, jubyte v) > { compiler_barrier(); *p = v; } > +inline void OrderAccess::release_store(volatile jushort* p, jushort v) > { compiler_barrier(); *p = v; } > +inline void OrderAccess::release_store(volatile juint* p, juint v) > { compiler_barrier(); *p = v; } > +inline void OrderAccess::release_store(volatile julong* p, julong v) > { compiler_barrier(); Atomic::store((jlong)v, (volatile jlong*)p); } > +inline void OrderAccess::release_store(volatile jfloat* p, jfloat v) > { compiler_barrier(); *p = v; } > inline void OrderAccess::release_store(volatile jdouble* p, jdouble v) > { release_store((volatile jlong *)p, jlong_cast(v)); } > > -inline void OrderAccess::release_store_ptr(volatile intptr_t* p, > intptr_t v) { *p = v; } > -inline void OrderAccess::release_store_ptr(volatile void* p, void* > v) { *(void* volatile *)p = v; } > +inline void OrderAccess::release_store_ptr(volatile intptr_t* p, > intptr_t v) { compiler_barrier(); *p = v; } > +inline void OrderAccess::release_store_ptr(volatile void* p, void* > v) { compiler_barrier(); *(void* volatile *)p = v; } > > inline void OrderAccess::store_fence(jbyte* p, jbyte v) { > __asm__ volatile ( "xchgb (%2),%0" > > > On Thu, May 25, 2017 at 9:25 AM, Andrew Dinn wrote: > >> Apologies but this RFR is retracted -- the problem only applies to jdk8. >> >> I will be posting a revised RFR to jdk8u. >> >> regards, >> >> >> Andrew Dinn >> ----------- >> Senior Principal Software Engineer >> Red Hat UK Ltd >> Registered in England and Wales under Company Registration No. 03798903 >> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander >> >> On 25/05/17 14:16, Andrew Dinn wrote: >>> Forwarding this to hotpsot-dev which is probably the more appropriate >>> destination. >>> >>> >>> -------- Forwarded Message -------- >>> Subject: RFR: 8181085: Race condition in method resolution may produce >>> spurious NullPointerException >>> Date: Thu, 25 May 2017 14:12:53 +0100 >>> From: Andrew Dinn >>> To: jdk10-dev >>> >>> The following webrev fixes a race condition that is present in jdk10 and >>> also jdk9 and jdk8. It is caused by a misplaced volatile keyword that >>> faild to ensure correct ordering of writes by the compiler. Reviews >> welcome. >>> >>> http://cr.openjdk.java.net/~adinn/8181085/webrev.00/ >>> >>> Backporting: >>> This same fix is required in jdk9 and jdk8. >>> >>> Testing: >>> The reproducer posted with the original issue manifests the NPE reliably >>> on jdk8. It does not manifest on jdk9/10 but that is only thanks to >>> changes introduced into the resolution process in jdk9 which change the >>> timing of execution. However, without this fix the out-of-order write >>> problem is still present in jdk9/10, as can be seen by eyeballing the >>> compiled code for ConstantPoolCacheEntry::set_direct_or_vtable_call. >>> >>> The patch has been validated on jdk8 by running the reproducer. It stops >>> any resulting NPEs. >>> >>> The code for ConstantPoolCacheEntry::set_direct_or_vtable_call on >>> jdk8-10 has been eyeballed to ensure that post-patch the assignments now >>> occur in the correct order. >>> >>> regards, >>> >>> >>> Andrew Dinn >>> ----------- >>> Senior Principal Software Engineer >>> Red Hat UK Ltd >>> Registered in England and Wales under Company Registration No. 03798903 >>> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander >>> >>> >> >> From martinrb at google.com Fri May 26 01:22:44 2017 From: martinrb at google.com (Martin Buchholz) Date: Thu, 25 May 2017 18:22:44 -0700 Subject: Fwd: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <7b268254-e17b-482f-071a-cb4a3e4b19f5@redhat.com> Message-ID: On Thu, May 25, 2017 at 6:04 PM, David Holmes wrote: > On 26/05/2017 4:55 AM, Martin Buchholz wrote: > >> [+jdk8u-dev] >> >> We've been hunting the elusive spurious NPEs as well; the following seems >> to be working for us (but we don't have any small repro recipe); something >> like this should be put into jdk8: >> > > In other words you want a backport of: 8061964: Insufficient compiler > barriers for GCC in OrderAccess functions ? > > https://bugs.openjdk.java.net/browse/JDK-8061964 > > IIRC what stopped this from being an 'automatic' backport candidate was > the potential problem of older gcc's needing to be validated. > Sure, __asm__ is non-portable, and it's easy to break other peoples' toolchains. But "it works for us" ... From david.holmes at oracle.com Fri May 26 02:03:28 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 12:03:28 +1000 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: References: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> <440931cb-f4a4-6edd-fbd3-82bbfe162b81@oracle.com> <6a0f369e-4d34-02dc-653d-90a8aa19b901@oracle.com> Message-ID: <04cc9432-8494-0d33-ddd3-1504d4256a50@oracle.com> On 26/05/2017 8:00 AM, Mikael Vidstedt wrote: > I?ve been spending the last few days going down a rabbit hole of what > turned out to be a totally unrelated performance issue. Long story > short: startup time is affected, in some cases significantly, by the > length of the path to the JDK in the file system. More on that in a > separate thread/at another time. https://bugs.openjdk.java.net/browse/JDK-7196911 ? > After having looked at generated code, and having run benchmarks > stressing class loading/startup time my conclusion is that this change > is performance neutral. For example, the alignment check introduced in > bytes_x86.hpp get_native/put_native collapses down to a single > unconditional load unless, of course, it?s done in a loop in which case > it gets unrolled+vectorized. > > I also ran hs-tier2, which should more than cover the changes in > question, and there were no failures. > > With that in mind I would like to push the change in its current form[1] > and handle a few things as follow-up work (roughly in order): > > * Introduce typedefs in classFileParser for potentially unaligned > pointer types > * Always using memcpy to do the read - need to investigate how the > primitives are used wrt. tearing > * Unify the Bytes::* impl across platforms - need to investigate/verify > the implications on performance > > Reasonable? Reasonable. Cheers, David > Cheers, > Mikael > > [1] > http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01/hotspot/webrev/ > >> On May 18, 2017, at 5:18 PM, David Holmes > > wrote: >> >> On 19/05/2017 9:19 AM, Mikael Vidstedt wrote: >>> >>>> On May 18, 2017, at 3:50 PM, David Holmes >>> >>>> > wrote: >>>> >>>> Hi Mikael, >>>> >>>> On 19/05/2017 8:15 AM, Mikael Vidstedt wrote: >>>>> >>>>>> On May 18, 2017, at 2:59 AM, Robbin Ehn >>>>> >>>>>> > wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On 05/17/2017 03:46 AM, Kim Barrett wrote: >>>>>>>> On May 9, 2017, at 6:40 PM, Mikael Vidstedt >>>>>>>> >>>>>>>> > >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Warning: It may be wise to stock up on coffee or tea before >>>>>>>> reading this. >>>>>>>> >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >>>>>>>> >>>>>>> Not a review, just a question. >>>>>>> ------------------------------------------------------------------------------ >>>>>>> src/cpu/x86/vm/bytes_x86.hpp >>>>>>> 40 template >>>>>>> 41 static inline T get_native(const void* p) { >>>>>>> 42 assert(p != NULL, "null pointer"); >>>>>>> 43 >>>>>>> 44 T x; >>>>>>> 45 >>>>>>> 46 if (is_ptr_aligned(p, sizeof(T))) { >>>>>>> 47 x = *(T*)p; >>>>>>> 48 } else { >>>>>>> 49 memcpy(&x, p, sizeof(T)); >>>>>>> 50 } >>>>>>> 51 >>>>>>> 52 return x; >>>>>>> I'm looking at this and wondering if there's a good reason to not >>>>>>> just >>>>>>> unconditionally use memcpy here. gcc -O will generate a single move >>>>>>> instruction for that on x86_64. I'm not sure what happens on 32bit >>>>>>> with an 8 byte value, but I suspect it will do something similarly >>>>>>> sensible, e.g. 2 4 byte memory to memory transfers. >>>>>> >>>>>> Unconditionally memcpy would be nice! >>>>>> >>>>>> Are going to look into that Mikael? >>>>> >>>>> It?s complicated? >>>>> >>>>> We may be able to switch, but there is (maybe) a subtle reason why >>>>> the alignment check is in there: to avoid word tearing.. >>>>> >>>>> Think of two threads racing: >>>>> >>>>> * thread 1 is writing to the memory location X >>>>> * thread 2 is reading from the same memory location X >>>>> >>>>> Will thread 2 always see a consistent value (either the original >>>>> value or the fully updated value)? >>>> >>>> We're talking about internal VM load and stores rights? For those we >>>> need to use appropriate atomic routine if there are potential races. >>>> But we should never be mixing these kind of accesses with Java level >>>> field accesses - that would be very broken. >>> >>> That seems reasonable, but for my untrained eye it?s not trivially true >>> that relaxing the implementation is correct for all the uses of the >>> get/put primitives. I am therefore a bit reluctant to do so without >>> understanding the implications. >> >> If a Copy routine doesn't have Atomic in its name then I don't expect >> atomicity. Even then unaligned accesses are not atomic even in the >> Atomic routine! >> >> But I'm not clear exactly how all these routines get used. >> >>>> For classFileparser we should no concurrency issues. >>> >>> That seems reasonable. What degree of certainty does your ?should? come >>> with? :) >> >> Pretty high. We're parsing a stream of bytes and writing values into >> local structures that will eventually be passed across to a klass >> instance, which in turn will eventually be published via the SD as a >> loaded class. The actual parsing phase is purely single-threaded. >> >> David >> >>> Cheers, >>> Mikael >>> >>>> >>>> David >>>> >>>>> In the unaligned/memcpy case I think we can agree that there?s >>>>> nothing preventing the compiler from doing individual loads/stores of >>>>> the bytes making up the data. Especially in something like slowdebug >>>>> that becomes more or less obvious - memcpy most likely isn?t >>>>> intrinsified and is quite likely just copying a byte at a time. Given >>>>> that the data is, in fact, unaligned, there is really no simple way >>>>> to prevent word tearing, so I?m pretty sure that we never depend on >>>>> it - if needed, we?re likely to already have some higher level >>>>> synchronization in place guarding the accesses. And the fact that the >>>>> other, non-x86 platforms already do individual byte loads/stores when >>>>> the pointer is unaligned indicates is a further indication that >>>>> that?s the case. >>>>> >>>>> However, the aligned case is where stuff gets more interesting. I >>>>> don?t think the C/C++ spec guarantees that accessing a memory >>>>> location using a pointer of type T will result in code which does a >>>>> single load/store of size >= sizeof(T), but for all the compilers we >>>>> *actually* use that?s likely to be the case. If it?s true that the >>>>> compilers don?t splits the memory accesses, that means we won?t have >>>>> word tearing when using the Bytes::get/put methods with *aligned* >>>>> pointers. >>>>> >>>>> If I switch to always using memcpy, there?s a risk that it introduces >>>>> tearing problems where earlier we had none. Two questions come to mind: >>>>> >>>>> * For the cases where the get/put methods get used *today*, is that a >>>>> problem? >>>>> * What happens if somebody in the *future* decides that put_Java_u4 >>>>> seems like a great thing to use to write to a Java int field on the >>>>> Java heap, and a Java thread is racing to read that same data? >>>>> >>>>> >>>>> All that said though, I think this is worth exploring and it may well >>>>> turn out that word tearing really isn?t a problem. Also, I believe >>>>> there may be opportunities to further clean up this code and perhaps >>>>> unify it a bit across the various platforms. >>>>> >>>>> And *that* said, I think the change as it stands is still an >>>>> improvement, so I?m leaning towards pushing it and filing an >>>>> enhancement and following up on it separately. Let me know if you >>>>> strongly feel that this should be looked into and addressed now and I >>>>> may reconsider :) >>>>> >>>>> Cheers, >>>>> Mikael >>>>> >>>>>> >>>>>> /Robbin >>>>>> >>>>>>> ------------------------------------------------------------------------------ >>> > From daniel.daugherty at oracle.com Fri May 26 02:24:40 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 25 May 2017 20:24:40 -0600 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> Message-ID: <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> On 5/25/17 6:39 PM, David Holmes wrote: > > > Hi Dan, > > Thanks very much for the review. I will apply all the (mostly > inherited) grammatical fixes to the comments, etc. > > What do you think about Robbin's suggested refactoring of the > to_abstime logic? (http://cr.openjdk.java.net/~rehn/8174231/webrev/) It's cleaner code. It will make it more difficult to compare against the original, but one could easily argue that it's very difficult to compare the os_posix.[ch]pp code with the originals... Dan > > Thanks, > David > > On 26/05/2017 2:48 AM, Daniel D. Daugherty wrote: >> On 5/18/17 12:25 AM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>> >>> webrevs: >>> >>> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >> >> General comment(s): >> - Sometimes you've updated the copyright year for the file and >> sometimes you haven't. Please check before pushing. >> >> >> common/autoconf/flags.m4 >> No comments. >> >> common/autoconf/generated-configure.sh >> No comments. >> >> >>> hotspot: http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >> >> src/os/aix/vm/os_aix.cpp >> No comments; did not try to compare deleted code with os_posix.cpp. >> >> src/os/aix/vm/os_aix.hpp >> No comments; did not try to compare deleted code with os_posix.hpp. >> >> src/os/bsd/vm/os_bsd.cpp >> No comments; compared deleted code with os_posix.cpp version; >> nothing >> jumped out as wrong. >> >> src/os/bsd/vm/os_bsd.hpp >> No comments; compared deleted code with os_posix.hpp version; >> nothing >> jumped out as wrong. >> >> src/os/linux/vm/os_linux.cpp >> No comments; compared deleted code with os_posix.cpp version; >> nothing >> jumped out as wrong. >> >> src/os/linux/vm/os_linux.hpp >> No comments; compared deleted code with os_posix.hpp version; >> nothing >> jumped out as wrong. >> >> src/os/posix/vm/os_posix.cpp >> L1401: // Not currently usable by Solaris >> L1408: // time-of-day clock >> nit - needs period at end of the sentence >> >> L1433: // build time support then there can not be >> typo - "can not" -> "cannot" >> >> L1435: // int or int64_t. >> typo - needs a ')' before the period. >> >> L1446: // determine what POSIX API's are present and do appropriate >> L1447: // configuration >> nits - 'determine' -> 'Determine' >> - needs period at end of the sentence >> >> L1455: // 1. Check for CLOCK_MONOTONIC support >> nit - needs period at end of the sentence >> >> L1462: // we do dlopen's in this particular order due to bug >> in linux >> L1463: // dynamical loader (see 6348968) leading to crash on exit >> nits - 'we' -> 'We' >> - needs period at end of the sentence >> >> typo - 'dynamical' -> 'dynamic' >> >> L1481: // we assume that if both clock_gettime and >> clock_getres support >> L1482: // CLOCK_MONOTONIC then the OS provides true high-res >> monotonic clock >> nits - 'we' -> 'We' >> - needs period at end of the sentence >> >> L1486: clock_gettime_func(CLOCK_MONOTONIC, &tp) == 0) { >> nit - extra space before '==' >> >> L1487: // yes, monotonic clock is supported >> nits - 'yes' -> 'Yes' >> - needs period at end of the sentence >> >> L1491: // close librt if there is no monotonic clock >> nits - 'close' -> 'Close' >> - needs period at end of the sentence >> >> L1499: // 2. Check for pthread_condattr_setclock support >> L1503: // libpthread is already loaded >> L1511: // Now do general initialization >> nit - needs period at end of the sentence >> >> L1591: if (timeout < 0) >> L1592: timeout = 0; >> nit - missing braces >> >> L1609: // More seconds than we can add, so pin to max_secs >> L1658: // More seconds than we can add, so pin to max_secs >> nit - needs period at end of the sentence >> >> L1643: // Absolue seconds exceeds allow max, so pin to >> max_secs >> typo - 'Absolue' -> 'Absolute' >> nit - needs period at end of the sentence >> >> src/os/posix/vm/os_posix.hpp >> L149: ~PlatformEvent() { guarantee(0, "invariant"); } >> L185: ~PlatformParker() { guarantee(0, "invariant"); } >> nit - '0' should be 'false' or just call fatal() >> >> src/os/solaris/vm/os_solaris.cpp >> No comments. >> >> src/os/solaris/vm/os_solaris.hpp >> No comments. >> >> >> As Robbin said, this is very hard to review and be sure that everything >> is relocated correctly. I tried to look at this code a couple of >> different >> ways and nothing jumped out at me as wrong. >> >> I did my usual crawl style review through posix.cpp and posix.hpp. I >> only >> found nits and typos that you can chose to ignore since you're on a time >> crunch here. >> >> Thumbs up! >> >> Dan >> >> >> >>> >>> First a big thank you to Thomas Stuefe for testing various versions >>> of this on AIX. >>> >>> This is primarily a refactoring and cleanup exercise (ie lots of >>> deleted duplicated code!). >>> >>> I have taken the PlatformEvent, PlatformParker and Parker::* code, >>> out of os_linux and moved it into os_posix for use by Linux, OSX, >>> BSD, AIX and perhaps one day Solaris (more on that later). >>> >>> The Linux code was the most functionally complete, dealing with >>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>> default wall-clock for absolute timed waits. That functionality is >>> not, unfortunately, supported by all our POSIX platforms so there >>> are some configure time build checks to set some #defines, and then >>> some dynamic lookup at runtime**. We allow for the runtime >>> environment to be less capable than the build environment, but not >>> the other way around (without build time support we don't know the >>> runtime types needed to make library calls). >>> >>> ** There is some duplication of dynamic lookup code on Linux but >>> this can be cleaned up in future work if we refactor the time/clock >>> code into os_posix as well. >>> >>> The cleanup covers a number of things: >>> - removal of linux anachronisms that got "ported" into the other >>> platforms >>> - eg EINTR can not be returned from the wait methods >>> - removal of solaris anachronisms that got ported into the linux >>> code and then on to other platforms >>> - eg ETIMEDOUT is what we expect never ETIME >>> - removal of the ancient/obsolete >>> os::*::allowdebug_blocked_signals() from the Parker methods >>> - consolidation of unpackTime and compute_abstime into one utility >>> function >>> - use statics for things completely private to the implementation >>> rather than making them part of the os* API (eg access to condAttr >>> objects) >>> - cleanup up commentary and style within methods of the same class >>> - clean up coding style in places eg not using Names that start with >>> capitals. >>> >>> I have not tried to cleanup every single oddity, nor tried to >>> reconcile differences between the very similar in places >>> PlatformEvent and Park methods. For example PlatformEvent still >>> examines the FilterSpuriousWakeups** flag, and Parker still ignores it. >>> >>> ** Perhaps a candidate for deprecation and future removal. >>> >>> There is one mini "enhancement" slipped in this. I now explicitly >>> initialize mutexes with a mutexAttr object with its type set to >>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>> "error checking" and so is slow. On all other current platforms >>> there is no effective change. >>> >>> Finally, Solaris is excluded from all this (other than the debug >>> signal blocking cleanup) because it potentially supports three >>> different low-level sync subsystems: UI thr*, Pthread, and direct >>> LWP sync. Solaris cleanup would be a separate RFE. >>> >>> No doubt I've overlooked mentioning something that someone will >>> spot. :) >>> >>> Thanks, >>> David >>> >> From david.holmes at oracle.com Fri May 26 02:29:36 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 12:29:36 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> Message-ID: On 26/05/2017 12:24 PM, Daniel D. Daugherty wrote: > On 5/25/17 6:39 PM, David Holmes wrote: >> >> >> Hi Dan, >> >> Thanks very much for the review. I will apply all the (mostly >> inherited) grammatical fixes to the comments, etc. >> >> What do you think about Robbin's suggested refactoring of the >> to_abstime logic? (http://cr.openjdk.java.net/~rehn/8174231/webrev/) > > It's cleaner code. It will make it more difficult to compare against > the original, but one could easily argue that it's very difficult to > compare the os_posix.[ch]pp code with the originals... Okay I'll make the change. I still want to run some more tests anyway. Thanks, David > Dan > > >> >> Thanks, >> David >> >> On 26/05/2017 2:48 AM, Daniel D. Daugherty wrote: >>> On 5/18/17 12:25 AM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8174231 >>>> >>>> webrevs: >>>> >>>> Build-related: http://cr.openjdk.java.net/~dholmes/8174231/webrev.top/ >>> >>> General comment(s): >>> - Sometimes you've updated the copyright year for the file and >>> sometimes you haven't. Please check before pushing. >>> >>> >>> common/autoconf/flags.m4 >>> No comments. >>> >>> common/autoconf/generated-configure.sh >>> No comments. >>> >>> >>>> hotspot: http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot/ >>> >>> src/os/aix/vm/os_aix.cpp >>> No comments; did not try to compare deleted code with os_posix.cpp. >>> >>> src/os/aix/vm/os_aix.hpp >>> No comments; did not try to compare deleted code with os_posix.hpp. >>> >>> src/os/bsd/vm/os_bsd.cpp >>> No comments; compared deleted code with os_posix.cpp version; >>> nothing >>> jumped out as wrong. >>> >>> src/os/bsd/vm/os_bsd.hpp >>> No comments; compared deleted code with os_posix.hpp version; >>> nothing >>> jumped out as wrong. >>> >>> src/os/linux/vm/os_linux.cpp >>> No comments; compared deleted code with os_posix.cpp version; >>> nothing >>> jumped out as wrong. >>> >>> src/os/linux/vm/os_linux.hpp >>> No comments; compared deleted code with os_posix.hpp version; >>> nothing >>> jumped out as wrong. >>> >>> src/os/posix/vm/os_posix.cpp >>> L1401: // Not currently usable by Solaris >>> L1408: // time-of-day clock >>> nit - needs period at end of the sentence >>> >>> L1433: // build time support then there can not be >>> typo - "can not" -> "cannot" >>> >>> L1435: // int or int64_t. >>> typo - needs a ')' before the period. >>> >>> L1446: // determine what POSIX API's are present and do appropriate >>> L1447: // configuration >>> nits - 'determine' -> 'Determine' >>> - needs period at end of the sentence >>> >>> L1455: // 1. Check for CLOCK_MONOTONIC support >>> nit - needs period at end of the sentence >>> >>> L1462: // we do dlopen's in this particular order due to bug >>> in linux >>> L1463: // dynamical loader (see 6348968) leading to crash on exit >>> nits - 'we' -> 'We' >>> - needs period at end of the sentence >>> >>> typo - 'dynamical' -> 'dynamic' >>> >>> L1481: // we assume that if both clock_gettime and >>> clock_getres support >>> L1482: // CLOCK_MONOTONIC then the OS provides true high-res >>> monotonic clock >>> nits - 'we' -> 'We' >>> - needs period at end of the sentence >>> >>> L1486: clock_gettime_func(CLOCK_MONOTONIC, &tp) == 0) { >>> nit - extra space before '==' >>> >>> L1487: // yes, monotonic clock is supported >>> nits - 'yes' -> 'Yes' >>> - needs period at end of the sentence >>> >>> L1491: // close librt if there is no monotonic clock >>> nits - 'close' -> 'Close' >>> - needs period at end of the sentence >>> >>> L1499: // 2. Check for pthread_condattr_setclock support >>> L1503: // libpthread is already loaded >>> L1511: // Now do general initialization >>> nit - needs period at end of the sentence >>> >>> L1591: if (timeout < 0) >>> L1592: timeout = 0; >>> nit - missing braces >>> >>> L1609: // More seconds than we can add, so pin to max_secs >>> L1658: // More seconds than we can add, so pin to max_secs >>> nit - needs period at end of the sentence >>> >>> L1643: // Absolue seconds exceeds allow max, so pin to >>> max_secs >>> typo - 'Absolue' -> 'Absolute' >>> nit - needs period at end of the sentence >>> >>> src/os/posix/vm/os_posix.hpp >>> L149: ~PlatformEvent() { guarantee(0, "invariant"); } >>> L185: ~PlatformParker() { guarantee(0, "invariant"); } >>> nit - '0' should be 'false' or just call fatal() >>> >>> src/os/solaris/vm/os_solaris.cpp >>> No comments. >>> >>> src/os/solaris/vm/os_solaris.hpp >>> No comments. >>> >>> >>> As Robbin said, this is very hard to review and be sure that everything >>> is relocated correctly. I tried to look at this code a couple of >>> different >>> ways and nothing jumped out at me as wrong. >>> >>> I did my usual crawl style review through posix.cpp and posix.hpp. I >>> only >>> found nits and typos that you can chose to ignore since you're on a time >>> crunch here. >>> >>> Thumbs up! >>> >>> Dan >>> >>> >>> >>>> >>>> First a big thank you to Thomas Stuefe for testing various versions >>>> of this on AIX. >>>> >>>> This is primarily a refactoring and cleanup exercise (ie lots of >>>> deleted duplicated code!). >>>> >>>> I have taken the PlatformEvent, PlatformParker and Parker::* code, >>>> out of os_linux and moved it into os_posix for use by Linux, OSX, >>>> BSD, AIX and perhaps one day Solaris (more on that later). >>>> >>>> The Linux code was the most functionally complete, dealing with >>>> correct use of CLOCK_MONOTONIC for relative timed waits, and the >>>> default wall-clock for absolute timed waits. That functionality is >>>> not, unfortunately, supported by all our POSIX platforms so there >>>> are some configure time build checks to set some #defines, and then >>>> some dynamic lookup at runtime**. We allow for the runtime >>>> environment to be less capable than the build environment, but not >>>> the other way around (without build time support we don't know the >>>> runtime types needed to make library calls). >>>> >>>> ** There is some duplication of dynamic lookup code on Linux but >>>> this can be cleaned up in future work if we refactor the time/clock >>>> code into os_posix as well. >>>> >>>> The cleanup covers a number of things: >>>> - removal of linux anachronisms that got "ported" into the other >>>> platforms >>>> - eg EINTR can not be returned from the wait methods >>>> - removal of solaris anachronisms that got ported into the linux >>>> code and then on to other platforms >>>> - eg ETIMEDOUT is what we expect never ETIME >>>> - removal of the ancient/obsolete >>>> os::*::allowdebug_blocked_signals() from the Parker methods >>>> - consolidation of unpackTime and compute_abstime into one utility >>>> function >>>> - use statics for things completely private to the implementation >>>> rather than making them part of the os* API (eg access to condAttr >>>> objects) >>>> - cleanup up commentary and style within methods of the same class >>>> - clean up coding style in places eg not using Names that start with >>>> capitals. >>>> >>>> I have not tried to cleanup every single oddity, nor tried to >>>> reconcile differences between the very similar in places >>>> PlatformEvent and Park methods. For example PlatformEvent still >>>> examines the FilterSpuriousWakeups** flag, and Parker still ignores it. >>>> >>>> ** Perhaps a candidate for deprecation and future removal. >>>> >>>> There is one mini "enhancement" slipped in this. I now explicitly >>>> initialize mutexes with a mutexAttr object with its type set to >>>> PTHREAD_MUTEX_NORMAL, instead of relying on the definition of >>>> PTHREAD_MUTEX_DEFAULT. On FreesBSD the default is not "normal" but >>>> "error checking" and so is slow. On all other current platforms >>>> there is no effective change. >>>> >>>> Finally, Solaris is excluded from all this (other than the debug >>>> signal blocking cleanup) because it potentially supports three >>>> different low-level sync subsystems: UI thr*, Pthread, and direct >>>> LWP sync. Solaris cleanup would be a separate RFE. >>>> >>>> No doubt I've overlooked mentioning something that someone will >>>> spot. :) >>>> >>>> Thanks, >>>> David >>>> >>> > From david.holmes at oracle.com Fri May 26 07:27:00 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 17:27:00 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> Message-ID: Robbin, Dan, Below is a modified version of the refactored to_abstime code that Robbin suggested. Robbin: there were a couple of issues with your version. For relative time the timeout is always in nanoseconds - the "unit" only tells you what form the "now_part_sec" is - nanos or micros. And the calc_abs_time always has a deadline in millis. So I simplified and did a little renaming, and tracked max_secs in debug_only instead of returning it. Please let me know what you think. Thanks, David ----- // Calculate a new absolute time that is "timeout" nanoseconds from "now". // "unit" indicates the unit of "now_part_sec" (may be nanos or micros depending // on which clock is being used). static void calc_rel_time(timespec* abstime, jlong timeout, jlong now_sec, jlong now_part_sec, jlong unit) { time_t max_secs = now_sec + MAX_SECS; jlong seconds = timeout / NANOUNITS; timeout %= NANOUNITS; // remaining nanos if (seconds >= MAX_SECS) { // More seconds than we can add, so pin to max_secs. abstime->tv_sec = max_secs; abstime->tv_nsec = 0; } else { abstime->tv_sec = now_sec + seconds; long nanos = (now_part_sec * (NANOUNITS / unit)) + timeout; if (nanos >= NANOUNITS) { // overflow abstime->tv_sec += 1; nanos -= NANOUNITS; } abstime->tv_nsec = nanos; } } // Unpack the given deadline in milliseconds since the epoch, into the given timespec. // The current time in seconds is also passed in to enforce an upper bound as discussed above. static void unpack_abs_time(timespec* abstime, jlong deadline, jlong now_sec) { time_t max_secs = now_sec + MAX_SECS; jlong seconds = deadline / MILLIUNITS; jlong millis = deadline % MILLIUNITS; if (seconds >= max_secs) { // Absolute seconds exceeds allowed max, so pin to max_secs. abstime->tv_sec = max_secs; abstime->tv_nsec = 0; } else { abstime->tv_sec = seconds; abstime->tv_nsec = millis * (NANOUNITS / MILLIUNITS); } } static void to_abstime(timespec* abstime, jlong timeout, bool isAbsolute) { DEBUG_ONLY(int max_secs = MAX_SECS;) if (timeout < 0) { timeout = 0; } #ifdef SUPPORTS_CLOCK_MONOTONIC if (_use_clock_monotonic_condattr && !isAbsolute) { struct timespec now; int status = _clock_gettime(CLOCK_MONOTONIC, &now); assert_status(status == 0, status, "clock_gettime"); calc_rel_time(abstime, timeout, now.tv_sec, now.tv_nsec, NANOUNITS); DEBUG_ONLY(max_secs += now.tv_sec;) } else { #else { // Match the block scope. #endif // SUPPORTS_CLOCK_MONOTONIC // Time-of-day clock is all we can reliably use. struct timeval now; int status = gettimeofday(&now, NULL); assert(status == 0, "gettimeofday"); if (isAbsolute) { unpack_abs_time(abstime, timeout, now.tv_sec); } else { calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, MICROUNITS); } DEBUG_ONLY(max_secs += now.tv_sec;) } assert(abstime->tv_sec >= 0, "tv_sec < 0"); assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >= nanos_per_sec"); } From john.r.rose at oracle.com Fri May 26 07:42:07 2017 From: john.r.rose at oracle.com (John Rose) Date: Fri, 26 May 2017 00:42:07 -0700 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> References: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> Message-ID: On May 18, 2017, at 3:15 PM, Mikael Vidstedt wrote: > > I don?t think the C/C++ spec guarantees that accessing a memory location using a pointer of type T will result in code which does a single load/store of size >= sizeof(T), but for all the compilers we *actually* use that?s likely to be the case. If it?s true that the compilers don?t splits the memory accesses, that means we won?t have word tearing when using the Bytes::get/put methods with *aligned* pointers. This is true I think and is the main reason memcpy isn't a trustworthy replacement for any racy code. We used it for primitive arraycopy long ago but it bit us with hard to reproduce bugs. That's why we have Copy now, including the explicit atomic versions. For explicitly nonatomic copies memcpy is ok but I'd want to wrap it in an API that makes the nonatomicity explicit. ? John From adinn at redhat.com Fri May 26 08:17:55 2017 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 26 May 2017 09:17:55 +0100 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: References: <5ec3ca7f-f4cd-337c-63d5-3b00fd2839a7@redhat.com> <702f7e19-da8d-6ca2-8277-185a9468ef2a@redhat.com> Message-ID: On 25/05/17 20:47, Kim Barrett wrote: >> On May 25, 2017, at 11:54 AM, Andrew Dinn wrote: >> >> Ok, so I have several interesting things to report. > > Thanks for trying this out, and apologies for the blunders. No problem (and no reason to describe any of it as blunders). . . . > So here's the new webrev. The only changes are in > orderAccess_linux_aarch64.inline.hpp, which I can't test. Hopefully > I've not made any more blunders. > > http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.01/ Yes, that builds and runs ok on AArch64. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From adinn at redhat.com Fri May 26 08:29:32 2017 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 26 May 2017 09:29:32 +0100 Subject: Fwd: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <7b268254-e17b-482f-071a-cb4a3e4b19f5@redhat.com> Message-ID: On 26/05/17 02:04, David Holmes wrote: > On 26/05/2017 4:55 AM, Martin Buchholz wrote: >> [+jdk8u-dev] >> >> We've been hunting the elusive spurious NPEs as well; the following seems >> to be working for us (but we don't have any small repro recipe); >> something >> like this should be put into jdk8: > > In other words you want a backport of: 8061964: Insufficient compiler > barriers for GCC in OrderAccess functions ? > > https://bugs.openjdk.java.net/browse/JDK-8061964 Well, yes, that sounds like the 'correct' solution but ... > IIRC what stopped this from being an 'automatic' backport candidate was > the potential problem of older gcc's needing to be validated. If the 'correct' fix fails because of legacy compilers then I think my proposed change to the volatile declaration for _f1 will be sufficient to sort that out on x86 (I am assuming none of the legacy compilers will re-order volatile stores :-). Of course, that's not enough in itself on AArch64 jdk8 (which Red Hat maintain downstream) nor on ppc jdk8 but it does no harm. This may not matter though. These two platforms *must* employ a compiler that is able to implement OrderAccess::release_store with the correct memory semantics because they don't provide TCO. So, I guess the legacy issue only applies to x86 in which case maybe my patch will be good enough. What does everyone think? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From doug.simon at oracle.com Fri May 26 09:48:29 2017 From: doug.simon at oracle.com (Doug Simon) Date: Fri, 26 May 2017 11:48:29 +0200 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> Message-ID: <069D9444-72EA-45DE-A101-81A9A438BEB6@oracle.com> > On 26 May 2017, at 02:04, John Rose wrote: > > On May 17, 2017, at 9:01 AM, coleen.phillimore at oracle.com wrote: >> >> Summary: Add a Java type called ResolvedMethodName which is immutable and can be stored in a hashtable, that is weakly collected by gc > > I'm looking at the 8174749.03/webrev version of your changes. > > A few comments: > > In the JVMCI changes, this line appears to be incorrect on 32-bit machines: > > + vmtargetField = (HotSpotResolvedJavaField) findFieldInClass(methodType, "vmtarget", resolveType(long.class)); > > (It's a pre-existing condition, and I'm not sure if it is a problem.) Given that JVMCI does not support any 32-bit platforms currently, it should not be a problem in practice. The field lookup would also fail-fast when adding 32-bit support. That said, we can take this opportunity to make it portable by replacing: + vmtargetField = (HotSpotResolvedJavaField) findFieldInClass(methodType, "vmtarget", resolveType(long.class)); with: + vmtargetField = (HotSpotResolvedJavaField) findFieldInClass(methodType, "vmtarget", resolveType(HotSpotJVMCIRuntime.getHostWordKind().toJavaClass()))); -Doug From erik.osterlund at oracle.com Fri May 26 12:04:36 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 26 May 2017 14:04:36 +0200 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: References: Message-ID: <592819D4.9070209@oracle.com> Hi Kim, Thanks for doing this. It looks good to me. /Erik On 2017-05-25 01:42, Kim Barrett wrote: > Please review this change to Atomic::load and OrderAccess::load_acquire > overloads to make their source const qualified, e.g. instead of > "volatile T*" make them "const volatile T*". This eliminates the need > for casting away const when, for example, applying one of these > operations to a member variable when in a const-qualified method. > > There are probably places that previously required casting away const > but now do not. Similarly, there are probably places where values > couldn't be const or member functions couldn't be const qualified, but > now can be. I did a little searching and found a few candidates, but > none that were otherwise trivial to add to this change, so haven't > included any. > > This change touches platform-specific code for non-Oracle supported > platforms that I can't test, so I'd like reviews from the respective > platform owners. > > Aside regarding aarch64: I don't know why gcc's __atomic_load doesn't > const-qualify the source argument; that seems like a bug. Or maybe > they are, but not documented that way. And I wonder why the aarch64 > port uses __atomic_load rather than __atomic_load_n. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8166651 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.00 > > Testing: > JPRT > From zgu at redhat.com Fri May 26 13:14:17 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 26 May 2017 09:14:17 -0400 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 Message-ID: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> Hi, There is a corner case that still failed after JDK-8175813. The system shows that it has multiple NUMA nodes, but only one is configured. Under this scenario, numa_interleave_memory() call will result "mbind: Invalid argument" message. Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ The system NUMA configuration: Architecture: ppc64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Big Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 4 Core(s) per socket: 1 Socket(s): 2 NUMA node(s): 2 Model: 2.1 (pvr 003f 0201) Model name: POWER7 (architected), altivec supported L1d cache: 32K L1i cache: 32K NUMA node0 CPU(s): 0-7 NUMA node1 CPU(s): Thanks, -Zhengyu From coleen.phillimore at oracle.com Fri May 26 13:37:04 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 26 May 2017 09:37:04 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> Message-ID: Hi John, Thank you for these comments and for your help with this bug fix/RFE. On 5/25/17 8:04 PM, John Rose wrote: > On May 17, 2017, at 9:01 AM, coleen.phillimore at oracle.com > wrote: >> >> Summary: Add a Java type called ResolvedMethodName which is immutable >> and can be stored in a hashtable, that is weakly collected by gc > > I'm looking at the 8174749.03/webrev version of your changes. > > A few comments: > > In the JVMCI changes, this line appears to be incorrect on 32-bit > machines: > > + vmtargetField = (HotSpotResolvedJavaField) > findFieldInClass(methodType, "vmtarget", resolveType(long.class)); > > (It's a pre-existing condition, and I'm not sure if it is a problem.) I'll add a comment. I don't know if Graal supports 32 bits (I'd guess no). > > In the new hash table file, the parameter names > seem like they could be made more consistent. > > 85 oop ResolvedMethodTable::basic_add(Method* method, oop vmtarget) { > > 114 oop ResolvedMethodTable::add_method(Handle mem_name_target) { > > I think vmtarget and mem_name_target are the same kind of thing. > Consider renaming them to "entry_to_add" or something more aligned > with the rest of the code. This code didn't get updated properly from all the renaming. I've changed to using "method" when the parameter is Method* and using rmethod_name when the parameter is an oop representing ResolvedMethodName. > > I don't think that MethodHandles::init_field_MemberName needs TRAPS. Yes, removed TRAPS and reverted the use of the default parameter. > Also, MethodHandles::init_method_MemberName could omit TRAPS if > it were passed the RMN pointer first. Suggestion: Remove TRAPS from > both *and* add a trapping function which does the info->RMN step. > > static oop init_method_MemberName(Handle mname_h, CallInfo& info, > oop resolved_method); > static oop init_method_MemberName(Handle mname_h, CallInfo& info, > TRAPS); > > Then the trapping overloading can pick up the RMN immediately from the > info, > and call the non-trapping overloading. The reason to do something > indirect > like this is that the existing code for init_method_MemberName is (a) > complex > and (b) non-trapping. Promoting it all to trapping makes it harder to > work with. > > In other words, non-TRAPS code is (IMO) easier to read and reason about, > so converting a big method to TRAPS for one line is something I'd like to > avoid. At least, that's the way I thought about this particular code when > I first wrote it. > > Better: Since init_m_MN is joined at the hip with CallInfo, consider > adding the > trapping operation to CallInfo. See patch below. I think that > preserves CI's > claim to be the Source of Truth for call sites, even in methodHandles.cpp. > This is quite a nice change! I'll do this and rerun the tests over the weekend and send out a new version next week. > Thank you very much for this fix. I know it's been years since we started > talking about it. I'm glad you let it bother you enough to fix it! We kept running into this, so it was time. > > I looked at everything else and didn't find anything out of place. Thank you! Coleen > > Reviewed. > > ? John > > diff --git a/src/share/vm/interpreter/linkResolver.hpp > b/src/share/vm/interpreter/linkResolver.hpp > --- a/src/share/vm/interpreter/linkResolver.hpp > +++ b/src/share/vm/interpreter/linkResolver.hpp > @@ -56,6 +56,7 @@ > int _call_index; // vtable or itable index of > selected class method (if any) > Handle _resolved_appendix; // extra argument in constant > pool (if CPCE::has_appendix) > Handle _resolved_method_type; // MethodType (for > invokedynamic and invokehandle call sites) > + Handle _resolved_method_name; // optional > ResolvedMethodName object for java.lang.invoke > void set_static(KlassHandle resolved_klass, const methodHandle& > resolved_method, TRAPS); > void set_interface(KlassHandle resolved_klass, KlassHandle > selected_klass, > @@ -97,6 +98,7 @@ > methodHandle selected_method() const { return _selected_method; } > Handle resolved_appendix() const { return > _resolved_appendix; } > Handle resolved_method_type() const { return > _resolved_method_type; } > + Handle resolved_method_name() const { return > _resolved_method_name; } > BasicType result_type() const { return > selected_method()->result_type(); } > CallKind call_kind() const { return _call_kind; } > @@ -117,6 +119,12 @@ > return _call_index; > } > + oop find_resolved_method_name(TRAPS) { > + if (_resolved_method_name.is_null()) > + > java_lang_invoke_ResolvedMethodName::find_resolved_method(_resolved_method, > CHECK_NULL); > + return _resolved_method_name; > + } > + > // debugging > #ifdef ASSERT > bool has_vtable_index() const { return _call_index >= > 0 && _call_kind != CallInfo::itable_call; } > diff --git a/src/share/vm/interpreter/linkResolver.hpp > b/src/share/vm/interpreter/linkResolver.hpp > --- a/src/share/vm/interpreter/linkResolver.hpp > +++ b/src/share/vm/interpreter/linkResolver.hpp > @@ -56,6 +56,7 @@ > int _call_index; // vtable or itable index of > selected class method (if any) > Handle _resolved_appendix; // extra argument in constant > pool (if CPCE::has_appendix) > Handle _resolved_method_type; // MethodType (for > invokedynamic and invokehandle call sites) > + Handle _resolved_method_name; // optional > ResolvedMethodName object for java.lang.invoke > void set_static(KlassHandle resolved_klass, const methodHandle& > resolved_method, TRAPS); > void set_interface(KlassHandle resolved_klass, KlassHandle > selected_klass, > @@ -97,6 +98,7 @@ > methodHandle selected_method() const { return _selected_method; } > Handle resolved_appendix() const { return > _resolved_appendix; } > Handle resolved_method_type() const { return > _resolved_method_type; } > + Handle resolved_method_name() const { return > _resolved_method_name; } > BasicType result_type() const { return > selected_method()->result_type(); } > CallKind call_kind() const { return _call_kind; } > @@ -117,6 +119,12 @@ > return _call_index; > } > + oop find_resolved_method_name(TRAPS) { > + if (_resolved_method_name.is_null()) > + > java_lang_invoke_ResolvedMethodName::find_resolved_method(_resolved_method, > CHECK_NULL); > + return _resolved_method_name; > + } > + > // debugging > #ifdef ASSERT > bool has_vtable_index() const { return _call_index >= > 0 && _call_kind != CallInfo::itable_call; } > > > From zgu at redhat.com Fri May 26 13:41:49 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 26 May 2017 09:41:49 -0400 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> Message-ID: This is a quick way to kill the symptom (or low risk?). I am not sure if disabling NUMA is a better solution for this circumstance? does 1 NUMA node = UMA? Thanks, -Zhengyu On 05/26/2017 09:14 AM, Zhengyu Gu wrote: > Hi, > > There is a corner case that still failed after JDK-8175813. > > The system shows that it has multiple NUMA nodes, but only one is > configured. Under this scenario, numa_interleave_memory() call will > result "mbind: Invalid argument" message. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 > Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ > > > The system NUMA configuration: > > Architecture: ppc64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Big Endian > CPU(s): 8 > On-line CPU(s) list: 0-7 > Thread(s) per core: 4 > Core(s) per socket: 1 > Socket(s): 2 > NUMA node(s): 2 > Model: 2.1 (pvr 003f 0201) > Model name: POWER7 (architected), altivec supported > L1d cache: 32K > L1i cache: 32K > NUMA node0 CPU(s): 0-7 > NUMA node1 CPU(s): > > Thanks, > > -Zhengyu From zoltan.majo at oracle.com Fri May 26 14:17:21 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 26 May 2017 16:17:21 +0200 Subject: [8u] Request for approval and review: 8180934 (XS): PutfieldError failed with UnsupportedClassVersionError Message-ID: Hi, when backporting 8160551, I also backported a test that is relevant only for class files with version >= 53. As JDK 8 supports only class files with version < 53, having the test in the JDK 8u test base does not make sense. This changeset proposes to remove the test. https://bugs.openjdk.java.net/browse/JDK-8180934 http://cr.openjdk.java.net/~zmajo/8180934/webrev.00/ I executed all hotspot/runtime tests with the changeset (using JDK 8u122), no problems have shown up. JPRT testing is in progress. Please note that this fix is a JDK 8u-specific fix (not a backport of some existing fix in JDK 9). Thank you! Best regards, Zoltan From sean.coffey at oracle.com Fri May 26 14:38:42 2017 From: sean.coffey at oracle.com (=?UTF-8?Q?Se=c3=a1n_Coffey?=) Date: Fri, 26 May 2017 15:38:42 +0100 Subject: [8u] Request for approval and review: 8180934 (XS): PutfieldError failed with UnsupportedClassVersionError In-Reply-To: References: Message-ID: <5f215f01-27b1-bc39-9540-6d03d6adf9a9@oracle.com> looks good. Please add the 9-na and noreg-self labels to the bug report. Approved for jdk8u-dev. Regards, Sean. On 26/05/17 15:17, Zolt?n Maj? wrote: > Hi, > > > when backporting 8160551, I also backported a test that is relevant > only for class files with version >= 53. As JDK 8 supports only class > files with version < 53, having the test in the JDK 8u test base does > not make sense. This changeset proposes to remove the test. > > https://bugs.openjdk.java.net/browse/JDK-8180934 > http://cr.openjdk.java.net/~zmajo/8180934/webrev.00/ > > I executed all hotspot/runtime tests with the changeset (using JDK > 8u122), no problems have shown up. JPRT testing is in progress. > > Please note that this fix is a JDK 8u-specific fix (not a backport of > some existing fix in JDK 9). > > Thank you! > > Best regards, > > > Zoltan > From harold.seigel at oracle.com Fri May 26 14:44:31 2017 From: harold.seigel at oracle.com (harold seigel) Date: Fri, 26 May 2017 10:44:31 -0400 Subject: [8u] Request for approval and review: 8180934 (XS): PutfieldError failed with UnsupportedClassVersionError In-Reply-To: References: Message-ID: <96127e2a-0059-8ef1-0ed8-2f152fe0e415@oracle.com> Hi Zoltan, Instead of deleting the test, can the class file version of Bad.jasm be changed to 52 for JDK-8u? Thanks, Harold On 5/26/2017 10:17 AM, Zolt?n Maj? wrote: > Hi, > > > when backporting 8160551, I also backported a test that is relevant > only for class files with version >= 53. As JDK 8 supports only class > files with version < 53, having the test in the JDK 8u test base does > not make sense. This changeset proposes to remove the test. > > https://bugs.openjdk.java.net/browse/JDK-8180934 > http://cr.openjdk.java.net/~zmajo/8180934/webrev.00/ > > I executed all hotspot/runtime tests with the changeset (using JDK > 8u122), no problems have shown up. JPRT testing is in progress. > > Please note that this fix is a JDK 8u-specific fix (not a backport of > some existing fix in JDK 9). > > Thank you! > > Best regards, > > > Zoltan > From coleen.phillimore at oracle.com Fri May 26 14:52:49 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 26 May 2017 10:52:49 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <069D9444-72EA-45DE-A101-81A9A438BEB6@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> <069D9444-72EA-45DE-A101-81A9A438BEB6@oracle.com> Message-ID: <287fa0c4-84bc-405e-acb0-a355082b5264@oracle.com> On 5/26/17 5:48 AM, Doug Simon wrote: >> On 26 May 2017, at 02:04, John Rose wrote: >> >> On May 17, 2017, at 9:01 AM, coleen.phillimore at oracle.com wrote: >>> Summary: Add a Java type called ResolvedMethodName which is immutable and can be stored in a hashtable, that is weakly collected by gc >> I'm looking at the 8174749.03/webrev version of your changes. >> >> A few comments: >> >> In the JVMCI changes, this line appears to be incorrect on 32-bit machines: >> >> + vmtargetField = (HotSpotResolvedJavaField) findFieldInClass(methodType, "vmtarget", resolveType(long.class)); >> >> (It's a pre-existing condition, and I'm not sure if it is a problem.) > Given that JVMCI does not support any 32-bit platforms currently, it should not be a problem in practice. The field lookup would also fail-fast when adding 32-bit support. > > That said, we can take this opportunity to make it portable by replacing: > > + vmtargetField = (HotSpotResolvedJavaField) findFieldInClass(methodType, "vmtarget", resolveType(long.class)); > > with: > > + vmtargetField = (HotSpotResolvedJavaField) findFieldInClass(methodType, "vmtarget", resolveType(HotSpotJVMCIRuntime.getHostWordKind().toJavaClass()))); I can make this change. I hope it works after I cut and paste this :) Coleen > > -Doug > From coleen.phillimore at oracle.com Fri May 26 17:47:19 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 26 May 2017 13:47:19 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> Message-ID: <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> Hi, I made the changes below, which turned out very nice. It didn't take that long to retest. See: open webrev at http://cr.openjdk.java.net/~coleenp/8174749.04/webrev open webrev at http://cr.openjdk.java.net/~coleenp/8174749.jdk.04/webrev I don't know how to do delta webrevs, so just look at linkResolver.cpp/hpp and methodHandles.cpp Thanks, Coleen On 5/26/17 9:37 AM, coleen.phillimore at oracle.com wrote: > > Hi John, > Thank you for these comments and for your help with this bug fix/RFE. > > On 5/25/17 8:04 PM, John Rose wrote: >> On May 17, 2017, at 9:01 AM, coleen.phillimore at oracle.com >> wrote: >>> >>> Summary: Add a Java type called ResolvedMethodName which is >>> immutable and can be stored in a hashtable, that is weakly collected >>> by gc >> >> I'm looking at the 8174749.03/webrev version of your changes. >> >> A few comments: >> >> In the JVMCI changes, this line appears to be incorrect on 32-bit >> machines: >> >> + vmtargetField = (HotSpotResolvedJavaField) >> findFieldInClass(methodType, "vmtarget", resolveType(long.class)); >> >> (It's a pre-existing condition, and I'm not sure if it is a problem.) > > I'll add a comment. I don't know if Graal supports 32 bits (I'd guess > no). >> >> In the new hash table file, the parameter names >> seem like they could be made more consistent. >> >> 85 oop ResolvedMethodTable::basic_add(Method* method, oop vmtarget) { >> >> 114 oop ResolvedMethodTable::add_method(Handle mem_name_target) { >> >> I think vmtarget and mem_name_target are the same kind of thing. >> Consider renaming them to "entry_to_add" or something more aligned >> with the rest of the code. > > This code didn't get updated properly from all the renaming. I've > changed to using "method" when the parameter is Method* and using > rmethod_name when the parameter is an oop representing > ResolvedMethodName. > >> >> I don't think that MethodHandles::init_field_MemberName needs TRAPS. > > Yes, removed TRAPS and reverted the use of the default parameter. >> Also, MethodHandles::init_method_MemberName could omit TRAPS if >> it were passed the RMN pointer first. Suggestion: Remove TRAPS from >> both *and* add a trapping function which does the info->RMN step. >> >> static oop init_method_MemberName(Handle mname_h, CallInfo& info, >> oop resolved_method); >> static oop init_method_MemberName(Handle mname_h, CallInfo& info, >> TRAPS); >> >> Then the trapping overloading can pick up the RMN immediately from >> the info, >> and call the non-trapping overloading. The reason to do something >> indirect >> like this is that the existing code for init_method_MemberName is (a) >> complex >> and (b) non-trapping. Promoting it all to trapping makes it harder >> to work with. >> >> In other words, non-TRAPS code is (IMO) easier to read and reason about, >> so converting a big method to TRAPS for one line is something I'd >> like to >> avoid. At least, that's the way I thought about this particular code >> when >> I first wrote it. >> >> Better: Since init_m_MN is joined at the hip with CallInfo, consider >> adding the >> trapping operation to CallInfo. See patch below. I think that >> preserves CI's >> claim to be the Source of Truth for call sites, even in >> methodHandles.cpp. >> > > This is quite a nice change! I'll do this and rerun the tests over > the weekend and send out a new version next week. > >> Thank you very much for this fix. I know it's been years since we >> started >> talking about it. I'm glad you let it bother you enough to fix it! > > We kept running into this, so it was time. > >> >> I looked at everything else and didn't find anything out of place. > > Thank you! > Coleen > >> >> Reviewed. >> >> ? John >> >> diff --git a/src/share/vm/interpreter/linkResolver.hpp >> b/src/share/vm/interpreter/linkResolver.hpp >> --- a/src/share/vm/interpreter/linkResolver.hpp >> +++ b/src/share/vm/interpreter/linkResolver.hpp >> @@ -56,6 +56,7 @@ >> int _call_index; // vtable or itable index of >> selected class method (if any) >> Handle _resolved_appendix; // extra argument in >> constant pool (if CPCE::has_appendix) >> Handle _resolved_method_type; // MethodType (for >> invokedynamic and invokehandle call sites) >> + Handle _resolved_method_name; // optional >> ResolvedMethodName object for java.lang.invoke >> void set_static(KlassHandle resolved_klass, const methodHandle& >> resolved_method, TRAPS); >> void set_interface(KlassHandle resolved_klass, KlassHandle >> selected_klass, >> @@ -97,6 +98,7 @@ >> methodHandle selected_method() const { return >> _selected_method; } >> Handle resolved_appendix() const { return >> _resolved_appendix; } >> Handle resolved_method_type() const { return >> _resolved_method_type; } >> + Handle resolved_method_name() const { return >> _resolved_method_name; } >> BasicType result_type() const { return >> selected_method()->result_type(); } >> CallKind call_kind() const { return _call_kind; } >> @@ -117,6 +119,12 @@ >> return _call_index; >> } >> + oop find_resolved_method_name(TRAPS) { >> + if (_resolved_method_name.is_null()) >> + >> java_lang_invoke_ResolvedMethodName::find_resolved_method(_resolved_method, >> CHECK_NULL); >> + return _resolved_method_name; >> + } >> + >> // debugging >> #ifdef ASSERT >> bool has_vtable_index() const { return _call_index >= >> 0 && _call_kind != CallInfo::itable_call; } >> diff --git a/src/share/vm/interpreter/linkResolver.hpp >> b/src/share/vm/interpreter/linkResolver.hpp >> --- a/src/share/vm/interpreter/linkResolver.hpp >> +++ b/src/share/vm/interpreter/linkResolver.hpp >> @@ -56,6 +56,7 @@ >> int _call_index; // vtable or itable index of >> selected class method (if any) >> Handle _resolved_appendix; // extra argument in >> constant pool (if CPCE::has_appendix) >> Handle _resolved_method_type; // MethodType (for >> invokedynamic and invokehandle call sites) >> + Handle _resolved_method_name; // optional >> ResolvedMethodName object for java.lang.invoke >> void set_static(KlassHandle resolved_klass, const methodHandle& >> resolved_method, TRAPS); >> void set_interface(KlassHandle resolved_klass, KlassHandle >> selected_klass, >> @@ -97,6 +98,7 @@ >> methodHandle selected_method() const { return >> _selected_method; } >> Handle resolved_appendix() const { return >> _resolved_appendix; } >> Handle resolved_method_type() const { return >> _resolved_method_type; } >> + Handle resolved_method_name() const { return >> _resolved_method_name; } >> BasicType result_type() const { return >> selected_method()->result_type(); } >> CallKind call_kind() const { return _call_kind; } >> @@ -117,6 +119,12 @@ >> return _call_index; >> } >> + oop find_resolved_method_name(TRAPS) { >> + if (_resolved_method_name.is_null()) >> + >> java_lang_invoke_ResolvedMethodName::find_resolved_method(_resolved_method, >> CHECK_NULL); >> + return _resolved_method_name; >> + } >> + >> // debugging >> #ifdef ASSERT >> bool has_vtable_index() const { return _call_index >= >> 0 && _call_kind != CallInfo::itable_call; } >> >> >> > From daniel.daugherty at oracle.com Fri May 26 18:19:48 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 26 May 2017 12:19:48 -0600 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> Message-ID: <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> On 5/26/17 1:27 AM, David Holmes wrote: > Robbin, Dan, > > Below is a modified version of the refactored to_abstime code that > Robbin suggested. > > Robbin: there were a couple of issues with your version. For relative > time the timeout is always in nanoseconds - the "unit" only tells you > what form the "now_part_sec" is - nanos or micros. And the > calc_abs_time always has a deadline in millis. So I simplified and did > a little renaming, and tracked max_secs in debug_only instead of > returning it. > > Please let me know what you think. Looks OK to me. Nit comments below... > > Thanks, > David > ----- > > > // Calculate a new absolute time that is "timeout" nanoseconds from > "now". > // "unit" indicates the unit of "now_part_sec" (may be nanos or micros > depending > // on which clock is being used). > static void calc_rel_time(timespec* abstime, jlong timeout, jlong > now_sec, > jlong now_part_sec, jlong unit) { > time_t max_secs = now_sec + MAX_SECS; > > jlong seconds = timeout / NANOUNITS; > timeout %= NANOUNITS; // remaining nanos > > if (seconds >= MAX_SECS) { > // More seconds than we can add, so pin to max_secs. > abstime->tv_sec = max_secs; > abstime->tv_nsec = 0; > } else { > abstime->tv_sec = now_sec + seconds; > long nanos = (now_part_sec * (NANOUNITS / unit)) + timeout; > if (nanos >= NANOUNITS) { // overflow > abstime->tv_sec += 1; > nanos -= NANOUNITS; > } > abstime->tv_nsec = nanos; > } > } > > // Unpack the given deadline in milliseconds since the epoch, into the > given timespec. > // The current time in seconds is also passed in to enforce an upper > bound as discussed above. > static void unpack_abs_time(timespec* abstime, jlong deadline, jlong > now_sec) { > time_t max_secs = now_sec + MAX_SECS; > > jlong seconds = deadline / MILLIUNITS; > jlong millis = deadline % MILLIUNITS; > > if (seconds >= max_secs) { > // Absolute seconds exceeds allowed max, so pin to max_secs. > abstime->tv_sec = max_secs; > abstime->tv_nsec = 0; > } else { > abstime->tv_sec = seconds; > abstime->tv_nsec = millis * (NANOUNITS / MILLIUNITS); > } > } > > > static void to_abstime(timespec* abstime, jlong timeout, bool > isAbsolute) { There's an extra blank line here. > > DEBUG_ONLY(int max_secs = MAX_SECS;) > > if (timeout < 0) { > timeout = 0; > } > > #ifdef SUPPORTS_CLOCK_MONOTONIC > > if (_use_clock_monotonic_condattr && !isAbsolute) { > struct timespec now; > int status = _clock_gettime(CLOCK_MONOTONIC, &now); > assert_status(status == 0, status, "clock_gettime"); > calc_rel_time(abstime, timeout, now.tv_sec, now.tv_nsec, NANOUNITS); > DEBUG_ONLY(max_secs += now.tv_sec;) > } else { > > #else > > { // Match the block scope. > > #endif // SUPPORTS_CLOCK_MONOTONIC > > // Time-of-day clock is all we can reliably use. > struct timeval now; > int status = gettimeofday(&now, NULL); > assert(status == 0, "gettimeofday"); assert_status() is used above, but assert() is used here. Why? > if (isAbsolute) { > unpack_abs_time(abstime, timeout, now.tv_sec); > } > else { Inconsistent "else-branch" formatting. I believe HotSpot style is "} else {" > calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, MICROUNITS); > } > DEBUG_ONLY(max_secs += now.tv_sec;) > } > > assert(abstime->tv_sec >= 0, "tv_sec < 0"); > assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); > assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); > assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >= > nanos_per_sec"); Why does the assert mesg have "nanos_per_sec" instead of "NANOSECS_PER_SEC"? There's an extra blank line here. > > } Definitely looks and reads much cleaner. Dan From john.r.rose at oracle.com Fri May 26 20:01:48 2017 From: john.r.rose at oracle.com (John Rose) Date: Fri, 26 May 2017 13:01:48 -0700 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> Message-ID: <614A148E-57D5-495B-9FD4-80451A65C208@oracle.com> One more comment. I am not sure about this change: int vmindex = java_lang_invoke_MemberName::vmindex(mname()); - if (vmtarget == NULL) { + bool have_defc = (java_lang_invoke_MemberName::clazz(mname()) != NULL); + + if (!have_defc) { THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "nothing to expand"); } The point of the expander is that if a MN has *only* a RMN, *then* it can recover the rest of its guts. That's why the old code checked vmtarget, and new code should, I think, check the new 'method' for non-null. The point is that if you are holding an RMN, you can't "peek" into it unless you wrap a blank MN around it and request "expand". I hesitated to mention this earlier because I assume there was another reason for changing the logic of this method, but now after a full review I don't see the reason. Notice that RMN is now a completely opaque Java type, which is very good. But there needs to be a Java API for unpacking an RMN, and MN.expand is that API. This is where we came out in the end, and I think the have_defc changes are now just an artifact of an intermediate state. So, I suggest walking back the changes related to "have_defc", unless I've missed something here. ? John From coleen.phillimore at oracle.com Fri May 26 20:32:19 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 26 May 2017 16:32:19 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <614A148E-57D5-495B-9FD4-80451A65C208@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> <614A148E-57D5-495B-9FD4-80451A65C208@oracle.com> Message-ID: <8104781a-8493-8d02-b6de-4de05e4d346a@oracle.com> On 5/26/17 4:01 PM, John Rose wrote: > One more comment. I am not sure about this change: > > int vmindex = java_lang_invoke_MemberName::vmindex(mname()); > - if (vmtarget == NULL) { > + bool have_defc = (java_lang_invoke_MemberName::clazz(mname()) != NULL); > + > + if (!have_defc) { > THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "nothing to expand"); > } > > > The point of the expander is that if a MN has *only* a RMN, *then* it > can recover the rest of its guts. > > That's why the old code checked vmtarget, and new code should, I > think, check the new 'method' for non-null. I made this change so that field MemberName doesn't have to have a vmtarget, since now it's a ResolvedMethodName. For field MemberName, the "method" field is null. I found that the clazz field is properly initialized in init_MemberName and never changes and is never null here, so that's why I check the clazz. > > The point is that if you are holding an RMN, you can't "peek" into it > unless you wrap a blank MN around it and request "expand". > > I hesitated to mention this earlier because I assume there was another > reason for changing the logic of this method, but now after a full > review I don't see the reason. > > Notice that RMN is now a completely opaque Java type, which is very > good. But there needs to be a Java API for unpacking an RMN, and > MN.expand is that API. This is where we came out in the end, and I > think the have_defc changes are now just an artifact of an > intermediate state. > > So, I suggest walking back the changes related to "have_defc", unless > I've missed something here. I thought we should add a different API for expanding a MemberName when you are holding a RMN, that isn't applicable for field MemberName. That's why I made this change. Coleen > > ? John From john.r.rose at oracle.com Fri May 26 20:48:10 2017 From: john.r.rose at oracle.com (John Rose) Date: Fri, 26 May 2017 13:48:10 -0700 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> Message-ID: <3B8B3D31-09F3-41AC-9B26-A14B4DB87082@oracle.com> On May 26, 2017, at 10:47 AM, coleen.phillimore at oracle.com wrote: > > Hi, I made the changes below, which turned out very nice. It didn't take that long to retest. See: > > open webrev at http://cr.openjdk.java.net/~coleenp/8174749.04/webrev > open webrev at http://cr.openjdk.java.net/~coleenp/8174749.jdk.04/webrev > > I don't know how to do delta webrevs, so just look at linkResolver.cpp/hpp and methodHandles.cpp Re-reviewed. See previous message for a late-breaking comment on expand. See below for a sketch of what I mean by keeping "have_defc" as is. (Another reviewer commented about a dead mode bit. The purpose of that stuff is to allow us to tweak the JDK API. I don't care much either way about GC-ing unused mode bits but I do want to keep the expander capability so we can prototype stuff in the JDK without having to edit the JVM. So on balance, I'd give the mode bits the benefit of the doubt. They can be used from the JDK, even if they aren't at the moment.) I also like how this CallInfo change turned out. Notice how now the function java_lang_invoke_ResolvedMethodName::find_resolved_method has only one usage, from the inside of CallInfo. This feels right. It also means you can take javaClasses.cpp out of the loop here, and just have CallInfo call directly into SystemDictionary and ResolvedMethodTable. It seems just as reasonable to me that linkResolver.cpp would do that job, than that it would be to delegate via javaClasses.cpp. I also think the patch will get a little smaller if you cut javaClasses.cpp out of that loop. Thanks, ? John P.S. As a step after this fix, if we loosen the coupling of the JVM with MemberName, I think we will want to get rid of MN::vmtarget and just have MN::method. In the code of MHN_getMemberVMInfo, the unchanged line "x = mname()" really wants to be "x = method" where method is the RMN. The JDK code expects a MN at that point, but it should really be the RMN now. The only JDK change would be in MemberName.java: - assert(vmtarget instanceof MemberName) : vmtarget + " in " + this; + assert(vmtarget instanceof ResolvedMethodName) : vmtarget + " in " + this; I wouldn't object if you anticipated this in the present change set, but it's OK to do it later. P.P.S. Here's a sketch of what I mean by walking back some of the "have_defc" changes. Maybe I'm missing something, but I think this version makes more sense than the current version: git a/src/share/vm/prims/methodHandles.cpp b/src/share/vm/prims/methodHandles.cpp --- a/src/share/vm/prims/methodHandles.cpp +++ b/src/share/vm/prims/methodHandles.cpp @@ -794,11 +794,6 @@ // which refers directly to JVM internals. void MethodHandles::expand_MemberName(Handle mname, int suppress, TRAPS) { assert(java_lang_invoke_MemberName::is_instance(mname()), ""); - Metadata* vmtarget = java_lang_invoke_MemberName::vmtarget(mname()); - int vmindex = java_lang_invoke_MemberName::vmindex(mname()); - if (vmtarget == NULL) { - THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "nothing to expand"); - } bool have_defc = (java_lang_invoke_MemberName::clazz(mname()) != NULL); bool have_name = (java_lang_invoke_MemberName::name(mname()) != NULL); @@ -817,10 +812,14 @@ case IS_METHOD: case IS_CONSTRUCTOR: { + Method* vmtarget = java_lang_invoke_MemberName::vmtarget(method()); + if (vmtarget == NULL) { + THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "nothing to expand"); + } assert(vmtarget->is_method(), "method or constructor vmtarget is Method*"); methodHandle m(THREAD, (Method*)vmtarget); DEBUG_ONLY(vmtarget = NULL); // safety - if (m.is_null()) break; + assert(m.not_null(), "checked above"); if (!have_defc) { InstanceKlass* defc = m->method_holder(); java_lang_invoke_MemberName::set_clazz(mname(), defc->java_mirror()); @@ -838,17 +837,16 @@ } case IS_FIELD: { - assert(vmtarget->is_klass(), "field vmtarget is Klass*"); - if (!((Klass*) vmtarget)->is_instance_klass()) break; - instanceKlassHandle defc(THREAD, (Klass*) vmtarget); - DEBUG_ONLY(vmtarget = NULL); // safety + oop clazz = java_lang_invoke_MemberName::clazz(mname()); + if (clazz == NULL) { + THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "nothing to expand (as field)"); + } + InstanceKlass* defc = InstanceKlass::cast(java_lang_Class::as_Klass(clazz)); + DEBUG_ONLY(clazz = NULL); // safety bool is_static = ((flags & JVM_ACC_STATIC) != 0); fieldDescriptor fd; // find_field initializes fd if found if (!defc->find_field_from_offset(vmindex, is_static, &fd)) break; // cannot expand - if (!have_defc) { - java_lang_invoke_MemberName::set_clazz(mname(), defc->java_mirror()); - } if (!have_name) { //not java_lang_String::create_from_symbol; let's intern member names Handle name = StringTable::intern(fd.name(), CHECK); @@ -1389,6 +1387,39 @@ From john.r.rose at oracle.com Fri May 26 20:50:39 2017 From: john.r.rose at oracle.com (John Rose) Date: Fri, 26 May 2017 13:50:39 -0700 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <8104781a-8493-8d02-b6de-4de05e4d346a@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> <614A148E-57D5-495B-9FD4-80451A65C208@oracle.com> <8104781a-8493-8d02-b6de-4de05e4d346a@oracle.com> Message-ID: On May 26, 2017, at 1:32 PM, coleen.phillimore at oracle.com wrote: > > I thought we should add a different API for expanding a MemberName when you are holding a RMN, that isn't applicable for field MemberName. That's why I made this change. I see. Well, that would in fact replace the whole MN.expand thing, since that's what it's for. Until we have the different API, let's keep the old one intact. Note the differing treatment of fields and methods in the patch I suggested; I think that makes sense. ? John From kim.barrett at oracle.com Fri May 26 21:27:36 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 26 May 2017 17:27:36 -0400 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: <592819D4.9070209@oracle.com> References: <592819D4.9070209@oracle.com> Message-ID: > On May 26, 2017, at 8:04 AM, Erik ?sterlund wrote: > > Hi Kim, > > Thanks for doing this. It looks good to me. Thanks. > > /Erik > > On 2017-05-25 01:42, Kim Barrett wrote: >> Please review this change to Atomic::load and OrderAccess::load_acquire >> overloads to make their source const qualified, e.g. instead of >> "volatile T*" make them "const volatile T*". This eliminates the need >> for casting away const when, for example, applying one of these >> operations to a member variable when in a const-qualified method. >> >> There are probably places that previously required casting away const >> but now do not. Similarly, there are probably places where values >> couldn't be const or member functions couldn't be const qualified, but >> now can be. I did a little searching and found a few candidates, but >> none that were otherwise trivial to add to this change, so haven't >> included any. >> >> This change touches platform-specific code for non-Oracle supported >> platforms that I can't test, so I'd like reviews from the respective >> platform owners. >> >> Aside regarding aarch64: I don't know why gcc's __atomic_load doesn't >> const-qualify the source argument; that seems like a bug. Or maybe >> they are, but not documented that way. And I wonder why the aarch64 >> port uses __atomic_load rather than __atomic_load_n. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8166651 >> >> Webrev: >> http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.00 >> >> Testing: >> JPRT From kim.barrett at oracle.com Fri May 26 21:31:08 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 26 May 2017 17:31:08 -0400 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: References: <5ec3ca7f-f4cd-337c-63d5-3b00fd2839a7@redhat.com> <702f7e19-da8d-6ca2-8277-185a9468ef2a@redhat.com> Message-ID: > On May 26, 2017, at 4:17 AM, Andrew Dinn wrote: > > On 25/05/17 20:47, Kim Barrett wrote: >>> On May 25, 2017, at 11:54 AM, Andrew Dinn wrote: >>> >>> Ok, so I have several interesting things to report. >> >> Thanks for trying this out, and apologies for the blunders. > > No problem (and no reason to describe any of it as blunders). They sure felt like blunders :) > . . . >> So here's the new webrev. The only changes are in >> orderAccess_linux_aarch64.inline.hpp, which I can't test. Hopefully >> I've not made any more blunders. >> >> http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.01/ > > Yes, that builds and runs ok on AArch64. Yay! Thanks for your help with this. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From coleen.phillimore at oracle.com Fri May 26 21:48:35 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 26 May 2017 17:48:35 -0400 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: <3B8B3D31-09F3-41AC-9B26-A14B4DB87082@oracle.com> References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> <3B8B3D31-09F3-41AC-9B26-A14B4DB87082@oracle.com> Message-ID: On 5/26/17 4:48 PM, John Rose wrote: > On May 26, 2017, at 10:47 AM, coleen.phillimore at oracle.com > wrote: >> >> Hi, I made the changes below, which turned out very nice. It didn't >> take that long to retest. See: >> >> open webrev athttp://cr.openjdk.java.net/~coleenp/8174749.04/webrev >> >> open webrev >> athttp://cr.openjdk.java.net/~coleenp/8174749.jdk.04/webrev >> >> >> I don't know how to do delta webrevs, so just look at >> linkResolver.cpp/hpp and methodHandles.cpp > > Re-reviewed. > > See previous message for a late-breaking comment on expand. > See below for a sketch of what I mean by keeping "have_defc" as is. Hi John, I was just thinking of this change below, it makes sense to treat field and method MemberName differently as you have below. The field needs the clazz present to be expanded but method MemberName does not. Yes, this makes sense. > > (Another reviewer commented about a dead mode bit. The purpose of that > stuff is to allow us to tweak the JDK API. I don't care much either > way about > GC-ing unused mode bits but I do want to keep the expander capability > so we > can prototype stuff in the JDK without having to edit the JVM. So on > balance, > I'd give the mode bits the benefit of the doubt. They can be used > from the JDK, > even if they aren't at the moment.) > > I also like how this CallInfo change turned out. Notice how now the > function > java_lang_invoke_ResolvedMethodName::find_resolved_method has only > one usage, from the inside of CallInfo. This feels right. It also > means you > can take javaClasses.cpp out of the loop here, and just have CallInfo call > directly into SystemDictionary and ResolvedMethodTable. It seems just > as reasonable to me that linkResolver.cpp would do that job, than that it > would be to delegate via javaClasses.cpp. I also think the patch will get > a little smaller if you cut javaClasses.cpp out of that loop. JavaClasses is in the loop because it knows which fields to assign and how to create a ResolvedMethodName. I think this makes sense to isolate it like this an appreciated only changing javaClasses.cpp when I kept changing the names of the fields. > > Thanks, > ? John > > P.S. As a step after this fix, if we loosen the coupling of the JVM > with MemberName, > I think we will want to get rid of MN::vmtarget and just have MN::method. > In the code of MHN_getMemberVMInfo, the unchanged line "x = mname()" > really wants to be "x = method" where method is the RMN. The JDK code > expects a MN at that point, but it should really be the RMN now. The only > JDK change would be in MemberName.java: > > - assert(vmtarget instanceof MemberName) : vmtarget + " in > " + this; > + assert(vmtarget instanceof ResolvedMethodName) : vmtarget > + " in " + this; > > I wouldn't object if you anticipated this in the present change set, > but it's OK > to do it later. Yes, it has to be later. I'm going to file a couple of RFE's after this that we discussed so that RMN can be used instead of MN. And believe it or not, large changes make me anxious. :) > > P.P.S. Here's a sketch of what I mean by walking back some of the > "have_defc" > changes. Maybe I'm missing something, but I think this version makes more > sense than the current version: Done. Passes java/lang/invoke tests (as sanity). http://cr.openjdk.java.net/~coleenp/8174749.05/webrev (wish I could do incremental webrevs because full webrevs take forever). Thank you for all your help and comments. Coleen > > git a/src/share/vm/prims/methodHandles.cpp > b/src/share/vm/prims/methodHandles.cpp > --- a/src/share/vm/prims/methodHandles.cpp > +++ b/src/share/vm/prims/methodHandles.cpp > @@ -794,11 +794,6 @@ > // which refers directly to JVM internals. > void MethodHandles::expand_MemberName(Handle mname, int suppress, > TRAPS) { > assert(java_lang_invoke_MemberName::is_instance(mname()), ""); > - Metadata* vmtarget = java_lang_invoke_MemberName::vmtarget(mname()); > - int vmindex = java_lang_invoke_MemberName::vmindex(mname()); > - if (vmtarget == NULL) { > - THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "nothing > to expand"); > - } > bool have_defc = (java_lang_invoke_MemberName::clazz(mname()) != NULL); > bool have_name = (java_lang_invoke_MemberName::name(mname()) != NULL); > @@ -817,10 +812,14 @@ > case IS_METHOD: > case IS_CONSTRUCTOR: > { > + Method* vmtarget = java_lang_invoke_MemberName::vmtarget(method()); > + if (vmtarget == NULL) { > + THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "nothing > to expand"); > + } > assert(vmtarget->is_method(), "method or constructor vmtarget > is Method*"); > methodHandle m(THREAD, (Method*)vmtarget); > DEBUG_ONLY(vmtarget = NULL); // safety > - if (m.is_null()) break; > + assert(m.not_null(), "checked above"); > if (!have_defc) { > InstanceKlass* defc = m->method_holder(); > java_lang_invoke_MemberName::set_clazz(mname(), defc->java_mirror()); > @@ -838,17 +837,16 @@ > } > case IS_FIELD: > { > - assert(vmtarget->is_klass(), "field vmtarget is Klass*"); > - if (!((Klass*) vmtarget)->is_instance_klass()) break; > - instanceKlassHandle defc(THREAD, (Klass*) vmtarget); > - DEBUG_ONLY(vmtarget = NULL); // safety > + oop clazz = java_lang_invoke_MemberName::clazz(mname()); > + if (clazz == NULL) { > + THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "nothing > to expand (as field)"); > + } > + InstanceKlass* defc = > InstanceKlass::cast(java_lang_Class::as_Klass(clazz)); > + DEBUG_ONLY(clazz = NULL); // safety > bool is_static = ((flags & JVM_ACC_STATIC) != 0); > fieldDescriptor fd; // find_field initializes fd if found > if (!defc->find_field_from_offset(vmindex, is_static, &fd)) > break; // cannot expand > - if (!have_defc) { > - java_lang_invoke_MemberName::set_clazz(mname(), defc->java_mirror()); > - } > if (!have_name) { > //not java_lang_String::create_from_symbol; let's intern > member names > Handle name = StringTable::intern(fd.name(), CHECK); > @@ -1389,6 +1387,39 @@ > From kim.barrett at oracle.com Fri May 26 22:37:09 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 26 May 2017 18:37:09 -0400 Subject: RFR: 8166651: OrderAccess::load_acquire &etc should have const parameters In-Reply-To: References: <5ec3ca7f-f4cd-337c-63d5-3b00fd2839a7@redhat.com> <702f7e19-da8d-6ca2-8277-185a9468ef2a@redhat.com> Message-ID: Looking over the changes again, I realized there was a problem with the changes for zero. The added const qualifier to Atomic::load would run afoul of a non-const-qualified source for os::atomic_copy64. I've updated all three definitions of os::atomic_copy64. Two were in zero-specific files. One was in os_linux_aarch64.hpp. Unfortunately, I wasn't able to test these additional changes, as building zero is already broken in jdk10/hs for other reasons (JDK-8181158). New webrev: full: http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.02/ incr: http://cr.openjdk.java.net/~kbarrett/8166651/hotspot.02.inc/ From gromero at linux.vnet.ibm.com Sat May 27 00:34:50 2017 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Fri, 26 May 2017 21:34:50 -0300 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> Message-ID: <5928C9AA.6030004@linux.vnet.ibm.com> Hi Zhengyu, Thanks a lot for taking care of this corner case on PPC64. On 26-05-2017 10:41, Zhengyu Gu wrote: > This is a quick way to kill the symptom (or low risk?). I am not sure if disabling NUMA is a better solution for this circumstance? does 1 NUMA node = UMA? On PPC64, 1 (configured) NUMA does not necessarily imply UMA. In the POWER7 machine you found the corner case (I copy below the data you provided in the JBS - thanks for the additional information): $ numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: node 1 size: 7680 MB node 1 free: 1896 MB node distances: node 0 1 0: 10 40 1: 40 10 CPUs in node0 have no other alternative besides allocating memory from node1. In that case CPUs in node0 are always accessing remote memory from node1 in a constant distance (40), so in that case we could say that 1 NUMA (configured) node == UMA. Nonetheless, if you add CPUs in node1 (by filling up the other socket present in the board) you will end up with CPUs with different distances from the node that has configured memory (in that case, node1), so it yields a configuration where 1 NUMA (configured) != UMA (i.e. distances are not always equal to a single value). On the other hand, the POWER7 machine configuration in question is bad (and rare). It's indeed impacting the whole system performance and it would be reasonable to open the machine and move the memory module from bank related to node1 to bank related to node0, because all CPUs are accessing remote memory without any apparent necessity. Once you change it all CPUs will have local memory (distance = 10). > Thanks, > > -Zhengyu > > On 05/26/2017 09:14 AM, Zhengyu Gu wrote: >> Hi, >> >> There is a corner case that still failed after JDK-8175813. >> >> The system shows that it has multiple NUMA nodes, but only one is >> configured. Under this scenario, numa_interleave_memory() call will >> result "mbind: Invalid argument" message. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 >> Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ Looks like that even for that POWER7 rare numa topology numa_interleave_memory() should succeed without "mbind: Invalid argument" since the 'mask' argument should be already a mask with only nodes from which memory can be allocated, i.e. only a mask of configured nodes (even if mask contains only one configured node, as in http://cr.openjdk.java.net/~gromero/logs/numa_only_one_node.txt). Inspecting a little bit more, it looks like that the problem boils down to the fact that the JVM is passing to numa_interleave_memory() 'numa_all_nodes' [1] in Linux::numa_interleave_memory(). One would expect that 'numa_all_nodes' (which is api v1) would track the same information as 'numa_all_nodes_ptr' (api v2) [2], however there is a subtle but important difference: 'numa_all_nodes' is constructed assuming a consecutive node distribution [3]: 100 max = numa_num_configured_nodes(); 101 for (i = 0; i < max; i++) 102 nodemask_set_compat((nodemask_t *)&numa_all_nodes, i); whilst 'numa_all_nodes_ptr' is constructed parsing /proc/self/status [4]: 499 if (strncmp(buffer,"Mems_allowed:",13) == 0) { 500 numprocnode = read_mask(mask, numa_all_nodes_ptr); Thus for a topology like: available: 4 nodes (0-1,16-17) node 0 cpus: 0 8 16 24 32 node 0 size: 130706 MB node 0 free: 145 MB node 1 cpus: 40 48 56 64 72 node 1 size: 0 MB node 1 free: 0 MB node 16 cpus: 80 88 96 104 112 node 16 size: 130630 MB node 16 free: 529 MB node 17 cpus: 120 128 136 144 152 node 17 size: 0 MB node 17 free: 0 MB node distances: node 0 1 16 17 0: 10 20 40 40 1: 20 10 40 40 16: 40 40 10 20 17: 40 40 20 10 numa_all_nodes=0x3 => 0b11 (node0 and node1) numa_all_nodes_ptr=0x10001 => 0b10000000000000001 (node0 and node16) (Please, see details in the following gdb log: http://cr.openjdk.java.net/~gromero/logs/numa_api_v1_vs_api_v2.txt) In that case passing node0 and node1, although being suboptimal, does not bother mbind() since the following is satisfied: "[nodemask] must contain at least one node that is on-line, allowed by the process's current cpuset context, and contains memory." So back to the POWER7 case, I suppose that for: available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: node 1 size: 7680 MB node 1 free: 1896 MB node distances: node 0 1 0: 10 40 1: 40 10 numa_all_nodes=0x1 => 0b01 (node0) numa_all_nodes_ptr=0x2 => 0b10 (node1) and hence numa_interleave_memory() gets nodemask = 0x1 (node0), which contains indeed no memory. That said, I don't know for sure if passing just node1 in the 'nodemask' will satisfy mbind() as in that case there are no cpus available in node1. In summing up, looks like that the root cause is not that numa_interleave_memory() does not accept only one configured node, but that the configured node being passed is wrong. I could not find a similar numa topology in my poll to test more, but it might be worth trying to write a small test using api v2 and 'numa_all_nodes_ptr' instead of 'numa_all_nodes' to see how numa_interleave_memory() goes in that machine :) If it behaves well, updating to api v2 would be a solution. HTH Regards, Gustavo [1] http://hg.openjdk.java.net/jdk10/hs/hotspot/file/4b93e1b1d5b7/src/os/linux/vm/os_linux.hpp#l274 [2] from libnuma.c:608 numa_all_nodes_ptr: "it only tracks nodes with memory from which the calling process can allocate." [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L100-L102 [4] https://github.com/numactl/numactl/blob/master/libnuma.c#L499-L500 >> >> The system NUMA configuration: >> >> Architecture: ppc64 >> CPU op-mode(s): 32-bit, 64-bit >> Byte Order: Big Endian >> CPU(s): 8 >> On-line CPU(s) list: 0-7 >> Thread(s) per core: 4 >> Core(s) per socket: 1 >> Socket(s): 2 >> NUMA node(s): 2 >> Model: 2.1 (pvr 003f 0201) >> Model name: POWER7 (architected), altivec supported >> L1d cache: 32K >> L1i cache: 32K >> NUMA node0 CPU(s): 0-7 >> NUMA node1 CPU(s): >> >> Thanks, >> >> -Zhengyu > From john.r.rose at oracle.com Sat May 27 02:48:42 2017 From: john.r.rose at oracle.com (John Rose) Date: Fri, 26 May 2017 19:48:42 -0700 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> <3B8B3D31-09F3-41AC-9B26-A14B4DB87082@oracle.com> Message-ID: On May 26, 2017, at 2:48 PM, coleen.phillimore at oracle.com wrote: > > On 5/26/17 4:48 PM, John Rose wrote: >> On May 26, 2017, at 10:47 AM, coleen.phillimore at oracle.com wrote: >>> >>> Hi, I made the changes below, which turned out very nice. It didn't take that long to retest. See: >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.04/webrev >>> open webrev at http://cr.openjdk.java.net/~coleenp/8174749.jdk.04/webrev >>> >>> I don't know how to do delta webrevs, so just look at linkResolver.cpp/hpp and methodHandles.cpp >> >> Re-reviewed. >> >> See previous message for a late-breaking comment on expand. >> See below for a sketch of what I mean by keeping "have_defc" as is. > Hi John, > > I was just thinking of this change below, it makes sense to treat field and method MemberName differently as you have below. The field needs the clazz present to be expanded but method MemberName does not. > > Yes, this makes sense. Excellent. I reviewed the change in the 05 version of the webrev and it looks good to go. >> >> ...It seems just >> as reasonable to me that linkResolver.cpp would do that job, than that it >> would be to delegate via javaClasses.cpp. I also think the patch will get >> a little smaller if you cut javaClasses.cpp out of that loop. > > JavaClasses is in the loop because it knows which fields to assign and how to create a ResolvedMethodName. I think this makes sense to isolate it like this an appreciated only changing javaClasses.cpp when I kept changing the names of the fields. OK, then keep it as is. The allocate_instance call on RMN seems at home in jC.cpp. >> I wouldn't object if you anticipated this in the present change set, but it's OK >> to do it later. > > Yes, it has to be later. I'm going to file a couple of RFE's after this that we discussed so that RMN can be used instead of MN. And believe it or not, large changes make me anxious. :) Yep. An RFE to track this makes me happy. >> >> P.P.S. Here's a sketch of what I mean by walking back some of the "have_defc" >> changes. Maybe I'm missing something, but I think this version makes more >> sense than the current version: > > Done. Passes java/lang/invoke tests (as sanity). > > http://cr.openjdk.java.net/~coleenp/8174749.05/webrev > > (wish I could do incremental webrevs because full webrevs take forever). > > Thank you for all your help and comments. Re-reviewed. Even better, and still good to go. ? John From robbin.ehn at oracle.com Sat May 27 11:53:08 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Sat, 27 May 2017 13:53:08 +0200 Subject: RFR(L): 8180032: Unaligned pointer dereference in ClassFileParser In-Reply-To: References: <921CF167-7883-479C-A2A2-55C9D72C25C4@oracle.com> <086D5C93-E0F0-4B1C-BAE5-1B910F4004DF@oracle.com> <440931cb-f4a4-6edd-fbd3-82bbfe162b81@oracle.com> <6a0f369e-4d34-02dc-653d-90a8aa19b901@oracle.com> Message-ID: <7738fb21-dd06-58ad-479e-c71ba0ce8600@oracle.com> Hi Mikael, I see you have pushed this, good! Sorry for the late response. On 2017-05-26 00:00, Mikael Vidstedt wrote: > > Reasonable? +1, thanks! /Robbin > > Cheers, > Mikael > > [1] http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.01/hotspot/webrev/ > >> On May 18, 2017, at 5:18 PM, David Holmes > > wrote: >> >> On 19/05/2017 9:19 AM, Mikael Vidstedt wrote: >>> >>>> On May 18, 2017, at 3:50 PM, David Holmes >>> >>>> > wrote: >>>> >>>> Hi Mikael, >>>> >>>> On 19/05/2017 8:15 AM, Mikael Vidstedt wrote: >>>>> >>>>>> On May 18, 2017, at 2:59 AM, Robbin Ehn >>>>> >>>>>> > wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On 05/17/2017 03:46 AM, Kim Barrett wrote: >>>>>>>> On May 9, 2017, at 6:40 PM, Mikael Vidstedt >>>>>>>> >>>>>>>> > >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Warning: It may be wise to stock up on coffee or tea before >>>>>>>> reading this. >>>>>>>> >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8180032 >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8180032/webrev.00/hotspot/webrev/ >>>>>>>> >>>>>>> Not a review, just a question. >>>>>>> ------------------------------------------------------------------------------ >>>>>>> src/cpu/x86/vm/bytes_x86.hpp >>>>>>> 40 template >>>>>>> 41 static inline T get_native(const void* p) { >>>>>>> 42 assert(p != NULL, "null pointer"); >>>>>>> 43 >>>>>>> 44 T x; >>>>>>> 45 >>>>>>> 46 if (is_ptr_aligned(p, sizeof(T))) { >>>>>>> 47 x = *(T*)p; >>>>>>> 48 } else { >>>>>>> 49 memcpy(&x, p, sizeof(T)); >>>>>>> 50 } >>>>>>> 51 >>>>>>> 52 return x; >>>>>>> I'm looking at this and wondering if there's a good reason to not >>>>>>> just >>>>>>> unconditionally use memcpy here. gcc -O will generate a single move >>>>>>> instruction for that on x86_64. I'm not sure what happens on 32bit >>>>>>> with an 8 byte value, but I suspect it will do something similarly >>>>>>> sensible, e.g. 2 4 byte memory to memory transfers. >>>>>> >>>>>> Unconditionally memcpy would be nice! >>>>>> >>>>>> Are going to look into that Mikael? >>>>> >>>>> It?s complicated? >>>>> >>>>> We may be able to switch, but there is (maybe) a subtle reason why >>>>> the alignment check is in there: to avoid word tearing.. >>>>> >>>>> Think of two threads racing: >>>>> >>>>> * thread 1 is writing to the memory location X >>>>> * thread 2 is reading from the same memory location X >>>>> >>>>> Will thread 2 always see a consistent value (either the original >>>>> value or the fully updated value)? >>>> >>>> We're talking about internal VM load and stores rights? For those we >>>> need to use appropriate atomic routine if there are potential races. >>>> But we should never be mixing these kind of accesses with Java level >>>> field accesses - that would be very broken. >>> >>> That seems reasonable, but for my untrained eye it?s not trivially true >>> that relaxing the implementation is correct for all the uses of the >>> get/put primitives. I am therefore a bit reluctant to do so without >>> understanding the implications. >> >> If a Copy routine doesn't have Atomic in its name then I don't expect >> atomicity. Even then unaligned accesses are not atomic even in the >> Atomic routine! >> >> But I'm not clear exactly how all these routines get used. >> >>>> For classFileparser we should no concurrency issues. >>> >>> That seems reasonable. What degree of certainty does your ?should? come >>> with? :) >> >> Pretty high. We're parsing a stream of bytes and writing values into >> local structures that will eventually be passed across to a klass >> instance, which in turn will eventually be published via the SD as a >> loaded class. The actual parsing phase is purely single-threaded. >> >> David >> >>> Cheers, >>> Mikael >>> >>>> >>>> David >>>> >>>>> In the unaligned/memcpy case I think we can agree that there?s >>>>> nothing preventing the compiler from doing individual loads/stores of >>>>> the bytes making up the data. Especially in something like slowdebug >>>>> that becomes more or less obvious - memcpy most likely isn?t >>>>> intrinsified and is quite likely just copying a byte at a time. Given >>>>> that the data is, in fact, unaligned, there is really no simple way >>>>> to prevent word tearing, so I?m pretty sure that we never depend on >>>>> it - if needed, we?re likely to already have some higher level >>>>> synchronization in place guarding the accesses. And the fact that the >>>>> other, non-x86 platforms already do individual byte loads/stores when >>>>> the pointer is unaligned indicates is a further indication that >>>>> that?s the case. >>>>> >>>>> However, the aligned case is where stuff gets more interesting. I >>>>> don?t think the C/C++ spec guarantees that accessing a memory >>>>> location using a pointer of type T will result in code which does a >>>>> single load/store of size >= sizeof(T), but for all the compilers we >>>>> *actually* use that?s likely to be the case. If it?s true that the >>>>> compilers don?t splits the memory accesses, that means we won?t have >>>>> word tearing when using the Bytes::get/put methods with *aligned* >>>>> pointers. >>>>> >>>>> If I switch to always using memcpy, there?s a risk that it introduces >>>>> tearing problems where earlier we had none. Two questions come to mind: >>>>> >>>>> * For the cases where the get/put methods get used *today*, is that a >>>>> problem? >>>>> * What happens if somebody in the *future* decides that put_Java_u4 >>>>> seems like a great thing to use to write to a Java int field on the >>>>> Java heap, and a Java thread is racing to read that same data? >>>>> >>>>> >>>>> All that said though, I think this is worth exploring and it may well >>>>> turn out that word tearing really isn?t a problem. Also, I believe >>>>> there may be opportunities to further clean up this code and perhaps >>>>> unify it a bit across the various platforms. >>>>> >>>>> And *that* said, I think the change as it stands is still an >>>>> improvement, so I?m leaning towards pushing it and filing an >>>>> enhancement and following up on it separately. Let me know if you >>>>> strongly feel that this should be looked into and addressed now and I >>>>> may reconsider :) >>>>> >>>>> Cheers, >>>>> Mikael >>>>> >>>>>> >>>>>> /Robbin >>>>>> >>>>>>> ------------------------------------------------------------------------------ >>> > From robbin.ehn at oracle.com Sat May 27 12:06:28 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Sat, 27 May 2017 14:06:28 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> Message-ID: <181abf06-8225-a3d9-4e10-ba9821aa5cd9@oracle.com> Thanks David, looks good! /Robbin On 2017-05-26 20:19, Daniel D. Daugherty wrote: > On 5/26/17 1:27 AM, David Holmes wrote: >> Robbin, Dan, >> >> Below is a modified version of the refactored to_abstime code that >> Robbin suggested. >> >> Robbin: there were a couple of issues with your version. For relative >> time the timeout is always in nanoseconds - the "unit" only tells you >> what form the "now_part_sec" is - nanos or micros. And the >> calc_abs_time always has a deadline in millis. So I simplified and did >> a little renaming, and tracked max_secs in debug_only instead of >> returning it. >> >> Please let me know what you think. > > Looks OK to me. Nit comments below... > > >> >> Thanks, >> David >> ----- >> >> >> // Calculate a new absolute time that is "timeout" nanoseconds from >> "now". >> // "unit" indicates the unit of "now_part_sec" (may be nanos or micros >> depending >> // on which clock is being used). >> static void calc_rel_time(timespec* abstime, jlong timeout, jlong >> now_sec, >> jlong now_part_sec, jlong unit) { >> time_t max_secs = now_sec + MAX_SECS; >> >> jlong seconds = timeout / NANOUNITS; >> timeout %= NANOUNITS; // remaining nanos >> >> if (seconds >= MAX_SECS) { >> // More seconds than we can add, so pin to max_secs. >> abstime->tv_sec = max_secs; >> abstime->tv_nsec = 0; >> } else { >> abstime->tv_sec = now_sec + seconds; >> long nanos = (now_part_sec * (NANOUNITS / unit)) + timeout; >> if (nanos >= NANOUNITS) { // overflow >> abstime->tv_sec += 1; >> nanos -= NANOUNITS; >> } >> abstime->tv_nsec = nanos; >> } >> } >> >> // Unpack the given deadline in milliseconds since the epoch, into the >> given timespec. >> // The current time in seconds is also passed in to enforce an upper >> bound as discussed above. >> static void unpack_abs_time(timespec* abstime, jlong deadline, jlong >> now_sec) { >> time_t max_secs = now_sec + MAX_SECS; >> >> jlong seconds = deadline / MILLIUNITS; >> jlong millis = deadline % MILLIUNITS; >> >> if (seconds >= max_secs) { >> // Absolute seconds exceeds allowed max, so pin to max_secs. >> abstime->tv_sec = max_secs; >> abstime->tv_nsec = 0; >> } else { >> abstime->tv_sec = seconds; >> abstime->tv_nsec = millis * (NANOUNITS / MILLIUNITS); >> } >> } >> >> >> static void to_abstime(timespec* abstime, jlong timeout, bool >> isAbsolute) { > > There's an extra blank line here. > >> >> DEBUG_ONLY(int max_secs = MAX_SECS;) >> >> if (timeout < 0) { >> timeout = 0; >> } >> >> #ifdef SUPPORTS_CLOCK_MONOTONIC >> >> if (_use_clock_monotonic_condattr && !isAbsolute) { >> struct timespec now; >> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >> assert_status(status == 0, status, "clock_gettime"); >> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_nsec, NANOUNITS); >> DEBUG_ONLY(max_secs += now.tv_sec;) >> } else { >> >> #else >> >> { // Match the block scope. >> >> #endif // SUPPORTS_CLOCK_MONOTONIC >> >> // Time-of-day clock is all we can reliably use. >> struct timeval now; >> int status = gettimeofday(&now, NULL); >> assert(status == 0, "gettimeofday"); > > assert_status() is used above, but assert() is used here. Why? > > >> if (isAbsolute) { >> unpack_abs_time(abstime, timeout, now.tv_sec); >> } >> else { > > Inconsistent "else-branch" formatting. > I believe HotSpot style is "} else {" > > >> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, MICROUNITS); >> } >> DEBUG_ONLY(max_secs += now.tv_sec;) >> } >> >> assert(abstime->tv_sec >= 0, "tv_sec < 0"); >> assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); >> assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); >> assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >= >> nanos_per_sec"); > > Why does the assert mesg have "nanos_per_sec" instead of > "NANOSECS_PER_SEC"? > > There's an extra blank line here. > >> >> } > > Definitely looks and reads much cleaner. > > Dan > From david.holmes at oracle.com Mon May 29 00:29:22 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 May 2017 10:29:22 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> Message-ID: <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> On 27/05/2017 4:19 AM, Daniel D. Daugherty wrote: > On 5/26/17 1:27 AM, David Holmes wrote: >> Robbin, Dan, >> >> Below is a modified version of the refactored to_abstime code that >> Robbin suggested. >> >> Robbin: there were a couple of issues with your version. For relative >> time the timeout is always in nanoseconds - the "unit" only tells you >> what form the "now_part_sec" is - nanos or micros. And the >> calc_abs_time always has a deadline in millis. So I simplified and did >> a little renaming, and tracked max_secs in debug_only instead of >> returning it. >> >> Please let me know what you think. > > Looks OK to me. Nit comments below... Thanks Dan - more below. >> >> >> // Calculate a new absolute time that is "timeout" nanoseconds from >> "now". >> // "unit" indicates the unit of "now_part_sec" (may be nanos or micros >> depending >> // on which clock is being used). >> static void calc_rel_time(timespec* abstime, jlong timeout, jlong >> now_sec, >> jlong now_part_sec, jlong unit) { >> time_t max_secs = now_sec + MAX_SECS; >> >> jlong seconds = timeout / NANOUNITS; >> timeout %= NANOUNITS; // remaining nanos >> >> if (seconds >= MAX_SECS) { >> // More seconds than we can add, so pin to max_secs. >> abstime->tv_sec = max_secs; >> abstime->tv_nsec = 0; >> } else { >> abstime->tv_sec = now_sec + seconds; >> long nanos = (now_part_sec * (NANOUNITS / unit)) + timeout; >> if (nanos >= NANOUNITS) { // overflow >> abstime->tv_sec += 1; >> nanos -= NANOUNITS; >> } >> abstime->tv_nsec = nanos; >> } >> } >> >> // Unpack the given deadline in milliseconds since the epoch, into the >> given timespec. >> // The current time in seconds is also passed in to enforce an upper >> bound as discussed above. >> static void unpack_abs_time(timespec* abstime, jlong deadline, jlong >> now_sec) { >> time_t max_secs = now_sec + MAX_SECS; >> >> jlong seconds = deadline / MILLIUNITS; >> jlong millis = deadline % MILLIUNITS; >> >> if (seconds >= max_secs) { >> // Absolute seconds exceeds allowed max, so pin to max_secs. >> abstime->tv_sec = max_secs; >> abstime->tv_nsec = 0; >> } else { >> abstime->tv_sec = seconds; >> abstime->tv_nsec = millis * (NANOUNITS / MILLIUNITS); >> } >> } >> >> >> static void to_abstime(timespec* abstime, jlong timeout, bool >> isAbsolute) { > > There's an extra blank line here. Fixed. >> >> DEBUG_ONLY(int max_secs = MAX_SECS;) >> >> if (timeout < 0) { >> timeout = 0; >> } >> >> #ifdef SUPPORTS_CLOCK_MONOTONIC >> >> if (_use_clock_monotonic_condattr && !isAbsolute) { >> struct timespec now; >> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >> assert_status(status == 0, status, "clock_gettime"); >> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_nsec, NANOUNITS); >> DEBUG_ONLY(max_secs += now.tv_sec;) >> } else { >> >> #else >> >> { // Match the block scope. >> >> #endif // SUPPORTS_CLOCK_MONOTONIC >> >> // Time-of-day clock is all we can reliably use. >> struct timeval now; >> int status = gettimeofday(&now, NULL); >> assert(status == 0, "gettimeofday"); > > assert_status() is used above, but assert() is used here. Why? Historical. assert_status was introduced for the pthread* and other posix funcs that return the error value rather than returning -1 and setting errno. gettimeofday is not one of those so still has the old assert. However, as someone pointed out a while ago you can use assert_status with these and pass errno as the "status". So I did that. > >> if (isAbsolute) { >> unpack_abs_time(abstime, timeout, now.tv_sec); >> } >> else { > > Inconsistent "else-branch" formatting. > I believe HotSpot style is "} else {" Fixed. >> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, MICROUNITS); >> } >> DEBUG_ONLY(max_secs += now.tv_sec;) >> } >> >> assert(abstime->tv_sec >= 0, "tv_sec < 0"); >> assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); >> assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); >> assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >= >> nanos_per_sec"); > > Why does the assert mesg have "nanos_per_sec" instead of > "NANOSECS_PER_SEC"? No reason. Actually that should now refer to NANOUNITS. Hmmm I can not recall why we have NANOUNITS and NANAOSECS_PER_SEC ... possibly an oversight. > There's an extra blank line here. Fixed. Will send out complete updated webrev soon. Thanks, David >> >> } > > Definitely looks and reads much cleaner. > > Dan > From david.holmes at oracle.com Mon May 29 01:51:27 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 May 2017 11:51:27 +1000 Subject: [8u] Request for approval and review: 8180934 (XS): PutfieldError failed with UnsupportedClassVersionError In-Reply-To: <96127e2a-0059-8ef1-0ed8-2f152fe0e415@oracle.com> References: <96127e2a-0059-8ef1-0ed8-2f152fe0e415@oracle.com> Message-ID: On 27/05/2017 12:44 AM, harold seigel wrote: > Hi Zoltan, > > Instead of deleting the test, can the class file version of Bad.jasm be > changed to 52 for JDK-8u? I concur. As I just wrote in the bug report I don't see how the changes can be backported to 8u but the test is somehow invalid for 8u! Also as a point of order: a RFR and a RFA are distinct and should be posted separately: the RFR on hotspot-xxx-dev (as appropriate) and the RFA on jdk8u-dev. Thanks, David > Thanks, Harold > > > On 5/26/2017 10:17 AM, Zolt?n Maj? wrote: >> Hi, >> >> >> when backporting 8160551, I also backported a test that is relevant >> only for class files with version >= 53. As JDK 8 supports only class >> files with version < 53, having the test in the JDK 8u test base does >> not make sense. This changeset proposes to remove the test. >> >> https://bugs.openjdk.java.net/browse/JDK-8180934 >> http://cr.openjdk.java.net/~zmajo/8180934/webrev.00/ >> >> I executed all hotspot/runtime tests with the changeset (using JDK >> 8u122), no problems have shown up. JPRT testing is in progress. >> >> Please note that this fix is a JDK 8u-specific fix (not a backport of >> some existing fix in JDK 9). >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> > From zgu at redhat.com Mon May 29 02:08:41 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Sun, 28 May 2017 22:08:41 -0400 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <5928C9AA.6030004@linux.vnet.ibm.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> <5928C9AA.6030004@linux.vnet.ibm.com> Message-ID: <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> Hi Gustavo, Thanks for the detail analysis and suggestion. I did not realize the difference between from bitmask and nodemask. As you suggested, numa_interleave_memory_v2 works under this configuration. Please updated Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.01/ Thanks, -Zhengyu On 05/26/2017 08:34 PM, Gustavo Romero wrote: > Hi Zhengyu, > > Thanks a lot for taking care of this corner case on PPC64. > > On 26-05-2017 10:41, Zhengyu Gu wrote: >> This is a quick way to kill the symptom (or low risk?). I am not sure if disabling NUMA is a better solution for this circumstance? does 1 NUMA node = UMA? > > On PPC64, 1 (configured) NUMA does not necessarily imply UMA. In the POWER7 > machine you found the corner case (I copy below the data you provided in the > JBS - thanks for the additional information): > > $ numactl -H > available: 2 nodes (0-1) > node 0 cpus: 0 1 2 3 4 5 6 7 > node 0 size: 0 MB > node 0 free: 0 MB > node 1 cpus: > node 1 size: 7680 MB > node 1 free: 1896 MB > node distances: > node 0 1 > 0: 10 40 > 1: 40 10 > > CPUs in node0 have no other alternative besides allocating memory from node1. In > that case CPUs in node0 are always accessing remote memory from node1 in a constant > distance (40), so in that case we could say that 1 NUMA (configured) node == UMA. > Nonetheless, if you add CPUs in node1 (by filling up the other socket present in > the board) you will end up with CPUs with different distances from the node that > has configured memory (in that case, node1), so it yields a configuration where > 1 NUMA (configured) != UMA (i.e. distances are not always equal to a single > value). > > On the other hand, the POWER7 machine configuration in question is bad (and > rare). It's indeed impacting the whole system performance and it would be > reasonable to open the machine and move the memory module from bank related to > node1 to bank related to node0, because all CPUs are accessing remote memory > without any apparent necessity. Once you change it all CPUs will have local > memory (distance = 10). > > >> Thanks, >> >> -Zhengyu >> >> On 05/26/2017 09:14 AM, Zhengyu Gu wrote: >>> Hi, >>> >>> There is a corner case that still failed after JDK-8175813. >>> >>> The system shows that it has multiple NUMA nodes, but only one is >>> configured. Under this scenario, numa_interleave_memory() call will >>> result "mbind: Invalid argument" message. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 >>> Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ > > Looks like that even for that POWER7 rare numa topology numa_interleave_memory() > should succeed without "mbind: Invalid argument" since the 'mask' argument > should be already a mask with only nodes from which memory can be allocated, i.e. > only a mask of configured nodes (even if mask contains only one configured node, > as in http://cr.openjdk.java.net/~gromero/logs/numa_only_one_node.txt). > > Inspecting a little bit more, it looks like that the problem boils down to the > fact that the JVM is passing to numa_interleave_memory() 'numa_all_nodes' [1] in > Linux::numa_interleave_memory(). > > One would expect that 'numa_all_nodes' (which is api v1) would track the same > information as 'numa_all_nodes_ptr' (api v2) [2], however there is a subtle but > important difference: > > 'numa_all_nodes' is constructed assuming a consecutive node distribution [3]: > > 100 max = numa_num_configured_nodes(); > 101 for (i = 0; i < max; i++) > 102 nodemask_set_compat((nodemask_t *)&numa_all_nodes, i); > > > whilst 'numa_all_nodes_ptr' is constructed parsing /proc/self/status [4]: > > 499 if (strncmp(buffer,"Mems_allowed:",13) == 0) { > 500 numprocnode = read_mask(mask, numa_all_nodes_ptr); > > Thus for a topology like: > > available: 4 nodes (0-1,16-17) > node 0 cpus: 0 8 16 24 32 > node 0 size: 130706 MB > node 0 free: 145 MB > node 1 cpus: 40 48 56 64 72 > node 1 size: 0 MB > node 1 free: 0 MB > node 16 cpus: 80 88 96 104 112 > node 16 size: 130630 MB > node 16 free: 529 MB > node 17 cpus: 120 128 136 144 152 > node 17 size: 0 MB > node 17 free: 0 MB > node distances: > node 0 1 16 17 > 0: 10 20 40 40 > 1: 20 10 40 40 > 16: 40 40 10 20 > 17: 40 40 20 10 > > numa_all_nodes=0x3 => 0b11 (node0 and node1) > numa_all_nodes_ptr=0x10001 => 0b10000000000000001 (node0 and node16) > > (Please, see details in the following gdb log: http://cr.openjdk.java.net/~gromero/logs/numa_api_v1_vs_api_v2.txt) > > In that case passing node0 and node1, although being suboptimal, does not bother > mbind() since the following is satisfied: > > "[nodemask] must contain at least one node that is on-line, allowed by the > process's current cpuset context, and contains memory." > > So back to the POWER7 case, I suppose that for: > > available: 2 nodes (0-1) > node 0 cpus: 0 1 2 3 4 5 6 7 > node 0 size: 0 MB > node 0 free: 0 MB > node 1 cpus: > node 1 size: 7680 MB > node 1 free: 1896 MB > node distances: > node 0 1 > 0: 10 40 > 1: 40 10 > > numa_all_nodes=0x1 => 0b01 (node0) > numa_all_nodes_ptr=0x2 => 0b10 (node1) > > and hence numa_interleave_memory() gets nodemask = 0x1 (node0), which contains > indeed no memory. That said, I don't know for sure if passing just node1 in the > 'nodemask' will satisfy mbind() as in that case there are no cpus available in > node1. > > In summing up, looks like that the root cause is not that numa_interleave_memory() > does not accept only one configured node, but that the configured node being > passed is wrong. I could not find a similar numa topology in my poll to test > more, but it might be worth trying to write a small test using api v2 and > 'numa_all_nodes_ptr' instead of 'numa_all_nodes' to see how numa_interleave_memory() > goes in that machine :) If it behaves well, updating to api v2 would be a > solution. > > HTH > > Regards, > Gustavo > > > [1] http://hg.openjdk.java.net/jdk10/hs/hotspot/file/4b93e1b1d5b7/src/os/linux/vm/os_linux.hpp#l274 > [2] from libnuma.c:608 numa_all_nodes_ptr: "it only tracks nodes with memory from which the calling process can allocate." > [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L100-L102 > [4] https://github.com/numactl/numactl/blob/master/libnuma.c#L499-L500 > > >>> >>> The system NUMA configuration: >>> >>> Architecture: ppc64 >>> CPU op-mode(s): 32-bit, 64-bit >>> Byte Order: Big Endian >>> CPU(s): 8 >>> On-line CPU(s) list: 0-7 >>> Thread(s) per core: 4 >>> Core(s) per socket: 1 >>> Socket(s): 2 >>> NUMA node(s): 2 >>> Model: 2.1 (pvr 003f 0201) >>> Model name: POWER7 (architected), altivec supported >>> L1d cache: 32K >>> L1i cache: 32K >>> NUMA node0 CPU(s): 0-7 >>> NUMA node1 CPU(s): >>> >>> Thanks, >>> >>> -Zhengyu >> > From david.holmes at oracle.com Mon May 29 02:17:13 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 May 2017 12:17:13 +1000 Subject: [PATCH]: linux-sparc build fixes In-Reply-To: References: Message-ID: <4121e72b-25cb-a18f-dace-a44528cb622d@oracle.com> Hi Adrian, cc'ing hotspot-dev and bcc'ing build-dev as these are not issues with the build files, but hotspot sources. First, than you for taking the time and effort to contribute to OpenJDK. However ... The status of linux-sparc as a port in OpenJDK 9 (or 8u) is unclear. As you have found those bits have bit-rotted as it is not a supported Oracle JDK platform, and no-one else seems to have cared until now. Your patches may be acceptable, but we have no way to validate them. And the very next push may break things again. Or indeed once the hotspot issues are resolved you may find issues on the JDK library side. The process to get them into JDK 9 now is somewhat more difficult given where we are in the JDK 9 release. But without a commitment from some part of the community to maintain this port it may be better as something that is maintained downstream. It would be interesting to hear from the Hotspot group leads on this. Thanks, David On 27/05/2017 10:28 PM, John Paul Adrian Glaubitz wrote: > Hello! > > openjdk-9 currently fails to build from source on Debian/sparc64 [1]. > > One of the reasons it fails is because of a misnamed header filename: > > In file included from /<>/src/hotspot/src/share/vm/memory/allocation.inline.hpp:28:0, > from /<>/src/hotspot/src/share/vm/utilities/array.hpp:29, > from /<>/src/hotspot/src/share/vm/memory/universe.hpp:29, > from /<>/src/hotspot/src/share/vm/code/oopRecorder.hpp:28, > from /<>/src/hotspot/src/share/vm/asm/codeBuffer.hpp:28, > from /<>/src/hotspot/src/share/vm/asm/assembler.hpp:28, > from /<>/src/hotspot/src/share/vm/asm/macroAssembler.hpp:28, > from /<>/src/hotspot/src/cpu/sparc/vm/nativeInst_sparc.hpp:28, > from /<>/src/hotspot/src/share/vm/code/nativeInst.hpp:30, > from ad_sparc.hpp:32, > from ad_sparc_misc.cpp:28: > /<>/src/hotspot/src/share/vm/runtime/atomic.hpp:126:31: fatal error: atomic_linux_sparc.hpp: No such file or directory > #include OS_CPU_HEADER(atomic) > > While fixing this, I also discovered some other problems with the build on Debian/sparc64. > > I'm attaching a set of three patches which address all the issues I have discovered with > openjdk-9 on Debian/sparc64 so far. They should apply to all builds on 64-bit Linux SPARC. > > Please note, this is my first attempt ever trying to get patches into OpenJDK, so I will > definitely need a little guidance. I have already asked in the #openjdk channel and after > signing the OCA and sending it in, I was told to post my patches here. So here they are :). > > Let's hope I can get them into the proper shape for getting them merged. > > All patches are: > > Signed-off by: John Paul Adrian Glaubitz > > Thanks, > Adrian > >> [1] https://buildd.debian.org/status/fetch.php?pkg=openjdk-9&arch=sparc64&ver=9%7Eb170-2&stamp=1495173966&raw=0 > From david.holmes at oracle.com Mon May 29 04:19:35 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 May 2017 14:19:35 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> Message-ID: <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> Dan, Robbin, Thomas, Okay here is the final ready to push version: http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot.v2/ this fixes all Dan's nits and refactors the time calculation code as suggested by Robbin. Thomas: if you are around and able, it would be good to get a final sanity check on AIX. Thanks. Testing: - JPRT: -testset hotspot -testset core - manual: - jtreg:java/util/concurrent - various little test programs that try to validate sleep/wait times to show early returns or unexpected delays Thanks again for the reviews. David On 29/05/2017 10:29 AM, David Holmes wrote: > On 27/05/2017 4:19 AM, Daniel D. Daugherty wrote: >> On 5/26/17 1:27 AM, David Holmes wrote: >>> Robbin, Dan, >>> >>> Below is a modified version of the refactored to_abstime code that >>> Robbin suggested. >>> >>> Robbin: there were a couple of issues with your version. For relative >>> time the timeout is always in nanoseconds - the "unit" only tells you >>> what form the "now_part_sec" is - nanos or micros. And the >>> calc_abs_time always has a deadline in millis. So I simplified and >>> did a little renaming, and tracked max_secs in debug_only instead of >>> returning it. >>> >>> Please let me know what you think. >> >> Looks OK to me. Nit comments below... > > Thanks Dan - more below. > >>> >>> >>> // Calculate a new absolute time that is "timeout" nanoseconds from >>> "now". >>> // "unit" indicates the unit of "now_part_sec" (may be nanos or >>> micros depending >>> // on which clock is being used). >>> static void calc_rel_time(timespec* abstime, jlong timeout, jlong >>> now_sec, >>> jlong now_part_sec, jlong unit) { >>> time_t max_secs = now_sec + MAX_SECS; >>> >>> jlong seconds = timeout / NANOUNITS; >>> timeout %= NANOUNITS; // remaining nanos >>> >>> if (seconds >= MAX_SECS) { >>> // More seconds than we can add, so pin to max_secs. >>> abstime->tv_sec = max_secs; >>> abstime->tv_nsec = 0; >>> } else { >>> abstime->tv_sec = now_sec + seconds; >>> long nanos = (now_part_sec * (NANOUNITS / unit)) + timeout; >>> if (nanos >= NANOUNITS) { // overflow >>> abstime->tv_sec += 1; >>> nanos -= NANOUNITS; >>> } >>> abstime->tv_nsec = nanos; >>> } >>> } >>> >>> // Unpack the given deadline in milliseconds since the epoch, into >>> the given timespec. >>> // The current time in seconds is also passed in to enforce an upper >>> bound as discussed above. >>> static void unpack_abs_time(timespec* abstime, jlong deadline, jlong >>> now_sec) { >>> time_t max_secs = now_sec + MAX_SECS; >>> >>> jlong seconds = deadline / MILLIUNITS; >>> jlong millis = deadline % MILLIUNITS; >>> >>> if (seconds >= max_secs) { >>> // Absolute seconds exceeds allowed max, so pin to max_secs. >>> abstime->tv_sec = max_secs; >>> abstime->tv_nsec = 0; >>> } else { >>> abstime->tv_sec = seconds; >>> abstime->tv_nsec = millis * (NANOUNITS / MILLIUNITS); >>> } >>> } >>> >>> >>> static void to_abstime(timespec* abstime, jlong timeout, bool >>> isAbsolute) { >> >> There's an extra blank line here. > > Fixed. > >>> >>> DEBUG_ONLY(int max_secs = MAX_SECS;) >>> >>> if (timeout < 0) { >>> timeout = 0; >>> } >>> >>> #ifdef SUPPORTS_CLOCK_MONOTONIC >>> >>> if (_use_clock_monotonic_condattr && !isAbsolute) { >>> struct timespec now; >>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>> assert_status(status == 0, status, "clock_gettime"); >>> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_nsec, NANOUNITS); >>> DEBUG_ONLY(max_secs += now.tv_sec;) >>> } else { >>> >>> #else >>> >>> { // Match the block scope. >>> >>> #endif // SUPPORTS_CLOCK_MONOTONIC >>> >>> // Time-of-day clock is all we can reliably use. >>> struct timeval now; >>> int status = gettimeofday(&now, NULL); >>> assert(status == 0, "gettimeofday"); >> >> assert_status() is used above, but assert() is used here. Why? > > Historical. assert_status was introduced for the pthread* and other > posix funcs that return the error value rather than returning -1 and > setting errno. gettimeofday is not one of those so still has the old > assert. However, as someone pointed out a while ago you can use > assert_status with these and pass errno as the "status". So I did that. > >> >>> if (isAbsolute) { >>> unpack_abs_time(abstime, timeout, now.tv_sec); >>> } >>> else { >> >> Inconsistent "else-branch" formatting. >> I believe HotSpot style is "} else {" > > Fixed. > >>> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, MICROUNITS); >>> } >>> DEBUG_ONLY(max_secs += now.tv_sec;) >>> } >>> >>> assert(abstime->tv_sec >= 0, "tv_sec < 0"); >>> assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); >>> assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); >>> assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >= >>> nanos_per_sec"); >> >> Why does the assert mesg have "nanos_per_sec" instead of >> "NANOSECS_PER_SEC"? > > No reason. Actually that should now refer to NANOUNITS. Hmmm I can not > recall why we have NANOUNITS and NANAOSECS_PER_SEC ... possibly an > oversight. > >> There's an extra blank line here. > > Fixed. > > Will send out complete updated webrev soon. > > Thanks, > David > >>> >>> } >> >> Definitely looks and reads much cleaner. >> >> Dan >> From david.holmes at oracle.com Mon May 29 04:34:28 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 May 2017 14:34:28 +1000 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> <5928C9AA.6030004@linux.vnet.ibm.com> <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> Message-ID: <95147596-caf9-4e49-f954-29fa13df3a56@oracle.com> Hi Zhengyu, On 29/05/2017 12:08 PM, Zhengyu Gu wrote: > Hi Gustavo, > > Thanks for the detail analysis and suggestion. I did not realize the > difference between from bitmask and nodemask. > > As you suggested, numa_interleave_memory_v2 works under this configuration. > > Please updated Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.01/ The addition of support for the "v2" API seems okay. Though I think this comment needs some clarification for the existing code: 2837 // If we are running with libnuma version > 2, then we should 2838 // be trying to use symbols with versions 1.1 2839 // If we are running with earlier version, which did not have symbol versions, 2840 // we should use the base version. 2841 void* os::Linux::libnuma_dlsym(void* handle, const char *name) { given that we now explicitly load the v1.2 symbol if present. Gustavo: can you vouch for the suitability of using the v2 API in all cases, if it exists? I'm running this through JPRT now. Thanks, David > > Thanks, > > -Zhengyu > > > > On 05/26/2017 08:34 PM, Gustavo Romero wrote: >> Hi Zhengyu, >> >> Thanks a lot for taking care of this corner case on PPC64. >> >> On 26-05-2017 10:41, Zhengyu Gu wrote: >>> This is a quick way to kill the symptom (or low risk?). I am not sure >>> if disabling NUMA is a better solution for this circumstance? does 1 >>> NUMA node = UMA? >> >> On PPC64, 1 (configured) NUMA does not necessarily imply UMA. In the >> POWER7 >> machine you found the corner case (I copy below the data you provided >> in the >> JBS - thanks for the additional information): >> >> $ numactl -H >> available: 2 nodes (0-1) >> node 0 cpus: 0 1 2 3 4 5 6 7 >> node 0 size: 0 MB >> node 0 free: 0 MB >> node 1 cpus: >> node 1 size: 7680 MB >> node 1 free: 1896 MB >> node distances: >> node 0 1 >> 0: 10 40 >> 1: 40 10 >> >> CPUs in node0 have no other alternative besides allocating memory from >> node1. In >> that case CPUs in node0 are always accessing remote memory from node1 >> in a constant >> distance (40), so in that case we could say that 1 NUMA (configured) >> node == UMA. >> Nonetheless, if you add CPUs in node1 (by filling up the other socket >> present in >> the board) you will end up with CPUs with different distances from the >> node that >> has configured memory (in that case, node1), so it yields a >> configuration where >> 1 NUMA (configured) != UMA (i.e. distances are not always equal to a >> single >> value). >> >> On the other hand, the POWER7 machine configuration in question is bad >> (and >> rare). It's indeed impacting the whole system performance and it would be >> reasonable to open the machine and move the memory module from bank >> related to >> node1 to bank related to node0, because all CPUs are accessing remote >> memory >> without any apparent necessity. Once you change it all CPUs will have >> local >> memory (distance = 10). >> >> >>> Thanks, >>> >>> -Zhengyu >>> >>> On 05/26/2017 09:14 AM, Zhengyu Gu wrote: >>>> Hi, >>>> >>>> There is a corner case that still failed after JDK-8175813. >>>> >>>> The system shows that it has multiple NUMA nodes, but only one is >>>> configured. Under this scenario, numa_interleave_memory() call will >>>> result "mbind: Invalid argument" message. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 >>>> Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ >> >> Looks like that even for that POWER7 rare numa topology >> numa_interleave_memory() >> should succeed without "mbind: Invalid argument" since the 'mask' >> argument >> should be already a mask with only nodes from which memory can be >> allocated, i.e. >> only a mask of configured nodes (even if mask contains only one >> configured node, >> as in http://cr.openjdk.java.net/~gromero/logs/numa_only_one_node.txt). >> >> Inspecting a little bit more, it looks like that the problem boils >> down to the >> fact that the JVM is passing to numa_interleave_memory() >> 'numa_all_nodes' [1] in >> Linux::numa_interleave_memory(). >> >> One would expect that 'numa_all_nodes' (which is api v1) would track >> the same >> information as 'numa_all_nodes_ptr' (api v2) [2], however there is a >> subtle but >> important difference: >> >> 'numa_all_nodes' is constructed assuming a consecutive node >> distribution [3]: >> >> 100 max = numa_num_configured_nodes(); >> 101 for (i = 0; i < max; i++) >> 102 nodemask_set_compat((nodemask_t *)&numa_all_nodes, >> i); >> >> >> whilst 'numa_all_nodes_ptr' is constructed parsing /proc/self/status [4]: >> >> 499 if (strncmp(buffer,"Mems_allowed:",13) == 0) { >> 500 numprocnode = read_mask(mask, >> numa_all_nodes_ptr); >> >> Thus for a topology like: >> >> available: 4 nodes (0-1,16-17) >> node 0 cpus: 0 8 16 24 32 >> node 0 size: 130706 MB >> node 0 free: 145 MB >> node 1 cpus: 40 48 56 64 72 >> node 1 size: 0 MB >> node 1 free: 0 MB >> node 16 cpus: 80 88 96 104 112 >> node 16 size: 130630 MB >> node 16 free: 529 MB >> node 17 cpus: 120 128 136 144 152 >> node 17 size: 0 MB >> node 17 free: 0 MB >> node distances: >> node 0 1 16 17 >> 0: 10 20 40 40 >> 1: 20 10 40 40 >> 16: 40 40 10 20 >> 17: 40 40 20 10 >> >> numa_all_nodes=0x3 => 0b11 (node0 and node1) >> numa_all_nodes_ptr=0x10001 => 0b10000000000000001 (node0 and node16) >> >> (Please, see details in the following gdb log: >> http://cr.openjdk.java.net/~gromero/logs/numa_api_v1_vs_api_v2.txt) >> >> In that case passing node0 and node1, although being suboptimal, does >> not bother >> mbind() since the following is satisfied: >> >> "[nodemask] must contain at least one node that is on-line, allowed by >> the >> process's current cpuset context, and contains memory." >> >> So back to the POWER7 case, I suppose that for: >> >> available: 2 nodes (0-1) >> node 0 cpus: 0 1 2 3 4 5 6 7 >> node 0 size: 0 MB >> node 0 free: 0 MB >> node 1 cpus: >> node 1 size: 7680 MB >> node 1 free: 1896 MB >> node distances: >> node 0 1 >> 0: 10 40 >> 1: 40 10 >> >> numa_all_nodes=0x1 => 0b01 (node0) >> numa_all_nodes_ptr=0x2 => 0b10 (node1) >> >> and hence numa_interleave_memory() gets nodemask = 0x1 (node0), which >> contains >> indeed no memory. That said, I don't know for sure if passing just >> node1 in the >> 'nodemask' will satisfy mbind() as in that case there are no cpus >> available in >> node1. >> >> In summing up, looks like that the root cause is not that >> numa_interleave_memory() >> does not accept only one configured node, but that the configured node >> being >> passed is wrong. I could not find a similar numa topology in my poll >> to test >> more, but it might be worth trying to write a small test using api v2 and >> 'numa_all_nodes_ptr' instead of 'numa_all_nodes' to see how >> numa_interleave_memory() >> goes in that machine :) If it behaves well, updating to api v2 would be a >> solution. >> >> HTH >> >> Regards, >> Gustavo >> >> >> [1] >> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/4b93e1b1d5b7/src/os/linux/vm/os_linux.hpp#l274 >> >> [2] from libnuma.c:608 numa_all_nodes_ptr: "it only tracks nodes with >> memory from which the calling process can allocate." >> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L100-L102 >> [4] https://github.com/numactl/numactl/blob/master/libnuma.c#L499-L500 >> >> >>>> >>>> The system NUMA configuration: >>>> >>>> Architecture: ppc64 >>>> CPU op-mode(s): 32-bit, 64-bit >>>> Byte Order: Big Endian >>>> CPU(s): 8 >>>> On-line CPU(s) list: 0-7 >>>> Thread(s) per core: 4 >>>> Core(s) per socket: 1 >>>> Socket(s): 2 >>>> NUMA node(s): 2 >>>> Model: 2.1 (pvr 003f 0201) >>>> Model name: POWER7 (architected), altivec supported >>>> L1d cache: 32K >>>> L1i cache: 32K >>>> NUMA node0 CPU(s): 0-7 >>>> NUMA node1 CPU(s): >>>> >>>> Thanks, >>>> >>>> -Zhengyu >>> >> From david.holmes at oracle.com Mon May 29 05:31:18 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 May 2017 15:31:18 +1000 Subject: [8u] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: <59258B49.9080602@linux.vnet.ibm.com> References: <59258B49.9080602@linux.vnet.ibm.com> Message-ID: Hi Gustavo, This looks like an accurate backport. Thanks, David ----- On 24/05/2017 11:31 PM, Gustavo Romero wrote: > Hi, > > Could this backport of 8175813 for jdk8u be reviewed, please? > > It applies cleanly to jdk8u except for a chunk in os::Linux::libnuma_init(), but > it's just due to an indentation change introduced with cleanup [1]. > > It improves JVM NUMA node detection on PPC64. > > Currently there is no Linux distros that package only libnuma v1, so libnuma API > v2 used in that change is always available. > > webrev : http://cr.openjdk.java.net/~gromero/8175813/backport/ > bug : https://bugs.openjdk.java.net/browse/JDK-8175813 > review thread: http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-May/026788.html > > Thank you. > > Regards, > Gustavo > > [1] https://bugs.openjdk.java.net/browse/JDK-8057107 > From david.holmes at oracle.com Mon May 29 08:52:14 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 May 2017 18:52:14 +1000 Subject: [PATCH]: linux-sparc build fixes In-Reply-To: <20170529081312.GA7132@physik.fu-berlin.de> References: <4121e72b-25cb-a18f-dace-a44528cb622d@oracle.com> <20170529081312.GA7132@physik.fu-berlin.de> Message-ID: <49c3a38a-779a-ea5d-6ef1-a4f298feb18c@oracle.com> On 29/05/2017 6:13 PM, John Paul Adrian Glaubitz wrote: > On Mon, May 29, 2017 at 12:17:13PM +1000, David Holmes wrote: >> Hi Adrian, > > Hi! > >> cc'ing hotspot-dev and bcc'ing build-dev as these are not issues with the >> build files, but hotspot sources. > > Thanks. > >> First, than you for taking the time and effort to contribute to OpenJDK. >> However ... > > You're welcome. > >> The status of linux-sparc as a port in OpenJDK 9 (or 8u) is unclear. As you >> have found those bits have bit-rotted as it is not a supported Oracle JDK >> platform, and no-one else seems to have cared until now. Your patches may be >> acceptable, but we have no way to validate them. > > That's surprising because depending on where your office is, it may > just be a matter of walking down an aisle and knocking on a colleague's > door to get access to the necessary test setup given the fact that > Oracle is officially shipping and supporting Linux for SPARC [1,2]. That doesn't necessarily translate into the OpenJDK or Oracle JDK supporting the Linux-sparc platform. It was previously supported in JFK 7 and early 8 IIRC but is no longer provided in 8u and is not a platform supported in 9. >> And the very next push may break things again. Or indeed once the >> hotspot issues are resolved you may find issues on the JDK library >> side. > > Well, I'm not so sure. openjdk-9 did build in the past (b88), so I > don't think the code is completely bit-rotten [3]. Sure it built in the past but b88 was some time ago and as I said once you get past the hotspot build problem you may find other problems. >> The process to get them into JDK 9 now is somewhat more difficult >> given where we are in the JDK 9 release. > > Even for minimal build fixes like these? Yes. The bar is high for any changes at the moment: http://openjdk.java.net/projects/jdk9/rdp-2 >> But without a commitment from some part of the community to maintain >> this port it may be better as something that is maintained >> downstream. > > I actually expect the Linux SPARC port to be picked up by Oracle > themselves in the near future. I can't really imagine that Oracle > would not be willing to support on of their one major software > products on one of their major platforms. I really couldn't say. Cheers, David > Thanks, > Adrian > >> [1] http://www.oracle.com/technetwork/server-storage/linux/downloads/oracle-linux-sparc-3665558.html >> [2] https://oss.oracle.com/projects/linux-sparc/ >> [3] https://buildd.debian.org/status/fetch.php?pkg=openjdk-9&arch=sparc64&ver=9%7Eb88-1&stamp=1448869573&raw=0 > From zoltan.majo at oracle.com Mon May 29 09:08:43 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 29 May 2017 11:08:43 +0200 Subject: [8u] Request for approval and review: 8180934 (XS): PutfieldError failed with UnsupportedClassVersionError In-Reply-To: References: <96127e2a-0059-8ef1-0ed8-2f152fe0e415@oracle.com> Message-ID: Hi, On 05/29/2017 03:51 AM, David Holmes wrote: > On 27/05/2017 12:44 AM, harold seigel wrote: >> Hi Zoltan, >> >> Instead of deleting the test, can the class file version of Bad.jasm >> be changed to 52 for JDK-8u? > > I concur. As I just wrote in the bug report I don't see how the > changes can be backported to 8u but the test is somehow invalid for 8u! I agree -- thank you, Harold and David, for pointing that out. I mis-read the test and thought to be related to the final field updates handled differently in 8 and 9 (functionality added by JDK-8157181 and JDK-8161987, respectively). Here is the updated webrev: http://cr.openjdk.java.net/~zmajo/8180934/webrev.01/ > > Also as a point of order: a RFR and a RFA are distinct and should be > posted separately: the RFR on hotspot-xxx-dev (as appropriate) and the > RFA on jdk8u-dev. Thanks, I noted that. Should I re-send the RFA and RFR for this issue, or does it suffice if I do it the next time (and onwards)? Best regards, Zoltan > > Thanks, > David > >> Thanks, Harold >> >> >> On 5/26/2017 10:17 AM, Zolt?n Maj? wrote: >>> Hi, >>> >>> >>> when backporting 8160551, I also backported a test that is relevant >>> only for class files with version >= 53. As JDK 8 supports only >>> class files with version < 53, having the test in the JDK 8u test >>> base does not make sense. This changeset proposes to remove the test. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8180934 >>> http://cr.openjdk.java.net/~zmajo/8180934/webrev.00/ >>> >>> I executed all hotspot/runtime tests with the changeset (using JDK >>> 8u122), no problems have shown up. JPRT testing is in progress. >>> >>> Please note that this fix is a JDK 8u-specific fix (not a backport >>> of some existing fix in JDK 9). >>> >>> Thank you! >>> >>> Best regards, >>> >>> >>> Zoltan >>> >> From david.holmes at oracle.com Mon May 29 11:23:24 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 May 2017 21:23:24 +1000 Subject: [8u] Request for approval and review: 8180934 (XS): PutfieldError failed with UnsupportedClassVersionError In-Reply-To: References: <96127e2a-0059-8ef1-0ed8-2f152fe0e415@oracle.com> Message-ID: <9f2e08c0-e0f5-3653-c772-d35ed1aa477b@oracle.com> On 29/05/2017 7:08 PM, Zolt?n Maj? wrote: > Hi, > > > On 05/29/2017 03:51 AM, David Holmes wrote: >> On 27/05/2017 12:44 AM, harold seigel wrote: >>> Hi Zoltan, >>> >>> Instead of deleting the test, can the class file version of Bad.jasm >>> be changed to 52 for JDK-8u? >> >> I concur. As I just wrote in the bug report I don't see how the >> changes can be backported to 8u but the test is somehow invalid for 8u! > > I agree -- thank you, Harold and David, for pointing that out. I > mis-read the test and thought to be related to the final field updates > handled differently in 8 and 9 (functionality added by JDK-8157181 and > JDK-8161987, respectively). > > Here is the updated webrev: > http://cr.openjdk.java.net/~zmajo/8180934/webrev.01/ Looks good. >> >> Also as a point of order: a RFR and a RFA are distinct and should be >> posted separately: the RFR on hotspot-xxx-dev (as appropriate) and the >> RFA on jdk8u-dev. > > Thanks, I noted that. Should I re-send the RFA and RFR for this issue, > or does it suffice if I do it the next time (and onwards)? That's up to Sean :) Thanks, David > Best regards, > > > Zoltan > > >> >> Thanks, >> David >> >>> Thanks, Harold >>> >>> >>> On 5/26/2017 10:17 AM, Zolt?n Maj? wrote: >>>> Hi, >>>> >>>> >>>> when backporting 8160551, I also backported a test that is relevant >>>> only for class files with version >= 53. As JDK 8 supports only >>>> class files with version < 53, having the test in the JDK 8u test >>>> base does not make sense. This changeset proposes to remove the test. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8180934 >>>> http://cr.openjdk.java.net/~zmajo/8180934/webrev.00/ >>>> >>>> I executed all hotspot/runtime tests with the changeset (using JDK >>>> 8u122), no problems have shown up. JPRT testing is in progress. >>>> >>>> Please note that this fix is a JDK 8u-specific fix (not a backport >>>> of some existing fix in JDK 9). >>>> >>>> Thank you! >>>> >>>> Best regards, >>>> >>>> >>>> Zoltan >>>> >>> > From sean.coffey at oracle.com Mon May 29 11:32:17 2017 From: sean.coffey at oracle.com (=?UTF-8?Q?Se=c3=a1n_Coffey?=) Date: Mon, 29 May 2017 12:32:17 +0100 Subject: [8u] Request for approval and review: 8180934 (XS): PutfieldError failed with UnsupportedClassVersionError In-Reply-To: <9f2e08c0-e0f5-3653-c772-d35ed1aa477b@oracle.com> References: <96127e2a-0059-8ef1-0ed8-2f152fe0e415@oracle.com> <9f2e08c0-e0f5-3653-c772-d35ed1aa477b@oracle.com> Message-ID: <8895d5e0-ab6c-6cf6-5a18-cc9611103b37@oracle.com> This still looks fine for jdk8u-dev! Approved. Regards, Sean. On 29/05/17 12:23, David Holmes wrote: >> Here is the updated webrev: >> http://cr.openjdk.java.net/~zmajo/8180934/webrev.01/ > > Looks good. > >>> >>> Also as a point of order: a RFR and a RFA are distinct and should be >>> posted separately: the RFR on hotspot-xxx-dev (as appropriate) and >>> the RFA on jdk8u-dev. >> >> Thanks, I noted that. Should I re-send the RFA and RFR for this >> issue, or does it suffice if I do it the next time (and onwards)? > > That's up to Sean :) From zoltan.majo at oracle.com Mon May 29 13:29:30 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 29 May 2017 15:29:30 +0200 Subject: [8u] Request for approval and review: 8180934 (XS): PutfieldError failed with UnsupportedClassVersionError In-Reply-To: <8895d5e0-ab6c-6cf6-5a18-cc9611103b37@oracle.com> References: <96127e2a-0059-8ef1-0ed8-2f152fe0e415@oracle.com> <9f2e08c0-e0f5-3653-c772-d35ed1aa477b@oracle.com> <8895d5e0-ab6c-6cf6-5a18-cc9611103b37@oracle.com> Message-ID: Thank you, Se?n! Best regards, Zolt?n On 05/29/2017 01:32 PM, Se?n Coffey wrote: > > This still looks fine for jdk8u-dev! Approved. > > Regards, > Sean. > On 29/05/17 12:23, David Holmes wrote: >>> Here is the updated webrev: >>> http://cr.openjdk.java.net/~zmajo/8180934/webrev.01/ >> >> Looks good. >> >>>> >>>> Also as a point of order: a RFR and a RFA are distinct and should >>>> be posted separately: the RFR on hotspot-xxx-dev (as appropriate) >>>> and the RFA on jdk8u-dev. >>> >>> Thanks, I noted that. Should I re-send the RFA and RFR for this >>> issue, or does it suffice if I do it the next time (and onwards)? >> >> That's up to Sean :) > From glaubitz at physik.fu-berlin.de Mon May 29 08:13:12 2017 From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz) Date: Mon, 29 May 2017 10:13:12 +0200 Subject: [PATCH]: linux-sparc build fixes In-Reply-To: <4121e72b-25cb-a18f-dace-a44528cb622d@oracle.com> References: <4121e72b-25cb-a18f-dace-a44528cb622d@oracle.com> Message-ID: <20170529081312.GA7132@physik.fu-berlin.de> On Mon, May 29, 2017 at 12:17:13PM +1000, David Holmes wrote: > Hi Adrian, Hi! > cc'ing hotspot-dev and bcc'ing build-dev as these are not issues with the > build files, but hotspot sources. Thanks. > First, than you for taking the time and effort to contribute to OpenJDK. > However ... You're welcome. > The status of linux-sparc as a port in OpenJDK 9 (or 8u) is unclear. As you > have found those bits have bit-rotted as it is not a supported Oracle JDK > platform, and no-one else seems to have cared until now. Your patches may be > acceptable, but we have no way to validate them. That's surprising because depending on where your office is, it may just be a matter of walking down an aisle and knocking on a colleague's door to get access to the necessary test setup given the fact that Oracle is officially shipping and supporting Linux for SPARC [1,2]. > And the very next push may break things again. Or indeed once the > hotspot issues are resolved you may find issues on the JDK library > side. Well, I'm not so sure. openjdk-9 did build in the past (b88), so I don't think the code is completely bit-rotten [3]. > The process to get them into JDK 9 now is somewhat more difficult > given where we are in the JDK 9 release. Even for minimal build fixes like these? > But without a commitment from some part of the community to maintain > this port it may be better as something that is maintained > downstream. I actually expect the Linux SPARC port to be picked up by Oracle themselves in the near future. I can't really imagine that Oracle would not be willing to support on of their one major software products on one of their major platforms. Thanks, Adrian > [1] http://www.oracle.com/technetwork/server-storage/linux/downloads/oracle-linux-sparc-3665558.html > [2] https://oss.oracle.com/projects/linux-sparc/ > [3] https://buildd.debian.org/status/fetch.php?pkg=openjdk-9&arch=sparc64&ver=9%7Eb88-1&stamp=1448869573&raw=0 -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz at debian.org `. `' Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 From robbin.ehn at oracle.com Mon May 29 15:33:53 2017 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 29 May 2017 17:33:53 +0200 Subject: output of jstack command In-Reply-To: <14CF8360-5840-4204-9F2D-6A123A5F9858@gmail.com> References: <5c8d22b5-5c6c-1aca-c57f-2b28733efe3f@oracle.com> <14CF8360-5840-4204-9F2D-6A123A5F9858@gmail.com> Message-ID: <27af994c-aa16-f503-d527-d2b7976c85e0@oracle.com> Hi, The text stream originates from: void Threads::print_on(outputStream* st, bool print_stacks, bool internal_format, bool print_concurrent_locks) { in hotspot/src/share/vm/runtime/thread.cpp L4491 and jstack only forwards that to your terminal. /Robbin On 05/25/2017 05:07 PM, Kirk Pepperdine wrote: > Hi Ramki, > > The source for jstack is in openJDK. Feel free to create your own copy of jstack where you can output the information in any format he likes. If you are suggesting that the existing format be changed do be aware that there are many tools that expect the current format. These have been adjusted to a change in format that was introduced with Java 8. I don?t see any reason why the format shouldn?t include information that is currently missing and is relevant. However I?d want to make sure that is is relevant and important before breaking the tool chain once again. > > I believe thread ids are already in the header. Certainly thread names are there. Not sure what you mean by types of threads. > > Kind regards, > Kirk >> On May 25, 2017, at 4:59 PM, Daniel D. Daugherty wrote: >> >> Adding serviceability-dev at ... since jstack is a Serviceability tool. >> >> I believe jstack is experimental which means the output format can >> change at any time... >> >> Dan >> >> On 5/25/17 8:35 AM, Ram Krishnan wrote: >>> Hi, >>> >>> I would like to leverage the output of jstack command for extracting >>> additional information about the type of threads, thread ids etc. Since I >>> will be parsing the output, I need the precise format. Is there any >>> documentation on jstack output format changes and the openjdk release(s) >>> where the changes happened? >>> >>> ?Tha?nks in advance. >>> >> > From gromero at linux.vnet.ibm.com Mon May 29 23:06:37 2017 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 29 May 2017 20:06:37 -0300 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <95147596-caf9-4e49-f954-29fa13df3a56@oracle.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> <5928C9AA.6030004@linux.vnet.ibm.com> <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> <95147596-caf9-4e49-f954-29fa13df3a56@oracle.com> Message-ID: <592CA97D.4000802@linux.vnet.ibm.com> Hi David, On 29-05-2017 01:34, David Holmes wrote: > Hi Zhengyu, > > On 29/05/2017 12:08 PM, Zhengyu Gu wrote: >> Hi Gustavo, >> >> Thanks for the detail analysis and suggestion. I did not realize the difference between from bitmask and nodemask. >> >> As you suggested, numa_interleave_memory_v2 works under this configuration. >> >> Please updated Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.01/ > > The addition of support for the "v2" API seems okay. Though I think this comment needs some clarification for the existing code: > > 2837 // If we are running with libnuma version > 2, then we should > 2838 // be trying to use symbols with versions 1.1 > 2839 // If we are running with earlier version, which did not have symbol versions, > 2840 // we should use the base version. > 2841 void* os::Linux::libnuma_dlsym(void* handle, const char *name) { > > given that we now explicitly load the v1.2 symbol if present. > > Gustavo: can you vouch for the suitability of using the v2 API in all cases, if it exists? My understanding is that in the transition to API v2 only the usage of numa_node_to_cpus() by the JVM will have to be adapted in os::Linux::rebuild_cpu_to_node_map(). The remaining functions (excluding numa_interleave_memory() as Zhengyu already addressed it) preserve the same functionality and signatures [1]. Currently JVM NUMA API requires the following libnuma functions: 1. numa_node_to_cpus v1 != v2 (using v1, JVM has to adapt) 2. numa_max_node v1 == v2 (using v1, transition is straightforward) 3. numa_num_configured_nodes v2 (added by gromero: 8175813) 4. numa_available v1 == v2 (using v1, transition is straightforward) 5. numa_tonode_memory v1 == v2 (using v1, transition is straightforward) 6. numa_interleave_memory v1 != v2 (updated by zhengyu: 8181055. Default use of v2, fallback to v1) 7. numa_set_bind_policy v1 == v2 (using v1, transition is straightforward) 8. numa_bitmask_isbitset v2 (added by gromero: 8175813) 9. numa_distance v1 == v2 (added by gromero: 8175813. Using v1, transition is straightforward) v1 != v2: function signature in version 1 is different from version 2 v1 == v2: function signature in version 1 is equal to version 2 v2 : function is only present in API v2 Thus, to the best of my knowledge, except for case 1. (which JVM need to adapt to) all other cases are suitable to use v2 API and we could use a fallback mechanism as proposed by Zhengyu or update directly to API v2 (risky?), given that I can't see how v2 API would not be available on current (not-EOL) Linux distro releases. Regarding the comment, I agree, it needs an update since we are not tied anymore to version 1.1 (we are in effect already using v2 for some functions). We could delete the comment atop libnuma_dlsym() and add something like: "Handle request to load libnuma symbol version 1.1 (API v1). If it fails load symbol from base version instead." and to libnuma_v2_dlsym() add: "Handle request to load libnuma symbol version 1.2 (API v2) only. If it fails no symbol from any other version - even if present - is loaded." I've opened a bug to track the transitions to API v2 (I also discussed that with Volker): https://bugs.openjdk.java.net/browse/JDK-8181196 Regards, Gustavo [1] API v1 vs API v2: API v1 ====== int numa_node_to_cpus(int node, unsigned long *buffer, int bufferlen); int numa_max_node(void); - int numa_num_configured_nodes(void); int numa_available(void); void numa_tonode_memory(void *start, size_t size, int node); void numa_interleave_memory(void *start, size_t size, nodemask_t *nodemask); void numa_set_bind_policy(int strict); - int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); int numa_distance(int node1, int node2); API v2 ====== int numa_node_to_cpus(int node, struct bitmask *mask); int numa_max_node(void); int numa_num_configured_nodes(void); int numa_available(void); void numa_tonode_memory(void *start, size_t size, int node); void numa_interleave_memory(void *start, size_t size, struct bitmask *nodemask); void numa_set_bind_policy(int strict) int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); int numa_distance(int node1, int node2); > I'm running this through JPRT now. > > Thanks, > David > >> >> Thanks, >> >> -Zhengyu >> >> >> >> On 05/26/2017 08:34 PM, Gustavo Romero wrote: >>> Hi Zhengyu, >>> >>> Thanks a lot for taking care of this corner case on PPC64. >>> >>> On 26-05-2017 10:41, Zhengyu Gu wrote: >>>> This is a quick way to kill the symptom (or low risk?). I am not sure if disabling NUMA is a better solution for this circumstance? does 1 NUMA node = UMA? >>> >>> On PPC64, 1 (configured) NUMA does not necessarily imply UMA. In the POWER7 >>> machine you found the corner case (I copy below the data you provided in the >>> JBS - thanks for the additional information): >>> >>> $ numactl -H >>> available: 2 nodes (0-1) >>> node 0 cpus: 0 1 2 3 4 5 6 7 >>> node 0 size: 0 MB >>> node 0 free: 0 MB >>> node 1 cpus: >>> node 1 size: 7680 MB >>> node 1 free: 1896 MB >>> node distances: >>> node 0 1 >>> 0: 10 40 >>> 1: 40 10 >>> >>> CPUs in node0 have no other alternative besides allocating memory from node1. In >>> that case CPUs in node0 are always accessing remote memory from node1 in a constant >>> distance (40), so in that case we could say that 1 NUMA (configured) node == UMA. >>> Nonetheless, if you add CPUs in node1 (by filling up the other socket present in >>> the board) you will end up with CPUs with different distances from the node that >>> has configured memory (in that case, node1), so it yields a configuration where >>> 1 NUMA (configured) != UMA (i.e. distances are not always equal to a single >>> value). >>> >>> On the other hand, the POWER7 machine configuration in question is bad (and >>> rare). It's indeed impacting the whole system performance and it would be >>> reasonable to open the machine and move the memory module from bank related to >>> node1 to bank related to node0, because all CPUs are accessing remote memory >>> without any apparent necessity. Once you change it all CPUs will have local >>> memory (distance = 10). >>> >>> >>>> Thanks, >>>> >>>> -Zhengyu >>>> >>>> On 05/26/2017 09:14 AM, Zhengyu Gu wrote: >>>>> Hi, >>>>> >>>>> There is a corner case that still failed after JDK-8175813. >>>>> >>>>> The system shows that it has multiple NUMA nodes, but only one is >>>>> configured. Under this scenario, numa_interleave_memory() call will >>>>> result "mbind: Invalid argument" message. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 >>>>> Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ >>> >>> Looks like that even for that POWER7 rare numa topology numa_interleave_memory() >>> should succeed without "mbind: Invalid argument" since the 'mask' argument >>> should be already a mask with only nodes from which memory can be allocated, i.e. >>> only a mask of configured nodes (even if mask contains only one configured node, >>> as in http://cr.openjdk.java.net/~gromero/logs/numa_only_one_node.txt). >>> >>> Inspecting a little bit more, it looks like that the problem boils down to the >>> fact that the JVM is passing to numa_interleave_memory() 'numa_all_nodes' [1] in >>> Linux::numa_interleave_memory(). >>> >>> One would expect that 'numa_all_nodes' (which is api v1) would track the same >>> information as 'numa_all_nodes_ptr' (api v2) [2], however there is a subtle but >>> important difference: >>> >>> 'numa_all_nodes' is constructed assuming a consecutive node distribution [3]: >>> >>> 100 max = numa_num_configured_nodes(); >>> 101 for (i = 0; i < max; i++) >>> 102 nodemask_set_compat((nodemask_t *)&numa_all_nodes, i); >>> >>> >>> whilst 'numa_all_nodes_ptr' is constructed parsing /proc/self/status [4]: >>> >>> 499 if (strncmp(buffer,"Mems_allowed:",13) == 0) { >>> 500 numprocnode = read_mask(mask, numa_all_nodes_ptr); >>> >>> Thus for a topology like: >>> >>> available: 4 nodes (0-1,16-17) >>> node 0 cpus: 0 8 16 24 32 >>> node 0 size: 130706 MB >>> node 0 free: 145 MB >>> node 1 cpus: 40 48 56 64 72 >>> node 1 size: 0 MB >>> node 1 free: 0 MB >>> node 16 cpus: 80 88 96 104 112 >>> node 16 size: 130630 MB >>> node 16 free: 529 MB >>> node 17 cpus: 120 128 136 144 152 >>> node 17 size: 0 MB >>> node 17 free: 0 MB >>> node distances: >>> node 0 1 16 17 >>> 0: 10 20 40 40 >>> 1: 20 10 40 40 >>> 16: 40 40 10 20 >>> 17: 40 40 20 10 >>> >>> numa_all_nodes=0x3 => 0b11 (node0 and node1) >>> numa_all_nodes_ptr=0x10001 => 0b10000000000000001 (node0 and node16) >>> >>> (Please, see details in the following gdb log: http://cr.openjdk.java.net/~gromero/logs/numa_api_v1_vs_api_v2.txt) >>> >>> In that case passing node0 and node1, although being suboptimal, does not bother >>> mbind() since the following is satisfied: >>> >>> "[nodemask] must contain at least one node that is on-line, allowed by the >>> process's current cpuset context, and contains memory." >>> >>> So back to the POWER7 case, I suppose that for: >>> >>> available: 2 nodes (0-1) >>> node 0 cpus: 0 1 2 3 4 5 6 7 >>> node 0 size: 0 MB >>> node 0 free: 0 MB >>> node 1 cpus: >>> node 1 size: 7680 MB >>> node 1 free: 1896 MB >>> node distances: >>> node 0 1 >>> 0: 10 40 >>> 1: 40 10 >>> >>> numa_all_nodes=0x1 => 0b01 (node0) >>> numa_all_nodes_ptr=0x2 => 0b10 (node1) >>> >>> and hence numa_interleave_memory() gets nodemask = 0x1 (node0), which contains >>> indeed no memory. That said, I don't know for sure if passing just node1 in the >>> 'nodemask' will satisfy mbind() as in that case there are no cpus available in >>> node1. >>> >>> In summing up, looks like that the root cause is not that numa_interleave_memory() >>> does not accept only one configured node, but that the configured node being >>> passed is wrong. I could not find a similar numa topology in my poll to test >>> more, but it might be worth trying to write a small test using api v2 and >>> 'numa_all_nodes_ptr' instead of 'numa_all_nodes' to see how numa_interleave_memory() >>> goes in that machine :) If it behaves well, updating to api v2 would be a >>> solution. >>> >>> HTH >>> >>> Regards, >>> Gustavo >>> >>> >>> [1] http://hg.openjdk.java.net/jdk10/hs/hotspot/file/4b93e1b1d5b7/src/os/linux/vm/os_linux.hpp#l274 >>> [2] from libnuma.c:608 numa_all_nodes_ptr: "it only tracks nodes with memory from which the calling process can allocate." >>> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L100-L102 >>> [4] https://github.com/numactl/numactl/blob/master/libnuma.c#L499-L500 >>> >>> >>>>> >>>>> The system NUMA configuration: >>>>> >>>>> Architecture: ppc64 >>>>> CPU op-mode(s): 32-bit, 64-bit >>>>> Byte Order: Big Endian >>>>> CPU(s): 8 >>>>> On-line CPU(s) list: 0-7 >>>>> Thread(s) per core: 4 >>>>> Core(s) per socket: 1 >>>>> Socket(s): 2 >>>>> NUMA node(s): 2 >>>>> Model: 2.1 (pvr 003f 0201) >>>>> Model name: POWER7 (architected), altivec supported >>>>> L1d cache: 32K >>>>> L1i cache: 32K >>>>> NUMA node0 CPU(s): 0-7 >>>>> NUMA node1 CPU(s): >>>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>> >>> > From zoltan.majo at oracle.com Tue May 30 07:07:23 2017 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Tue, 30 May 2017 09:07:23 +0200 Subject: [8u] Request for approval and review: 8180934 (XS): PutfieldError failed with UnsupportedClassVersionError In-Reply-To: <9f2e08c0-e0f5-3653-c772-d35ed1aa477b@oracle.com> References: <96127e2a-0059-8ef1-0ed8-2f152fe0e415@oracle.com> <9f2e08c0-e0f5-3653-c772-d35ed1aa477b@oracle.com> Message-ID: <5769e8c2-2b2c-1c90-50ac-304b8cab3f1c@oracle.com> Hi David, On 05/29/2017 01:23 PM, David Holmes wrote: > On 29/05/2017 7:08 PM, Zolt?n Maj? wrote: >> [...] >> >> Here is the updated webrev: >> http://cr.openjdk.java.net/~zmajo/8180934/webrev.01/ > > Looks good. Thank you for the review! > >>> >>> Also as a point of order: a RFR and a RFA are distinct and should be >>> posted separately: the RFR on hotspot-xxx-dev (as appropriate) and >>> the RFA on jdk8u-dev. >> >> Thanks, I noted that. Should I re-send the RFA and RFR for this >> issue, or does it suffice if I do it the next time (and onwards)? > > That's up to Sean :) OK, thank you. Best regards, Zoltan > > Thanks, > David > >> Best regards, >> >> >> Zoltan >> >> >>> >>> Thanks, >>> David >>> >>>> Thanks, Harold >>>> >>>> >>>> On 5/26/2017 10:17 AM, Zolt?n Maj? wrote: >>>>> Hi, >>>>> >>>>> >>>>> when backporting 8160551, I also backported a test that is >>>>> relevant only for class files with version >= 53. As JDK 8 >>>>> supports only class files with version < 53, having the test in >>>>> the JDK 8u test base does not make sense. This changeset proposes >>>>> to remove the test. >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8180934 >>>>> http://cr.openjdk.java.net/~zmajo/8180934/webrev.00/ >>>>> >>>>> I executed all hotspot/runtime tests with the changeset (using JDK >>>>> 8u122), no problems have shown up. JPRT testing is in progress. >>>>> >>>>> Please note that this fix is a JDK 8u-specific fix (not a backport >>>>> of some existing fix in JDK 9). >>>>> >>>>> Thank you! >>>>> >>>>> Best regards, >>>>> >>>>> >>>>> Zoltan >>>>> >>>> >> From thomas.stuefe at gmail.com Tue May 30 09:53:39 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 30 May 2017 11:53:39 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> Message-ID: Hi David, works fine on AIX, as it did before. Had a look at the change, some very small nits: - We now carry pthread_.. types in os_posix.hpp (pthread_mutex_t, pthread_cond_t). Are the rules not that each header should be self-contained? If yes, should os_posix.hpp not include the relevant OS headers for the pthread_... types? - the coding around dlopen/dlsym would be a bit more readable if you were to define types for the function pointers, that would save you from spelling them out each time. e.g.: typedef int (*clock_getres_func_t)(clockid_t, struct timespec *); ... static clock_getres_func_t _clock_getres_func = (clock_getres_func_t) dlsym(...); ...Thomas On Mon, May 29, 2017 at 6:19 AM, David Holmes wrote: > Dan, Robbin, Thomas, > > Okay here is the final ready to push version: > > http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot.v2/ > > this fixes all Dan's nits and refactors the time calculation code as > suggested by Robbin. > > Thomas: if you are around and able, it would be good to get a final sanity > check on AIX. Thanks. > > Testing: > - JPRT: -testset hotspot > -testset core > > - manual: > - jtreg:java/util/concurrent > - various little test programs that try to validate sleep/wait times to > show early returns or unexpected delays > > Thanks again for the reviews. > > David > > > On 29/05/2017 10:29 AM, David Holmes wrote: > >> On 27/05/2017 4:19 AM, Daniel D. Daugherty wrote: >> >>> On 5/26/17 1:27 AM, David Holmes wrote: >>> >>>> Robbin, Dan, >>>> >>>> Below is a modified version of the refactored to_abstime code that >>>> Robbin suggested. >>>> >>>> Robbin: there were a couple of issues with your version. For relative >>>> time the timeout is always in nanoseconds - the "unit" only tells you what >>>> form the "now_part_sec" is - nanos or micros. And the calc_abs_time always >>>> has a deadline in millis. So I simplified and did a little renaming, and >>>> tracked max_secs in debug_only instead of returning it. >>>> >>>> Please let me know what you think. >>>> >>> >>> Looks OK to me. Nit comments below... >>> >> >> Thanks Dan - more below. >> >> >>>> >>>> // Calculate a new absolute time that is "timeout" nanoseconds from >>>> "now". >>>> // "unit" indicates the unit of "now_part_sec" (may be nanos or micros >>>> depending >>>> // on which clock is being used). >>>> static void calc_rel_time(timespec* abstime, jlong timeout, jlong >>>> now_sec, >>>> jlong now_part_sec, jlong unit) { >>>> time_t max_secs = now_sec + MAX_SECS; >>>> >>>> jlong seconds = timeout / NANOUNITS; >>>> timeout %= NANOUNITS; // remaining nanos >>>> >>>> if (seconds >= MAX_SECS) { >>>> // More seconds than we can add, so pin to max_secs. >>>> abstime->tv_sec = max_secs; >>>> abstime->tv_nsec = 0; >>>> } else { >>>> abstime->tv_sec = now_sec + seconds; >>>> long nanos = (now_part_sec * (NANOUNITS / unit)) + timeout; >>>> if (nanos >= NANOUNITS) { // overflow >>>> abstime->tv_sec += 1; >>>> nanos -= NANOUNITS; >>>> } >>>> abstime->tv_nsec = nanos; >>>> } >>>> } >>>> >>>> // Unpack the given deadline in milliseconds since the epoch, into the >>>> given timespec. >>>> // The current time in seconds is also passed in to enforce an upper >>>> bound as discussed above. >>>> static void unpack_abs_time(timespec* abstime, jlong deadline, jlong >>>> now_sec) { >>>> time_t max_secs = now_sec + MAX_SECS; >>>> >>>> jlong seconds = deadline / MILLIUNITS; >>>> jlong millis = deadline % MILLIUNITS; >>>> >>>> if (seconds >= max_secs) { >>>> // Absolute seconds exceeds allowed max, so pin to max_secs. >>>> abstime->tv_sec = max_secs; >>>> abstime->tv_nsec = 0; >>>> } else { >>>> abstime->tv_sec = seconds; >>>> abstime->tv_nsec = millis * (NANOUNITS / MILLIUNITS); >>>> } >>>> } >>>> >>>> >>>> static void to_abstime(timespec* abstime, jlong timeout, bool >>>> isAbsolute) { >>>> >>> >>> There's an extra blank line here. >>> >> >> Fixed. >> >> >>>> DEBUG_ONLY(int max_secs = MAX_SECS;) >>>> >>>> if (timeout < 0) { >>>> timeout = 0; >>>> } >>>> >>>> #ifdef SUPPORTS_CLOCK_MONOTONIC >>>> >>>> if (_use_clock_monotonic_condattr && !isAbsolute) { >>>> struct timespec now; >>>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>>> assert_status(status == 0, status, "clock_gettime"); >>>> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_nsec, NANOUNITS); >>>> DEBUG_ONLY(max_secs += now.tv_sec;) >>>> } else { >>>> >>>> #else >>>> >>>> { // Match the block scope. >>>> >>>> #endif // SUPPORTS_CLOCK_MONOTONIC >>>> >>>> // Time-of-day clock is all we can reliably use. >>>> struct timeval now; >>>> int status = gettimeofday(&now, NULL); >>>> assert(status == 0, "gettimeofday"); >>>> >>> >>> assert_status() is used above, but assert() is used here. Why? >>> >> >> Historical. assert_status was introduced for the pthread* and other posix >> funcs that return the error value rather than returning -1 and setting >> errno. gettimeofday is not one of those so still has the old assert. >> However, as someone pointed out a while ago you can use assert_status with >> these and pass errno as the "status". So I did that. >> >> >>> if (isAbsolute) { >>>> unpack_abs_time(abstime, timeout, now.tv_sec); >>>> } >>>> else { >>>> >>> >>> Inconsistent "else-branch" formatting. >>> I believe HotSpot style is "} else {" >>> >> >> Fixed. >> >> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, MICROUNITS); >>>> } >>>> DEBUG_ONLY(max_secs += now.tv_sec;) >>>> } >>>> >>>> assert(abstime->tv_sec >= 0, "tv_sec < 0"); >>>> assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); >>>> assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); >>>> assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >= >>>> nanos_per_sec"); >>>> >>> >>> Why does the assert mesg have "nanos_per_sec" instead of >>> "NANOSECS_PER_SEC"? >>> >> >> No reason. Actually that should now refer to NANOUNITS. Hmmm I can not >> recall why we have NANOUNITS and NANAOSECS_PER_SEC ... possibly an >> oversight. >> >> There's an extra blank line here. >>> >> >> Fixed. >> >> Will send out complete updated webrev soon. >> >> Thanks, >> David >> >> >>>> } >>>> >>> >>> Definitely looks and reads much cleaner. >>> >>> Dan >>> >>> From david.holmes at oracle.com Tue May 30 11:22:40 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 May 2017 21:22:40 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> Message-ID: Hi Thomas, On 30/05/2017 7:53 PM, Thomas St?fe wrote: > Hi David, > > works fine on AIX, as it did before. Great - thanks for confirming that again. > Had a look at the change, some very small nits: > > - We now carry pthread_.. types in os_posix.hpp (pthread_mutex_t, > pthread_cond_t). Are the rules not that each header should be > self-contained? If yes, should os_posix.hpp not include the relevant OS > headers for the pthread_... types? Yes. I'll fix that - and check how the pthread types are currently getting seen. Thanks. > - the coding around dlopen/dlsym would be a bit more readable if you > were to define types for the function pointers, that would save you from > spelling them out each time. I'll safe this for the next cleanup if we share all of the "clock" related stuff. Thanks again, David ----- > e.g.: > > typedef int (*clock_getres_func_t)(clockid_t, struct timespec *); > ... > static clock_getres_func_t _clock_getres_func = (clock_getres_func_t) > dlsym(...); > > ...Thomas > > > On Mon, May 29, 2017 at 6:19 AM, David Holmes > wrote: > > Dan, Robbin, Thomas, > > Okay here is the final ready to push version: > > http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot.v2/ > > > this fixes all Dan's nits and refactors the time calculation code as > suggested by Robbin. > > Thomas: if you are around and able, it would be good to get a final > sanity check on AIX. Thanks. > > Testing: > - JPRT: -testset hotspot > -testset core > > - manual: > - jtreg:java/util/concurrent > - various little test programs that try to validate sleep/wait > times to show early returns or unexpected delays > > Thanks again for the reviews. > > David > > > On 29/05/2017 10:29 AM, David Holmes wrote: > > On 27/05/2017 4:19 AM, Daniel D. Daugherty wrote: > > On 5/26/17 1:27 AM, David Holmes wrote: > > Robbin, Dan, > > Below is a modified version of the refactored to_abstime > code that Robbin suggested. > > Robbin: there were a couple of issues with your version. > For relative time the timeout is always in nanoseconds - > the "unit" only tells you what form the "now_part_sec" > is - nanos or micros. And the calc_abs_time always has a > deadline in millis. So I simplified and did a little > renaming, and tracked max_secs in debug_only instead of > returning it. > > Please let me know what you think. > > > Looks OK to me. Nit comments below... > > > Thanks Dan - more below. > > > > // Calculate a new absolute time that is "timeout" > nanoseconds from "now". > // "unit" indicates the unit of "now_part_sec" (may be > nanos or micros depending > // on which clock is being used). > static void calc_rel_time(timespec* abstime, jlong > timeout, jlong now_sec, > jlong now_part_sec, jlong unit) { > time_t max_secs = now_sec + MAX_SECS; > > jlong seconds = timeout / NANOUNITS; > timeout %= NANOUNITS; // remaining nanos > > if (seconds >= MAX_SECS) { > // More seconds than we can add, so pin to max_secs. > abstime->tv_sec = max_secs; > abstime->tv_nsec = 0; > } else { > abstime->tv_sec = now_sec + seconds; > long nanos = (now_part_sec * (NANOUNITS / unit)) + > timeout; > if (nanos >= NANOUNITS) { // overflow > abstime->tv_sec += 1; > nanos -= NANOUNITS; > } > abstime->tv_nsec = nanos; > } > } > > // Unpack the given deadline in milliseconds since the > epoch, into the given timespec. > // The current time in seconds is also passed in to > enforce an upper bound as discussed above. > static void unpack_abs_time(timespec* abstime, jlong > deadline, jlong now_sec) { > time_t max_secs = now_sec + MAX_SECS; > > jlong seconds = deadline / MILLIUNITS; > jlong millis = deadline % MILLIUNITS; > > if (seconds >= max_secs) { > // Absolute seconds exceeds allowed max, so pin to > max_secs. > abstime->tv_sec = max_secs; > abstime->tv_nsec = 0; > } else { > abstime->tv_sec = seconds; > abstime->tv_nsec = millis * (NANOUNITS / MILLIUNITS); > } > } > > > static void to_abstime(timespec* abstime, jlong timeout, > bool isAbsolute) { > > > There's an extra blank line here. > > > Fixed. > > > DEBUG_ONLY(int max_secs = MAX_SECS;) > > if (timeout < 0) { > timeout = 0; > } > > #ifdef SUPPORTS_CLOCK_MONOTONIC > > if (_use_clock_monotonic_condattr && !isAbsolute) { > struct timespec now; > int status = _clock_gettime(CLOCK_MONOTONIC, &now); > assert_status(status == 0, status, "clock_gettime"); > calc_rel_time(abstime, timeout, now.tv_sec, > now.tv_nsec, NANOUNITS); > DEBUG_ONLY(max_secs += now.tv_sec;) > } else { > > #else > > { // Match the block scope. > > #endif // SUPPORTS_CLOCK_MONOTONIC > > // Time-of-day clock is all we can reliably use. > struct timeval now; > int status = gettimeofday(&now, NULL); > assert(status == 0, "gettimeofday"); > > > assert_status() is used above, but assert() is used here. Why? > > > Historical. assert_status was introduced for the pthread* and > other posix funcs that return the error value rather than > returning -1 and setting errno. gettimeofday is not one of those > so still has the old assert. However, as someone pointed out a > while ago you can use assert_status with these and pass errno as > the "status". So I did that. > > > if (isAbsolute) { > unpack_abs_time(abstime, timeout, now.tv_sec); > } > else { > > > Inconsistent "else-branch" formatting. > I believe HotSpot style is "} else {" > > > Fixed. > > calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, > MICROUNITS); > } > DEBUG_ONLY(max_secs += now.tv_sec;) > } > > assert(abstime->tv_sec >= 0, "tv_sec < 0"); > assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); > assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); > assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec > >= nanos_per_sec"); > > > Why does the assert mesg have "nanos_per_sec" instead of > "NANOSECS_PER_SEC"? > > > No reason. Actually that should now refer to NANOUNITS. Hmmm I > can not recall why we have NANOUNITS and NANAOSECS_PER_SEC ... > possibly an oversight. > > There's an extra blank line here. > > > Fixed. > > Will send out complete updated webrev soon. > > Thanks, > David > > > } > > > Definitely looks and reads much cleaner. > > Dan > > From david.holmes at oracle.com Tue May 30 11:28:25 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 May 2017 21:28:25 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> Message-ID: <617a8f08-a660-d4e8-cbbc-278f193769b9@oracle.com> Correction ... On 30/05/2017 9:22 PM, David Holmes wrote: > Hi Thomas, > > On 30/05/2017 7:53 PM, Thomas St?fe wrote: >> Hi David, >> >> works fine on AIX, as it did before. > > Great - thanks for confirming that again. > >> Had a look at the change, some very small nits: >> >> - We now carry pthread_.. types in os_posix.hpp (pthread_mutex_t, >> pthread_cond_t). Are the rules not that each header should be >> self-contained? If yes, should os_posix.hpp not include the relevant >> OS headers for the pthread_... types? > > Yes. I'll fix that - and check how the pthread types are currently > getting seen. Thanks. So ... os_posix.hpp only #includes "runtime/os.hpp" and yet uses dozens of types defined in numerous other header files! So I could add pthread.h, but it would look very lonely. Dan: thoughts? Cheers, David >> - the coding around dlopen/dlsym would be a bit more readable if you >> were to define types for the function pointers, that would save you >> from spelling them out each time. > > I'll safe this for the next cleanup if we share all of the "clock" > related stuff. > > Thanks again, > David > ----- > >> e.g.: >> >> typedef int (*clock_getres_func_t)(clockid_t, struct timespec *); >> ... >> static clock_getres_func_t _clock_getres_func = (clock_getres_func_t) >> dlsym(...); >> >> ...Thomas >> >> >> On Mon, May 29, 2017 at 6:19 AM, David Holmes > > wrote: >> >> Dan, Robbin, Thomas, >> >> Okay here is the final ready to push version: >> >> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot.v2/ >> >> >> this fixes all Dan's nits and refactors the time calculation code as >> suggested by Robbin. >> >> Thomas: if you are around and able, it would be good to get a final >> sanity check on AIX. Thanks. >> >> Testing: >> - JPRT: -testset hotspot >> -testset core >> >> - manual: >> - jtreg:java/util/concurrent >> - various little test programs that try to validate sleep/wait >> times to show early returns or unexpected delays >> >> Thanks again for the reviews. >> >> David >> >> >> On 29/05/2017 10:29 AM, David Holmes wrote: >> >> On 27/05/2017 4:19 AM, Daniel D. Daugherty wrote: >> >> On 5/26/17 1:27 AM, David Holmes wrote: >> >> Robbin, Dan, >> >> Below is a modified version of the refactored to_abstime >> code that Robbin suggested. >> >> Robbin: there were a couple of issues with your version. >> For relative time the timeout is always in nanoseconds - >> the "unit" only tells you what form the "now_part_sec" >> is - nanos or micros. And the calc_abs_time always has a >> deadline in millis. So I simplified and did a little >> renaming, and tracked max_secs in debug_only instead of >> returning it. >> >> Please let me know what you think. >> >> >> Looks OK to me. Nit comments below... >> >> >> Thanks Dan - more below. >> >> >> >> // Calculate a new absolute time that is "timeout" >> nanoseconds from "now". >> // "unit" indicates the unit of "now_part_sec" (may be >> nanos or micros depending >> // on which clock is being used). >> static void calc_rel_time(timespec* abstime, jlong >> timeout, jlong now_sec, >> jlong now_part_sec, jlong >> unit) { >> time_t max_secs = now_sec + MAX_SECS; >> >> jlong seconds = timeout / NANOUNITS; >> timeout %= NANOUNITS; // remaining nanos >> >> if (seconds >= MAX_SECS) { >> // More seconds than we can add, so pin to max_secs. >> abstime->tv_sec = max_secs; >> abstime->tv_nsec = 0; >> } else { >> abstime->tv_sec = now_sec + seconds; >> long nanos = (now_part_sec * (NANOUNITS / unit)) + >> timeout; >> if (nanos >= NANOUNITS) { // overflow >> abstime->tv_sec += 1; >> nanos -= NANOUNITS; >> } >> abstime->tv_nsec = nanos; >> } >> } >> >> // Unpack the given deadline in milliseconds since the >> epoch, into the given timespec. >> // The current time in seconds is also passed in to >> enforce an upper bound as discussed above. >> static void unpack_abs_time(timespec* abstime, jlong >> deadline, jlong now_sec) { >> time_t max_secs = now_sec + MAX_SECS; >> >> jlong seconds = deadline / MILLIUNITS; >> jlong millis = deadline % MILLIUNITS; >> >> if (seconds >= max_secs) { >> // Absolute seconds exceeds allowed max, so pin to >> max_secs. >> abstime->tv_sec = max_secs; >> abstime->tv_nsec = 0; >> } else { >> abstime->tv_sec = seconds; >> abstime->tv_nsec = millis * (NANOUNITS / >> MILLIUNITS); >> } >> } >> >> >> static void to_abstime(timespec* abstime, jlong timeout, >> bool isAbsolute) { >> >> >> There's an extra blank line here. >> >> >> Fixed. >> >> >> DEBUG_ONLY(int max_secs = MAX_SECS;) >> >> if (timeout < 0) { >> timeout = 0; >> } >> >> #ifdef SUPPORTS_CLOCK_MONOTONIC >> >> if (_use_clock_monotonic_condattr && !isAbsolute) { >> struct timespec now; >> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >> assert_status(status == 0, status, "clock_gettime"); >> calc_rel_time(abstime, timeout, now.tv_sec, >> now.tv_nsec, NANOUNITS); >> DEBUG_ONLY(max_secs += now.tv_sec;) >> } else { >> >> #else >> >> { // Match the block scope. >> >> #endif // SUPPORTS_CLOCK_MONOTONIC >> >> // Time-of-day clock is all we can reliably use. >> struct timeval now; >> int status = gettimeofday(&now, NULL); >> assert(status == 0, "gettimeofday"); >> >> >> assert_status() is used above, but assert() is used here. >> Why? >> >> >> Historical. assert_status was introduced for the pthread* and >> other posix funcs that return the error value rather than >> returning -1 and setting errno. gettimeofday is not one of those >> so still has the old assert. However, as someone pointed out a >> while ago you can use assert_status with these and pass errno as >> the "status". So I did that. >> >> >> if (isAbsolute) { >> unpack_abs_time(abstime, timeout, now.tv_sec); >> } >> else { >> >> >> Inconsistent "else-branch" formatting. >> I believe HotSpot style is "} else {" >> >> >> Fixed. >> >> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, >> MICROUNITS); >> } >> DEBUG_ONLY(max_secs += now.tv_sec;) >> } >> >> assert(abstime->tv_sec >= 0, "tv_sec < 0"); >> assert(abstime->tv_sec <= max_secs, "tv_sec > >> max_secs"); >> assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); >> assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >> >= nanos_per_sec"); >> >> >> Why does the assert mesg have "nanos_per_sec" instead of >> "NANOSECS_PER_SEC"? >> >> >> No reason. Actually that should now refer to NANOUNITS. Hmmm I >> can not recall why we have NANOUNITS and NANAOSECS_PER_SEC ... >> possibly an oversight. >> >> There's an extra blank line here. >> >> >> Fixed. >> >> Will send out complete updated webrev soon. >> >> Thanks, >> David >> >> >> } >> >> >> Definitely looks and reads much cleaner. >> >> Dan >> >> From zgu at redhat.com Tue May 30 11:59:33 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 30 May 2017 07:59:33 -0400 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <592CA97D.4000802@linux.vnet.ibm.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> <5928C9AA.6030004@linux.vnet.ibm.com> <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> <95147596-caf9-4e49-f954-29fa13df3a56@oracle.com> <592CA97D.4000802@linux.vnet.ibm.com> Message-ID: <937b01c9-5569-ce73-e7a3-ad38aed82ab3@redhat.com> Hi David and Gustavo, Thanks for the review. Webrev is updated according to your comments: http://cr.openjdk.java.net/~zgu/8181055/webrev.02/ Thanks, -Zhengyu On 05/29/2017 07:06 PM, Gustavo Romero wrote: > Hi David, > > On 29-05-2017 01:34, David Holmes wrote: >> Hi Zhengyu, >> >> On 29/05/2017 12:08 PM, Zhengyu Gu wrote: >>> Hi Gustavo, >>> >>> Thanks for the detail analysis and suggestion. I did not realize the difference between from bitmask and nodemask. >>> >>> As you suggested, numa_interleave_memory_v2 works under this configuration. >>> >>> Please updated Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.01/ >> >> The addition of support for the "v2" API seems okay. Though I think this comment needs some clarification for the existing code: >> >> 2837 // If we are running with libnuma version > 2, then we should >> 2838 // be trying to use symbols with versions 1.1 >> 2839 // If we are running with earlier version, which did not have symbol versions, >> 2840 // we should use the base version. >> 2841 void* os::Linux::libnuma_dlsym(void* handle, const char *name) { >> >> given that we now explicitly load the v1.2 symbol if present. >> >> Gustavo: can you vouch for the suitability of using the v2 API in all cases, if it exists? > > My understanding is that in the transition to API v2 only the usage of > numa_node_to_cpus() by the JVM will have to be adapted in os::Linux::rebuild_cpu_to_node_map(). > The remaining functions (excluding numa_interleave_memory() as Zhengyu already addressed it) > preserve the same functionality and signatures [1]. > > Currently JVM NUMA API requires the following libnuma functions: > > 1. numa_node_to_cpus v1 != v2 (using v1, JVM has to adapt) > 2. numa_max_node v1 == v2 (using v1, transition is straightforward) > 3. numa_num_configured_nodes v2 (added by gromero: 8175813) > 4. numa_available v1 == v2 (using v1, transition is straightforward) > 5. numa_tonode_memory v1 == v2 (using v1, transition is straightforward) > 6. numa_interleave_memory v1 != v2 (updated by zhengyu: 8181055. Default use of v2, fallback to v1) > 7. numa_set_bind_policy v1 == v2 (using v1, transition is straightforward) > 8. numa_bitmask_isbitset v2 (added by gromero: 8175813) > 9. numa_distance v1 == v2 (added by gromero: 8175813. Using v1, transition is straightforward) > > v1 != v2: function signature in version 1 is different from version 2 > v1 == v2: function signature in version 1 is equal to version 2 > v2 : function is only present in API v2 > > Thus, to the best of my knowledge, except for case 1. (which JVM need to adapt to) > all other cases are suitable to use v2 API and we could use a fallback mechanism as > proposed by Zhengyu or update directly to API v2 (risky?), given that I can't see > how v2 API would not be available on current (not-EOL) Linux distro releases. > > Regarding the comment, I agree, it needs an update since we are not tied anymore > to version 1.1 (we are in effect already using v2 for some functions). We could > delete the comment atop libnuma_dlsym() and add something like: > > "Handle request to load libnuma symbol version 1.1 (API v1). If it fails load symbol from base version instead." > > and to libnuma_v2_dlsym() add: > > "Handle request to load libnuma symbol version 1.2 (API v2) only. If it fails no symbol from any other version - even if present - is loaded." > > I've opened a bug to track the transitions to API v2 (I also discussed that with Volker): > https://bugs.openjdk.java.net/browse/JDK-8181196 > > > Regards, > Gustavo > > [1] API v1 vs API v2: > > API v1 > ====== > > int numa_node_to_cpus(int node, unsigned long *buffer, int bufferlen); > int numa_max_node(void); > - int numa_num_configured_nodes(void); > int numa_available(void); > void numa_tonode_memory(void *start, size_t size, int node); > void numa_interleave_memory(void *start, size_t size, nodemask_t *nodemask); > void numa_set_bind_policy(int strict); > - int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); > int numa_distance(int node1, int node2); > > > API v2 > ====== > > int numa_node_to_cpus(int node, struct bitmask *mask); > int numa_max_node(void); > int numa_num_configured_nodes(void); > int numa_available(void); > void numa_tonode_memory(void *start, size_t size, int node); > void numa_interleave_memory(void *start, size_t size, struct bitmask *nodemask); > void numa_set_bind_policy(int strict) > int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); > int numa_distance(int node1, int node2); > > >> I'm running this through JPRT now. >> >> Thanks, >> David >> >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> >>> >>> On 05/26/2017 08:34 PM, Gustavo Romero wrote: >>>> Hi Zhengyu, >>>> >>>> Thanks a lot for taking care of this corner case on PPC64. >>>> >>>> On 26-05-2017 10:41, Zhengyu Gu wrote: >>>>> This is a quick way to kill the symptom (or low risk?). I am not sure if disabling NUMA is a better solution for this circumstance? does 1 NUMA node = UMA? >>>> >>>> On PPC64, 1 (configured) NUMA does not necessarily imply UMA. In the POWER7 >>>> machine you found the corner case (I copy below the data you provided in the >>>> JBS - thanks for the additional information): >>>> >>>> $ numactl -H >>>> available: 2 nodes (0-1) >>>> node 0 cpus: 0 1 2 3 4 5 6 7 >>>> node 0 size: 0 MB >>>> node 0 free: 0 MB >>>> node 1 cpus: >>>> node 1 size: 7680 MB >>>> node 1 free: 1896 MB >>>> node distances: >>>> node 0 1 >>>> 0: 10 40 >>>> 1: 40 10 >>>> >>>> CPUs in node0 have no other alternative besides allocating memory from node1. In >>>> that case CPUs in node0 are always accessing remote memory from node1 in a constant >>>> distance (40), so in that case we could say that 1 NUMA (configured) node == UMA. >>>> Nonetheless, if you add CPUs in node1 (by filling up the other socket present in >>>> the board) you will end up with CPUs with different distances from the node that >>>> has configured memory (in that case, node1), so it yields a configuration where >>>> 1 NUMA (configured) != UMA (i.e. distances are not always equal to a single >>>> value). >>>> >>>> On the other hand, the POWER7 machine configuration in question is bad (and >>>> rare). It's indeed impacting the whole system performance and it would be >>>> reasonable to open the machine and move the memory module from bank related to >>>> node1 to bank related to node0, because all CPUs are accessing remote memory >>>> without any apparent necessity. Once you change it all CPUs will have local >>>> memory (distance = 10). >>>> >>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>>> On 05/26/2017 09:14 AM, Zhengyu Gu wrote: >>>>>> Hi, >>>>>> >>>>>> There is a corner case that still failed after JDK-8175813. >>>>>> >>>>>> The system shows that it has multiple NUMA nodes, but only one is >>>>>> configured. Under this scenario, numa_interleave_memory() call will >>>>>> result "mbind: Invalid argument" message. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 >>>>>> Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ >>>> >>>> Looks like that even for that POWER7 rare numa topology numa_interleave_memory() >>>> should succeed without "mbind: Invalid argument" since the 'mask' argument >>>> should be already a mask with only nodes from which memory can be allocated, i.e. >>>> only a mask of configured nodes (even if mask contains only one configured node, >>>> as in http://cr.openjdk.java.net/~gromero/logs/numa_only_one_node.txt). >>>> >>>> Inspecting a little bit more, it looks like that the problem boils down to the >>>> fact that the JVM is passing to numa_interleave_memory() 'numa_all_nodes' [1] in >>>> Linux::numa_interleave_memory(). >>>> >>>> One would expect that 'numa_all_nodes' (which is api v1) would track the same >>>> information as 'numa_all_nodes_ptr' (api v2) [2], however there is a subtle but >>>> important difference: >>>> >>>> 'numa_all_nodes' is constructed assuming a consecutive node distribution [3]: >>>> >>>> 100 max = numa_num_configured_nodes(); >>>> 101 for (i = 0; i < max; i++) >>>> 102 nodemask_set_compat((nodemask_t *)&numa_all_nodes, i); >>>> >>>> >>>> whilst 'numa_all_nodes_ptr' is constructed parsing /proc/self/status [4]: >>>> >>>> 499 if (strncmp(buffer,"Mems_allowed:",13) == 0) { >>>> 500 numprocnode = read_mask(mask, numa_all_nodes_ptr); >>>> >>>> Thus for a topology like: >>>> >>>> available: 4 nodes (0-1,16-17) >>>> node 0 cpus: 0 8 16 24 32 >>>> node 0 size: 130706 MB >>>> node 0 free: 145 MB >>>> node 1 cpus: 40 48 56 64 72 >>>> node 1 size: 0 MB >>>> node 1 free: 0 MB >>>> node 16 cpus: 80 88 96 104 112 >>>> node 16 size: 130630 MB >>>> node 16 free: 529 MB >>>> node 17 cpus: 120 128 136 144 152 >>>> node 17 size: 0 MB >>>> node 17 free: 0 MB >>>> node distances: >>>> node 0 1 16 17 >>>> 0: 10 20 40 40 >>>> 1: 20 10 40 40 >>>> 16: 40 40 10 20 >>>> 17: 40 40 20 10 >>>> >>>> numa_all_nodes=0x3 => 0b11 (node0 and node1) >>>> numa_all_nodes_ptr=0x10001 => 0b10000000000000001 (node0 and node16) >>>> >>>> (Please, see details in the following gdb log: http://cr.openjdk.java.net/~gromero/logs/numa_api_v1_vs_api_v2.txt) >>>> >>>> In that case passing node0 and node1, although being suboptimal, does not bother >>>> mbind() since the following is satisfied: >>>> >>>> "[nodemask] must contain at least one node that is on-line, allowed by the >>>> process's current cpuset context, and contains memory." >>>> >>>> So back to the POWER7 case, I suppose that for: >>>> >>>> available: 2 nodes (0-1) >>>> node 0 cpus: 0 1 2 3 4 5 6 7 >>>> node 0 size: 0 MB >>>> node 0 free: 0 MB >>>> node 1 cpus: >>>> node 1 size: 7680 MB >>>> node 1 free: 1896 MB >>>> node distances: >>>> node 0 1 >>>> 0: 10 40 >>>> 1: 40 10 >>>> >>>> numa_all_nodes=0x1 => 0b01 (node0) >>>> numa_all_nodes_ptr=0x2 => 0b10 (node1) >>>> >>>> and hence numa_interleave_memory() gets nodemask = 0x1 (node0), which contains >>>> indeed no memory. That said, I don't know for sure if passing just node1 in the >>>> 'nodemask' will satisfy mbind() as in that case there are no cpus available in >>>> node1. >>>> >>>> In summing up, looks like that the root cause is not that numa_interleave_memory() >>>> does not accept only one configured node, but that the configured node being >>>> passed is wrong. I could not find a similar numa topology in my poll to test >>>> more, but it might be worth trying to write a small test using api v2 and >>>> 'numa_all_nodes_ptr' instead of 'numa_all_nodes' to see how numa_interleave_memory() >>>> goes in that machine :) If it behaves well, updating to api v2 would be a >>>> solution. >>>> >>>> HTH >>>> >>>> Regards, >>>> Gustavo >>>> >>>> >>>> [1] http://hg.openjdk.java.net/jdk10/hs/hotspot/file/4b93e1b1d5b7/src/os/linux/vm/os_linux.hpp#l274 >>>> [2] from libnuma.c:608 numa_all_nodes_ptr: "it only tracks nodes with memory from which the calling process can allocate." >>>> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L100-L102 >>>> [4] https://github.com/numactl/numactl/blob/master/libnuma.c#L499-L500 >>>> >>>> >>>>>> >>>>>> The system NUMA configuration: >>>>>> >>>>>> Architecture: ppc64 >>>>>> CPU op-mode(s): 32-bit, 64-bit >>>>>> Byte Order: Big Endian >>>>>> CPU(s): 8 >>>>>> On-line CPU(s) list: 0-7 >>>>>> Thread(s) per core: 4 >>>>>> Core(s) per socket: 1 >>>>>> Socket(s): 2 >>>>>> NUMA node(s): 2 >>>>>> Model: 2.1 (pvr 003f 0201) >>>>>> Model name: POWER7 (architected), altivec supported >>>>>> L1d cache: 32K >>>>>> L1i cache: 32K >>>>>> NUMA node0 CPU(s): 0-7 >>>>>> NUMA node1 CPU(s): >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>> >>>> >> > From thomas.stuefe at gmail.com Tue May 30 12:20:53 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 30 May 2017 14:20:53 +0200 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <617a8f08-a660-d4e8-cbbc-278f193769b9@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> <617a8f08-a660-d4e8-cbbc-278f193769b9@oracle.com> Message-ID: On Tue, May 30, 2017 at 1:28 PM, David Holmes wrote: > Correction ... > > On 30/05/2017 9:22 PM, David Holmes wrote: > >> Hi Thomas, >> >> On 30/05/2017 7:53 PM, Thomas St?fe wrote: >> >>> Hi David, >>> >>> works fine on AIX, as it did before. >>> >> >> Great - thanks for confirming that again. >> >> Had a look at the change, some very small nits: >>> >>> - We now carry pthread_.. types in os_posix.hpp (pthread_mutex_t, >>> pthread_cond_t). Are the rules not that each header should be >>> self-contained? If yes, should os_posix.hpp not include the relevant OS >>> headers for the pthread_... types? >>> >> >> Yes. I'll fix that - and check how the pthread types are currently >> getting seen. Thanks. >> > > So ... os_posix.hpp only #includes "runtime/os.hpp" and yet uses dozens of > types defined in numerous other header files! So I could add pthread.h, but > it would look very lonely. > > Looking more closely, this seems not to be a real header, similar to all the other os_xxx.hpp files, but its only purpose seems to be to get included by os.hpp at that one specific place. So, the rules to be self-contained does not apply, right? Which also means the include of runtime/os.hpp is a bit pointless. ..Thomas > Dan: thoughts? > > Cheers, > David > > > - the coding around dlopen/dlsym would be a bit more readable if you were >>> to define types for the function pointers, that would save you from >>> spelling them out each time. >>> >> >> I'll safe this for the next cleanup if we share all of the "clock" >> related stuff. >> >> Thanks again, >> David >> ----- >> >> e.g.: >>> >>> typedef int (*clock_getres_func_t)(clockid_t, struct timespec *); >>> ... >>> static clock_getres_func_t _clock_getres_func = (clock_getres_func_t) >>> dlsym(...); >>> >>> ...Thomas >>> >>> >>> On Mon, May 29, 2017 at 6:19 AM, David Holmes >> > wrote: >>> >>> Dan, Robbin, Thomas, >>> >>> Okay here is the final ready to push version: >>> >>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot.v2/ >>> >>> >>> this fixes all Dan's nits and refactors the time calculation code as >>> suggested by Robbin. >>> >>> Thomas: if you are around and able, it would be good to get a final >>> sanity check on AIX. Thanks. >>> >>> Testing: >>> - JPRT: -testset hotspot >>> -testset core >>> >>> - manual: >>> - jtreg:java/util/concurrent >>> - various little test programs that try to validate sleep/wait >>> times to show early returns or unexpected delays >>> >>> Thanks again for the reviews. >>> >>> David >>> >>> >>> On 29/05/2017 10:29 AM, David Holmes wrote: >>> >>> On 27/05/2017 4:19 AM, Daniel D. Daugherty wrote: >>> >>> On 5/26/17 1:27 AM, David Holmes wrote: >>> >>> Robbin, Dan, >>> >>> Below is a modified version of the refactored to_abstime >>> code that Robbin suggested. >>> >>> Robbin: there were a couple of issues with your version. >>> For relative time the timeout is always in nanoseconds - >>> the "unit" only tells you what form the "now_part_sec" >>> is - nanos or micros. And the calc_abs_time always has a >>> deadline in millis. So I simplified and did a little >>> renaming, and tracked max_secs in debug_only instead of >>> returning it. >>> >>> Please let me know what you think. >>> >>> >>> Looks OK to me. Nit comments below... >>> >>> >>> Thanks Dan - more below. >>> >>> >>> >>> // Calculate a new absolute time that is "timeout" >>> nanoseconds from "now". >>> // "unit" indicates the unit of "now_part_sec" (may be >>> nanos or micros depending >>> // on which clock is being used). >>> static void calc_rel_time(timespec* abstime, jlong >>> timeout, jlong now_sec, >>> jlong now_part_sec, jlong >>> unit) { >>> time_t max_secs = now_sec + MAX_SECS; >>> >>> jlong seconds = timeout / NANOUNITS; >>> timeout %= NANOUNITS; // remaining nanos >>> >>> if (seconds >= MAX_SECS) { >>> // More seconds than we can add, so pin to max_secs. >>> abstime->tv_sec = max_secs; >>> abstime->tv_nsec = 0; >>> } else { >>> abstime->tv_sec = now_sec + seconds; >>> long nanos = (now_part_sec * (NANOUNITS / unit)) + >>> timeout; >>> if (nanos >= NANOUNITS) { // overflow >>> abstime->tv_sec += 1; >>> nanos -= NANOUNITS; >>> } >>> abstime->tv_nsec = nanos; >>> } >>> } >>> >>> // Unpack the given deadline in milliseconds since the >>> epoch, into the given timespec. >>> // The current time in seconds is also passed in to >>> enforce an upper bound as discussed above. >>> static void unpack_abs_time(timespec* abstime, jlong >>> deadline, jlong now_sec) { >>> time_t max_secs = now_sec + MAX_SECS; >>> >>> jlong seconds = deadline / MILLIUNITS; >>> jlong millis = deadline % MILLIUNITS; >>> >>> if (seconds >= max_secs) { >>> // Absolute seconds exceeds allowed max, so pin to >>> max_secs. >>> abstime->tv_sec = max_secs; >>> abstime->tv_nsec = 0; >>> } else { >>> abstime->tv_sec = seconds; >>> abstime->tv_nsec = millis * (NANOUNITS / >>> MILLIUNITS); >>> } >>> } >>> >>> >>> static void to_abstime(timespec* abstime, jlong timeout, >>> bool isAbsolute) { >>> >>> >>> There's an extra blank line here. >>> >>> >>> Fixed. >>> >>> >>> DEBUG_ONLY(int max_secs = MAX_SECS;) >>> >>> if (timeout < 0) { >>> timeout = 0; >>> } >>> >>> #ifdef SUPPORTS_CLOCK_MONOTONIC >>> >>> if (_use_clock_monotonic_condattr && !isAbsolute) { >>> struct timespec now; >>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>> assert_status(status == 0, status, "clock_gettime"); >>> calc_rel_time(abstime, timeout, now.tv_sec, >>> now.tv_nsec, NANOUNITS); >>> DEBUG_ONLY(max_secs += now.tv_sec;) >>> } else { >>> >>> #else >>> >>> { // Match the block scope. >>> >>> #endif // SUPPORTS_CLOCK_MONOTONIC >>> >>> // Time-of-day clock is all we can reliably use. >>> struct timeval now; >>> int status = gettimeofday(&now, NULL); >>> assert(status == 0, "gettimeofday"); >>> >>> >>> assert_status() is used above, but assert() is used here. >>> Why? >>> >>> >>> Historical. assert_status was introduced for the pthread* and >>> other posix funcs that return the error value rather than >>> returning -1 and setting errno. gettimeofday is not one of those >>> so still has the old assert. However, as someone pointed out a >>> while ago you can use assert_status with these and pass errno as >>> the "status". So I did that. >>> >>> >>> if (isAbsolute) { >>> unpack_abs_time(abstime, timeout, now.tv_sec); >>> } >>> else { >>> >>> >>> Inconsistent "else-branch" formatting. >>> I believe HotSpot style is "} else {" >>> >>> >>> Fixed. >>> >>> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, >>> MICROUNITS); >>> } >>> DEBUG_ONLY(max_secs += now.tv_sec;) >>> } >>> >>> assert(abstime->tv_sec >= 0, "tv_sec < 0"); >>> assert(abstime->tv_sec <= max_secs, "tv_sec > >>> max_secs"); >>> assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); >>> assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >>> >= nanos_per_sec"); >>> >>> >>> Why does the assert mesg have "nanos_per_sec" instead of >>> "NANOSECS_PER_SEC"? >>> >>> >>> No reason. Actually that should now refer to NANOUNITS. Hmmm I >>> can not recall why we have NANOUNITS and NANAOSECS_PER_SEC ... >>> possibly an oversight. >>> >>> There's an extra blank line here. >>> >>> >>> Fixed. >>> >>> Will send out complete updated webrev soon. >>> >>> Thanks, >>> David >>> >>> >>> } >>> >>> >>> Definitely looks and reads much cleaner. >>> >>> Dan >>> >>> >>> From daniel.daugherty at oracle.com Tue May 30 13:58:27 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 30 May 2017 07:58:27 -0600 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> <617a8f08-a660-d4e8-cbbc-278f193769b9@oracle.com> Message-ID: <8474ce10-71d4-6ad6-f988-72b9ec578c35@oracle.com> On 5/30/17 6:20 AM, Thomas St?fe wrote: > On Tue, May 30, 2017 at 1:28 PM, David Holmes > wrote: > >> Correction ... >> >> On 30/05/2017 9:22 PM, David Holmes wrote: >> >>> Hi Thomas, >>> >>> On 30/05/2017 7:53 PM, Thomas St?fe wrote: >>> >>>> Hi David, >>>> >>>> works fine on AIX, as it did before. >>>> >>> Great - thanks for confirming that again. >>> >>> Had a look at the change, some very small nits: >>>> - We now carry pthread_.. types in os_posix.hpp (pthread_mutex_t, >>>> pthread_cond_t). Are the rules not that each header should be >>>> self-contained? If yes, should os_posix.hpp not include the relevant OS >>>> headers for the pthread_... types? >>>> >>> Yes. I'll fix that - and check how the pthread types are currently >>> getting seen. Thanks. >>> >> So ... os_posix.hpp only #includes "runtime/os.hpp" and yet uses dozens of >> types defined in numerous other header files! So I could add pthread.h, but >> it would look very lonely. >> >> > Looking more closely, this seems not to be a real header, similar to all > the other os_xxx.hpp files, but its only purpose seems to be to get > included by os.hpp at that one specific place. So, the rules to be > self-contained does not apply, right? > > Which also means the include of runtime/os.hpp is a bit pointless. > > ..Thomas > > >> Dan: thoughts? Just starting my re-review now, but I'm inclined to agree os_posix.hpp is not meant to be a self-contained header... Dan >> >> Cheers, >> David >> >> >> - the coding around dlopen/dlsym would be a bit more readable if you were >>>> to define types for the function pointers, that would save you from >>>> spelling them out each time. >>>> >>> I'll safe this for the next cleanup if we share all of the "clock" >>> related stuff. >>> >>> Thanks again, >>> David >>> ----- >>> >>> e.g.: >>>> typedef int (*clock_getres_func_t)(clockid_t, struct timespec *); >>>> ... >>>> static clock_getres_func_t _clock_getres_func = (clock_getres_func_t) >>>> dlsym(...); >>>> >>>> ...Thomas >>>> >>>> >>>> On Mon, May 29, 2017 at 6:19 AM, David Holmes >>> > wrote: >>>> >>>> Dan, Robbin, Thomas, >>>> >>>> Okay here is the final ready to push version: >>>> >>>> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot.v2/ >>>> >>>> >>>> this fixes all Dan's nits and refactors the time calculation code as >>>> suggested by Robbin. >>>> >>>> Thomas: if you are around and able, it would be good to get a final >>>> sanity check on AIX. Thanks. >>>> >>>> Testing: >>>> - JPRT: -testset hotspot >>>> -testset core >>>> >>>> - manual: >>>> - jtreg:java/util/concurrent >>>> - various little test programs that try to validate sleep/wait >>>> times to show early returns or unexpected delays >>>> >>>> Thanks again for the reviews. >>>> >>>> David >>>> >>>> >>>> On 29/05/2017 10:29 AM, David Holmes wrote: >>>> >>>> On 27/05/2017 4:19 AM, Daniel D. Daugherty wrote: >>>> >>>> On 5/26/17 1:27 AM, David Holmes wrote: >>>> >>>> Robbin, Dan, >>>> >>>> Below is a modified version of the refactored to_abstime >>>> code that Robbin suggested. >>>> >>>> Robbin: there were a couple of issues with your version. >>>> For relative time the timeout is always in nanoseconds - >>>> the "unit" only tells you what form the "now_part_sec" >>>> is - nanos or micros. And the calc_abs_time always has a >>>> deadline in millis. So I simplified and did a little >>>> renaming, and tracked max_secs in debug_only instead of >>>> returning it. >>>> >>>> Please let me know what you think. >>>> >>>> >>>> Looks OK to me. Nit comments below... >>>> >>>> >>>> Thanks Dan - more below. >>>> >>>> >>>> >>>> // Calculate a new absolute time that is "timeout" >>>> nanoseconds from "now". >>>> // "unit" indicates the unit of "now_part_sec" (may be >>>> nanos or micros depending >>>> // on which clock is being used). >>>> static void calc_rel_time(timespec* abstime, jlong >>>> timeout, jlong now_sec, >>>> jlong now_part_sec, jlong >>>> unit) { >>>> time_t max_secs = now_sec + MAX_SECS; >>>> >>>> jlong seconds = timeout / NANOUNITS; >>>> timeout %= NANOUNITS; // remaining nanos >>>> >>>> if (seconds >= MAX_SECS) { >>>> // More seconds than we can add, so pin to max_secs. >>>> abstime->tv_sec = max_secs; >>>> abstime->tv_nsec = 0; >>>> } else { >>>> abstime->tv_sec = now_sec + seconds; >>>> long nanos = (now_part_sec * (NANOUNITS / unit)) + >>>> timeout; >>>> if (nanos >= NANOUNITS) { // overflow >>>> abstime->tv_sec += 1; >>>> nanos -= NANOUNITS; >>>> } >>>> abstime->tv_nsec = nanos; >>>> } >>>> } >>>> >>>> // Unpack the given deadline in milliseconds since the >>>> epoch, into the given timespec. >>>> // The current time in seconds is also passed in to >>>> enforce an upper bound as discussed above. >>>> static void unpack_abs_time(timespec* abstime, jlong >>>> deadline, jlong now_sec) { >>>> time_t max_secs = now_sec + MAX_SECS; >>>> >>>> jlong seconds = deadline / MILLIUNITS; >>>> jlong millis = deadline % MILLIUNITS; >>>> >>>> if (seconds >= max_secs) { >>>> // Absolute seconds exceeds allowed max, so pin to >>>> max_secs. >>>> abstime->tv_sec = max_secs; >>>> abstime->tv_nsec = 0; >>>> } else { >>>> abstime->tv_sec = seconds; >>>> abstime->tv_nsec = millis * (NANOUNITS / >>>> MILLIUNITS); >>>> } >>>> } >>>> >>>> >>>> static void to_abstime(timespec* abstime, jlong timeout, >>>> bool isAbsolute) { >>>> >>>> >>>> There's an extra blank line here. >>>> >>>> >>>> Fixed. >>>> >>>> >>>> DEBUG_ONLY(int max_secs = MAX_SECS;) >>>> >>>> if (timeout < 0) { >>>> timeout = 0; >>>> } >>>> >>>> #ifdef SUPPORTS_CLOCK_MONOTONIC >>>> >>>> if (_use_clock_monotonic_condattr && !isAbsolute) { >>>> struct timespec now; >>>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>>> assert_status(status == 0, status, "clock_gettime"); >>>> calc_rel_time(abstime, timeout, now.tv_sec, >>>> now.tv_nsec, NANOUNITS); >>>> DEBUG_ONLY(max_secs += now.tv_sec;) >>>> } else { >>>> >>>> #else >>>> >>>> { // Match the block scope. >>>> >>>> #endif // SUPPORTS_CLOCK_MONOTONIC >>>> >>>> // Time-of-day clock is all we can reliably use. >>>> struct timeval now; >>>> int status = gettimeofday(&now, NULL); >>>> assert(status == 0, "gettimeofday"); >>>> >>>> >>>> assert_status() is used above, but assert() is used here. >>>> Why? >>>> >>>> >>>> Historical. assert_status was introduced for the pthread* and >>>> other posix funcs that return the error value rather than >>>> returning -1 and setting errno. gettimeofday is not one of those >>>> so still has the old assert. However, as someone pointed out a >>>> while ago you can use assert_status with these and pass errno as >>>> the "status". So I did that. >>>> >>>> >>>> if (isAbsolute) { >>>> unpack_abs_time(abstime, timeout, now.tv_sec); >>>> } >>>> else { >>>> >>>> >>>> Inconsistent "else-branch" formatting. >>>> I believe HotSpot style is "} else {" >>>> >>>> >>>> Fixed. >>>> >>>> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, >>>> MICROUNITS); >>>> } >>>> DEBUG_ONLY(max_secs += now.tv_sec;) >>>> } >>>> >>>> assert(abstime->tv_sec >= 0, "tv_sec < 0"); >>>> assert(abstime->tv_sec <= max_secs, "tv_sec > >>>> max_secs"); >>>> assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); >>>> assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >>>> >= nanos_per_sec"); >>>> >>>> >>>> Why does the assert mesg have "nanos_per_sec" instead of >>>> "NANOSECS_PER_SEC"? >>>> >>>> >>>> No reason. Actually that should now refer to NANOUNITS. Hmmm I >>>> can not recall why we have NANOUNITS and NANAOSECS_PER_SEC ... >>>> possibly an oversight. >>>> >>>> There's an extra blank line here. >>>> >>>> >>>> Fixed. >>>> >>>> Will send out complete updated webrev soon. >>>> >>>> Thanks, >>>> David >>>> >>>> >>>> } >>>> >>>> >>>> Definitely looks and reads much cleaner. >>>> >>>> Dan >>>> >>>> >>>> From daniel.daugherty at oracle.com Tue May 30 15:11:08 2017 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 30 May 2017 09:11:08 -0600 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> Message-ID: On 5/28/17 10:19 PM, David Holmes wrote: > Dan, Robbin, Thomas, > > Okay here is the final ready to push version: > > http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot.v2/ General - Approaching the review differently than last round. This time I'm focused on the os_posix.[ch]pp changes as if this were all new code. - i.e., I'm going to assume that code deleted from the platform specific files is all appropriately represented in os_posix.[ch]pp. src/os/posix/vm/os_posix.hpp No comments. src/os/posix/vm/os_posix.cpp L1518: _use_clock_monotonic_condattr = true; L1522: _use_clock_monotonic_condattr = false; _use_clock_monotonic_condattr could briefly be observed as 'true' before being reset to 'false' due to the EINVAL. I think we are single threaded at this point so there should be no other thread running to be confused by this. An alternative would be to set _use_clock_monotonic_condattr to true only when _pthread_condattr_setclock() returns 0. L1581: // number of seconds, in abstime, is less than current_time + 100,000,000. L1582: // As it will be over 20 years before "now + 100000000" will overflow we can L1584: // of "now + 100,000,000". This places a limit on the timeout of about 3.17 nit - consistency of using ',' or not in 100000000. Personally, I would prefer no commas so the comments match MAX_SECS. L1703: if (Atomic::cmpxchg(v-1, &_event, v) == v) break; L1743: if (Atomic::cmpxchg(v-1, &_event, v) == v) break; nit - please add spaces around the '-' operator. L1749: to_abstime(&abst, millis * (NANOUNITS/MILLIUNITS), false); nit - please add spaces around the '/' operator. src/os/aix/vm/os_aix.hpp No comments. src/os/aix/vm/os_aix.cpp No comments. src/os/bsd/vm/os_bsd.hpp No comments. src/os/bsd/vm/os_bsd.cpp No comments. src/os/linux/vm/os_linux.hpp No comments. src/os/linux/vm/os_linux.cpp No comments. src/os/solaris/vm/os_solaris.hpp No comments. src/os/solaris/vm/os_solaris.cpp No comments. Thumbs up. Don't need to see another webrev if you choose to fix the bits... Dan > > this fixes all Dan's nits and refactors the time calculation code as > suggested by Robbin. > > Thomas: if you are around and able, it would be good to get a final > sanity check on AIX. Thanks. > > Testing: > - JPRT: -testset hotspot > -testset core > > - manual: > - jtreg:java/util/concurrent > - various little test programs that try to validate sleep/wait > times to show early returns or unexpected delays > > Thanks again for the reviews. > > David > > On 29/05/2017 10:29 AM, David Holmes wrote: >> On 27/05/2017 4:19 AM, Daniel D. Daugherty wrote: >>> On 5/26/17 1:27 AM, David Holmes wrote: >>>> Robbin, Dan, >>>> >>>> Below is a modified version of the refactored to_abstime code that >>>> Robbin suggested. >>>> >>>> Robbin: there were a couple of issues with your version. For >>>> relative time the timeout is always in nanoseconds - the "unit" >>>> only tells you what form the "now_part_sec" is - nanos or micros. >>>> And the calc_abs_time always has a deadline in millis. So I >>>> simplified and did a little renaming, and tracked max_secs in >>>> debug_only instead of returning it. >>>> >>>> Please let me know what you think. >>> >>> Looks OK to me. Nit comments below... >> >> Thanks Dan - more below. >> >>>> >>>> >>>> // Calculate a new absolute time that is "timeout" nanoseconds from >>>> "now". >>>> // "unit" indicates the unit of "now_part_sec" (may be nanos or >>>> micros depending >>>> // on which clock is being used). >>>> static void calc_rel_time(timespec* abstime, jlong timeout, jlong >>>> now_sec, >>>> jlong now_part_sec, jlong unit) { >>>> time_t max_secs = now_sec + MAX_SECS; >>>> >>>> jlong seconds = timeout / NANOUNITS; >>>> timeout %= NANOUNITS; // remaining nanos >>>> >>>> if (seconds >= MAX_SECS) { >>>> // More seconds than we can add, so pin to max_secs. >>>> abstime->tv_sec = max_secs; >>>> abstime->tv_nsec = 0; >>>> } else { >>>> abstime->tv_sec = now_sec + seconds; >>>> long nanos = (now_part_sec * (NANOUNITS / unit)) + timeout; >>>> if (nanos >= NANOUNITS) { // overflow >>>> abstime->tv_sec += 1; >>>> nanos -= NANOUNITS; >>>> } >>>> abstime->tv_nsec = nanos; >>>> } >>>> } >>>> >>>> // Unpack the given deadline in milliseconds since the epoch, into >>>> the given timespec. >>>> // The current time in seconds is also passed in to enforce an >>>> upper bound as discussed above. >>>> static void unpack_abs_time(timespec* abstime, jlong deadline, >>>> jlong now_sec) { >>>> time_t max_secs = now_sec + MAX_SECS; >>>> >>>> jlong seconds = deadline / MILLIUNITS; >>>> jlong millis = deadline % MILLIUNITS; >>>> >>>> if (seconds >= max_secs) { >>>> // Absolute seconds exceeds allowed max, so pin to max_secs. >>>> abstime->tv_sec = max_secs; >>>> abstime->tv_nsec = 0; >>>> } else { >>>> abstime->tv_sec = seconds; >>>> abstime->tv_nsec = millis * (NANOUNITS / MILLIUNITS); >>>> } >>>> } >>>> >>>> >>>> static void to_abstime(timespec* abstime, jlong timeout, bool >>>> isAbsolute) { >>> >>> There's an extra blank line here. >> >> Fixed. >> >>>> >>>> DEBUG_ONLY(int max_secs = MAX_SECS;) >>>> >>>> if (timeout < 0) { >>>> timeout = 0; >>>> } >>>> >>>> #ifdef SUPPORTS_CLOCK_MONOTONIC >>>> >>>> if (_use_clock_monotonic_condattr && !isAbsolute) { >>>> struct timespec now; >>>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>>> assert_status(status == 0, status, "clock_gettime"); >>>> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_nsec, >>>> NANOUNITS); >>>> DEBUG_ONLY(max_secs += now.tv_sec;) >>>> } else { >>>> >>>> #else >>>> >>>> { // Match the block scope. >>>> >>>> #endif // SUPPORTS_CLOCK_MONOTONIC >>>> >>>> // Time-of-day clock is all we can reliably use. >>>> struct timeval now; >>>> int status = gettimeofday(&now, NULL); >>>> assert(status == 0, "gettimeofday"); >>> >>> assert_status() is used above, but assert() is used here. Why? >> >> Historical. assert_status was introduced for the pthread* and other >> posix funcs that return the error value rather than returning -1 and >> setting errno. gettimeofday is not one of those so still has the old >> assert. However, as someone pointed out a while ago you can use >> assert_status with these and pass errno as the "status". So I did that. >> >>> >>>> if (isAbsolute) { >>>> unpack_abs_time(abstime, timeout, now.tv_sec); >>>> } >>>> else { >>> >>> Inconsistent "else-branch" formatting. >>> I believe HotSpot style is "} else {" >> >> Fixed. >> >>>> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, MICROUNITS); >>>> } >>>> DEBUG_ONLY(max_secs += now.tv_sec;) >>>> } >>>> >>>> assert(abstime->tv_sec >= 0, "tv_sec < 0"); >>>> assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); >>>> assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); >>>> assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >= >>>> nanos_per_sec"); >>> >>> Why does the assert mesg have "nanos_per_sec" instead of >>> "NANOSECS_PER_SEC"? >> >> No reason. Actually that should now refer to NANOUNITS. Hmmm I can >> not recall why we have NANOUNITS and NANAOSECS_PER_SEC ... possibly >> an oversight. >> >>> There's an extra blank line here. >> >> Fixed. >> >> Will send out complete updated webrev soon. >> >> Thanks, >> David >> >>>> >>>> } >>> >>> Definitely looks and reads much cleaner. >>> >>> Dan >>> From daniel.smith at oracle.com Tue May 30 18:56:27 2017 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 30 May 2017 12:56:27 -0600 Subject: General Registration -- 2017 JVM Language Summit Message-ID: GENERAL REGISTRATION -- JVM LANGUAGE SUMMIT, JULY-AUGUST 2017 General registration for the 2017 JVM Language Summit is now open. The event will be held at Oracle's Santa Clara campus on July 31-August 2, 2017. The JVM Language Summit is an open technical collaboration among language designers, compiler writers, tool builders, runtime engineers, and VM architects. We will share our experiences as creators of both the JVM and programming languages for the JVM. We also welcome non-JVM developers of similar technologies to attend or speak on their runtime, VM, or language of choice. Presentations will be recorded and made available to the public. This event is being organized by language and JVM engineers?no marketers involved! So bring your slide rules and be prepared for some seriously geeky discussions. Format The summit is held in a single classroom-style room to support direct communication between participants. About 100-120 attendees are expected. The schedule consists of a single track of traditional presentations (about 6 each day) interspersed with less-formal multitrack "workshop" discussion groups (2-4 each day) and, possibly, impromptu "lightning talks." Workshops will be less structured than in the past, favoring an open discussion format with only a small amount of prepared material. Thus, rather than collecting workshop abstracts from speakers, we're asking each registrant to suggest a few topics of interest. After choosing the most popular topics, we'll ask some registrants if they'd like to act as discussion leaders. To register: register.jvmlangsummit.com For further information: jvmlangsummit.com Questions: inquire2017 at jvmlangsummit.com From david.holmes at oracle.com Tue May 30 20:50:51 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 May 2017 06:50:51 +1000 Subject: (10) (M) RFR: 8174231: Factor out and share PlatformEvent and Parker code for POSIX systems In-Reply-To: References: <3401d786-e657-35b4-cb0f-70848f5215b4@oracle.com> <368d99c5-a836-088e-b107-486cd4020b34@oracle.com> <96e65645-263f-e5a5-996d-efd7b4cfc01f@oracle.com> <29aef2eb-5870-5923-abe9-fd15f2c4b919@oracle.com> <25e10752-719a-237b-5797-16f225cdbd34@oracle.com> Message-ID: <78ee6517-fefb-1a08-e8c8-68bbdfcbca6a@oracle.com> Hi Dan, On 31/05/2017 1:11 AM, Daniel D. Daugherty wrote: > On 5/28/17 10:19 PM, David Holmes wrote: >> Dan, Robbin, Thomas, >> >> Okay here is the final ready to push version: >> >> http://cr.openjdk.java.net/~dholmes/8174231/webrev.hotspot.v2/ > > General > - Approaching the review differently than last round. This time I'm > focused on the os_posix.[ch]pp changes as if this were all new code. > - i.e., I'm going to assume that code deleted from the platform > specific files is all appropriately represented in os_posix.[ch]pp. Okay - thanks again. > src/os/posix/vm/os_posix.hpp > No comments. Okay I'm leaving the #includes as-is. > src/os/posix/vm/os_posix.cpp > L1518: _use_clock_monotonic_condattr = true; > L1522: _use_clock_monotonic_condattr = false; > _use_clock_monotonic_condattr could briefly be observed as 'true' > before being reset to 'false' due to the EINVAL. I think we are > single threaded at this point so there should be no other thread > running to be confused by this. Right this is single-threaded VM init. > An alternative would be to set _use_clock_monotonic_condattr > to true only when _pthread_condattr_setclock() returns 0. Yes - fixed. > L1581: // number of seconds, in abstime, is less than current_time > + 100,000,000. > L1582: // As it will be over 20 years before "now + 100000000" will > overflow we can > L1584: // of "now + 100,000,000". This places a limit on the > timeout of about 3.17 > nit - consistency of using ',' or not in 100000000. Personally, > I would prefer no commas so the comments match MAX_SECS. Fixed. > L1703: if (Atomic::cmpxchg(v-1, &_event, v) == v) break; > L1743: if (Atomic::cmpxchg(v-1, &_event, v) == v) break; > nit - please add spaces around the '-' operator. Fixed. > L1749: to_abstime(&abst, millis * (NANOUNITS/MILLIUNITS), false); > nit - please add spaces around the '/' operator. Fixed. > src/os/aix/vm/os_aix.hpp > No comments. > > src/os/aix/vm/os_aix.cpp > No comments. > > src/os/bsd/vm/os_bsd.hpp > No comments. > > src/os/bsd/vm/os_bsd.cpp > No comments. > > src/os/linux/vm/os_linux.hpp > No comments. > > src/os/linux/vm/os_linux.cpp > No comments. > > src/os/solaris/vm/os_solaris.hpp > No comments. > > src/os/solaris/vm/os_solaris.cpp > No comments. > > > Thumbs up. Don't need to see another webrev if you choose to fix > the bits... Thanks again. David > Dan > > >> >> this fixes all Dan's nits and refactors the time calculation code as >> suggested by Robbin. >> >> Thomas: if you are around and able, it would be good to get a final >> sanity check on AIX. Thanks. >> >> Testing: >> - JPRT: -testset hotspot >> -testset core >> >> - manual: >> - jtreg:java/util/concurrent >> - various little test programs that try to validate sleep/wait >> times to show early returns or unexpected delays >> >> Thanks again for the reviews. >> >> David >> >> On 29/05/2017 10:29 AM, David Holmes wrote: >>> On 27/05/2017 4:19 AM, Daniel D. Daugherty wrote: >>>> On 5/26/17 1:27 AM, David Holmes wrote: >>>>> Robbin, Dan, >>>>> >>>>> Below is a modified version of the refactored to_abstime code that >>>>> Robbin suggested. >>>>> >>>>> Robbin: there were a couple of issues with your version. For >>>>> relative time the timeout is always in nanoseconds - the "unit" >>>>> only tells you what form the "now_part_sec" is - nanos or micros. >>>>> And the calc_abs_time always has a deadline in millis. So I >>>>> simplified and did a little renaming, and tracked max_secs in >>>>> debug_only instead of returning it. >>>>> >>>>> Please let me know what you think. >>>> >>>> Looks OK to me. Nit comments below... >>> >>> Thanks Dan - more below. >>> >>>>> >>>>> >>>>> // Calculate a new absolute time that is "timeout" nanoseconds from >>>>> "now". >>>>> // "unit" indicates the unit of "now_part_sec" (may be nanos or >>>>> micros depending >>>>> // on which clock is being used). >>>>> static void calc_rel_time(timespec* abstime, jlong timeout, jlong >>>>> now_sec, >>>>> jlong now_part_sec, jlong unit) { >>>>> time_t max_secs = now_sec + MAX_SECS; >>>>> >>>>> jlong seconds = timeout / NANOUNITS; >>>>> timeout %= NANOUNITS; // remaining nanos >>>>> >>>>> if (seconds >= MAX_SECS) { >>>>> // More seconds than we can add, so pin to max_secs. >>>>> abstime->tv_sec = max_secs; >>>>> abstime->tv_nsec = 0; >>>>> } else { >>>>> abstime->tv_sec = now_sec + seconds; >>>>> long nanos = (now_part_sec * (NANOUNITS / unit)) + timeout; >>>>> if (nanos >= NANOUNITS) { // overflow >>>>> abstime->tv_sec += 1; >>>>> nanos -= NANOUNITS; >>>>> } >>>>> abstime->tv_nsec = nanos; >>>>> } >>>>> } >>>>> >>>>> // Unpack the given deadline in milliseconds since the epoch, into >>>>> the given timespec. >>>>> // The current time in seconds is also passed in to enforce an >>>>> upper bound as discussed above. >>>>> static void unpack_abs_time(timespec* abstime, jlong deadline, >>>>> jlong now_sec) { >>>>> time_t max_secs = now_sec + MAX_SECS; >>>>> >>>>> jlong seconds = deadline / MILLIUNITS; >>>>> jlong millis = deadline % MILLIUNITS; >>>>> >>>>> if (seconds >= max_secs) { >>>>> // Absolute seconds exceeds allowed max, so pin to max_secs. >>>>> abstime->tv_sec = max_secs; >>>>> abstime->tv_nsec = 0; >>>>> } else { >>>>> abstime->tv_sec = seconds; >>>>> abstime->tv_nsec = millis * (NANOUNITS / MILLIUNITS); >>>>> } >>>>> } >>>>> >>>>> >>>>> static void to_abstime(timespec* abstime, jlong timeout, bool >>>>> isAbsolute) { >>>> >>>> There's an extra blank line here. >>> >>> Fixed. >>> >>>>> >>>>> DEBUG_ONLY(int max_secs = MAX_SECS;) >>>>> >>>>> if (timeout < 0) { >>>>> timeout = 0; >>>>> } >>>>> >>>>> #ifdef SUPPORTS_CLOCK_MONOTONIC >>>>> >>>>> if (_use_clock_monotonic_condattr && !isAbsolute) { >>>>> struct timespec now; >>>>> int status = _clock_gettime(CLOCK_MONOTONIC, &now); >>>>> assert_status(status == 0, status, "clock_gettime"); >>>>> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_nsec, >>>>> NANOUNITS); >>>>> DEBUG_ONLY(max_secs += now.tv_sec;) >>>>> } else { >>>>> >>>>> #else >>>>> >>>>> { // Match the block scope. >>>>> >>>>> #endif // SUPPORTS_CLOCK_MONOTONIC >>>>> >>>>> // Time-of-day clock is all we can reliably use. >>>>> struct timeval now; >>>>> int status = gettimeofday(&now, NULL); >>>>> assert(status == 0, "gettimeofday"); >>>> >>>> assert_status() is used above, but assert() is used here. Why? >>> >>> Historical. assert_status was introduced for the pthread* and other >>> posix funcs that return the error value rather than returning -1 and >>> setting errno. gettimeofday is not one of those so still has the old >>> assert. However, as someone pointed out a while ago you can use >>> assert_status with these and pass errno as the "status". So I did that. >>> >>>> >>>>> if (isAbsolute) { >>>>> unpack_abs_time(abstime, timeout, now.tv_sec); >>>>> } >>>>> else { >>>> >>>> Inconsistent "else-branch" formatting. >>>> I believe HotSpot style is "} else {" >>> >>> Fixed. >>> >>>>> calc_rel_time(abstime, timeout, now.tv_sec, now.tv_usec, MICROUNITS); >>>>> } >>>>> DEBUG_ONLY(max_secs += now.tv_sec;) >>>>> } >>>>> >>>>> assert(abstime->tv_sec >= 0, "tv_sec < 0"); >>>>> assert(abstime->tv_sec <= max_secs, "tv_sec > max_secs"); >>>>> assert(abstime->tv_nsec >= 0, "tv_nsec < 0"); >>>>> assert(abstime->tv_nsec < NANOSECS_PER_SEC, "tv_nsec >= >>>>> nanos_per_sec"); >>>> >>>> Why does the assert mesg have "nanos_per_sec" instead of >>>> "NANOSECS_PER_SEC"? >>> >>> No reason. Actually that should now refer to NANOUNITS. Hmmm I can >>> not recall why we have NANOUNITS and NANAOSECS_PER_SEC ... possibly >>> an oversight. >>> >>>> There's an extra blank line here. >>> >>> Fixed. >>> >>> Will send out complete updated webrev soon. >>> >>> Thanks, >>> David >>> >>>>> >>>>> } >>>> >>>> Definitely looks and reads much cleaner. >>>> >>>> Dan >>>> > From david.holmes at oracle.com Tue May 30 21:30:16 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 May 2017 07:30:16 +1000 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <937b01c9-5569-ce73-e7a3-ad38aed82ab3@redhat.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> <5928C9AA.6030004@linux.vnet.ibm.com> <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> <95147596-caf9-4e49-f954-29fa13df3a56@oracle.com> <592CA97D.4000802@linux.vnet.ibm.com> <937b01c9-5569-ce73-e7a3-ad38aed82ab3@redhat.com> Message-ID: <4dc1ac9e-f35f-209a-761f-96dc584f68a1@oracle.com> Looks fine to me. Thanks, David On 30/05/2017 9:59 PM, Zhengyu Gu wrote: > Hi David and Gustavo, > > Thanks for the review. > > Webrev is updated according to your comments: > > http://cr.openjdk.java.net/~zgu/8181055/webrev.02/ > > Thanks, > > -Zhengyu > > > On 05/29/2017 07:06 PM, Gustavo Romero wrote: >> Hi David, >> >> On 29-05-2017 01:34, David Holmes wrote: >>> Hi Zhengyu, >>> >>> On 29/05/2017 12:08 PM, Zhengyu Gu wrote: >>>> Hi Gustavo, >>>> >>>> Thanks for the detail analysis and suggestion. I did not realize the >>>> difference between from bitmask and nodemask. >>>> >>>> As you suggested, numa_interleave_memory_v2 works under this >>>> configuration. >>>> >>>> Please updated Webrev: >>>> http://cr.openjdk.java.net/~zgu/8181055/webrev.01/ >>> >>> The addition of support for the "v2" API seems okay. Though I think >>> this comment needs some clarification for the existing code: >>> >>> 2837 // If we are running with libnuma version > 2, then we should >>> 2838 // be trying to use symbols with versions 1.1 >>> 2839 // If we are running with earlier version, which did not have >>> symbol versions, >>> 2840 // we should use the base version. >>> 2841 void* os::Linux::libnuma_dlsym(void* handle, const char *name) { >>> >>> given that we now explicitly load the v1.2 symbol if present. >>> >>> Gustavo: can you vouch for the suitability of using the v2 API in all >>> cases, if it exists? >> >> My understanding is that in the transition to API v2 only the usage of >> numa_node_to_cpus() by the JVM will have to be adapted in >> os::Linux::rebuild_cpu_to_node_map(). >> The remaining functions (excluding numa_interleave_memory() as Zhengyu >> already addressed it) >> preserve the same functionality and signatures [1]. >> >> Currently JVM NUMA API requires the following libnuma functions: >> >> 1. numa_node_to_cpus v1 != v2 (using v1, JVM has to adapt) >> 2. numa_max_node v1 == v2 (using v1, transition is >> straightforward) >> 3. numa_num_configured_nodes v2 (added by gromero: 8175813) >> 4. numa_available v1 == v2 (using v1, transition is >> straightforward) >> 5. numa_tonode_memory v1 == v2 (using v1, transition is >> straightforward) >> 6. numa_interleave_memory v1 != v2 (updated by zhengyu: 8181055. >> Default use of v2, fallback to v1) >> 7. numa_set_bind_policy v1 == v2 (using v1, transition is >> straightforward) >> 8. numa_bitmask_isbitset v2 (added by gromero: 8175813) >> 9. numa_distance v1 == v2 (added by gromero: 8175813. >> Using v1, transition is straightforward) >> >> v1 != v2: function signature in version 1 is different from version 2 >> v1 == v2: function signature in version 1 is equal to version 2 >> v2 : function is only present in API v2 >> >> Thus, to the best of my knowledge, except for case 1. (which JVM need >> to adapt to) >> all other cases are suitable to use v2 API and we could use a fallback >> mechanism as >> proposed by Zhengyu or update directly to API v2 (risky?), given that >> I can't see >> how v2 API would not be available on current (not-EOL) Linux distro >> releases. >> >> Regarding the comment, I agree, it needs an update since we are not >> tied anymore >> to version 1.1 (we are in effect already using v2 for some functions). >> We could >> delete the comment atop libnuma_dlsym() and add something like: >> >> "Handle request to load libnuma symbol version 1.1 (API v1). If it >> fails load symbol from base version instead." >> >> and to libnuma_v2_dlsym() add: >> >> "Handle request to load libnuma symbol version 1.2 (API v2) only. If >> it fails no symbol from any other version - even if present - is loaded." >> >> I've opened a bug to track the transitions to API v2 (I also discussed >> that with Volker): >> https://bugs.openjdk.java.net/browse/JDK-8181196 >> >> >> Regards, >> Gustavo >> >> [1] API v1 vs API v2: >> >> API v1 >> ====== >> >> int numa_node_to_cpus(int node, unsigned long *buffer, int bufferlen); >> int numa_max_node(void); >> - int numa_num_configured_nodes(void); >> int numa_available(void); >> void numa_tonode_memory(void *start, size_t size, int node); >> void numa_interleave_memory(void *start, size_t size, nodemask_t >> *nodemask); >> void numa_set_bind_policy(int strict); >> - int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); >> int numa_distance(int node1, int node2); >> >> >> API v2 >> ====== >> >> int numa_node_to_cpus(int node, struct bitmask *mask); >> int numa_max_node(void); >> int numa_num_configured_nodes(void); >> int numa_available(void); >> void numa_tonode_memory(void *start, size_t size, int node); >> void numa_interleave_memory(void *start, size_t size, struct bitmask >> *nodemask); >> void numa_set_bind_policy(int strict) >> int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); >> int numa_distance(int node1, int node2); >> >> >>> I'm running this through JPRT now. >>> >>> Thanks, >>> David >>> >>>> >>>> Thanks, >>>> >>>> -Zhengyu >>>> >>>> >>>> >>>> On 05/26/2017 08:34 PM, Gustavo Romero wrote: >>>>> Hi Zhengyu, >>>>> >>>>> Thanks a lot for taking care of this corner case on PPC64. >>>>> >>>>> On 26-05-2017 10:41, Zhengyu Gu wrote: >>>>>> This is a quick way to kill the symptom (or low risk?). I am not >>>>>> sure if disabling NUMA is a better solution for this circumstance? >>>>>> does 1 NUMA node = UMA? >>>>> >>>>> On PPC64, 1 (configured) NUMA does not necessarily imply UMA. In >>>>> the POWER7 >>>>> machine you found the corner case (I copy below the data you >>>>> provided in the >>>>> JBS - thanks for the additional information): >>>>> >>>>> $ numactl -H >>>>> available: 2 nodes (0-1) >>>>> node 0 cpus: 0 1 2 3 4 5 6 7 >>>>> node 0 size: 0 MB >>>>> node 0 free: 0 MB >>>>> node 1 cpus: >>>>> node 1 size: 7680 MB >>>>> node 1 free: 1896 MB >>>>> node distances: >>>>> node 0 1 >>>>> 0: 10 40 >>>>> 1: 40 10 >>>>> >>>>> CPUs in node0 have no other alternative besides allocating memory >>>>> from node1. In >>>>> that case CPUs in node0 are always accessing remote memory from >>>>> node1 in a constant >>>>> distance (40), so in that case we could say that 1 NUMA >>>>> (configured) node == UMA. >>>>> Nonetheless, if you add CPUs in node1 (by filling up the other >>>>> socket present in >>>>> the board) you will end up with CPUs with different distances from >>>>> the node that >>>>> has configured memory (in that case, node1), so it yields a >>>>> configuration where >>>>> 1 NUMA (configured) != UMA (i.e. distances are not always equal to >>>>> a single >>>>> value). >>>>> >>>>> On the other hand, the POWER7 machine configuration in question is >>>>> bad (and >>>>> rare). It's indeed impacting the whole system performance and it >>>>> would be >>>>> reasonable to open the machine and move the memory module from bank >>>>> related to >>>>> node1 to bank related to node0, because all CPUs are accessing >>>>> remote memory >>>>> without any apparent necessity. Once you change it all CPUs will >>>>> have local >>>>> memory (distance = 10). >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>>> >>>>>> On 05/26/2017 09:14 AM, Zhengyu Gu wrote: >>>>>>> Hi, >>>>>>> >>>>>>> There is a corner case that still failed after JDK-8175813. >>>>>>> >>>>>>> The system shows that it has multiple NUMA nodes, but only one is >>>>>>> configured. Under this scenario, numa_interleave_memory() call will >>>>>>> result "mbind: Invalid argument" message. >>>>>>> >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 >>>>>>> Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ >>>>> >>>>> Looks like that even for that POWER7 rare numa topology >>>>> numa_interleave_memory() >>>>> should succeed without "mbind: Invalid argument" since the 'mask' >>>>> argument >>>>> should be already a mask with only nodes from which memory can be >>>>> allocated, i.e. >>>>> only a mask of configured nodes (even if mask contains only one >>>>> configured node, >>>>> as in >>>>> http://cr.openjdk.java.net/~gromero/logs/numa_only_one_node.txt). >>>>> >>>>> Inspecting a little bit more, it looks like that the problem boils >>>>> down to the >>>>> fact that the JVM is passing to numa_interleave_memory() >>>>> 'numa_all_nodes' [1] in >>>>> Linux::numa_interleave_memory(). >>>>> >>>>> One would expect that 'numa_all_nodes' (which is api v1) would >>>>> track the same >>>>> information as 'numa_all_nodes_ptr' (api v2) [2], however there is >>>>> a subtle but >>>>> important difference: >>>>> >>>>> 'numa_all_nodes' is constructed assuming a consecutive node >>>>> distribution [3]: >>>>> >>>>> 100 max = numa_num_configured_nodes(); >>>>> 101 for (i = 0; i < max; i++) >>>>> 102 nodemask_set_compat((nodemask_t >>>>> *)&numa_all_nodes, i); >>>>> >>>>> >>>>> whilst 'numa_all_nodes_ptr' is constructed parsing >>>>> /proc/self/status [4]: >>>>> >>>>> 499 if (strncmp(buffer,"Mems_allowed:",13) == 0) { >>>>> 500 numprocnode = read_mask(mask, >>>>> numa_all_nodes_ptr); >>>>> >>>>> Thus for a topology like: >>>>> >>>>> available: 4 nodes (0-1,16-17) >>>>> node 0 cpus: 0 8 16 24 32 >>>>> node 0 size: 130706 MB >>>>> node 0 free: 145 MB >>>>> node 1 cpus: 40 48 56 64 72 >>>>> node 1 size: 0 MB >>>>> node 1 free: 0 MB >>>>> node 16 cpus: 80 88 96 104 112 >>>>> node 16 size: 130630 MB >>>>> node 16 free: 529 MB >>>>> node 17 cpus: 120 128 136 144 152 >>>>> node 17 size: 0 MB >>>>> node 17 free: 0 MB >>>>> node distances: >>>>> node 0 1 16 17 >>>>> 0: 10 20 40 40 >>>>> 1: 20 10 40 40 >>>>> 16: 40 40 10 20 >>>>> 17: 40 40 20 10 >>>>> >>>>> numa_all_nodes=0x3 => 0b11 (node0 and node1) >>>>> numa_all_nodes_ptr=0x10001 => 0b10000000000000001 (node0 and node16) >>>>> >>>>> (Please, see details in the following gdb log: >>>>> http://cr.openjdk.java.net/~gromero/logs/numa_api_v1_vs_api_v2.txt) >>>>> >>>>> In that case passing node0 and node1, although being suboptimal, >>>>> does not bother >>>>> mbind() since the following is satisfied: >>>>> >>>>> "[nodemask] must contain at least one node that is on-line, allowed >>>>> by the >>>>> process's current cpuset context, and contains memory." >>>>> >>>>> So back to the POWER7 case, I suppose that for: >>>>> >>>>> available: 2 nodes (0-1) >>>>> node 0 cpus: 0 1 2 3 4 5 6 7 >>>>> node 0 size: 0 MB >>>>> node 0 free: 0 MB >>>>> node 1 cpus: >>>>> node 1 size: 7680 MB >>>>> node 1 free: 1896 MB >>>>> node distances: >>>>> node 0 1 >>>>> 0: 10 40 >>>>> 1: 40 10 >>>>> >>>>> numa_all_nodes=0x1 => 0b01 (node0) >>>>> numa_all_nodes_ptr=0x2 => 0b10 (node1) >>>>> >>>>> and hence numa_interleave_memory() gets nodemask = 0x1 (node0), >>>>> which contains >>>>> indeed no memory. That said, I don't know for sure if passing just >>>>> node1 in the >>>>> 'nodemask' will satisfy mbind() as in that case there are no cpus >>>>> available in >>>>> node1. >>>>> >>>>> In summing up, looks like that the root cause is not that >>>>> numa_interleave_memory() >>>>> does not accept only one configured node, but that the configured >>>>> node being >>>>> passed is wrong. I could not find a similar numa topology in my >>>>> poll to test >>>>> more, but it might be worth trying to write a small test using api >>>>> v2 and >>>>> 'numa_all_nodes_ptr' instead of 'numa_all_nodes' to see how >>>>> numa_interleave_memory() >>>>> goes in that machine :) If it behaves well, updating to api v2 >>>>> would be a >>>>> solution. >>>>> >>>>> HTH >>>>> >>>>> Regards, >>>>> Gustavo >>>>> >>>>> >>>>> [1] >>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/4b93e1b1d5b7/src/os/linux/vm/os_linux.hpp#l274 >>>>> >>>>> [2] from libnuma.c:608 numa_all_nodes_ptr: "it only tracks nodes >>>>> with memory from which the calling process can allocate." >>>>> [3] https://github.com/numactl/numactl/blob/master/libnuma.c#L100-L102 >>>>> [4] https://github.com/numactl/numactl/blob/master/libnuma.c#L499-L500 >>>>> >>>>> >>>>>>> >>>>>>> The system NUMA configuration: >>>>>>> >>>>>>> Architecture: ppc64 >>>>>>> CPU op-mode(s): 32-bit, 64-bit >>>>>>> Byte Order: Big Endian >>>>>>> CPU(s): 8 >>>>>>> On-line CPU(s) list: 0-7 >>>>>>> Thread(s) per core: 4 >>>>>>> Core(s) per socket: 1 >>>>>>> Socket(s): 2 >>>>>>> NUMA node(s): 2 >>>>>>> Model: 2.1 (pvr 003f 0201) >>>>>>> Model name: POWER7 (architected), altivec supported >>>>>>> L1d cache: 32K >>>>>>> L1i cache: 32K >>>>>>> NUMA node0 CPU(s): 0-7 >>>>>>> NUMA node1 CPU(s): >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -Zhengyu >>>>>> >>>>> >>> >> From serguei.spitsyn at oracle.com Wed May 31 00:21:18 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 30 May 2017 17:21:18 -0700 Subject: RFR (L) 8174749: Use hash table/oops for MemberName table In-Reply-To: References: <0da3d97a-304c-db23-423e-d57f45d2ffd7@oracle.com> <3CA63497-39C7-4944-A4F0-D83CB897DEA0@oracle.com> <41b557ce-dc84-2b4a-5af9-ce225254b332@oracle.com> <3B8B3D31-09F3-41AC-9B26-A14B4DB87082@oracle.com> Message-ID: <44da5a30-5569-9d85-4ce7-dc9d57a5e029@oracle.com> Hi Coleen, It looks good to me. At least, I do not see anything bad in the last update. Thanks, Serguei On 5/26/17 14:48, coleen.phillimore at oracle.com wrote: > > > On 5/26/17 4:48 PM, John Rose wrote: >> On May 26, 2017, at 10:47 AM, coleen.phillimore at oracle.com >> wrote: >>> >>> Hi, I made the changes below, which turned out very nice. It didn't >>> take that long to retest. See: >>> >>> open webrev athttp://cr.openjdk.java.net/~coleenp/8174749.04/webrev >>> >>> open webrev >>> athttp://cr.openjdk.java.net/~coleenp/8174749.jdk.04/webrev >>> >>> >>> I don't know how to do delta webrevs, so just look at >>> linkResolver.cpp/hpp and methodHandles.cpp >> >> Re-reviewed. >> >> See previous message for a late-breaking comment on expand. >> See below for a sketch of what I mean by keeping "have_defc" as is. > Hi John, > > I was just thinking of this change below, it makes sense to treat > field and method MemberName differently as you have below. The field > needs the clazz present to be expanded but method MemberName does not. > > Yes, this makes sense. >> >> (Another reviewer commented about a dead mode bit. The purpose of that >> stuff is to allow us to tweak the JDK API. I don't care much either >> way about >> GC-ing unused mode bits but I do want to keep the expander capability >> so we >> can prototype stuff in the JDK without having to edit the JVM. So on >> balance, >> I'd give the mode bits the benefit of the doubt. They can be used >> from the JDK, >> even if they aren't at the moment.) >> >> I also like how this CallInfo change turned out. Notice how now the >> function >> java_lang_invoke_ResolvedMethodName::find_resolved_method has only >> one usage, from the inside of CallInfo. This feels right. It also >> means you >> can take javaClasses.cpp out of the loop here, and just have CallInfo >> call >> directly into SystemDictionary and ResolvedMethodTable. It seems just >> as reasonable to me that linkResolver.cpp would do that job, than >> that it >> would be to delegate via javaClasses.cpp. I also think the patch >> will get >> a little smaller if you cut javaClasses.cpp out of that loop. > > JavaClasses is in the loop because it knows which fields to assign and > how to create a ResolvedMethodName. I think this makes sense to > isolate it like this an appreciated only changing javaClasses.cpp when > I kept changing the names of the fields. > >> >> Thanks, >> ? John >> >> P.S. As a step after this fix, if we loosen the coupling of the JVM >> with MemberName, >> I think we will want to get rid of MN::vmtarget and just have >> MN::method. >> In the code of MHN_getMemberVMInfo, the unchanged line "x = mname()" >> really wants to be "x = method" where method is the RMN. The JDK code >> expects a MN at that point, but it should really be the RMN now. The >> only >> JDK change would be in MemberName.java: >> >> - assert(vmtarget instanceof MemberName) : vmtarget + " in >> " + this; >> + assert(vmtarget instanceof ResolvedMethodName) : >> vmtarget + " in " + this; >> >> I wouldn't object if you anticipated this in the present change set, >> but it's OK >> to do it later. > > Yes, it has to be later. I'm going to file a couple of RFE's after > this that we discussed so that RMN can be used instead of MN. And > believe it or not, large changes make me anxious. :) >> >> P.P.S. Here's a sketch of what I mean by walking back some of the >> "have_defc" >> changes. Maybe I'm missing something, but I think this version makes >> more >> sense than the current version: > > Done. Passes java/lang/invoke tests (as sanity). > > http://cr.openjdk.java.net/~coleenp/8174749.05/webrev > > (wish I could do incremental webrevs because full webrevs take forever). > > Thank you for all your help and comments. > > Coleen > > >> >> git a/src/share/vm/prims/methodHandles.cpp >> b/src/share/vm/prims/methodHandles.cpp >> --- a/src/share/vm/prims/methodHandles.cpp >> +++ b/src/share/vm/prims/methodHandles.cpp >> @@ -794,11 +794,6 @@ >> // which refers directly to JVM internals. >> void MethodHandles::expand_MemberName(Handle mname, int suppress, >> TRAPS) { >> assert(java_lang_invoke_MemberName::is_instance(mname()), ""); >> - Metadata* vmtarget = java_lang_invoke_MemberName::vmtarget(mname()); >> - int vmindex = java_lang_invoke_MemberName::vmindex(mname()); >> - if (vmtarget == NULL) { >> - THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), >> "nothing to expand"); >> - } >> bool have_defc = (java_lang_invoke_MemberName::clazz(mname()) != >> NULL); >> bool have_name = (java_lang_invoke_MemberName::name(mname()) != >> NULL); >> @@ -817,10 +812,14 @@ >> case IS_METHOD: >> case IS_CONSTRUCTOR: >> { >> + Method* vmtarget = >> java_lang_invoke_MemberName::vmtarget(method()); >> + if (vmtarget == NULL) { >> + THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), >> "nothing to expand"); >> + } >> assert(vmtarget->is_method(), "method or constructor vmtarget >> is Method*"); >> methodHandle m(THREAD, (Method*)vmtarget); >> DEBUG_ONLY(vmtarget = NULL); // safety >> - if (m.is_null()) break; >> + assert(m.not_null(), "checked above"); >> if (!have_defc) { >> InstanceKlass* defc = m->method_holder(); >> java_lang_invoke_MemberName::set_clazz(mname(), defc->java_mirror()); >> @@ -838,17 +837,16 @@ >> } >> case IS_FIELD: >> { >> - assert(vmtarget->is_klass(), "field vmtarget is Klass*"); >> - if (!((Klass*) vmtarget)->is_instance_klass()) break; >> - instanceKlassHandle defc(THREAD, (Klass*) vmtarget); >> - DEBUG_ONLY(vmtarget = NULL); // safety >> + oop clazz = java_lang_invoke_MemberName::clazz(mname()); >> + if (clazz == NULL) { >> + THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), >> "nothing to expand (as field)"); >> + } >> + InstanceKlass* defc = >> InstanceKlass::cast(java_lang_Class::as_Klass(clazz)); >> + DEBUG_ONLY(clazz = NULL); // safety >> bool is_static = ((flags & JVM_ACC_STATIC) != 0); >> fieldDescriptor fd; // find_field initializes fd if found >> if (!defc->find_field_from_offset(vmindex, is_static, &fd)) >> break; // cannot expand >> - if (!have_defc) { >> - java_lang_invoke_MemberName::set_clazz(mname(), defc->java_mirror()); >> - } >> if (!have_name) { >> //not java_lang_String::create_from_symbol; let's intern >> member names >> Handle name = StringTable::intern(fd.name(), CHECK); >> @@ -1389,6 +1387,39 @@ >> > From zgu at redhat.com Wed May 31 00:37:11 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 30 May 2017 20:37:11 -0400 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <4dc1ac9e-f35f-209a-761f-96dc584f68a1@oracle.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> <5928C9AA.6030004@linux.vnet.ibm.com> <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> <95147596-caf9-4e49-f954-29fa13df3a56@oracle.com> <592CA97D.4000802@linux.vnet.ibm.com> <937b01c9-5569-ce73-e7a3-ad38aed82ab3@redhat.com> <4dc1ac9e-f35f-209a-761f-96dc584f68a1@oracle.com> Message-ID: <461d3048-88a2-c99d-818a-01de3813a29b@redhat.com> Hi David, Thanks for the review. Gustavo, might I count you as a reviewer? Thanks, -Zhengyu On 05/30/2017 05:30 PM, David Holmes wrote: > Looks fine to me. > > Thanks, > David > > On 30/05/2017 9:59 PM, Zhengyu Gu wrote: >> Hi David and Gustavo, >> >> Thanks for the review. >> >> Webrev is updated according to your comments: >> >> http://cr.openjdk.java.net/~zgu/8181055/webrev.02/ >> >> Thanks, >> >> -Zhengyu >> >> >> On 05/29/2017 07:06 PM, Gustavo Romero wrote: >>> Hi David, >>> >>> On 29-05-2017 01:34, David Holmes wrote: >>>> Hi Zhengyu, >>>> >>>> On 29/05/2017 12:08 PM, Zhengyu Gu wrote: >>>>> Hi Gustavo, >>>>> >>>>> Thanks for the detail analysis and suggestion. I did not realize >>>>> the difference between from bitmask and nodemask. >>>>> >>>>> As you suggested, numa_interleave_memory_v2 works under this >>>>> configuration. >>>>> >>>>> Please updated Webrev: >>>>> http://cr.openjdk.java.net/~zgu/8181055/webrev.01/ >>>> >>>> The addition of support for the "v2" API seems okay. Though I think >>>> this comment needs some clarification for the existing code: >>>> >>>> 2837 // If we are running with libnuma version > 2, then we should >>>> 2838 // be trying to use symbols with versions 1.1 >>>> 2839 // If we are running with earlier version, which did not have >>>> symbol versions, >>>> 2840 // we should use the base version. >>>> 2841 void* os::Linux::libnuma_dlsym(void* handle, const char *name) { >>>> >>>> given that we now explicitly load the v1.2 symbol if present. >>>> >>>> Gustavo: can you vouch for the suitability of using the v2 API in >>>> all cases, if it exists? >>> >>> My understanding is that in the transition to API v2 only the usage of >>> numa_node_to_cpus() by the JVM will have to be adapted in >>> os::Linux::rebuild_cpu_to_node_map(). >>> The remaining functions (excluding numa_interleave_memory() as >>> Zhengyu already addressed it) >>> preserve the same functionality and signatures [1]. >>> >>> Currently JVM NUMA API requires the following libnuma functions: >>> >>> 1. numa_node_to_cpus v1 != v2 (using v1, JVM has to adapt) >>> 2. numa_max_node v1 == v2 (using v1, transition is >>> straightforward) >>> 3. numa_num_configured_nodes v2 (added by gromero: 8175813) >>> 4. numa_available v1 == v2 (using v1, transition is >>> straightforward) >>> 5. numa_tonode_memory v1 == v2 (using v1, transition is >>> straightforward) >>> 6. numa_interleave_memory v1 != v2 (updated by zhengyu: >>> 8181055. Default use of v2, fallback to v1) >>> 7. numa_set_bind_policy v1 == v2 (using v1, transition is >>> straightforward) >>> 8. numa_bitmask_isbitset v2 (added by gromero: 8175813) >>> 9. numa_distance v1 == v2 (added by gromero: 8175813. >>> Using v1, transition is straightforward) >>> >>> v1 != v2: function signature in version 1 is different from version 2 >>> v1 == v2: function signature in version 1 is equal to version 2 >>> v2 : function is only present in API v2 >>> >>> Thus, to the best of my knowledge, except for case 1. (which JVM need >>> to adapt to) >>> all other cases are suitable to use v2 API and we could use a >>> fallback mechanism as >>> proposed by Zhengyu or update directly to API v2 (risky?), given that >>> I can't see >>> how v2 API would not be available on current (not-EOL) Linux distro >>> releases. >>> >>> Regarding the comment, I agree, it needs an update since we are not >>> tied anymore >>> to version 1.1 (we are in effect already using v2 for some >>> functions). We could >>> delete the comment atop libnuma_dlsym() and add something like: >>> >>> "Handle request to load libnuma symbol version 1.1 (API v1). If it >>> fails load symbol from base version instead." >>> >>> and to libnuma_v2_dlsym() add: >>> >>> "Handle request to load libnuma symbol version 1.2 (API v2) only. If >>> it fails no symbol from any other version - even if present - is >>> loaded." >>> >>> I've opened a bug to track the transitions to API v2 (I also >>> discussed that with Volker): >>> https://bugs.openjdk.java.net/browse/JDK-8181196 >>> >>> >>> Regards, >>> Gustavo >>> >>> [1] API v1 vs API v2: >>> >>> API v1 >>> ====== >>> >>> int numa_node_to_cpus(int node, unsigned long *buffer, int bufferlen); >>> int numa_max_node(void); >>> - int numa_num_configured_nodes(void); >>> int numa_available(void); >>> void numa_tonode_memory(void *start, size_t size, int node); >>> void numa_interleave_memory(void *start, size_t size, nodemask_t >>> *nodemask); >>> void numa_set_bind_policy(int strict); >>> - int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); >>> int numa_distance(int node1, int node2); >>> >>> >>> API v2 >>> ====== >>> >>> int numa_node_to_cpus(int node, struct bitmask *mask); >>> int numa_max_node(void); >>> int numa_num_configured_nodes(void); >>> int numa_available(void); >>> void numa_tonode_memory(void *start, size_t size, int node); >>> void numa_interleave_memory(void *start, size_t size, struct bitmask >>> *nodemask); >>> void numa_set_bind_policy(int strict) >>> int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); >>> int numa_distance(int node1, int node2); >>> >>> >>>> I'm running this through JPRT now. >>>> >>>> Thanks, >>>> David >>>> >>>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>>> >>>>> >>>>> On 05/26/2017 08:34 PM, Gustavo Romero wrote: >>>>>> Hi Zhengyu, >>>>>> >>>>>> Thanks a lot for taking care of this corner case on PPC64. >>>>>> >>>>>> On 26-05-2017 10:41, Zhengyu Gu wrote: >>>>>>> This is a quick way to kill the symptom (or low risk?). I am not >>>>>>> sure if disabling NUMA is a better solution for this >>>>>>> circumstance? does 1 NUMA node = UMA? >>>>>> >>>>>> On PPC64, 1 (configured) NUMA does not necessarily imply UMA. In >>>>>> the POWER7 >>>>>> machine you found the corner case (I copy below the data you >>>>>> provided in the >>>>>> JBS - thanks for the additional information): >>>>>> >>>>>> $ numactl -H >>>>>> available: 2 nodes (0-1) >>>>>> node 0 cpus: 0 1 2 3 4 5 6 7 >>>>>> node 0 size: 0 MB >>>>>> node 0 free: 0 MB >>>>>> node 1 cpus: >>>>>> node 1 size: 7680 MB >>>>>> node 1 free: 1896 MB >>>>>> node distances: >>>>>> node 0 1 >>>>>> 0: 10 40 >>>>>> 1: 40 10 >>>>>> >>>>>> CPUs in node0 have no other alternative besides allocating memory >>>>>> from node1. In >>>>>> that case CPUs in node0 are always accessing remote memory from >>>>>> node1 in a constant >>>>>> distance (40), so in that case we could say that 1 NUMA >>>>>> (configured) node == UMA. >>>>>> Nonetheless, if you add CPUs in node1 (by filling up the other >>>>>> socket present in >>>>>> the board) you will end up with CPUs with different distances from >>>>>> the node that >>>>>> has configured memory (in that case, node1), so it yields a >>>>>> configuration where >>>>>> 1 NUMA (configured) != UMA (i.e. distances are not always equal to >>>>>> a single >>>>>> value). >>>>>> >>>>>> On the other hand, the POWER7 machine configuration in question is >>>>>> bad (and >>>>>> rare). It's indeed impacting the whole system performance and it >>>>>> would be >>>>>> reasonable to open the machine and move the memory module from >>>>>> bank related to >>>>>> node1 to bank related to node0, because all CPUs are accessing >>>>>> remote memory >>>>>> without any apparent necessity. Once you change it all CPUs will >>>>>> have local >>>>>> memory (distance = 10). >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -Zhengyu >>>>>>> >>>>>>> On 05/26/2017 09:14 AM, Zhengyu Gu wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> There is a corner case that still failed after JDK-8175813. >>>>>>>> >>>>>>>> The system shows that it has multiple NUMA nodes, but only one is >>>>>>>> configured. Under this scenario, numa_interleave_memory() call will >>>>>>>> result "mbind: Invalid argument" message. >>>>>>>> >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 >>>>>>>> Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ >>>>>> >>>>>> Looks like that even for that POWER7 rare numa topology >>>>>> numa_interleave_memory() >>>>>> should succeed without "mbind: Invalid argument" since the 'mask' >>>>>> argument >>>>>> should be already a mask with only nodes from which memory can be >>>>>> allocated, i.e. >>>>>> only a mask of configured nodes (even if mask contains only one >>>>>> configured node, >>>>>> as in >>>>>> http://cr.openjdk.java.net/~gromero/logs/numa_only_one_node.txt). >>>>>> >>>>>> Inspecting a little bit more, it looks like that the problem boils >>>>>> down to the >>>>>> fact that the JVM is passing to numa_interleave_memory() >>>>>> 'numa_all_nodes' [1] in >>>>>> Linux::numa_interleave_memory(). >>>>>> >>>>>> One would expect that 'numa_all_nodes' (which is api v1) would >>>>>> track the same >>>>>> information as 'numa_all_nodes_ptr' (api v2) [2], however there is >>>>>> a subtle but >>>>>> important difference: >>>>>> >>>>>> 'numa_all_nodes' is constructed assuming a consecutive node >>>>>> distribution [3]: >>>>>> >>>>>> 100 max = numa_num_configured_nodes(); >>>>>> 101 for (i = 0; i < max; i++) >>>>>> 102 nodemask_set_compat((nodemask_t >>>>>> *)&numa_all_nodes, i); >>>>>> >>>>>> >>>>>> whilst 'numa_all_nodes_ptr' is constructed parsing >>>>>> /proc/self/status [4]: >>>>>> >>>>>> 499 if (strncmp(buffer,"Mems_allowed:",13) == 0) { >>>>>> 500 numprocnode = read_mask(mask, >>>>>> numa_all_nodes_ptr); >>>>>> >>>>>> Thus for a topology like: >>>>>> >>>>>> available: 4 nodes (0-1,16-17) >>>>>> node 0 cpus: 0 8 16 24 32 >>>>>> node 0 size: 130706 MB >>>>>> node 0 free: 145 MB >>>>>> node 1 cpus: 40 48 56 64 72 >>>>>> node 1 size: 0 MB >>>>>> node 1 free: 0 MB >>>>>> node 16 cpus: 80 88 96 104 112 >>>>>> node 16 size: 130630 MB >>>>>> node 16 free: 529 MB >>>>>> node 17 cpus: 120 128 136 144 152 >>>>>> node 17 size: 0 MB >>>>>> node 17 free: 0 MB >>>>>> node distances: >>>>>> node 0 1 16 17 >>>>>> 0: 10 20 40 40 >>>>>> 1: 20 10 40 40 >>>>>> 16: 40 40 10 20 >>>>>> 17: 40 40 20 10 >>>>>> >>>>>> numa_all_nodes=0x3 => 0b11 (node0 and node1) >>>>>> numa_all_nodes_ptr=0x10001 => 0b10000000000000001 (node0 and node16) >>>>>> >>>>>> (Please, see details in the following gdb log: >>>>>> http://cr.openjdk.java.net/~gromero/logs/numa_api_v1_vs_api_v2.txt) >>>>>> >>>>>> In that case passing node0 and node1, although being suboptimal, >>>>>> does not bother >>>>>> mbind() since the following is satisfied: >>>>>> >>>>>> "[nodemask] must contain at least one node that is on-line, >>>>>> allowed by the >>>>>> process's current cpuset context, and contains memory." >>>>>> >>>>>> So back to the POWER7 case, I suppose that for: >>>>>> >>>>>> available: 2 nodes (0-1) >>>>>> node 0 cpus: 0 1 2 3 4 5 6 7 >>>>>> node 0 size: 0 MB >>>>>> node 0 free: 0 MB >>>>>> node 1 cpus: >>>>>> node 1 size: 7680 MB >>>>>> node 1 free: 1896 MB >>>>>> node distances: >>>>>> node 0 1 >>>>>> 0: 10 40 >>>>>> 1: 40 10 >>>>>> >>>>>> numa_all_nodes=0x1 => 0b01 (node0) >>>>>> numa_all_nodes_ptr=0x2 => 0b10 (node1) >>>>>> >>>>>> and hence numa_interleave_memory() gets nodemask = 0x1 (node0), >>>>>> which contains >>>>>> indeed no memory. That said, I don't know for sure if passing just >>>>>> node1 in the >>>>>> 'nodemask' will satisfy mbind() as in that case there are no cpus >>>>>> available in >>>>>> node1. >>>>>> >>>>>> In summing up, looks like that the root cause is not that >>>>>> numa_interleave_memory() >>>>>> does not accept only one configured node, but that the configured >>>>>> node being >>>>>> passed is wrong. I could not find a similar numa topology in my >>>>>> poll to test >>>>>> more, but it might be worth trying to write a small test using api >>>>>> v2 and >>>>>> 'numa_all_nodes_ptr' instead of 'numa_all_nodes' to see how >>>>>> numa_interleave_memory() >>>>>> goes in that machine :) If it behaves well, updating to api v2 >>>>>> would be a >>>>>> solution. >>>>>> >>>>>> HTH >>>>>> >>>>>> Regards, >>>>>> Gustavo >>>>>> >>>>>> >>>>>> [1] >>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/4b93e1b1d5b7/src/os/linux/vm/os_linux.hpp#l274 >>>>>> >>>>>> [2] from libnuma.c:608 numa_all_nodes_ptr: "it only tracks nodes >>>>>> with memory from which the calling process can allocate." >>>>>> [3] >>>>>> https://github.com/numactl/numactl/blob/master/libnuma.c#L100-L102 >>>>>> [4] >>>>>> https://github.com/numactl/numactl/blob/master/libnuma.c#L499-L500 >>>>>> >>>>>> >>>>>>>> >>>>>>>> The system NUMA configuration: >>>>>>>> >>>>>>>> Architecture: ppc64 >>>>>>>> CPU op-mode(s): 32-bit, 64-bit >>>>>>>> Byte Order: Big Endian >>>>>>>> CPU(s): 8 >>>>>>>> On-line CPU(s) list: 0-7 >>>>>>>> Thread(s) per core: 4 >>>>>>>> Core(s) per socket: 1 >>>>>>>> Socket(s): 2 >>>>>>>> NUMA node(s): 2 >>>>>>>> Model: 2.1 (pvr 003f 0201) >>>>>>>> Model name: POWER7 (architected), altivec supported >>>>>>>> L1d cache: 32K >>>>>>>> L1i cache: 32K >>>>>>>> NUMA node0 CPU(s): 0-7 >>>>>>>> NUMA node1 CPU(s): >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -Zhengyu >>>>>>> >>>>>> >>>> >>> From glaubitz at physik.fu-berlin.de Wed May 31 10:55:51 2017 From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz) Date: Wed, 31 May 2017 12:55:51 +0200 Subject: [PATCH]: linux-sparc build fixes In-Reply-To: <49c3a38a-779a-ea5d-6ef1-a4f298feb18c@oracle.com> References: <4121e72b-25cb-a18f-dace-a44528cb622d@oracle.com> <20170529081312.GA7132@physik.fu-berlin.de> <49c3a38a-779a-ea5d-6ef1-a4f298feb18c@oracle.com> Message-ID: <20170531105551.GC14877@physik.fu-berlin.de> On Mon, May 29, 2017 at 06:52:14PM +1000, David Holmes wrote: > >That's surprising because depending on where your office is, it may > >just be a matter of walking down an aisle and knocking on a colleague's > >door to get access to the necessary test setup given the fact that > >Oracle is officially shipping and supporting Linux for SPARC [1,2]. > > That doesn't necessarily translate into the OpenJDK or Oracle JDK supporting > the Linux-sparc platform. It was previously supported in JFK 7 and early 8 > IIRC but is no longer provided in 8u and is not a platform supported in 9. It should translate, however :-). If I were to buy an Oracle server, I would expect Oracle software to run on it. > >Well, I'm not so sure. openjdk-9 did build in the past (b88), so I > >don't think the code is completely bit-rotten [3]. > > Sure it built in the past but b88 was some time ago and as I said once you > get past the hotspot build problem you may find other problems. Which can also be fixed by additional patches. I also don't see why this is an argument to fix existing issues. They live inside the linux_sparc subfolder and therefore don't affect the release architectures. So, merging the fixes won't hurt, but it makes the work for downstream Linux distributions easier. Thanks, Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz at debian.org `. `' Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 From erik.osterlund at oracle.com Wed May 31 11:07:44 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 31 May 2017 13:07:44 +0200 Subject: RFR: 8161145: The min/max macros make hotspot tests fail to build with GCC 6 In-Reply-To: <1471353948.2985.22.camel@redhat.com> References: <20160711184513.GA1485@redhat.com> <20D15A8D-1C0A-4D44-9E83-A99B38622A6C@oracle.com> <1655965499.4239256.1468347672234.JavaMail.zimbra@redhat.com> <27891E85-5736-4D44-8D79-1C44B01499EE@oracle.com> <1461764727.4609223.1468437153322.JavaMail.zimbra@redhat.com> <1471353948.2985.22.camel@redhat.com> Message-ID: <592EA400.9010004@oracle.com> Hi, I am bringing back this blast from the past. I also need a solution for this for the GC interface that wishes to use . I propose to do what Kim suggested - to #define max max and #define min min. This way the innocent "max" identifiers that have nothing to do with any macros will build as they should, and any accidental consumer of a potential max() macro will find a compiler error as intended. Is anyone left unhappy with this solution? Thanks, /Erik On 2016-08-16 15:25, Severin Gehwolf wrote: > On Fri, 2016-07-15 at 14:35 -0400, Kim Barrett wrote: >>>>> On Jul 13, 2016, at 3:12 PM, Andrew Hughes wrote: >>> >>> >>> ----- Original Message ----- >>>>> On Jul 12, 2016, at 2:21 PM, Andrew Hughes wrote: >>>>> The workaround that currently works for me is: >>>>> >>>>> diff -r b515beb3b4ad src/share/vm/utilities/globalDefinitions.hpp >>>>>>>>> --- a/src/share/vm/utilities/globalDefinitions.hpp Thu Jul 07 18:40:53 2016 >>>>> +0100 >>>>>>>>> +++ b/src/share/vm/utilities/globalDefinitions.hpp Tue Jul 12 19:13:51 2016 >>>>> +0100 >>>>> @@ -1163,8 +1163,10 @@ >>>>> #undef min >>>>> #endif >>>>> >>>>> +#ifndef _GLIBCXX_STDLIB_H >>>>> #define max(a,b) Do_not_use_max_use_MAX2_instead >>>>> #define min(a,b) Do_not_use_min_use_MIN2_instead >>>>> +#endif >>>>> >>>>> // It is necessary to use templates here. Having normal overloaded >>>>> // functions does not work because it is necessary to provide both 32- >>>>> >>>>> >>>>> _GLIBCXX_STDLIB_H only exists in GCC 6. Earlier versions use stdlib.h from >>>>> the >>>>> C library. Thus this seems to provide the solution of only disabling >>>>> those macros only on GCC >= 6 where they conflict with the functions >>>>> max and min defined by this C++ stdlib.h wrapper (see stdlib.h changes >>>>> in [0]) >>>> Since when does define / declare min or max? To me, that seems >>>> like a bug in this wrapper. >>> It doesn't; it's defined in . >>> >>> The stdlib.h C++ wrapper is new to GCC 6, and thus so is the define, so we >>> can use it to just disable these macros on that version. >>> >>> It also wouldn't surprise me that the error is down to one of these new >>> wrapper headers bringing in other C++ headers, including limits. I >>> can't find the exact path for such an inclusion, but this issue >>> only shows up on GCC 6, where the wrapper headers are present. >>> Including on earlier versions just uses the C >>> header, not . >> That seems a likely explanation. >> >> If my conjecture about these being intended to poison the windefs.h >> definitions is correct, then we could just move these definitions to >> globalDefinitions_visCPP.hpp. But I don't know if anyone could answer >> that definitively. >> >> We try pretty hard to avoid platform-specific #ifdefs and the like in >> otherwise generic code. I think that's a good thing, hence my >> reluctance to conditionalize on _GLIBCXX_STDLIB_H. >> >> After some experiments, my preferred solution would be the blue paint >> approach I suggested as a possibility earlier, e.g. >> >> #define max max >> #define min min >> >> A later attempt to (re)define without #undef would make the program >> ill-formed, and all the compilers Oracle supports report such. >> >> In the absence of any attempt to redefine, the macros expand to >> similarly named identifiers, e.g. it's as if the macro definitions >> didn't exist, except for defined(max) &etc being true. > Hi! > > How can we move forward with this? Fedora 24 has GCC 6 as default and > anybody on that platform will need patches to get OpenJDK building. > > Thanks, > Severin From dalibor.topic at oracle.com Wed May 31 11:28:50 2017 From: dalibor.topic at oracle.com (dalibor topic) Date: Wed, 31 May 2017 13:28:50 +0200 Subject: [PATCH]: linux-sparc build fixes In-Reply-To: <20170531105551.GC14877@physik.fu-berlin.de> References: <4121e72b-25cb-a18f-dace-a44528cb622d@oracle.com> <20170529081312.GA7132@physik.fu-berlin.de> <49c3a38a-779a-ea5d-6ef1-a4f298feb18c@oracle.com> <20170531105551.GC14877@physik.fu-berlin.de> Message-ID: On 31.05.2017 12:55, John Paul Adrian Glaubitz wrote: > It should translate, however :-). If I were to buy an Oracle server, I > would expect Oracle software to run on it. A preliminary list of Oracle JDK 9 Supported Platforms can be found at http://jdk.java.net/9/supported . Last time I checked, Linux on SPARC was not on that list. Please keep in mind that this list is subject to change throughout the release cycle. > So, merging the fixes won't hurt, but it makes the work > for downstream Linux distributions easier. Well, dealing with changes to unsupported platforms still requires reviewing those changes, shepherding them to the bug & build systems, through toolchain changes, etc. So it's usually preferable if such activity done by platform experts. As such, the best way to add a new platform is to work on it in its own Project, and, once it works well, to bring it over to the mainline. That externalizes a part of the cost of adding new platforms onto the shoulders of the developers interested in them, without affecting work on the mainline, until their work is actually ready to be integrated. Whether something is ready is a bar that's at least as high as 'it builds and passes the usual jtreg tests and the TCK for the latest Java SE platform release', rather than 'it builds', though. If a port never ends up being ready to be integrated, then no one else needs to care about first reviewing and later removing the changes to support that platform from mainline, for example. In short, if there is one or more Linux distributions interested in creating and maintaining a port of OpenJDK to sparc-linux, I'd suggest creating a dedicated porting Project, in the same manner how PowerPC64 or s390x were added to OpenJDK mainline. I understand that can be a bit confusing, because 'the code is already there' in this case. ;) cheers, dalibor topic -- Dalibor Topic | Principal Product Manager Phone: +494089091214 | Mobile: +491737185961 ORACLE Deutschland B.V. & Co. KG | K?hneh?fe 5 | 22761 Hamburg ORACLE Deutschland B.V. & Co. KG Hauptverwaltung: Riesstr. 25, D-80992 M?nchen Registergericht: Amtsgericht M?nchen, HRA 95603 Komplement?rin: ORACLE Deutschland Verwaltung B.V. Hertogswetering 163/167, 3543 AS Utrecht, Niederlande Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697 Gesch?ftsf?hrer: Alexander van der Ven, Jan Schultheiss, Val Maher Oracle is committed to developing practices and products that help protect the environment From glaubitz at physik.fu-berlin.de Wed May 31 11:36:16 2017 From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz) Date: Wed, 31 May 2017 13:36:16 +0200 Subject: [PATCH]: linux-sparc build fixes In-Reply-To: References: <4121e72b-25cb-a18f-dace-a44528cb622d@oracle.com> <20170529081312.GA7132@physik.fu-berlin.de> <49c3a38a-779a-ea5d-6ef1-a4f298feb18c@oracle.com> <20170531105551.GC14877@physik.fu-berlin.de> Message-ID: <20170531113615.GD14877@physik.fu-berlin.de> On Wed, May 31, 2017 at 01:28:50PM +0200, dalibor topic wrote: > On 31.05.2017 12:55, John Paul Adrian Glaubitz wrote: > >It should translate, however :-). If I were to buy an Oracle server, I > >would expect Oracle software to run on it. > > A preliminary list of Oracle JDK 9 Supported Platforms can be found at > http://jdk.java.net/9/supported . Last time I checked, Linux on SPARC was > not on that list. > > Please keep in mind that this list is subject to change throughout the > release cycle. Yes, I'm aware of that list. > >So, merging the fixes won't hurt, but it makes the work > >for downstream Linux distributions easier. > > Well, dealing with changes to unsupported platforms still requires reviewing > those changes, shepherding them to the bug & build systems, through > toolchain changes, etc. So it's usually preferable if such activity done by > platform experts. If the platform is unsupported, why do the changes need in-depth reviewal. If it breaks and someone complains, you refer them to the list of supported platforms. > In short, if there is one or more Linux distributions interested in creating > and maintaining a port of OpenJDK to sparc-linux, I'd suggest creating a > dedicated porting Project, in the same manner how PowerPC64 or s390x were > added to OpenJDK mainline. I understand that can be a bit confusing, because > 'the code is already there' in this case. ;) Indeed. The code is already there and I just want to help fix it. A process that normally works very easy in most upstream projects. I have sent in patches to many other projects fixing platform support for Linux/sparcv9, but ironically it's Oracle's own project where that isn't easily possible. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz at debian.org `. `' Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 From sgehwolf at redhat.com Wed May 31 11:55:07 2017 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 31 May 2017 13:55:07 +0200 Subject: RFR: 8161145: The min/max macros make hotspot tests fail to build with GCC 6 In-Reply-To: <592EA400.9010004@oracle.com> References: <20160711184513.GA1485@redhat.com> <20D15A8D-1C0A-4D44-9E83-A99B38622A6C@oracle.com> <1655965499.4239256.1468347672234.JavaMail.zimbra@redhat.com> <27891E85-5736-4D44-8D79-1C44B01499EE@oracle.com> <1461764727.4609223.1468437153322.JavaMail.zimbra@redhat.com> <1471353948.2985.22.camel@redhat.com> <592EA400.9010004@oracle.com> Message-ID: <1496231707.3749.4.camel@redhat.com> Hi Erik, On Wed, 2017-05-31 at 13:07 +0200, Erik ?sterlund wrote: > Hi, > > I am bringing back this blast from the past. I also need a solution for? > this for the GC interface that wishes to use . > I propose to do what Kim suggested - to #define max max and #define min? > min. This way the innocent "max" identifiers that have nothing to do? > with any macros will build as they should, and any accidental consumer? > of a potential max() macro will find a compiler error as intended. Is? > anyone left unhappy with this solution? I'm not. It works fine on my end. Looking forward to have this finally fixed upstream. Thanks, Severin From dalibor.topic at oracle.com Wed May 31 12:09:26 2017 From: dalibor.topic at oracle.com (dalibor topic) Date: Wed, 31 May 2017 14:09:26 +0200 Subject: [PATCH]: linux-sparc build fixes In-Reply-To: <20170531113615.GD14877@physik.fu-berlin.de> References: <4121e72b-25cb-a18f-dace-a44528cb622d@oracle.com> <20170529081312.GA7132@physik.fu-berlin.de> <49c3a38a-779a-ea5d-6ef1-a4f298feb18c@oracle.com> <20170531105551.GC14877@physik.fu-berlin.de> <20170531113615.GD14877@physik.fu-berlin.de> Message-ID: <220039a2-738d-d45a-70fc-0d639f8e4b83@oracle.com> On 31.05.2017 13:36, John Paul Adrian Glaubitz wrote: > If the platform is unsupported, why do the changes need in-depth > reviewal. We don't want to have code that hasn't been reviewed enter the mainline JDK. > Indeed. The code is already there and I just want to help fix it. A > process that normally works very easy in most upstream projects. I > have sent in patches to many other projects fixing platform support > for Linux/sparcv9, but ironically it's Oracle's own project where that > isn't easily possible. I think the issue where semantics clash is what constitutes fixing a port. From my perspective, that entails not just getting OpenJDK to build on a given platform, but also making it work well, and creating a sustainable effort around it to maintain it, so that it doesn't bit rot over time. The Linux SPARC port seems to have received occasional bursts of activity (mostly from Oracle, afaict) that take care of the first two items, which is fine and well. Unfortunately, no specific long term effort seems to have formed around the port it to maintain it across the Linux distributions, and that's the kind of problem where one-off build fixes are just a small piece of the puzzle. Of course, forming a Project to maintain a port is not a panacea, either. But it provides a structure within which developers interested in a port can chose to collaborate (or not, this is open source, after all, and eternal balkanization of effort is its cute curse ;). cheers, dalibor topic -- Dalibor Topic | Principal Product Manager Phone: +494089091214 | Mobile: +491737185961 ORACLE Deutschland B.V. & Co. KG | K?hneh?fe 5 | 22761 Hamburg ORACLE Deutschland B.V. & Co. KG Hauptverwaltung: Riesstr. 25, D-80992 M?nchen Registergericht: Amtsgericht M?nchen, HRA 95603 Komplement?rin: ORACLE Deutschland Verwaltung B.V. Hertogswetering 163/167, 3543 AS Utrecht, Niederlande Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697 Gesch?ftsf?hrer: Alexander van der Ven, Jan Schultheiss, Val Maher Oracle is committed to developing practices and products that help protect the environment From gromero at linux.vnet.ibm.com Wed May 31 12:47:07 2017 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 31 May 2017 09:47:07 -0300 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <461d3048-88a2-c99d-818a-01de3813a29b@redhat.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> <5928C9AA.6030004@linux.vnet.ibm.com> <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> <95147596-caf9-4e49-f954-29fa13df3a56@oracle.com> <592CA97D.4000802@linux.vnet.ibm.com> <937b01c9-5569-ce73-e7a3-ad38aed82ab3@redhat.com> <4dc1ac9e-f35f-209a-761f-96dc584f68a1@oracle.com> <461d3048-88a2-c99d-818a-01de3813a29b@redhat.com> Message-ID: <592EBB4B.1020909@linux.vnet.ibm.com> Hi Zhengyu, On 30-05-2017 21:37, Zhengyu Gu wrote: > Hi David, > > Thanks for the review. > > Gustavo, might I count you as a reviewer? Formally speaking (accordingly to the community Bylaws) I'm not a reviewer, so I guess no. Kind regards, Gustavo > Thanks, > > -Zhengyu > > > > On 05/30/2017 05:30 PM, David Holmes wrote: >> Looks fine to me. >> >> Thanks, >> David >> >> On 30/05/2017 9:59 PM, Zhengyu Gu wrote: >>> Hi David and Gustavo, >>> >>> Thanks for the review. >>> >>> Webrev is updated according to your comments: >>> >>> http://cr.openjdk.java.net/~zgu/8181055/webrev.02/ >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> >>> On 05/29/2017 07:06 PM, Gustavo Romero wrote: >>>> Hi David, >>>> >>>> On 29-05-2017 01:34, David Holmes wrote: >>>>> Hi Zhengyu, >>>>> >>>>> On 29/05/2017 12:08 PM, Zhengyu Gu wrote: >>>>>> Hi Gustavo, >>>>>> >>>>>> Thanks for the detail analysis and suggestion. I did not realize >>>>>> the difference between from bitmask and nodemask. >>>>>> >>>>>> As you suggested, numa_interleave_memory_v2 works under this >>>>>> configuration. >>>>>> >>>>>> Please updated Webrev: >>>>>> http://cr.openjdk.java.net/~zgu/8181055/webrev.01/ >>>>> >>>>> The addition of support for the "v2" API seems okay. Though I think >>>>> this comment needs some clarification for the existing code: >>>>> >>>>> 2837 // If we are running with libnuma version > 2, then we should >>>>> 2838 // be trying to use symbols with versions 1.1 >>>>> 2839 // If we are running with earlier version, which did not have >>>>> symbol versions, >>>>> 2840 // we should use the base version. >>>>> 2841 void* os::Linux::libnuma_dlsym(void* handle, const char *name) { >>>>> >>>>> given that we now explicitly load the v1.2 symbol if present. >>>>> >>>>> Gustavo: can you vouch for the suitability of using the v2 API in >>>>> all cases, if it exists? >>>> >>>> My understanding is that in the transition to API v2 only the usage of >>>> numa_node_to_cpus() by the JVM will have to be adapted in >>>> os::Linux::rebuild_cpu_to_node_map(). >>>> The remaining functions (excluding numa_interleave_memory() as >>>> Zhengyu already addressed it) >>>> preserve the same functionality and signatures [1]. >>>> >>>> Currently JVM NUMA API requires the following libnuma functions: >>>> >>>> 1. numa_node_to_cpus v1 != v2 (using v1, JVM has to adapt) >>>> 2. numa_max_node v1 == v2 (using v1, transition is >>>> straightforward) >>>> 3. numa_num_configured_nodes v2 (added by gromero: 8175813) >>>> 4. numa_available v1 == v2 (using v1, transition is >>>> straightforward) >>>> 5. numa_tonode_memory v1 == v2 (using v1, transition is >>>> straightforward) >>>> 6. numa_interleave_memory v1 != v2 (updated by zhengyu: >>>> 8181055. Default use of v2, fallback to v1) >>>> 7. numa_set_bind_policy v1 == v2 (using v1, transition is >>>> straightforward) >>>> 8. numa_bitmask_isbitset v2 (added by gromero: 8175813) >>>> 9. numa_distance v1 == v2 (added by gromero: 8175813. >>>> Using v1, transition is straightforward) >>>> >>>> v1 != v2: function signature in version 1 is different from version 2 >>>> v1 == v2: function signature in version 1 is equal to version 2 >>>> v2 : function is only present in API v2 >>>> >>>> Thus, to the best of my knowledge, except for case 1. (which JVM need >>>> to adapt to) >>>> all other cases are suitable to use v2 API and we could use a >>>> fallback mechanism as >>>> proposed by Zhengyu or update directly to API v2 (risky?), given that >>>> I can't see >>>> how v2 API would not be available on current (not-EOL) Linux distro >>>> releases. >>>> >>>> Regarding the comment, I agree, it needs an update since we are not >>>> tied anymore >>>> to version 1.1 (we are in effect already using v2 for some >>>> functions). We could >>>> delete the comment atop libnuma_dlsym() and add something like: >>>> >>>> "Handle request to load libnuma symbol version 1.1 (API v1). If it >>>> fails load symbol from base version instead." >>>> >>>> and to libnuma_v2_dlsym() add: >>>> >>>> "Handle request to load libnuma symbol version 1.2 (API v2) only. If >>>> it fails no symbol from any other version - even if present - is >>>> loaded." >>>> >>>> I've opened a bug to track the transitions to API v2 (I also >>>> discussed that with Volker): >>>> https://bugs.openjdk.java.net/browse/JDK-8181196 >>>> >>>> >>>> Regards, >>>> Gustavo >>>> >>>> [1] API v1 vs API v2: >>>> >>>> API v1 >>>> ====== >>>> >>>> int numa_node_to_cpus(int node, unsigned long *buffer, int bufferlen); >>>> int numa_max_node(void); >>>> - int numa_num_configured_nodes(void); >>>> int numa_available(void); >>>> void numa_tonode_memory(void *start, size_t size, int node); >>>> void numa_interleave_memory(void *start, size_t size, nodemask_t >>>> *nodemask); >>>> void numa_set_bind_policy(int strict); >>>> - int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); >>>> int numa_distance(int node1, int node2); >>>> >>>> >>>> API v2 >>>> ====== >>>> >>>> int numa_node_to_cpus(int node, struct bitmask *mask); >>>> int numa_max_node(void); >>>> int numa_num_configured_nodes(void); >>>> int numa_available(void); >>>> void numa_tonode_memory(void *start, size_t size, int node); >>>> void numa_interleave_memory(void *start, size_t size, struct bitmask >>>> *nodemask); >>>> void numa_set_bind_policy(int strict) >>>> int numa_bitmask_isbitset(const struct bitmask *bmp, unsigned int n); >>>> int numa_distance(int node1, int node2); >>>> >>>> >>>>> I'm running this through JPRT now. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>>> >>>>>> >>>>>> >>>>>> On 05/26/2017 08:34 PM, Gustavo Romero wrote: >>>>>>> Hi Zhengyu, >>>>>>> >>>>>>> Thanks a lot for taking care of this corner case on PPC64. >>>>>>> >>>>>>> On 26-05-2017 10:41, Zhengyu Gu wrote: >>>>>>>> This is a quick way to kill the symptom (or low risk?). I am not >>>>>>>> sure if disabling NUMA is a better solution for this >>>>>>>> circumstance? does 1 NUMA node = UMA? >>>>>>> >>>>>>> On PPC64, 1 (configured) NUMA does not necessarily imply UMA. In >>>>>>> the POWER7 >>>>>>> machine you found the corner case (I copy below the data you >>>>>>> provided in the >>>>>>> JBS - thanks for the additional information): >>>>>>> >>>>>>> $ numactl -H >>>>>>> available: 2 nodes (0-1) >>>>>>> node 0 cpus: 0 1 2 3 4 5 6 7 >>>>>>> node 0 size: 0 MB >>>>>>> node 0 free: 0 MB >>>>>>> node 1 cpus: >>>>>>> node 1 size: 7680 MB >>>>>>> node 1 free: 1896 MB >>>>>>> node distances: >>>>>>> node 0 1 >>>>>>> 0: 10 40 >>>>>>> 1: 40 10 >>>>>>> >>>>>>> CPUs in node0 have no other alternative besides allocating memory >>>>>>> from node1. In >>>>>>> that case CPUs in node0 are always accessing remote memory from >>>>>>> node1 in a constant >>>>>>> distance (40), so in that case we could say that 1 NUMA >>>>>>> (configured) node == UMA. >>>>>>> Nonetheless, if you add CPUs in node1 (by filling up the other >>>>>>> socket present in >>>>>>> the board) you will end up with CPUs with different distances from >>>>>>> the node that >>>>>>> has configured memory (in that case, node1), so it yields a >>>>>>> configuration where >>>>>>> 1 NUMA (configured) != UMA (i.e. distances are not always equal to >>>>>>> a single >>>>>>> value). >>>>>>> >>>>>>> On the other hand, the POWER7 machine configuration in question is >>>>>>> bad (and >>>>>>> rare). It's indeed impacting the whole system performance and it >>>>>>> would be >>>>>>> reasonable to open the machine and move the memory module from >>>>>>> bank related to >>>>>>> node1 to bank related to node0, because all CPUs are accessing >>>>>>> remote memory >>>>>>> without any apparent necessity. Once you change it all CPUs will >>>>>>> have local >>>>>>> memory (distance = 10). >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -Zhengyu >>>>>>>> >>>>>>>> On 05/26/2017 09:14 AM, Zhengyu Gu wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> There is a corner case that still failed after JDK-8175813. >>>>>>>>> >>>>>>>>> The system shows that it has multiple NUMA nodes, but only one is >>>>>>>>> configured. Under this scenario, numa_interleave_memory() call will >>>>>>>>> result "mbind: Invalid argument" message. >>>>>>>>> >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8181055 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~zgu/8181055/webrev.00/ >>>>>>> >>>>>>> Looks like that even for that POWER7 rare numa topology >>>>>>> numa_interleave_memory() >>>>>>> should succeed without "mbind: Invalid argument" since the 'mask' >>>>>>> argument >>>>>>> should be already a mask with only nodes from which memory can be >>>>>>> allocated, i.e. >>>>>>> only a mask of configured nodes (even if mask contains only one >>>>>>> configured node, >>>>>>> as in >>>>>>> http://cr.openjdk.java.net/~gromero/logs/numa_only_one_node.txt). >>>>>>> >>>>>>> Inspecting a little bit more, it looks like that the problem boils >>>>>>> down to the >>>>>>> fact that the JVM is passing to numa_interleave_memory() >>>>>>> 'numa_all_nodes' [1] in >>>>>>> Linux::numa_interleave_memory(). >>>>>>> >>>>>>> One would expect that 'numa_all_nodes' (which is api v1) would >>>>>>> track the same >>>>>>> information as 'numa_all_nodes_ptr' (api v2) [2], however there is >>>>>>> a subtle but >>>>>>> important difference: >>>>>>> >>>>>>> 'numa_all_nodes' is constructed assuming a consecutive node >>>>>>> distribution [3]: >>>>>>> >>>>>>> 100 max = numa_num_configured_nodes(); >>>>>>> 101 for (i = 0; i < max; i++) >>>>>>> 102 nodemask_set_compat((nodemask_t >>>>>>> *)&numa_all_nodes, i); >>>>>>> >>>>>>> >>>>>>> whilst 'numa_all_nodes_ptr' is constructed parsing >>>>>>> /proc/self/status [4]: >>>>>>> >>>>>>> 499 if (strncmp(buffer,"Mems_allowed:",13) == 0) { >>>>>>> 500 numprocnode = read_mask(mask, >>>>>>> numa_all_nodes_ptr); >>>>>>> >>>>>>> Thus for a topology like: >>>>>>> >>>>>>> available: 4 nodes (0-1,16-17) >>>>>>> node 0 cpus: 0 8 16 24 32 >>>>>>> node 0 size: 130706 MB >>>>>>> node 0 free: 145 MB >>>>>>> node 1 cpus: 40 48 56 64 72 >>>>>>> node 1 size: 0 MB >>>>>>> node 1 free: 0 MB >>>>>>> node 16 cpus: 80 88 96 104 112 >>>>>>> node 16 size: 130630 MB >>>>>>> node 16 free: 529 MB >>>>>>> node 17 cpus: 120 128 136 144 152 >>>>>>> node 17 size: 0 MB >>>>>>> node 17 free: 0 MB >>>>>>> node distances: >>>>>>> node 0 1 16 17 >>>>>>> 0: 10 20 40 40 >>>>>>> 1: 20 10 40 40 >>>>>>> 16: 40 40 10 20 >>>>>>> 17: 40 40 20 10 >>>>>>> >>>>>>> numa_all_nodes=0x3 => 0b11 (node0 and node1) >>>>>>> numa_all_nodes_ptr=0x10001 => 0b10000000000000001 (node0 and node16) >>>>>>> >>>>>>> (Please, see details in the following gdb log: >>>>>>> http://cr.openjdk.java.net/~gromero/logs/numa_api_v1_vs_api_v2.txt) >>>>>>> >>>>>>> In that case passing node0 and node1, although being suboptimal, >>>>>>> does not bother >>>>>>> mbind() since the following is satisfied: >>>>>>> >>>>>>> "[nodemask] must contain at least one node that is on-line, >>>>>>> allowed by the >>>>>>> process's current cpuset context, and contains memory." >>>>>>> >>>>>>> So back to the POWER7 case, I suppose that for: >>>>>>> >>>>>>> available: 2 nodes (0-1) >>>>>>> node 0 cpus: 0 1 2 3 4 5 6 7 >>>>>>> node 0 size: 0 MB >>>>>>> node 0 free: 0 MB >>>>>>> node 1 cpus: >>>>>>> node 1 size: 7680 MB >>>>>>> node 1 free: 1896 MB >>>>>>> node distances: >>>>>>> node 0 1 >>>>>>> 0: 10 40 >>>>>>> 1: 40 10 >>>>>>> >>>>>>> numa_all_nodes=0x1 => 0b01 (node0) >>>>>>> numa_all_nodes_ptr=0x2 => 0b10 (node1) >>>>>>> >>>>>>> and hence numa_interleave_memory() gets nodemask = 0x1 (node0), >>>>>>> which contains >>>>>>> indeed no memory. That said, I don't know for sure if passing just >>>>>>> node1 in the >>>>>>> 'nodemask' will satisfy mbind() as in that case there are no cpus >>>>>>> available in >>>>>>> node1. >>>>>>> >>>>>>> In summing up, looks like that the root cause is not that >>>>>>> numa_interleave_memory() >>>>>>> does not accept only one configured node, but that the configured >>>>>>> node being >>>>>>> passed is wrong. I could not find a similar numa topology in my >>>>>>> poll to test >>>>>>> more, but it might be worth trying to write a small test using api >>>>>>> v2 and >>>>>>> 'numa_all_nodes_ptr' instead of 'numa_all_nodes' to see how >>>>>>> numa_interleave_memory() >>>>>>> goes in that machine :) If it behaves well, updating to api v2 >>>>>>> would be a >>>>>>> solution. >>>>>>> >>>>>>> HTH >>>>>>> >>>>>>> Regards, >>>>>>> Gustavo >>>>>>> >>>>>>> >>>>>>> [1] >>>>>>> http://hg.openjdk.java.net/jdk10/hs/hotspot/file/4b93e1b1d5b7/src/os/linux/vm/os_linux.hpp#l274 >>>>>>> >>>>>>> [2] from libnuma.c:608 numa_all_nodes_ptr: "it only tracks nodes >>>>>>> with memory from which the calling process can allocate." >>>>>>> [3] >>>>>>> https://github.com/numactl/numactl/blob/master/libnuma.c#L100-L102 >>>>>>> [4] >>>>>>> https://github.com/numactl/numactl/blob/master/libnuma.c#L499-L500 >>>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> The system NUMA configuration: >>>>>>>>> >>>>>>>>> Architecture: ppc64 >>>>>>>>> CPU op-mode(s): 32-bit, 64-bit >>>>>>>>> Byte Order: Big Endian >>>>>>>>> CPU(s): 8 >>>>>>>>> On-line CPU(s) list: 0-7 >>>>>>>>> Thread(s) per core: 4 >>>>>>>>> Core(s) per socket: 1 >>>>>>>>> Socket(s): 2 >>>>>>>>> NUMA node(s): 2 >>>>>>>>> Model: 2.1 (pvr 003f 0201) >>>>>>>>> Model name: POWER7 (architected), altivec supported >>>>>>>>> L1d cache: 32K >>>>>>>>> L1i cache: 32K >>>>>>>>> NUMA node0 CPU(s): 0-7 >>>>>>>>> NUMA node1 CPU(s): >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> -Zhengyu >>>>>>>> >>>>>>> >>>>> >>>> > From shade at redhat.com Wed May 31 13:04:20 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 31 May 2017 15:04:20 +0200 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <937b01c9-5569-ce73-e7a3-ad38aed82ab3@redhat.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> <5928C9AA.6030004@linux.vnet.ibm.com> <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> <95147596-caf9-4e49-f954-29fa13df3a56@oracle.com> <592CA97D.4000802@linux.vnet.ibm.com> <937b01c9-5569-ce73-e7a3-ad38aed82ab3@redhat.com> Message-ID: <97b44e65-efea-2118-6740-2e197cd72d6b@redhat.com> On 05/30/2017 01:59 PM, Zhengyu Gu wrote: > http://cr.openjdk.java.net/~zgu/8181055/webrev.02/ Looks fine to me too, given Gustavo's comments. -Aleksey From stuart.monteith at linaro.org Wed May 31 13:19:41 2017 From: stuart.monteith at linaro.org (Stuart Monteith) Date: Wed, 31 May 2017 14:19:41 +0100 Subject: RFR 8u backport: 8077608: [TESTBUG] Enable Hotspot jtreg tests to run in agentvm mode Message-ID: Hello, Currently the jdk8u codebase fails some JTreg Hotspot tests when running in the -agentvm mode. This is because the ProcessTools class is not passing the classpath. There are substantial time savings to be gained using -agentvm over -othervm. Fortunately, there was a fix for jdk9 (8077608) that has not been backported to jdk8u. The details are as follows: http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-April/017937.html https://bugs.openjdk.java.net/browse/JDK-8077608 http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/af2a1e9f08f3 The patch just needed a slight change, to remove the change to the file "test/compiler/uncommontrap/TestUnstableIfTrap.java" as that test doesn't exist on jdk8u. My colleague Ningsheng has kindly hosted the change here: http://cr.openjdk.java.net/~njian/8077608/webrev.00 BR, Stuart From zgu at redhat.com Wed May 31 13:23:57 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 31 May 2017 09:23:57 -0400 Subject: RFR(XS) 8181055: "mbind: Invalid argument" still seen after 8175813 In-Reply-To: <97b44e65-efea-2118-6740-2e197cd72d6b@redhat.com> References: <3a2a0ef7-5eac-b72c-5dc6-b7594dc70c07@redhat.com> <5928C9AA.6030004@linux.vnet.ibm.com> <38f323bc-7416-5c3d-c534-f5f17be4c7c6@redhat.com> <95147596-caf9-4e49-f954-29fa13df3a56@oracle.com> <592CA97D.4000802@linux.vnet.ibm.com> <937b01c9-5569-ce73-e7a3-ad38aed82ab3@redhat.com> <97b44e65-efea-2118-6740-2e197cd72d6b@redhat.com> Message-ID: <13a703cc-9a82-0420-40c9-5c31290c78c4@redhat.com> Hi David, It has two reviewers now. Would you mind to sponsor this change? I prepared the final patch: http://cr.openjdk.java.net/~zgu/8181055/webrev.03/ Thanks, -Zhengyu On 05/31/2017 09:04 AM, Aleksey Shipilev wrote: > On 05/30/2017 01:59 PM, Zhengyu Gu wrote: >> http://cr.openjdk.java.net/~zgu/8181055/webrev.02/ > > Looks fine to me too, given Gustavo's comments. > > -Aleksey > > From erik.joelsson at oracle.com Wed May 31 13:57:14 2017 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Wed, 31 May 2017 15:57:14 +0200 Subject: RFR (2xS): 8181318: Allow C++ library headers on Solaris Studio In-Reply-To: <592EC8F6.5080605@oracle.com> References: <592EC8F6.5080605@oracle.com> Message-ID: <5edf009a-9055-a2a8-b546-d73094e05360@oracle.com> (adding hotspot-dev) Looks good from a build perspective. /Erik On 2017-05-31 15:45, Erik ?sterlund wrote: > Hi, > > It would be desirable to be able to use harmless C++ standard library > headers like in the code as long as it does not add any > link-time dependencies to the standard library. > This is possible on all supported platforms except the ones using the > solaris studio compiler where we enforce -library=%none in both CFLAGS > and LDFLAGS. > I propose to remove the restriction from CFLAGS but keep it on LDFLAGS. > > I have consulted with the studio folks, and they think this is > absolutely fine and thought that the choice of -library=stlport4 > should be fine for our CFLAGS and is indeed what is already used in > the gtest launcher. > > Webrev for jdk10-hs top level repository: > http://cr.openjdk.java.net/~eosterlund/8181318/webrev.00/ > > Webrev for jdk10-hs hotspot repository: > http://cr.openjdk.java.net/~eosterlund/8181318/webrev.01/ > > Testing: JPRT. > > Will need a sponsor. > > Thanks, > /Erik From erik.osterlund at oracle.com Wed May 31 15:01:48 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 31 May 2017 17:01:48 +0200 Subject: RFR: 8161145: The min/max macros make hotspot tests fail to build with GCC 6 In-Reply-To: <1496231707.3749.4.camel@redhat.com> References: <20160711184513.GA1485@redhat.com> <20D15A8D-1C0A-4D44-9E83-A99B38622A6C@oracle.com> <1655965499.4239256.1468347672234.JavaMail.zimbra@redhat.com> <27891E85-5736-4D44-8D79-1C44B01499EE@oracle.com> <1461764727.4609223.1468437153322.JavaMail.zimbra@redhat.com> <1471353948.2985.22.camel@redhat.com> <592EA400.9010004@oracle.com> <1496231707.3749.4.camel@redhat.com> Message-ID: <592EDADC.8040709@oracle.com> Hi, Excellent. In that case I would like reviews on this patch that does exactly that: http://cr.openjdk.java.net/~eosterlund/8161145/webrev.00/ Testing: JPRT Need a sponsor. Thanks, /Erik On 2017-05-31 13:55, Severin Gehwolf wrote: > Hi Erik, > > On Wed, 2017-05-31 at 13:07 +0200, Erik ?sterlund wrote: >> Hi, >> >> I am bringing back this blast from the past. I also need a solution for >> this for the GC interface that wishes to use . >> I propose to do what Kim suggested - to #define max max and #define min >> min. This way the innocent "max" identifiers that have nothing to do >> with any macros will build as they should, and any accidental consumer >> of a potential max() macro will find a compiler error as intended. Is >> anyone left unhappy with this solution? > I'm not. It works fine on my end. Looking forward to have this finally > fixed upstream. > > Thanks, > Severin From gromero at linux.vnet.ibm.com Wed May 31 15:56:13 2017 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 31 May 2017 12:56:13 -0300 Subject: [8u] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: References: <59258B49.9080602@linux.vnet.ibm.com> Message-ID: <592EE79D.1020104@linux.vnet.ibm.com> Hi David, On 29-05-2017 02:31, David Holmes wrote: > Hi Gustavo, > > This looks like an accurate backport. Thanks for reviewing the change. Does it need a second reviewer or should I proceed to request the approval? Regards, Gustavo > Thanks, > David > ----- > > On 24/05/2017 11:31 PM, Gustavo Romero wrote: >> Hi, >> >> Could this backport of 8175813 for jdk8u be reviewed, please? >> >> It applies cleanly to jdk8u except for a chunk in os::Linux::libnuma_init(), but >> it's just due to an indentation change introduced with cleanup [1]. >> >> It improves JVM NUMA node detection on PPC64. >> >> Currently there is no Linux distros that package only libnuma v1, so libnuma API >> v2 used in that change is always available. >> >> webrev : http://cr.openjdk.java.net/~gromero/8175813/backport/ >> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >> review thread: http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-May/026788.html >> >> Thank you. >> >> Regards, >> Gustavo >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8057107 >> > From zgu at redhat.com Wed May 31 17:15:15 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 31 May 2017 13:15:15 -0400 Subject: [8u] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used In-Reply-To: <592EE79D.1020104@linux.vnet.ibm.com> References: <59258B49.9080602@linux.vnet.ibm.com> <592EE79D.1020104@linux.vnet.ibm.com> Message-ID: <1ef6fe36-5582-a041-2fde-19b2bf6c9c4f@redhat.com> Hi Gustavo, On 05/31/2017 11:56 AM, Gustavo Romero wrote: > Hi David, > > On 29-05-2017 02:31, David Holmes wrote: >> Hi Gustavo, >> >> This looks like an accurate backport. > > Thanks for reviewing the change. > > Does it need a second reviewer or should I proceed to request the approval? > You can add me as a reviewer, if needed. Thanks for doing this backport. -Zhengyu > Regards, > Gustavo > >> Thanks, >> David >> ----- >> >> On 24/05/2017 11:31 PM, Gustavo Romero wrote: >>> Hi, >>> >>> Could this backport of 8175813 for jdk8u be reviewed, please? >>> >>> It applies cleanly to jdk8u except for a chunk in os::Linux::libnuma_init(), but >>> it's just due to an indentation change introduced with cleanup [1]. >>> >>> It improves JVM NUMA node detection on PPC64. >>> >>> Currently there is no Linux distros that package only libnuma v1, so libnuma API >>> v2 used in that change is always available. >>> >>> webrev : http://cr.openjdk.java.net/~gromero/8175813/backport/ >>> bug : https://bugs.openjdk.java.net/browse/JDK-8175813 >>> review thread: http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-May/026788.html >>> >>> Thank you. >>> >>> Regards, >>> Gustavo >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8057107 >>> >> >