From matthias.baesken at sap.com Mon Jul 1 06:52:18 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Mon, 1 Jul 2019 06:52:18 +0000 Subject: RFR [XS] : 8226943: compile error in libfollowref003.cpp with XCode 10.2 on macosx In-Reply-To: References: <770df6b7-7588-1030-f13e-e1500a63231e@oracle.com> Message-ID: Hello, thanks for the review ! > I'd suggest to fix it in 13 as it is the test fix. I'll push it then to 13 , fine with me ! Best regards, Matthias > > Hi Matthias, > > The fix is good. > It worked before because both JVMTI_REFERENCE_ARRAY_ELEMENT > and JVMTI_HEAP_REFERENCE_ARRAY_ELEMENT have the same value 3 > as Gary noticed. > > I'd suggest to fix it in 13 as it is the test fix. > I've added labels 'testbug' and 'noreg-self'. > > Thanks, > Serguei > > On 6/28/19 12:04 PM, David Holmes wrote: > > Hi Matthias, > > > > Dropped build-dev and added serviceability-dev as this is a > > servicability test. > > > > On 28/06/2019 7:43 am, Baesken, Matthias wrote: > >> Hello please review this? small fix for a compile issue? on OSX . > >> Today I? compiled?? jdk/jdk?? on a machine? with?? XCode 10.2? . It > >> worked pretty well . > >> However this small issue showed up . > >> > >> > >> In file included from > >> > /open_jdk/jdk_just_clone/jdk/test/hotspot/jtreg/vmTestbase/nsk/jvmti/u > nit/FollowReferences/followref003/libfollowref003.cpp:33: > >> > /open_jdk/jdk_just_clone/jdk/test/hotspot/jtreg/vmTestbase/nsk/jvmti/u > nit/FollowReferences/followref003/followref003.cpp:813:14: > >> error: > >> comparison of two values with different enumeration types in switch > >> statement ('jvmtiHeapReferenceKind' and 'jvmtiObjectReferenceKind') > >> [-Werror,-Wenum-compare-switch] > >> > >> > >> And here XCode 10 is correct , JVMTI_REFERENCE_ARRAY_ELEMENT?? is > >> from a different? enumeration type? and should be replaced? with the > >> value? from the correct enumeration type?? . > >> > >> Bug / webrev : > >> > >> https://bugs.openjdk.java.net/browse/JDK-8226943 > >> > >> http://cr.openjdk.java.net/~mbaesken/webrevs/8226943.0/ > > > > The fix seems reasonable but the issue indicates a further problem > > with the test. If it expected JVMTI_HEAP_REFERENCE_ARRAY_ELEMENT > but > > was checking for JVMTI_REFERENCE_ARRAY_ELEMENT then we should > have hit > > the default clause and failed the test. That suggests the test doesn't > > actually expect JVMTI_HEAP_REFERENCE_ARRAY_ELEMENT in the first > place. > > > > Cheers, > > David > > > >> > >> Thanks, Matthias > >> From martin.doerr at sap.com Mon Jul 1 10:06:05 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 1 Jul 2019 10:06:05 +0000 Subject: RFR(m): 8220351: Cross-modifying code In-Reply-To: <8736k5339e.fsf@oldenburg2.str.redhat.com> References: <4021e6fe-a2e7-7e66-bd54-1ea9d80863ae@oracle.com> <87fto536lo.fsf@oldenburg2.str.redhat.com> <8736k5339e.fsf@oldenburg2.str.redhat.com> Message-ID: Hi Florian, sorry for breaking 32 bit linux. We don't build on this platform so we didn't notice. I believe "Compiler version last used for testing: gcc 4.8.2" is still correct for 64 bit linux. Andrew's proposal looks reasonable. Best regards, Martin > -----Original Message----- > From: Florian Weimer > Sent: Mittwoch, 19. Juni 2019 19:16 > To: Doerr, Martin > Cc: hotspot-dev at openjdk.java.net; aph at redhat.com > Subject: Re: RFR(m): 8220351: Cross-modifying code > > * Florian Weimer: > > > * Martin Doerr: > > > >> Not sure if the inline assembler code on x86 necessarily needs a "clobber > memory" effect. > >> I don't know what a C++ compiler is allowed to do if it doesn't know that > the code has some kind of memory effect. > >> > >> For ebx...edx, you could also use clobber if you want to make it shorter. > >> E.g. with "+a" to use eax as input and output: > >> int idx = 0; > >> __asm__ volatile ("cpuid " : "+a" (idx) : : "ebx", "ecx", "edx", "memory"); > > > > ebx clobbers are not supported on older GCC versions. > > src/hotspot/os_cpu/linux_x86/orderAccess_linux_x86.hpp currently says > > this: > > > > // Compiler version last used for testing: gcc 4.8.2 > > > > But this is blatantly not true because GCC 4.8 cannot spill ebx in PIC > > mode. > > I got this patch from Andrew Haley, and the build works again with GCC > 4.8.5 (the system compiler on Red Hat Enterprise Linux 7): > > diff -r d7da94e6c169 > src/hotspot/os_cpu/linux_x86/orderAccess_linux_x86.hpp > --- a/src/hotspot/os_cpu/linux_x86/orderAccess_linux_x86.hpp Tue Jun 18 > 16:15:15 2019 +0100 > +++ b/src/hotspot/os_cpu/linux_x86/orderAccess_linux_x86.hpp Wed Jun > 19 17:52:26 2019 +0100 > @@ -57,7 +57,11 @@ > > inline void OrderAccess::cross_modify_fence() { > int idx = 0; > +#ifdef AMD64 > __asm__ volatile ("cpuid " : "+a" (idx) : : "ebx", "ecx", "edx", "memory"); > +#else > + __asm__ volatile ("xchg %%esi, %%ebx; cpuid; xchg %%esi, %%ebx " : "+a" > (idx) : : "esi", "ecx", "edx", "memory"); > +#endif > } > > template<> > > GCC can spill %esi without problems since forever, so this should work > everywhere. > > Thanks, > Florian From adam.farley at uk.ibm.com Mon Jul 1 12:27:11 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Mon, 1 Jul 2019 13:27:11 +0100 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN Message-ID: Hi All, The title say it all. If you pass in a value for sun.boot.library.path consisting of one or more paths that are too long, then the vm will fail to start because it can't load one of the libraries it needs (the zip library), despite the fact that the VM automatically prepends the default library path to the sun.boot.library.path property, using the correct separator to divide it from the user-specified path. So we've got the right path, in the right place, at the right time, we just can't *use* it. I've fixed this by changing the relevant os.cpp code to ignore paths that are too long, and to attempt to locate the needed library on the other paths (if any are valid). I've also added functionality to handle the edge case of paths that are neeeeeeearly too long, only for a sub-path (or file name) to push us over the limit *after* the split_path function is done assessing the path length. I've also changed the code we're overriding, on the assumption that someone's still using it somewhere. Bug: https://bugs.openjdk.java.net/browse/JDK-8227021 Webrev: http://cr.openjdk.java.net/~afarley/8227021/webrev/ Thoughts and impressions welcome. Best Regards Adam Farley IBM Runtimes Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From aph at redhat.com Mon Jul 1 13:09:32 2019 From: aph at redhat.com (Andrew Haley) Date: Mon, 1 Jul 2019 14:09:32 +0100 Subject: RFR(m): 8220351: Cross-modifying code In-Reply-To: References: <4021e6fe-a2e7-7e66-bd54-1ea9d80863ae@oracle.com> <87fto536lo.fsf@oldenburg2.str.redhat.com> <8736k5339e.fsf@oldenburg2.str.redhat.com> Message-ID: <50cf980f-6ca9-4357-4ca5-5c6f5955162a@redhat.com> On 7/1/19 11:06 AM, Doerr, Martin wrote: > sorry for breaking 32 bit linux. We don't build on this platform so we didn't notice. > I believe "Compiler version last used for testing: gcc 4.8.2" is still correct for 64 bit linux. > > Andrew's proposal looks reasonable. https://bugs.openjdk.java.net/browse/JDK-8226525 I'm on it. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.osterlund at oracle.com Mon Jul 1 13:12:23 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 1 Jul 2019 15:12:23 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic Message-ID: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> Hi, Today it is up to callers of methods changing state on nmethods like make_not_entrant(), to know all other possible concurrent attempts to transition the nmethod, and know that there are no such attempts trying to make the nmethod more dead. There have been multiple occurrences of issues where the caller got it wrong due to the fragile nature of this code. This specific CR deals with a bug where an OSR nmethod was made not entrant (deopt) and made unloaded concurrently. The result of such a race can be that it is first made unloaded and then made not entrant, making the nmethod go backwards in its state machine, effectively resurrecting dead nmethods, causing a subsequent GC to feel awkward (crash). But I have seen other similar incidents with deopt racing with the sweeper. These non-monotonicity problems are unnecessary to have. So I intend to fix the bug by enforcing monotonicity of the nmethod state machine explicitly, instead of trying to reason about all callers of these make_* functions. I swapped the order of unloaded and zombie in the enum as zombies are strictly more dead than unloaded nmethods. All transitions change in the direction of increasing deadness and fail if the transition is not monotonically increasing. For ZGC I moved OSR nmethod unlinking to before the unlinking (where unlinking code belongs), instead of after the handshake (intended for deleting things safely unlinked). Strictly speaking, moving the OSR nmethod unlinking removes the racing between make_not_entrant and make_unloaded, but I still want the monotonicity guards to make this code more robust. I left AOT methods alone. Since they don't die, they don't have resurrection problems, and hence do not benefit from these guards in the same way. Bug: https://bugs.openjdk.java.net/browse/JDK-8224674 Webrev: http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00/ Thanks, /Erik From aph at redhat.com Mon Jul 1 14:22:09 2019 From: aph at redhat.com (Andrew Haley) Date: Mon, 1 Jul 2019 15:22:09 +0100 Subject: RFR: 8226525: HotSpot compile-time error for x86-32 Message-ID: <469d68e3-e153-504a-6412-4bb4cc58dcaa@redhat.com> This asm statement: asm__ volatile ("cpuid " : "+a" (idx) : : "ebx", "ecx", "edx", "memory") ... breaks on 32-bit systems because the GCC we use doesn't allow EBX to be clobbered. Fixed thusly: http://cr.openjdk.java.net/~aph/8226525/ There is some small overhead, but given that we're trashing the pipeline anyway the overhead is insignificant. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From shade at redhat.com Mon Jul 1 14:46:32 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 1 Jul 2019 16:46:32 +0200 Subject: RFR: 8226525: HotSpot compile-time error for x86-32 In-Reply-To: <469d68e3-e153-504a-6412-4bb4cc58dcaa@redhat.com> References: <469d68e3-e153-504a-6412-4bb4cc58dcaa@redhat.com> Message-ID: <4988619d-5112-4000-3a9e-eb5554f99e03@redhat.com> On 7/1/19 4:22 PM, Andrew Haley wrote: > This asm statement: > > asm__ volatile ("cpuid " : "+a" (idx) : : "ebx", "ecx", "edx", "memory") > > ... breaks on 32-bit systems because the GCC we use doesn't allow EBX to be > clobbered. Fixed thusly: > > http://cr.openjdk.java.net/~aph/8226525/ Looks okay to me. Put the comment, e.g.: // EBX is a reserved register on 32-bit Linux systems, cannot clobber it. -Aleksey From kim.barrett at oracle.com Mon Jul 1 17:49:07 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 1 Jul 2019 13:49:07 -0400 Subject: RFR: 8226525: HotSpot compile-time error for x86-32 In-Reply-To: <4988619d-5112-4000-3a9e-eb5554f99e03@redhat.com> References: <469d68e3-e153-504a-6412-4bb4cc58dcaa@redhat.com> <4988619d-5112-4000-3a9e-eb5554f99e03@redhat.com> Message-ID: > On Jul 1, 2019, at 10:46 AM, Aleksey Shipilev wrote: > > On 7/1/19 4:22 PM, Andrew Haley wrote: >> This asm statement: >> >> asm__ volatile ("cpuid " : "+a" (idx) : : "ebx", "ecx", "edx", "memory") >> >> ... breaks on 32-bit systems because the GCC we use doesn't allow EBX to be >> clobbered. Fixed thusly: >> >> http://cr.openjdk.java.net/~aph/8226525/ > > Looks okay to me. Put the comment, e.g.: > // EBX is a reserved register on 32-bit Linux systems, cannot clobber it. > > -Aleksey Looks good. +1 on the additional comment. From thomas.stuefe at gmail.com Mon Jul 1 18:56:46 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 1 Jul 2019 20:56:46 +0200 Subject: RFR(xs): 8225200: runtime/memory/RunUnitTestsConcurrently.java has a memory leak Message-ID: Hi all, may I please have reviews and opinions about the following patch: Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 cr: http://cr.openjdk.java.net/~stuefe/webrevs/8227041-rununittestsconcurrently-has-a-mem-leak/webrev.00/webrev/index.html There is a memory leak in test_virtual_space_list_large_chunk(), called as part of the whitebox tests WB_RunMemoryUnitTests(). In this test metaspace allocation is tested by rapidly allocating and subsequently leaking a metachunk of ~512K. This is done by a number of threads in a tight loop for 15 seconds, which usually makes for 10-20GB rss. Test is usually OOM killed. This test seems to be often excluded, which makes sense, since this leak makes its memory usage difficult to predict. It is also earmarked by Oracle for gtest-ification, see 8213269. This leak is not easy to fix, among other things because it is not clear what it is it wants to test. Meanwhile, time moved on and we have quite nice gtests to test metaspace allocation (see e.g. test_metaspace_allocation.cpp) and I rather would run those gtests concurrently. Which could be a future RFE. So I just removed this metaspace related test from WB_RunMemoryUnitTests() altogether, since to me it does nothing useful. Once you remove the leaking allocation, not much is left. Without this part RunUnitTestsConcurrently test runs smoothly through its other parts, and in that form it is still useful. What do you think? Cheers, Thomas From stefan.karlsson at oracle.com Mon Jul 1 19:06:46 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 1 Jul 2019 21:06:46 +0200 Subject: RFR(xs): 8225200: runtime/memory/RunUnitTestsConcurrently.java has a memory leak In-Reply-To: References: Message-ID: On 2019-07-01 20:56, Thomas St?fe wrote: > Hi all, > > may I please have reviews and opinions about the following patch: > > Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 > cr: > http://cr.openjdk.java.net/~stuefe/webrevs/8227041-rununittestsconcurrently-has-a-mem-leak/webrev.00/webrev/index.html > > There is a memory leak in test_virtual_space_list_large_chunk(), called as > part of the whitebox tests WB_RunMemoryUnitTests(). In this test metaspace > allocation is tested by rapidly allocating and subsequently leaking a > metachunk of ~512K. This is done by a number of threads in a tight loop for > 15 seconds, which usually makes for 10-20GB rss. Test is usually OOM killed. > > This test seems to be often excluded, which makes sense, since this leak > makes its memory usage difficult to predict. > > It is also earmarked by Oracle for gtest-ification, see 8213269. > > This leak is not easy to fix, among other things because it is not clear > what it is it wants to test. Meanwhile, time moved on and we have quite > nice gtests to test metaspace allocation (see e.g. > test_metaspace_allocation.cpp) and I rather would run those gtests > concurrently. Which could be a future RFE. > > So I just removed this metaspace related test from WB_RunMemoryUnitTests() > altogether, since to me it does nothing useful. Once you remove the leaking > allocation, not much is left. > > Without this part RunUnitTestsConcurrently test runs smoothly through its > other parts, and in that form it is still useful. > > What do you think? I think this makes sense and it looks good to me. Thanks, StefanK > > Cheers, Thomas From thomas.stuefe at gmail.com Mon Jul 1 19:07:42 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 1 Jul 2019 21:07:42 +0200 Subject: RFR(xs): 8225200: runtime/memory/RunUnitTestsConcurrently.java has a memory leak In-Reply-To: References: Message-ID: Thanks Stefan! On Mon, Jul 1, 2019, 21:06 Stefan Karlsson wrote: > On 2019-07-01 20:56, Thomas St?fe wrote: > > Hi all, > > > > may I please have reviews and opinions about the following patch: > > > > Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 > > cr: > > > http://cr.openjdk.java.net/~stuefe/webrevs/8227041-rununittestsconcurrently-has-a-mem-leak/webrev.00/webrev/index.html > > > > There is a memory leak in test_virtual_space_list_large_chunk(), called > as > > part of the whitebox tests WB_RunMemoryUnitTests(). In this test > metaspace > > allocation is tested by rapidly allocating and subsequently leaking a > > metachunk of ~512K. This is done by a number of threads in a tight loop > for > > 15 seconds, which usually makes for 10-20GB rss. Test is usually OOM > killed. > > > > This test seems to be often excluded, which makes sense, since this leak > > makes its memory usage difficult to predict. > > > > It is also earmarked by Oracle for gtest-ification, see 8213269. > > > > This leak is not easy to fix, among other things because it is not clear > > what it is it wants to test. Meanwhile, time moved on and we have quite > > nice gtests to test metaspace allocation (see e.g. > > test_metaspace_allocation.cpp) and I rather would run those gtests > > concurrently. Which could be a future RFE. > > > > So I just removed this metaspace related test from > WB_RunMemoryUnitTests() > > altogether, since to me it does nothing useful. Once you remove the > leaking > > allocation, not much is left. > > > > Without this part RunUnitTestsConcurrently test runs smoothly through its > > other parts, and in that form it is still useful. > > > > What do you think? > > I think this makes sense and it looks good to me. > > Thanks, > StefanK > > > > > Cheers, Thomas > > From coleen.phillimore at oracle.com Mon Jul 1 19:13:21 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 1 Jul 2019 15:13:21 -0400 Subject: RFR(xs): 8225200: runtime/memory/RunUnitTestsConcurrently.java has a memory leak In-Reply-To: References: Message-ID: +1 Thank you for taking care of this! Coleen On 7/1/19 3:07 PM, Thomas St?fe wrote: > Thanks Stefan! > > On Mon, Jul 1, 2019, 21:06 Stefan Karlsson > wrote: > >> On 2019-07-01 20:56, Thomas St?fe wrote: >>> Hi all, >>> >>> may I please have reviews and opinions about the following patch: >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 >>> cr: >>> >> http://cr.openjdk.java.net/~stuefe/webrevs/8227041-rununittestsconcurrently-has-a-mem-leak/webrev.00/webrev/index.html >>> There is a memory leak in test_virtual_space_list_large_chunk(), called >> as >>> part of the whitebox tests WB_RunMemoryUnitTests(). In this test >> metaspace >>> allocation is tested by rapidly allocating and subsequently leaking a >>> metachunk of ~512K. This is done by a number of threads in a tight loop >> for >>> 15 seconds, which usually makes for 10-20GB rss. Test is usually OOM >> killed. >>> This test seems to be often excluded, which makes sense, since this leak >>> makes its memory usage difficult to predict. >>> >>> It is also earmarked by Oracle for gtest-ification, see 8213269. >>> >>> This leak is not easy to fix, among other things because it is not clear >>> what it is it wants to test. Meanwhile, time moved on and we have quite >>> nice gtests to test metaspace allocation (see e.g. >>> test_metaspace_allocation.cpp) and I rather would run those gtests >>> concurrently. Which could be a future RFE. >>> >>> So I just removed this metaspace related test from >> WB_RunMemoryUnitTests() >>> altogether, since to me it does nothing useful. Once you remove the >> leaking >>> allocation, not much is left. >>> >>> Without this part RunUnitTestsConcurrently test runs smoothly through its >>> other parts, and in that form it is still useful. >>> >>> What do you think? >> I think this makes sense and it looks good to me. >> >> Thanks, >> StefanK >> >>> Cheers, Thomas >> From thomas.stuefe at gmail.com Mon Jul 1 19:18:52 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 1 Jul 2019 21:18:52 +0200 Subject: RFR(xs): 8225200: runtime/memory/RunUnitTestsConcurrently.java has a memory leak In-Reply-To: References: Message-ID: Thanks Coleen! On Mon, Jul 1, 2019, 21:14 wrote: > +1 > Thank you for taking care of this! > Coleen > > On 7/1/19 3:07 PM, Thomas St?fe wrote: > > Thanks Stefan! > > > > On Mon, Jul 1, 2019, 21:06 Stefan Karlsson > > wrote: > > > >> On 2019-07-01 20:56, Thomas St?fe wrote: > >>> Hi all, > >>> > >>> may I please have reviews and opinions about the following patch: > >>> > >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 > >>> cr: > >>> > >> > http://cr.openjdk.java.net/~stuefe/webrevs/8227041-rununittestsconcurrently-has-a-mem-leak/webrev.00/webrev/index.html > >>> There is a memory leak in test_virtual_space_list_large_chunk(), called > >> as > >>> part of the whitebox tests WB_RunMemoryUnitTests(). In this test > >> metaspace > >>> allocation is tested by rapidly allocating and subsequently leaking a > >>> metachunk of ~512K. This is done by a number of threads in a tight loop > >> for > >>> 15 seconds, which usually makes for 10-20GB rss. Test is usually OOM > >> killed. > >>> This test seems to be often excluded, which makes sense, since this > leak > >>> makes its memory usage difficult to predict. > >>> > >>> It is also earmarked by Oracle for gtest-ification, see 8213269. > >>> > >>> This leak is not easy to fix, among other things because it is not > clear > >>> what it is it wants to test. Meanwhile, time moved on and we have quite > >>> nice gtests to test metaspace allocation (see e.g. > >>> test_metaspace_allocation.cpp) and I rather would run those gtests > >>> concurrently. Which could be a future RFE. > >>> > >>> So I just removed this metaspace related test from > >> WB_RunMemoryUnitTests() > >>> altogether, since to me it does nothing useful. Once you remove the > >> leaking > >>> allocation, not much is left. > >>> > >>> Without this part RunUnitTestsConcurrently test runs smoothly through > its > >>> other parts, and in that form it is still useful. > >>> > >>> What do you think? > >> I think this makes sense and it looks good to me. > >> > >> Thanks, > >> StefanK > >> > >>> Cheers, Thomas > >> > > From david.holmes at oracle.com Mon Jul 1 21:10:45 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 2 Jul 2019 07:10:45 +1000 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: References: Message-ID: <2c9e6acd-0e79-13c0-23ea-2cef402ee125@oracle.com> Hi Adam, On 1/07/2019 10:27 pm, Adam Farley8 wrote: > Hi All, > > The title say it all. > > If you pass in a value for sun.boot.library.path consisting > of one or more paths that are too long, then the vm will > fail to start because it can't load one of the libraries it > needs (the zip library), despite the fact that the VM > automatically prepends the default library path to the > sun.boot.library.path property, using the correct separator > to divide it from the user-specified path. > > So we've got the right path, in the right place, at the > right time, we just can't *use* it. > > I've fixed this by changing the relevant os.cpp code to > ignore paths that are too long, and to attempt to locate > the needed library on the other paths (if any are valid). As I just added to the bug report I have a different view of "correct" here. If you just ignore the long path and keep processing other short paths you may find the wrong library. There is a user error here and that error should be reported ASAP and in a way that leads to failure ASAP. Perhaps we should be more aggressive in aborting the VM when this is detected? David ----- > I've also added functionality to handle the edge case of > paths that are neeeeeeearly too long, only for a > sub-path (or file name) to push us over the limit *after* > the split_path function is done assessing the path length. > > I've also changed the code we're overriding, on the assumption > that someone's still using it somewhere. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227021 > Webrev: http://cr.openjdk.java.net/~afarley/8227021/webrev/ > > Thoughts and impressions welcome. > > Best Regards > > Adam Farley > IBM Runtimes > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > From coleen.phillimore at oracle.com Mon Jul 1 21:36:23 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 1 Jul 2019 17:36:23 -0400 Subject: RFR[13]: 8226366: Excessive ServiceThread wakeups for OopStorage cleanup In-Reply-To: References: Message-ID: http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/runtime/serviceThread.cpp.frames.html Do you have another bug to add the oopStorage for the ResolvedMethodTable to the list? http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/runtime/safepoint.cpp.frames.html I suppose you don't need is_safepoint_needed() to trigger this cleanup in the GuaranteedSafepointInterval because if there is no GC, there won't be any blocks to deallocate. http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/gc/shared/oopStorage.cpp.frames.html One nit.? The rest of the implementations that do the same thing as this, are called "trigger_concurrent_work".? This is called differently from the safepoint cleanup tasls, but could you call it trigger_cleanup_if_needed() instead?? Then I know it does the same/similar thing as the others without looking. 818 void OopStorage::request_cleanup_if_needed() { 819 MonitorLocker ml(Service_lock, Monitor::_no_safepoint_check_flag); 820 if (Atomic::load(&needs_cleanup_requested) && 821 !needs_cleanup_notified && 822 (os::javaTimeNanos() > cleanup_permit_time)) { 823 needs_cleanup_notified = true; 824 ml.notify_all(); 825 } 826 } The implementation looks good.? I think it's good that you don't have the safepoint cleanup task timer around this. Thanks, Coleen On 6/25/19 10:38 PM, Kim Barrett wrote: > Please review this change to OopStorage's notifications to the ServiceThread > to perform empty block deletion. The existing mechanism (introduced by > JDK-8210986) is driven by entry allocation, and may arbitrarily delay such > cleanup, or alternatively may be much too enthusiastic about waking up the > ServiceThread. > > The new mechanism does not depend on allocations. Instead, a new safepoint > cleanup task is used to (irregularly) check for pending requests and notify > the ServiceThread. That notification has a time-based throttle, and also > avoids duplicate notifications. Also, requests are now only recorded for > to-empty transitions and not for full to not-full transitions. > > Changed the work limit for delete_empty_blocks to have a small surplus to > avoid some common cases with small number of blocks leading to unnecessarily > spinning the ServiceThread. > > While making these changes, noticed and fixed a problem in block allocation > that could result in a mistaken report of allocation failure. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8226366 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8226366/open.00/ > > Testing: > mach5 tier1-5 > > Locally ran gc/stress/TestReclaimStringsLeaksMemory.java with some extra > logging and verified that the number of ServiceThread notifications was > reduced by a *lot*, down to something reasonable. > From coppa at di.uniroma1.it Tue Jul 2 08:47:22 2019 From: coppa at di.uniroma1.it (Emilio Coppa) Date: Tue, 2 Jul 2019 10:47:22 +0200 Subject: MPLR 2019 - Deadline Extension Message-ID: =============================================== MPLR 2019 16th International Conference on Managed Programming Languages & Runtimes Co-located with SPLASH 2019 Athens, Greece, Oct 20-25, 2019 https://conf.researchr.org/home/mplr-2019 =============================================== The 16th International Conference on Managed Programming Languages & Runtimes (MPLR, formerly ManLang) is a premier forum for presenting and discussing novel results in all aspects of managed programming languages and runtime systems, which serve as building blocks for some of the most important computing systems around, ranging from small-scale (embedded and real-time systems) to large-scale (cloud-computing and big-data platforms) and anything in between (mobile, IoT, and wearable applications). This year, MPLR is co-located with SPLASH 2019 and sponsored by ACM. For more information, check out the conference website: https://conf.researchr.org/home/mplr-2019 # Topics Topics of interest include but are not limited to: * Languages and Compilers - Managed languages (e.g., Java, Scala, JavaScript, Python, Ruby, C#, F#, Clojure, Groovy, Kotlin, R, Smalltalk, Racket, Rust, Go, etc.) - Domain-specific languages - Language design - Compilers and interpreters - Type systems and program logics - Language interoperability - Parallelism, distribution, and concurrency * Virtual Machines - Managed runtime systems (e.g., JVM, Dalvik VM, Android Runtime (ART), LLVM, .NET CLR, RPython, etc.) - VM design and optimization - VMs for mobile and embedded devices - VMs for real-time applications - Memory management - Hardware/software co-design * Techniques, Tools, and Applications - Static and dynamic program analysis - Testing and debugging - Refactoring - Program understanding - Program synthesis - Security and privacy - Performance analysis and monitoring - Compiler and program verification # Submission Categories MPLR accepts four types of submissions: 1. Regular research papers, which describe novel contributions involving managed language platforms (up to 12 pages excluding bibliography and appendix). Research papers will be evaluated based on their relevance, novelty, technical rigor, and contribution to the state-of-the-art. 2. Work-in-progress research papers, which describe promising new ideas but yet have less maturity than full papers (up to 6 pages excluding bibliography and appendix). When evaluating work-in-progress papers, more emphasis will be placed on novelty and the potential of the new ideas than on technical rigor and experimental results. 3. Industry and tool papers, which present technical challenges and solutions for managed language platforms in the context of deployed applications and systems (up to 6 pages excluding bibliography and appendix). Industry and tool papers will be evaluated on their relevance, usefulness, and results. Suitability for demonstration and availability will also be considered for tool papers. 4. Posters, which can be accompanied by a one-page abstract and will be evaluated on similar criteria as Work-in-progress papers. Posters can accompany any submission as a way to provide additional demonstration and discussion opportunities. MPLR 2019 submissions must conform to the ACM Policy on Prior Publication and Simultaneous Submissions and to the SIGPLAN Republication Policy. # Important Dates and Organization Submission Deadline: ***Jul 15, 2019*** (extended) Author Notification: Aug 24, 2019 Camera Ready: Sep 12, 2019 Conference Dates: Oct 20-25, 2019 General Chair: Tony Hosking, Australian National University / Data61, Australia Program Chair: Irene Finocchi, Sapienza University of Rome, Italy Program Committee: * Edd Barrett, King's College London, United Kingdom * Steve Blackburn, Australian National University, Australia * Lubomir Bulej, Charles University, Czech Republic * Shigeru Chiba, University of Tokyo, Japan * Daniele Cono D'Elia, Sapienza University of Rome, Italy * Ana Lucia de Moura, Pontifical Catholic University of Rio de Janeiro, Brazil * Erik Ernst, Google, Denmark * Matthew Hertz, University at Buffalo, United States * Vivek Kumar, Indraprastha Institute of Information Technology, Delhi * Doug Lea, State University of New York (SUNY) Oswego, United States * Magnus Madsen, Aarhus University, Denmark * Hidehiko Masuhara, Tokyo Institute of Technology, Japan * Ana Milanova, Rensselaer Polytechnic Institute, United States * Matthew Parkinson, Microsoft Research, United Kingdom * Gregor Richards, University of Waterloo, Canada * Manuel Rigger, ETH Zurich, Switzerland * Andrea Rosa, University of Lugano, Switzerland * Guido Salvaneschi, TU Darmstadt, Germany * Lukas Stadler, Oracle Labs, Austria * Ben L. Titzer, Google, Germany From adam.farley at uk.ibm.com Tue Jul 2 09:44:04 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Tue, 2 Jul 2019 10:44:04 +0100 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: <2c9e6acd-0e79-13c0-23ea-2cef402ee125@oracle.com> References: <2c9e6acd-0e79-13c0-23ea-2cef402ee125@oracle.com> Message-ID: Hi David, Thanks for your thoughts. The user should absolutely have immediate feedback, yes, and I agree that "skipping" paths could lead to us loading the wrong library. Perhaps a compromise? We fire off a stderr warning if any of the paths are too long (without killing the VM), we ignore any path *after* (and including) the first too-long path, and we kill the VM if the first path is too long. Warning message example: ---- Warning: One or more sun.boot.library.path paths were too long for this system, and it (along with all subsequent paths) have been ignored. ---- Another addition could be to check the path lengths for the property sooner, thus aborting the VM faster if the default path is too long. Assuming we posit that the VM will always need to load libraries. Best Regards Adam Farley IBM Runtimes David Holmes wrote on 01/07/2019 22:10:45: > From: David Holmes > To: Adam Farley8 , hotspot-dev at openjdk.java.net > Date: 01/07/2019 22:12 > Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > paths are longer than JVM_MAXPATHLEN > > Hi Adam, > > On 1/07/2019 10:27 pm, Adam Farley8 wrote: > > Hi All, > > > > The title say it all. > > > > If you pass in a value for sun.boot.library.path consisting > > of one or more paths that are too long, then the vm will > > fail to start because it can't load one of the libraries it > > needs (the zip library), despite the fact that the VM > > automatically prepends the default library path to the > > sun.boot.library.path property, using the correct separator > > to divide it from the user-specified path. > > > > So we've got the right path, in the right place, at the > > right time, we just can't *use* it. > > > > I've fixed this by changing the relevant os.cpp code to > > ignore paths that are too long, and to attempt to locate > > the needed library on the other paths (if any are valid). > > As I just added to the bug report I have a different view of "correct" > here. If you just ignore the long path and keep processing other short > paths you may find the wrong library. There is a user error here and > that error should be reported ASAP and in a way that leads to failure > ASAP. Perhaps we should be more aggressive in aborting the VM when this > is detected? > > David > ----- > > > I've also added functionality to handle the edge case of > > paths that are neeeeeeearly too long, only for a > > sub-path (or file name) to push us over the limit *after* > > the split_path function is done assessing the path length. > > > > I've also changed the code we're overriding, on the assumption > > that someone's still using it somewhere. > > > > Bug: https://urldefense.proofpoint.com/v2/url? > u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8227021&d=DwICaQ&c=jf_iaSHvJObTbx- > siA1ZOg&r=P5m8KWUXJf- > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=xZzQCnv68xd9hJyyK1obSim38eWSRmLPfuR__9ddZWg&e= > > Webrev: https://urldefense.proofpoint.com/v2/url? > u=http-3A__cr.openjdk.java.net_-7Eafarley_8227021_webrev_&d=DwICaQ&c=jf_iaSHvJObTbx- > siA1ZOg&r=P5m8KWUXJf- > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=- > hKU0zUd_0LDT08wTilexgI54EeSgt8xUk97i6V63Bk&e= > > > > Thoughts and impressions welcome. > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with number > > 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From martin.doerr at sap.com Tue Jul 2 10:19:42 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 2 Jul 2019 10:19:42 +0000 Subject: RFR: 8226238: Improve error output and fix elf issues in os::dll_load In-Reply-To: References: Message-ID: Hi Matthias, thanks for contributing this improvement. Please note that there are endianness macros available. You can use e.g. if (elf_head.e_ident[EI_DATA] != LITTLE_ENDIAN_ONLY(ELFDATA2LSB) BIG_ENDIAN_ONLY(ELFDATA2MSB)) { I don't see why we need a variable "current_endianness". Besides this, change looks good to me. I don't need to see another webrev. Best regards, Martin > -----Original Message----- > From: hotspot-dev On Behalf Of > Baesken, Matthias > Sent: Freitag, 28. Juni 2019 09:06 > To: Langer, Christoph ; 'hotspot- > dev at openjdk.java.net' > Subject: RE: RFR: 8226238: Improve error output and fix elf issues in > os::dll_load > > Hi Christoph, thanks for looking into it. > I did the changes you mentioned, here is my new webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8226238.4/ > > Would be good to get a second review . > > > Thanks and best regards, Matthias > > > > -----Original Message----- > > From: Langer, Christoph > > Sent: Donnerstag, 27. Juni 2019 16:59 > > To: Baesken, Matthias ; 'hotspot- > > dev at openjdk.java.net' > > Subject: RE: RFR: 8226238: Improve error output and fix elf issues in > > os::dll_load > > > > Hi Matthias, > > > > your change looks good overall. > > > > I only have a few style nits: > > > > src/hotspot/os/linux/os_linux.cpp, line 1751 (new): > > > > Can you convert > > > > unsigned char current_endianness = ELFDATA2MSB; // BE > > #if defined(VM_LITTLE_ENDIAN) > > current_endianness = ELFDATA2LSB; // LE > > #endif > > > > to > > > > #if defined(VM_LITTLE_ENDIAN) > > unsigned char current_endianness = ELFDATA2LSB; // LE > > #else > > unsigned char current_endianness = ELFDATA2MSB; // BE > > #endif > > > > And the same in line 1611 of src/hotspot/os/solaris/os_solaris.cpp. > > > > src/hotspot/os/linux/os_linux.cpp, line 1802: you could fix the indentation > of > > ELFDATA2LSB for EM_ARM > > same for line 1580 of src/hotspot/os/solaris/os_solaris.cpp. > > From thomas.schatzl at oracle.com Tue Jul 2 11:00:09 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 02 Jul 2019 13:00:09 +0200 Subject: RFR[13]: 8226366: Excessive ServiceThread wakeups for OopStorage cleanup In-Reply-To: References: Message-ID: <4b178ed4b7c64e435d05619f95cf0421e5c23d6e.camel@oracle.com> Hi, On Mon, 2019-07-01 at 17:36 -0400, coleen.phillimore at oracle.com wrote: > http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/runtime/serviceThread.cpp.frames.html > > Do you have another bug to add the oopStorage for the > ResolvedMethodTable to the list? > > http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/runtime/safepoint.cpp.frames.html > > I suppose you don't need is_safepoint_needed() to trigger this > cleanup in the GuaranteedSafepointInterval because if there is no GC, > there won't be any blocks to deallocate. > > http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/gc/shared/oopStorage.cpp.frames.html > > One nit. The rest of the implementations that do the same thing as > this, are called "trigger_concurrent_work". This is called > differently from the safepoint cleanup tasls, but could you call it > trigger_cleanup_if_needed() instead? Then I know it does the > same/similar thing as the others without looking. > > 818 void OopStorage::request_cleanup_if_needed() { > 819 MonitorLocker ml(Service_lock, > Monitor::_no_safepoint_check_flag); > 820 if (Atomic::load(&needs_cleanup_requested) && > 821 !needs_cleanup_notified && > 822 (os::javaTimeNanos() > cleanup_permit_time)) { > 823 needs_cleanup_notified = true; > 824 ml.notify_all(); > 825 } > 826 } > Similar in serviceThread.cpp:136, it would be nice if the method were named "has_work()" like others instead of "test_and_clear_cleanup_request()". While the latter is technically better, it raises the question whether it is the correct thing to do here in some way when compared to others. Feel free to ignore this comment though. I *think* otherwise it is good, but I am kind of new to the OopStorage stuff. Thanks, Thomas From matthias.baesken at sap.com Tue Jul 2 12:54:39 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 2 Jul 2019 12:54:39 +0000 Subject: RFR: 8226238: Improve error output and fix elf issues in os::dll_load In-Reply-To: References: Message-ID: Hi Martin , thanks for the review . I followed your advice and removed the current_endianness - variable . http://cr.openjdk.java.net/~mbaesken/webrevs/8226238.5/ Best regards , Matthias > Hi Matthias, > > thanks for contributing this improvement. > > Please note that there are endianness macros available. You can use e.g. > if (elf_head.e_ident[EI_DATA] != LITTLE_ENDIAN_ONLY(ELFDATA2LSB) > BIG_ENDIAN_ONLY(ELFDATA2MSB)) { > I don't see why we need a variable "current_endianness". > > Besides this, change looks good to me. I don't need to see another webrev. > > Best regards, > Martin > From martin.doerr at sap.com Tue Jul 2 12:56:37 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 2 Jul 2019 12:56:37 +0000 Subject: RFR: 8226238: Improve error output and fix elf issues in os::dll_load In-Reply-To: References: Message-ID: Looks good. Thanks, Martin > -----Original Message----- > From: Baesken, Matthias > Sent: Dienstag, 2. Juli 2019 14:55 > To: Doerr, Martin ; Langer, Christoph > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net> > Subject: RE: RFR: 8226238: Improve error output and fix elf issues in > os::dll_load > > Hi Martin , thanks for the review . > > I followed your advice and removed the current_endianness - variable . > > http://cr.openjdk.java.net/~mbaesken/webrevs/8226238.5/ > > > Best regards , Matthias > > > > > Hi Matthias, > > > > thanks for contributing this improvement. > > > > Please note that there are endianness macros available. You can use e.g. > > if (elf_head.e_ident[EI_DATA] != LITTLE_ENDIAN_ONLY(ELFDATA2LSB) > > BIG_ENDIAN_ONLY(ELFDATA2MSB)) { > > I don't see why we need a variable "current_endianness". > > > > Besides this, change looks good to me. I don't need to see another webrev. > > > > Best regards, > > Martin > > From kim.barrett at oracle.com Tue Jul 2 18:36:32 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 2 Jul 2019 14:36:32 -0400 Subject: RFR[13]: 8226366: Excessive ServiceThread wakeups for OopStorage cleanup In-Reply-To: References: Message-ID: <18A29890-7336-485E-97E8-A0E5C6DE93E6@oracle.com> > On Jul 1, 2019, at 5:36 PM, coleen.phillimore at oracle.com wrote: > > > http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/runtime/serviceThread.cpp.frames.html > > Do you have another bug to add the oopStorage for the ResolvedMethodTable to the list? JDK-8227053. Also JDK-8227054. > http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/runtime/safepoint.cpp.frames.html > > I suppose you don't need is_safepoint_needed() to trigger this cleanup in the GuaranteedSafepointInterval because if there is no GC, there won't be any blocks to deallocate. s/is_safepoint_needed()/is_cleanup_needed()/ There doesn't seem to be a clear theory of what that function should check for. Some of the existing safepoint cleanups have checks there, some don't, and it's not always obvious why. This cleanup doesn't seem so urgent that if there are no other reasons to safepoint for a long(ish) time then we should force one for just this purpose. > http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/gc/shared/oopStorage.cpp.frames.html > > One nit. The rest of the implementations that do the same thing as this, are called "trigger_concurrent_work". This is called differently from the safepoint cleanup tasls, but could you call it trigger_cleanup_if_needed() instead? Then I know it does the same/similar thing as the others without looking. Renamed to trigger_cleanup_if_needed(). Also test_and_clear_cleanup_request() => has_cleanup_work_and_reset(). Thomas asked for "has_work", to be consistent with String/Symbol/ResolvedMethodTable, but I think that's too generic here; what kind of work? (In the case of the tables, it's not always "cleanup" work.) Coleen suggested the "and_reset" suffix, to follow an existing convention. I also made a few corresponding internal name changes. > The implementation looks good. I think it's good that you don't have the safepoint cleanup task timer around this. I added a comment about the lack of task timing, so it's clearly intentional and not simply forgotten. New webrevs: full: http://cr.openjdk.java.net/~kbarrett/8226366/open.01/ incr: http://cr.openjdk.java.net/~kbarrett/8226366/open.01.inc/ Testing: Local build and hotspot_tier1. From coleen.phillimore at oracle.com Tue Jul 2 21:12:47 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 2 Jul 2019 17:12:47 -0400 Subject: RFR[13]: 8226366: Excessive ServiceThread wakeups for OopStorage cleanup In-Reply-To: <18A29890-7336-485E-97E8-A0E5C6DE93E6@oracle.com> References: <18A29890-7336-485E-97E8-A0E5C6DE93E6@oracle.com> Message-ID: <44159f2d-18c0-bf39-1e41-1af3e32972a3@oracle.com> On 7/2/19 2:36 PM, Kim Barrett wrote: >> On Jul 1, 2019, at 5:36 PM, coleen.phillimore at oracle.com wrote: >> >> >> http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/runtime/serviceThread.cpp.frames.html >> >> Do you have another bug to add the oopStorage for the ResolvedMethodTable to the list? > JDK-8227053. Also JDK-8227054. Good. > >> http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/runtime/safepoint.cpp.frames.html >> >> I suppose you don't need is_safepoint_needed() to trigger this cleanup in the GuaranteedSafepointInterval because if there is no GC, there won't be any blocks to deallocate. > s/is_safepoint_needed()/is_cleanup_needed()/ > > There doesn't seem to be a clear theory of what that function should > check for. Some of the existing safepoint cleanups have checks there, > some don't, and it's not always obvious why. This cleanup doesn't seem > so urgent that if there are no other reasons to safepoint for a > long(ish) time then we should force one for just this purpose. Yes, it's not well defined.? If something can run forever and need a cleanup without a safepoint, it should go in the list. > >> http://cr.openjdk.java.net/~kbarrett/8226366/open.00/src/hotspot/share/gc/shared/oopStorage.cpp.frames.html >> >> One nit. The rest of the implementations that do the same thing as this, are called "trigger_concurrent_work". This is called differently from the safepoint cleanup tasls, but could you call it trigger_cleanup_if_needed() instead? Then I know it does the same/similar thing as the others without looking. > Renamed to trigger_cleanup_if_needed(). > > Also test_and_clear_cleanup_request() => has_cleanup_work_and_reset(). > Thomas asked for "has_work", to be consistent with > String/Symbol/ResolvedMethodTable, but I think that's too generic here; > what kind of work? (In the case of the tables, it's not always > "cleanup" work.) Coleen suggested the "and_reset" suffix, to follow an > existing convention. > > I also made a few corresponding internal name changes. > >> The implementation looks good. I think it's good that you don't have the safepoint cleanup task timer around this. > I added a comment about the lack of task timing, so it's clearly > intentional and not simply forgotten. > > New webrevs: > full: http://cr.openjdk.java.net/~kbarrett/8226366/open.01/ > incr: http://cr.openjdk.java.net/~kbarrett/8226366/open.01.inc/ > > Testing: Local build and hotspot_tier1. > Nice! Thanks, Coleen From kim.barrett at oracle.com Tue Jul 2 21:43:46 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 2 Jul 2019 17:43:46 -0400 Subject: RFR[13]: 8226366: Excessive ServiceThread wakeups for OopStorage cleanup In-Reply-To: <44159f2d-18c0-bf39-1e41-1af3e32972a3@oracle.com> References: <18A29890-7336-485E-97E8-A0E5C6DE93E6@oracle.com> <44159f2d-18c0-bf39-1e41-1af3e32972a3@oracle.com> Message-ID: <67CFC501-B51C-4251-8331-15D0E59C32E9@oracle.com> > On Jul 2, 2019, at 5:12 PM, coleen.phillimore at oracle.com wrote: >> >> New webrevs: >> full: http://cr.openjdk.java.net/~kbarrett/8226366/open.01/ >> incr: http://cr.openjdk.java.net/~kbarrett/8226366/open.01.inc/ >> >> Testing: Local build and hotspot_tier1. >> > > Nice! > Thanks, > Coleen Thanks. From david.holmes at oracle.com Wed Jul 3 07:36:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 3 Jul 2019 17:36:36 +1000 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: References: <2c9e6acd-0e79-13c0-23ea-2cef402ee125@oracle.com> Message-ID: On 2/07/2019 7:44 pm, Adam Farley8 wrote: > Hi David, > > Thanks for your thoughts. > > The user should absolutely have immediate feedback, yes, and I agree > that "skipping" paths could lead to us loading the wrong library. > > Perhaps a compromise? We fire off a stderr warning if any of the paths > are too long (without killing the VM), we ignore any path *after* > (and including) the first too-long path, and we kill the VM if the > first path is too long. My first though is why be so elaborate and not just fail immediately: Error occurred during initialization of VM One or more sun.boot.library.path elements is too long for this system. --- ? But AFAICS we don't do any sanity checking of the those paths so this would have an impact on startup. I can't locate where we would detect the too-long path element, is it in hostpot or JDK code? Thanks, David ----- > Warning message example: > > ---- > Warning: One or more sun.boot.library.path paths were too long > for this system, and it (along with all subsequent paths) have been > ignored. > ---- > > Another addition could be to check the path lengths for the property > sooner, thus aborting the VM faster if the default path is too long. > > Assuming we posit that the VM will always need to load libraries. > > Best Regards > > Adam Farley > IBM Runtimes > > > David Holmes wrote on 01/07/2019 22:10:45: > >> From: David Holmes >> To: Adam Farley8 , hotspot-dev at openjdk.java.net >> Date: 01/07/2019 22:12 >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path >> paths are longer than JVM_MAXPATHLEN >> >> Hi Adam, >> >> On 1/07/2019 10:27 pm, Adam Farley8 wrote: >> > Hi All, >> > >> > The title say it all. >> > >> > If you pass in a value for sun.boot.library.path consisting >> > of one or more paths that are too long, then the vm will >> > fail to start because it can't load one of the libraries it >> > needs (the zip library), despite the fact that the VM >> > automatically prepends the default library path to the >> > sun.boot.library.path property, using the correct separator >> > to divide it from the user-specified path. >> > >> > So we've got the right path, in the right place, at the >> > right time, we just can't *use* it. >> > >> > I've fixed this by changing the relevant os.cpp code to >> > ignore paths that are too long, and to attempt to locate >> > the needed library on the other paths (if any are valid). >> >> As I just added to the bug report I have a different view of "correct" >> here. If you just ignore the long path and keep processing other short >> paths you may find the wrong library. There is a user error here and >> that error should be reported ASAP and in a way that leads to failure >> ASAP. Perhaps we should be more aggressive in aborting the VM when this >> is detected? >> >> David >> ----- >> >> > I've also added functionality to handle the edge case of >> > paths that are neeeeeeearly too long, only for a >> > sub-path (or file name) to push us over the limit *after* >> > the split_path function is done assessing the path length. >> > >> > I've also changed the code we're overriding, on the assumption >> > that someone's still using it somewhere. >> > >> > Bug: https://urldefense.proofpoint.com/v2/url? >> u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8227021&d=DwICaQ&c=jf_iaSHvJObTbx- >> siA1ZOg&r=P5m8KWUXJf- >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=xZzQCnv68xd9hJyyK1obSim38eWSRmLPfuR__9ddZWg&e= >> > Webrev: https://urldefense.proofpoint.com/v2/url? >> u=http-3A__cr.openjdk.java.net_-7Eafarley_8227021_webrev_&d=DwICaQ&c=jf_iaSHvJObTbx- >> siA1ZOg&r=P5m8KWUXJf- >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=- >> hKU0zUd_0LDT08wTilexgI54EeSgt8xUk97i6V63Bk&e= >> > >> > Thoughts and impressions welcome. >> > >> > Best Regards >> > >> > Adam Farley >> > IBM Runtimes >> > >> > Unless stated otherwise above: >> > IBM United Kingdom Limited - Registered in England and Wales with number >> > 741598. >> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >> > >> > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From erik.osterlund at oracle.com Wed Jul 3 10:15:47 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 3 Jul 2019 12:15:47 +0200 Subject: RFR[13]: 8224531: SEGV while collecting Klass statistics Message-ID: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> Hi, The heap inspection API performs pointer chasing through dead memory. In particular, it populates a table with classes based on a heap walk using CollectedHeap::object_iterate(). That API can give you dead objects as well, whereas CollectedHeap::safe_object_iterate() gives you only live objects. The possibly dead objects have their possibly dead Klass* recorded. Then we call Klass::collect_statistics on said possibly dead recorded classes. In there, we follow the possibly dead mirror oop of the possibly dead Klass, and subsequently read the Klass* of the possibly dead mirror. Here is an attempted ASCII art explaining this | Klass*????? | -> |??????????? | | hi I am a?? |??? | hi I am a? | | dead object |??? | dead Klass | ?????????????????? |??????????? | ?????????????????? | oop mirror | -> | Klass* ???? | -> | Class klass | ???????????????????????????????????? | hi I am a?? | ???????????????????????????????????? | dead mirror | So as you can see we pointer chase through this chain of possibly dead memory. What could possibly go wrong though? In CMS a crash can manifest in the following way: In a concurrent collection, both the object and its class die. They are dead once we reach final marking. But the memory is kept around and being swept by concurrent sweeping. The sweeping yields to young collections (controllable through the CMSYield flag). So what can happen is that the mirror is swept (which already is a problem in debug builds because we zap the free chunk of memory, but hold on there is a problem in in product builds too) and gets added to a free list. In the yielded safepoint we may perform a young collection that promotes objects to the memory of the freed chunk (where the mirror used to be, except due to coalescing of freed chunks, there might not be a Klass* pointer where there used to be one for the mirror). And then, before sweeping finishes, the heap inspection API is called. That API sometimes tries to perform a STW GC first to get only live objects, but that GC may fail because of the JNI gc locker. And then it just goes ahead calling the unsafe object_iterate API anyway. The object_iterate API will pass in the object to the closure but not the mirror as it has been freed and reused. Buuut... since we pointer chase through the dead object to the stale reference to the dead mirror, we eventually find ourselves in an awkward situation where we try to read and use a Klass* that might really be a primitive vaue now (read: crash). The general rule of thumb is that pointer chasing through dead memory should NOT be done. We allow it in a few rare situations with the following constraints: 1) You have to use AS_NO_KEEPALIVE when reading dead oops, or things can blow up, 2) You may only read dead oops if you are the GC and hence can control that the memory it points at has not and will not be freed until your read finishes (...because you are the GC). Neither of these two constraints hold here. We read the mirrors without AS_NO_KEEPALIVE in the pointer chase, and we can not control that the memory it points at has not been freed. Therefore, this is an invalid use of pointer chasing through dead memory. The fix is simple: use the safe_object_iterate API instead, which only hands out live objects. I also sprinkled in no_keepalive decorators on the mirrors because it's good practice to not use that for such use cases where you clobber the whole heap (causing it to be marked in ZGC) but really just read an some int or something from the oop, without publishing any references to the oop. I tested this with 100 kitchensink iterations without my fix (failed 2 times) and 100 kitchensink iterations with my fix (failed 0 times). Bug: https://bugs.openjdk.java.net/browse/JDK-8224531 Webrev: http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/ Thanks, /Erik From coleen.phillimore at oracle.com Wed Jul 3 12:10:22 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 3 Jul 2019 08:10:22 -0400 Subject: RFR[13]: 8224531: SEGV while collecting Klass statistics In-Reply-To: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> References: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> Message-ID: <0208491e-37a3-6a84-0f80-d122d28ea4b7@oracle.com> http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/src/hotspot/share/memory/heapInspection.cpp.frames.html There's another object_iterate() in this file with a comment to change it to safe_object_iterate().? Should you change that too? Did you run the jvmti tests?? There used to be tests that failed if dead objects weren't found, but the tests may have been fixed. The rest of the change looks good.? Thank you for figuring this out! Coleen On 7/3/19 6:15 AM, Erik ?sterlund wrote: > Hi, > > The heap inspection API performs pointer chasing through dead memory. > In particular, it populates a table with classes based on a heap walk > using CollectedHeap::object_iterate(). That API can give you dead > objects as well, whereas CollectedHeap::safe_object_iterate() gives > you only live objects. > The possibly dead objects have their possibly dead Klass* recorded. > Then we call Klass::collect_statistics on said possibly dead recorded > classes. In there, we follow the possibly dead mirror oop of the > possibly dead Klass, and subsequently read the Klass* of the possibly > dead mirror. > Here is an attempted ASCII art explaining this > > | Klass*????? | -> |??????????? | > | hi I am a?? |??? | hi I am a? | > | dead object |??? | dead Klass | > ?????????????????? |??????????? | > ?????????????????? | oop mirror | -> | Klass* ???? | -> | Class klass | > ???????????????????????????????????? | hi I am a?? | > ???????????????????????????????????? | dead mirror | > > So as you can see we pointer chase through this chain of possibly dead > memory. What could possibly go wrong though? > In CMS a crash can manifest in the following way: > > In a concurrent collection, both the object and its class die. They > are dead once we reach final marking. But the memory is kept around > and being swept by concurrent sweeping. > The sweeping yields to young collections (controllable through the > CMSYield flag). So what can happen is that the mirror is swept (which > already is a problem in debug builds because we zap the free chunk of > memory, but hold on there is a problem in in product builds too) and > gets added to a free list. In the yielded safepoint we may perform a > young collection that promotes objects to the memory of the freed > chunk (where the mirror used to be, except due to coalescing of freed > chunks, there might not be a Klass* pointer where there used to be one > for the mirror). And then, before sweeping finishes, the heap > inspection API is called. That API sometimes tries to perform a STW GC > first to get only live objects, but that GC may fail because of the > JNI gc locker. And then it just goes ahead calling the unsafe > object_iterate API anyway. The object_iterate API will pass in the > object to the closure but not the mirror as it has been freed and > reused. Buuut... since we pointer chase through the dead object to the > stale reference to the dead mirror, we eventually find ourselves in an > awkward situation where we try to read and use a Klass* that might > really be a primitive vaue now (read: crash). > > The general rule of thumb is that pointer chasing through dead memory > should NOT be done. We allow it in a few rare situations with the > following constraints: 1) You have to use AS_NO_KEEPALIVE when reading > dead oops, or things can blow up, 2) You may only read dead oops if > you are the GC and hence can control that the memory it points at has > not and will not be freed until your read finishes (...because you are > the GC). > Neither of these two constraints hold here. We read the mirrors > without AS_NO_KEEPALIVE in the pointer chase, and we can not control > that the memory it points at has not been freed. Therefore, this is an > invalid use of pointer chasing through dead memory. The fix is simple: > use the safe_object_iterate API instead, which only hands out live > objects. I also sprinkled in no_keepalive decorators on the mirrors > because it's good practice to not use that for such use cases where > you clobber the whole heap (causing it to be marked in ZGC) but really > just read an some int or something from the oop, without publishing > any references to the oop. > > I tested this with 100 kitchensink iterations without my fix (failed 2 > times) and 100 kitchensink iterations with my fix (failed 0 times). > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8224531 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/ > > Thanks, > /Erik From erik.osterlund at oracle.com Wed Jul 3 12:48:41 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 3 Jul 2019 14:48:41 +0200 Subject: RFR[13]: 8224531: SEGV while collecting Klass statistics In-Reply-To: <0208491e-37a3-6a84-0f80-d122d28ea4b7@oracle.com> References: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> <0208491e-37a3-6a84-0f80-d122d28ea4b7@oracle.com> Message-ID: <005dde7d-d2bb-fa19-318c-9f130fa4c2de@oracle.com> Hi Coleen, Thanks for the review. On 2019-07-03 14:10, coleen.phillimore at oracle.com wrote: > > http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/src/hotspot/share/memory/heapInspection.cpp.frames.html > > > There's another object_iterate() in this file with a comment to change > it to safe_object_iterate().? Should you change that too? It probably should change, but it does not seem to have the same behaviour and I don't know if there is a real bug in that unrelated code. But yeah I would prefer that to change too, but I think that should be a separate change. > Did you run the jvmti tests?? There used to be tests that failed if > dead objects weren't found, but the tests may have been fixed. I will take it for a spin. Note though that behaviour relying on always getting dead objects will fail; the caller of the API will race with concurrent sweeping and either get or not get the dead objects depending on whether the heap iteration happened to kick in before or after sweeping. There is just no way you can rely on that. So if there is a test failure because of that, the test is wrong. Nevertheless, I will try to hunt down such tests. Thanks, /Erik > The rest of the change looks good.? Thank you for figuring this out! > Coleen > > On 7/3/19 6:15 AM, Erik ?sterlund wrote: >> Hi, >> >> The heap inspection API performs pointer chasing through dead memory. >> In particular, it populates a table with classes based on a heap walk >> using CollectedHeap::object_iterate(). That API can give you dead >> objects as well, whereas CollectedHeap::safe_object_iterate() gives >> you only live objects. >> The possibly dead objects have their possibly dead Klass* recorded. >> Then we call Klass::collect_statistics on said possibly dead recorded >> classes. In there, we follow the possibly dead mirror oop of the >> possibly dead Klass, and subsequently read the Klass* of the possibly >> dead mirror. >> Here is an attempted ASCII art explaining this >> >> | Klass*????? | -> |??????????? | >> | hi I am a?? |??? | hi I am a? | >> | dead object |??? | dead Klass | >> ?????????????????? |??????????? | >> ?????????????????? | oop mirror | -> | Klass* ???? | -> | Class klass | >> ???????????????????????????????????? | hi I am a?? | >> ???????????????????????????????????? | dead mirror | >> >> So as you can see we pointer chase through this chain of possibly >> dead memory. What could possibly go wrong though? >> In CMS a crash can manifest in the following way: >> >> In a concurrent collection, both the object and its class die. They >> are dead once we reach final marking. But the memory is kept around >> and being swept by concurrent sweeping. >> The sweeping yields to young collections (controllable through the >> CMSYield flag). So what can happen is that the mirror is swept (which >> already is a problem in debug builds because we zap the free chunk of >> memory, but hold on there is a problem in in product builds too) and >> gets added to a free list. In the yielded safepoint we may perform a >> young collection that promotes objects to the memory of the freed >> chunk (where the mirror used to be, except due to coalescing of freed >> chunks, there might not be a Klass* pointer where there used to be >> one for the mirror). And then, before sweeping finishes, the heap >> inspection API is called. That API sometimes tries to perform a STW >> GC first to get only live objects, but that GC may fail because of >> the JNI gc locker. And then it just goes ahead calling the unsafe >> object_iterate API anyway. The object_iterate API will pass in the >> object to the closure but not the mirror as it has been freed and >> reused. Buuut... since we pointer chase through the dead object to >> the stale reference to the dead mirror, we eventually find ourselves >> in an awkward situation where we try to read and use a Klass* that >> might really be a primitive vaue now (read: crash). >> >> The general rule of thumb is that pointer chasing through dead memory >> should NOT be done. We allow it in a few rare situations with the >> following constraints: 1) You have to use AS_NO_KEEPALIVE when >> reading dead oops, or things can blow up, 2) You may only read dead >> oops if you are the GC and hence can control that the memory it >> points at has not and will not be freed until your read finishes >> (...because you are the GC). >> Neither of these two constraints hold here. We read the mirrors >> without AS_NO_KEEPALIVE in the pointer chase, and we can not control >> that the memory it points at has not been freed. Therefore, this is >> an invalid use of pointer chasing through dead memory. The fix is >> simple: use the safe_object_iterate API instead, which only hands out >> live objects. I also sprinkled in no_keepalive decorators on the >> mirrors because it's good practice to not use that for such use cases >> where you clobber the whole heap (causing it to be marked in ZGC) but >> really just read an some int or something from the oop, without >> publishing any references to the oop. >> >> I tested this with 100 kitchensink iterations without my fix (failed >> 2 times) and 100 kitchensink iterations with my fix (failed 0 times). >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8224531 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/ >> >> Thanks, >> /Erik > From coleen.phillimore at oracle.com Wed Jul 3 12:50:02 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 3 Jul 2019 08:50:02 -0400 Subject: RFR[13]: 8224531: SEGV while collecting Klass statistics In-Reply-To: <005dde7d-d2bb-fa19-318c-9f130fa4c2de@oracle.com> References: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> <0208491e-37a3-6a84-0f80-d122d28ea4b7@oracle.com> <005dde7d-d2bb-fa19-318c-9f130fa4c2de@oracle.com> Message-ID: <43de7b9d-8888-967d-8eeb-510e6b932cf7@oracle.com> On 7/3/19 8:48 AM, Erik ?sterlund wrote: > Hi Coleen, > > Thanks for the review. > > On 2019-07-03 14:10, coleen.phillimore at oracle.com wrote: >> >> http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/src/hotspot/share/memory/heapInspection.cpp.frames.html >> >> >> There's another object_iterate() in this file with a comment to >> change it to safe_object_iterate().? Should you change that too? > > It probably should change, but it does not seem to have the same > behaviour and I don't know if there is a real bug in that unrelated > code. But yeah I would prefer that to change too, but I think that > should be a separate change. That's fine. > >> Did you run the jvmti tests?? There used to be tests that failed if >> dead objects weren't found, but the tests may have been fixed. > > I will take it for a spin. Note though that behaviour relying on > always getting dead objects will fail; the caller of the API will race > with concurrent sweeping and either get or not get the dead objects > depending on whether the heap iteration happened to kick in before or > after sweeping. There is just no way you can rely on that. So if there > is a test failure because of that, the test is wrong. Nevertheless, I > will try to hunt down such tests. > Yes, please. Thanks, Coleen > Thanks, > /Erik > >> The rest of the change looks good.? Thank you for figuring this out! >> Coleen >> >> On 7/3/19 6:15 AM, Erik ?sterlund wrote: >>> Hi, >>> >>> The heap inspection API performs pointer chasing through dead >>> memory. In particular, it populates a table with classes based on a >>> heap walk using CollectedHeap::object_iterate(). That API can give >>> you dead objects as well, whereas >>> CollectedHeap::safe_object_iterate() gives you only live objects. >>> The possibly dead objects have their possibly dead Klass* recorded. >>> Then we call Klass::collect_statistics on said possibly dead >>> recorded classes. In there, we follow the possibly dead mirror oop >>> of the possibly dead Klass, and subsequently read the Klass* of the >>> possibly dead mirror. >>> Here is an attempted ASCII art explaining this >>> >>> | Klass*????? | -> |??????????? | >>> | hi I am a?? |??? | hi I am a? | >>> | dead object |??? | dead Klass | >>> ?????????????????? |??????????? | >>> ?????????????????? | oop mirror | -> | Klass* ???? | -> | Class klass | >>> ???????????????????????????????????? | hi I am a?? | >>> ???????????????????????????????????? | dead mirror | >>> >>> So as you can see we pointer chase through this chain of possibly >>> dead memory. What could possibly go wrong though? >>> In CMS a crash can manifest in the following way: >>> >>> In a concurrent collection, both the object and its class die. They >>> are dead once we reach final marking. But the memory is kept around >>> and being swept by concurrent sweeping. >>> The sweeping yields to young collections (controllable through the >>> CMSYield flag). So what can happen is that the mirror is swept >>> (which already is a problem in debug builds because we zap the free >>> chunk of memory, but hold on there is a problem in in product builds >>> too) and gets added to a free list. In the yielded safepoint we may >>> perform a young collection that promotes objects to the memory of >>> the freed chunk (where the mirror used to be, except due to >>> coalescing of freed chunks, there might not be a Klass* pointer >>> where there used to be one for the mirror). And then, before >>> sweeping finishes, the heap inspection API is called. That API >>> sometimes tries to perform a STW GC first to get only live objects, >>> but that GC may fail because of the JNI gc locker. And then it just >>> goes ahead calling the unsafe object_iterate API anyway. The >>> object_iterate API will pass in the object to the closure but not >>> the mirror as it has been freed and reused. Buuut... since we >>> pointer chase through the dead object to the stale reference to the >>> dead mirror, we eventually find ourselves in an awkward situation >>> where we try to read and use a Klass* that might really be a >>> primitive vaue now (read: crash). >>> >>> The general rule of thumb is that pointer chasing through dead >>> memory should NOT be done. We allow it in a few rare situations with >>> the following constraints: 1) You have to use AS_NO_KEEPALIVE when >>> reading dead oops, or things can blow up, 2) You may only read dead >>> oops if you are the GC and hence can control that the memory it >>> points at has not and will not be freed until your read finishes >>> (...because you are the GC). >>> Neither of these two constraints hold here. We read the mirrors >>> without AS_NO_KEEPALIVE in the pointer chase, and we can not control >>> that the memory it points at has not been freed. Therefore, this is >>> an invalid use of pointer chasing through dead memory. The fix is >>> simple: use the safe_object_iterate API instead, which only hands >>> out live objects. I also sprinkled in no_keepalive decorators on the >>> mirrors because it's good practice to not use that for such use >>> cases where you clobber the whole heap (causing it to be marked in >>> ZGC) but really just read an some int or something from the oop, >>> without publishing any references to the oop. >>> >>> I tested this with 100 kitchensink iterations without my fix (failed >>> 2 times) and 100 kitchensink iterations with my fix (failed 0 times). >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8224531 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/ >>> >>> Thanks, >>> /Erik >> > From stefan.karlsson at oracle.com Wed Jul 3 13:31:27 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 3 Jul 2019 15:31:27 +0200 Subject: RFR: 8227175: ZGC: ZHeapIterator visits potentially dead objects Message-ID: Hi all, (Sending this RFR to hotspot-dev since it changes CLD claiming.) Please review this patch to fix the ZHeapIterator to not visit potentially dead objects. https://cr.openjdk.java.net/~stefank/8227175/ https://bugs.openjdk.java.net/browse/JDK-8227175 It changes how all heap iterations are done in ZGC. Previously, the marking code visited only the strong CLDs and traced through metadata to find all other CLDs that should be considered alive. The verification code, serviceability heap iterations, and marking without class unloading, skipped the metadata tracing part and visited all CLDs instead. Now, with this patch, all these heap iterations starts with the strong CLDs and trace through the object graph. One complication with that scheme is that non-GC heap iterations might be executing after concurrent marking has started, but before dead CLDs have been unlinked. To allow the GC marking code and one, at a time, non-GC heap iteration to run at the same time, I've introduced a new claim bit in the CLD claiming byte. I've called it "other", so now we have "strong", "finalizable", and "other". The contract is that the "other" bits should only be used in a safepoint operation, and must be cleared before the operation ends. This way we get mutual exclusion between different users of the "other" bits. The patch also adds more precise verification of ZGC references. This patch was written a few weeks ago to make the verification of ZGC references more precise. I've been using it since then to get better verification of other patches and when hunting for bugs. This means I've been running it through tier 1-7 multiple times. The intent was to get this pushed to JDK 14, but now that we've seen that we have an actual bug because of the imprecise nature of the ZHeapIterator, I'd like to get this patch pushed to JDK 13. We considered trying to split this up into two parts, the first part that fixes the heap iterations and the second part that adds the extra ZGC verification, but we thing that would take longer time and be riskier. Thanks, StefanK From zgu at redhat.com Wed Jul 3 14:03:43 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 3 Jul 2019 10:03:43 -0400 Subject: RFR: 8227175: ZGC: ZHeapIterator visits potentially dead objects In-Reply-To: References: Message-ID: Hi Stefan, Runtime part looks good to me. I had the same thoughts when moving Shenandoah CLDG evacuation to concurrent phase, then heap dump started to interfere concurrent CLDG iteration. Because Shenandoah heap dump uses single-thread to walk CLDG, we used _claimed_none to avoid the problem. The change can potentially allow us to relax the restriction. Thanks, -Zhengyu On 7/3/19 9:31 AM, Stefan Karlsson wrote: > Hi all, > > (Sending this RFR to hotspot-dev since it changes CLD claiming.) > > Please review this patch to fix the ZHeapIterator to not visit > potentially dead objects. > > https://cr.openjdk.java.net/~stefank/8227175/ > https://bugs.openjdk.java.net/browse/JDK-8227175 > > It changes how all heap iterations are done in ZGC. Previously, the > marking code visited only the strong CLDs and traced through metadata to > find all other CLDs that should be considered alive. The verification > code, serviceability heap iterations, and marking without class > unloading, skipped the metadata tracing part and visited all CLDs > instead. Now, with this patch, all these heap iterations starts with the > strong CLDs and trace through the object graph. > > One complication with that scheme is that non-GC heap iterations might > be executing after concurrent marking has started, but before dead CLDs > have been unlinked. To allow the GC marking code and one, at a time, > non-GC heap iteration to run at the same time, I've introduced a new > claim bit in the CLD claiming byte. I've called it "other", so now we > have "strong", "finalizable", and "other". The contract is that the > "other" bits should only be used in a safepoint operation, and must be > cleared before the operation ends. This way we get mutual exclusion > between different users of the "other" bits. > > The patch also adds more precise verification of ZGC references. > > This patch was written a few weeks ago to make the verification of ZGC > references more precise. I've been using it since then to get better > verification of other patches and when hunting for bugs. This means I've > been running it through tier 1-7 multiple times. The intent was to get > this pushed to JDK 14, but now that we've seen that we have an actual > bug because of the imprecise nature of the ZHeapIterator, I'd like to > get this patch pushed to JDK 13. We considered trying to split this up > into two parts, the first part that fixes the heap iterations and the > second part that adds the extra ZGC verification, but we thing that > would take longer time and be riskier. > > Thanks, > StefanK From erik.osterlund at oracle.com Wed Jul 3 14:18:39 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 3 Jul 2019 16:18:39 +0200 Subject: RFR: 8227175: ZGC: ZHeapIterator visits potentially dead objects In-Reply-To: References: Message-ID: <0a8696ee-241f-b9b1-4ee1-1045d0cc21c9@oracle.com> Hi Stefan, Looks good. Thanks, /Erik On 2019-07-03 15:31, Stefan Karlsson wrote: > Hi all, > > (Sending this RFR to hotspot-dev since it changes CLD claiming.) > > Please review this patch to fix the ZHeapIterator to not visit > potentially dead objects. > > https://cr.openjdk.java.net/~stefank/8227175/ > https://bugs.openjdk.java.net/browse/JDK-8227175 > > It changes how all heap iterations are done in ZGC. Previously, the > marking code visited only the strong CLDs and traced through metadata > to find all other CLDs that should be considered alive. The > verification code, serviceability heap iterations, and marking without > class unloading, skipped the metadata tracing part and visited all > CLDs instead. Now, with this patch, all these heap iterations starts > with the strong CLDs and trace through the object graph. > > One complication with that scheme is that non-GC heap iterations might > be executing after concurrent marking has started, but before dead > CLDs have been unlinked. To allow the GC marking code and one, at a > time, non-GC heap iteration to run at the same time, I've introduced a > new claim bit in the CLD claiming byte. I've called it "other", so now > we have "strong", "finalizable", and "other". The contract is that the > "other" bits should only be used in a safepoint operation, and must be > cleared before the operation ends. This way we get mutual exclusion > between different users of the "other" bits. > > The patch also adds more precise verification of ZGC references. > > This patch was written a few weeks ago to make the verification of ZGC > references more precise. I've been using it since then to get better > verification of other patches and when hunting for bugs. This means > I've been running it through tier 1-7 multiple times. The intent was > to get this pushed to JDK 14, but now that we've seen that we have an > actual bug because of the imprecise nature of the ZHeapIterator, I'd > like to get this patch pushed to JDK 13. We considered trying to split > this up into two parts, the first part that fixes the heap iterations > and the second part that adds the extra ZGC verification, but we thing > that would take longer time and be riskier. > > Thanks, > StefanK From adam.farley at uk.ibm.com Wed Jul 3 15:42:29 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Wed, 3 Jul 2019 16:42:29 +0100 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: References: <2c9e6acd-0e79-13c0-23ea-2cef402ee125@oracle.com> Message-ID: Hi David, I figured it should be elaborate so we can avoid killing the VM if we don't have to. Ultimately, if we have a list of three paths and the last two are invalid, does it matter so long as all the libraries we need are in the first path? As to your question "is it in hostpot or JDK code", I presume you mean in the change set. I'm primarily referring to the hotspot code. Also, if we end up adopting a "kill the vm if any path is too long" approach, we still need to change the JDK code, as those currently seem to want to fail if the total length of the sub.boot.library.path property is longer than the maximum length of a single path. So if you pass in three 100 character paths on Windows, it'll fail because they add up to more than the 260 character path limit. Best Regards Adam Farley IBM Runtimes David Holmes wrote on 03/07/2019 08:36:36: > From: David Holmes > To: Adam Farley8 > Cc: hotspot-dev at openjdk.java.net > Date: 03/07/2019 08:36 > Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > paths are longer than JVM_MAXPATHLEN > > On 2/07/2019 7:44 pm, Adam Farley8 wrote: > > Hi David, > > > > Thanks for your thoughts. > > > > The user should absolutely have immediate feedback, yes, and I agree > > that "skipping" paths could lead to us loading the wrong library. > > > > Perhaps a compromise? We fire off a stderr warning if any of the paths > > are too long (without killing the VM), we ignore any path *after* > > (and including) the first too-long path, and we kill the VM if the > > first path is too long. > > My first though is why be so elaborate and not just fail immediately: > > Error occurred during initialization of VM > One or more sun.boot.library.path elements is too long for this system. > --- > > ? But AFAICS we don't do any sanity checking of the those paths so this > would have an impact on startup. > > I can't locate where we would detect the too-long path element, is it in > hostpot or JDK code? > > Thanks, > David > ----- > > > Warning message example: > > > > ---- > > Warning: One or more sun.boot.library.path paths were too long > > for this system, and it (along with all subsequent paths) have been > > ignored. > > ---- > > > > Another addition could be to check the path lengths for the property > > sooner, thus aborting the VM faster if the default path is too long. > > > > Assuming we posit that the VM will always need to load libraries. > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > > > David Holmes wrote on 01/07/2019 22:10:45: > > > >> From: David Holmes > >> To: Adam Farley8 , hotspot-dev at openjdk.java.net > >> Date: 01/07/2019 22:12 > >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > >> paths are longer than JVM_MAXPATHLEN > >> > >> Hi Adam, > >> > >> On 1/07/2019 10:27 pm, Adam Farley8 wrote: > >> > Hi All, > >> > > >> > The title say it all. > >> > > >> > If you pass in a value for sun.boot.library.path consisting > >> > of one or more paths that are too long, then the vm will > >> > fail to start because it can't load one of the libraries it > >> > needs (the zip library), despite the fact that the VM > >> > automatically prepends the default library path to the > >> > sun.boot.library.path property, using the correct separator > >> > to divide it from the user-specified path. > >> > > >> > So we've got the right path, in the right place, at the > >> > right time, we just can't *use* it. > >> > > >> > I've fixed this by changing the relevant os.cpp code to > >> > ignore paths that are too long, and to attempt to locate > >> > the needed library on the other paths (if any are valid). > >> > >> As I just added to the bug report I have a different view of "correct" > >> here. If you just ignore the long path and keep processing other short > >> paths you may find the wrong library. There is a user error here and > >> that error should be reported ASAP and in a way that leads to failure > >> ASAP. Perhaps we should be more aggressive in aborting the VM when this > >> is detected? > >> > >> David > >> ----- > >> > >> > I've also added functionality to handle the edge case of > >> > paths that are neeeeeeearly too long, only for a > >> > sub-path (or file name) to push us over the limit *after* > >> > the split_path function is done assessing the path length. > >> > > >> > I've also changed the code we're overriding, on the assumption > >> > that someone's still using it somewhere. > >> > > >> > Bug: https://urldefense.proofpoint.com/v2/url? > >> > u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8227021&d=DwICaQ&c=jf_iaSHvJObTbx- > >> siA1ZOg&r=P5m8KWUXJf- > >> > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=xZzQCnv68xd9hJyyK1obSim38eWSRmLPfuR__9ddZWg&e= > >> > Webrev: https://urldefense.proofpoint.com/v2/url? > >> > u=http-3A__cr.openjdk.java.net_-7Eafarley_8227021_webrev_&d=DwICaQ&c=jf_iaSHvJObTbx- > >> siA1ZOg&r=P5m8KWUXJf- > >> > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=- > >> hKU0zUd_0LDT08wTilexgI54EeSgt8xUk97i6V63Bk&e= > >> > > >> > Thoughts and impressions welcome. > >> > > >> > Best Regards > >> > > >> > Adam Farley > >> > IBM Runtimes > >> > > >> > Unless stated otherwise above: > >> > IBM United Kingdom Limited - Registered in England and Wales with number > >> > 741598. > >> > Registered office: PO Box 41, North Harbour, Portsmouth, > Hampshire PO6 3AU > >> > > >> > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with number > > 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From kim.barrett at oracle.com Wed Jul 3 18:55:57 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 3 Jul 2019 14:55:57 -0400 Subject: RFR[13]: 8224531: SEGV while collecting Klass statistics In-Reply-To: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> References: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> Message-ID: <39D8C321-D15C-41A4-86D9-403FDB75B224@oracle.com> > On Jul 3, 2019, at 6:15 AM, Erik ?sterlund wrote: > > [?] > > So as you can see we pointer chase through this chain of possibly dead memory. What could possibly go wrong though? Hahaha! Thanks for the detailed description. > Bug: > https://bugs.openjdk.java.net/browse/JDK-8224531 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/ > > Thanks, > /Erik Looks good. From kim.barrett at oracle.com Wed Jul 3 18:57:46 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 3 Jul 2019 14:57:46 -0400 Subject: RFR[13]: 8224531: SEGV while collecting Klass statistics In-Reply-To: <005dde7d-d2bb-fa19-318c-9f130fa4c2de@oracle.com> References: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> <0208491e-37a3-6a84-0f80-d122d28ea4b7@oracle.com> <005dde7d-d2bb-fa19-318c-9f130fa4c2de@oracle.com> Message-ID: <9BFB4618-02B1-47CD-9B83-76AC448B82C4@oracle.com> > On Jul 3, 2019, at 8:48 AM, Erik ?sterlund wrote: >> Did you run the jvmti tests? There used to be tests that failed if dead objects weren't found, but the tests may have been fixed. > > I will take it for a spin. Note though that behaviour relying on always getting dead objects will fail; the caller of the API will race with concurrent sweeping and either get or not get the dead objects depending on whether the heap iteration happened to kick in before or after sweeping. There is just no way you can rely on that. So if there is a test failure because of that, the test is wrong. Nevertheless, I will try to hunt down such tests. Good. If there are still such tests, they need to be fixed. That shouldn?t hold back this change. From thomas.schatzl at oracle.com Wed Jul 3 19:31:58 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 03 Jul 2019 21:31:58 +0200 Subject: RFR[13]: 8224531: SEGV while collecting Klass statistics In-Reply-To: <0208491e-37a3-6a84-0f80-d122d28ea4b7@oracle.com> References: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> <0208491e-37a3-6a84-0f80-d122d28ea4b7@oracle.com> Message-ID: Hi, On Wed, 2019-07-03 at 08:10 -0400, coleen.phillimore at oracle.com wrote: > http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/src/hotspot/share/memory/heapInspection.cpp.frames.html > > There's another object_iterate() in this file with a comment to > change it to safe_object_iterate(). Should you change that too? I think this is a different issue as Erik pointed out, this is iteration during a safepoint. Not that I think that this is much safer *and* there is already the comment there that this might not work with CMS either. Erik, can you file a CR? > > Did you run the jvmti tests? There used to be tests that failed if > dead objects weren't found, but the tests may have been fixed. It would be nice to at least know which jvmti tests iterate over dead objects before pushing this if possible. > > The rest of the change looks good. Thank you for figuring this out! Change looks good. Thanks, Thomas From david.holmes at oracle.com Thu Jul 4 06:57:14 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Jul 2019 16:57:14 +1000 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: References: <2c9e6acd-0e79-13c0-23ea-2cef402ee125@oracle.com> Message-ID: <842ae03e-8574-593e-3ac2-5cc283832be9@oracle.com> Hi Adam, On 4/07/2019 1:42 am, Adam Farley8 wrote: > Hi David, > > I figured it should be elaborate so we can avoid killing the VM > if we don't have to. > > Ultimately, if we have a list of three paths and the last two > are invalid, does it matter so long as all the libraries we need > are in the first path? I prefer not see the users error ignored if we can reasonably detect it. They set the paths for a reason, and if they paths are invalid they probably would like to know. > As to your question "is it in hostpot or JDK code", I presume you > mean in the change set. I'm primarily referring to the hotspot code. No I mean where in the current code will we detect that one of these path elements is too long? > Also, if we end up adopting a "kill the vm if any path is too long" > approach, we still need to change the JDK code, as those currently > seem to want to fail if the total length of the sub.boot.library.path > property is longer than the maximum length of a single path. > > So if you pass in three 100 character paths on Windows, it'll fail > because they add up to more than the 260 character path limit. That seems like a separate bug that should be addressed. :( Thanks, David > Best Regards > > Adam Farley > IBM Runtimes > > > David Holmes wrote on 03/07/2019 08:36:36: > >> From: David Holmes >> To: Adam Farley8 >> Cc: hotspot-dev at openjdk.java.net >> Date: 03/07/2019 08:36 >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path >> paths are longer than JVM_MAXPATHLEN >> >> On 2/07/2019 7:44 pm, Adam Farley8 wrote: >> > Hi David, >> > >> > Thanks for your thoughts. >> > >> > The user should absolutely have immediate feedback, yes, and I agree >> > that "skipping" paths could lead to us loading the wrong library. >> > >> > Perhaps a compromise? We fire off a stderr warning if any of the paths >> > are too long (without killing the VM), we ignore any path *after* >> > (and including) the first too-long path, and we kill the VM if the >> > first path is too long. >> >> My first though is why be so elaborate and not just fail immediately: >> >> Error occurred during initialization of VM >> One or more sun.boot.library.path elements is too long for this system. >> --- >> >> ? But AFAICS we don't do any sanity checking of the those paths so this >> would have an impact on startup. >> >> I can't locate where we would detect the too-long path element, is it in >> hostpot or JDK code? >> >> Thanks, >> David >> ----- >> >> > Warning message example: >> > >> > ---- >> > Warning: One or more sun.boot.library.path paths were too long >> > for this system, and it (along with all subsequent paths) have been >> > ignored. >> > ---- >> > >> > Another addition could be to check the path lengths for the property >> > sooner, thus aborting the VM faster if the default path is too long. >> > >> > Assuming we posit that the VM will always need to load libraries. >> > >> > Best Regards >> > >> > Adam Farley >> > IBM Runtimes >> > >> > >> > David Holmes wrote on 01/07/2019 22:10:45: >> > >> >> From: David Holmes >> >> To: Adam Farley8 , ?hotspot-dev at openjdk.java.net >> >> Date: 01/07/2019 22:12 >> >> Subject: Re: RFR: JDK-8227021: ?VM fails if any sun.boot.library.path >> >> paths are longer than JVM_MAXPATHLEN >> >> >> >> Hi Adam, >> >> >> >> On 1/07/2019 10:27 pm, Adam Farley8 wrote: >> >> > Hi All, >> >> > >> >> > The title say it all. >> >> > >> >> > If you pass in a value for sun.boot.library.path consisting >> >> > of one or more paths that are too long, then the vm will >> >> > fail to start because it can't load one of the libraries it >> >> > needs (the zip library), despite the fact that the VM >> >> > automatically prepends the default library path to the >> >> > sun.boot.library.path property, using the correct separator >> >> > to divide it from the user-specified path. >> >> > >> >> > So we've got the right path, in the right place, at the >> >> > right time, we just can't *use* it. >> >> > >> >> > I've fixed this by changing the relevant os.cpp code to >> >> > ignore paths that are too long, and to attempt to locate >> >> > the needed library on the other paths (if any are valid). >> >> >> >> As I just added to the bug report I have a different view of "correct" >> >> here. If you just ignore the long path and keep processing other short >> >> paths you may find the wrong library. There is a user error here and >> >> that error should be reported ASAP and in a way that leads to failure >> >> ASAP. Perhaps we should be more aggressive in aborting the VM when ?this >> >> is detected? >> >> >> >> David >> >> ----- >> >> >> >> > I've also added functionality to handle the edge case of >> >> > paths that are neeeeeeearly too long, only for a >> >> > sub-path (or file name) to push us over the limit *after* >> >> > the split_path function is done assessing the path length. >> >> > >> >> > I've also changed the code we're overriding, on the assumption >> >> > that someone's still using it somewhere. >> >> > >> >> > Bug: https://urldefense.proofpoint.com/v2/url? >> >> >> u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8227021&d=DwICaQ&c=jf_iaSHvJObTbx- >> >> siA1ZOg&r=P5m8KWUXJf- >> >> >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=xZzQCnv68xd9hJyyK1obSim38eWSRmLPfuR__9ddZWg&e= >> >> > Webrev: https://urldefense.proofpoint.com/v2/url? >> >> >> u=http-3A__cr.openjdk.java.net_-7Eafarley_8227021_webrev_&d=DwICaQ&c=jf_iaSHvJObTbx- >> >> siA1ZOg&r=P5m8KWUXJf- >> >> >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=- >> >> hKU0zUd_0LDT08wTilexgI54EeSgt8xUk97i6V63Bk&e= >> >> > >> >> > Thoughts and impressions welcome. >> >> > >> >> > Best Regards >> >> > >> >> > Adam Farley >> >> > IBM Runtimes >> >> > >> >> > Unless stated otherwise above: >> >> > IBM United Kingdom Limited - Registered in England and Wales ?with number >> >> > 741598. >> >> > Registered office: PO Box 41, North Harbour, Portsmouth, >> Hampshire ?PO6 3AU >> >> > >> >> >> > >> > Unless stated otherwise above: >> > IBM United Kingdom Limited - Registered in England and Wales with number >> > 741598. >> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >> > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From david.holmes at oracle.com Thu Jul 4 07:38:06 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 Jul 2019 17:38:06 +1000 Subject: RFR: 8226816: add UserHandler calls to event log In-Reply-To: References: Message-ID: Hi Matthias, On 27/06/2019 6:56 pm, Baesken, Matthias wrote: > Hello, please review the following small patch . > It adds event logging to the UserHandler (user signal handler) calls . That seems reasonable. > (additionally it adds a function os::win32::get_signal_name > to get signal names for signal numbers ; this is similar to what we already had for posix ). If you add this then we don't need distinct POSIX and non-POSIX versions - the existing os::Posix::get_signal_name etc could all be hoisted into os.cpp and the os class - no? Aside: I spotted this in UserHandler: // 4511530 - sem_post is serialized and handled by the manager thread. When // the program is interrupted by Ctrl-C, SIGINT is sent to every thread. We // don't want to flood the manager thread with sem_post requests. if (sig == SIGINT && Atomic::add(1, &sigint_count) > 1) return; That's a LinuxThreads anachronism which has been copied, unnecessarily into the other OS implementations. I will file a RFE to get rid of it. Thanks, David > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8226816 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8226816.0/ > > Thanks, Matthias > From stefan.karlsson at oracle.com Thu Jul 4 09:48:53 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 4 Jul 2019 11:48:53 +0200 Subject: RFR: 8227175: ZGC: ZHeapIterator visits potentially dead objects In-Reply-To: References: Message-ID: <003b5282-abf0-1820-4fb9-c9856850078b@oracle.com> Thanks for reviewing! StefanK On 2019-07-03 16:03, Zhengyu Gu wrote: > Hi Stefan, > > Runtime part looks good to me. > > I had the same thoughts when moving Shenandoah CLDG evacuation to > concurrent phase, then heap dump started to interfere concurrent CLDG > iteration. Because Shenandoah heap dump uses single-thread to walk CLDG, > we used _claimed_none to avoid the problem. The change can potentially > allow us to relax the restriction. > > Thanks, > > -Zhengyu > > > > On 7/3/19 9:31 AM, Stefan Karlsson wrote: >> Hi all, >> >> (Sending this RFR to hotspot-dev since it changes CLD claiming.) >> >> Please review this patch to fix the ZHeapIterator to not visit >> potentially dead objects. >> >> https://cr.openjdk.java.net/~stefank/8227175/ >> https://bugs.openjdk.java.net/browse/JDK-8227175 >> >> It changes how all heap iterations are done in ZGC. Previously, the >> marking code visited only the strong CLDs and traced through metadata >> to find all other CLDs that should be considered alive. The >> verification code, serviceability heap iterations, and marking without >> class unloading, skipped the metadata tracing part and visited all >> CLDs instead. Now, with this patch, all these heap iterations starts >> with the strong CLDs and trace through the object graph. >> >> One complication with that scheme is that non-GC heap iterations might >> be executing after concurrent marking has started, but before dead >> CLDs have been unlinked. To allow the GC marking code and one, at a >> time, non-GC heap iteration to run at the same time, I've introduced a >> new claim bit in the CLD claiming byte. I've called it "other", so now >> we have "strong", "finalizable", and "other". The contract is that the >> "other" bits should only be used in a safepoint operation, and must be >> cleared before the operation ends. This way we get mutual exclusion >> between different users of the "other" bits. >> >> The patch also adds more precise verification of ZGC references. >> >> This patch was written a few weeks ago to make the verification of ZGC >> references more precise. I've been using it since then to get better >> verification of other patches and when hunting for bugs. This means >> I've been running it through tier 1-7 multiple times. The intent was >> to get this pushed to JDK 14, but now that we've seen that we have an >> actual bug because of the imprecise nature of the ZHeapIterator, I'd >> like to get this patch pushed to JDK 13. We considered trying to split >> this up into two parts, the first part that fixes the heap iterations >> and the second part that adds the extra ZGC verification, but we thing >> that would take longer time and be riskier. >> >> Thanks, >> StefanK From stefan.karlsson at oracle.com Thu Jul 4 09:53:35 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 4 Jul 2019 11:53:35 +0200 Subject: RFR: 8227175: ZGC: ZHeapIterator visits potentially dead objects In-Reply-To: <0a8696ee-241f-b9b1-4ee1-1045d0cc21c9@oracle.com> References: <0a8696ee-241f-b9b1-4ee1-1045d0cc21c9@oracle.com> Message-ID: Thanks for reviewing! StefanK On 2019-07-03 16:18, wrote: > Hi Stefan, > > Looks good. > > Thanks, > /Erik > > On 2019-07-03 15:31, Stefan Karlsson wrote: >> Hi all, >> >> (Sending this RFR to hotspot-dev since it changes CLD claiming.) >> >> Please review this patch to fix the ZHeapIterator to not visit >> potentially dead objects. >> >> https://cr.openjdk.java.net/~stefank/8227175/ >> https://bugs.openjdk.java.net/browse/JDK-8227175 >> >> It changes how all heap iterations are done in ZGC. Previously, the >> marking code visited only the strong CLDs and traced through metadata >> to find all other CLDs that should be considered alive. The >> verification code, serviceability heap iterations, and marking without >> class unloading, skipped the metadata tracing part and visited all >> CLDs instead. Now, with this patch, all these heap iterations starts >> with the strong CLDs and trace through the object graph. >> >> One complication with that scheme is that non-GC heap iterations might >> be executing after concurrent marking has started, but before dead >> CLDs have been unlinked. To allow the GC marking code and one, at a >> time, non-GC heap iteration to run at the same time, I've introduced a >> new claim bit in the CLD claiming byte. I've called it "other", so now >> we have "strong", "finalizable", and "other". The contract is that the >> "other" bits should only be used in a safepoint operation, and must be >> cleared before the operation ends. This way we get mutual exclusion >> between different users of the "other" bits. >> >> The patch also adds more precise verification of ZGC references. >> >> This patch was written a few weeks ago to make the verification of ZGC >> references more precise. I've been using it since then to get better >> verification of other patches and when hunting for bugs. This means >> I've been running it through tier 1-7 multiple times. The intent was >> to get this pushed to JDK 14, but now that we've seen that we have an >> actual bug because of the imprecise nature of the ZHeapIterator, I'd >> like to get this patch pushed to JDK 13. We considered trying to split >> this up into two parts, the first part that fixes the heap iterations >> and the second part that adds the extra ZGC verification, but we thing >> that would take longer time and be riskier. >> >> Thanks, >> StefanK > From erik.osterlund at oracle.com Thu Jul 4 12:55:27 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 4 Jul 2019 14:55:27 +0200 Subject: RFR[13]: 8224531: SEGV while collecting Klass statistics In-Reply-To: <39D8C321-D15C-41A4-86D9-403FDB75B224@oracle.com> References: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> <39D8C321-D15C-41A4-86D9-403FDB75B224@oracle.com> Message-ID: Hi Kim, Thanks for the review. /Erik On 2019-07-03 20:55, Kim Barrett wrote: >> On Jul 3, 2019, at 6:15 AM, Erik ?sterlund wrote: >> >> [?] >> >> So as you can see we pointer chase through this chain of possibly dead memory. What could possibly go wrong though? > Hahaha! > > Thanks for the detailed description. > >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8224531 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/ >> >> Thanks, >> /Erik > Looks good. > From erik.osterlund at oracle.com Thu Jul 4 12:59:11 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 4 Jul 2019 14:59:11 +0200 Subject: RFR[13]: 8224531: SEGV while collecting Klass statistics In-Reply-To: References: <479363ab-3157-03c0-c05f-b03b809cde91@oracle.com> <0208491e-37a3-6a84-0f80-d122d28ea4b7@oracle.com> Message-ID: Hi Thomas, Thanks for the review. On 2019-07-03 21:31, Thomas Schatzl wrote: > Hi, > > On Wed, 2019-07-03 at 08:10 -0400, coleen.phillimore at oracle.com wrote: > http://cr.openjdk.java.net/~eosterlund/8224531/webrev.00/src/hotspot/share/memory/heapInspection.cpp.frames.html >> There's another object_iterate() in this file with a comment to >> change it to safe_object_iterate(). Should you change that too? > I think this is a different issue as Erik pointed out, this is > iteration during a safepoint. Not that I think that this is much safer > *and* there is already the comment there that this might not work with > CMS either. > Erik, can you file a CR? Sure, can do. Thanks, /Erik >> Did you run the jvmti tests? There used to be tests that failed if >> dead objects weren't found, but the tests may have been fixed. > It would be nice to at least know which jvmti tests iterate over dead > objects before pushing this if possible. > >> The rest of the change looks good. Thank you for figuring this out! > Change looks good. > > Thanks, > Thomas > > From matthias.baesken at sap.com Thu Jul 4 13:06:27 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 4 Jul 2019 13:06:27 +0000 Subject: RFR: 8226816: add UserHandler calls to event log In-Reply-To: References: Message-ID: Hi David, thanks for looking into this . > > If you add this then we don't need distinct POSIX and non-POSIX versions > - the existing os::Posix::get_signal_name etc could all be hoisted into > os.cpp and the os class - no? > Should I go for this ? The coding is still a little different (e.g. is_valid_signal (.. ) call in os_posix ) but I think it could be done without much trouble (maybe with a few small ifdefs ) . > That's a LinuxThreads anachronism which has been copied, unnecessarily > into the other OS implementations. I will file a RFE to get rid of it. Good catch ! Best regards, Matthias > > Hi Matthias, > > On 27/06/2019 6:56 pm, Baesken, Matthias wrote: > > Hello, please review the following small patch . > > It adds event logging to the UserHandler (user signal handler) calls . > > That seems reasonable. > > > (additionally it adds a function os::win32::get_signal_name > > to get signal names for signal numbers ; this is similar to what we already > had for posix ). > > If you add this then we don't need distinct POSIX and non-POSIX versions > - the existing os::Posix::get_signal_name etc could all be hoisted into > os.cpp and the os class - no? > > Aside: I spotted this in UserHandler: > > // 4511530 - sem_post is serialized and handled by the manager > thread. When > // the program is interrupted by Ctrl-C, SIGINT is sent to every > thread. We > // don't want to flood the manager thread with sem_post requests. > if (sig == SIGINT && Atomic::add(1, &sigint_count) > 1) > return; > > That's a LinuxThreads anachronism which has been copied, unnecessarily > into the other OS implementations. I will file a RFE to get rid of it. > > Thanks, > David > > > > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8226816 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8226816.0/ > > > > Thanks, Matthias > > From erik.osterlund at oracle.com Thu Jul 4 15:02:52 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 4 Jul 2019 17:02:52 +0200 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls Message-ID: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> Hi, The i2c adapter sets a thread-local "callee_target" Method*, which is caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c call is "bad" (e.g. not_entrant). This error handler forwards execution to the callee c2i entry. If the SharedRuntime::handle_wrong_method method is called again due to the i2c2i call being still bad, then we will crash the VM in the following guarantee in SharedRuntime::handle_wrong_method: Method* callee = thread->callee_target(); guarantee(callee != NULL && callee->is_method(), "bad handshake"); Unfortunately, the c2i entry can indeed fail again if it, e.g., hits the new class initialization entry barrier of the c2i adapter. The solution is to simply not clear the thread-local "callee_target" after handling the first failure, as we can't really know there won't be another one. There is no reason to clear this value as nobody else reads it than the SharedRuntime::handle_wrong_method handler (and we really do want it to be able to read the value as many times as it takes until the call goes through). I found some confused clearing of this callee_target in JavaThread::oops_do(), with a comment saying this is a methodOop that we need to clear to make GC happy or something. Seems like old traces of perm gen. So I deleted that too. I caught this in ZGC where the timing window for hitting this issue seems to be wider due to concurrent code cache unloading. But it is equally problematic for all GCs. Bug: https://bugs.openjdk.java.net/browse/JDK-8227260 Webrev: http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ Thanks, /Erik From adam.farley at uk.ibm.com Thu Jul 4 16:41:23 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Thu, 4 Jul 2019 17:41:23 +0100 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: <842ae03e-8574-593e-3ac2-5cc283832be9@oracle.com> References: <2c9e6acd-0e79-13c0-23ea-2cef402ee125@oracle.com> <842ae03e-8574-593e-3ac2-5cc283832be9@oracle.com> Message-ID: Hi David, To detect a too-long path when it's being passed in, the best option I can see is to check it in two places: 1) when it's being set initially with the location of libjvm.so, either: a)in hotspot/os/[os name]/os_[os name].cpp, right before the call to Arguments::set_dll_dir or b), in the Arguments::set_dll_dir function itself (ideally the latter) 2) when/if the extra paths are being passed in as a parameter, as they pass through hotspot/share/runtime/arguments.cpp, right after the line: --- else if (strcmp(key, "sun.boot.library.path") == 0)"); --- You're right in that this could slow down startup a little, with the length checking, and the potential looping over the -D value to check the length of each path. Not a major slowdown though. Best Regards Adam Farley IBM Runtimes David Holmes wrote on 04/07/2019 07:57:14: > From: David Holmes > To: Adam Farley8 > Cc: hotspot-dev at openjdk.java.net > Date: 04/07/2019 07:58 > Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > paths are longer than JVM_MAXPATHLEN > > Hi Adam, > > On 4/07/2019 1:42 am, Adam Farley8 wrote: > > Hi David, > > > > I figured it should be elaborate so we can avoid killing the VM > > if we don't have to. > > > > Ultimately, if we have a list of three paths and the last two > > are invalid, does it matter so long as all the libraries we need > > are in the first path? > > I prefer not see the users error ignored if we can reasonably detect it. > They set the paths for a reason, and if they paths are invalid they > probably would like to know. > > > As to your question "is it in hostpot or JDK code", I presume you > > mean in the change set. I'm primarily referring to the hotspot code. > > No I mean where in the current code will we detect that one of these > path elements is too long? > > > Also, if we end up adopting a "kill the vm if any path is too long" > > approach, we still need to change the JDK code, as those currently > > seem to want to fail if the total length of the sub.boot.library.path > > property is longer than the maximum length of a single path. > > > > So if you pass in three 100 character paths on Windows, it'll fail > > because they add up to more than the 260 character path limit. > > That seems like a separate bug that should be addressed. :( > > Thanks, > David > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > > > David Holmes wrote on 03/07/2019 08:36:36: > > > >> From: David Holmes > >> To: Adam Farley8 > >> Cc: hotspot-dev at openjdk.java.net > >> Date: 03/07/2019 08:36 > >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > >> paths are longer than JVM_MAXPATHLEN > >> > >> On 2/07/2019 7:44 pm, Adam Farley8 wrote: > >> > Hi David, > >> > > >> > Thanks for your thoughts. > >> > > >> > The user should absolutely have immediate feedback, yes, and I agree > >> > that "skipping" paths could lead to us loading the wrong library. > >> > > >> > Perhaps a compromise? We fire off a stderr warning if any of the paths > >> > are too long (without killing the VM), we ignore any path *after* > >> > (and including) the first too-long path, and we kill the VM if the > >> > first path is too long. > >> > >> My first though is why be so elaborate and not just fail immediately: > >> > >> Error occurred during initialization of VM > >> One or more sun.boot.library.path elements is too long for this system. > >> --- > >> > >> ? But AFAICS we don't do any sanity checking of the those paths so this > >> would have an impact on startup. > >> > >> I can't locate where we would detect the too-long path element, is it in > >> hostpot or JDK code? > >> > >> Thanks, > >> David > >> ----- > >> > >> > Warning message example: > >> > > >> > ---- > >> > Warning: One or more sun.boot.library.path paths were too long > >> > for this system, and it (along with all subsequent paths) have been > >> > ignored. > >> > ---- > >> > > >> > Another addition could be to check the path lengths for the property > >> > sooner, thus aborting the VM faster if the default path is too long. > >> > > >> > Assuming we posit that the VM will always need to load libraries. > >> > > >> > Best Regards > >> > > >> > Adam Farley > >> > IBM Runtimes > >> > > >> > > >> > David Holmes wrote on 01/07/2019 22:10:45: > >> > > >> >> From: David Holmes > >> >> To: Adam Farley8 , hotspot-dev at openjdk.java.net > >> >> Date: 01/07/2019 22:12 > >> >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > >> >> paths are longer than JVM_MAXPATHLEN > >> >> > >> >> Hi Adam, > >> >> > >> >> On 1/07/2019 10:27 pm, Adam Farley8 wrote: > >> >> > Hi All, > >> >> > > >> >> > The title say it all. > >> >> > > >> >> > If you pass in a value for sun.boot.library.path consisting > >> >> > of one or more paths that are too long, then the vm will > >> >> > fail to start because it can't load one of the libraries it > >> >> > needs (the zip library), despite the fact that the VM > >> >> > automatically prepends the default library path to the > >> >> > sun.boot.library.path property, using the correct separator > >> >> > to divide it from the user-specified path. > >> >> > > >> >> > So we've got the right path, in the right place, at the > >> >> > right time, we just can't *use* it. > >> >> > > >> >> > I've fixed this by changing the relevant os.cpp code to > >> >> > ignore paths that are too long, and to attempt to locate > >> >> > the needed library on the other paths (if any are valid). > >> >> > >> >> As I just added to the bug report I have a different view of "correct" > >> >> here. If you just ignore the long path and keep processing other short > >> >> paths you may find the wrong library. There is a user error here and > >> >> that error should be reported ASAP and in a way that leads to failure > >> >> ASAP. Perhaps we should be more aggressive in aborting the VMwhen this > >> >> is detected? > >> >> > >> >> David > >> >> ----- > >> >> > >> >> > I've also added functionality to handle the edge case of > >> >> > paths that are neeeeeeearly too long, only for a > >> >> > sub-path (or file name) to push us over the limit *after* > >> >> > the split_path function is done assessing the path length. > >> >> > > >> >> > I've also changed the code we're overriding, on the assumption > >> >> > that someone's still using it somewhere. > >> >> > > >> >> > Bug: https://urldefense.proofpoint.com/v2/url? > >> >> > >> > u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8227021&d=DwICaQ&c=jf_iaSHvJObTbx- > >> >> siA1ZOg&r=P5m8KWUXJf- > >> >> > >> > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=xZzQCnv68xd9hJyyK1obSim38eWSRmLPfuR__9ddZWg&e= > >> >> > Webrev: https://urldefense.proofpoint.com/v2/url? > >> >> > >> > u=http-3A__cr.openjdk.java.net_-7Eafarley_8227021_webrev_&d=DwICaQ&c=jf_iaSHvJObTbx- > >> >> siA1ZOg&r=P5m8KWUXJf- > >> >> > >> > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=- > >> >> hKU0zUd_0LDT08wTilexgI54EeSgt8xUk97i6V63Bk&e= > >> >> > > >> >> > Thoughts and impressions welcome. > >> >> > > >> >> > Best Regards > >> >> > > >> >> > Adam Farley > >> >> > IBM Runtimes > >> >> > > >> >> > Unless stated otherwise above: > >> >> > IBM United Kingdom Limited - Registered in England and > Wales with number > >> >> > 741598. > >> >> > Registered office: PO Box 41, North Harbour, Portsmouth, > >> Hampshire PO6 3AU > >> >> > > >> >> > >> > > >> > Unless stated otherwise above: > >> > IBM United Kingdom Limited - Registered in England and Wales with number > >> > 741598. > >> > Registered office: PO Box 41, North Harbour, Portsmouth, > Hampshire PO6 3AU > >> > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with number > > 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From david.holmes at oracle.com Thu Jul 4 21:14:06 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Jul 2019 07:14:06 +1000 Subject: RFR: 8226816: add UserHandler calls to event log In-Reply-To: References: Message-ID: <2ba2c9fa-3faa-2e16-59d6-d76412115515@oracle.com> On 4/07/2019 11:06 pm, Baesken, Matthias wrote: > Hi David, thanks for looking into this . > >> >> If you add this then we don't need distinct POSIX and non-POSIX versions >> - the existing os::Posix::get_signal_name etc could all be hoisted into >> os.cpp and the os class - no? >> > > Should I go for this ? > The coding is still a little different (e.g. is_valid_signal (.. ) call in os_posix ) but I think it could be done without much trouble (maybe with a few small ifdefs ) . I think it's worth trying it. I have to apologize in advance though as I'm about to disappear on two weeks vacation so may not be able to follow through on this. Thanks, David > >> That's a LinuxThreads anachronism which has been copied, unnecessarily >> into the other OS implementations. I will file a RFE to get rid of it. > > Good catch ! > > Best regards, Matthias > >> >> Hi Matthias, >> >> On 27/06/2019 6:56 pm, Baesken, Matthias wrote: >>> Hello, please review the following small patch . >>> It adds event logging to the UserHandler (user signal handler) calls . >> >> That seems reasonable. >> >>> (additionally it adds a function os::win32::get_signal_name >>> to get signal names for signal numbers ; this is similar to what we already >> had for posix ). >> >> If you add this then we don't need distinct POSIX and non-POSIX versions >> - the existing os::Posix::get_signal_name etc could all be hoisted into >> os.cpp and the os class - no? >> >> Aside: I spotted this in UserHandler: >> >> // 4511530 - sem_post is serialized and handled by the manager >> thread. When >> // the program is interrupted by Ctrl-C, SIGINT is sent to every >> thread. We >> // don't want to flood the manager thread with sem_post requests. >> if (sig == SIGINT && Atomic::add(1, &sigint_count) > 1) >> return; >> >> That's a LinuxThreads anachronism which has been copied, unnecessarily >> into the other OS implementations. I will file a RFE to get rid of it. >> >> Thanks, >> David >> >>> >>> >>> Bug/webrev : >>> >>> https://bugs.openjdk.java.net/browse/JDK-8226816 >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8226816.0/ >>> >>> Thanks, Matthias >>> From david.holmes at oracle.com Thu Jul 4 21:21:59 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Jul 2019 07:21:59 +1000 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: References: <2c9e6acd-0e79-13c0-23ea-2cef402ee125@oracle.com> <842ae03e-8574-593e-3ac2-5cc283832be9@oracle.com> Message-ID: <08a6c8a3-bd3e-25db-2460-cea7c8fbb3f3@oracle.com> Hi Adam, On 5/07/2019 2:41 am, Adam Farley8 wrote: > Hi David, > > To detect a too-long path when it's being passed in, the best option > I can see is to check it in two places: Right, but my outstanding question relates to the existing code today. Where will we detect that a path element is too long? I'm still not sure whether the VM has the right to dictate behaviour here or whether this belongs to core-libs. And we need to be very careful about any change in behaviour. > 1) when it's being set initially with the location of libjvm.so, either: > ? ? a)in hotspot/os/[os name]/os_[os name].cpp, right before the call > ?to Arguments::set_dll_dir > ? ? ?or b), in the Arguments::set_dll_dirfunction itself (ideally the > latter) > > 2) when/if the extra paths are being passed in as a parameter, as they > pass through hotspot/share/runtime/arguments.cpp, right after the line: > > --- > else if (_strcmp_(key, "sun.boot.library.path") == 0)"); > --- > > You're right in that this could slow down startup a little, with > the length checking, and the potential looping over the -D value > to check the length of each path. Not a major slowdown though. I'm sure Claes would disagree :) Apologies in advance as I'm about to disappear for two weeks vacation. David ----- > Best Regards > > Adam Farley > IBM Runtimes > > > David Holmes wrote on 04/07/2019 07:57:14: > >> From: David Holmes >> To: Adam Farley8 >> Cc: hotspot-dev at openjdk.java.net >> Date: 04/07/2019 07:58 >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path >> paths are longer than JVM_MAXPATHLEN >> >> Hi Adam, >> >> On 4/07/2019 1:42 am, Adam Farley8 wrote: >> > Hi David, >> > >> > I figured it should be elaborate so we can avoid killing the VM >> > if we don't have to. >> > >> > Ultimately, if we have a list of three paths and the last two >> > are invalid, does it matter so long as all the libraries we need >> > are in the first path? >> >> I prefer not see the users error ignored if we can reasonably detect it. >> They set the paths for a reason, and if they paths are invalid they >> probably would like to know. >> >> > As to your question "is it in hostpot or JDK code", I presume you >> > mean in the change set. I'm primarily referring to the hotspot code. >> >> No I mean where in the current code will we detect that one of these >> path elements is too long? >> >> > Also, if we end up adopting a "kill the vm if any path is too long" >> > approach, we still need to change the JDK code, as those currently >> > seem to want to fail if the total length of the sub.boot.library.path >> > property is longer than the maximum length of a single path. >> > >> > So if you pass in three 100 character paths on Windows, it'll fail >> > because they add up to more than the 260 character path limit. >> >> That seems like a separate bug that should be addressed. :( >> >> Thanks, >> David >> >> > Best Regards >> > >> > Adam Farley >> > IBM Runtimes >> > >> > >> > David Holmes wrote on 03/07/2019 08:36:36: >> > >> >> From: David Holmes >> >> To: Adam Farley8 >> >> Cc: hotspot-dev at openjdk.java.net >> >> Date: 03/07/2019 08:36 >> >> Subject: Re: RFR: JDK-8227021: ?VM fails if any sun.boot.library.path >> >> paths are longer than JVM_MAXPATHLEN >> >> >> >> On 2/07/2019 7:44 pm, Adam Farley8 wrote: >> >> > Hi David, >> >> > >> >> > Thanks for your thoughts. >> >> > >> >> > The user should absolutely have immediate feedback, yes, and ?I agree >> >> > that "skipping" paths could lead to us loading the ?wrong library. >> >> > >> >> > Perhaps a compromise? We fire off a stderr warning if any of ?the paths >> >> > are too long (without killing the VM), we ignore any path *after* >> >> > (and including) the first too-long path, and we kill the VM if ?the >> >> > first path is too long. >> >> >> >> My first though is why be so elaborate and not just fail immediately: >> >> >> >> Error occurred during initialization of VM >> >> One or more sun.boot.library.path elements is too long for this system. >> >> --- >> >> >> >> ? But AFAICS we don't do any sanity checking of the those paths so ?this >> >> would have an impact on startup. >> >> >> >> I can't locate where we would detect the too-long path element, is ?it in >> >> hostpot or JDK code? >> >> >> >> Thanks, >> >> David >> >> ----- >> >> >> >> > Warning message example: >> >> > >> >> > ---- >> >> > Warning: One or more sun.boot.library.path paths were too long >> >> > for this system, and it (along with all subsequent paths) have ?been >> >> > ignored. >> >> > ---- >> >> > >> >> > Another addition could be to check the path lengths for the property >> >> > sooner, thus aborting the VM faster if the default path is too ?long. >> >> > >> >> > Assuming we posit that the VM will always need to load libraries. >> >> > >> >> > Best Regards >> >> > >> >> > Adam Farley >> >> > IBM Runtimes >> >> > >> >> > >> >> > David Holmes wrote on 01/07/2019 ?22:10:45: >> >> > >> >> >> From: David Holmes >> >> >> To: Adam Farley8 , ?hotspot-dev at openjdk.java.net >> >> >> Date: 01/07/2019 22:12 >> >> >> Subject: Re: RFR: JDK-8227021: ?VM fails if any sun.boot.library.path >> >> >> paths are longer than JVM_MAXPATHLEN >> >> >> >> >> >> Hi Adam, >> >> >> >> >> >> On 1/07/2019 10:27 pm, Adam Farley8 wrote: >> >> >> > Hi All, >> >> >> > >> >> >> > The title say it all. >> >> >> > >> >> >> > If you pass in a value for sun.boot.library.path consisting >> >> >> > of one or more paths that are too long, then the vm ?will >> >> >> > fail to start because it can't load one of the libraries ?it >> >> >> > needs (the zip library), despite the fact that the VM >> >> >> > automatically prepends the default library path to the >> >> >> > sun.boot.library.path property, using the correct separator >> >> >> > to divide it from the user-specified path. >> >> >> > >> >> >> > So we've got the right path, in the right place, at ?the >> >> >> > right time, we just can't *use* it. >> >> >> > >> >> >> > I've fixed this by changing the relevant os.cpp code ?to >> >> >> > ignore paths that are too long, and to attempt to locate >> >> >> > the needed library on the other paths (if any are valid). >> >> >> >> >> >> As I just added to the bug report I have a different view ?of "correct" >> >> >> here. If you just ignore the long path and keep processing ?other short >> >> >> paths you may find the wrong library. There is a user error ?here and >> >> >> that error should be reported ASAP and in a way that leads ?to failure >> >> >> ASAP. Perhaps we should be more aggressive in aborting the ?VMwhen ?this >> >> >> is detected? >> >> >> >> >> >> David >> >> >> ----- >> >> >> >> >> >> > I've also added functionality to handle the edge case ?of >> >> >> > paths that are neeeeeeearly too long, only for a >> >> >> > sub-path (or file name) to push us over the limit *after* >> >> >> > the split_path function is done assessing the path length. >> >> >> > >> >> >> > I've also changed the code we're overriding, on the ?assumption >> >> >> > that someone's still using it somewhere. >> >> >> > >> >> >> > Bug: https://urldefense.proofpoint.com/v2/url? >> >> >> >> >> >> u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8227021&d=DwICaQ&c=jf_iaSHvJObTbx- >> >> >> siA1ZOg&r=P5m8KWUXJf- >> >> >> >> >> >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=xZzQCnv68xd9hJyyK1obSim38eWSRmLPfuR__9ddZWg&e= >> >> >> > Webrev: https://urldefense.proofpoint.com/v2/url? >> >> >> >> >> >> u=http-3A__cr.openjdk.java.net_-7Eafarley_8227021_webrev_&d=DwICaQ&c=jf_iaSHvJObTbx- >> >> >> siA1ZOg&r=P5m8KWUXJf- >> >> >> >> >> >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=- >> >> >> hKU0zUd_0LDT08wTilexgI54EeSgt8xUk97i6V63Bk&e= >> >> >> > >> >> >> > Thoughts and impressions welcome. >> >> >> > >> >> >> > Best Regards >> >> >> > >> >> >> > Adam Farley >> >> >> > IBM Runtimes >> >> >> > >> >> >> > Unless stated otherwise above: >> >> >> > IBM United Kingdom Limited - Registered in England and >> Wales ?with number >> >> >> > 741598. >> >> >> > Registered office: PO Box 41, North Harbour, Portsmouth, >> >> Hampshire ?PO6 3AU >> >> >> > >> >> >> >> >> > >> >> > Unless stated otherwise above: >> >> > IBM United Kingdom Limited - Registered in England and Wales ?with number >> >> > 741598. >> >> > Registered office: PO Box 41, North Harbour, Portsmouth, >> Hampshire ?PO6 3AU >> >> >> > >> > Unless stated otherwise above: >> > IBM United Kingdom Limited - Registered in England and Wales with number >> > 741598. >> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >> > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From erik.osterlund at oracle.com Fri Jul 5 10:19:14 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 5 Jul 2019 12:19:14 +0200 Subject: RFR[13]: 8227277: HeapInspection::find_instances_at_safepoint walks dead objects Message-ID: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> Hi, In the HeapInspection::find_instances_at_safepoint function, the unsafe heap iteration API (which also walks dead objects) is used to find objects that are instance of a class, used for concurrent lock dumping where we find dead java.util.concurrent.locks.AbstractOwnableSynchronizer objects and pointer chase to its possibly dead owner threadObj. There is a comment saying that if this starts crashing because we use CMS, we should probably change to use the safe_object_iterate() API instead, which does not include dead objects. Arguably, whether CMS is observed to crash or not, we really should not be walking over dead objects and exposing them anyway. It's not safe... and it will crash sooner or later. For example, CMS yields to safepoints (including young GCs) while sweeping. This means that both the AbstractOwnableSynchronizer and its owner thread might have died, but while sweeping, we could yield for a young GC that promotes objects overriding the memory of the dead thread object with random primitives, but not yet freeing the dead AbstractOwnableSynchronizer. A subsequent dumping operation could use the heap walker to find the dead AbstractOwnableSynchronizer, and pointer chase into its dead owner thread, which by now has been freed and had its memory clobbered with primitive data. This will all eventually end up in a glorious crash. So we shouldn't do this. Bug: https://bugs.openjdk.java.net/browse/JDK-8227277 Webrev: http://cr.openjdk.java.net/~eosterlund/8227277/webrev.00/ Thanks, /Erik From erik.osterlund at oracle.com Fri Jul 5 10:33:55 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 5 Jul 2019 12:33:55 +0200 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls Message-ID: Hi, The i2c adapter sets a thread-local "callee_target" Method*, which is caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c call is "bad" (e.g. not_entrant). This error handler forwards execution to the callee c2i entry. If the SharedRuntime::handle_wrong_method method is called again due to the i2c2i call being still bad, then we will crash the VM in the following guarantee in SharedRuntime::handle_wrong_method: Method* callee = thread->callee_target(); guarantee(callee != NULL && callee->is_method(), "bad handshake"); Unfortunately, the c2i entry can indeed fail again if it, e.g., hits the new class initialization entry barrier. I think a solution to this problem should stop making assumptions about how many things can go wrong when calling a method from the interpreter. I caught this in ZGC where the timing window for hitting this issue seems to be wider due to concurrent code cache unloading. But it is equally problematic for all GCs. With ZGC, I could catch this failing in SPECjbb2015 where a static method is called from JNI. I could reliably (25% chance) reproduce it, and with the patch it no longer reproduces after 25 runs. I also tried hs-tier1-5, and it looked good. Webrev: http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8227260 Thanks, /Erik From vladimir.x.ivanov at oracle.com Fri Jul 5 11:14:11 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 5 Jul 2019 14:14:11 +0300 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> Message-ID: Thanks for diagnosing the issue, Erik! Can you elaborate, please, on relation to clinit barrier in c2i? I don't see how it is possible to hit clinit barrier during i2c2i transition. Template interpreter has clinit barrier as part of invokestatic handler [1], so by the time c2i is reached, proper checks should be already performed and all the conditions to pass the barrier should be met. If SR::handle_wrong_method is called from clinit barrier in c2i during i2c2i, it signals about a bug in the new clinit logic: somehow clinit barrier is bypassed in interpreter. Are you sure it is not related to upcalls from native code (caller_frame.is_entry_frame() [1])? Also, SR::handle_wrong_method() calls coming from clinit barriers shouldn't hit the fast path w/ callee_target(), because it bypasses the actual initialization check happening during call site re-resolution. Best regards, Vladimir Ivanov PS: regarding clearing JavaThread::_callee_target in JavaThread::oops_do(), I'd prefer to keep it and limit the exposure of a stale Method*. But it's just a matter of preference and I don't have a strong opinion here. [1] src/hotspot/share/runtime/sharedRuntime.cpp: JRT_BLOCK_ENTRY(address, SharedRuntime::handle_wrong_method(JavaThread* thread)) ... if (caller_frame.is_interpreted_frame() || caller_frame.is_entry_frame()) { On 04/07/2019 18:02, Erik ?sterlund wrote: > Hi, > > The i2c adapter sets a thread-local "callee_target" Method*, which is > caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c > call is "bad" (e.g. not_entrant). This error handler forwards execution > to the callee c2i entry. If the SharedRuntime::handle_wrong_method > method is called again due to the i2c2i call being still bad, then we > will crash the VM in the following guarantee in > SharedRuntime::handle_wrong_method: > > Method* callee = thread->callee_target(); > guarantee(callee != NULL && callee->is_method(), "bad handshake"); > > Unfortunately, the c2i entry can indeed fail again if it, e.g., hits the > new class initialization entry barrier of the c2i adapter. > The solution is to simply not clear the thread-local "callee_target" > after handling the first failure, as we can't really know there won't be > another one. There is no reason to clear this value as nobody else reads > it than the SharedRuntime::handle_wrong_method handler (and we really do > want it to be able to read the value as many times as it takes until the > call goes through). I found some confused clearing of this callee_target > in JavaThread::oops_do(), with a comment saying this is a methodOop that > we need to clear to make GC happy or something. Seems like old traces of > perm gen. So I deleted that too. > > I caught this in ZGC where the timing window for hitting this issue > seems to be wider due to concurrent code cache unloading. But it is > equally problematic for all GCs. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8227260 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ > > Thanks, > /Erik From david.holmes at oracle.com Fri Jul 5 11:35:59 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Jul 2019 21:35:59 +1000 Subject: RFR[13]: 8227277: HeapInspection::find_instances_at_safepoint walks dead objects In-Reply-To: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> References: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> Message-ID: <5904f2e1-9172-1d4f-aa85-54cf29b6cb52@oracle.com> Hi Erik, On 5/07/2019 8:19 pm, Erik ?sterlund wrote: > Hi, > > In the HeapInspection::find_instances_at_safepoint function, the unsafe > heap iteration API (which also walks dead objects) is used to find > objects that are instance of a class, used for concurrent lock dumping > where we find dead > java.util.concurrent.locks.AbstractOwnableSynchronizer objects and > pointer chase to its possibly dead owner threadObj. There is a comment > saying that if this starts crashing because we use CMS, we should > probably change to use the safe_object_iterate() API instead, which does > not include dead objects. > > Arguably, whether CMS is observed to crash or not, we really should not > be walking over dead objects and exposing them anyway. It's not safe... > and it will crash sooner or later. > > For example, CMS yields to safepoints (including young GCs) while > sweeping. This means that both the AbstractOwnableSynchronizer and its > owner thread might have died, but while sweeping, we could yield for a > young GC that promotes objects overriding the memory of the dead thread > object with random primitives, but not yet freeing the dead > AbstractOwnableSynchronizer. A subsequent dumping operation could use > the heap walker to find the dead AbstractOwnableSynchronizer, and > pointer chase into its dead owner thread, which by now has been freed > and had its memory clobbered with primitive data. > > This will all eventually end up in a glorious crash. So we shouldn't do > this. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8227277 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8227277/webrev.00/ That seems eminently reasonable. :) Are there any valid uses for the (unsafe) object_iterate? Cheers, David > Thanks, > /Erik From matthias.baesken at sap.com Fri Jul 5 12:21:57 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 5 Jul 2019 12:21:57 +0000 Subject: RFR: 8226816: add UserHandler calls to event log In-Reply-To: <2ba2c9fa-3faa-2e16-59d6-d76412115515@oracle.com> References: <2ba2c9fa-3faa-2e16-59d6-d76412115515@oracle.com> Message-ID: Hello David , here is another webrev with get_signal_name / get_signal_number moved to os.cpp : http://cr.openjdk.java.net/~mbaesken/webrevs/8226816.1/ Best regards, Matthias > > On 4/07/2019 11:06 pm, Baesken, Matthias wrote: > > Hi David, thanks for looking into this . > > > >> > >> If you add this then we don't need distinct POSIX and non-POSIX versions > >> - the existing os::Posix::get_signal_name etc could all be hoisted into > >> os.cpp and the os class - no? > >> > > > > Should I go for this ? > > The coding is still a little different (e.g. is_valid_signal (.. ) call in os_posix ) > but I think it could be done without much trouble (maybe with a few small > ifdefs ) . > > I think it's worth trying it. > > I have to apologize in advance though as I'm about to disappear on two > weeks vacation so may not be able to follow through on this. > > Thanks, > David > From erik.osterlund at oracle.com Fri Jul 5 15:26:09 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 5 Jul 2019 17:26:09 +0200 Subject: RFR[13]: 8227277: HeapInspection::find_instances_at_safepoint walks dead objects In-Reply-To: <5904f2e1-9172-1d4f-aa85-54cf29b6cb52@oracle.com> References: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> <5904f2e1-9172-1d4f-aa85-54cf29b6cb52@oracle.com> Message-ID: On 2019-07-05 13:35, David Holmes wrote: > Hi Erik, > > On 5/07/2019 8:19 pm, Erik ?sterlund wrote: >> Hi, >> >> In the HeapInspection::find_instances_at_safepoint function, the >> unsafe heap iteration API (which also walks dead objects) is used to >> find objects that are instance of a class, used for concurrent lock >> dumping where we find dead >> java.util.concurrent.locks.AbstractOwnableSynchronizer objects and >> pointer chase to its possibly dead owner threadObj. There is a >> comment saying that if this starts crashing because we use CMS, we >> should probably change to use the safe_object_iterate() API instead, >> which does not include dead objects. >> >> Arguably, whether CMS is observed to crash or not, we really should >> not be walking over dead objects and exposing them anyway. It's not >> safe... and it will crash sooner or later. >> >> For example, CMS yields to safepoints (including young GCs) while >> sweeping. This means that both the AbstractOwnableSynchronizer and >> its owner thread might have died, but while sweeping, we could yield >> for a young GC that promotes objects overriding the memory of the >> dead thread object with random primitives, but not yet freeing the >> dead AbstractOwnableSynchronizer. A subsequent dumping operation >> could use the heap walker to find the dead >> AbstractOwnableSynchronizer, and pointer chase into its dead owner >> thread, which by now has been freed and had its memory clobbered with >> primitive data. >> >> This will all eventually end up in a glorious crash. So we shouldn't >> do this. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8227277 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8227277/webrev.00/ > > That seems eminently reasonable. :) Thanks! > Are there any valid uses for the (unsafe) object_iterate? Well... valid might be an overstatement, but I think it probably won't crash if you don't pointer chase through dead references in dead objects. We simply can't do that. Thanks, /Erik > Cheers, > David > >> Thanks, >> /Erik From david.holmes at oracle.com Fri Jul 5 20:54:10 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 6 Jul 2019 06:54:10 +1000 Subject: RFR: 8226816: add UserHandler calls to event log In-Reply-To: References: <2ba2c9fa-3faa-2e16-59d6-d76412115515@oracle.com> Message-ID: <3fc5baee-e709-3294-c79d-f3c4f94c8a02@oracle.com> Hi Matthias, On 5/07/2019 10:21 pm, Baesken, Matthias wrote: > Hello David , here is another webrev with get_signal_name / get_signal_number moved to os.cpp : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8226816.1/ That looks good - thanks. I'm running it through our test system. One query, in os.cpp: + #ifdef _WINDOWS + { SIGBREAK, "SIGBREAK" }, Can that be #ifdef SIGBREAK { SIGBREAK, "SIGBREAK" }, like the other cases? No need for an updated webrev if so. Thanks, David ----- > > Best regards, Matthias > >> >> On 4/07/2019 11:06 pm, Baesken, Matthias wrote: >>> Hi David, thanks for looking into this . >>> >>>> >>>> If you add this then we don't need distinct POSIX and non-POSIX versions >>>> - the existing os::Posix::get_signal_name etc could all be hoisted into >>>> os.cpp and the os class - no? >>>> >>> >>> Should I go for this ? >>> The coding is still a little different (e.g. is_valid_signal (.. ) call in os_posix ) >> but I think it could be done without much trouble (maybe with a few small >> ifdefs ) . >> >> I think it's worth trying it. >> >> I have to apologize in advance though as I'm about to disappear on two >> weeks vacation so may not be able to follow through on this. >> >> Thanks, >> David >> > From dean.long at oracle.com Fri Jul 5 21:46:51 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 5 Jul 2019 14:46:51 -0700 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> Message-ID: <1fb16eb4-59af-7f24-3fdd-56b4a892f82f@oracle.com> What is callee->is_method() doing?? Like Vladimir, I'm concerned about pointers to stale metadata. dl On 7/4/19 8:02 AM, Erik ?sterlund wrote: > Hi, > > The i2c adapter sets a thread-local "callee_target" Method*, which is > caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c > call is "bad" (e.g. not_entrant). This error handler forwards > execution to the callee c2i entry. If the > SharedRuntime::handle_wrong_method method is called again due to the > i2c2i call being still bad, then we will crash the VM in the following > guarantee in SharedRuntime::handle_wrong_method: > > Method* callee = thread->callee_target(); > guarantee(callee != NULL && callee->is_method(), "bad handshake"); > > Unfortunately, the c2i entry can indeed fail again if it, e.g., hits > the new class initialization entry barrier of the c2i adapter. > The solution is to simply not clear the thread-local "callee_target" > after handling the first failure, as we can't really know there won't > be another one. There is no reason to clear this value as nobody else > reads it than the SharedRuntime::handle_wrong_method handler (and we > really do want it to be able to read the value as many times as it > takes until the call goes through). I found some confused clearing of > this callee_target in JavaThread::oops_do(), with a comment saying > this is a methodOop that we need to clear to make GC happy or > something. Seems like old traces of perm gen. So I deleted that too. > > I caught this in ZGC where the timing window for hitting this issue > seems to be wider due to concurrent code cache unloading. But it is > equally problematic for all GCs. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8227260 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ > > Thanks, > /Erik From kim.barrett at oracle.com Sat Jul 6 00:31:18 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 5 Jul 2019 20:31:18 -0400 Subject: RFR[13]: 8227277: HeapInspection::find_instances_at_safepoint walks dead objects In-Reply-To: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> References: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> Message-ID: <90927B2B-DB56-4EAA-A52F-75DF8D3EF98D@oracle.com> > On Jul 5, 2019, at 6:19 AM, Erik ?sterlund wrote: > > Hi, > > In the HeapInspection::find_instances_at_safepoint function, the unsafe heap iteration API (which also walks dead objects) is used to find objects that are instance of a class, used for concurrent lock dumping where we find dead java.util.concurrent.locks.AbstractOwnableSynchronizer objects and pointer chase to its possibly dead owner threadObj. There is a comment saying that if this starts crashing because we use CMS, we should probably change to use the safe_object_iterate() API instead, which does not include dead objects. > > Arguably, whether CMS is observed to crash or not, we really should not be walking over dead objects and exposing them anyway. It's not safe... and it will crash sooner or later. > > For example, CMS yields to safepoints (including young GCs) while sweeping. This means that both the AbstractOwnableSynchronizer and its owner thread might have died, but while sweeping, we could yield for a young GC that promotes objects overriding the memory of the dead thread object with random primitives, but not yet freeing the dead AbstractOwnableSynchronizer. A subsequent dumping operation could use the heap walker to find the dead AbstractOwnableSynchronizer, and pointer chase into its dead owner thread, which by now has been freed and had its memory clobbered with primitive data. > > This will all eventually end up in a glorious crash. So we shouldn't do this. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8227277 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8227277/webrev.00/ > > Thanks, > /Erik Looks good. From david.holmes at oracle.com Sat Jul 6 02:28:08 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 6 Jul 2019 12:28:08 +1000 Subject: RFR: 8226816: add UserHandler calls to event log In-Reply-To: <3fc5baee-e709-3294-c79d-f3c4f94c8a02@oracle.com> References: <2ba2c9fa-3faa-2e16-59d6-d76412115515@oracle.com> <3fc5baee-e709-3294-c79d-f3c4f94c8a02@oracle.com> Message-ID: <3af1d3d2-b243-95e5-9af7-08304ac15860@oracle.com> On 6/07/2019 6:54 am, David Holmes wrote: > Hi Matthias, > > On 5/07/2019 10:21 pm, Baesken, Matthias wrote: >> Hello David , here is another webrev? with? get_signal_name / >> get_signal_number?? moved to os.cpp : >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8226816.1/ > > That looks good - thanks. I'm running it through our test system. All passed. David ----- > One query, in os.cpp: > > + #ifdef _WINDOWS > +?? {? SIGBREAK,??? "SIGBREAK" }, > > Can that be > > ?#ifdef SIGBREAK > ?? {? SIGBREAK,??? "SIGBREAK" }, > > like the other cases? > > No need for an updated webrev if so. > > Thanks, > David > ----- > >> >> Best regards, Matthias >> >>> >>> On 4/07/2019 11:06 pm, Baesken, Matthias wrote: >>>> Hi David,? thanks for looking into this . >>>> >>>>> >>>>> If you add this then we don't need distinct POSIX and non-POSIX >>>>> versions >>>>> - the existing os::Posix::get_signal_name etc could all be hoisted >>>>> into >>>>> os.cpp and the os class - no? >>>>> >>>> >>>> Should I go for this ? >>>> The coding is still a little different?? (e.g. is_valid_signal (.. ) >>>> call? in os_posix ) >>> but I think it could be done without much trouble (maybe with a few >>> small >>> ifdefs ) . >>> >>> I think it's worth trying it. >>> >>> I have to apologize in advance though as I'm about to disappear on two >>> weeks vacation so may not be able to follow through on this. >>> >>> Thanks, >>> David >>> >> From igor.ignatyev at oracle.com Sat Jul 6 03:09:06 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 5 Jul 2019 20:09:06 -0700 Subject: RFR(S) [13] : 8226910 : make it possible to use jtreg's -match via run-test framework In-Reply-To: <8B6A5349-A39A-4AE0-980D-5C336C339DE7@oracle.com> References: <8B6A5349-A39A-4AE0-980D-5C336C339DE7@oracle.com> Message-ID: <9DA3B077-FFE6-472E-B3EA-7C4CFFDB45EB@oracle.com> ping? -- Igor > On Jun 27, 2019, at 3:25 PM, Igor Ignatyev wrote: > > http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >> 25 lines changed: 18 ins; 3 del; 4 mod; > > Hi all, > > could you please review this small patch which adds JTREG_RUN_PROBLEM_LISTS options to run-test framework? when JTREG_RUN_PROBLEM_LISTS is set to true, jtreg will use problem lists as values of -match: instead of -exclude, which effectively means it will run only problem listed tests. > > doc/building.html got changed when I ran update-build-docs, I can exclude it from the patch, but it seems it will keep changing every time we run update-build-docs, so I decided to at least bring it up. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8226910 > webrev: http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html > > Thanks, > -- Igor From david.holmes at oracle.com Sat Jul 6 08:58:16 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 6 Jul 2019 18:58:16 +1000 Subject: RFR(S) [13] : 8226910 : make it possible to use jtreg's -match via run-test framework In-Reply-To: <9DA3B077-FFE6-472E-B3EA-7C4CFFDB45EB@oracle.com> References: <8B6A5349-A39A-4AE0-980D-5C336C339DE7@oracle.com> <9DA3B077-FFE6-472E-B3EA-7C4CFFDB45EB@oracle.com> Message-ID: <5b10f093-8aa8-4b5f-14bf-a9b7c5704381@oracle.com> Hi Igor, On 6/07/2019 1:09 pm, Igor Ignatyev wrote: > ping? > > -- Igor > >> On Jun 27, 2019, at 3:25 PM, Igor Ignatyev wrote: >> >> http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>> 25 lines changed: 18 ins; 3 del; 4 mod; >> >> Hi all, >> >> could you please review this small patch which adds JTREG_RUN_PROBLEM_LISTS options to run-test framework? when JTREG_RUN_PROBLEM_LISTS is set to true, jtreg will use problem lists as values of -match: instead of -exclude, which effectively means it will run only problem listed tests. doc/testing.md + Set to `true` of `false`. typo: s/of/or/ Build changes seem okay - I can't attest to the operation of the flag. >> doc/building.html got changed when I ran update-build-docs, I can exclude it from the patch, but it seems it will keep changing every time we run update-build-docs, so I decided to at least bring it up. Weird it seems to have removed line-breaks in that paragraph. What platform did you build on? David ----- >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8226910 >> webrev: http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >> >> Thanks, >> -- Igor > From igor.ignatyev at oracle.com Sat Jul 6 18:50:16 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Sat, 6 Jul 2019 11:50:16 -0700 Subject: RFR(S) [13] : 8226910 : make it possible to use jtreg's -match via run-test framework In-Reply-To: <5b10f093-8aa8-4b5f-14bf-a9b7c5704381@oracle.com> References: <8B6A5349-A39A-4AE0-980D-5C336C339DE7@oracle.com> <9DA3B077-FFE6-472E-B3EA-7C4CFFDB45EB@oracle.com> <5b10f093-8aa8-4b5f-14bf-a9b7c5704381@oracle.com> Message-ID: Hi David, > On Jul 6, 2019, at 1:58 AM, David Holmes wrote: > > Hi Igor, > > On 6/07/2019 1:09 pm, Igor Ignatyev wrote: >> ping? >> -- Igor >>> On Jun 27, 2019, at 3:25 PM, Igor Ignatyev wrote: >>> >>> http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>>> 25 lines changed: 18 ins; 3 del; 4 mod; >>> >>> Hi all, >>> >>> could you please review this small patch which adds JTREG_RUN_PROBLEM_LISTS options to run-test framework? when JTREG_RUN_PROBLEM_LISTS is set to true, jtreg will use problem lists as values of -match: instead of -exclude, which effectively means it will run only problem listed tests. > > doc/testing.md > > + Set to `true` of `false`. > > typo: s/of/or/ fixed .md, regenerated .html. > > Build changes seem okay - I can't attest to the operation of the flag. here is how I verified that it does that it supposed to: $ make test "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true" TEST=open/test/hotspot/jtreg/:hotspot_all lists 53 tests, the same command w/o RUN_PROBLEM_LISTS (or w/ RUN_PROBLEM_LISTS=false) lists 6698 tests. $ make test "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true;EXTRA_PROBLEM_LISTS=ProblemList-aot.txt lists 81 tests, the same command w/o RUN_PROBLEM_LISTS lists 6670 tests. > >>> doc/building.html got changed when I ran update-build-docs, I can exclude it from the patch, but it seems it will keep changing every time we run update-build-docs, so I decided to at least bring it up. > > Weird it seems to have removed line-breaks in that paragraph. What platform did you build on? I built on macos. now when I wrote that, I remember pandoc used to produce different results on macos. so I've rerun it on linux on the source w/o my change, and doc/building.html still got changed in the exact same way. > David > ----- > >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8226910 >>> webrev: http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>> >>> Thanks, >>> -- Igor From thomas.schatzl at oracle.com Mon Jul 8 07:00:44 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 08 Jul 2019 09:00:44 +0200 Subject: RFR[13]: 8227277: HeapInspection::find_instances_at_safepoint walks dead objects In-Reply-To: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> References: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> Message-ID: Hi, On Fri, 2019-07-05 at 12:19 +0200, Erik ?sterlund wrote: > Hi, > > In the HeapInspection::find_instances_at_safepoint function, the > unsafe heap iteration API (which also walks dead objects) is used to > find objects that are instance of a class, used for concurrent lock > dumping where we find > dead java.util.concurrent.locks.AbstractOwnableSynchronizer objects > and pointer chase to its possibly dead owner threadObj. There is a > comment saying that if this starts crashing because we use CMS, we > should probably change to use the safe_object_iterate() API instead, > which does not include dead objects. > > Arguably, whether CMS is observed to crash or not, we really should > not be walking over dead objects and exposing them anyway. It's not > safe... and it will crash sooner or later. [...] > This will all eventually end up in a glorious crash. So we shouldn't > do this. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8227277 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8227277/webrev.00/ looks good. Thomas From matthias.baesken at sap.com Mon Jul 8 07:00:32 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Mon, 8 Jul 2019 07:00:32 +0000 Subject: RFR: 8226816: add UserHandler calls to event log In-Reply-To: <3af1d3d2-b243-95e5-9af7-08304ac15860@oracle.com> References: <2ba2c9fa-3faa-2e16-59d6-d76412115515@oracle.com> <3fc5baee-e709-3294-c79d-f3c4f94c8a02@oracle.com> <3af1d3d2-b243-95e5-9af7-08304ac15860@oracle.com> Message-ID: Thanks for looking into it and running the tests ! Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Samstag, 6. Juli 2019 04:28 > To: Baesken, Matthias ; 'hotspot- > dev at openjdk.java.net' > Subject: Re: RFR: 8226816: add UserHandler calls to event log > > On 6/07/2019 6:54 am, David Holmes wrote: > > Hi Matthias, > > > > On 5/07/2019 10:21 pm, Baesken, Matthias wrote: > >> Hello David , here is another webrev? with? get_signal_name / > >> get_signal_number?? moved to os.cpp : > >> > >> http://cr.openjdk.java.net/~mbaesken/webrevs/8226816.1/ > > > > That looks good - thanks. I'm running it through our test system. > > All passed. > > David > ----- > > > One query, in os.cpp: > > > > + #ifdef _WINDOWS > > +?? {? SIGBREAK,??? "SIGBREAK" }, > > > > Can that be > > > > ?#ifdef SIGBREAK > > ?? {? SIGBREAK,??? "SIGBREAK" }, > > > > like the other cases? > > > > No need for an updated webrev if so. > > > > Thanks, > > David > > ----- > > > >> > >> Best regards, Matthias > >> > >>> > >>> On 4/07/2019 11:06 pm, Baesken, Matthias wrote: > >>>> Hi David,? thanks for looking into this . > >>>> > >>>>> > >>>>> If you add this then we don't need distinct POSIX and non-POSIX > >>>>> versions > >>>>> - the existing os::Posix::get_signal_name etc could all be hoisted > >>>>> into > >>>>> os.cpp and the os class - no? > >>>>> > >>>> > >>>> Should I go for this ? > >>>> The coding is still a little different?? (e.g. is_valid_signal (.. ) > >>>> call? in os_posix ) > >>> but I think it could be done without much trouble (maybe with a few > >>> small > >>> ifdefs ) . > >>> > >>> I think it's worth trying it. > >>> > >>> I have to apologize in advance though as I'm about to disappear on two > >>> weeks vacation so may not be able to follow through on this. > >>> > >>> Thanks, > >>> David > >>> > >> From erik.osterlund at oracle.com Mon Jul 8 10:07:52 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 8 Jul 2019 12:07:52 +0200 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <1fb16eb4-59af-7f24-3fdd-56b4a892f82f@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <1fb16eb4-59af-7f24-3fdd-56b4a892f82f@oracle.com> Message-ID: <4474b767-b53b-ac22-3d98-b013e1bdbd08@oracle.com> Hi Dean and Vladimir, the callee->is_method() in the guarantee is there probably to find corrupt memory. So the problem is specifically when performing upcalls from JNI. The call wrapper tries to "quack like an interpreter" and performs i2c calls, failing due to the nmethod being not entrant. Then the subsequent c2i attempt fails again due to clinit barriers. In the template interpreter calls, the clinit barriers have already been taken, but in the JNI upcall path, we don't perform that barrier. So as our current i2c calls can't actually deal with blocking at all (and no safepoints), the right solution seems to be sticking in some clinit barriers into the JavaCalls API, so that when the call is performed, we know the clinit barrier won't be hit. I still think that allowing only one thing to go wrong across an i2c2i call is pretty scary, and I'd love to remove that restriction. Anyway, Vladimir offered to find the right place to put the clinit barrier, so I'm handing this one over. :) Thanks, /Erik On 2019-07-05 23:46, dean.long at oracle.com wrote: > What is callee->is_method() doing?? Like Vladimir, I'm concerned about > pointers to stale metadata. > > dl > > On 7/4/19 8:02 AM, Erik ?sterlund wrote: >> Hi, >> >> The i2c adapter sets a thread-local "callee_target" Method*, which is >> caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c >> call is "bad" (e.g. not_entrant). This error handler forwards >> execution to the callee c2i entry. If the >> SharedRuntime::handle_wrong_method method is called again due to the >> i2c2i call being still bad, then we will crash the VM in the >> following guarantee in SharedRuntime::handle_wrong_method: >> >> Method* callee = thread->callee_target(); >> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >> >> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits >> the new class initialization entry barrier of the c2i adapter. >> The solution is to simply not clear the thread-local "callee_target" >> after handling the first failure, as we can't really know there won't >> be another one. There is no reason to clear this value as nobody else >> reads it than the SharedRuntime::handle_wrong_method handler (and we >> really do want it to be able to read the value as many times as it >> takes until the call goes through). I found some confused clearing of >> this callee_target in JavaThread::oops_do(), with a comment saying >> this is a methodOop that we need to clear to make GC happy or >> something. Seems like old traces of perm gen. So I deleted that too. >> >> I caught this in ZGC where the timing window for hitting this issue >> seems to be wider due to concurrent code cache unloading. But it is >> equally problematic for all GCs. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8227260 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >> >> Thanks, >> /Erik > From erik.osterlund at oracle.com Mon Jul 8 10:53:28 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 8 Jul 2019 12:53:28 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> Message-ID: Any takers? /Erik On 2019-07-01 15:12, Erik ?sterlund wrote: > Hi, > > Today it is up to callers of methods changing state on nmethods like > make_not_entrant(), to know all other possible concurrent attempts to > transition the nmethod, and know that there are no such attempts > trying to make the nmethod more dead. > There have been multiple occurrences of issues where the caller got it > wrong due to the fragile nature of this code. This specific CR deals > with a bug where an OSR nmethod was made not entrant (deopt) and made > unloaded concurrently. > The result of such a race can be that it is first made unloaded and > then made not entrant, making the nmethod go backwards in its state > machine, effectively resurrecting dead nmethods, causing a subsequent > GC to feel awkward (crash). > But I have seen other similar incidents with deopt racing with the > sweeper. These non-monotonicity problems are unnecessary to have. So I > intend to fix the bug by enforcing monotonicity of the nmethod state > machine explicitly, instead of trying to reason about all callers of > these make_* functions. > I swapped the order of unloaded and zombie in the enum as zombies are > strictly more dead than unloaded nmethods. All transitions change in > the direction of increasing deadness and fail if the transition is not > monotonically increasing. > > For ZGC I moved OSR nmethod unlinking to before the unlinking (where > unlinking code belongs), instead of after the handshake (intended for > deleting things safely unlinked). > Strictly speaking, moving the OSR nmethod unlinking removes the racing > between make_not_entrant and make_unloaded, but I still want the > monotonicity guards to make this code more robust. > > I left AOT methods alone. Since they don't die, they don't have > resurrection problems, and hence do not benefit from these guards in > the same way. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8224674 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00/ > > Thanks, > /Erik From adam.farley at uk.ibm.com Mon Jul 8 12:45:28 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Mon, 8 Jul 2019 13:45:28 +0100 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: <08a6c8a3-bd3e-25db-2460-cea7c8fbb3f3@oracle.com> References: <2c9e6acd-0e79-13c0-23ea-2cef402ee125@oracle.com> <842ae03e-8574-593e-3ac2-5cc283832be9@oracle.com> <08a6c8a3-bd3e-25db-2460-cea7c8fbb3f3@oracle.com> Message-ID: Hi David, David Holmes wrote on 04/07/2019 22:21:59: > From: David Holmes > To: Adam Farley8 > Cc: hotspot-dev at openjdk.java.net > Date: 04/07/2019 22:22 > Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > paths are longer than JVM_MAXPATHLEN > > Hi Adam, > > On 5/07/2019 2:41 am, Adam Farley8 wrote: > > Hi David, > > > > To detect a too-long path when it's being passed in, the best option > > I can see is to check it in two places: > > Right, but my outstanding question relates to the existing code today. > Where will we detect that a path element is too long? Ahh, right. Right now, the place checking the length of the paths is split_path, the method I modified in oc.cpp. Specifically this bit: ---- if (len > JVM_MAXPATHLEN) { return NULL; } ---- This seemed wrong to me at the time, because it meant that is *any* of the paths was too long, the method splitting up the paths string would return null, and fail to check any of the library locations. If we agree that the correct behaviour would be to throw an error and kill the vm if any of the paths is too long, then the simplest option would be to replace the "return NULL" in that "if" with an error. Something like this, perhaps? --- if (len > JVM_MAXPATHLEN) { vm_exit_during_initialization("java.lang.VirtualMachineError", "One or more of the sun.boot.library.path " "paths has exceeded the maximum path length " "for this system."); } --- With fewer new-lines, of course. ;) > > I'm still not sure whether the VM has the right to dictate behaviour > here or whether this belongs to core-libs. And we need to be very > careful about any change in behaviour. > > > 1) when it's being set initially with the location of libjvm.so, either: > > a)in hotspot/os/[os name]/os_[os name].cpp, right before the call > > to Arguments::set_dll_dir > > or b), in the Arguments::set_dll_dirfunction itself (ideally the > > latter) > > > > 2) when/if the extra paths are being passed in as a parameter, as they > > pass through hotspot/share/runtime/arguments.cpp, right after the line: > > > > --- > > else if (_strcmp_(key, "sun.boot.library.path") == 0)"); > > --- > > > > You're right in that this could slow down startup a little, with > > the length checking, and the potential looping over the -D value > > to check the length of each path. Not a major slowdown though. > > I'm sure Claes would disagree :) > > Apologies in advance as I'm about to disappear for two weeks vacation. > > David > ----- No worries. :) - Adam > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > > > David Holmes wrote on 04/07/2019 07:57:14: > > > >> From: David Holmes > >> To: Adam Farley8 > >> Cc: hotspot-dev at openjdk.java.net > >> Date: 04/07/2019 07:58 > >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > >> paths are longer than JVM_MAXPATHLEN > >> > >> Hi Adam, > >> > >> On 4/07/2019 1:42 am, Adam Farley8 wrote: > >> > Hi David, > >> > > >> > I figured it should be elaborate so we can avoid killing the VM > >> > if we don't have to. > >> > > >> > Ultimately, if we have a list of three paths and the last two > >> > are invalid, does it matter so long as all the libraries we need > >> > are in the first path? > >> > >> I prefer not see the users error ignored if we can reasonably detect it. > >> They set the paths for a reason, and if they paths are invalid they > >> probably would like to know. > >> > >> > As to your question "is it in hostpot or JDK code", I presume you > >> > mean in the change set. I'm primarily referring to the hotspot code. > >> > >> No I mean where in the current code will we detect that one of these > >> path elements is too long? > >> > >> > Also, if we end up adopting a "kill the vm if any path is too long" > >> > approach, we still need to change the JDK code, as those currently > >> > seem to want to fail if the total length of the sub.boot.library.path > >> > property is longer than the maximum length of a single path. > >> > > >> > So if you pass in three 100 character paths on Windows, it'll fail > >> > because they add up to more than the 260 character path limit. > >> > >> That seems like a separate bug that should be addressed. :( > >> > >> Thanks, > >> David > >> > >> > Best Regards > >> > > >> > Adam Farley > >> > IBM Runtimes > >> > > >> > > >> > David Holmes wrote on 03/07/2019 08:36:36: > >> > > >> >> From: David Holmes > >> >> To: Adam Farley8 > >> >> Cc: hotspot-dev at openjdk.java.net > >> >> Date: 03/07/2019 08:36 > >> >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > >> >> paths are longer than JVM_MAXPATHLEN > >> >> > >> >> On 2/07/2019 7:44 pm, Adam Farley8 wrote: > >> >> > Hi David, > >> >> > > >> >> > Thanks for your thoughts. > >> >> > > >> >> > The user should absolutely have immediate feedback, yes, and I agree > >> >> > that "skipping" paths could lead to us loading the wrong library. > >> >> > > >> >> > Perhaps a compromise? We fire off a stderr warning if any > of the paths > >> >> > are too long (without killing the VM), we ignore any path *after* > >> >> > (and including) the first too-long path, and we kill the VM if the > >> >> > first path is too long. > >> >> > >> >> My first though is why be so elaborate and not just fail immediately: > >> >> > >> >> Error occurred during initialization of VM > >> >> One or more sun.boot.library.path elements is too long for this system. > >> >> --- > >> >> > >> >> ? But AFAICS we don't do any sanity checking of the those > paths so this > >> >> would have an impact on startup. > >> >> > >> >> I can't locate where we would detect the too-long path > element, is it in > >> >> hostpot or JDK code? > >> >> > >> >> Thanks, > >> >> David > >> >> ----- > >> >> > >> >> > Warning message example: > >> >> > > >> >> > ---- > >> >> > Warning: One or more sun.boot.library.path paths were too long > >> >> > for this system, and it (along with all subsequent paths) have been > >> >> > ignored. > >> >> > ---- > >> >> > > >> >> > Another addition could be to check the path lengths for the property > >> >> > sooner, thus aborting the VM faster if the default path is too long. > >> >> > > >> >> > Assuming we posit that the VM will always need to load libraries. > >> >> > > >> >> > Best Regards > >> >> > > >> >> > Adam Farley > >> >> > IBM Runtimes > >> >> > > >> >> > > >> >> > David Holmes wrote on 01/07/2019 22:10:45: > >> >> > > >> >> >> From: David Holmes > >> >> >> To: Adam Farley8 , hotspot- > dev at openjdk.java.net > >> >> >> Date: 01/07/2019 22:12 > >> >> >> Subject: Re: RFR: JDK-8227021: VM fails if any > sun.boot.library.path > >> >> >> paths are longer than JVM_MAXPATHLEN > >> >> >> > >> >> >> Hi Adam, > >> >> >> > >> >> >> On 1/07/2019 10:27 pm, Adam Farley8 wrote: > >> >> >> > Hi All, > >> >> >> > > >> >> >> > The title say it all. > >> >> >> > > >> >> >> > If you pass in a value for sun.boot.library.path consisting > >> >> >> > of one or more paths that are too long, then the vm will > >> >> >> > fail to start because it can't load one of the libraries it > >> >> >> > needs (the zip library), despite the fact that the VM > >> >> >> > automatically prepends the default library path to the > >> >> >> > sun.boot.library.path property, using the correct separator > >> >> >> > to divide it from the user-specified path. > >> >> >> > > >> >> >> > So we've got the right path, in the right place, at the > >> >> >> > right time, we just can't *use* it. > >> >> >> > > >> >> >> > I've fixed this by changing the relevant os.cpp code to > >> >> >> > ignore paths that are too long, and to attempt to locate > >> >> >> > the needed library on the other paths (if any are valid). > >> >> >> > >> >> >> As I just added to the bug report I have a different view > of "correct" > >> >> >> here. If you just ignore the long path and keep processing > other short > >> >> >> paths you may find the wrong library. There is a user > error here and > >> >> >> that error should be reported ASAP and in a way that leads > to failure > >> >> >> ASAP. Perhaps we should be more aggressive in aborting the > VMwhen this > >> >> >> is detected? > >> >> >> > >> >> >> David > >> >> >> ----- > >> >> >> > >> >> >> > I've also added functionality to handle the edge case of > >> >> >> > paths that are neeeeeeearly too long, only for a > >> >> >> > sub-path (or file name) to push us over the limit *after* > >> >> >> > the split_path function is done assessing the path length. > >> >> >> > > >> >> >> > I've also changed the code we're overriding, on the assumption > >> >> >> > that someone's still using it somewhere. > >> >> >> > > >> >> >> > Bug: https://urldefense.proofpoint.com/v2/url? > >> >> >> > >> >> > >> > u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8227021&d=DwICaQ&c=jf_iaSHvJObTbx- > >> >> >> siA1ZOg&r=P5m8KWUXJf- > >> >> >> > >> >> > >> > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=xZzQCnv68xd9hJyyK1obSim38eWSRmLPfuR__9ddZWg&e= > >> >> >> > Webrev: https://urldefense.proofpoint.com/v2/url? > >> >> >> > >> >> > >> > u=http-3A__cr.openjdk.java.net_-7Eafarley_8227021_webrev_&d=DwICaQ&c=jf_iaSHvJObTbx- > >> >> >> siA1ZOg&r=P5m8KWUXJf- > >> >> >> > >> >> > >> > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=cSTGBGkEsu5yl0haJ6it9egPSgixg7mRei6lBDB5Y3k&s=- > >> >> >> hKU0zUd_0LDT08wTilexgI54EeSgt8xUk97i6V63Bk&e= > >> >> >> > > >> >> >> > Thoughts and impressions welcome. > >> >> >> > > >> >> >> > Best Regards > >> >> >> > > >> >> >> > Adam Farley > >> >> >> > IBM Runtimes > >> >> >> > > >> >> >> > Unless stated otherwise above: > >> >> >> > IBM United Kingdom Limited - Registered in England and > >> Wales with number > >> >> >> > 741598. > >> >> >> > Registered office: PO Box 41, North Harbour, Portsmouth, > >> >> Hampshire PO6 3AU > >> >> >> > > >> >> >> > >> >> > > >> >> > Unless stated otherwise above: > >> >> > IBM United Kingdom Limited - Registered in England and > Wales with number > >> >> > 741598. > >> >> > Registered office: PO Box 41, North Harbour, Portsmouth, > >> Hampshire PO6 3AU > >> >> > >> > > >> > Unless stated otherwise above: > >> > IBM United Kingdom Limited - Registered in England and Wales with number > >> > 741598. > >> > Registered office: PO Box 41, North Harbour, Portsmouth, > Hampshire PO6 3AU > >> > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with number > > 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From coleen.phillimore at oracle.com Mon Jul 8 15:11:42 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 8 Jul 2019 11:11:42 -0400 Subject: RFR[13]: 8227277: HeapInspection::find_instances_at_safepoint walks dead objects In-Reply-To: References: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> <5904f2e1-9172-1d4f-aa85-54cf29b6cb52@oracle.com> Message-ID: <9b63a1dc-4bb2-7a07-a9a5-758f0a36e487@oracle.com> On 7/5/19 11:26 AM, Erik ?sterlund wrote: > > > On 2019-07-05 13:35, David Holmes wrote: >> Hi Erik, >> >> On 5/07/2019 8:19 pm, Erik ?sterlund wrote: >>> Hi, >>> >>> In the HeapInspection::find_instances_at_safepoint function, the >>> unsafe heap iteration API (which also walks dead objects) is used to >>> find objects that are instance of a class, used for concurrent lock >>> dumping where we find dead >>> java.util.concurrent.locks.AbstractOwnableSynchronizer objects and >>> pointer chase to its possibly dead owner threadObj. There is a >>> comment saying that if this starts crashing because we use CMS, we >>> should probably change to use the safe_object_iterate() API instead, >>> which does not include dead objects. >>> >>> Arguably, whether CMS is observed to crash or not, we really should >>> not be walking over dead objects and exposing them anyway. It's not >>> safe... and it will crash sooner or later. >>> >>> For example, CMS yields to safepoints (including young GCs) while >>> sweeping. This means that both the AbstractOwnableSynchronizer and >>> its owner thread might have died, but while sweeping, we could yield >>> for a young GC that promotes objects overriding the memory of the >>> dead thread object with random primitives, but not yet freeing the >>> dead AbstractOwnableSynchronizer. A subsequent dumping operation >>> could use the heap walker to find the dead >>> AbstractOwnableSynchronizer, and pointer chase into its dead owner >>> thread, which by now has been freed and had its memory clobbered >>> with primitive data. >>> >>> This will all eventually end up in a glorious crash. So we shouldn't >>> do this. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8227277 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~eosterlund/8227277/webrev.00/ >> >> That seems eminently reasonable. :) > > Thanks! > >> Are there any valid uses for the (unsafe) object_iterate? > > Well... valid might be an overstatement, but I think it probably won't > crash if you don't pointer chase through dead references in dead > objects. We simply can't do that. This change looks good, but I have to echo David's question.? It looks like we have the same thing in jvmtiTagMap, with some out of date comments. hare/prims/jvmtiTagMap.cpp:??? // consider using safe_object_iterate() which avoids perm gen share/prims/jvmtiTagMap.cpp: Universe::heap()->object_iterate(_blk); share/prims/jvmtiTagMap.cpp: Universe::heap()->object_iterate(&blk); Should we eliminate all uses of Universe::heap() version of object_iterate?? It looks like the GCs call it, and it's probably safe in those places, so should not be a virtual function for each GC? Thanks, Coleen > > Thanks, > /Erik > >> Cheers, >> David >> >>> Thanks, >>> /Erik > From erik.osterlund at oracle.com Mon Jul 8 16:27:57 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Mon, 8 Jul 2019 18:27:57 +0200 Subject: RFR[13]: 8227277: HeapInspection::find_instances_at_safepoint walks dead objects In-Reply-To: <9b63a1dc-4bb2-7a07-a9a5-758f0a36e487@oracle.com> References: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> <5904f2e1-9172-1d4f-aa85-54cf29b6cb52@oracle.com> <9b63a1dc-4bb2-7a07-a9a5-758f0a36e487@oracle.com> Message-ID: Hi Coleen, Thanks for the review. I am 100% for not using the unsafe API in shared code at all. Could we make that change for 14 though? Thanks, /Erik > On 8 Jul 2019, at 17:11, coleen.phillimore at oracle.com wrote: > > > >> On 7/5/19 11:26 AM, Erik ?sterlund wrote: >> >> >>> On 2019-07-05 13:35, David Holmes wrote: >>> Hi Erik, >>> >>>> On 5/07/2019 8:19 pm, Erik ?sterlund wrote: >>>> Hi, >>>> >>>> In the HeapInspection::find_instances_at_safepoint function, the unsafe heap iteration API (which also walks dead objects) is used to find objects that are instance of a class, used for concurrent lock dumping where we find dead java.util.concurrent.locks.AbstractOwnableSynchronizer objects and pointer chase to its possibly dead owner threadObj. There is a comment saying that if this starts crashing because we use CMS, we should probably change to use the safe_object_iterate() API instead, which does not include dead objects. >>>> >>>> Arguably, whether CMS is observed to crash or not, we really should not be walking over dead objects and exposing them anyway. It's not safe... and it will crash sooner or later. >>>> >>>> For example, CMS yields to safepoints (including young GCs) while sweeping. This means that both the AbstractOwnableSynchronizer and its owner thread might have died, but while sweeping, we could yield for a young GC that promotes objects overriding the memory of the dead thread object with random primitives, but not yet freeing the dead AbstractOwnableSynchronizer. A subsequent dumping operation could use the heap walker to find the dead AbstractOwnableSynchronizer, and pointer chase into its dead owner thread, which by now has been freed and had its memory clobbered with primitive data. >>>> >>>> This will all eventually end up in a glorious crash. So we shouldn't do this. >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8227277 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~eosterlund/8227277/webrev.00/ >>> >>> That seems eminently reasonable. :) >> >> Thanks! >> >>> Are there any valid uses for the (unsafe) object_iterate? >> >> Well... valid might be an overstatement, but I think it probably won't crash if you don't pointer chase through dead references in dead objects. We simply can't do that. > > This change looks good, but I have to echo David's question. It looks like we have the same thing in jvmtiTagMap, with some out of date comments. > > hare/prims/jvmtiTagMap.cpp: // consider using safe_object_iterate() which avoids perm gen > share/prims/jvmtiTagMap.cpp: Universe::heap()->object_iterate(_blk); > share/prims/jvmtiTagMap.cpp: Universe::heap()->object_iterate(&blk); > > Should we eliminate all uses of Universe::heap() version of object_iterate? It looks like the GCs call it, and it's probably safe in those places, so should not be a virtual function for each GC? > > Thanks, > Coleen > >> >> Thanks, >> /Erik >> >>> Cheers, >>> David >>> >>>> Thanks, >>>> /Erik >> > From coleen.phillimore at oracle.com Mon Jul 8 18:44:26 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 8 Jul 2019 14:44:26 -0400 Subject: RFR: 8226816: add UserHandler calls to event log In-Reply-To: References: <2ba2c9fa-3faa-2e16-59d6-d76412115515@oracle.com> <3fc5baee-e709-3294-c79d-f3c4f94c8a02@oracle.com> <3af1d3d2-b243-95e5-9af7-08304ac15860@oracle.com> Message-ID: <2c7786a5-f6c6-609a-1857-e946da054ba4@oracle.com> This change looks fine. Coleen On 7/8/19 3:00 AM, Baesken, Matthias wrote: > Thanks for looking into it and running the tests ! > > Best regards, Matthias > >> -----Original Message----- >> From: David Holmes >> Sent: Samstag, 6. Juli 2019 04:28 >> To: Baesken, Matthias ; 'hotspot- >> dev at openjdk.java.net' >> Subject: Re: RFR: 8226816: add UserHandler calls to event log >> >> On 6/07/2019 6:54 am, David Holmes wrote: >>> Hi Matthias, >>> >>> On 5/07/2019 10:21 pm, Baesken, Matthias wrote: >>>> Hello David , here is another webrev? with? get_signal_name / >>>> get_signal_number?? moved to os.cpp : >>>> >>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8226816.1/ >>> That looks good - thanks. I'm running it through our test system. >> All passed. >> >> David >> ----- >> >>> One query, in os.cpp: >>> >>> + #ifdef _WINDOWS >>> +?? {? SIGBREAK,??? "SIGBREAK" }, >>> >>> Can that be >>> >>> ?#ifdef SIGBREAK >>> ?? {? SIGBREAK,??? "SIGBREAK" }, >>> >>> like the other cases? >>> >>> No need for an updated webrev if so. >>> >>> Thanks, >>> David >>> ----- >>> >>>> Best regards, Matthias >>>> >>>>> On 4/07/2019 11:06 pm, Baesken, Matthias wrote: >>>>>> Hi David,? thanks for looking into this . >>>>>> >>>>>>> If you add this then we don't need distinct POSIX and non-POSIX >>>>>>> versions >>>>>>> - the existing os::Posix::get_signal_name etc could all be hoisted >>>>>>> into >>>>>>> os.cpp and the os class - no? >>>>>>> >>>>>> Should I go for this ? >>>>>> The coding is still a little different?? (e.g. is_valid_signal (.. ) >>>>>> call? in os_posix ) >>>>> but I think it could be done without much trouble (maybe with a few >>>>> small >>>>> ifdefs ) . >>>>> >>>>> I think it's worth trying it. >>>>> >>>>> I have to apologize in advance though as I'm about to disappear on two >>>>> weeks vacation so may not be able to follow through on this. >>>>> >>>>> Thanks, >>>>> David >>>>> From coleen.phillimore at oracle.com Mon Jul 8 18:45:04 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 8 Jul 2019 14:45:04 -0400 Subject: RFR[13]: 8227277: HeapInspection::find_instances_at_safepoint walks dead objects In-Reply-To: References: <8cb16d70-edc5-8aeb-a06c-7384ff6e55a3@oracle.com> <5904f2e1-9172-1d4f-aa85-54cf29b6cb52@oracle.com> <9b63a1dc-4bb2-7a07-a9a5-758f0a36e487@oracle.com> Message-ID: <744e055e-04b4-968a-3a99-704cea6b568a@oracle.com> On 7/8/19 12:27 PM, Erik Osterlund wrote: > Hi Coleen, > > Thanks for the review. I am 100% for not using the unsafe API in shared code at all. Could we make that change for 14 though? Oh, definitely! Coleen > > Thanks, > /Erik > >> On 8 Jul 2019, at 17:11, coleen.phillimore at oracle.com wrote: >> >> >> >>> On 7/5/19 11:26 AM, Erik ?sterlund wrote: >>> >>> >>>> On 2019-07-05 13:35, David Holmes wrote: >>>> Hi Erik, >>>> >>>>> On 5/07/2019 8:19 pm, Erik ?sterlund wrote: >>>>> Hi, >>>>> >>>>> In the HeapInspection::find_instances_at_safepoint function, the unsafe heap iteration API (which also walks dead objects) is used to find objects that are instance of a class, used for concurrent lock dumping where we find dead java.util.concurrent.locks.AbstractOwnableSynchronizer objects and pointer chase to its possibly dead owner threadObj. There is a comment saying that if this starts crashing because we use CMS, we should probably change to use the safe_object_iterate() API instead, which does not include dead objects. >>>>> >>>>> Arguably, whether CMS is observed to crash or not, we really should not be walking over dead objects and exposing them anyway. It's not safe... and it will crash sooner or later. >>>>> >>>>> For example, CMS yields to safepoints (including young GCs) while sweeping. This means that both the AbstractOwnableSynchronizer and its owner thread might have died, but while sweeping, we could yield for a young GC that promotes objects overriding the memory of the dead thread object with random primitives, but not yet freeing the dead AbstractOwnableSynchronizer. A subsequent dumping operation could use the heap walker to find the dead AbstractOwnableSynchronizer, and pointer chase into its dead owner thread, which by now has been freed and had its memory clobbered with primitive data. >>>>> >>>>> This will all eventually end up in a glorious crash. So we shouldn't do this. >>>>> >>>>> Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8227277 >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~eosterlund/8227277/webrev.00/ >>>> That seems eminently reasonable. :) >>> Thanks! >>> >>>> Are there any valid uses for the (unsafe) object_iterate? >>> Well... valid might be an overstatement, but I think it probably won't crash if you don't pointer chase through dead references in dead objects. We simply can't do that. >> This change looks good, but I have to echo David's question. It looks like we have the same thing in jvmtiTagMap, with some out of date comments. >> >> hare/prims/jvmtiTagMap.cpp: // consider using safe_object_iterate() which avoids perm gen >> share/prims/jvmtiTagMap.cpp: Universe::heap()->object_iterate(_blk); >> share/prims/jvmtiTagMap.cpp: Universe::heap()->object_iterate(&blk); >> >> Should we eliminate all uses of Universe::heap() version of object_iterate? It looks like the GCs call it, and it's probably safe in those places, so should not be a virtual function for each GC? >> >> Thanks, >> Coleen >> >>> Thanks, >>> /Erik >>> >>>> Cheers, >>>> David >>>> >>>>> Thanks, >>>>> /Erik From coleen.phillimore at oracle.com Mon Jul 8 19:34:27 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 8 Jul 2019 15:34:27 -0400 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> Message-ID: One comment On 7/5/19 7:14 AM, Vladimir Ivanov wrote: > Thanks for diagnosing the issue, Erik! > > Can you elaborate, please, on relation to clinit barrier in c2i? > > I don't see how it is possible to hit clinit barrier during i2c2i > transition. Template interpreter has clinit barrier as part of > invokestatic handler [1], so by the time c2i is reached, proper checks > should be already performed and all the conditions to pass the barrier > should be met. > > If SR::handle_wrong_method is called from clinit barrier in c2i during > i2c2i, it signals about a bug in the new clinit logic: somehow clinit > barrier is bypassed in interpreter. > > Are you sure it is not related to upcalls from native code > (caller_frame.is_entry_frame() [1])? > > Also, SR::handle_wrong_method() calls coming from clinit barriers > shouldn't hit the fast path w/ callee_target(), because it bypasses > the actual initialization check happening during call site re-resolution. > > Best regards, > Vladimir Ivanov > > PS: regarding clearing JavaThread::_callee_target in > JavaThread::oops_do(), I'd prefer to keep it and limit the exposure of > a stale Method*. But it's just a matter of preference and I don't have > a strong opinion here. The callee_method in JavaThread::oops_do() won't keep the Method* alive.? I'm not sure what keeps it alive in the callee_method field.?? Can there be a GC now with some Method* that you need there? In that case, you should put the callee_method->method_holder()->java_mirror() in a new field in JavaThread::_callee_mirror or something, and have JavaThread::oops_do walk that.? Also, redefinition might have to keep the callee_method alive in the metadata walk, but you can file a separate bug for that if I'm not too confused. Coleen > > [1] > src/hotspot/share/runtime/sharedRuntime.cpp: > > JRT_BLOCK_ENTRY(address, > SharedRuntime::handle_wrong_method(JavaThread* thread)) > ... > ? if (caller_frame.is_interpreted_frame() || > ????? caller_frame.is_entry_frame()) { > > > On 04/07/2019 18:02, Erik ?sterlund wrote: >> Hi, >> >> The i2c adapter sets a thread-local "callee_target" Method*, which is >> caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c >> call is "bad" (e.g. not_entrant). This error handler forwards >> execution to the callee c2i entry. If the >> SharedRuntime::handle_wrong_method method is called again due to the >> i2c2i call being still bad, then we will crash the VM in the >> following guarantee in SharedRuntime::handle_wrong_method: >> >> Method* callee = thread->callee_target(); >> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >> >> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits >> the new class initialization entry barrier of the c2i adapter. >> The solution is to simply not clear the thread-local "callee_target" >> after handling the first failure, as we can't really know there won't >> be another one. There is no reason to clear this value as nobody else >> reads it than the SharedRuntime::handle_wrong_method handler (and we >> really do want it to be able to read the value as many times as it >> takes until the call goes through). I found some confused clearing of >> this callee_target in JavaThread::oops_do(), with a comment saying >> this is a methodOop that we need to clear to make GC happy or >> something. Seems like old traces of perm gen. So I deleted that too. >> >> I caught this in ZGC where the timing window for hitting this issue >> seems to be wider due to concurrent code cache unloading. But it is >> equally problematic for all GCs. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8227260 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >> >> Thanks, >> /Erik From coleen.phillimore at oracle.com Mon Jul 8 19:46:48 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 8 Jul 2019 15:46:48 -0400 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <4474b767-b53b-ac22-3d98-b013e1bdbd08@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <1fb16eb4-59af-7f24-3fdd-56b4a892f82f@oracle.com> <4474b767-b53b-ac22-3d98-b013e1bdbd08@oracle.com> Message-ID: <228ee794-8f0d-1834-93d5-8ef8decf7811@oracle.com> On 7/8/19 6:07 AM, Erik ?sterlund wrote: > Hi Dean and Vladimir, > > the callee->is_method() in the guarantee is there probably to find > corrupt memory. > > So the problem is specifically when performing upcalls from JNI. The > call wrapper tries to "quack like an interpreter" and performs i2c > calls, failing due to the nmethod being not entrant. Then the > subsequent c2i attempt fails again due to clinit barriers. In the > template interpreter calls, the clinit barriers have already been > taken, but in the JNI upcall path, we don't perform that barrier. > > So as our current i2c calls can't actually deal with blocking at all > (and no safepoints), the right solution seems to be sticking in some > clinit barriers into the JavaCalls API, so that when the call is > performed, we know the clinit barrier won't be hit. Ok, you *cannot* block with callee_method in JavaThread.? Ignore my last mail!? That comment in oops_do was a leftover from permgen. Thanks, Coleen > > I still think that allowing only one thing to go wrong across an i2c2i > call is pretty scary, and I'd love to remove that restriction. > > Anyway, Vladimir offered to find the right place to put the clinit > barrier, so I'm handing this one over. :) > > Thanks, > /Erik > > On 2019-07-05 23:46, dean.long at oracle.com wrote: >> What is callee->is_method() doing? Like Vladimir, I'm concerned about >> pointers to stale metadata. >> >> dl >> >> On 7/4/19 8:02 AM, Erik ?sterlund wrote: >>> Hi, >>> >>> The i2c adapter sets a thread-local "callee_target" Method*, which >>> is caught (and cleared) by SharedRuntime::handle_wrong_method if the >>> i2c call is "bad" (e.g. not_entrant). This error handler forwards >>> execution to the callee c2i entry. If the >>> SharedRuntime::handle_wrong_method method is called again due to the >>> i2c2i call being still bad, then we will crash the VM in the >>> following guarantee in SharedRuntime::handle_wrong_method: >>> >>> Method* callee = thread->callee_target(); >>> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >>> >>> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits >>> the new class initialization entry barrier of the c2i adapter. >>> The solution is to simply not clear the thread-local "callee_target" >>> after handling the first failure, as we can't really know there >>> won't be another one. There is no reason to clear this value as >>> nobody else reads it than the SharedRuntime::handle_wrong_method >>> handler (and we really do want it to be able to read the value as >>> many times as it takes until the call goes through). I found some >>> confused clearing of this callee_target in JavaThread::oops_do(), >>> with a comment saying this is a methodOop that we need to clear to >>> make GC happy or something. Seems like old traces of perm gen. So I >>> deleted that too. >>> >>> I caught this in ZGC where the timing window for hitting this issue >>> seems to be wider due to concurrent code cache unloading. But it is >>> equally problematic for all GCs. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8227260 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >>> >>> Thanks, >>> /Erik >> > From coleen.phillimore at oracle.com Mon Jul 8 21:19:12 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 8 Jul 2019 17:19:12 -0400 Subject: RFR (S) 8222446: assert(C->env()->system_dictionary_modification_counter_changed()) failed: Must invalidate if TypeFuncs differ In-Reply-To: <154ad551-d397-5abe-1b6a-7a3ddd129f3d@oracle.com> References: <703b29a2-71a6-27d7-99e3-d54216332c33@oracle.com> <154ad551-d397-5abe-1b6a-7a3ddd129f3d@oracle.com> Message-ID: Hi,? From offline discussions, I updated the code in Parse::do_exits() to make the method not compilable if the return types don't match.? Otherwise it would revert a change that Volker made to prevent infinite compilation loops.? It seems that the compiler code has been changed to no longer exercise this path (ShouldNotReachHere never reached), so keeping the conservative path seemed safest. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222446.02/webrev I changed the comment Dean, it might need help rewording. Tested with tier1-8. Thanks, Coleen On 6/21/19 4:44 PM, coleen.phillimore at oracle.com wrote: > > Dean,? Thank you for reviewing and for your help and discussion of > this change. > > On 6/21/19 3:48 PM, dean.long at oracle.com wrote: >> For the most part, this looks good.? I only have a couple concerns: >> >> 1) The distinction in both validate_compile_task_dependencies >> functions between "dependencies failed" and "dependencies invalid" is >> even more fuzzy after this change.? I suggest filing an RFE to remove >> this distinction. > > Yes, in jvmciRuntime I had to carefully preserve this logic or some > tests failed.?? I'll file an RFE for you. >> >> 2) In Parse::do_exits(), we don't know that concurrent class loading >> didn't cause the problem.? We should be optimistic and allow the retry: >> C->record_failure(C2Compiler::retry_class_loading_during_parsing()); >> rather than more drastic >> ??? C->record_method_not_compilable >> This is actually what the code did in an earlier revision. > > Erik and I were trying to guess which was the right answer.? It seemed > too lucky that you'd do concurrent class loading in this time period, > so we picked the more drastic answer, but I tested both.? So I'll > change it to the optimistic answer. > > Thanks! > Coleen >> >> dl >> >> On 6/20/19 10:28 AM, coleen.phillimore at oracle.com wrote: >>> Summary: Remove SystemDictionary::modification_counter optimization >>> >>> See bug for more details.? To avoid the assert in the bug report, >>> it's necessary to also increase the modification counter for class >>> unloading, which needs special code for concurrent class unloading. >>> The global counter is used to verify that validate_dependencies() >>> gets the same answer based on the subklass hierarchy, but provides a >>> quick exit in production mode.? Removing it may allow more nmethods >>> to be created that don't depend on the classes that may be loaded >>> while the Method is being compiled. Performance testing was done on >>> this with no change in performance.? Also investigated the >>> breakpoint setting code which incremented the modification counter. >>> Dependent compilations are invalidated using evol_method >>> dependencies, so updating the system dictionary modification counter >>> isn't unnecessary. >>> >>> Tested with hs-tier1-8 testing, and CTW, and local jvmti/jdi/jdwp >>> test runs with -Xcomp. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8222446.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8222446 >>> >>> Thanks, >>> Coleen >> > From tobias.hartmann at oracle.com Tue Jul 9 05:13:06 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 9 Jul 2019 07:13:06 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> Message-ID: <611a0a23-70ca-6bf7-8b35-ab2c1872e360@oracle.com> Hi Erik, this looks reasonable to me but a second review would be good. Please test thoroughly before pushing. Best regards, Tobias On 01.07.19 15:12, Erik ?sterlund wrote: > Hi, > > Today it is up to callers of methods changing state on nmethods like make_not_entrant(), to know all > other possible concurrent attempts to transition the nmethod, and know that there are no such > attempts trying to make the nmethod more dead. > There have been multiple occurrences of issues where the caller got it wrong due to the fragile > nature of this code. This specific CR deals with a bug where an OSR nmethod was made not entrant > (deopt) and made unloaded concurrently. > The result of such a race can be that it is first made unloaded and then made not entrant, making > the nmethod go backwards in its state machine, effectively resurrecting dead nmethods, causing a > subsequent GC to feel awkward (crash). > But I have seen other similar incidents with deopt racing with the sweeper. These non-monotonicity > problems are unnecessary to have. So I intend to fix the bug by enforcing monotonicity of the > nmethod state machine explicitly, instead of trying to reason about all callers of these make_* > functions. > I swapped the order of unloaded and zombie in the enum as zombies are strictly more dead than > unloaded nmethods. All transitions change in the direction of increasing deadness and fail if the > transition is not monotonically increasing. > > For ZGC I moved OSR nmethod unlinking to before the unlinking (where unlinking code belongs), > instead of after the handshake (intended for deleting things safely unlinked). > Strictly speaking, moving the OSR nmethod unlinking removes the racing between make_not_entrant and > make_unloaded, but I still want the monotonicity guards to make this code more robust. > > I left AOT methods alone. Since they don't die, they don't have resurrection problems, and hence do > not benefit from these guards in the same way. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8224674 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00/ > > Thanks, > /Erik From erik.osterlund at oracle.com Tue Jul 9 06:30:49 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Tue, 9 Jul 2019 08:30:49 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <611a0a23-70ca-6bf7-8b35-ab2c1872e360@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <611a0a23-70ca-6bf7-8b35-ab2c1872e360@oracle.com> Message-ID: Hi Tobias, Thanks for the review. /Erik > On 9 Jul 2019, at 07:13, Tobias Hartmann wrote: > > Hi Erik, > > this looks reasonable to me but a second review would be good. > > Please test thoroughly before pushing. > > Best regards, > Tobias > >> On 01.07.19 15:12, Erik ?sterlund wrote: >> Hi, >> >> Today it is up to callers of methods changing state on nmethods like make_not_entrant(), to know all >> other possible concurrent attempts to transition the nmethod, and know that there are no such >> attempts trying to make the nmethod more dead. >> There have been multiple occurrences of issues where the caller got it wrong due to the fragile >> nature of this code. This specific CR deals with a bug where an OSR nmethod was made not entrant >> (deopt) and made unloaded concurrently. >> The result of such a race can be that it is first made unloaded and then made not entrant, making >> the nmethod go backwards in its state machine, effectively resurrecting dead nmethods, causing a >> subsequent GC to feel awkward (crash). >> But I have seen other similar incidents with deopt racing with the sweeper. These non-monotonicity >> problems are unnecessary to have. So I intend to fix the bug by enforcing monotonicity of the >> nmethod state machine explicitly, instead of trying to reason about all callers of these make_* >> functions. >> I swapped the order of unloaded and zombie in the enum as zombies are strictly more dead than >> unloaded nmethods. All transitions change in the direction of increasing deadness and fail if the >> transition is not monotonically increasing. >> >> For ZGC I moved OSR nmethod unlinking to before the unlinking (where unlinking code belongs), >> instead of after the handshake (intended for deleting things safely unlinked). >> Strictly speaking, moving the OSR nmethod unlinking removes the racing between make_not_entrant and >> make_unloaded, but I still want the monotonicity guards to make this code more robust. >> >> I left AOT methods alone. Since they don't die, they don't have resurrection problems, and hence do >> not benefit from these guards in the same way. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8224674 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00/ >> >> Thanks, >> /Erik From erik.osterlund at oracle.com Tue Jul 9 09:15:18 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 9 Jul 2019 11:15:18 +0200 Subject: RFR (S) 8222446: assert(C->env()->system_dictionary_modification_counter_changed()) failed: Must invalidate if TypeFuncs differ In-Reply-To: References: <703b29a2-71a6-27d7-99e3-d54216332c33@oracle.com> <154ad551-d397-5abe-1b6a-7a3ddd129f3d@oracle.com> Message-ID: Hi Coleen, I like the counter removal. This looks good. Thanks for digging into this and fixing it! /Erik On 2019-07-08 23:19, coleen.phillimore at oracle.com wrote: > > Hi,? From offline discussions, I updated the code in Parse::do_exits() > to make the method not compilable if the return types don't match.? > Otherwise it would revert a change that Volker made to prevent > infinite compilation loops.? It seems that the compiler code has been > changed to no longer exercise this path (ShouldNotReachHere never > reached), so keeping the conservative path seemed safest. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222446.02/webrev > > I changed the comment Dean, it might need help rewording. > > Tested with tier1-8. > > Thanks, > Coleen > > On 6/21/19 4:44 PM, coleen.phillimore at oracle.com wrote: >> >> Dean,? Thank you for reviewing and for your help and discussion of >> this change. >> >> On 6/21/19 3:48 PM, dean.long at oracle.com wrote: >>> For the most part, this looks good.? I only have a couple concerns: >>> >>> 1) The distinction in both validate_compile_task_dependencies >>> functions between "dependencies failed" and "dependencies invalid" >>> is even more fuzzy after this change.? I suggest filing an RFE to >>> remove this distinction. >> >> Yes, in jvmciRuntime I had to carefully preserve this logic or some >> tests failed.?? I'll file an RFE for you. >>> >>> 2) In Parse::do_exits(), we don't know that concurrent class loading >>> didn't cause the problem.? We should be optimistic and allow the retry: >>> C->record_failure(C2Compiler::retry_class_loading_during_parsing()); >>> rather than more drastic >>> ??? C->record_method_not_compilable >>> This is actually what the code did in an earlier revision. >> >> Erik and I were trying to guess which was the right answer.? It >> seemed too lucky that you'd do concurrent class loading in this time >> period, so we picked the more drastic answer, but I tested both.? So >> I'll change it to the optimistic answer. >> >> Thanks! >> Coleen >>> >>> dl >>> >>> On 6/20/19 10:28 AM, coleen.phillimore at oracle.com wrote: >>>> Summary: Remove SystemDictionary::modification_counter optimization >>>> >>>> See bug for more details.? To avoid the assert in the bug report, >>>> it's necessary to also increase the modification counter for class >>>> unloading, which needs special code for concurrent class unloading. >>>> The global counter is used to verify that validate_dependencies() >>>> gets the same answer based on the subklass hierarchy, but provides >>>> a quick exit in production mode.? Removing it may allow more >>>> nmethods to be created that don't depend on the classes that may be >>>> loaded while the Method is being compiled. Performance testing was >>>> done on this with no change in performance. Also investigated the >>>> breakpoint setting code which incremented the modification counter. >>>> Dependent compilations are invalidated using evol_method >>>> dependencies, so updating the system dictionary modification >>>> counter isn't unnecessary. >>>> >>>> Tested with hs-tier1-8 testing, and CTW, and local jvmti/jdi/jdwp >>>> test runs with -Xcomp. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8222446.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222446 >>>> >>>> Thanks, >>>> Coleen >>> >> > From matthias.baesken at sap.com Tue Jul 9 11:36:25 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 9 Jul 2019 11:36:25 +0000 Subject: RFR: 8226816: add UserHandler calls to event log Message-ID: Hi Coleen, thanks for the review . We discussed a bit internally about this and still have some concerns in corner cases / "room for improvement" so I'll not push it immediately . Best regards, Matthias > > Message: 1 > Date: Mon, 8 Jul 2019 14:44:26 -0400 > From: coleen.phillimore at oracle.com > To: hotspot-dev at openjdk.java.net > Subject: Re: RFR: 8226816: add UserHandler calls to event log > Message-ID: <2c7786a5-f6c6-609a-1857-e946da054ba4 at oracle.com> > Content-Type: text/plain; charset=utf-8; format=flowed > > This change looks fine. > Coleen > > On 7/8/19 3:00 AM, Baesken, Matthias wrote: > > Thanks for looking into it and running the tests ! > > > > Best regards, Matthias > > From coleen.phillimore at oracle.com Tue Jul 9 14:06:14 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 9 Jul 2019 10:06:14 -0400 Subject: RFR (S) 8222446: assert(C->env()->system_dictionary_modification_counter_changed()) failed: Must invalidate if TypeFuncs differ In-Reply-To: References: <703b29a2-71a6-27d7-99e3-d54216332c33@oracle.com> <154ad551-d397-5abe-1b6a-7a3ddd129f3d@oracle.com> Message-ID: <2f9276e7-5187-7c30-f55e-13f5e44524da@oracle.com> Thanks, Erik! Coleen On 7/9/19 5:15 AM, Erik ?sterlund wrote: > Hi Coleen, > > I like the counter removal. This looks good. Thanks for digging into > this and fixing it! > > /Erik > > On 2019-07-08 23:19, coleen.phillimore at oracle.com wrote: >> >> Hi,? From offline discussions, I updated the code in >> Parse::do_exits() to make the method not compilable if the return >> types don't match.? Otherwise it would revert a change that Volker >> made to prevent infinite compilation loops.? It seems that the >> compiler code has been changed to no longer exercise this path >> (ShouldNotReachHere never reached), so keeping the conservative path >> seemed safest. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8222446.02/webrev >> >> I changed the comment Dean, it might need help rewording. >> >> Tested with tier1-8. >> >> Thanks, >> Coleen >> >> On 6/21/19 4:44 PM, coleen.phillimore at oracle.com wrote: >>> >>> Dean,? Thank you for reviewing and for your help and discussion of >>> this change. >>> >>> On 6/21/19 3:48 PM, dean.long at oracle.com wrote: >>>> For the most part, this looks good.? I only have a couple concerns: >>>> >>>> 1) The distinction in both validate_compile_task_dependencies >>>> functions between "dependencies failed" and "dependencies invalid" >>>> is even more fuzzy after this change.? I suggest filing an RFE to >>>> remove this distinction. >>> >>> Yes, in jvmciRuntime I had to carefully preserve this logic or some >>> tests failed.?? I'll file an RFE for you. >>>> >>>> 2) In Parse::do_exits(), we don't know that concurrent class >>>> loading didn't cause the problem.? We should be optimistic and >>>> allow the retry: >>>> C->record_failure(C2Compiler::retry_class_loading_during_parsing()); >>>> rather than more drastic >>>> ??? C->record_method_not_compilable >>>> This is actually what the code did in an earlier revision. >>> >>> Erik and I were trying to guess which was the right answer. It >>> seemed too lucky that you'd do concurrent class loading in this time >>> period, so we picked the more drastic answer, but I tested both.? So >>> I'll change it to the optimistic answer. >>> >>> Thanks! >>> Coleen >>>> >>>> dl >>>> >>>> On 6/20/19 10:28 AM, coleen.phillimore at oracle.com wrote: >>>>> Summary: Remove SystemDictionary::modification_counter optimization >>>>> >>>>> See bug for more details.? To avoid the assert in the bug report, >>>>> it's necessary to also increase the modification counter for class >>>>> unloading, which needs special code for concurrent class >>>>> unloading. The global counter is used to verify that >>>>> validate_dependencies() gets the same answer based on the subklass >>>>> hierarchy, but provides a quick exit in production mode.? Removing >>>>> it may allow more nmethods to be created that don't depend on the >>>>> classes that may be loaded while the Method is being compiled. >>>>> Performance testing was done on this with no change in >>>>> performance. Also investigated the breakpoint setting code which >>>>> incremented the modification counter. Dependent compilations are >>>>> invalidated using evol_method dependencies, so updating the system >>>>> dictionary modification counter isn't unnecessary. >>>>> >>>>> Tested with hs-tier1-8 testing, and CTW, and local jvmti/jdi/jdwp >>>>> test runs with -Xcomp. >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8222446.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222446 >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>> >> > From dean.long at oracle.com Tue Jul 9 21:06:44 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 9 Jul 2019 14:06:44 -0700 Subject: RFR (S) 8222446: assert(C->env()->system_dictionary_modification_counter_changed()) failed: Must invalidate if TypeFuncs differ In-Reply-To: References: <703b29a2-71a6-27d7-99e3-d54216332c33@oracle.com> <154ad551-d397-5abe-1b6a-7a3ddd129f3d@oracle.com> Message-ID: <4db8e49f-36f4-eb8f-2e6b-34f9e532fbdf@oracle.com> The updated comment sounds good.? Now that you have removed the only place that was failing with retry_class_loading_during_parsing(), we should be able to remove that method and its uses.? That gets rid of the only way to "retry forever" vs the remaining and presumably safe "down-grade and retry just once more".? Or you can file an RFE to clean that up. dl On 7/8/19 2:19 PM, coleen.phillimore at oracle.com wrote: > > Hi,? From offline discussions, I updated the code in Parse::do_exits() > to make the method not compilable if the return types don't match.? > Otherwise it would revert a change that Volker made to prevent > infinite compilation loops.? It seems that the compiler code has been > changed to no longer exercise this path (ShouldNotReachHere never > reached), so keeping the conservative path seemed safest. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8222446.02/webrev > > I changed the comment Dean, it might need help rewording. > > Tested with tier1-8. > > Thanks, > Coleen > > On 6/21/19 4:44 PM, coleen.phillimore at oracle.com wrote: >> >> Dean,? Thank you for reviewing and for your help and discussion of >> this change. >> >> On 6/21/19 3:48 PM, dean.long at oracle.com wrote: >>> For the most part, this looks good.? I only have a couple concerns: >>> >>> 1) The distinction in both validate_compile_task_dependencies >>> functions between "dependencies failed" and "dependencies invalid" >>> is even more fuzzy after this change.? I suggest filing an RFE to >>> remove this distinction. >> >> Yes, in jvmciRuntime I had to carefully preserve this logic or some >> tests failed.?? I'll file an RFE for you. >>> >>> 2) In Parse::do_exits(), we don't know that concurrent class loading >>> didn't cause the problem.? We should be optimistic and allow the retry: >>> C->record_failure(C2Compiler::retry_class_loading_during_parsing()); >>> rather than more drastic >>> ??? C->record_method_not_compilable >>> This is actually what the code did in an earlier revision. >> >> Erik and I were trying to guess which was the right answer.? It >> seemed too lucky that you'd do concurrent class loading in this time >> period, so we picked the more drastic answer, but I tested both.? So >> I'll change it to the optimistic answer. >> >> Thanks! >> Coleen >>> >>> dl >>> >>> On 6/20/19 10:28 AM, coleen.phillimore at oracle.com wrote: >>>> Summary: Remove SystemDictionary::modification_counter optimization >>>> >>>> See bug for more details.? To avoid the assert in the bug report, >>>> it's necessary to also increase the modification counter for class >>>> unloading, which needs special code for concurrent class unloading. >>>> The global counter is used to verify that validate_dependencies() >>>> gets the same answer based on the subklass hierarchy, but provides >>>> a quick exit in production mode.? Removing it may allow more >>>> nmethods to be created that don't depend on the classes that may be >>>> loaded while the Method is being compiled. Performance testing was >>>> done on this with no change in performance. Also investigated the >>>> breakpoint setting code which incremented the modification counter. >>>> Dependent compilations are invalidated using evol_method >>>> dependencies, so updating the system dictionary modification >>>> counter isn't unnecessary. >>>> >>>> Tested with hs-tier1-8 testing, and CTW, and local jvmti/jdi/jdwp >>>> test runs with -Xcomp. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8222446.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222446 >>>> >>>> Thanks, >>>> Coleen >>> >> > From coleen.phillimore at oracle.com Tue Jul 9 21:16:10 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 9 Jul 2019 17:16:10 -0400 Subject: RFR (S) 8222446: assert(C->env()->system_dictionary_modification_counter_changed()) failed: Must invalidate if TypeFuncs differ In-Reply-To: <4db8e49f-36f4-eb8f-2e6b-34f9e532fbdf@oracle.com> References: <703b29a2-71a6-27d7-99e3-d54216332c33@oracle.com> <154ad551-d397-5abe-1b6a-7a3ddd129f3d@oracle.com> <4db8e49f-36f4-eb8f-2e6b-34f9e532fbdf@oracle.com> Message-ID: <3616e932-6b11-245e-14fa-94394716fa6d@oracle.com> On 7/9/19 5:06 PM, dean.long at oracle.com wrote: > The updated comment sounds good.? Now that you have removed the only > place that was failing with retry_class_loading_during_parsing(), we > should be able to remove that method and its uses.? That gets rid of > the only way to "retry forever" vs the remaining and presumably safe > "down-grade and retry just once more".? Or you can file an RFE to > clean that up. Thanks Dean.? I noticed that C2Compiler::retry_class_loading_during_parsing()); is now not used with my change but didn't want to clean it up with this change.? I'll file an RFE to clean it up (or find some other use for it in the compiler code).? What is the remaining "downgrade and retry just once more" option? Thanks for the help! Coleen > > dl > > On 7/8/19 2:19 PM, coleen.phillimore at oracle.com wrote: >> >> Hi,? From offline discussions, I updated the code in >> Parse::do_exits() to make the method not compilable if the return >> types don't match.? Otherwise it would revert a change that Volker >> made to prevent infinite compilation loops.? It seems that the >> compiler code has been changed to no longer exercise this path >> (ShouldNotReachHere never reached), so keeping the conservative path >> seemed safest. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8222446.02/webrev >> >> I changed the comment Dean, it might need help rewording. >> >> Tested with tier1-8. >> >> Thanks, >> Coleen >> >> On 6/21/19 4:44 PM, coleen.phillimore at oracle.com wrote: >>> >>> Dean,? Thank you for reviewing and for your help and discussion of >>> this change. >>> >>> On 6/21/19 3:48 PM, dean.long at oracle.com wrote: >>>> For the most part, this looks good.? I only have a couple concerns: >>>> >>>> 1) The distinction in both validate_compile_task_dependencies >>>> functions between "dependencies failed" and "dependencies invalid" >>>> is even more fuzzy after this change.? I suggest filing an RFE to >>>> remove this distinction. >>> >>> Yes, in jvmciRuntime I had to carefully preserve this logic or some >>> tests failed.?? I'll file an RFE for you. >>>> >>>> 2) In Parse::do_exits(), we don't know that concurrent class >>>> loading didn't cause the problem.? We should be optimistic and >>>> allow the retry: >>>> C->record_failure(C2Compiler::retry_class_loading_during_parsing()); >>>> rather than more drastic >>>> ??? C->record_method_not_compilable >>>> This is actually what the code did in an earlier revision. >>> >>> Erik and I were trying to guess which was the right answer. It >>> seemed too lucky that you'd do concurrent class loading in this time >>> period, so we picked the more drastic answer, but I tested both.? So >>> I'll change it to the optimistic answer. >>> >>> Thanks! >>> Coleen >>>> >>>> dl >>>> >>>> On 6/20/19 10:28 AM, coleen.phillimore at oracle.com wrote: >>>>> Summary: Remove SystemDictionary::modification_counter optimization >>>>> >>>>> See bug for more details.? To avoid the assert in the bug report, >>>>> it's necessary to also increase the modification counter for class >>>>> unloading, which needs special code for concurrent class >>>>> unloading. The global counter is used to verify that >>>>> validate_dependencies() gets the same answer based on the subklass >>>>> hierarchy, but provides a quick exit in production mode.? Removing >>>>> it may allow more nmethods to be created that don't depend on the >>>>> classes that may be loaded while the Method is being compiled. >>>>> Performance testing was done on this with no change in >>>>> performance. Also investigated the breakpoint setting code which >>>>> incremented the modification counter. Dependent compilations are >>>>> invalidated using evol_method dependencies, so updating the system >>>>> dictionary modification counter isn't unnecessary. >>>>> >>>>> Tested with hs-tier1-8 testing, and CTW, and local jvmti/jdi/jdwp >>>>> test runs with -Xcomp. >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8222446.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222446 >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>> >> > From dean.long at oracle.com Tue Jul 9 21:31:54 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 9 Jul 2019 14:31:54 -0700 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> Message-ID: On 7/1/19 6:12 AM, Erik ?sterlund wrote: > For ZGC I moved OSR nmethod unlinking to before the unlinking (where > unlinking code belongs), instead of after the handshake (intended for > deleting things safely unlinked). > Strictly speaking, moving the OSR nmethod unlinking removes the racing > between make_not_entrant and make_unloaded, but I still want the > monotonicity guards to make this code more robust. I see where you added OSR nmethod unlinking, but not where you removed it, so it's not obvious it was a "move". Would it make sense for nmethod::unlink_from_method() to do the OSR unlinking, or to assert that it has already been done? The new bailout in the middle of nmethod::make_not_entrant_or_zombie() worries me a little, because the code up to that point has side-effects, and we could be bailing out in an unexpected state. dl From dean.long at oracle.com Tue Jul 9 21:40:41 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 9 Jul 2019 14:40:41 -0700 Subject: RFR (S) 8222446: assert(C->env()->system_dictionary_modification_counter_changed()) failed: Must invalidate if TypeFuncs differ In-Reply-To: <3616e932-6b11-245e-14fa-94394716fa6d@oracle.com> References: <703b29a2-71a6-27d7-99e3-d54216332c33@oracle.com> <154ad551-d397-5abe-1b6a-7a3ddd129f3d@oracle.com> <4db8e49f-36f4-eb8f-2e6b-34f9e532fbdf@oracle.com> <3616e932-6b11-245e-14fa-94394716fa6d@oracle.com> Message-ID: On 7/9/19 2:16 PM, coleen.phillimore at oracle.com wrote: > > > On 7/9/19 5:06 PM, dean.long at oracle.com wrote: >> The updated comment sounds good.? Now that you have removed the only >> place that was failing with retry_class_loading_during_parsing(), we >> should be able to remove that method and its uses.? That gets rid of >> the only way to "retry forever" vs the remaining and presumably safe >> "down-grade and retry just once more".? Or you can file an RFE to >> clean that up. > > Thanks Dean.? I noticed that > C2Compiler::retry_class_loading_during_parsing()); > > is now not used with my change but didn't want to clean it up with > this change.? I'll file an RFE to clean it up (or find some other use > for it in the compiler code).? What is the remaining "downgrade and > retry just once more" option? > The remaining are retry_no_subsuming_loads(), retry_no_escape_analysis(), and has_boxed_value() here: https://java.se.oracle.com/source/xref/jdk-jdk/open/src/hotspot/share/opto/c2compiler.cpp#112 Notice that they all set some kind of flag to disable the current failure, preventing infinite loops. dl > Thanks for the help! > Coleen > >> >> dl >> >> On 7/8/19 2:19 PM, coleen.phillimore at oracle.com wrote: >>> >>> Hi,? From offline discussions, I updated the code in >>> Parse::do_exits() to make the method not compilable if the return >>> types don't match.? Otherwise it would revert a change that Volker >>> made to prevent infinite compilation loops.? It seems that the >>> compiler code has been changed to no longer exercise this path >>> (ShouldNotReachHere never reached), so keeping the conservative path >>> seemed safest. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8222446.02/webrev >>> >>> I changed the comment Dean, it might need help rewording. >>> >>> Tested with tier1-8. >>> >>> Thanks, >>> Coleen >>> >>> On 6/21/19 4:44 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Dean,? Thank you for reviewing and for your help and discussion of >>>> this change. >>>> >>>> On 6/21/19 3:48 PM, dean.long at oracle.com wrote: >>>>> For the most part, this looks good. I only have a couple concerns: >>>>> >>>>> 1) The distinction in both validate_compile_task_dependencies >>>>> functions between "dependencies failed" and "dependencies invalid" >>>>> is even more fuzzy after this change.? I suggest filing an RFE to >>>>> remove this distinction. >>>> >>>> Yes, in jvmciRuntime I had to carefully preserve this logic or some >>>> tests failed.?? I'll file an RFE for you. >>>>> >>>>> 2) In Parse::do_exits(), we don't know that concurrent class >>>>> loading didn't cause the problem.? We should be optimistic and >>>>> allow the retry: >>>>> C->record_failure(C2Compiler::retry_class_loading_during_parsing()); >>>>> rather than more drastic >>>>> ??? C->record_method_not_compilable >>>>> This is actually what the code did in an earlier revision. >>>> >>>> Erik and I were trying to guess which was the right answer. It >>>> seemed too lucky that you'd do concurrent class loading in this >>>> time period, so we picked the more drastic answer, but I tested >>>> both.? So I'll change it to the optimistic answer. >>>> >>>> Thanks! >>>> Coleen >>>>> >>>>> dl >>>>> >>>>> On 6/20/19 10:28 AM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Remove SystemDictionary::modification_counter optimization >>>>>> >>>>>> See bug for more details.? To avoid the assert in the bug report, >>>>>> it's necessary to also increase the modification counter for >>>>>> class unloading, which needs special code for concurrent class >>>>>> unloading. The global counter is used to verify that >>>>>> validate_dependencies() gets the same answer based on the >>>>>> subklass hierarchy, but provides a quick exit in production >>>>>> mode.? Removing it may allow more nmethods to be created that >>>>>> don't depend on the classes that may be loaded while the Method >>>>>> is being compiled. Performance testing was done on this with no >>>>>> change in performance. Also investigated the breakpoint setting >>>>>> code which incremented the modification counter. Dependent >>>>>> compilations are invalidated using evol_method dependencies, so >>>>>> updating the system dictionary modification counter isn't >>>>>> unnecessary. >>>>>> >>>>>> Tested with hs-tier1-8 testing, and CTW, and local jvmti/jdi/jdwp >>>>>> test runs with -Xcomp. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8222446.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222446 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>> >>> >> > From erik.osterlund at oracle.com Wed Jul 10 08:28:25 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 10 Jul 2019 10:28:25 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> Message-ID: Hi Dean, On 2019-07-09 23:31, dean.long at oracle.com wrote: > On 7/1/19 6:12 AM, Erik ?sterlund wrote: >> For ZGC I moved OSR nmethod unlinking to before the unlinking (where >> unlinking code belongs), instead of after the handshake (intended for >> deleting things safely unlinked). >> Strictly speaking, moving the OSR nmethod unlinking removes the >> racing between make_not_entrant and make_unloaded, but I still want >> the monotonicity guards to make this code more robust. > > I see where you added OSR nmethod unlinking, but not where you removed > it, so it's not obvious it was a "move". Sorry, bad wording on my part. I added OSR nmethod unlinking before the global handshake is run. After the handshake, we call make_unloaded() on the same is_unloading() nmethods. That function "tries" to unlink the OSR nmethod, but will just not do it as it's already unlinked at that point. So in a way, I didn't remove the call to unlink the OSR nmethod there, it just won't do anything. I preferred structuring it that way instead of trying to optimize away the call to unlink the OSR nmethod when making it unloaded, but only for the concurrent case. It seemed to introduce more conditional magic than it was worth. So in practice, the unlinking of OSR nmethods has moved for concurrent unloading to before the handshake. > Would it make sense for nmethod::unlink_from_method() to do the OSR > unlinking, or to assert that it has already been done? An earlier version of this patch tried to do that. It is indeed possible. But it requires changing lock ranks of the OSR nmethod lock to special - 1 and moving around a bunch of code as this function is also called both when making nmethods not_entrant, zombie, and unlinking them in that case. For the first two, we conditionally unlink the nmethod based on the current state (which is the old state), whereas when I move it, the current state is the new state. So I had to change things around a bit more to figure out the right condition when to unlink it that works for all 3 callers. In the end, since this is going to 13, I thought it's more important to minimize the risk as much as I can, and leave such refactorings to 14. > The new bailout in the middle of nmethod::make_not_entrant_or_zombie() > worries me a little, because the code up to that point has > side-effects, and we could be bailing out in an unexpected state. Correct. In an earlier version of this patch, I moved the transition to before the side effects. But a bunch of code is using the current nmethod state to determine what to do, and that current state changed from the old to the new state. In particular, we conditionally patch in the jump based on the current (old) state, and we conditionally increment decompile count based on the current (old) state. So I ended up having to rewrite more code than I wanted to for a patch going into 13, and convince myself that I had not implicitly messed something up. It felt safer to reason about the 3 side effects up until the transitioning point: 1) Patching in the jump into VEP. Any state more dead than the current transition, would still want that jump to be there. 2) Incrementing decompile count when making it not_entrant. Seems in order to do regardless, as we had an actual request to make the nmethod not entrant because it was bad somehow. 3) Marking it as seen on stack when making it not_entrant. This will only make can_convert_to_zombie start returning false, which is harmless in general. Also, as both transitions to zombie and not_entrant are performed under the Patching_lock, the only possible race is with make_unloaded. And those nmethods are is_unloading(), which also makes can_convert_to_zombie return false (in a not racy fashion). So it would essentially make no observable difference to any single call to can_convert_to_zombie(). In summary, #1 and #3 don't really observably change the state of the system, and #2 is completely harmless and probably wanted. Therefore I found that moving these things around and finding out where we use the current state(), as well as rewriting it, seemed like a slightly scarier change for 13 to me. So in general, there is some refactoring that could be done (and I have tried it) to make this nicer. But I want to minimize the risk for 13 as much as possible, and perform any risky refactorings in 14 instead. If your risk assessment is different and you would prefer moving the transition higher up (and flipping some conditions) instead, I am totally up for that too though, and I do see where you are coming from. BTW, I have tested this change through hs-tier1-7, and it looks good. Thanks a lot Dean for reviewing this code. /Erik > dl > From coleen.phillimore at oracle.com Wed Jul 10 12:24:28 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 10 Jul 2019 08:24:28 -0400 Subject: RFR (S) 8222446: assert(C->env()->system_dictionary_modification_counter_changed()) failed: Must invalidate if TypeFuncs differ In-Reply-To: References: <703b29a2-71a6-27d7-99e3-d54216332c33@oracle.com> <154ad551-d397-5abe-1b6a-7a3ddd129f3d@oracle.com> <4db8e49f-36f4-eb8f-2e6b-34f9e532fbdf@oracle.com> <3616e932-6b11-245e-14fa-94394716fa6d@oracle.com> Message-ID: <6451d01e-8e15-ce8d-cb34-6460735e13b4@oracle.com> On 7/9/19 5:40 PM, dean.long at oracle.com wrote: > On 7/9/19 2:16 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/9/19 5:06 PM, dean.long at oracle.com wrote: >>> The updated comment sounds good.? Now that you have removed the only >>> place that was failing with retry_class_loading_during_parsing(), we >>> should be able to remove that method and its uses.? That gets rid of >>> the only way to "retry forever" vs the remaining and presumably safe >>> "down-grade and retry just once more".? Or you can file an RFE to >>> clean that up. >> >> Thanks Dean.? I noticed that >> C2Compiler::retry_class_loading_during_parsing()); >> >> is now not used with my change but didn't want to clean it up with >> this change.? I'll file an RFE to clean it up (or find some other use >> for it in the compiler code).? What is the remaining "downgrade and >> retry just once more" option? >> > > The remaining are retry_no_subsuming_loads(), > retry_no_escape_analysis(), and has_boxed_value() here: > > https://java.se.oracle.com/source/xref/jdk-jdk/open/src/hotspot/share/opto/c2compiler.cpp#112 > > > Notice that they all set some kind of flag to disable the current > failure, preventing infinite loops. I see.? Thanks for the code pointer.? I'll add this to the RFE. thanks, Coleen > > dl > >> Thanks for the help! >> Coleen >> >>> >>> dl >>> >>> On 7/8/19 2:19 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi,? From offline discussions, I updated the code in >>>> Parse::do_exits() to make the method not compilable if the return >>>> types don't match.? Otherwise it would revert a change that Volker >>>> made to prevent infinite compilation loops.? It seems that the >>>> compiler code has been changed to no longer exercise this path >>>> (ShouldNotReachHere never reached), so keeping the conservative >>>> path seemed safest. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8222446.02/webrev >>>> >>>> I changed the comment Dean, it might need help rewording. >>>> >>>> Tested with tier1-8. >>>> >>>> Thanks, >>>> Coleen >>>> >>>> On 6/21/19 4:44 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Dean,? Thank you for reviewing and for your help and discussion of >>>>> this change. >>>>> >>>>> On 6/21/19 3:48 PM, dean.long at oracle.com wrote: >>>>>> For the most part, this looks good. I only have a couple concerns: >>>>>> >>>>>> 1) The distinction in both validate_compile_task_dependencies >>>>>> functions between "dependencies failed" and "dependencies >>>>>> invalid" is even more fuzzy after this change.? I suggest filing >>>>>> an RFE to remove this distinction. >>>>> >>>>> Yes, in jvmciRuntime I had to carefully preserve this logic or >>>>> some tests failed.?? I'll file an RFE for you. >>>>>> >>>>>> 2) In Parse::do_exits(), we don't know that concurrent class >>>>>> loading didn't cause the problem.? We should be optimistic and >>>>>> allow the retry: >>>>>> C->record_failure(C2Compiler::retry_class_loading_during_parsing()); >>>>>> rather than more drastic >>>>>> ??? C->record_method_not_compilable >>>>>> This is actually what the code did in an earlier revision. >>>>> >>>>> Erik and I were trying to guess which was the right answer. It >>>>> seemed too lucky that you'd do concurrent class loading in this >>>>> time period, so we picked the more drastic answer, but I tested >>>>> both.? So I'll change it to the optimistic answer. >>>>> >>>>> Thanks! >>>>> Coleen >>>>>> >>>>>> dl >>>>>> >>>>>> On 6/20/19 10:28 AM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Remove SystemDictionary::modification_counter optimization >>>>>>> >>>>>>> See bug for more details.? To avoid the assert in the bug >>>>>>> report, it's necessary to also increase the modification counter >>>>>>> for class unloading, which needs special code for concurrent >>>>>>> class unloading. The global counter is used to verify that >>>>>>> validate_dependencies() gets the same answer based on the >>>>>>> subklass hierarchy, but provides a quick exit in production >>>>>>> mode.? Removing it may allow more nmethods to be created that >>>>>>> don't depend on the classes that may be loaded while the Method >>>>>>> is being compiled. Performance testing was done on this with no >>>>>>> change in performance. Also investigated the breakpoint setting >>>>>>> code which incremented the modification counter. Dependent >>>>>>> compilations are invalidated using evol_method dependencies, so >>>>>>> updating the system dictionary modification counter isn't >>>>>>> unnecessary. >>>>>>> >>>>>>> Tested with hs-tier1-8 testing, and CTW, and local >>>>>>> jvmti/jdi/jdwp test runs with -Xcomp. >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8222446.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8222446 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>> >>>>> >>>> >>> >> > From dean.long at oracle.com Thu Jul 11 04:42:08 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 10 Jul 2019 21:42:08 -0700 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> Message-ID: <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> On 7/10/19 1:28 AM, Erik ?sterlund wrote: > Hi Dean, > > On 2019-07-09 23:31, dean.long at oracle.com wrote: >> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>> For ZGC I moved OSR nmethod unlinking to before the unlinking (where >>> unlinking code belongs), instead of after the handshake (intended >>> for deleting things safely unlinked). >>> Strictly speaking, moving the OSR nmethod unlinking removes the >>> racing between make_not_entrant and make_unloaded, but I still want >>> the monotonicity guards to make this code more robust. >> >> I see where you added OSR nmethod unlinking, but not where you >> removed it, so it's not obvious it was a "move". > > Sorry, bad wording on my part. I added OSR nmethod unlinking before > the global handshake is run. After the handshake, we call > make_unloaded() on the same is_unloading() nmethods. That function > "tries" to unlink the OSR nmethod, but will just not do it as it's > already unlinked at that point. So in a way, I didn't remove the call > to unlink the OSR nmethod there, it just won't do anything. I > preferred structuring it that way instead of trying to optimize away > the call to unlink the OSR nmethod when making it unloaded, but only > for the concurrent case. It seemed to introduce more conditional magic > than it was worth. > So in practice, the unlinking of OSR nmethods has moved for concurrent > unloading to before the handshake. > OK, in that case, could you add a little information to the "Invalidate the osr nmethod only once" comment so that in the future someone isn't tempted to remove the code as redundant? >> Would it make sense for nmethod::unlink_from_method() to do the OSR >> unlinking, or to assert that it has already been done? > > An earlier version of this patch tried to do that. It is indeed > possible. But it requires changing lock ranks of the OSR nmethod lock > to special - 1 and moving around a bunch of code as this function is > also called both when making nmethods not_entrant, zombie, and > unlinking them in that case. For the first two, we conditionally > unlink the nmethod based on the current state (which is the old > state), whereas when I move it, the current state is the new state. So > I had to change things around a bit more to figure out the right > condition when to unlink it that works for all 3 callers. In the end, > since this is going to 13, I thought it's more important to minimize > the risk as much as I can, and leave such refactorings to 14. > OK. >> The new bailout in the middle of >> nmethod::make_not_entrant_or_zombie() worries me a little, because >> the code up to that point has side-effects, and we could be bailing >> out in an unexpected state. > > Correct. In an earlier version of this patch, I moved the transition > to before the side effects. But a bunch of code is using the current > nmethod state to determine what to do, and that current state changed > from the old to the new state. In particular, we conditionally patch > in the jump based on the current (old) state, and we conditionally > increment decompile count based on the current (old) state. So I ended > up having to rewrite more code than I wanted to for a patch going into > 13, and convince myself that I had not implicitly messed something up. > It felt safer to reason about the 3 side effects up until the > transitioning point: > > 1) Patching in the jump into VEP. Any state more dead than the current > transition, would still want that jump to be there. > 2) Incrementing decompile count when making it not_entrant. Seems in > order to do regardless, as we had an actual request to make the > nmethod not entrant because it was bad somehow. > 3) Marking it as seen on stack when making it not_entrant. This will > only make can_convert_to_zombie start returning false, which is > harmless in general. Also, as both transitions to zombie and > not_entrant are performed under the Patching_lock, the only possible > race is with make_unloaded. And those nmethods are is_unloading(), > which also makes can_convert_to_zombie return false (in a not racy > fashion). So it would essentially make no observable difference to any > single call to can_convert_to_zombie(). > > In summary, #1 and #3 don't really observably change the state of the > system, and #2 is completely harmless and probably wanted. Therefore I > found that moving these things around and finding out where we use the > current state(), as well as rewriting it, seemed like a slightly > scarier change for 13 to me. > > So in general, there is some refactoring that could be done (and I > have tried it) to make this nicer. But I want to minimize the risk for > 13 as much as possible, and perform any risky refactorings in 14 instead. > If your risk assessment is different and you would prefer moving the > transition higher up (and flipping some conditions) instead, I am > totally up for that too though, and I do see where you are coming from. > So if we fail, it means that we lost a race to a "deader" state, and assuming this is the only path to the deader state, wouldn't that also mean that #1, #2, and #3 would have already been done by the winning thread?? If so, that makes me feel better about bailing out in the middle, but I'm still not 100% convinced, unless we can assert that 1-3 already happened.? Do you have a prototype of what moving the transition higher up would look like? dl > BTW, I have tested this change through hs-tier1-7, and it looks good. > > Thanks a lot Dean for reviewing this code. > > /Erik > >> dl >> > From erik.osterlund at oracle.com Thu Jul 11 13:53:44 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 11 Jul 2019 09:53:44 -0400 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> Message-ID: Hi Dean, On 2019-07-11 00:42, dean.long at oracle.com wrote: > On 7/10/19 1:28 AM, Erik ?sterlund wrote: >> Hi Dean, >> >> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>> For ZGC I moved OSR nmethod unlinking to before the unlinking (where >>>> unlinking code belongs), instead of after the handshake (intended >>>> for deleting things safely unlinked). >>>> Strictly speaking, moving the OSR nmethod unlinking removes the >>>> racing between make_not_entrant and make_unloaded, but I still want >>>> the monotonicity guards to make this code more robust. >>> >>> I see where you added OSR nmethod unlinking, but not where you >>> removed it, so it's not obvious it was a "move". >> >> Sorry, bad wording on my part. I added OSR nmethod unlinking before >> the global handshake is run. After the handshake, we call >> make_unloaded() on the same is_unloading() nmethods. That function >> "tries" to unlink the OSR nmethod, but will just not do it as it's >> already unlinked at that point. So in a way, I didn't remove the call >> to unlink the OSR nmethod there, it just won't do anything. I >> preferred structuring it that way instead of trying to optimize away >> the call to unlink the OSR nmethod when making it unloaded, but only >> for the concurrent case. It seemed to introduce more conditional magic >> than it was worth. >> So in practice, the unlinking of OSR nmethods has moved for concurrent >> unloading to before the handshake. >> > > OK, in that case, could you add a little information to the "Invalidate > the osr nmethod only once" comment so that in the future someone isn't > tempted to remove the code as redundant? Sure. >>> Would it make sense for nmethod::unlink_from_method() to do the OSR >>> unlinking, or to assert that it has already been done? >> >> An earlier version of this patch tried to do that. It is indeed >> possible. But it requires changing lock ranks of the OSR nmethod lock >> to special - 1 and moving around a bunch of code as this function is >> also called both when making nmethods not_entrant, zombie, and >> unlinking them in that case. For the first two, we conditionally >> unlink the nmethod based on the current state (which is the old >> state), whereas when I move it, the current state is the new state. So >> I had to change things around a bit more to figure out the right >> condition when to unlink it that works for all 3 callers. In the end, >> since this is going to 13, I thought it's more important to minimize >> the risk as much as I can, and leave such refactorings to 14. >> > > OK. > >>> The new bailout in the middle of >>> nmethod::make_not_entrant_or_zombie() worries me a little, because >>> the code up to that point has side-effects, and we could be bailing >>> out in an unexpected state. >> >> Correct. In an earlier version of this patch, I moved the transition >> to before the side effects. But a bunch of code is using the current >> nmethod state to determine what to do, and that current state changed >> from the old to the new state. In particular, we conditionally patch >> in the jump based on the current (old) state, and we conditionally >> increment decompile count based on the current (old) state. So I ended >> up having to rewrite more code than I wanted to for a patch going into >> 13, and convince myself that I had not implicitly messed something up. >> It felt safer to reason about the 3 side effects up until the >> transitioning point: >> >> 1) Patching in the jump into VEP. Any state more dead than the current >> transition, would still want that jump to be there. >> 2) Incrementing decompile count when making it not_entrant. Seems in >> order to do regardless, as we had an actual request to make the >> nmethod not entrant because it was bad somehow. >> 3) Marking it as seen on stack when making it not_entrant. This will >> only make can_convert_to_zombie start returning false, which is >> harmless in general. Also, as both transitions to zombie and >> not_entrant are performed under the Patching_lock, the only possible >> race is with make_unloaded. And those nmethods are is_unloading(), >> which also makes can_convert_to_zombie return false (in a not racy >> fashion). So it would essentially make no observable difference to any >> single call to can_convert_to_zombie(). >> >> In summary, #1 and #3 don't really observably change the state of the >> system, and #2 is completely harmless and probably wanted. Therefore I >> found that moving these things around and finding out where we use the >> current state(), as well as rewriting it, seemed like a slightly >> scarier change for 13 to me. >> >> So in general, there is some refactoring that could be done (and I >> have tried it) to make this nicer. But I want to minimize the risk for >> 13 as much as possible, and perform any risky refactorings in 14 instead. >> If your risk assessment is different and you would prefer moving the >> transition higher up (and flipping some conditions) instead, I am >> totally up for that too though, and I do see where you are coming from. >> > > So if we fail, it means that we lost a race to a "deader" state, and > assuming this is the only path to the deader state, wouldn't that also > mean that #1, #2, and #3 would have already been done by the winning > thread?? If so, that makes me feel better about bailing out in the > middle, but I'm still not 100% convinced, unless we can assert that 1-3 > already happened.? Do you have a prototype of what moving the transition > higher up would look like? As a matter of fact I do. Here is a webrev: http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ I kind of like it. What do you think? Thanks, /Erik > dl > >> BTW, I have tested this change through hs-tier1-7, and it looks good. >> >> Thanks a lot Dean for reviewing this code. >> >> /Erik >> >>> dl >>> >> > From coleen.phillimore at oracle.com Thu Jul 11 14:46:31 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 11 Jul 2019 10:46:31 -0400 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> Message-ID: <2fc92bdb-ff1b-ad48-0e5e-9983e619c3c7@oracle.com> Hi Erik, I had a look at this also and it seems reasonable and I like that 'unloaded' is less dead than 'zombie' now. http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00/src/hotspot/share/code/nmethod.cpp.frames.html 1230 guarantee(try_transition(unloaded), "Invalid nmethod transition to unloaded"); This line worries me.? Can you explain why another thread could not have made this nmethod zombie already, before the handshake in zgc and the call afterward to make_unloaded() for this nmethod??? And add a comment here? Thanks, Coleen On 7/11/19 9:53 AM, Erik ?sterlund wrote: > Hi Dean, > > On 2019-07-11 00:42, dean.long at oracle.com wrote: >> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>> Hi Dean, >>> >>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking >>>>> (where unlinking code belongs), instead of after the handshake >>>>> (intended for deleting things safely unlinked). >>>>> Strictly speaking, moving the OSR nmethod unlinking removes the >>>>> racing between make_not_entrant and make_unloaded, but I still >>>>> want the monotonicity guards to make this code more robust. >>>> >>>> I see where you added OSR nmethod unlinking, but not where you >>>> removed it, so it's not obvious it was a "move". >>> >>> Sorry, bad wording on my part. I added OSR nmethod unlinking before >>> the global handshake is run. After the handshake, we call >>> make_unloaded() on the same is_unloading() nmethods. That function >>> "tries" to unlink the OSR nmethod, but will just not do it as it's >>> already unlinked at that point. So in a way, I didn't remove the >>> call to unlink the OSR nmethod there, it just won't do anything. I >>> preferred structuring it that way instead of trying to optimize away >>> the call to unlink the OSR nmethod when making it unloaded, but only >>> for the concurrent case. It seemed to introduce more conditional >>> magic than it was worth. >>> So in practice, the unlinking of OSR nmethods has moved for >>> concurrent unloading to before the handshake. >>> >> >> OK, in that case, could you add a little information to the >> "Invalidate the osr nmethod only once" comment so that in the future >> someone isn't tempted to remove the code as redundant? > > Sure. > >>>> Would it make sense for nmethod::unlink_from_method() to do the OSR >>>> unlinking, or to assert that it has already been done? >>> >>> An earlier version of this patch tried to do that. It is indeed >>> possible. But it requires changing lock ranks of the OSR nmethod >>> lock to special - 1 and moving around a bunch of code as this >>> function is also called both when making nmethods not_entrant, >>> zombie, and unlinking them in that case. For the first two, we >>> conditionally unlink the nmethod based on the current state (which >>> is the old state), whereas when I move it, the current state is the >>> new state. So I had to change things around a bit more to figure out >>> the right condition when to unlink it that works for all 3 callers. >>> In the end, since this is going to 13, I thought it's more important >>> to minimize the risk as much as I can, and leave such refactorings >>> to 14. >>> >> >> OK. >> >>>> The new bailout in the middle of >>>> nmethod::make_not_entrant_or_zombie() worries me a little, because >>>> the code up to that point has side-effects, and we could be bailing >>>> out in an unexpected state. >>> >>> Correct. In an earlier version of this patch, I moved the transition >>> to before the side effects. But a bunch of code is using the current >>> nmethod state to determine what to do, and that current state >>> changed from the old to the new state. In particular, we >>> conditionally patch in the jump based on the current (old) state, >>> and we conditionally increment decompile count based on the current >>> (old) state. So I ended up having to rewrite more code than I wanted >>> to for a patch going into 13, and convince myself that I had not >>> implicitly messed something up. It felt safer to reason about the 3 >>> side effects up until the transitioning point: >>> >>> 1) Patching in the jump into VEP. Any state more dead than the >>> current transition, would still want that jump to be there. >>> 2) Incrementing decompile count when making it not_entrant. Seems in >>> order to do regardless, as we had an actual request to make the >>> nmethod not entrant because it was bad somehow. >>> 3) Marking it as seen on stack when making it not_entrant. This will >>> only make can_convert_to_zombie start returning false, which is >>> harmless in general. Also, as both transitions to zombie and >>> not_entrant are performed under the Patching_lock, the only possible >>> race is with make_unloaded. And those nmethods are is_unloading(), >>> which also makes can_convert_to_zombie return false (in a not racy >>> fashion). So it would essentially make no observable difference to >>> any single call to can_convert_to_zombie(). >>> >>> In summary, #1 and #3 don't really observably change the state of >>> the system, and #2 is completely harmless and probably wanted. >>> Therefore I found that moving these things around and finding out >>> where we use the current state(), as well as rewriting it, seemed >>> like a slightly scarier change for 13 to me. >>> >>> So in general, there is some refactoring that could be done (and I >>> have tried it) to make this nicer. But I want to minimize the risk >>> for 13 as much as possible, and perform any risky refactorings in 14 >>> instead. >>> If your risk assessment is different and you would prefer moving the >>> transition higher up (and flipping some conditions) instead, I am >>> totally up for that too though, and I do see where you are coming from. >>> >> >> So if we fail, it means that we lost a race to a "deader" state, and >> assuming this is the only path to the deader state, wouldn't that >> also mean that #1, #2, and #3 would have already been done by the >> winning thread?? If so, that makes me feel better about bailing out >> in the middle, but I'm still not 100% convinced, unless we can assert >> that 1-3 already happened.? Do you have a prototype of what moving >> the transition higher up would look like? > > As a matter of fact I do. Here is a webrev: > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ > > I kind of like it. What do you think? > > Thanks, > /Erik > >> dl >> >>> BTW, I have tested this change through hs-tier1-7, and it looks good. >>> >>> Thanks a lot Dean for reviewing this code. >>> >>> /Erik >>> >>>> dl >>>> >>> >> From erik.osterlund at oracle.com Thu Jul 11 18:18:30 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Thu, 11 Jul 2019 20:18:30 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <2fc92bdb-ff1b-ad48-0e5e-9983e619c3c7@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <2fc92bdb-ff1b-ad48-0e5e-9983e619c3c7@oracle.com> Message-ID: <3D8AC43D-B56B-44AF-AD9F-A0F663A7DBDD@oracle.com> Hi Coleen, > On 11 Jul 2019, at 16:46, coleen.phillimore at oracle.com wrote: > > > Hi Erik, > I had a look at this also and it seems reasonable and I like that 'unloaded' is less dead than 'zombie' now. Glad you like it! > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00/src/hotspot/share/code/nmethod.cpp.frames.html > > 1230 guarantee(try_transition(unloaded), "Invalid nmethod transition to unloaded"); > > > This line worries me. Can you explain why another thread could not have made this nmethod zombie already, before the handshake in zgc and the call afterward to make_unloaded() for this nmethod? And add a comment here? Sure, will add a comment. Like this: ?It is an important invariant that there exists no race between the sweeper and GC thread competing for making the same nmethod zombie and unloaded respectively. This is ensured by can_convert_to_zombie() returning false for any is_unloading() nmethod, informing the sweeper not to step on any GC toes? Does that sound comprehensible? Thanks, /Erik > Thanks, > Coleen > >> On 7/11/19 9:53 AM, Erik ?sterlund wrote: >> Hi Dean, >> >>> On 2019-07-11 00:42, dean.long at oracle.com wrote: >>>> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>>> Hi Dean, >>>> >>>>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking (where unlinking code belongs), instead of after the handshake (intended for deleting things safely unlinked). >>>>>> Strictly speaking, moving the OSR nmethod unlinking removes the racing between make_not_entrant and make_unloaded, but I still want the monotonicity guards to make this code more robust. >>>>> >>>>> I see where you added OSR nmethod unlinking, but not where you removed it, so it's not obvious it was a "move". >>>> >>>> Sorry, bad wording on my part. I added OSR nmethod unlinking before the global handshake is run. After the handshake, we call make_unloaded() on the same is_unloading() nmethods. That function "tries" to unlink the OSR nmethod, but will just not do it as it's already unlinked at that point. So in a way, I didn't remove the call to unlink the OSR nmethod there, it just won't do anything. I preferred structuring it that way instead of trying to optimize away the call to unlink the OSR nmethod when making it unloaded, but only for the concurrent case. It seemed to introduce more conditional magic than it was worth. >>>> So in practice, the unlinking of OSR nmethods has moved for concurrent unloading to before the handshake. >>>> >>> >>> OK, in that case, could you add a little information to the "Invalidate the osr nmethod only once" comment so that in the future someone isn't tempted to remove the code as redundant? >> >> Sure. >> >>>>> Would it make sense for nmethod::unlink_from_method() to do the OSR unlinking, or to assert that it has already been done? >>>> >>>> An earlier version of this patch tried to do that. It is indeed possible. But it requires changing lock ranks of the OSR nmethod lock to special - 1 and moving around a bunch of code as this function is also called both when making nmethods not_entrant, zombie, and unlinking them in that case. For the first two, we conditionally unlink the nmethod based on the current state (which is the old state), whereas when I move it, the current state is the new state. So I had to change things around a bit more to figure out the right condition when to unlink it that works for all 3 callers. In the end, since this is going to 13, I thought it's more important to minimize the risk as much as I can, and leave such refactorings to 14. >>>> >>> >>> OK. >>> >>>>> The new bailout in the middle of nmethod::make_not_entrant_or_zombie() worries me a little, because the code up to that point has side-effects, and we could be bailing out in an unexpected state. >>>> >>>> Correct. In an earlier version of this patch, I moved the transition to before the side effects. But a bunch of code is using the current nmethod state to determine what to do, and that current state changed from the old to the new state. In particular, we conditionally patch in the jump based on the current (old) state, and we conditionally increment decompile count based on the current (old) state. So I ended up having to rewrite more code than I wanted to for a patch going into 13, and convince myself that I had not implicitly messed something up. It felt safer to reason about the 3 side effects up until the transitioning point: >>>> >>>> 1) Patching in the jump into VEP. Any state more dead than the current transition, would still want that jump to be there. >>>> 2) Incrementing decompile count when making it not_entrant. Seems in order to do regardless, as we had an actual request to make the nmethod not entrant because it was bad somehow. >>>> 3) Marking it as seen on stack when making it not_entrant. This will only make can_convert_to_zombie start returning false, which is harmless in general. Also, as both transitions to zombie and not_entrant are performed under the Patching_lock, the only possible race is with make_unloaded. And those nmethods are is_unloading(), which also makes can_convert_to_zombie return false (in a not racy fashion). So it would essentially make no observable difference to any single call to can_convert_to_zombie(). >>>> >>>> In summary, #1 and #3 don't really observably change the state of the system, and #2 is completely harmless and probably wanted. Therefore I found that moving these things around and finding out where we use the current state(), as well as rewriting it, seemed like a slightly scarier change for 13 to me. >>>> >>>> So in general, there is some refactoring that could be done (and I have tried it) to make this nicer. But I want to minimize the risk for 13 as much as possible, and perform any risky refactorings in 14 instead. >>>> If your risk assessment is different and you would prefer moving the transition higher up (and flipping some conditions) instead, I am totally up for that too though, and I do see where you are coming from. >>>> >>> >>> So if we fail, it means that we lost a race to a "deader" state, and assuming this is the only path to the deader state, wouldn't that also mean that #1, #2, and #3 would have already been done by the winning thread? If so, that makes me feel better about bailing out in the middle, but I'm still not 100% convinced, unless we can assert that 1-3 already happened. Do you have a prototype of what moving the transition higher up would look like? >> >> As a matter of fact I do. Here is a webrev: >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ >> >> I kind of like it. What do you think? >> >> Thanks, >> /Erik >> >>> dl >>> >>>> BTW, I have tested this change through hs-tier1-7, and it looks good. >>>> >>>> Thanks a lot Dean for reviewing this code. >>>> >>>> /Erik >>>> >>>>> dl >>>>> >>>> >>> > From coleen.phillimore at oracle.com Thu Jul 11 18:23:54 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 11 Jul 2019 14:23:54 -0400 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <3D8AC43D-B56B-44AF-AD9F-A0F663A7DBDD@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <2fc92bdb-ff1b-ad48-0e5e-9983e619c3c7@oracle.com> <3D8AC43D-B56B-44AF-AD9F-A0F663A7DBDD@oracle.com> Message-ID: On 7/11/19 2:18 PM, Erik Osterlund wrote: > Hi Coleen, > >> On 11 Jul 2019, at 16:46, coleen.phillimore at oracle.com wrote: >> >> >> Hi Erik, >> I had a look at this also and it seems reasonable and I like that 'unloaded' is less dead than 'zombie' now. > Glad you like it! > >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00/src/hotspot/share/code/nmethod.cpp.frames.html >> >> 1230 guarantee(try_transition(unloaded), "Invalid nmethod transition to unloaded"); >> >> >> This line worries me. Can you explain why another thread could not have made this nmethod zombie already, before the handshake in zgc and the call afterward to make_unloaded() for this nmethod? And add a comment here? > Sure, will add a comment. Like this: > ?It is an important invariant that there exists no race between the sweeper and GC thread competing for making the same nmethod zombie and unloaded respectively. This is ensured by can_convert_to_zombie() returning false for any is_unloading() nmethod, informing the sweeper not to step on any GC toes? Yes, I like the comment.? Should it be an assert instead though? Thanks, Coleen > > Does that sound comprehensible? > > Thanks, > /Erik > >> Thanks, >> Coleen >> >>> On 7/11/19 9:53 AM, Erik ?sterlund wrote: >>> Hi Dean, >>> >>>> On 2019-07-11 00:42, dean.long at oracle.com wrote: >>>>> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>>>> Hi Dean, >>>>> >>>>>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>>>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking (where unlinking code belongs), instead of after the handshake (intended for deleting things safely unlinked). >>>>>>> Strictly speaking, moving the OSR nmethod unlinking removes the racing between make_not_entrant and make_unloaded, but I still want the monotonicity guards to make this code more robust. >>>>>> I see where you added OSR nmethod unlinking, but not where you removed it, so it's not obvious it was a "move". >>>>> Sorry, bad wording on my part. I added OSR nmethod unlinking before the global handshake is run. After the handshake, we call make_unloaded() on the same is_unloading() nmethods. That function "tries" to unlink the OSR nmethod, but will just not do it as it's already unlinked at that point. So in a way, I didn't remove the call to unlink the OSR nmethod there, it just won't do anything. I preferred structuring it that way instead of trying to optimize away the call to unlink the OSR nmethod when making it unloaded, but only for the concurrent case. It seemed to introduce more conditional magic than it was worth. >>>>> So in practice, the unlinking of OSR nmethods has moved for concurrent unloading to before the handshake. >>>>> >>>> OK, in that case, could you add a little information to the "Invalidate the osr nmethod only once" comment so that in the future someone isn't tempted to remove the code as redundant? >>> Sure. >>> >>>>>> Would it make sense for nmethod::unlink_from_method() to do the OSR unlinking, or to assert that it has already been done? >>>>> An earlier version of this patch tried to do that. It is indeed possible. But it requires changing lock ranks of the OSR nmethod lock to special - 1 and moving around a bunch of code as this function is also called both when making nmethods not_entrant, zombie, and unlinking them in that case. For the first two, we conditionally unlink the nmethod based on the current state (which is the old state), whereas when I move it, the current state is the new state. So I had to change things around a bit more to figure out the right condition when to unlink it that works for all 3 callers. In the end, since this is going to 13, I thought it's more important to minimize the risk as much as I can, and leave such refactorings to 14. >>>>> >>>> OK. >>>> >>>>>> The new bailout in the middle of nmethod::make_not_entrant_or_zombie() worries me a little, because the code up to that point has side-effects, and we could be bailing out in an unexpected state. >>>>> Correct. In an earlier version of this patch, I moved the transition to before the side effects. But a bunch of code is using the current nmethod state to determine what to do, and that current state changed from the old to the new state. In particular, we conditionally patch in the jump based on the current (old) state, and we conditionally increment decompile count based on the current (old) state. So I ended up having to rewrite more code than I wanted to for a patch going into 13, and convince myself that I had not implicitly messed something up. It felt safer to reason about the 3 side effects up until the transitioning point: >>>>> >>>>> 1) Patching in the jump into VEP. Any state more dead than the current transition, would still want that jump to be there. >>>>> 2) Incrementing decompile count when making it not_entrant. Seems in order to do regardless, as we had an actual request to make the nmethod not entrant because it was bad somehow. >>>>> 3) Marking it as seen on stack when making it not_entrant. This will only make can_convert_to_zombie start returning false, which is harmless in general. Also, as both transitions to zombie and not_entrant are performed under the Patching_lock, the only possible race is with make_unloaded. And those nmethods are is_unloading(), which also makes can_convert_to_zombie return false (in a not racy fashion). So it would essentially make no observable difference to any single call to can_convert_to_zombie(). >>>>> >>>>> In summary, #1 and #3 don't really observably change the state of the system, and #2 is completely harmless and probably wanted. Therefore I found that moving these things around and finding out where we use the current state(), as well as rewriting it, seemed like a slightly scarier change for 13 to me. >>>>> >>>>> So in general, there is some refactoring that could be done (and I have tried it) to make this nicer. But I want to minimize the risk for 13 as much as possible, and perform any risky refactorings in 14 instead. >>>>> If your risk assessment is different and you would prefer moving the transition higher up (and flipping some conditions) instead, I am totally up for that too though, and I do see where you are coming from. >>>>> >>>> So if we fail, it means that we lost a race to a "deader" state, and assuming this is the only path to the deader state, wouldn't that also mean that #1, #2, and #3 would have already been done by the winning thread? If so, that makes me feel better about bailing out in the middle, but I'm still not 100% convinced, unless we can assert that 1-3 already happened. Do you have a prototype of what moving the transition higher up would look like? >>> As a matter of fact I do. Here is a webrev: >>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ >>> >>> I kind of like it. What do you think? >>> >>> Thanks, >>> /Erik >>> >>>> dl >>>> >>>>> BTW, I have tested this change through hs-tier1-7, and it looks good. >>>>> >>>>> Thanks a lot Dean for reviewing this code. >>>>> >>>>> /Erik >>>>> >>>>>> dl >>>>>> From erik.osterlund at oracle.com Thu Jul 11 19:02:45 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 11 Jul 2019 15:02:45 -0400 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <2fc92bdb-ff1b-ad48-0e5e-9983e619c3c7@oracle.com> <3D8AC43D-B56B-44AF-AD9F-A0F663A7DBDD@oracle.com> Message-ID: <92ba7623-5c9c-8808-2e82-1151b356de58@oracle.com> Hi Coleen, On 2019-07-11 14:23, coleen.phillimore at oracle.com wrote: > > > On 7/11/19 2:18 PM, Erik Osterlund wrote: >> Hi Coleen, >> >>> On 11 Jul 2019, at 16:46, coleen.phillimore at oracle.com wrote: >>> >>> >>> Hi Erik, >>> I had a look at this also and it seems reasonable and I like that >>> 'unloaded' is less dead than 'zombie' now. >> Glad you like it! >> >>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00/src/hotspot/share/code/nmethod.cpp.frames.html >>> >>> >>> 1230 guarantee(try_transition(unloaded), "Invalid nmethod transition >>> to unloaded"); >>> >>> >>> This line worries me.? Can you explain why another thread could not >>> have made this nmethod zombie already, before the handshake in zgc >>> and the call afterward to make_unloaded() for this nmethod??? And add >>> a comment here? >> Sure, will add a comment. Like this: >> ?It is an important invariant that there exists no race between the >> sweeper and GC thread competing for making the same nmethod zombie and >> unloaded respectively. This is ensured by can_convert_to_zombie() >> returning false for any is_unloading() nmethod, informing the sweeper >> not to step on any GC toes? > > Yes, I like the comment.? Should it be an assert instead though? Sure, why not! The comment addition and assert instead of guarantee: http://cr.openjdk.java.net/~eosterlund/8224674/webrev.02/ Thanks, /Erik > Thanks, > Coleen >> >> Does that sound comprehensible? >> >> Thanks, >> /Erik >> >>> Thanks, >>> Coleen >>> >>>> On 7/11/19 9:53 AM, Erik ?sterlund wrote: >>>> Hi Dean, >>>> >>>>> On 2019-07-11 00:42, dean.long at oracle.com wrote: >>>>>> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>>>>> Hi Dean, >>>>>> >>>>>>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>>>>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking >>>>>>>> (where unlinking code belongs), instead of after the handshake >>>>>>>> (intended for deleting things safely unlinked). >>>>>>>> Strictly speaking, moving the OSR nmethod unlinking removes the >>>>>>>> racing between make_not_entrant and make_unloaded, but I still >>>>>>>> want the monotonicity guards to make this code more robust. >>>>>>> I see where you added OSR nmethod unlinking, but not where you >>>>>>> removed it, so it's not obvious it was a "move". >>>>>> Sorry, bad wording on my part. I added OSR nmethod unlinking >>>>>> before the global handshake is run. After the handshake, we call >>>>>> make_unloaded() on the same is_unloading() nmethods. That function >>>>>> "tries" to unlink the OSR nmethod, but will just not do it as it's >>>>>> already unlinked at that point. So in a way, I didn't remove the >>>>>> call to unlink the OSR nmethod there, it just won't do anything. I >>>>>> preferred structuring it that way instead of trying to optimize >>>>>> away the call to unlink the OSR nmethod when making it unloaded, >>>>>> but only for the concurrent case. It seemed to introduce more >>>>>> conditional magic than it was worth. >>>>>> So in practice, the unlinking of OSR nmethods has moved for >>>>>> concurrent unloading to before the handshake. >>>>>> >>>>> OK, in that case, could you add a little information to the >>>>> "Invalidate the osr nmethod only once" comment so that in the >>>>> future someone isn't tempted to remove the code as redundant? >>>> Sure. >>>> >>>>>>> Would it make sense for nmethod::unlink_from_method() to do the >>>>>>> OSR unlinking, or to assert that it has already been done? >>>>>> An earlier version of this patch tried to do that. It is indeed >>>>>> possible. But it requires changing lock ranks of the OSR nmethod >>>>>> lock to special - 1 and moving around a bunch of code as this >>>>>> function is also called both when making nmethods not_entrant, >>>>>> zombie, and unlinking them in that case. For the first two, we >>>>>> conditionally unlink the nmethod based on the current state (which >>>>>> is the old state), whereas when I move it, the current state is >>>>>> the new state. So I had to change things around a bit more to >>>>>> figure out the right condition when to unlink it that works for >>>>>> all 3 callers. In the end, since this is going to 13, I thought >>>>>> it's more important to minimize the risk as much as I can, and >>>>>> leave such refactorings to 14. >>>>>> >>>>> OK. >>>>> >>>>>>> The new bailout in the middle of >>>>>>> nmethod::make_not_entrant_or_zombie() worries me a little, >>>>>>> because the code up to that point has side-effects, and we could >>>>>>> be bailing out in an unexpected state. >>>>>> Correct. In an earlier version of this patch, I moved the >>>>>> transition to before the side effects. But a bunch of code is >>>>>> using the current nmethod state to determine what to do, and that >>>>>> current state changed from the old to the new state. In >>>>>> particular, we conditionally patch in the jump based on the >>>>>> current (old) state, and we conditionally increment decompile >>>>>> count based on the current (old) state. So I ended up having to >>>>>> rewrite more code than I wanted to for a patch going into 13, and >>>>>> convince myself that I had not implicitly messed something up. It >>>>>> felt safer to reason about the 3 side effects up until the >>>>>> transitioning point: >>>>>> >>>>>> 1) Patching in the jump into VEP. Any state more dead than the >>>>>> current transition, would still want that jump to be there. >>>>>> 2) Incrementing decompile count when making it not_entrant. Seems >>>>>> in order to do regardless, as we had an actual request to make the >>>>>> nmethod not entrant because it was bad somehow. >>>>>> 3) Marking it as seen on stack when making it not_entrant. This >>>>>> will only make can_convert_to_zombie start returning false, which >>>>>> is harmless in general. Also, as both transitions to zombie and >>>>>> not_entrant are performed under the Patching_lock, the only >>>>>> possible race is with make_unloaded. And those nmethods are >>>>>> is_unloading(), which also makes can_convert_to_zombie return >>>>>> false (in a not racy fashion). So it would essentially make no >>>>>> observable difference to any single call to can_convert_to_zombie(). >>>>>> >>>>>> In summary, #1 and #3 don't really observably change the state of >>>>>> the system, and #2 is completely harmless and probably wanted. >>>>>> Therefore I found that moving these things around and finding out >>>>>> where we use the current state(), as well as rewriting it, seemed >>>>>> like a slightly scarier change for 13 to me. >>>>>> >>>>>> So in general, there is some refactoring that could be done (and I >>>>>> have tried it) to make this nicer. But I want to minimize the risk >>>>>> for 13 as much as possible, and perform any risky refactorings in >>>>>> 14 instead. >>>>>> If your risk assessment is different and you would prefer moving >>>>>> the transition higher up (and flipping some conditions) instead, I >>>>>> am totally up for that too though, and I do see where you are >>>>>> coming from. >>>>>> >>>>> So if we fail, it means that we lost a race to a "deader" state, >>>>> and assuming this is the only path to the deader state, wouldn't >>>>> that also mean that #1, #2, and #3 would have already been done by >>>>> the winning thread?? If so, that makes me feel better about bailing >>>>> out in the middle, but I'm still not 100% convinced, unless we can >>>>> assert that 1-3 already happened.? Do you have a prototype of what >>>>> moving the transition higher up would look like? >>>> As a matter of fact I do. Here is a webrev: >>>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ >>>> >>>> I kind of like it. What do you think? >>>> >>>> Thanks, >>>> /Erik >>>> >>>>> dl >>>>> >>>>>> BTW, I have tested this change through hs-tier1-7, and it looks good. >>>>>> >>>>>> Thanks a lot Dean for reviewing this code. >>>>>> >>>>>> /Erik >>>>>> >>>>>>> dl >>>>>>> > From dean.long at oracle.com Thu Jul 11 19:29:38 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 11 Jul 2019 12:29:38 -0700 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> Message-ID: On 7/11/19 6:53 AM, Erik ?sterlund wrote: > Hi Dean, > > On 2019-07-11 00:42, dean.long at oracle.com wrote: >> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>> Hi Dean, >>> >>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking >>>>> (where unlinking code belongs), instead of after the handshake >>>>> (intended for deleting things safely unlinked). >>>>> Strictly speaking, moving the OSR nmethod unlinking removes the >>>>> racing between make_not_entrant and make_unloaded, but I still >>>>> want the monotonicity guards to make this code more robust. >>>> >>>> I see where you added OSR nmethod unlinking, but not where you >>>> removed it, so it's not obvious it was a "move". >>> >>> Sorry, bad wording on my part. I added OSR nmethod unlinking before >>> the global handshake is run. After the handshake, we call >>> make_unloaded() on the same is_unloading() nmethods. That function >>> "tries" to unlink the OSR nmethod, but will just not do it as it's >>> already unlinked at that point. So in a way, I didn't remove the >>> call to unlink the OSR nmethod there, it just won't do anything. I >>> preferred structuring it that way instead of trying to optimize away >>> the call to unlink the OSR nmethod when making it unloaded, but only >>> for the concurrent case. It seemed to introduce more conditional >>> magic than it was worth. >>> So in practice, the unlinking of OSR nmethods has moved for >>> concurrent unloading to before the handshake. >>> >> >> OK, in that case, could you add a little information to the >> "Invalidate the osr nmethod only once" comment so that in the future >> someone isn't tempted to remove the code as redundant? > > Sure. > I meant the one in zNMethod.cpp :-) >>>> Would it make sense for nmethod::unlink_from_method() to do the OSR >>>> unlinking, or to assert that it has already been done? >>> >>> An earlier version of this patch tried to do that. It is indeed >>> possible. But it requires changing lock ranks of the OSR nmethod >>> lock to special - 1 and moving around a bunch of code as this >>> function is also called both when making nmethods not_entrant, >>> zombie, and unlinking them in that case. For the first two, we >>> conditionally unlink the nmethod based on the current state (which >>> is the old state), whereas when I move it, the current state is the >>> new state. So I had to change things around a bit more to figure out >>> the right condition when to unlink it that works for all 3 callers. >>> In the end, since this is going to 13, I thought it's more important >>> to minimize the risk as much as I can, and leave such refactorings >>> to 14. >>> >> >> OK. >> >>>> The new bailout in the middle of >>>> nmethod::make_not_entrant_or_zombie() worries me a little, because >>>> the code up to that point has side-effects, and we could be bailing >>>> out in an unexpected state. >>> >>> Correct. In an earlier version of this patch, I moved the transition >>> to before the side effects. But a bunch of code is using the current >>> nmethod state to determine what to do, and that current state >>> changed from the old to the new state. In particular, we >>> conditionally patch in the jump based on the current (old) state, >>> and we conditionally increment decompile count based on the current >>> (old) state. So I ended up having to rewrite more code than I wanted >>> to for a patch going into 13, and convince myself that I had not >>> implicitly messed something up. It felt safer to reason about the 3 >>> side effects up until the transitioning point: >>> >>> 1) Patching in the jump into VEP. Any state more dead than the >>> current transition, would still want that jump to be there. >>> 2) Incrementing decompile count when making it not_entrant. Seems in >>> order to do regardless, as we had an actual request to make the >>> nmethod not entrant because it was bad somehow. >>> 3) Marking it as seen on stack when making it not_entrant. This will >>> only make can_convert_to_zombie start returning false, which is >>> harmless in general. Also, as both transitions to zombie and >>> not_entrant are performed under the Patching_lock, the only possible >>> race is with make_unloaded. And those nmethods are is_unloading(), >>> which also makes can_convert_to_zombie return false (in a not racy >>> fashion). So it would essentially make no observable difference to >>> any single call to can_convert_to_zombie(). >>> >>> In summary, #1 and #3 don't really observably change the state of >>> the system, and #2 is completely harmless and probably wanted. >>> Therefore I found that moving these things around and finding out >>> where we use the current state(), as well as rewriting it, seemed >>> like a slightly scarier change for 13 to me. >>> >>> So in general, there is some refactoring that could be done (and I >>> have tried it) to make this nicer. But I want to minimize the risk >>> for 13 as much as possible, and perform any risky refactorings in 14 >>> instead. >>> If your risk assessment is different and you would prefer moving the >>> transition higher up (and flipping some conditions) instead, I am >>> totally up for that too though, and I do see where you are coming from. >>> >> >> So if we fail, it means that we lost a race to a "deader" state, and >> assuming this is the only path to the deader state, wouldn't that >> also mean that #1, #2, and #3 would have already been done by the >> winning thread?? If so, that makes me feel better about bailing out >> in the middle, but I'm still not 100% convinced, unless we can assert >> that 1-3 already happened.? Do you have a prototype of what moving >> the transition higher up would look like? > > As a matter of fact I do. Here is a webrev: > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ > > I kind of like it. What do you think? > Now the code after the transition that says "Must happen before state change" worries me.? Can you remind me again what kind of race can make the state transition fail here?? Did you happen to draw a state diagram while learning this code? :-) dl > Thanks, > /Erik > >> dl >> >>> BTW, I have tested this change through hs-tier1-7, and it looks good. >>> >>> Thanks a lot Dean for reviewing this code. >>> >>> /Erik >>> >>>> dl >>>> >>> >> From coleen.phillimore at oracle.com Thu Jul 11 19:39:24 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 11 Jul 2019 15:39:24 -0400 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <92ba7623-5c9c-8808-2e82-1151b356de58@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <2fc92bdb-ff1b-ad48-0e5e-9983e619c3c7@oracle.com> <3D8AC43D-B56B-44AF-AD9F-A0F663A7DBDD@oracle.com> <92ba7623-5c9c-8808-2e82-1151b356de58@oracle.com> Message-ID: <929f68da-7365-57a7-e659-42d62adfebab@oracle.com> On 7/11/19 3:02 PM, Erik ?sterlund wrote: > Hi Coleen, > > On 2019-07-11 14:23, coleen.phillimore at oracle.com wrote: >> >> >> On 7/11/19 2:18 PM, Erik Osterlund wrote: >>> Hi Coleen, >>> >>>> On 11 Jul 2019, at 16:46, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> Hi Erik, >>>> I had a look at this also and it seems reasonable and I like that >>>> 'unloaded' is less dead than 'zombie' now. >>> Glad you like it! >>> >>>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00/src/hotspot/share/code/nmethod.cpp.frames.html >>>> >>>> >>>> 1230 guarantee(try_transition(unloaded), "Invalid nmethod >>>> transition to unloaded"); >>>> >>>> >>>> This line worries me.? Can you explain why another thread could not >>>> have made this nmethod zombie already, before the handshake in zgc >>>> and the call afterward to make_unloaded() for this nmethod??? And >>>> add a comment here? >>> Sure, will add a comment. Like this: >>> ?It is an important invariant that there exists no race between the >>> sweeper and GC thread competing for making the same nmethod zombie >>> and unloaded respectively. This is ensured by >>> can_convert_to_zombie() returning false for any is_unloading() >>> nmethod, informing the sweeper not to step on any GC toes? >> >> Yes, I like the comment.? Should it be an assert instead though? > > Sure, why not! The comment addition and assert instead of guarantee: > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.02/ Looks good! Coleen > > Thanks, > /Erik > >> Thanks, >> Coleen >>> >>> Does that sound comprehensible? >>> >>> Thanks, >>> /Erik >>> >>>> Thanks, >>>> Coleen >>>> >>>>> On 7/11/19 9:53 AM, Erik ?sterlund wrote: >>>>> Hi Dean, >>>>> >>>>>> On 2019-07-11 00:42, dean.long at oracle.com wrote: >>>>>>> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>>>>>> Hi Dean, >>>>>>> >>>>>>>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>>>>>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>>>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking >>>>>>>>> (where unlinking code belongs), instead of after the handshake >>>>>>>>> (intended for deleting things safely unlinked). >>>>>>>>> Strictly speaking, moving the OSR nmethod unlinking removes >>>>>>>>> the racing between make_not_entrant and make_unloaded, but I >>>>>>>>> still want the monotonicity guards to make this code more robust. >>>>>>>> I see where you added OSR nmethod unlinking, but not where you >>>>>>>> removed it, so it's not obvious it was a "move". >>>>>>> Sorry, bad wording on my part. I added OSR nmethod unlinking >>>>>>> before the global handshake is run. After the handshake, we call >>>>>>> make_unloaded() on the same is_unloading() nmethods. That >>>>>>> function "tries" to unlink the OSR nmethod, but will just not do >>>>>>> it as it's already unlinked at that point. So in a way, I didn't >>>>>>> remove the call to unlink the OSR nmethod there, it just won't >>>>>>> do anything. I preferred structuring it that way instead of >>>>>>> trying to optimize away the call to unlink the OSR nmethod when >>>>>>> making it unloaded, but only for the concurrent case. It seemed >>>>>>> to introduce more conditional magic than it was worth. >>>>>>> So in practice, the unlinking of OSR nmethods has moved for >>>>>>> concurrent unloading to before the handshake. >>>>>>> >>>>>> OK, in that case, could you add a little information to the >>>>>> "Invalidate the osr nmethod only once" comment so that in the >>>>>> future someone isn't tempted to remove the code as redundant? >>>>> Sure. >>>>> >>>>>>>> Would it make sense for nmethod::unlink_from_method() to do the >>>>>>>> OSR unlinking, or to assert that it has already been done? >>>>>>> An earlier version of this patch tried to do that. It is indeed >>>>>>> possible. But it requires changing lock ranks of the OSR nmethod >>>>>>> lock to special - 1 and moving around a bunch of code as this >>>>>>> function is also called both when making nmethods not_entrant, >>>>>>> zombie, and unlinking them in that case. For the first two, we >>>>>>> conditionally unlink the nmethod based on the current state >>>>>>> (which is the old state), whereas when I move it, the current >>>>>>> state is the new state. So I had to change things around a bit >>>>>>> more to figure out the right condition when to unlink it that >>>>>>> works for all 3 callers. In the end, since this is going to 13, >>>>>>> I thought it's more important to minimize the risk as much as I >>>>>>> can, and leave such refactorings to 14. >>>>>>> >>>>>> OK. >>>>>> >>>>>>>> The new bailout in the middle of >>>>>>>> nmethod::make_not_entrant_or_zombie() worries me a little, >>>>>>>> because the code up to that point has side-effects, and we >>>>>>>> could be bailing out in an unexpected state. >>>>>>> Correct. In an earlier version of this patch, I moved the >>>>>>> transition to before the side effects. But a bunch of code is >>>>>>> using the current nmethod state to determine what to do, and >>>>>>> that current state changed from the old to the new state. In >>>>>>> particular, we conditionally patch in the jump based on the >>>>>>> current (old) state, and we conditionally increment decompile >>>>>>> count based on the current (old) state. So I ended up having to >>>>>>> rewrite more code than I wanted to for a patch going into 13, >>>>>>> and convince myself that I had not implicitly messed something >>>>>>> up. It felt safer to reason about the 3 side effects up until >>>>>>> the transitioning point: >>>>>>> >>>>>>> 1) Patching in the jump into VEP. Any state more dead than the >>>>>>> current transition, would still want that jump to be there. >>>>>>> 2) Incrementing decompile count when making it not_entrant. >>>>>>> Seems in order to do regardless, as we had an actual request to >>>>>>> make the nmethod not entrant because it was bad somehow. >>>>>>> 3) Marking it as seen on stack when making it not_entrant. This >>>>>>> will only make can_convert_to_zombie start returning false, >>>>>>> which is harmless in general. Also, as both transitions to >>>>>>> zombie and not_entrant are performed under the Patching_lock, >>>>>>> the only possible race is with make_unloaded. And those nmethods >>>>>>> are is_unloading(), which also makes can_convert_to_zombie >>>>>>> return false (in a not racy fashion). So it would essentially >>>>>>> make no observable difference to any single call to >>>>>>> can_convert_to_zombie(). >>>>>>> >>>>>>> In summary, #1 and #3 don't really observably change the state >>>>>>> of the system, and #2 is completely harmless and probably >>>>>>> wanted. Therefore I found that moving these things around and >>>>>>> finding out where we use the current state(), as well as >>>>>>> rewriting it, seemed like a slightly scarier change for 13 to me. >>>>>>> >>>>>>> So in general, there is some refactoring that could be done (and >>>>>>> I have tried it) to make this nicer. But I want to minimize the >>>>>>> risk for 13 as much as possible, and perform any risky >>>>>>> refactorings in 14 instead. >>>>>>> If your risk assessment is different and you would prefer moving >>>>>>> the transition higher up (and flipping some conditions) instead, >>>>>>> I am totally up for that too though, and I do see where you are >>>>>>> coming from. >>>>>>> >>>>>> So if we fail, it means that we lost a race to a "deader" state, >>>>>> and assuming this is the only path to the deader state, wouldn't >>>>>> that also mean that #1, #2, and #3 would have already been done >>>>>> by the winning thread?? If so, that makes me feel better about >>>>>> bailing out in the middle, but I'm still not 100% convinced, >>>>>> unless we can assert that 1-3 already happened.? Do you have a >>>>>> prototype of what moving the transition higher up would look like? >>>>> As a matter of fact I do. Here is a webrev: >>>>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ >>>>> >>>>> I kind of like it. What do you think? >>>>> >>>>> Thanks, >>>>> /Erik >>>>> >>>>>> dl >>>>>> >>>>>>> BTW, I have tested this change through hs-tier1-7, and it looks >>>>>>> good. >>>>>>> >>>>>>> Thanks a lot Dean for reviewing this code. >>>>>>> >>>>>>> /Erik >>>>>>> >>>>>>>> dl >>>>>>>> >> From erik.osterlund at oracle.com Thu Jul 11 20:13:06 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 11 Jul 2019 16:13:06 -0400 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> Message-ID: Hi Dean, On 2019-07-11 15:29, dean.long at oracle.com wrote: > On 7/11/19 6:53 AM, Erik ?sterlund wrote: >> Hi Dean, >> >> On 2019-07-11 00:42, dean.long at oracle.com wrote: >>> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>>> Hi Dean, >>>> >>>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking >>>>>> (where unlinking code belongs), instead of after the handshake >>>>>> (intended for deleting things safely unlinked). >>>>>> Strictly speaking, moving the OSR nmethod unlinking removes the >>>>>> racing between make_not_entrant and make_unloaded, but I still >>>>>> want the monotonicity guards to make this code more robust. >>>>> >>>>> I see where you added OSR nmethod unlinking, but not where you >>>>> removed it, so it's not obvious it was a "move". >>>> >>>> Sorry, bad wording on my part. I added OSR nmethod unlinking before >>>> the global handshake is run. After the handshake, we call >>>> make_unloaded() on the same is_unloading() nmethods. That function >>>> "tries" to unlink the OSR nmethod, but will just not do it as it's >>>> already unlinked at that point. So in a way, I didn't remove the >>>> call to unlink the OSR nmethod there, it just won't do anything. I >>>> preferred structuring it that way instead of trying to optimize away >>>> the call to unlink the OSR nmethod when making it unloaded, but only >>>> for the concurrent case. It seemed to introduce more conditional >>>> magic than it was worth. >>>> So in practice, the unlinking of OSR nmethods has moved for >>>> concurrent unloading to before the handshake. >>>> >>> >>> OK, in that case, could you add a little information to the >>> "Invalidate the osr nmethod only once" comment so that in the future >>> someone isn't tempted to remove the code as redundant? >> >> Sure. >> > > I meant the one in zNMethod.cpp :-) Okay, will put another comment in there once we agree on a direction on the next point. > >>>>> Would it make sense for nmethod::unlink_from_method() to do the OSR >>>>> unlinking, or to assert that it has already been done? >>>> >>>> An earlier version of this patch tried to do that. It is indeed >>>> possible. But it requires changing lock ranks of the OSR nmethod >>>> lock to special - 1 and moving around a bunch of code as this >>>> function is also called both when making nmethods not_entrant, >>>> zombie, and unlinking them in that case. For the first two, we >>>> conditionally unlink the nmethod based on the current state (which >>>> is the old state), whereas when I move it, the current state is the >>>> new state. So I had to change things around a bit more to figure out >>>> the right condition when to unlink it that works for all 3 callers. >>>> In the end, since this is going to 13, I thought it's more important >>>> to minimize the risk as much as I can, and leave such refactorings >>>> to 14. >>>> >>> >>> OK. >>> >>>>> The new bailout in the middle of >>>>> nmethod::make_not_entrant_or_zombie() worries me a little, because >>>>> the code up to that point has side-effects, and we could be bailing >>>>> out in an unexpected state. >>>> >>>> Correct. In an earlier version of this patch, I moved the transition >>>> to before the side effects. But a bunch of code is using the current >>>> nmethod state to determine what to do, and that current state >>>> changed from the old to the new state. In particular, we >>>> conditionally patch in the jump based on the current (old) state, >>>> and we conditionally increment decompile count based on the current >>>> (old) state. So I ended up having to rewrite more code than I wanted >>>> to for a patch going into 13, and convince myself that I had not >>>> implicitly messed something up. It felt safer to reason about the 3 >>>> side effects up until the transitioning point: >>>> >>>> 1) Patching in the jump into VEP. Any state more dead than the >>>> current transition, would still want that jump to be there. >>>> 2) Incrementing decompile count when making it not_entrant. Seems in >>>> order to do regardless, as we had an actual request to make the >>>> nmethod not entrant because it was bad somehow. >>>> 3) Marking it as seen on stack when making it not_entrant. This will >>>> only make can_convert_to_zombie start returning false, which is >>>> harmless in general. Also, as both transitions to zombie and >>>> not_entrant are performed under the Patching_lock, the only possible >>>> race is with make_unloaded. And those nmethods are is_unloading(), >>>> which also makes can_convert_to_zombie return false (in a not racy >>>> fashion). So it would essentially make no observable difference to >>>> any single call to can_convert_to_zombie(). >>>> >>>> In summary, #1 and #3 don't really observably change the state of >>>> the system, and #2 is completely harmless and probably wanted. >>>> Therefore I found that moving these things around and finding out >>>> where we use the current state(), as well as rewriting it, seemed >>>> like a slightly scarier change for 13 to me. >>>> >>>> So in general, there is some refactoring that could be done (and I >>>> have tried it) to make this nicer. But I want to minimize the risk >>>> for 13 as much as possible, and perform any risky refactorings in 14 >>>> instead. >>>> If your risk assessment is different and you would prefer moving the >>>> transition higher up (and flipping some conditions) instead, I am >>>> totally up for that too though, and I do see where you are coming from. >>>> >>> >>> So if we fail, it means that we lost a race to a "deader" state, and >>> assuming this is the only path to the deader state, wouldn't that >>> also mean that #1, #2, and #3 would have already been done by the >>> winning thread?? If so, that makes me feel better about bailing out >>> in the middle, but I'm still not 100% convinced, unless we can assert >>> that 1-3 already happened.? Do you have a prototype of what moving >>> the transition higher up would look like? >> >> As a matter of fact I do. Here is a webrev: >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ >> >> I kind of like it. What do you think? >> > > Now the code after the transition that says "Must happen before state > change" worries me. Yes indeed. This is why I was hesitant to move the transition up. It moves past 3 things that implicitly depends on the current state. This one is extra scary. It actually introduces a race condition that could crash the VM (because can_convert_to_zombie() may observe an nmethod that just turned not_entrant, without being marked on stack). I think this shows (IMO) that trying to move the transition up has 3 problems, and this one is particularly hard to dodge. I think it really has to be before the transition. Would you agree now that keeping the transition where it was is less risky (as I did originally) and convincing ourselves that the 3 "side effects" are not really observable side effects in the system, as I reasoned about earlier? If not, I can try to move the mark-on-stack up above the transition. > Can you remind me again what kind of race can make > the state transition fail here?? Did you happen to draw a state diagram > while learning this code? :-) Yes indeed. Would you like the long story or the short story? Here is the short story: the only known race is between one thread making an nmethod not_entrant and the GC thread making it unloaded. That make_not_entrant is the only transition that can fail. Previously I relied on there never existing any concurrent calls to make_not_entrant() and make_unloaded(). The OSR nmethod was caught as a special case (isn't it always...) where this could happen, violating monotonicity. But I think it feels safer to enforce the monotonicity of transitions in the actual code that performs the transitions, instead of relying on knowledge of the relationships between all state transitioning calls, implicitly ensuring monotonicity. Thanks, /Erik > dl > >> Thanks, >> /Erik >> >>> dl >>> >>>> BTW, I have tested this change through hs-tier1-7, and it looks good. >>>> >>>> Thanks a lot Dean for reviewing this code. >>>> >>>> /Erik >>>> >>>>> dl >>>>> >>>> >>> > From Pengfei.Li at arm.com Fri Jul 12 03:27:55 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Fri, 12 Jul 2019 03:27:55 +0000 Subject: RFR(trivial): 8227512: [TESTBUG] Fix JTReg javac test failures with Graal Message-ID: Hi, Please help review this small fix. JBS: https://bugs.openjdk.java.net/browse/JDK-8227512 Webrev: http://cr.openjdk.java.net/~pli/rfr/8227512/ JTReg javac tests * langtools/tools/javac/modules/InheritRuntimeEnvironmentTest.java * langtools/tools/javac/file/LimitedImage.java failed when Graal is used as JVMCI compiler. These cases test javac behavior with the condition that observable modules are limited. But Graal is unable to be found in the limited module scope. This fixes these two tests by adding "jdk.internal.vm.compiler" into the limited modules. -- Thanks, Pengfei From matthias.baesken at sap.com Fri Jul 12 07:48:32 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 12 Jul 2019 07:48:32 +0000 Subject: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] Message-ID: Hello , when looking into the recent xlc16 / xlclang warnings I came across those 3 : /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:1729:7: warning: 'this' pointer cannot be null in well-defined C++ code; comparison may be assumed to always evaluate to true [-Wtautological-undefined-compare] if( this != NULL ) { ^~~~ ~~~~ /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:3416:7: warning: 'this' pointer cannot be null in well-defined C++ code; comparison may be assumed to always evaluate to false [-Wtautological-undefined-compare] if( this == NULL ) return; /nightly/jdk/src/hotspot/share/libadt/set.cpp:46:7: warning: 'this' pointer cannot be null in well-defined C++ code; comparison may be assumed to always evaluate to false [-Wtautological-undefined-compare] if( this == NULL ) return os::strdup("{no set}"); Do you think the NULL-checks can be removed or is there still some value in doing them ? Best regards, Matthias From erik.osterlund at oracle.com Fri Jul 12 08:22:04 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 12 Jul 2019 10:22:04 +0200 Subject: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] In-Reply-To: References: Message-ID: <55e8bddf-3228-0fd7-3639-cc9bc920e2c5@oracle.com> Hi Matthias, Removing such NULL checks seems like a good idea in general due to the undefined behaviour. Worth mentioning though that there are some tricky ones, like in markOopDesc* where this == NULL means that the mark word has the "inflating" value. So we explicitly check if this == NULL and hope the compiler will not elide the check. Just gonna drop that one here and run for it. Thanks, /Erik On 2019-07-12 09:48, Baesken, Matthias wrote: > Hello , when looking into the recent xlc16 / xlclang warnings I came across those 3 : > > /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:1729:7: warning: 'this' pointer cannot be null in well-defined C++ code; > comparison may be assumed to always evaluate to true [-Wtautological-undefined-compare] > if( this != NULL ) { > ^~~~ ~~~~ > > /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:3416:7: warning: 'this' pointer cannot be null in well-defined C++ code; > comparison may be assumed to always evaluate to false [-Wtautological-undefined-compare] > if( this == NULL ) return; > > /nightly/jdk/src/hotspot/share/libadt/set.cpp:46:7: warning: 'this' pointer cannot be null in well-defined C++ code; > comparison may be assumed to always evaluate to false [-Wtautological-undefined-compare] > if( this == NULL ) return os::strdup("{no set}"); > > > Do you think the NULL-checks can be removed or is there still some value in doing them ? > > Best regards, Matthias From matthias.baesken at sap.com Fri Jul 12 10:34:01 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 12 Jul 2019 10:34:01 +0000 Subject: RFR [XS] : 8227630: adjust format specifiers in loadlib_aix.cpp Message-ID: Hello, please review this very small fix for AIX . Currently we use %llu printf-format specifiers at 2 places in loadlib_aix.cpp where we output size_t variables . This leads to warnings with xlc16/xlclang : /nightly/jdk/src/hotspot/os/aix/loadlib_aix.cpp:210:48: warning: format specifies type 'unsigned long long' but the argument has type 'size_t' (aka 'unsigned long') [-Wformat] trcVerbose("loadquery buffer size is %llu.", buflen); ~~~~ ^~~~~~ %zu We can use correct format specifiers (casting might be another option). Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8227630 http://cr.openjdk.java.net/~mbaesken/webrevs/8227630.0/ Thanks, Matthias From shade at redhat.com Fri Jul 12 10:48:30 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 12 Jul 2019 12:48:30 +0200 Subject: RFR [XS] : 8227630: adjust format specifiers in loadlib_aix.cpp In-Reply-To: References: Message-ID: <0f0c1019-c043-9770-24b8-28d89a33bf11@redhat.com> On 7/12/19 12:34 PM, Baesken, Matthias wrote: > http://cr.openjdk.java.net/~mbaesken/webrevs/8227630.0/ Looks fine and trivial. -- Thanks, -Aleksey From matthias.baesken at sap.com Fri Jul 12 11:09:08 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 12 Jul 2019 11:09:08 +0000 Subject: RFR: 8227631: Adjust AIX version check Message-ID: Hello, please review this small AIX related change . For some time, we do not support AIX 5.3 any more. See (where AIX 7.1 or 7.2 is the supported build platform since OpenJDK11) : https://wiki.openjdk.java.net/display/Build/Supported+Build+Platforms The currently used xlc 16.1 (XL C/C++ Compilers) even needs minimum AIX 7.1 to run , see http://www-01.ibm.com/support/docview.wss?uid=swg21326972 (and compiling for older releases on 7.1 / 7.2 would not work easily , at least not "out of the box" to my knowledge .) So we should adjust the minimum OS version check done in os_aix.cpp in os::Aix::initialize_os_info() . Additionally the change removes a couple of warnings [-Wwritable-strings category] . /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4081:22: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] char *name_str = "unknown OS"; ^ /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4089:18: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] name_str = "OS/400 (pase)"; ^ /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4100:18: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] name_str = "AIX"; Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8227631 http://cr.openjdk.java.net/~mbaesken/webrevs/8227631.0/ Thanks, Matthias From martin.doerr at sap.com Fri Jul 12 11:57:36 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 12 Jul 2019 11:57:36 +0000 Subject: RFR [XS] : 8227630: adjust format specifiers in loadlib_aix.cpp In-Reply-To: References: Message-ID: Hi Matthias, looks good to me. Best regards, Martin > -----Original Message----- > From: hotspot-dev On Behalf Of > Baesken, Matthias > Sent: Freitag, 12. Juli 2019 12:34 > To: 'hotspot-dev at openjdk.java.net' > Subject: RFR [XS] : 8227630: adjust format specifiers in loadlib_aix.cpp > > Hello, please review this very small fix for AIX . > > Currently we use %llu printf-format specifiers at 2 places in loadlib_aix.cpp > where we output size_t variables . > This leads to warnings with xlc16/xlclang : > > /nightly/jdk/src/hotspot/os/aix/loadlib_aix.cpp:210:48: warning: format > specifies type 'unsigned long long' but the argument > has type 'size_t' (aka 'unsigned long') [-Wformat] > trcVerbose("loadquery buffer size is %llu.", buflen); > ~~~~ ^~~~~~ > %zu > > We can use correct format specifiers (casting might be another option). > > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8227630 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227630.0/ > > > Thanks, Matthias From harold.seigel at oracle.com Fri Jul 12 12:14:55 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Fri, 12 Jul 2019 08:14:55 -0400 Subject: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] In-Reply-To: <55e8bddf-3228-0fd7-3639-cc9bc920e2c5@oracle.com> References: <55e8bddf-3228-0fd7-3639-cc9bc920e2c5@oracle.com> Message-ID: The functions that compare 'this' to NULL could be changed from instance to static functions where 'this' is explicitly passed as a parameter.? Then you could keep the equivalent NULL checks. Harold On 7/12/2019 4:22 AM, Erik ?sterlund wrote: > Hi Matthias, > > Removing such NULL checks seems like a good idea in general due to the > undefined behaviour. > Worth mentioning though that there are some tricky ones, like in > markOopDesc* where this == NULL > means that the mark word has the "inflating" value. So we explicitly > check if this == NULL and > hope the compiler will not elide the check. Just gonna drop that one > here and run for it. > > Thanks, > /Erik > > On 2019-07-12 09:48, Baesken, Matthias wrote: >> Hello , when looking? into? the? recent xlc16 / xlclang?? warnings I >> came? across? those? 3 : >> >> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:1729:7: warning: >> 'this' pointer cannot be null in well-defined C++ code; >> comparison may be assumed to always evaluate to true >> [-Wtautological-undefined-compare] >> ?? if( this != NULL ) { >> ?????? ^~~~??? ~~~~ >> >> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:3416:7: warning: >> 'this' pointer cannot be null in well-defined C++ code; >> comparison may be assumed to always evaluate to false >> [-Wtautological-undefined-compare] >> ?? if( this == NULL ) return; >> >> /nightly/jdk/src/hotspot/share/libadt/set.cpp:46:7: warning: 'this' >> pointer cannot be null in well-defined C++ code; >> comparison may be assumed to always evaluate to false >> [-Wtautological-undefined-compare] >> ?? if( this == NULL ) return os::strdup("{no set}"); >> >> >> Do you think the? NULL-checks can be removed or is there still some >> value in doing them ? >> >> Best regards, Matthias > From christoph.langer at sap.com Fri Jul 12 12:17:22 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Fri, 12 Jul 2019 12:17:22 +0000 Subject: RFR: 8227631: Adjust AIX version check In-Reply-To: References: Message-ID: Hi Matthias, looks good. This might even be something to push to JDK13 still (if you do it within the next few days). Best regards Christoph > -----Original Message----- > From: hotspot-dev On Behalf Of > Baesken, Matthias > Sent: Freitag, 12. Juli 2019 13:09 > To: 'hotspot-dev at openjdk.java.net' ; > 'ppc-aix-port-dev at openjdk.java.net' > Subject: RFR: 8227631: Adjust AIX version check > > Hello, please review this small AIX related change . > > For some time, we do not support AIX 5.3 any more. > See (where AIX 7.1 or 7.2 is the supported build platform since OpenJDK11) : > > https://wiki.openjdk.java.net/display/Build/Supported+Build+Platforms > > The currently used xlc 16.1 (XL C/C++ Compilers) even needs minimum AIX > 7.1 to run , see > > http://www-01.ibm.com/support/docview.wss?uid=swg21326972 > > (and compiling for older releases on 7.1 / 7.2 would not work easily , at least > not "out of the box" to my knowledge .) > > So we should adjust the minimum OS version check done in os_aix.cpp in > os::Aix::initialize_os_info() . > > > Additionally the change removes a couple of warnings [-Wwritable-strings > category] . > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4081:22: warning: ISO C++11 > does not allow conversion from string literal to 'char *' [-Wwritable-strings] > char *name_str = "unknown OS"; > ^ > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4089:18: warning: ISO C++11 > does not allow conversion from string literal to 'char *' [-Wwritable-strings] > name_str = "OS/400 (pase)"; > ^ > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4100:18: warning: ISO C++11 > does not allow conversion from string literal to 'char *' [-Wwritable-strings] > name_str = "AIX"; > > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8227631 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227631.0/ > > Thanks, Matthias From matthias.baesken at sap.com Fri Jul 12 12:30:35 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 12 Jul 2019 12:30:35 +0000 Subject: RFR: 8227633: avoid comparing this pointers to NULL - was : RE: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] Message-ID: Hello Erik, thanks for the input . We still have a few places in the HS codebase where "this" is compared to NULL. When compiling with xlc16 / xlclang we get these warnings : warning: 'this' pointer cannot be null in well-defined C++ code; comparison may be assumed to always evaluate to false [-Wtautological-undefined-compare] so those places should be removed where possible. I adjusted 3 checks , please review ! Bug/webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.0/ https://bugs.openjdk.java.net/browse/JDK-8227633 Thanks , Matthias > -----Original Message----- > From: Erik ?sterlund > Sent: Freitag, 12. Juli 2019 10:22 > To: Baesken, Matthias ; 'hotspot- > dev at openjdk.java.net' > Subject: Re: this-pointer NULL-checks in hotspot codebase [-Wtautological- > undefined-compare] > > Hi Matthias, > > Removing such NULL checks seems like a good idea in general due to the > undefined behaviour. > Worth mentioning though that there are some tricky ones, like in > markOopDesc* where this == NULL > means that the mark word has the "inflating" value. So we explicitly > check if this == NULL and > hope the compiler will not elide the check. Just gonna drop that one > here and run for it. > > Thanks, > /Erik > > On 2019-07-12 09:48, Baesken, Matthias wrote: > > Hello , when looking into the recent xlc16 / xlclang warnings I came > across those 3 : > > > > /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:1729:7: warning: 'this' > pointer cannot be null in well-defined C++ code; > > comparison may be assumed to always evaluate to true [-Wtautological- > undefined-compare] > > if( this != NULL ) { > > ^~~~ ~~~~ > > > > /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:3416:7: warning: 'this' > pointer cannot be null in well-defined C++ code; > > comparison may be assumed to always evaluate to false [-Wtautological- > undefined-compare] > > if( this == NULL ) return; > > > > /nightly/jdk/src/hotspot/share/libadt/set.cpp:46:7: warning: 'this' pointer > cannot be null in well-defined C++ code; > > comparison may be assumed to always evaluate to false [-Wtautological- > undefined-compare] > > if( this == NULL ) return os::strdup("{no set}"); > > > > > > Do you think the NULL-checks can be removed or is there still some value > in doing them ? > > > > Best regards, Matthias From coleen.phillimore at oracle.com Fri Jul 12 12:48:45 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 12 Jul 2019 08:48:45 -0400 Subject: RFR: 8227633: avoid comparing this pointers to NULL - was : RE: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] In-Reply-To: References: Message-ID: http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.0/src/hotspot/share/adlc/formssel.cpp.udiff.html + if (mnode) mnode->count_instr_names(names); We also try to avoid implicit checks against null for pointers so change this to: + if (mnode != NULL) mnode->count_instr_names(names); I didn't see that you added a check for NULL in the callers of print_opcodes or setstr.? Can those callers never pass NULL? We've done a few passes to clean up these this == NULL checks. Thank you for doing this! Coleen On 7/12/19 8:30 AM, Baesken, Matthias wrote: > Hello Erik, thanks for the input . > > We still have a few places in the HS codebase where "this" is compared to NULL. > When compiling with xlc16 / xlclang we get these warnings : > > warning: 'this' pointer cannot be null in well-defined C++ code; comparison may be assumed to always evaluate to false [-Wtautological-undefined-compare] > > so those places should be removed where possible. > > > I adjusted 3 checks , please review ! > > > > Bug/webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.0/ > > https://bugs.openjdk.java.net/browse/JDK-8227633 > > Thanks , Matthias > > >> -----Original Message----- >> From: Erik ?sterlund >> Sent: Freitag, 12. Juli 2019 10:22 >> To: Baesken, Matthias ; 'hotspot- >> dev at openjdk.java.net' >> Subject: Re: this-pointer NULL-checks in hotspot codebase [-Wtautological- >> undefined-compare] >> >> Hi Matthias, >> >> Removing such NULL checks seems like a good idea in general due to the >> undefined behaviour. >> Worth mentioning though that there are some tricky ones, like in >> markOopDesc* where this == NULL >> means that the mark word has the "inflating" value. So we explicitly >> check if this == NULL and >> hope the compiler will not elide the check. Just gonna drop that one >> here and run for it. >> >> Thanks, >> /Erik >> >> On 2019-07-12 09:48, Baesken, Matthias wrote: >>> Hello , when looking into the recent xlc16 / xlclang warnings I came >> across those 3 : >>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:1729:7: warning: 'this' >> pointer cannot be null in well-defined C++ code; >>> comparison may be assumed to always evaluate to true [-Wtautological- >> undefined-compare] >>> if( this != NULL ) { >>> ^~~~ ~~~~ >>> >>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:3416:7: warning: 'this' >> pointer cannot be null in well-defined C++ code; >>> comparison may be assumed to always evaluate to false [-Wtautological- >> undefined-compare] >>> if( this == NULL ) return; >>> >>> /nightly/jdk/src/hotspot/share/libadt/set.cpp:46:7: warning: 'this' pointer >> cannot be null in well-defined C++ code; >>> comparison may be assumed to always evaluate to false [-Wtautological- >> undefined-compare] >>> if( this == NULL ) return os::strdup("{no set}"); >>> >>> >>> Do you think the NULL-checks can be removed or is there still some value >> in doing them ? >>> Best regards, Matthias From matthias.baesken at sap.com Fri Jul 12 13:01:31 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 12 Jul 2019 13:01:31 +0000 Subject: RFR: 8227633: avoid comparing this pointers to NULL Message-ID: > > + if (mnode) mnode->count_instr_names(names); > > > We also try to avoid implicit checks against null for pointers so change > this to: > Hi Coleen, sure I can change this ; I just found a lot of places in formssel.cpp where if (ptr) { ... } is used . > > I didn't see that you added a check for NULL in the callers of > print_opcodes or setstr.? Can those callers never pass NULL? > It looked to me that the setstr is never really called and void Set::print() const { ... } where it is used is used for debug printing - did I miss something ? Regarding print_opcodes , there probably the NULL checks at caller palces should better be added . Regards, Matthias > ------------------------------ > > Message: 4 > Date: Fri, 12 Jul 2019 08:48:45 -0400 > From: coleen.phillimore at oracle.com > To: hotspot-dev at openjdk.java.net > Subject: Re: RFR: 8227633: avoid comparing this pointers to NULL - was > : RE: this-pointer NULL-checks in hotspot codebase > [-Wtautological-undefined-compare] > Message-ID: > Content-Type: text/plain; charset=utf-8; format=flowed > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.0/src/hotspot/sha > re/adlc/formssel.cpp.udiff.html > > + if (mnode) mnode->count_instr_names(names); > > > We also try to avoid implicit checks against null for pointers so change > this to: > > + if (mnode != NULL) mnode->count_instr_names(names); > > I didn't see that you added a check for NULL in the callers of > print_opcodes or setstr.? Can those callers never pass NULL? > > We've done a few passes to clean up these this == NULL checks. Thank you > for doing this! > > Coleen > > From erik.osterlund at oracle.com Fri Jul 12 14:46:43 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 12 Jul 2019 16:46:43 +0200 Subject: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] In-Reply-To: References: <55e8bddf-3228-0fd7-3639-cc9bc920e2c5@oracle.com> Message-ID: Hi Harold, It's worse than that though, unfortunately. You are not allowed to have "this" equal to NULL, whether you perform such explicit NULL comparisons or not. The implication is that as long as "inflating" is NULL, we kind of can't use any of the functions on markOop and hence? mustrewrite pretty much all uses of markOop to do something else. The same goes for things like Register, where rax == NULL. To be compliant, we would similarly have to rewrite all uses of Register. In other words, if we are to really hunt down uses of this == NULL and remove them, we will find ourselves with a mountain of work. Again, just gonna drop that here and run. /Erik On 2019-07-12 14:14, Harold Seigel wrote: > The functions that compare 'this' to NULL could be changed from > instance to static functions where 'this' is explicitly passed as a > parameter.? Then you could keep the equivalent NULL checks. > > Harold > > On 7/12/2019 4:22 AM, Erik ?sterlund wrote: >> Hi Matthias, >> >> Removing such NULL checks seems like a good idea in general due to >> the undefined behaviour. >> Worth mentioning though that there are some tricky ones, like in >> markOopDesc* where this == NULL >> means that the mark word has the "inflating" value. So we explicitly >> check if this == NULL and >> hope the compiler will not elide the check. Just gonna drop that one >> here and run for it. >> >> Thanks, >> /Erik >> >> On 2019-07-12 09:48, Baesken, Matthias wrote: >>> Hello , when looking? into? the? recent xlc16 / xlclang?? warnings I >>> came? across? those? 3 : >>> >>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:1729:7: warning: >>> 'this' pointer cannot be null in well-defined C++ code; >>> comparison may be assumed to always evaluate to true >>> [-Wtautological-undefined-compare] >>> ?? if( this != NULL ) { >>> ?????? ^~~~??? ~~~~ >>> >>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:3416:7: warning: >>> 'this' pointer cannot be null in well-defined C++ code; >>> comparison may be assumed to always evaluate to false >>> [-Wtautological-undefined-compare] >>> ?? if( this == NULL ) return; >>> >>> /nightly/jdk/src/hotspot/share/libadt/set.cpp:46:7: warning: 'this' >>> pointer cannot be null in well-defined C++ code; >>> comparison may be assumed to always evaluate to false >>> [-Wtautological-undefined-compare] >>> ?? if( this == NULL ) return os::strdup("{no set}"); >>> >>> >>> Do you think the? NULL-checks can be removed or is there still some >>> value in doing them ? >>> >>> Best regards, Matthias >> From fweimer at redhat.com Fri Jul 12 15:36:32 2019 From: fweimer at redhat.com (Florian Weimer) Date: Fri, 12 Jul 2019 17:36:32 +0200 Subject: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] In-Reply-To: (Matthias Baesken's message of "Fri, 12 Jul 2019 07:48:32 +0000") References: Message-ID: <87blxzz2m7.fsf@oldenburg2.str.redhat.com> * Matthias Baesken: > Do you think the NULL-checks can be removed or is there still some > value in doing them ? I believe you need to build OpenJDK in a mode where the compiler assumes that the this pointer can be null: # These flags are required for GCC 6 builds as undefined behaviour in # OpenJDK code runs afoul of the more aggressive versions of these # optimisations. Notably, value range propagation now assumes that # the this pointer of C++ member functions is non-null. Thanks, Florian From sgehwolf at redhat.com Fri Jul 12 18:08:18 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 12 Jul 2019 20:08:18 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible Message-ID: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> Hi, There is an alternative container engine which is being used by Fedora and RHEL 8, called podman[1]. It's mostly compatible with docker. It looks like OpenJDK docker tests can be made podman compatible with a few little tweaks. One "interesting" one is to not assert "Successfully built" in the build output but only rely on the exit code, which seems to be OK for my testing. Interestingly the test would be skipped in that case. Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ Adjustments I've done: * Don't assert "Successfully built" in image build output[2]. * Add /usr/sbin to PATH as the podman binary relies on iptables for it to work which is in /usr/sbin on Fedora * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() to be equal to the previous value. I've found those counters to be slowly increasing, which made the tests unreliable. Testing: Running docker tests with docker as engine. Did the same with podman as engine via -Djdk.test.docker.command=podman on Linux x86_64. Both passed (non-trivially). Thoughts? Thanks, Severin [1] https://podman.io/ [2] Image builds with podman look like ("COMMIT" over "Successfully built"): STEP 1: FROM fedora:29 STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics STEP 5: COMMIT fedora-metrics-11 d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e From dean.long at oracle.com Fri Jul 12 21:50:15 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 12 Jul 2019 14:50:15 -0700 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> Message-ID: On 7/11/19 1:13 PM, Erik ?sterlund wrote: > Hi Dean, > > On 2019-07-11 15:29, dean.long at oracle.com wrote: >> On 7/11/19 6:53 AM, Erik ?sterlund wrote: >>> Hi Dean, >>> >>> On 2019-07-11 00:42, dean.long at oracle.com wrote: >>>> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>>>> Hi Dean, >>>>> >>>>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking >>>>>>> (where unlinking code belongs), instead of after the handshake >>>>>>> (intended for deleting things safely unlinked). >>>>>>> Strictly speaking, moving the OSR nmethod unlinking removes the >>>>>>> racing between make_not_entrant and make_unloaded, but I still >>>>>>> want the monotonicity guards to make this code more robust. >>>>>> >>>>>> I see where you added OSR nmethod unlinking, but not where you >>>>>> removed it, so it's not obvious it was a "move". >>>>> >>>>> Sorry, bad wording on my part. I added OSR nmethod unlinking >>>>> before the global handshake is run. After the handshake, we call >>>>> make_unloaded() on the same is_unloading() nmethods. That function >>>>> "tries" to unlink the OSR nmethod, but will just not do it as it's >>>>> already unlinked at that point. So in a way, I didn't remove the >>>>> call to unlink the OSR nmethod there, it just won't do anything. I >>>>> preferred structuring it that way instead of trying to optimize >>>>> away the call to unlink the OSR nmethod when making it unloaded, >>>>> but only for the concurrent case. It seemed to introduce more >>>>> conditional magic than it was worth. >>>>> So in practice, the unlinking of OSR nmethods has moved for >>>>> concurrent unloading to before the handshake. >>>>> >>>> >>>> OK, in that case, could you add a little information to the >>>> "Invalidate the osr nmethod only once" comment so that in the >>>> future someone isn't tempted to remove the code as redundant? >>> >>> Sure. >>> >> >> I meant the one in zNMethod.cpp :-) > > Okay, will put another comment in there once we agree on a direction > on the next point. > >> >>>>>> Would it make sense for nmethod::unlink_from_method() to do the >>>>>> OSR unlinking, or to assert that it has already been done? >>>>> >>>>> An earlier version of this patch tried to do that. It is indeed >>>>> possible. But it requires changing lock ranks of the OSR nmethod >>>>> lock to special - 1 and moving around a bunch of code as this >>>>> function is also called both when making nmethods not_entrant, >>>>> zombie, and unlinking them in that case. For the first two, we >>>>> conditionally unlink the nmethod based on the current state (which >>>>> is the old state), whereas when I move it, the current state is >>>>> the new state. So I had to change things around a bit more to >>>>> figure out the right condition when to unlink it that works for >>>>> all 3 callers. In the end, since this is going to 13, I thought >>>>> it's more important to minimize the risk as much as I can, and >>>>> leave such refactorings to 14. >>>>> >>>> >>>> OK. >>>> >>>>>> The new bailout in the middle of >>>>>> nmethod::make_not_entrant_or_zombie() worries me a little, >>>>>> because the code up to that point has side-effects, and we could >>>>>> be bailing out in an unexpected state. >>>>> >>>>> Correct. In an earlier version of this patch, I moved the >>>>> transition to before the side effects. But a bunch of code is >>>>> using the current nmethod state to determine what to do, and that >>>>> current state changed from the old to the new state. In >>>>> particular, we conditionally patch in the jump based on the >>>>> current (old) state, and we conditionally increment decompile >>>>> count based on the current (old) state. So I ended up having to >>>>> rewrite more code than I wanted to for a patch going into 13, and >>>>> convince myself that I had not implicitly messed something up. It >>>>> felt safer to reason about the 3 side effects up until the >>>>> transitioning point: >>>>> >>>>> 1) Patching in the jump into VEP. Any state more dead than the >>>>> current transition, would still want that jump to be there. >>>>> 2) Incrementing decompile count when making it not_entrant. Seems >>>>> in order to do regardless, as we had an actual request to make the >>>>> nmethod not entrant because it was bad somehow. >>>>> 3) Marking it as seen on stack when making it not_entrant. This >>>>> will only make can_convert_to_zombie start returning false, which >>>>> is harmless in general. Also, as both transitions to zombie and >>>>> not_entrant are performed under the Patching_lock, the only >>>>> possible race is with make_unloaded. And those nmethods are >>>>> is_unloading(), which also makes can_convert_to_zombie return >>>>> false (in a not racy fashion). So it would essentially make no >>>>> observable difference to any single call to can_convert_to_zombie(). >>>>> >>>>> In summary, #1 and #3 don't really observably change the state of >>>>> the system, and #2 is completely harmless and probably wanted. >>>>> Therefore I found that moving these things around and finding out >>>>> where we use the current state(), as well as rewriting it, seemed >>>>> like a slightly scarier change for 13 to me. >>>>> >>>>> So in general, there is some refactoring that could be done (and I >>>>> have tried it) to make this nicer. But I want to minimize the risk >>>>> for 13 as much as possible, and perform any risky refactorings in >>>>> 14 instead. >>>>> If your risk assessment is different and you would prefer moving >>>>> the transition higher up (and flipping some conditions) instead, I >>>>> am totally up for that too though, and I do see where you are >>>>> coming from. >>>>> >>>> >>>> So if we fail, it means that we lost a race to a "deader" state, >>>> and assuming this is the only path to the deader state, wouldn't >>>> that also mean that #1, #2, and #3 would have already been done by >>>> the winning thread?? If so, that makes me feel better about bailing >>>> out in the middle, but I'm still not 100% convinced, unless we can >>>> assert that 1-3 already happened.? Do you have a prototype of what >>>> moving the transition higher up would look like? >>> >>> As a matter of fact I do. Here is a webrev: >>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ >>> >>> I kind of like it. What do you think? >>> >> >> Now the code after the transition that says "Must happen before state >> change" worries me. > > Yes indeed. This is why I was hesitant to move the transition up. It > moves past 3 things that implicitly depends on the current state. This > one is extra scary. It actually introduces a race condition that could > crash the VM (because can_convert_to_zombie() may observe an nmethod > that just turned not_entrant, without being marked on stack). > > I think this shows (IMO) that trying to move the transition up has 3 > problems, and this one is particularly hard to dodge. I think it > really has to be before the transition. > > Would you agree now that keeping the transition where it was is less > risky (as I did originally) Yes. > and convincing ourselves that the 3 "side effects" are not really > observable side effects in the system, as I reasoned about earlier? > yes, but I'm hoping we can do more than just reason, like adding asserts.? More below... > If not, I can try to move the mark-on-stack up above the transition. > >> Can you remind me again what kind of race can make the state >> transition fail here?? Did you happen to draw a state diagram while >> learning this code? :-) > > Yes indeed. Would you like the long story or the short story? Here is > the short story: the only known race is between one thread making an > nmethod not_entrant and the GC thread making it unloaded. That > make_not_entrant is the only transition that can fail. Previously I > relied on there never existing any concurrent calls to > make_not_entrant() and make_unloaded(). The OSR nmethod was caught as > a special case (isn't it always...) where this could happen, violating > monotonicity. But I think it feels safer to enforce the monotonicity > of transitions in the actual code that performs the transitions, > instead of relying on knowledge of the relationships between all state > transitioning calls, implicitly ensuring monotonicity. > Can we enforce in_use --> not_entrant --> unloaded --> zombie, and not allow jumps or skipped states?? Then we can assert that cleanup from a less-dead state has already been done.? So if make_not_entrant failed, it would assert that all the cleanup that would have been done by a successful make_not_entrant has already been done. dl > Thanks, > /Erik > >> dl >> >>> Thanks, >>> /Erik >>> >>>> dl >>>> >>>>> BTW, I have tested this change through hs-tier1-7, and it looks good. >>>>> >>>>> Thanks a lot Dean for reviewing this code. >>>>> >>>>> /Erik >>>>> >>>>>> dl >>>>>> >>>>> >>>> >> From mikhailo.seledtsov at oracle.com Fri Jul 12 22:19:49 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Fri, 12 Jul 2019 15:19:49 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> Message-ID: <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> Hi Severin, ? The change looks good to me. Thank you for adding support for Podman container technology. Testing: I ran both HotSpot and JDK container tests with your patch; tests executed on Oracle Linux 7.6 using default container engine (Docker): ??? test/hotspot/jtreg/containers/?? AND test/jdk/jdk/internal/platform/docker/ All PASS Thanks, Misha On 7/12/19 11:08 AM, Severin Gehwolf wrote: > Hi, > > There is an alternative container engine which is being used by Fedora > and RHEL 8, called podman[1]. It's mostly compatible with docker. It > looks like OpenJDK docker tests can be made podman compatible with a > few little tweaks. One "interesting" one is to not assert "Successfully > built" in the build output but only rely on the exit code, which seems > to be OK for my testing. Interestingly the test would be skipped in > that case. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ > > Adjustments I've done: > * Don't assert "Successfully built" in image build output[2]. > * Add /usr/sbin to PATH as the podman binary relies on iptables for it > to work which is in /usr/sbin on Fedora > * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() > to be equal to the previous value. I've found those counters to be > slowly increasing, which made the tests unreliable. > > Testing: > > Running docker tests with docker as engine. Did the same with podman as > engine via -Djdk.test.docker.command=podman on Linux x86_64. Both > passed (non-trivially). > > Thoughts? > > Thanks, > Severin > > [1] https://podman.io/ > [2] Image builds with podman look > like ("COMMIT" over "Successfully built"): > STEP 1: FROM fedora:29 > STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all > --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d > STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ > 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 > STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics > STEP 5: COMMIT fedora-metrics-11 > d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e > From Pengfei.Li at arm.com Mon Jul 15 01:38:33 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Mon, 15 Jul 2019 01:38:33 +0000 Subject: RFR(trivial): 8227512: [TESTBUG] Fix JTReg javac test failures with Graal In-Reply-To: References: Message-ID: CC compiler-dev -- Thanks, Pengfei > Hi, > > Please help review this small fix. > JBS: https://bugs.openjdk.java.net/browse/JDK-8227512 > Webrev: http://cr.openjdk.java.net/~pli/rfr/8227512/ > > JTReg javac tests > * langtools/tools/javac/modules/InheritRuntimeEnvironmentTest.java > * langtools/tools/javac/file/LimitedImage.java > failed when Graal is used as JVMCI compiler. > > These cases test javac behavior with the condition that observable modules > are limited. But Graal is unable to be found in the limited module scope. This > fixes these two tests by adding "jdk.internal.vm.compiler" into the limited > modules. > > -- > Thanks, > Pengfei From sgehwolf at redhat.com Mon Jul 15 08:04:17 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 15 Jul 2019 10:04:17 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> Message-ID: Hi Misha, On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: > Hi Severin, > > The change looks good to me. Thank you for adding support for Podman > container technology. > > Testing: I ran both HotSpot and JDK container tests with your patch; > tests executed on Oracle Linux 7.6 using default container engine (Docker): > > test/hotspot/jtreg/containers/ AND > test/jdk/jdk/internal/platform/docker/ > > All PASS Thanks for the review and check! Cheers, Severin > > Thanks, > > Misha > > > On 7/12/19 11:08 AM, Severin Gehwolf wrote: > > Hi, > > > > There is an alternative container engine which is being used by > > Fedora > > and RHEL 8, called podman[1]. It's mostly compatible with docker. > > It > > looks like OpenJDK docker tests can be made podman compatible with > > a > > few little tweaks. One "interesting" one is to not assert > > "Successfully > > built" in the build output but only rely on the exit code, which > > seems > > to be OK for my testing. Interestingly the test would be skipped in > > that case. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 > > webrev: > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ > > > > Adjustments I've done: > > * Don't assert "Successfully built" in image build output[2]. > > * Add /usr/sbin to PATH as the podman binary relies on iptables > > for it > > to work which is in /usr/sbin on Fedora > > * Allow for Metrics.getCpuSystemUsage() and > > Metrics.getCpuUserUsage() > > to be equal to the previous value. I've found those counters to > > be > > slowly increasing, which made the tests unreliable. > > > > Testing: > > > > Running docker tests with docker as engine. Did the same with > > podman as > > engine via -Djdk.test.docker.command=podman on Linux x86_64. Both > > passed (non-trivially). > > > > Thoughts? > > > > Thanks, > > Severin > > > > [1] https://podman.io/ > > [2] Image builds with podman look > > like ("COMMIT" over "Successfully built"): > > STEP 1: FROM fedora:29 > > STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean > > all > > --> Using cache > > 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d > > STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ > > 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 > > STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add- > > modules java.base --add-exports > > java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics > > STEP 5: COMMIT fedora-metrics-11 > > d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e > > From erik.osterlund at oracle.com Mon Jul 15 09:10:02 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 15 Jul 2019 11:10:02 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> Message-ID: <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> Hi Dean, On 2019-07-12 23:50, dean.long at oracle.com wrote: > On 7/11/19 1:13 PM, Erik ?sterlund wrote: >> Hi Dean, >> >> On 2019-07-11 15:29, dean.long at oracle.com wrote: >>> On 7/11/19 6:53 AM, Erik ?sterlund wrote: >>>> Hi Dean, >>>> >>>> On 2019-07-11 00:42, dean.long at oracle.com wrote: >>>>> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>>>>> Hi Dean, >>>>>> >>>>>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>>>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking >>>>>>>> (where unlinking code belongs), instead of after the handshake >>>>>>>> (intended for deleting things safely unlinked). >>>>>>>> Strictly speaking, moving the OSR nmethod unlinking removes the >>>>>>>> racing between make_not_entrant and make_unloaded, but I still >>>>>>>> want the monotonicity guards to make this code more robust. >>>>>>> >>>>>>> I see where you added OSR nmethod unlinking, but not where you >>>>>>> removed it, so it's not obvious it was a "move". >>>>>> >>>>>> Sorry, bad wording on my part. I added OSR nmethod unlinking >>>>>> before the global handshake is run. After the handshake, we call >>>>>> make_unloaded() on the same is_unloading() nmethods. That >>>>>> function "tries" to unlink the OSR nmethod, but will just not do >>>>>> it as it's already unlinked at that point. So in a way, I didn't >>>>>> remove the call to unlink the OSR nmethod there, it just won't do >>>>>> anything. I preferred structuring it that way instead of trying >>>>>> to optimize away the call to unlink the OSR nmethod when making >>>>>> it unloaded, but only for the concurrent case. It seemed to >>>>>> introduce more conditional magic than it was worth. >>>>>> So in practice, the unlinking of OSR nmethods has moved for >>>>>> concurrent unloading to before the handshake. >>>>>> >>>>> >>>>> OK, in that case, could you add a little information to the >>>>> "Invalidate the osr nmethod only once" comment so that in the >>>>> future someone isn't tempted to remove the code as redundant? >>>> >>>> Sure. >>>> >>> >>> I meant the one in zNMethod.cpp :-) >> >> Okay, will put another comment in there once we agree on a direction >> on the next point. >> >>> >>>>>>> Would it make sense for nmethod::unlink_from_method() to do the >>>>>>> OSR unlinking, or to assert that it has already been done? >>>>>> >>>>>> An earlier version of this patch tried to do that. It is indeed >>>>>> possible. But it requires changing lock ranks of the OSR nmethod >>>>>> lock to special - 1 and moving around a bunch of code as this >>>>>> function is also called both when making nmethods not_entrant, >>>>>> zombie, and unlinking them in that case. For the first two, we >>>>>> conditionally unlink the nmethod based on the current state >>>>>> (which is the old state), whereas when I move it, the current >>>>>> state is the new state. So I had to change things around a bit >>>>>> more to figure out the right condition when to unlink it that >>>>>> works for all 3 callers. In the end, since this is going to 13, I >>>>>> thought it's more important to minimize the risk as much as I >>>>>> can, and leave such refactorings to 14. >>>>>> >>>>> >>>>> OK. >>>>> >>>>>>> The new bailout in the middle of >>>>>>> nmethod::make_not_entrant_or_zombie() worries me a little, >>>>>>> because the code up to that point has side-effects, and we could >>>>>>> be bailing out in an unexpected state. >>>>>> >>>>>> Correct. In an earlier version of this patch, I moved the >>>>>> transition to before the side effects. But a bunch of code is >>>>>> using the current nmethod state to determine what to do, and that >>>>>> current state changed from the old to the new state. In >>>>>> particular, we conditionally patch in the jump based on the >>>>>> current (old) state, and we conditionally increment decompile >>>>>> count based on the current (old) state. So I ended up having to >>>>>> rewrite more code than I wanted to for a patch going into 13, and >>>>>> convince myself that I had not implicitly messed something up. It >>>>>> felt safer to reason about the 3 side effects up until the >>>>>> transitioning point: >>>>>> >>>>>> 1) Patching in the jump into VEP. Any state more dead than the >>>>>> current transition, would still want that jump to be there. >>>>>> 2) Incrementing decompile count when making it not_entrant. Seems >>>>>> in order to do regardless, as we had an actual request to make >>>>>> the nmethod not entrant because it was bad somehow. >>>>>> 3) Marking it as seen on stack when making it not_entrant. This >>>>>> will only make can_convert_to_zombie start returning false, which >>>>>> is harmless in general. Also, as both transitions to zombie and >>>>>> not_entrant are performed under the Patching_lock, the only >>>>>> possible race is with make_unloaded. And those nmethods are >>>>>> is_unloading(), which also makes can_convert_to_zombie return >>>>>> false (in a not racy fashion). So it would essentially make no >>>>>> observable difference to any single call to can_convert_to_zombie(). >>>>>> >>>>>> In summary, #1 and #3 don't really observably change the state of >>>>>> the system, and #2 is completely harmless and probably wanted. >>>>>> Therefore I found that moving these things around and finding out >>>>>> where we use the current state(), as well as rewriting it, seemed >>>>>> like a slightly scarier change for 13 to me. >>>>>> >>>>>> So in general, there is some refactoring that could be done (and >>>>>> I have tried it) to make this nicer. But I want to minimize the >>>>>> risk for 13 as much as possible, and perform any risky >>>>>> refactorings in 14 instead. >>>>>> If your risk assessment is different and you would prefer moving >>>>>> the transition higher up (and flipping some conditions) instead, >>>>>> I am totally up for that too though, and I do see where you are >>>>>> coming from. >>>>>> >>>>> >>>>> So if we fail, it means that we lost a race to a "deader" state, >>>>> and assuming this is the only path to the deader state, wouldn't >>>>> that also mean that #1, #2, and #3 would have already been done by >>>>> the winning thread?? If so, that makes me feel better about >>>>> bailing out in the middle, but I'm still not 100% convinced, >>>>> unless we can assert that 1-3 already happened.? Do you have a >>>>> prototype of what moving the transition higher up would look like? >>>> >>>> As a matter of fact I do. Here is a webrev: >>>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ >>>> >>>> I kind of like it. What do you think? >>>> >>> >>> Now the code after the transition that says "Must happen before >>> state change" worries me. >> >> Yes indeed. This is why I was hesitant to move the transition up. It >> moves past 3 things that implicitly depends on the current state. >> This one is extra scary. It actually introduces a race condition that >> could crash the VM (because can_convert_to_zombie() may observe an >> nmethod that just turned not_entrant, without being marked on stack). >> >> I think this shows (IMO) that trying to move the transition up has 3 >> problems, and this one is particularly hard to dodge. I think it >> really has to be before the transition. >> >> Would you agree now that keeping the transition where it was is less >> risky (as I did originally) > > Yes. > >> and convincing ourselves that the 3 "side effects" are not really >> observable side effects in the system, as I reasoned about earlier? >> > > yes, but I'm hoping we can do more than just reason, like adding > asserts.? More below... > >> If not, I can try to move the mark-on-stack up above the transition. >> >>> Can you remind me again what kind of race can make the state >>> transition fail here?? Did you happen to draw a state diagram while >>> learning this code? :-) >> >> Yes indeed. Would you like the long story or the short story? Here is >> the short story: the only known race is between one thread making an >> nmethod not_entrant and the GC thread making it unloaded. That >> make_not_entrant is the only transition that can fail. Previously I >> relied on there never existing any concurrent calls to >> make_not_entrant() and make_unloaded(). The OSR nmethod was caught as >> a special case (isn't it always...) where this could happen, >> violating monotonicity. But I think it feels safer to enforce the >> monotonicity of transitions in the actual code that performs the >> transitions, instead of relying on knowledge of the relationships >> between all state transitioning calls, implicitly ensuring monotonicity. >> > > Can we enforce in_use --> not_entrant --> unloaded --> zombie, and not > allow jumps or skipped states?? Then we can assert that cleanup from a > less-dead state has already been done.? So if make_not_entrant failed, > it would assert that all the cleanup that would have been done by a > successful make_not_entrant has already been done. I'm afraid not. The state machine skips states by design. For example, the set of {not_installed, in_use, not_entrant} states are alive and {unloaded, zombie} are not alive. Any nmethod in an "alive" state may transition to the unloaded state due to an oop dying. Actually strictly speaking, only {in_use, not_entrant} may become unloaded, as nmethods are made in_use within the same thread_in_vm critical section that they finalized oops in the nmethod, and hence could not yet have died. Similarly, the "unloaded" state is reserved for unloading by the GC. And not all nmethods that become zombie were unloaded by the GC. I think changing so that all these transitions are taken for all nmethods, sounds like it will break invariants and be quite dangerous. Note though that what all dead (!is_alive()) states have in common is that they can never be called or be on-stack; by the time an nmethod enters a dead state (unloaded or zombie), its inline caches and all other stale pointers to the nmethod have been cleaned out, and either a safepoint or global thread-local handshake with cross-modifying fences has finished, without finding activation records on-stack. That is the unwritten definition of being !is_alive() (e.g. unloaded or zombie). Therefore, if a transition to not_entrant fails due to entering a more dead state (unloaded or zombie), then that implies the following: 1) The jump at VEP is no longer needed because the jump is no longer reachable code, as another thread had enough knowledge to determine it was dead (all references to it have been unlinked, followed by a handshake/safepoint with cross-modifying fencing and stack scanning). So whether another transition performed this step or not is unimportant. Note that for example make_unloaded() does not patch in a jump at VEP, despite transitioning nmethods directly from in_use to unloaded, for this exact reason. By the time the nmethod is killed, that jump better be dead code already. It's only needed for the not_entrant state, where the nmethod may still alive but we want to stop calls into it. 2) The mark_as_seen_on_stack() prevents the sweeper from transitioning not_entrant() nmethods to zombie until it's no longer seen on stack, so it doesn't accidentally kill not_entrant nmethods. But if the transition failed, it's already dead, and the only path that looks at that value, is not taken (looking for not_entrant nmethods that can be made zombie). Again, it is totally fine that another thread killing the nmethod for a different reason did not perform this step. 3) The inc_decompile_count() is still valid, as the caller had a valid reason to deopt the nmethod, regardless of whether there were multiple reasons for discarding the nmethod or not. So in summary, if a make_not_entrant attempt fails due to a make_unloaded (or hypothetically make_zombie even though that race is impossible) attempt, then the presence or lack of presence of the VEP jump and the mark-on-stack value no longer matter, as they are properties that only matter to is_alive() nmethods. And inc_decompile_count is fine to do as well as there was a valid deopt reason for the make_not_entrant() caller. Would it feel better if I wrote this reasoning down in comments in make_not_entrant_or_zombie? Thanks, /Erik > dl > >> Thanks, >> /Erik >> >>> dl >>> >>>> Thanks, >>>> /Erik >>>> >>>>> dl >>>>> >>>>>> BTW, I have tested this change through hs-tier1-7, and it looks >>>>>> good. >>>>>> >>>>>> Thanks a lot Dean for reviewing this code. >>>>>> >>>>>> /Erik >>>>>> >>>>>>> dl >>>>>>> >>>>>> >>>>> >>> > From martin.doerr at sap.com Mon Jul 15 13:06:32 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 15 Jul 2019 13:06:32 +0000 Subject: dbg feature: PrintMallocStatistics still wanted? Message-ID: Hi, I recently noticed that the implementation for PrintMallocStatistics slows down the VM in fastdbg builds even if the feature is not active: https://bugs.openjdk.java.net/browse/JDK-8227597 My current proposal just improves the performance impact: http://cr.openjdk.java.net/~mdoerr/8227597_DBG_Inline_inc_bytes_allocated/webrev.01/ But now, the question has come up, if PrintMallocStatistics is still needed since we have NMT. Note that PrintMallocStatistics is only available in dbg builds. Does anybody still want to use it? Would anybody vote for removing this feature? Best regards, Martin From maurizio.cimadamore at oracle.com Mon Jul 15 15:25:43 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 15 Jul 2019 16:25:43 +0100 Subject: RFR(trivial): 8227512: [TESTBUG] Fix JTReg javac test failures with Graal In-Reply-To: References: Message-ID: <808dfddf-1a5f-be51-4078-fccfac3f19f8@oracle.com> Looks good! Thanks Maurizio On 15/07/2019 02:38, Pengfei Li (Arm Technology China) wrote: > CC compiler-dev > > -- > Thanks, > Pengfei > >> Hi, >> >> Please help review this small fix. >> JBS: https://bugs.openjdk.java.net/browse/JDK-8227512 >> Webrev: http://cr.openjdk.java.net/~pli/rfr/8227512/ >> >> JTReg javac tests >> * langtools/tools/javac/modules/InheritRuntimeEnvironmentTest.java >> * langtools/tools/javac/file/LimitedImage.java >> failed when Graal is used as JVMCI compiler. >> >> These cases test javac behavior with the condition that observable modules >> are limited. But Graal is unable to be found in the limited module scope. This >> fixes these two tests by adding "jdk.internal.vm.compiler" into the limited >> modules. >> >> -- >> Thanks, >> Pengfei From coleen.phillimore at oracle.com Mon Jul 15 16:37:12 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 15 Jul 2019 12:37:12 -0400 Subject: dbg feature: PrintMallocStatistics still wanted? In-Reply-To: References: Message-ID: I didn't realize this was still in the sources.? I think you should remove it. Coleen On 7/15/19 9:06 AM, Doerr, Martin wrote: > Hi, > > I recently noticed that the implementation for PrintMallocStatistics slows down the VM in fastdbg builds even if the feature is not active: > https://bugs.openjdk.java.net/browse/JDK-8227597 > > My current proposal just improves the performance impact: > http://cr.openjdk.java.net/~mdoerr/8227597_DBG_Inline_inc_bytes_allocated/webrev.01/ > > But now, the question has come up, if PrintMallocStatistics is still needed since we have NMT. Note that PrintMallocStatistics is only available in dbg builds. > Does anybody still want to use it? > Would anybody vote for removing this feature? > > Best regards, > Martin > From martin.doerr at sap.com Mon Jul 15 19:48:49 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 15 Jul 2019 19:48:49 +0000 Subject: dbg feature: PrintMallocStatistics still wanted? In-Reply-To: References: Message-ID: Hi Coleen, thanks for your feedback. I've created JDK-8227692: "Remove develop feature PrintMallocStatistics" and I'll send an RFR, soon. Best regards, Martin > -----Original Message----- > From: hotspot-dev On Behalf Of > coleen.phillimore at oracle.com > Sent: Montag, 15. Juli 2019 18:37 > To: hotspot-dev at openjdk.java.net > Subject: Re: dbg feature: PrintMallocStatistics still wanted? > > > I didn't realize this was still in the sources.? I think you should > remove it. > Coleen > > On 7/15/19 9:06 AM, Doerr, Martin wrote: > > Hi, > > > > I recently noticed that the implementation for PrintMallocStatistics slows > down the VM in fastdbg builds even if the feature is not active: > > https://bugs.openjdk.java.net/browse/JDK-8227597 > > > > My current proposal just improves the performance impact: > > > http://cr.openjdk.java.net/~mdoerr/8227597_DBG_Inline_inc_bytes_allocat > ed/webrev.01/ > > > > But now, the question has come up, if PrintMallocStatistics is still needed > since we have NMT. Note that PrintMallocStatistics is only available in dbg > builds. > > Does anybody still want to use it? > > Would anybody vote for removing this feature? > > > > Best regards, > > Martin > > From kim.barrett at oracle.com Mon Jul 15 19:51:22 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 15 Jul 2019 15:51:22 -0400 Subject: 8227652: SetupOperatorNewDeleteCheck should discuss deleting destructors Message-ID: <40590A26-1A32-4B3F-B1D8-55A56090C5F4@oracle.com> Please review this explanatory comment being added to the description of the check for using global operator new/delete in Hotspot code. The described situation is somewhat obscure, and encountering it for the first time (or again after a long time, as happened to me recently) can be quite puzzling. CR: https://bugs.openjdk.java.net/browse/JDK-8227652 Webrev: http://cr.openjdk.java.net/~kbarrett/8227652/open.00/ From kim.barrett at oracle.com Tue Jul 16 01:18:43 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 15 Jul 2019 21:18:43 -0400 Subject: RFR: 8227653: Add VM Global OopStorage Message-ID: Please review this change which adds a VMGlobal OopStorage object. It is initially being used instead of the conditional JVMCI global storage object, which is being removed. Looking for reviewers from all of gc, runtime, and compiler. To keep things simple for now, this new storage object is (optionally) included in the processing done by SystemDictionary::oops_do. Most existing storage processors use that mechanism. For most processors, that's consistent with how JNI global handles are processed. ZGC uses a different approach, and provides enough infrastructure that it was easy to process this new storage object in a way that is consistent with ZGC's handling of JNI globals. This change does not attempt to address the problems around changing the set of OopStorage instance described by JDK-8227054. This change was a useful bit of preparation for the work I'm doing on JDK-8227054, so I split it out as a separate change. This change also includes a minimal update of Shenandoah, using the processing of the new storage object by SystemDictionary::oops_do. It looks like Shenandoah is conceptually similar to ZGC in it's handling of JNI globals, and should be able to handle this new storage object similarly, but I'm leaving that to the Shenandoah developers. You might want to wait for JDK-8227054 though. Note that neither ZGC nor Shenandoah processed the former conditional JVMCI global storage. CR: https://bugs.openjdk.java.net/browse/JDK-8227653 Webrev: http://cr.openjdk.java.net/~kbarrett/8227653/open.00/ Testing: mach5 tier1-5 From matthias.baesken at sap.com Tue Jul 16 07:51:42 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 16 Jul 2019 07:51:42 +0000 Subject: RFR: 8227631: Adjust AIX version check In-Reply-To: References: Message-ID: > would print the OS, too, as it did before. Hi Goetz I do not think that we currently run in OpenJDK on OS/400 , so printing the OS does not have much value currently ( it is always AIX) . > Didn't the warning stem from the > assert(false, name_str); I think clang does not like the usage of string literals for non-constant strings [-Wwritable-strings] . Best regards, Matthias > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Dienstag, 16. Juli 2019 09:30 > To: Baesken, Matthias ; Doerr, Martin > > Subject: RE: RFR: 8227631: Adjust AIX version check > > Hi Matthias, > > limiting the version is a good idea. > > As this is only a trace and an assertion (debug build), > I don't think it is necessary to push it to 13, but > pushing it there is fine too. > > I would appreciate if the > trcVerbose("We run on %s %s", name_str, ver_str); > would print the OS, too, as it did before. > As the code for OS/400 is in there, also the tracing > should be complete. > > Didn't the warning stem from the > assert(false, name_str); > which correctly could be > assert(false, "%s", name_str); > ? (Your version for the assert is fine, too.) > > Best regards, > Goetz. > > > -----Original Message----- > > From: Langer, Christoph > > Sent: Freitag, 12. Juli 2019 14:17 > > To: Baesken, Matthias ; 'hotspot- > > dev at openjdk.java.net' ; 'ppc-aix-port- > > dev at openjdk.java.net' > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > Hi Matthias, > > > > looks good. This might even be something to push to JDK13 still (if you do it > > within the next few days). > > > > Best regards > > Christoph > > > > > > > -----Original Message----- > > > From: hotspot-dev On Behalf > Of > > > Baesken, Matthias > > > Sent: Freitag, 12. Juli 2019 13:09 > > > To: 'hotspot-dev at openjdk.java.net' ; > > > 'ppc-aix-port-dev at openjdk.java.net' dev at openjdk.java.net> > > > Subject: RFR: 8227631: Adjust AIX version check > > > > > > Hello, please review this small AIX related change . > > > > > > For some time, we do not support AIX 5.3 any more. > > > See (where AIX 7.1 or 7.2 is the supported build platform since > OpenJDK11) : > > > > > > https://wiki.openjdk.java.net/display/Build/Supported+Build+Platforms > > > > > > The currently used xlc 16.1 (XL C/C++ Compilers) even needs minimum > AIX > > > 7.1 to run , see > > > > > > http://www-01.ibm.com/support/docview.wss?uid=swg21326972 > > > > > > (and compiling for older releases on 7.1 / 7.2 would not work easily , at > least > > > not "out of the box" to my knowledge .) > > > > > > So we should adjust the minimum OS version check done in os_aix.cpp in > > > os::Aix::initialize_os_info() . > > > > > > > > > Additionally the change removes a couple of warnings [-Wwritable- > strings > > > category] . > > > > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4081:22: warning: ISO C++11 > > > does not allow conversion from string literal to 'char *' [-Wwritable- > strings] > > > char *name_str = "unknown OS"; > > > ^ > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4089:18: warning: ISO C++11 > > > does not allow conversion from string literal to 'char *' [-Wwritable- > strings] > > > name_str = "OS/400 (pase)"; > > > ^ > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4100:18: warning: ISO C++11 > > > does not allow conversion from string literal to 'char *' [-Wwritable- > strings] > > > name_str = "AIX"; > > > > > > > > > > > > Bug/webrev : > > > > > > https://bugs.openjdk.java.net/browse/JDK-8227631 > > > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227631.0/ > > > > > > Thanks, Matthias From thomas.schatzl at oracle.com Tue Jul 16 09:25:32 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 16 Jul 2019 11:25:32 +0200 Subject: RFR: 8227653: Add VM Global OopStorage In-Reply-To: References: Message-ID: Hi, On Mon, 2019-07-15 at 21:18 -0400, Kim Barrett wrote: > Please review this change which adds a VMGlobal OopStorage > object. It is initially being used instead of the conditional JVMCI > global storage object, which is being removed. > > Looking for reviewers from all of gc, runtime, and compiler. > > To keep things simple for now, this new storage object is > (optionally) included in the processing done by > SystemDictionary::oops_do. Most existing storage processors use that > mechanism. For most processors, that's consistent with how JNI > global handles are processed. ZGC uses a different approach, and > provides enough infrastructure that it was easy to process this new > storage object in a way that is consistent with ZGC's handling of JNI > globals. > > This change does not attempt to address the problems around changing > the set of OopStorage instance described by JDK-8227054. This change > was a useful bit of preparation for the work I'm doing on JDK- > 8227054, so I split it out as a separate change. > > This change also includes a minimal update of Shenandoah, using the > processing of the new storage object by SystemDictionary::oops_do. > It looks like Shenandoah is conceptually similar to ZGC in it's > handling of JNI globals, and should be able to handle this new > storage object similarly, but I'm leaving that to the Shenandoah > developers. You might want to wait for JDK-8227054 though. > > Note that neither ZGC nor Shenandoah processed the former conditional > JVMCI global storage. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8227653 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8227653/open.00/ looks good to me. Thomas From christoph.goettschkes at microdoc.com Tue Jul 16 10:05:59 2019 From: christoph.goettschkes at microdoc.com (christoph.goettschkes at microdoc.com) Date: Tue, 16 Jul 2019 12:05:59 +0200 Subject: [PATCH] Use of an unitialized register in 32-bit ARM template interpreter Message-ID: Hello, while working with the OpenJDK 11 on a 32-bit ARMv7-A platform, I noticed something weird in the template interpreter, regarding the template for the bytecode instruction ldc2_w. The type check for the operand is only done correctly if the ABI is a hard-float one. For soft-float, the check is done wrong, using an uninitialized register Rtemp. Please see the following diff: diff -r 327d5994b2fb src/hotspot/cpu/arm/templateTable_arm.cpp --- a/src/hotspot/cpu/arm/templateTable_arm.cpp Tue Mar 12 11:13:39 2019 -0400 +++ b/src/hotspot/cpu/arm/templateTable_arm.cpp Tue Jul 16 11:22:14 2019 +0200 @@ -515,36 +515,37 @@ void TemplateTable::ldc2_w() { transition(vtos, vtos); const Register Rtags = R2_tmp; const Register Rindex = R3_tmp; const Register Rcpool = R4_tmp; const Register Rbase = R5_tmp; __ get_unsigned_2_byte_index_at_bcp(Rindex, 1); __ get_cpool_and_tags(Rcpool, Rtags); const int base_offset = ConstantPool::header_size() * wordSize; const int tags_offset = Array::base_offset_in_bytes(); __ add(Rbase, Rcpool, AsmOperand(Rindex, lsl, LogBytesPerWord)); + // get type from tags + __ add(Rtemp, Rtags, tags_offset); + __ ldrb(Rtemp, Address(Rtemp, Rindex)); + Label Condy, exit; #ifdef __ABI_HARD__ Label Long; - // get type from tags - __ add(Rtemp, Rtags, tags_offset); - __ ldrb(Rtemp, Address(Rtemp, Rindex)); __ cmp(Rtemp, JVM_CONSTANT_Double); __ b(Long, ne); __ ldr_double(D0_tos, Address(Rbase, base_offset)); __ push(dtos); __ b(exit); __ bind(Long); #endif __ cmp(Rtemp, JVM_CONSTANT_Long); __ b(Condy, ne); #ifdef AARCH64 __ ldr(R0_tos, Address(Rbase, base_offset)); #else __ ldr(R0_tos_lo, Address(Rbase, base_offset + 0 * wordSize)); If the check for the type of the operand is done correctly, the call to InterpreterRuntime::resolve_ldc should never happen. Currently, for 32-bit soft-float arm, InterpreterRuntime::resolve_ldc is called if the operand for ldc2_w is of type long. Also, I find it weird that the "condy_helper" code is genarted for the ldc2_w bytecode instruction on 32-bit hard-float arm (and also on x86). Aren't the only two valid types for ldc2_w long and double? -- Christoph From sgehwolf at redhat.com Tue Jul 16 12:36:05 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 16 Jul 2019 14:36:05 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> Message-ID: Hi, I believe I still need a *R*eviewer for this. Any takers? Thanks, Severin On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: > Hi Severin, > > The change looks good to me. Thank you for adding support for Podman > container technology. > > Testing: I ran both HotSpot and JDK container tests with your patch; > tests executed on Oracle Linux 7.6 using default container engine (Docker): > > test/hotspot/jtreg/containers/ AND > test/jdk/jdk/internal/platform/docker/ > > All PASS > > > Thanks, > > Misha > > > On 7/12/19 11:08 AM, Severin Gehwolf wrote: > > Hi, > > > > There is an alternative container engine which is being used by Fedora > > and RHEL 8, called podman[1]. It's mostly compatible with docker. It > > looks like OpenJDK docker tests can be made podman compatible with a > > few little tweaks. One "interesting" one is to not assert "Successfully > > built" in the build output but only rely on the exit code, which seems > > to be OK for my testing. Interestingly the test would be skipped in > > that case. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 > > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ > > > > Adjustments I've done: > > * Don't assert "Successfully built" in image build output[2]. > > * Add /usr/sbin to PATH as the podman binary relies on iptables for it > > to work which is in /usr/sbin on Fedora > > * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() > > to be equal to the previous value. I've found those counters to be > > slowly increasing, which made the tests unreliable. > > > > Testing: > > > > Running docker tests with docker as engine. Did the same with podman as > > engine via -Djdk.test.docker.command=podman on Linux x86_64. Both > > passed (non-trivially). > > > > Thoughts? > > > > Thanks, > > Severin > > > > [1] https://podman.io/ > > [2] Image builds with podman look > > like ("COMMIT" over "Successfully built"): > > STEP 1: FROM fedora:29 > > STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all > > --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d > > STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ > > 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 > > STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics > > STEP 5: COMMIT fedora-metrics-11 > > d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e > > From goetz.lindenmaier at sap.com Tue Jul 16 13:00:03 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 16 Jul 2019 13:00:03 +0000 Subject: RFR: 8227631: Adjust AIX version check In-Reply-To: References: Message-ID: Hi Mathias, ... lost the list, sorry. The change looks good! Best regards, Goetz. > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Dienstag, 16. Juli 2019 12:59 > To: Baesken, Matthias > Subject: RE: RFR: 8227631: Adjust AIX version check > > Hi Matthias, > > looks fine, thanks a lot! > > Best regards, > Goetz. > > > -----Original Message----- > > From: Baesken, Matthias > > Sent: Dienstag, 16. Juli 2019 12:18 > > To: Lindenmaier, Goetz > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > Hi Goetz , yes I adjusted this to "const char *" and do not see the warning > any > > more , new webrev : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227631.1/ > > > > > > Best regards, Matthias > > > > > > > -----Original Message----- > > > From: Lindenmaier, Goetz > > > Sent: Dienstag, 16. Juli 2019 11:16 > > > To: Baesken, Matthias > > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > > > I think > > > const char *name_str > > > should do the job. > > > > > > Best regards, > > > Goetz. > > > > > > > -----Original Message----- > > > > From: Baesken, Matthias > > > > Sent: Dienstag, 16. Juli 2019 09:52 > > > > To: Lindenmaier, Goetz ; Doerr, Martin > > > > > > > > Cc: 'hotspot-dev at openjdk.java.net' > > > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > > > > > > would print the OS, too, as it did before. > > > > > > > > Hi Goetz I do not think that we currently run in OpenJDK on OS/400 , > > > > so printing the OS does not have much value currently ( it is always AIX) . > > > > > > > > > Didn't the warning stem from the > > > > > assert(false, name_str); > > > > > > > > I think clang does not like the usage of string literals for non-constant > > > strings > > > > [-Wwritable-strings] . > > > > > > > > Best regards, Matthias > > > > > > > > > > > > > -----Original Message----- > > > > > From: Lindenmaier, Goetz > > > > > Sent: Dienstag, 16. Juli 2019 09:30 > > > > > To: Baesken, Matthias ; Doerr, Martin > > > > > > > > > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > > > > > > > Hi Matthias, > > > > > > > > > > limiting the version is a good idea. > > > > > > > > > > As this is only a trace and an assertion (debug build), > > > > > I don't think it is necessary to push it to 13, but > > > > > pushing it there is fine too. > > > > > > > > > > I would appreciate if the > > > > > trcVerbose("We run on %s %s", name_str, ver_str); > > > > > would print the OS, too, as it did before. > > > > > As the code for OS/400 is in there, also the tracing > > > > > should be complete. > > > > > > > > > > Didn't the warning stem from the > > > > > assert(false, name_str); > > > > > which correctly could be > > > > > assert(false, "%s", name_str); > > > > > ? (Your version for the assert is fine, too.) > > > > > > > > > > Best regards, > > > > > Goetz. > > > > > > > > > > > -----Original Message----- > > > > > > From: Langer, Christoph > > > > > > Sent: Freitag, 12. Juli 2019 14:17 > > > > > > To: Baesken, Matthias ; 'hotspot- > > > > > > dev at openjdk.java.net' ; 'ppc-aix- > > > port- > > > > > > dev at openjdk.java.net' > > > > > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > > > > > > > > > Hi Matthias, > > > > > > > > > > > > looks good. This might even be something to push to JDK13 still (if you > > > do it > > > > > > within the next few days). > > > > > > > > > > > > Best regards > > > > > > Christoph > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: hotspot-dev On > > > Behalf > > > > > Of > > > > > > > Baesken, Matthias > > > > > > > Sent: Freitag, 12. Juli 2019 13:09 > > > > > > > To: 'hotspot-dev at openjdk.java.net' > > dev at openjdk.java.net>; > > > > > > > 'ppc-aix-port-dev at openjdk.java.net' > > > > dev at openjdk.java.net> > > > > > > > Subject: RFR: 8227631: Adjust AIX version check > > > > > > > > > > > > > > Hello, please review this small AIX related change . > > > > > > > > > > > > > > For some time, we do not support AIX 5.3 any more. > > > > > > > See (where AIX 7.1 or 7.2 is the supported build platform since > > > > > OpenJDK11) : > > > > > > > > > > > > > > > > > https://wiki.openjdk.java.net/display/Build/Supported+Build+Platforms > > > > > > > > > > > > > > The currently used xlc 16.1 (XL C/C++ Compilers) even needs > > > minimum > > > > > AIX > > > > > > > 7.1 to run , see > > > > > > > > > > > > > > http://www-01.ibm.com/support/docview.wss?uid=swg21326972 > > > > > > > > > > > > > > (and compiling for older releases on 7.1 / 7.2 would not work easily , > > > at > > > > > least > > > > > > > not "out of the box" to my knowledge .) > > > > > > > > > > > > > > So we should adjust the minimum OS version check done in > > > os_aix.cpp in > > > > > > > os::Aix::initialize_os_info() . > > > > > > > > > > > > > > > > > > > > > Additionally the change removes a couple of warnings [-Wwritable- > > > > > strings > > > > > > > category] . > > > > > > > > > > > > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4081:22: warning: ISO > > > C++11 > > > > > > > does not allow conversion from string literal to 'char *' [-Wwritable- > > > > > strings] > > > > > > > char *name_str = "unknown OS"; > > > > > > > ^ > > > > > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4089:18: warning: ISO > > > C++11 > > > > > > > does not allow conversion from string literal to 'char *' [-Wwritable- > > > > > strings] > > > > > > > name_str = "OS/400 (pase)"; > > > > > > > ^ > > > > > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4100:18: warning: ISO > > > C++11 > > > > > > > does not allow conversion from string literal to 'char *' [-Wwritable- > > > > > strings] > > > > > > > name_str = "AIX"; > > > > > > > > > > > > > > > > > > > > > > > > > > > > Bug/webrev : > > > > > > > > > > > > > > https://bugs.openjdk.java.net/browse/JDK-8227631 > > > > > > > > > > > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227631.0/ > > > > > > > > > > > > > > Thanks, Matthias From matthias.baesken at sap.com Tue Jul 16 13:01:21 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 16 Jul 2019 13:01:21 +0000 Subject: RFR: 8227633: avoid comparing this pointers to NULL Message-ID: Hello Coleen , I adjusted the check in formssel.cpp to if (mnode != NULL) , > > I didn't see that you added a check for NULL in the callers of > > print_opcodes and added NULL checks to the _inst._opcode->print_opcode calls in src/hotspot/share/adlc/output_c.cpp . Regarding Set::setstr() in src/hotspot/share/libadt/set.cpp , This is used in print() and this can be called "conveniently in the debugger" (see set.hpp ). So I think it is okay to remove the check . please see the new webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.2/ Thanks, Matthias > > > > > + if (mnode) mnode->count_instr_names(names); > > > > > > We also try to avoid implicit checks against null for pointers so change > > this to: > > > > Hi Coleen, sure I can change this ; I just found a lot of places in formssel.cpp > where if (ptr) { ... } is used . > > > > > I didn't see that you added a check for NULL in the callers of > > print_opcodes or setstr.? Can those callers never pass NULL? > > > > It looked to me that the setstr is never really called and void Set::print() > const { ... } where it is used is used for debug printing - did I miss something > ? > > Regarding print_opcodes , there probably the NULL checks at caller palces > should better be added . > > Regards, Matthias > From christoph.langer at sap.com Tue Jul 16 14:11:26 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Tue, 16 Jul 2019 14:11:26 +0000 Subject: RFR: 8227631: Adjust AIX version check In-Reply-To: References: Message-ID: +1 > -----Original Message----- > From: hotspot-dev On Behalf Of > Lindenmaier, Goetz > Sent: Dienstag, 16. Juli 2019 15:00 > To: Baesken, Matthias ; hotspot- > dev at openjdk.java.net > Subject: RE: RFR: 8227631: Adjust AIX version check > > Hi Mathias, > > ... lost the list, sorry. > > The change looks good! > > Best regards, > Goetz. > > > -----Original Message----- > > From: Lindenmaier, Goetz > > Sent: Dienstag, 16. Juli 2019 12:59 > > To: Baesken, Matthias > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > Hi Matthias, > > > > looks fine, thanks a lot! > > > > Best regards, > > Goetz. > > > > > -----Original Message----- > > > From: Baesken, Matthias > > > Sent: Dienstag, 16. Juli 2019 12:18 > > > To: Lindenmaier, Goetz > > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > > > Hi Goetz , yes I adjusted this to "const char *" and do not see the > warning > > any > > > more , new webrev : > > > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227631.1/ > > > > > > > > > Best regards, Matthias > > > > > > > > > > -----Original Message----- > > > > From: Lindenmaier, Goetz > > > > Sent: Dienstag, 16. Juli 2019 11:16 > > > > To: Baesken, Matthias > > > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > > > > > I think > > > > const char *name_str > > > > should do the job. > > > > > > > > Best regards, > > > > Goetz. > > > > > > > > > -----Original Message----- > > > > > From: Baesken, Matthias > > > > > Sent: Dienstag, 16. Juli 2019 09:52 > > > > > To: Lindenmaier, Goetz ; Doerr, Martin > > > > > > > > > > Cc: 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net> > > > > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > > > > > > > > would print the OS, too, as it did before. > > > > > > > > > > Hi Goetz I do not think that we currently run in OpenJDK on OS/400 , > > > > > so printing the OS does not have much value currently ( it is always > AIX) . > > > > > > > > > > > Didn't the warning stem from the > > > > > > assert(false, name_str); > > > > > > > > > > I think clang does not like the usage of string literals for non- > constant > > > > strings > > > > > [-Wwritable-strings] . > > > > > > > > > > Best regards, Matthias > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: Lindenmaier, Goetz > > > > > > Sent: Dienstag, 16. Juli 2019 09:30 > > > > > > To: Baesken, Matthias ; Doerr, > Martin > > > > > > > > > > > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > > > > > > > > > Hi Matthias, > > > > > > > > > > > > limiting the version is a good idea. > > > > > > > > > > > > As this is only a trace and an assertion (debug build), > > > > > > I don't think it is necessary to push it to 13, but > > > > > > pushing it there is fine too. > > > > > > > > > > > > I would appreciate if the > > > > > > trcVerbose("We run on %s %s", name_str, ver_str); > > > > > > would print the OS, too, as it did before. > > > > > > As the code for OS/400 is in there, also the tracing > > > > > > should be complete. > > > > > > > > > > > > Didn't the warning stem from the > > > > > > assert(false, name_str); > > > > > > which correctly could be > > > > > > assert(false, "%s", name_str); > > > > > > ? (Your version for the assert is fine, too.) > > > > > > > > > > > > Best regards, > > > > > > Goetz. > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Langer, Christoph > > > > > > > Sent: Freitag, 12. Juli 2019 14:17 > > > > > > > To: Baesken, Matthias ; 'hotspot- > > > > > > > dev at openjdk.java.net' ; 'ppc- > aix- > > > > port- > > > > > > > dev at openjdk.java.net' > > > > > > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > > > > > > > > > > > Hi Matthias, > > > > > > > > > > > > > > looks good. This might even be something to push to JDK13 still (if > you > > > > do it > > > > > > > within the next few days). > > > > > > > > > > > > > > Best regards > > > > > > > Christoph > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > From: hotspot-dev > On > > > > Behalf > > > > > > Of > > > > > > > > Baesken, Matthias > > > > > > > > Sent: Freitag, 12. Juli 2019 13:09 > > > > > > > > To: 'hotspot-dev at openjdk.java.net' > > > dev at openjdk.java.net>; > > > > > > > > 'ppc-aix-port-dev at openjdk.java.net' > > > > > dev at openjdk.java.net> > > > > > > > > Subject: RFR: 8227631: Adjust AIX version check > > > > > > > > > > > > > > > > Hello, please review this small AIX related change . > > > > > > > > > > > > > > > > For some time, we do not support AIX 5.3 any more. > > > > > > > > See (where AIX 7.1 or 7.2 is the supported build platform since > > > > > > OpenJDK11) : > > > > > > > > > > > > > > > > > > > > > https://wiki.openjdk.java.net/display/Build/Supported+Build+Platforms > > > > > > > > > > > > > > > > The currently used xlc 16.1 (XL C/C++ Compilers) even needs > > > > minimum > > > > > > AIX > > > > > > > > 7.1 to run , see > > > > > > > > > > > > > > > > http://www- > 01.ibm.com/support/docview.wss?uid=swg21326972 > > > > > > > > > > > > > > > > (and compiling for older releases on 7.1 / 7.2 would not work > easily , > > > > at > > > > > > least > > > > > > > > not "out of the box" to my knowledge .) > > > > > > > > > > > > > > > > So we should adjust the minimum OS version check done in > > > > os_aix.cpp in > > > > > > > > os::Aix::initialize_os_info() . > > > > > > > > > > > > > > > > > > > > > > > > Additionally the change removes a couple of warnings [- > Wwritable- > > > > > > strings > > > > > > > > category] . > > > > > > > > > > > > > > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4081:22: warning: > ISO > > > > C++11 > > > > > > > > does not allow conversion from string literal to 'char *' [- > Wwritable- > > > > > > strings] > > > > > > > > char *name_str = "unknown OS"; > > > > > > > > ^ > > > > > > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4089:18: warning: > ISO > > > > C++11 > > > > > > > > does not allow conversion from string literal to 'char *' [- > Wwritable- > > > > > > strings] > > > > > > > > name_str = "OS/400 (pase)"; > > > > > > > > ^ > > > > > > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:4100:18: warning: > ISO > > > > C++11 > > > > > > > > does not allow conversion from string literal to 'char *' [- > Wwritable- > > > > > > strings] > > > > > > > > name_str = "AIX"; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Bug/webrev : > > > > > > > > > > > > > > > > https://bugs.openjdk.java.net/browse/JDK-8227631 > > > > > > > > > > > > > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227631.0/ > > > > > > > > > > > > > > > > Thanks, Matthias From matthias.baesken at sap.com Tue Jul 16 14:59:20 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 16 Jul 2019 14:59:20 +0000 Subject: RFR: 8227631: Adjust AIX version check In-Reply-To: References: Message-ID: Thanks for the reviews ! Best regards, Matthias > Subject: RE: RFR: 8227631: Adjust AIX version check > > +1 > > > -----Original Message----- > > From: hotspot-dev On Behalf > Of > > Lindenmaier, Goetz > > Sent: Dienstag, 16. Juli 2019 15:00 > > To: Baesken, Matthias ; hotspot- > > dev at openjdk.java.net > > Subject: RE: RFR: 8227631: Adjust AIX version check > > > > Hi Mathias, > > > > ... lost the list, sorry. > > > > The change looks good! > > > > Best regards, > > Goetz. > > From coleen.phillimore at oracle.com Tue Jul 16 15:30:44 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 16 Jul 2019 11:30:44 -0400 Subject: RFR: 8227633: avoid comparing this pointers to NULL In-Reply-To: References: Message-ID: <5ace8298-e942-09ab-43ce-874937c160ba@oracle.com> This looks good to me.? I don't know this compiler code very well, so please wait for a second reviewer. Thanks, Coleen On 7/16/19 9:01 AM, Baesken, Matthias wrote: > Hello Coleen , > > I adjusted the check in formssel.cpp to if (mnode != NULL) , > >>> I didn't see that you added a check for NULL in the callers of >>> print_opcodes > and added NULL checks to the _inst._opcode->print_opcode calls in src/hotspot/share/adlc/output_c.cpp . > > Regarding Set::setstr() in src/hotspot/share/libadt/set.cpp , > This is used in print() and this can be called "conveniently in the debugger" (see set.hpp ). > So I think it is okay to remove the check . > > > please see the new webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.2/ > > > Thanks, Matthias > > > >>> + if (mnode) mnode->count_instr_names(names); >>> >>> >>> We also try to avoid implicit checks against null for pointers so change >>> this to: >>> >> Hi Coleen, sure I can change this ; I just found a lot of places in formssel.cpp >> where if (ptr) { ... } is used . >> >>> I didn't see that you added a check for NULL in the callers of >>> print_opcodes or setstr.? Can those callers never pass NULL? >>> >> It looked to me that the setstr is never really called and void Set::print() >> const { ... } where it is used is used for debug printing - did I miss something >> ? >> >> Regarding print_opcodes , there probably the NULL checks at caller palces >> should better be added . >> >> Regards, Matthias >> From lois.foltan at oracle.com Tue Jul 16 15:33:33 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 16 Jul 2019 11:33:33 -0400 Subject: RFR: 8227653: Add VM Global OopStorage In-Reply-To: References: Message-ID: On 7/15/2019 9:18 PM, Kim Barrett wrote: > Please review this change which adds a VMGlobal OopStorage object. It > is initially being used instead of the conditional JVMCI global storage > object, which is being removed. > > Looking for reviewers from all of gc, runtime, and compiler. > > To keep things simple for now, this new storage object is (optionally) > included in the processing done by SystemDictionary::oops_do. Most > existing storage processors use that mechanism. For most processors, > that's consistent with how JNI global handles are processed. ZGC uses > a different approach, and provides enough infrastructure that it was > easy to process this new storage object in a way that is consistent > with ZGC's handling of JNI globals. > > This change does not attempt to address the problems around changing > the set of OopStorage instance described by JDK-8227054. This change > was a useful bit of preparation for the work I'm doing on JDK-8227054, > so I split it out as a separate change. > > This change also includes a minimal update of Shenandoah, using the > processing of the new storage object by SystemDictionary::oops_do. > It looks like Shenandoah is conceptually similar to ZGC in it's > handling of JNI globals, and should be able to handle this new storage > object similarly, but I'm leaving that to the Shenandoah developers. > You might want to wait for JDK-8227054 though. > > Note that neither ZGC nor Shenandoah processed the former conditional > JVMCI global storage. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8227653 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8227653/open.00/ > > Testing: > mach5 tier1-5 > Hi Kim, Looks good to me as well. Thanks, Lois From vladimir.kozlov at oracle.com Tue Jul 16 15:52:01 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 16 Jul 2019 08:52:01 -0700 Subject: RFR: 8227653: Add VM Global OopStorage In-Reply-To: References: Message-ID: Here goes my work for JVMCI oops handling ;) Kim, after this change the only use for JVMCI::_object_handles is JVMCI::is_global_handle() which is only used in assert() in deleteGlobalHandle() in jvmciCompilerToVM.cpp. Do we really need it there? May be we should remove this use too. Thanks, Vladimir On 7/15/19 6:18 PM, Kim Barrett wrote: > Please review this change which adds a VMGlobal OopStorage object. It > is initially being used instead of the conditional JVMCI global storage > object, which is being removed. > > Looking for reviewers from all of gc, runtime, and compiler. > > To keep things simple for now, this new storage object is (optionally) > included in the processing done by SystemDictionary::oops_do. Most > existing storage processors use that mechanism. For most processors, > that's consistent with how JNI global handles are processed. ZGC uses > a different approach, and provides enough infrastructure that it was > easy to process this new storage object in a way that is consistent > with ZGC's handling of JNI globals. > > This change does not attempt to address the problems around changing > the set of OopStorage instance described by JDK-8227054. This change > was a useful bit of preparation for the work I'm doing on JDK-8227054, > so I split it out as a separate change. > > This change also includes a minimal update of Shenandoah, using the > processing of the new storage object by SystemDictionary::oops_do. > It looks like Shenandoah is conceptually similar to ZGC in it's > handling of JNI globals, and should be able to handle this new storage > object similarly, but I'm leaving that to the Shenandoah developers. > You might want to wait for JDK-8227054 though. > > Note that neither ZGC nor Shenandoah processed the former conditional > JVMCI global storage. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8227653 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8227653/open.00/ > > Testing: > mach5 tier1-5 > From dean.long at oracle.com Tue Jul 16 17:51:08 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 16 Jul 2019 10:51:08 -0700 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> Message-ID: <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> On 7/15/19 2:10 AM, Erik ?sterlund wrote: > Hi Dean, > > On 2019-07-12 23:50, dean.long at oracle.com wrote: >> On 7/11/19 1:13 PM, Erik ?sterlund wrote: >>> Hi Dean, >>> >>> On 2019-07-11 15:29, dean.long at oracle.com wrote: >>>> On 7/11/19 6:53 AM, Erik ?sterlund wrote: >>>>> Hi Dean, >>>>> >>>>> On 2019-07-11 00:42, dean.long at oracle.com wrote: >>>>>> On 7/10/19 1:28 AM, Erik ?sterlund wrote: >>>>>>> Hi Dean, >>>>>>> >>>>>>> On 2019-07-09 23:31, dean.long at oracle.com wrote: >>>>>>>> On 7/1/19 6:12 AM, Erik ?sterlund wrote: >>>>>>>>> For ZGC I moved OSR nmethod unlinking to before the unlinking >>>>>>>>> (where unlinking code belongs), instead of after the handshake >>>>>>>>> (intended for deleting things safely unlinked). >>>>>>>>> Strictly speaking, moving the OSR nmethod unlinking removes >>>>>>>>> the racing between make_not_entrant and make_unloaded, but I >>>>>>>>> still want the monotonicity guards to make this code more robust. >>>>>>>> >>>>>>>> I see where you added OSR nmethod unlinking, but not where you >>>>>>>> removed it, so it's not obvious it was a "move". >>>>>>> >>>>>>> Sorry, bad wording on my part. I added OSR nmethod unlinking >>>>>>> before the global handshake is run. After the handshake, we call >>>>>>> make_unloaded() on the same is_unloading() nmethods. That >>>>>>> function "tries" to unlink the OSR nmethod, but will just not do >>>>>>> it as it's already unlinked at that point. So in a way, I didn't >>>>>>> remove the call to unlink the OSR nmethod there, it just won't >>>>>>> do anything. I preferred structuring it that way instead of >>>>>>> trying to optimize away the call to unlink the OSR nmethod when >>>>>>> making it unloaded, but only for the concurrent case. It seemed >>>>>>> to introduce more conditional magic than it was worth. >>>>>>> So in practice, the unlinking of OSR nmethods has moved for >>>>>>> concurrent unloading to before the handshake. >>>>>>> >>>>>> >>>>>> OK, in that case, could you add a little information to the >>>>>> "Invalidate the osr nmethod only once" comment so that in the >>>>>> future someone isn't tempted to remove the code as redundant? >>>>> >>>>> Sure. >>>>> >>>> >>>> I meant the one in zNMethod.cpp :-) >>> >>> Okay, will put another comment in there once we agree on a direction >>> on the next point. >>> >>>> >>>>>>>> Would it make sense for nmethod::unlink_from_method() to do the >>>>>>>> OSR unlinking, or to assert that it has already been done? >>>>>>> >>>>>>> An earlier version of this patch tried to do that. It is indeed >>>>>>> possible. But it requires changing lock ranks of the OSR nmethod >>>>>>> lock to special - 1 and moving around a bunch of code as this >>>>>>> function is also called both when making nmethods not_entrant, >>>>>>> zombie, and unlinking them in that case. For the first two, we >>>>>>> conditionally unlink the nmethod based on the current state >>>>>>> (which is the old state), whereas when I move it, the current >>>>>>> state is the new state. So I had to change things around a bit >>>>>>> more to figure out the right condition when to unlink it that >>>>>>> works for all 3 callers. In the end, since this is going to 13, >>>>>>> I thought it's more important to minimize the risk as much as I >>>>>>> can, and leave such refactorings to 14. >>>>>>> >>>>>> >>>>>> OK. >>>>>> >>>>>>>> The new bailout in the middle of >>>>>>>> nmethod::make_not_entrant_or_zombie() worries me a little, >>>>>>>> because the code up to that point has side-effects, and we >>>>>>>> could be bailing out in an unexpected state. >>>>>>> >>>>>>> Correct. In an earlier version of this patch, I moved the >>>>>>> transition to before the side effects. But a bunch of code is >>>>>>> using the current nmethod state to determine what to do, and >>>>>>> that current state changed from the old to the new state. In >>>>>>> particular, we conditionally patch in the jump based on the >>>>>>> current (old) state, and we conditionally increment decompile >>>>>>> count based on the current (old) state. So I ended up having to >>>>>>> rewrite more code than I wanted to for a patch going into 13, >>>>>>> and convince myself that I had not implicitly messed something >>>>>>> up. It felt safer to reason about the 3 side effects up until >>>>>>> the transitioning point: >>>>>>> >>>>>>> 1) Patching in the jump into VEP. Any state more dead than the >>>>>>> current transition, would still want that jump to be there. >>>>>>> 2) Incrementing decompile count when making it not_entrant. >>>>>>> Seems in order to do regardless, as we had an actual request to >>>>>>> make the nmethod not entrant because it was bad somehow. >>>>>>> 3) Marking it as seen on stack when making it not_entrant. This >>>>>>> will only make can_convert_to_zombie start returning false, >>>>>>> which is harmless in general. Also, as both transitions to >>>>>>> zombie and not_entrant are performed under the Patching_lock, >>>>>>> the only possible race is with make_unloaded. And those nmethods >>>>>>> are is_unloading(), which also makes can_convert_to_zombie >>>>>>> return false (in a not racy fashion). So it would essentially >>>>>>> make no observable difference to any single call to >>>>>>> can_convert_to_zombie(). >>>>>>> >>>>>>> In summary, #1 and #3 don't really observably change the state >>>>>>> of the system, and #2 is completely harmless and probably >>>>>>> wanted. Therefore I found that moving these things around and >>>>>>> finding out where we use the current state(), as well as >>>>>>> rewriting it, seemed like a slightly scarier change for 13 to me. >>>>>>> >>>>>>> So in general, there is some refactoring that could be done (and >>>>>>> I have tried it) to make this nicer. But I want to minimize the >>>>>>> risk for 13 as much as possible, and perform any risky >>>>>>> refactorings in 14 instead. >>>>>>> If your risk assessment is different and you would prefer moving >>>>>>> the transition higher up (and flipping some conditions) instead, >>>>>>> I am totally up for that too though, and I do see where you are >>>>>>> coming from. >>>>>>> >>>>>> >>>>>> So if we fail, it means that we lost a race to a "deader" state, >>>>>> and assuming this is the only path to the deader state, wouldn't >>>>>> that also mean that #1, #2, and #3 would have already been done >>>>>> by the winning thread?? If so, that makes me feel better about >>>>>> bailing out in the middle, but I'm still not 100% convinced, >>>>>> unless we can assert that 1-3 already happened.? Do you have a >>>>>> prototype of what moving the transition higher up would look like? >>>>> >>>>> As a matter of fact I do. Here is a webrev: >>>>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.01/ >>>>> >>>>> I kind of like it. What do you think? >>>>> >>>> >>>> Now the code after the transition that says "Must happen before >>>> state change" worries me. >>> >>> Yes indeed. This is why I was hesitant to move the transition up. It >>> moves past 3 things that implicitly depends on the current state. >>> This one is extra scary. It actually introduces a race condition >>> that could crash the VM (because can_convert_to_zombie() may observe >>> an nmethod that just turned not_entrant, without being marked on >>> stack). >>> >>> I think this shows (IMO) that trying to move the transition up has 3 >>> problems, and this one is particularly hard to dodge. I think it >>> really has to be before the transition. >>> >>> Would you agree now that keeping the transition where it was is less >>> risky (as I did originally) >> >> Yes. >> >>> and convincing ourselves that the 3 "side effects" are not really >>> observable side effects in the system, as I reasoned about earlier? >>> >> >> yes, but I'm hoping we can do more than just reason, like adding >> asserts.? More below... >> >>> If not, I can try to move the mark-on-stack up above the transition. >>> >>>> Can you remind me again what kind of race can make the state >>>> transition fail here?? Did you happen to draw a state diagram while >>>> learning this code? :-) >>> >>> Yes indeed. Would you like the long story or the short story? Here >>> is the short story: the only known race is between one thread making >>> an nmethod not_entrant and the GC thread making it unloaded. That >>> make_not_entrant is the only transition that can fail. Previously I >>> relied on there never existing any concurrent calls to >>> make_not_entrant() and make_unloaded(). The OSR nmethod was caught >>> as a special case (isn't it always...) where this could happen, >>> violating monotonicity. But I think it feels safer to enforce the >>> monotonicity of transitions in the actual code that performs the >>> transitions, instead of relying on knowledge of the relationships >>> between all state transitioning calls, implicitly ensuring >>> monotonicity. >>> >> >> Can we enforce in_use --> not_entrant --> unloaded --> zombie, and >> not allow jumps or skipped states?? Then we can assert that cleanup >> from a less-dead state has already been done.? So if make_not_entrant >> failed, it would assert that all the cleanup that would have been >> done by a successful make_not_entrant has already been done. > > I'm afraid not. The state machine skips states by design. For example, > the set of {not_installed, in_use, not_entrant} states are alive and > {unloaded, zombie} are not alive. Any nmethod in an "alive" state may > transition to the unloaded state due to an oop dying. Actually > strictly speaking, only {in_use, not_entrant} may become unloaded, as > nmethods are made in_use within the same thread_in_vm critical section > that they finalized oops in the nmethod, and hence could not yet have > died. Similarly, the "unloaded" state is reserved for unloading by the > GC. And not all nmethods that become zombie were unloaded by the GC. I > think changing so that all these transitions are taken for all > nmethods, sounds like it will break invariants and be quite dangerous. > > Note though that what all dead (!is_alive()) states have in common is > that they can never be called or be on-stack; by the time an nmethod > enters a dead state (unloaded or zombie), its inline caches and all > other stale pointers to the nmethod have been cleaned out, and either > a safepoint or global thread-local handshake with cross-modifying > fences has finished, without finding activation records on-stack. That > is the unwritten definition of being !is_alive() (e.g. unloaded or > zombie). Therefore, if a transition to not_entrant fails due to > entering a more dead state (unloaded or zombie), then that implies the > following: > 1) The jump at VEP is no longer needed because the jump is no longer > reachable code, as another thread had enough knowledge to determine it > was dead (all references to it have been unlinked, followed by a > handshake/safepoint with cross-modifying fencing and stack scanning). > So whether another transition performed this step or not is > unimportant. Note that for example make_unloaded() does not patch in a > jump at VEP, despite transitioning nmethods directly from in_use to > unloaded, for this exact reason. By the time the nmethod is killed, > that jump better be dead code already. It's only needed for the > not_entrant state, where the nmethod may still alive but we want to > stop calls into it. > 2) The mark_as_seen_on_stack() prevents the sweeper from transitioning > not_entrant() nmethods to zombie until it's no longer seen on stack, > so it doesn't accidentally kill not_entrant nmethods. But if the > transition failed, it's already dead, and the only path that looks at > that value, is not taken (looking for not_entrant nmethods that can be > made zombie). Again, it is totally fine that another thread killing > the nmethod for a different reason did not perform this step. > 3) The inc_decompile_count() is still valid, as the caller had a valid > reason to deopt the nmethod, regardless of whether there were multiple > reasons for discarding the nmethod or not. > > So in summary, if a make_not_entrant attempt fails due to a > make_unloaded (or hypothetically make_zombie even though that race is > impossible) attempt, then the presence or lack of presence of the VEP > jump and the mark-on-stack value no longer matter, as they are > properties that only matter to is_alive() nmethods. And > inc_decompile_count is fine to do as well as there was a valid deopt > reason for the make_not_entrant() caller. > > Would it feel better if I wrote this reasoning down in comments in > make_not_entrant_or_zombie? > Yes, I think any additional clarity in this area would be helpful. Back to the make_not_entrant / make_unloaded race.? If make_not_entrant bails out half-way through because make_unloaded won the race, doesn't that mean that make_unloaded needs to have already done all the work that make_not_entrant is not doing? unlink_from_method, invalidate_nmethod_mirror, remove_osr_nmethod, unregister_nmethod, etc. dl > Thanks, > /Erik > >> dl >> >>> Thanks, >>> /Erik >>> >>>> dl >>>> >>>>> Thanks, >>>>> /Erik >>>>> >>>>>> dl >>>>>> >>>>>>> BTW, I have tested this change through hs-tier1-7, and it looks >>>>>>> good. >>>>>>> >>>>>>> Thanks a lot Dean for reviewing this code. >>>>>>> >>>>>>> /Erik >>>>>>> >>>>>>>> dl >>>>>>>> >>>>>>> >>>>>> >>>> >> > From igor.ignatyev at oracle.com Tue Jul 16 18:49:08 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 16 Jul 2019 11:49:08 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> Message-ID: <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> Hi Severin, I don't think that tests (or test libraries for that matter) should be responsible for setting correct PATH value, it should be a part of host configuration procedure (tests can/should check that all required bins are available though). in other words, I'd prefer if you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and TestJFREvents. the rest looks good to me. Thanks, -- Igor > On Jul 16, 2019, at 5:36 AM, Severin Gehwolf wrote: > > Hi, > > I believe I still need a *R*eviewer for this. Any takers? > > Thanks, > Severin > > On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: >> Hi Severin, >> >> The change looks good to me. Thank you for adding support for Podman >> container technology. >> >> Testing: I ran both HotSpot and JDK container tests with your patch; >> tests executed on Oracle Linux 7.6 using default container engine (Docker): >> >> test/hotspot/jtreg/containers/ AND >> test/jdk/jdk/internal/platform/docker/ >> >> All PASS >> >> >> Thanks, >> >> Misha >> >> >> On 7/12/19 11:08 AM, Severin Gehwolf wrote: >>> Hi, >>> >>> There is an alternative container engine which is being used by Fedora >>> and RHEL 8, called podman[1]. It's mostly compatible with docker. It >>> looks like OpenJDK docker tests can be made podman compatible with a >>> few little tweaks. One "interesting" one is to not assert "Successfully >>> built" in the build output but only rely on the exit code, which seems >>> to be OK for my testing. Interestingly the test would be skipped in >>> that case. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 >>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ >>> >>> Adjustments I've done: >>> * Don't assert "Successfully built" in image build output[2]. >>> * Add /usr/sbin to PATH as the podman binary relies on iptables for it >>> to work which is in /usr/sbin on Fedora >>> * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() >>> to be equal to the previous value. I've found those counters to be >>> slowly increasing, which made the tests unreliable. >>> >>> Testing: >>> >>> Running docker tests with docker as engine. Did the same with podman as >>> engine via -Djdk.test.docker.command=podman on Linux x86_64. Both >>> passed (non-trivially). >>> >>> Thoughts? >>> >>> Thanks, >>> Severin >>> >>> [1] https://podman.io/ >>> [2] Image builds with podman look >>> like ("COMMIT" over "Successfully built"): >>> STEP 1: FROM fedora:29 >>> STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all >>> --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d >>> STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ >>> 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 >>> STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics >>> STEP 5: COMMIT fedora-metrics-11 >>> d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e >>> > From mikhailo.seledtsov at oracle.com Tue Jul 16 20:23:32 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Tue, 16 Jul 2019 13:23:32 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> Message-ID: <99a03fcd-dd33-52ea-8f43-29c8aa2bcf78@oracle.com> Hi Igor, ?? In both cases the environment variable is set for the Docker/Podman container process, not the host system. This will not affect the host system in any way. The docker process has its own namespace for environment variables. Does this alleviate your concerns? Thank you, Misha On 7/16/19 11:49 AM, Igor Ignatyev wrote: > Hi Severin, > > I don't think that tests (or test libraries for that matter) should be responsible for setting correct PATH value, it should be a part of host configuration procedure (tests can/should check that all required bins are available though). in other words, I'd prefer if you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and TestJFREvents. the rest looks good to me. > > Thanks, > -- Igor > >> On Jul 16, 2019, at 5:36 AM, Severin Gehwolf wrote: >> >> Hi, >> >> I believe I still need a *R*eviewer for this. Any takers? >> >> Thanks, >> Severin >> >> On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: >>> Hi Severin, >>> >>> The change looks good to me. Thank you for adding support for Podman >>> container technology. >>> >>> Testing: I ran both HotSpot and JDK container tests with your patch; >>> tests executed on Oracle Linux 7.6 using default container engine (Docker): >>> >>> test/hotspot/jtreg/containers/ AND >>> test/jdk/jdk/internal/platform/docker/ >>> >>> All PASS >>> >>> >>> Thanks, >>> >>> Misha >>> >>> >>> On 7/12/19 11:08 AM, Severin Gehwolf wrote: >>>> Hi, >>>> >>>> There is an alternative container engine which is being used by Fedora >>>> and RHEL 8, called podman[1]. It's mostly compatible with docker. It >>>> looks like OpenJDK docker tests can be made podman compatible with a >>>> few little tweaks. One "interesting" one is to not assert "Successfully >>>> built" in the build output but only rely on the exit code, which seems >>>> to be OK for my testing. Interestingly the test would be skipped in >>>> that case. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 >>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ >>>> >>>> Adjustments I've done: >>>> * Don't assert "Successfully built" in image build output[2]. >>>> * Add /usr/sbin to PATH as the podman binary relies on iptables for it >>>> to work which is in /usr/sbin on Fedora >>>> * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() >>>> to be equal to the previous value. I've found those counters to be >>>> slowly increasing, which made the tests unreliable. >>>> >>>> Testing: >>>> >>>> Running docker tests with docker as engine. Did the same with podman as >>>> engine via -Djdk.test.docker.command=podman on Linux x86_64. Both >>>> passed (non-trivially). >>>> >>>> Thoughts? >>>> >>>> Thanks, >>>> Severin >>>> >>>> [1] https://podman.io/ >>>> [2] Image builds with podman look >>>> like ("COMMIT" over "Successfully built"): >>>> STEP 1: FROM fedora:29 >>>> STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all >>>> --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d >>>> STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ >>>> 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 >>>> STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics >>>> STEP 5: COMMIT fedora-metrics-11 >>>> d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e >>>> From igor.ignatyev at oracle.com Tue Jul 16 20:32:43 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 16 Jul 2019 13:32:43 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <99a03fcd-dd33-52ea-8f43-29c8aa2bcf78@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <99a03fcd-dd33-52ea-8f43-29c8aa2bcf78@oracle.com> Message-ID: <026F52EE-CD26-4E5F-B8D6-306C5DF358B8@oracle.com> Hi Misha, I understand that it doesn't alter the host system. my concern is that we move problem of host-configuration into tests. let's say tomorrow a new container engine will require something from /usr/local/sbin, or /usr/local/Cellar/docker/bin on another OS, or, god forbid, C:\Program Files(x86)\podman\bin. it has nothing to do w/ tests, it's a question of configuring a host, as I said, should be handled at another level by "scripts" run (once) prior test execution. -- Igor > On Jul 16, 2019, at 1:23 PM, mikhailo.seledtsov at oracle.com wrote: > > Hi Igor, > > In both cases the environment variable is set for the Docker/Podman container process, not the host system. This will not affect the host system in any way. The docker process has its own namespace for environment variables. Does this alleviate your concerns? > > > Thank you, > > Misha > > On 7/16/19 11:49 AM, Igor Ignatyev wrote: >> Hi Severin, >> >> I don't think that tests (or test libraries for that matter) should be responsible for setting correct PATH value, it should be a part of host configuration procedure (tests can/should check that all required bins are available though). in other words, I'd prefer if you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and TestJFREvents. the rest looks good to me. >> >> Thanks, >> -- Igor >> >>> On Jul 16, 2019, at 5:36 AM, Severin Gehwolf wrote: >>> >>> Hi, >>> >>> I believe I still need a *R*eviewer for this. Any takers? >>> >>> Thanks, >>> Severin >>> >>> On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: >>>> Hi Severin, >>>> >>>> The change looks good to me. Thank you for adding support for Podman >>>> container technology. >>>> >>>> Testing: I ran both HotSpot and JDK container tests with your patch; >>>> tests executed on Oracle Linux 7.6 using default container engine (Docker): >>>> >>>> test/hotspot/jtreg/containers/ AND >>>> test/jdk/jdk/internal/platform/docker/ >>>> >>>> All PASS >>>> >>>> >>>> Thanks, >>>> >>>> Misha >>>> >>>> >>>> On 7/12/19 11:08 AM, Severin Gehwolf wrote: >>>>> Hi, >>>>> >>>>> There is an alternative container engine which is being used by Fedora >>>>> and RHEL 8, called podman[1]. It's mostly compatible with docker. It >>>>> looks like OpenJDK docker tests can be made podman compatible with a >>>>> few little tweaks. One "interesting" one is to not assert "Successfully >>>>> built" in the build output but only rely on the exit code, which seems >>>>> to be OK for my testing. Interestingly the test would be skipped in >>>>> that case. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 >>>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ >>>>> >>>>> Adjustments I've done: >>>>> * Don't assert "Successfully built" in image build output[2]. >>>>> * Add /usr/sbin to PATH as the podman binary relies on iptables for it >>>>> to work which is in /usr/sbin on Fedora >>>>> * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() >>>>> to be equal to the previous value. I've found those counters to be >>>>> slowly increasing, which made the tests unreliable. >>>>> >>>>> Testing: >>>>> >>>>> Running docker tests with docker as engine. Did the same with podman as >>>>> engine via -Djdk.test.docker.command=podman on Linux x86_64. Both >>>>> passed (non-trivially). >>>>> >>>>> Thoughts? >>>>> >>>>> Thanks, >>>>> Severin >>>>> >>>>> [1] https://podman.io/ >>>>> [2] Image builds with podman look >>>>> like ("COMMIT" over "Successfully built"): >>>>> STEP 1: FROM fedora:29 >>>>> STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all >>>>> --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d >>>>> STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ >>>>> 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 >>>>> STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics >>>>> STEP 5: COMMIT fedora-metrics-11 >>>>> d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e >>>>> From sgehwolf at redhat.com Tue Jul 16 21:01:00 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 16 Jul 2019 23:01:00 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <026F52EE-CD26-4E5F-B8D6-306C5DF358B8@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <99a03fcd-dd33-52ea-8f43-29c8aa2bcf78@oracle.com> <026F52EE-CD26-4E5F-B8D6-306C5DF358B8@oracle.com> Message-ID: <73012440f0b8ea7c603ba1685717efc4734dd994.camel@redhat.com> Hi Igor, I understand the concern and I guess I could remove it and locally install a wrapper in /bin or /usr/bin for podman which adds /usr/sbin to the path. On the other hand... This seems to be an issue of code being run through jtreg. Looking at the jtr files I see this: ----------rerun:(21/1545)*---------- cd /disk/openjdk/upstream-sources/openjdk-head/JTwork/scratch && \\ DISPLAY=:0 \\ HOME=/home/sgehwolf \\ LANG=en_US.UTF-8 \\ PATH=/bin:/usr/bin \\ XMODIFIERS=@im=ibus \\ [...] So jtreg reduces the host's PATH to /bin:/usr/bin, which is insufficient for the podman case. The tag-spec docs[1] for jtreg mention for "shell" tests that it sets the PATH to the above settings. This affects Java tests as well it seems. ProcessBuilder outside jtreg has a sensible PATH as set up by the system, FWIW. So while my system is properly set up, jtreg interferes and renders this necessary. Any suggestions as to how to convince jtreg to use the host's PATH setting? Thanks, Severin [1] http://openjdk.java.net/jtreg/tag-spec.html On Tue, 2019-07-16 at 13:32 -0700, Igor Ignatyev wrote: > Hi Misha, > > I understand that it doesn't alter the host system. my concern is > that we move problem of host-configuration into tests. let's say > tomorrow a new container engine will require something from > /usr/local/sbin, or /usr/local/Cellar/docker/bin on another OS, or, > god forbid, C:\Program Files(x86)\podman\bin. it has nothing to do w/ > tests, it's a question of configuring a host, as I said, should be > handled at another level by "scripts" run (once) prior test > execution. > > -- Igor > > > On Jul 16, 2019, at 1:23 PM, mikhailo.seledtsov at oracle.com wrote: > > > > Hi Igor, > > > > In both cases the environment variable is set for the > > Docker/Podman container process, not the host system. This will not > > affect the host system in any way. The docker process has its own > > namespace for environment variables. Does this alleviate your > > concerns? > > > > > > Thank you, > > > > Misha > > > > On 7/16/19 11:49 AM, Igor Ignatyev wrote: > > > Hi Severin, > > > > > > I don't think that tests (or test libraries for that matter) > > > should be responsible for setting correct PATH value, it should > > > be a part of host configuration procedure (tests can/should check > > > that all required bins are available though). in other words, I'd > > > prefer if you remove 'env.put("PATH", ...)' lines from both > > > DockerTestUtils and TestJFREvents. the rest looks good to me. > > > > > > Thanks, > > > -- Igor > > > > > > > On Jul 16, 2019, at 5:36 AM, Severin Gehwolf < > > > > sgehwolf at redhat.com> wrote: > > > > > > > > Hi, > > > > > > > > I believe I still need a *R*eviewer for this. Any takers? > > > > > > > > Thanks, > > > > Severin > > > > > > > > On Fri, 2019-07-12 at 15:19 -0700, > > > > mikhailo.seledtsov at oracle.com wrote: > > > > > Hi Severin, > > > > > > > > > > The change looks good to me. Thank you for adding support > > > > > for Podman > > > > > container technology. > > > > > > > > > > Testing: I ran both HotSpot and JDK container tests with your > > > > > patch; > > > > > tests executed on Oracle Linux 7.6 using default container > > > > > engine (Docker): > > > > > > > > > > test/hotspot/jtreg/containers/ AND > > > > > test/jdk/jdk/internal/platform/docker/ > > > > > > > > > > All PASS > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Misha > > > > > > > > > > > > > > > On 7/12/19 11:08 AM, Severin Gehwolf wrote: > > > > > > Hi, > > > > > > > > > > > > There is an alternative container engine which is being > > > > > > used by Fedora > > > > > > and RHEL 8, called podman[1]. It's mostly compatible with > > > > > > docker. It > > > > > > looks like OpenJDK docker tests can be made podman > > > > > > compatible with a > > > > > > few little tweaks. One "interesting" one is to not assert > > > > > > "Successfully > > > > > > built" in the build output but only rely on the exit code, > > > > > > which seems > > > > > > to be OK for my testing. Interestingly the test would be > > > > > > skipped in > > > > > > that case. > > > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 > > > > > > webrev: > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ > > > > > > > > > > > > Adjustments I've done: > > > > > > * Don't assert "Successfully built" in image build > > > > > > output[2]. > > > > > > * Add /usr/sbin to PATH as the podman binary relies on > > > > > > iptables for it > > > > > > to work which is in /usr/sbin on Fedora > > > > > > * Allow for Metrics.getCpuSystemUsage() and > > > > > > Metrics.getCpuUserUsage() > > > > > > to be equal to the previous value. I've found those > > > > > > counters to be > > > > > > slowly increasing, which made the tests unreliable. > > > > > > > > > > > > Testing: > > > > > > > > > > > > Running docker tests with docker as engine. Did the same > > > > > > with podman as > > > > > > engine via -Djdk.test.docker.command=podman on Linux > > > > > > x86_64. Both > > > > > > passed (non-trivially). > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > Thanks, > > > > > > Severin > > > > > > > > > > > > [1] https://podman.io/ > > > > > > [2] Image builds with podman look > > > > > > like ("COMMIT" over "Successfully built"): > > > > > > STEP 1: FROM fedora:29 > > > > > > STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf > > > > > > clean all > > > > > > --> Using cache > > > > > > 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690 > > > > > > afd9d > > > > > > STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ > > > > > > 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bd > > > > > > b49a8 > > > > > > STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt > > > > > > --add-modules java.base --add-exports > > > > > > java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics > > > > > > STEP 5: COMMIT fedora-metrics-11 > > > > > > d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e > > > > > > 8cd7e > > > > > > From jonathan.gibbons at oracle.com Tue Jul 16 21:11:13 2019 From: jonathan.gibbons at oracle.com (Jonathan Gibbons) Date: Tue, 16 Jul 2019 14:11:13 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <73012440f0b8ea7c603ba1685717efc4734dd994.camel@redhat.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <99a03fcd-dd33-52ea-8f43-29c8aa2bcf78@oracle.com> <026F52EE-CD26-4E5F-B8D6-306C5DF358B8@oracle.com> <73012440f0b8ea7c603ba1685717efc4734dd994.camel@redhat.com> Message-ID: <6d0d25c5-9ca0-608e-93eb-f83093114b76@oracle.com> Severin, This might be a reasonable update to jtreg, to allow /usr/sbin on the path on Unix-like systems.? The intent of jtreg is to protect tests from random crufty stuff on the PATH, and /usr/sbin is not in that category. I've created CODETOOLS-7902505: Consider allowing /usr/sbin on $PATH https://bugs.openjdk.java.net/browse/CODETOOLS-7902505 The short-term workaround is to use the jtreg command-line option ??? -e:PATH which should override the default settign for PATH and pass through whatever you have set for $PATH. -- Jon On 07/16/2019 02:01 PM, Severin Gehwolf wrote: > Hi Igor, > > I understand the concern and I guess I could remove it and locally > install a wrapper in /bin or /usr/bin for podman which adds /usr/sbin > to the path. On the other hand... > > This seems to be an issue of code being run through jtreg. Looking at > the jtr files I see this: > > ----------rerun:(21/1545)*---------- > cd /disk/openjdk/upstream-sources/openjdk-head/JTwork/scratch && \\ > DISPLAY=:0 \\ > HOME=/home/sgehwolf \\ > LANG=en_US.UTF-8 \\ > PATH=/bin:/usr/bin \\ > XMODIFIERS=@im=ibus \\ > [...] > > So jtreg reduces the host's PATH to /bin:/usr/bin, which is > insufficient for the podman case. The tag-spec docs[1] for jtreg > mention for "shell" tests that it sets the PATH to the above settings. > This affects Java tests as well it seems. > > ProcessBuilder outside jtreg has a sensible PATH as set up by the > system, FWIW. > > So while my system is properly set up, jtreg interferes and renders > this necessary. Any suggestions as to how to convince jtreg to use the > host's PATH setting? > > Thanks, > Severin > > [1] http://openjdk.java.net/jtreg/tag-spec.html > > On Tue, 2019-07-16 at 13:32 -0700, Igor Ignatyev wrote: >> Hi Misha, >> >> I understand that it doesn't alter the host system. my concern is >> that we move problem of host-configuration into tests. let's say >> tomorrow a new container engine will require something from >> /usr/local/sbin, or /usr/local/Cellar/docker/bin on another OS, or, >> god forbid, C:\Program Files(x86)\podman\bin. it has nothing to do w/ >> tests, it's a question of configuring a host, as I said, should be >> handled at another level by "scripts" run (once) prior test >> execution. >> >> -- Igor >> >>> On Jul 16, 2019, at 1:23 PM, mikhailo.seledtsov at oracle.com wrote: >>> >>> Hi Igor, >>> >>> In both cases the environment variable is set for the >>> Docker/Podman container process, not the host system. This will not >>> affect the host system in any way. The docker process has its own >>> namespace for environment variables. Does this alleviate your >>> concerns? >>> >>> >>> Thank you, >>> >>> Misha >>> >>> On 7/16/19 11:49 AM, Igor Ignatyev wrote: >>>> Hi Severin, >>>> >>>> I don't think that tests (or test libraries for that matter) >>>> should be responsible for setting correct PATH value, it should >>>> be a part of host configuration procedure (tests can/should check >>>> that all required bins are available though). in other words, I'd >>>> prefer if you remove 'env.put("PATH", ...)' lines from both >>>> DockerTestUtils and TestJFREvents. the rest looks good to me. >>>> >>>> Thanks, >>>> -- Igor >>>> >>>>> On Jul 16, 2019, at 5:36 AM, Severin Gehwolf < >>>>> sgehwolf at redhat.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I believe I still need a *R*eviewer for this. Any takers? >>>>> >>>>> Thanks, >>>>> Severin >>>>> >>>>> On Fri, 2019-07-12 at 15:19 -0700, >>>>> mikhailo.seledtsov at oracle.com wrote: >>>>>> Hi Severin, >>>>>> >>>>>> The change looks good to me. Thank you for adding support >>>>>> for Podman >>>>>> container technology. >>>>>> >>>>>> Testing: I ran both HotSpot and JDK container tests with your >>>>>> patch; >>>>>> tests executed on Oracle Linux 7.6 using default container >>>>>> engine (Docker): >>>>>> >>>>>> test/hotspot/jtreg/containers/ AND >>>>>> test/jdk/jdk/internal/platform/docker/ >>>>>> >>>>>> All PASS >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Misha >>>>>> >>>>>> >>>>>> On 7/12/19 11:08 AM, Severin Gehwolf wrote: >>>>>>> Hi, >>>>>>> >>>>>>> There is an alternative container engine which is being >>>>>>> used by Fedora >>>>>>> and RHEL 8, called podman[1]. It's mostly compatible with >>>>>>> docker. It >>>>>>> looks like OpenJDK docker tests can be made podman >>>>>>> compatible with a >>>>>>> few little tweaks. One "interesting" one is to not assert >>>>>>> "Successfully >>>>>>> built" in the build output but only rely on the exit code, >>>>>>> which seems >>>>>>> to be OK for my testing. Interestingly the test would be >>>>>>> skipped in >>>>>>> that case. >>>>>>> >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ >>>>>>> >>>>>>> Adjustments I've done: >>>>>>> * Don't assert "Successfully built" in image build >>>>>>> output[2]. >>>>>>> * Add /usr/sbin to PATH as the podman binary relies on >>>>>>> iptables for it >>>>>>> to work which is in /usr/sbin on Fedora >>>>>>> * Allow for Metrics.getCpuSystemUsage() and >>>>>>> Metrics.getCpuUserUsage() >>>>>>> to be equal to the previous value. I've found those >>>>>>> counters to be >>>>>>> slowly increasing, which made the tests unreliable. >>>>>>> >>>>>>> Testing: >>>>>>> >>>>>>> Running docker tests with docker as engine. Did the same >>>>>>> with podman as >>>>>>> engine via -Djdk.test.docker.command=podman on Linux >>>>>>> x86_64. Both >>>>>>> passed (non-trivially). >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> Thanks, >>>>>>> Severin >>>>>>> >>>>>>> [1] https://podman.io/ >>>>>>> [2] Image builds with podman look >>>>>>> like ("COMMIT" over "Successfully built"): >>>>>>> STEP 1: FROM fedora:29 >>>>>>> STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf >>>>>>> clean all >>>>>>> --> Using cache >>>>>>> 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690 >>>>>>> afd9d >>>>>>> STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ >>>>>>> 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bd >>>>>>> b49a8 >>>>>>> STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt >>>>>>> --add-modules java.base --add-exports >>>>>>> java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics >>>>>>> STEP 5: COMMIT fedora-metrics-11 >>>>>>> d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e >>>>>>> 8cd7e >>>>>>> From sgehwolf at redhat.com Tue Jul 16 21:21:33 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 16 Jul 2019 23:21:33 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <6d0d25c5-9ca0-608e-93eb-f83093114b76@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <99a03fcd-dd33-52ea-8f43-29c8aa2bcf78@oracle.com> <026F52EE-CD26-4E5F-B8D6-306C5DF358B8@oracle.com> <73012440f0b8ea7c603ba1685717efc4734dd994.camel@redhat.com> <6d0d25c5-9ca0-608e-93eb-f83093114b76@oracle.com> Message-ID: <0678d929475cde04c508da9c15ce2b9fcc36a738.camel@redhat.com> On Tue, 2019-07-16 at 14:11 -0700, Jonathan Gibbons wrote: > Severin, > > This might be a reasonable update to jtreg, to allow /usr/sbin on the path > on Unix-like systems. The intent of jtreg is to protect tests from random > crufty stuff on the PATH, and /usr/sbin is not in that category. > > I've created > CODETOOLS-7902505: Consider allowing /usr/sbin on $PATH > https://bugs.openjdk.java.net/browse/CODETOOLS-7902505 > > The short-term workaround is to use the jtreg command-line option > > -e:PATH > > which should override the default settign for PATH and pass through > whatever you have set for $PATH. Thanks, Jon! Cheers, Severin > -- Jon > > On 07/16/2019 02:01 PM, Severin Gehwolf wrote: > > Hi Igor, > > > > I understand the concern and I guess I could remove it and locally > > install a wrapper in /bin or /usr/bin for podman which adds > > /usr/sbin > > to the path. On the other hand... > > > > This seems to be an issue of code being run through jtreg. Looking > > at > > the jtr files I see this: > > > > ----------rerun:(21/1545)*---------- > > cd /disk/openjdk/upstream-sources/openjdk-head/JTwork/scratch && \\ > > DISPLAY=:0 \\ > > HOME=/home/sgehwolf \\ > > LANG=en_US.UTF-8 \\ > > PATH=/bin:/usr/bin \\ > > XMODIFIERS=@im=ibus \\ > > [...] > > > > So jtreg reduces the host's PATH to /bin:/usr/bin, which is > > insufficient for the podman case. The tag-spec docs[1] for jtreg > > mention for "shell" tests that it sets the PATH to the above > > settings. > > This affects Java tests as well it seems. > > > > ProcessBuilder outside jtreg has a sensible PATH as set up by the > > system, FWIW. > > > > So while my system is properly set up, jtreg interferes and renders > > this necessary. Any suggestions as to how to convince jtreg to use > > the > > host's PATH setting? > > > > Thanks, > > Severin > > > > [1] http://openjdk.java.net/jtreg/tag-spec.html > > > > On Tue, 2019-07-16 at 13:32 -0700, Igor Ignatyev wrote: > > > Hi Misha, > > > > > > I understand that it doesn't alter the host system. my concern is > > > that we move problem of host-configuration into tests. let's say > > > tomorrow a new container engine will require something from > > > /usr/local/sbin, or /usr/local/Cellar/docker/bin on another OS, > > > or, > > > god forbid, C:\Program Files(x86)\podman\bin. it has nothing to > > > do w/ > > > tests, it's a question of configuring a host, as I said, should > > > be > > > handled at another level by "scripts" run (once) prior test > > > execution. > > > > > > -- Igor > > > > > > > On Jul 16, 2019, at 1:23 PM, mikhailo.seledtsov at oracle.com > > > > wrote: > > > > > > > > Hi Igor, > > > > > > > > In both cases the environment variable is set for the > > > > Docker/Podman container process, not the host system. This will > > > > not > > > > affect the host system in any way. The docker process has its > > > > own > > > > namespace for environment variables. Does this alleviate your > > > > concerns? > > > > > > > > > > > > Thank you, > > > > > > > > Misha > > > > > > > > On 7/16/19 11:49 AM, Igor Ignatyev wrote: > > > > > Hi Severin, > > > > > > > > > > I don't think that tests (or test libraries for that matter) > > > > > should be responsible for setting correct PATH value, it > > > > > should > > > > > be a part of host configuration procedure (tests can/should > > > > > check > > > > > that all required bins are available though). in other words, > > > > > I'd > > > > > prefer if you remove 'env.put("PATH", ...)' lines from both > > > > > DockerTestUtils and TestJFREvents. the rest looks good to me. > > > > > > > > > > Thanks, > > > > > -- Igor > > > > > > > > > > > On Jul 16, 2019, at 5:36 AM, Severin Gehwolf < > > > > > > sgehwolf at redhat.com> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > I believe I still need a *R*eviewer for this. Any takers? > > > > > > > > > > > > Thanks, > > > > > > Severin > > > > > > > > > > > > On Fri, 2019-07-12 at 15:19 -0700, > > > > > > mikhailo.seledtsov at oracle.com wrote: > > > > > > > Hi Severin, > > > > > > > > > > > > > > The change looks good to me. Thank you for adding > > > > > > > support > > > > > > > for Podman > > > > > > > container technology. > > > > > > > > > > > > > > Testing: I ran both HotSpot and JDK container tests with > > > > > > > your > > > > > > > patch; > > > > > > > tests executed on Oracle Linux 7.6 using default > > > > > > > container > > > > > > > engine (Docker): > > > > > > > > > > > > > > test/hotspot/jtreg/containers/ AND > > > > > > > test/jdk/jdk/internal/platform/docker/ > > > > > > > > > > > > > > All PASS > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Misha > > > > > > > > > > > > > > > > > > > > > On 7/12/19 11:08 AM, Severin Gehwolf wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > There is an alternative container engine which is being > > > > > > > > used by Fedora > > > > > > > > and RHEL 8, called podman[1]. It's mostly compatible > > > > > > > > with > > > > > > > > docker. It > > > > > > > > looks like OpenJDK docker tests can be made podman > > > > > > > > compatible with a > > > > > > > > few little tweaks. One "interesting" one is to not > > > > > > > > assert > > > > > > > > "Successfully > > > > > > > > built" in the build output but only rely on the exit > > > > > > > > code, > > > > > > > > which seems > > > > > > > > to be OK for my testing. Interestingly the test would > > > > > > > > be > > > > > > > > skipped in > > > > > > > > that case. > > > > > > > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 > > > > > > > > webrev: > > > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ > > > > > > > > > > > > > > > > Adjustments I've done: > > > > > > > > * Don't assert "Successfully built" in image build > > > > > > > > output[2]. > > > > > > > > * Add /usr/sbin to PATH as the podman binary relies > > > > > > > > on > > > > > > > > iptables for it > > > > > > > > to work which is in /usr/sbin on Fedora > > > > > > > > * Allow for Metrics.getCpuSystemUsage() and > > > > > > > > Metrics.getCpuUserUsage() > > > > > > > > to be equal to the previous value. I've found those > > > > > > > > counters to be > > > > > > > > slowly increasing, which made the tests unreliable. > > > > > > > > > > > > > > > > Testing: > > > > > > > > > > > > > > > > Running docker tests with docker as engine. Did the > > > > > > > > same > > > > > > > > with podman as > > > > > > > > engine via -Djdk.test.docker.command=podman on Linux > > > > > > > > x86_64. Both > > > > > > > > passed (non-trivially). > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Severin > > > > > > > > > > > > > > > > [1] https://podman.io/ > > > > > > > > [2] Image builds with podman look > > > > > > > > like ("COMMIT" over "Successfully built"): > > > > > > > > STEP 1: FROM fedora:29 > > > > > > > > STEP 2: RUN dnf install -y java-11-openjdk-devel > > > > > > > > && dnf > > > > > > > > clean all > > > > > > > > --> Using cache > > > > > > > > 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182 > > > > > > > > a690 > > > > > > > > afd9d > > > > > > > > STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ > > > > > > > > 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574d > > > > > > > > c3bd > > > > > > > > b49a8 > > > > > > > > STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp > > > > > > > > /opt > > > > > > > > --add-modules java.base --add-exports > > > > > > > > java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics > > > > > > > > STEP 5: COMMIT fedora-metrics-11 > > > > > > > > d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91 > > > > > > > > 926e > > > > > > > > 8cd7e > > > > > > > > From mikhailo.seledtsov at oracle.com Tue Jul 16 21:29:01 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Tue, 16 Jul 2019 14:29:01 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <026F52EE-CD26-4E5F-B8D6-306C5DF358B8@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <99a03fcd-dd33-52ea-8f43-29c8aa2bcf78@oracle.com> <026F52EE-CD26-4E5F-B8D6-306C5DF358B8@oracle.com> Message-ID: <19774fe1-a49f-1fcf-cbbf-d3f55210d12d@oracle.com> On 7/16/19 1:32 PM, Igor Ignatyev wrote: > Hi Misha, > > I understand that it doesn't alter the host system. my concern is that we move problem of host-configuration into tests. let's say tomorrow a new container engine will require something from /usr/local/sbin, or /usr/local/Cellar/docker/bin on another OS, or, god forbid, C:\Program Files(x86)\podman\bin. it has nothing to do w/ tests, it's a question of configuring a host, as I said, should be handled at another level by "scripts" run (once) prior test execution. OK, it makes sense now. Thank you, Misha > > -- Igor > >> On Jul 16, 2019, at 1:23 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Hi Igor, >> >> In both cases the environment variable is set for the Docker/Podman container process, not the host system. This will not affect the host system in any way. The docker process has its own namespace for environment variables. Does this alleviate your concerns? >> >> >> Thank you, >> >> Misha >> >> On 7/16/19 11:49 AM, Igor Ignatyev wrote: >>> Hi Severin, >>> >>> I don't think that tests (or test libraries for that matter) should be responsible for setting correct PATH value, it should be a part of host configuration procedure (tests can/should check that all required bins are available though). in other words, I'd prefer if you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and TestJFREvents. the rest looks good to me. >>> >>> Thanks, >>> -- Igor >>> >>>> On Jul 16, 2019, at 5:36 AM, Severin Gehwolf wrote: >>>> >>>> Hi, >>>> >>>> I believe I still need a *R*eviewer for this. Any takers? >>>> >>>> Thanks, >>>> Severin >>>> >>>> On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>> Hi Severin, >>>>> >>>>> The change looks good to me. Thank you for adding support for Podman >>>>> container technology. >>>>> >>>>> Testing: I ran both HotSpot and JDK container tests with your patch; >>>>> tests executed on Oracle Linux 7.6 using default container engine (Docker): >>>>> >>>>> test/hotspot/jtreg/containers/ AND >>>>> test/jdk/jdk/internal/platform/docker/ >>>>> >>>>> All PASS >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Misha >>>>> >>>>> >>>>> On 7/12/19 11:08 AM, Severin Gehwolf wrote: >>>>>> Hi, >>>>>> >>>>>> There is an alternative container engine which is being used by Fedora >>>>>> and RHEL 8, called podman[1]. It's mostly compatible with docker. It >>>>>> looks like OpenJDK docker tests can be made podman compatible with a >>>>>> few little tweaks. One "interesting" one is to not assert "Successfully >>>>>> built" in the build output but only rely on the exit code, which seems >>>>>> to be OK for my testing. Interestingly the test would be skipped in >>>>>> that case. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 >>>>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ >>>>>> >>>>>> Adjustments I've done: >>>>>> * Don't assert "Successfully built" in image build output[2]. >>>>>> * Add /usr/sbin to PATH as the podman binary relies on iptables for it >>>>>> to work which is in /usr/sbin on Fedora >>>>>> * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() >>>>>> to be equal to the previous value. I've found those counters to be >>>>>> slowly increasing, which made the tests unreliable. >>>>>> >>>>>> Testing: >>>>>> >>>>>> Running docker tests with docker as engine. Did the same with podman as >>>>>> engine via -Djdk.test.docker.command=podman on Linux x86_64. Both >>>>>> passed (non-trivially). >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> Thanks, >>>>>> Severin >>>>>> >>>>>> [1] https://podman.io/ >>>>>> [2] Image builds with podman look >>>>>> like ("COMMIT" over "Successfully built"): >>>>>> STEP 1: FROM fedora:29 >>>>>> STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all >>>>>> --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d >>>>>> STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ >>>>>> 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 >>>>>> STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics >>>>>> STEP 5: COMMIT fedora-metrics-11 >>>>>> d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e >>>>>> From igor.ignatyev at oracle.com Tue Jul 16 21:35:08 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 16 Jul 2019 14:35:08 -0700 Subject: RFR(S) [13] : 8226910 : make it possible to use jtreg's -match via run-test framework In-Reply-To: References: <8B6A5349-A39A-4AE0-980D-5C336C339DE7@oracle.com> <9DA3B077-FFE6-472E-B3EA-7C4CFFDB45EB@oracle.com> <5b10f093-8aa8-4b5f-14bf-a9b7c5704381@oracle.com> Message-ID: <2F2CE24E-9DDB-489D-9CC6-3296C0149B9A@oracle.com> can I get a review for this patch? http://cr.openjdk.java.net/~iignatyev//8226910/webrev.01/index.html Thanks, -- Igor > On Jul 6, 2019, at 11:50 AM, Igor Ignatyev wrote: > > Hi David, > >> On Jul 6, 2019, at 1:58 AM, David Holmes wrote: >> >> Hi Igor, >> >> On 6/07/2019 1:09 pm, Igor Ignatyev wrote: >>> ping? >>> -- Igor >>>> On Jun 27, 2019, at 3:25 PM, Igor Ignatyev wrote: >>>> >>>> http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>>>> 25 lines changed: 18 ins; 3 del; 4 mod; >>>> >>>> Hi all, >>>> >>>> could you please review this small patch which adds JTREG_RUN_PROBLEM_LISTS options to run-test framework? when JTREG_RUN_PROBLEM_LISTS is set to true, jtreg will use problem lists as values of -match: instead of -exclude, which effectively means it will run only problem listed tests. >> >> doc/testing.md >> >> + Set to `true` of `false`. >> >> typo: s/of/or/ > fixed .md, regenerated .html. >> >> Build changes seem okay - I can't attest to the operation of the flag. > > here is how I verified that it does that it supposed to: > > $ make test "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true" TEST=open/test/hotspot/jtreg/:hotspot_all > lists 53 tests, the same command w/o RUN_PROBLEM_LISTS (or w/ RUN_PROBLEM_LISTS=false) lists 6698 tests. > > $ make test "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true;EXTRA_PROBLEM_LISTS=ProblemList-aot.txt > lists 81 tests, the same command w/o RUN_PROBLEM_LISTS lists 6670 tests. > >> >>>> doc/building.html got changed when I ran update-build-docs, I can exclude it from the patch, but it seems it will keep changing every time we run update-build-docs, so I decided to at least bring it up. >> >> Weird it seems to have removed line-breaks in that paragraph. What platform did you build on? > I built on macos. now when I wrote that, I remember pandoc used to produce different results on macos. so I've rerun it on linux on the source w/o my change, and doc/building.html still got changed in the exact same way. > >> David >> ----- >> >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8226910 >>>> webrev: http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>>> >>>> Thanks, >>>> -- Igor From dean.long at oracle.com Wed Jul 17 07:17:33 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 17 Jul 2019 00:17:33 -0700 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> Message-ID: On 7/16/19 10:51 AM, dean.long at oracle.com wrote: > Back to the make_not_entrant / make_unloaded race.? If > make_not_entrant bails out half-way through because make_unloaded won > the race, doesn't that mean that make_unloaded needs to have already > done all the work that make_not_entrant is not doing? > unlink_from_method, invalidate_nmethod_mirror, remove_osr_nmethod, > unregister_nmethod, etc. What I'm thinking is, what happens if instead of this: 1365 // Change state 1366 if (!try_transition(state)) { 1367 return false; 1368 } we do this: 1365 // Maybe change state 1366 if (!try_transition(state)) { 1367 // fall through 1368 } dl From vladimir.x.ivanov at oracle.com Wed Jul 17 10:26:31 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 17 Jul 2019 13:26:31 +0300 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> Message-ID: Revised fix: http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ It turned out the problem is not specific to i2c2i: fast class initialization barriers on nmethod entry trigger the assert as well. JNI upcalls (CallStaticMethod) don't have class initialization checks, so it's possible to initiate a JNI upcall from a non-initializing thread and JVM should let it complete. It leads to a busy loop (asserts in debug) between nmethod entry barrier & SharedRuntime::handle_wrong_method until holder class is initialized (possibly infinite if it blocks class initialization). Proposed fix is to keep using c2i, but jump over class initialization barrier right to the argument shuffling logic on verified entry when coming from SharedRuntime::handle_wrong_method. Improved regression test reliably reproduces the problem. Testing: regression test, hs-precheckin-comp, tier1-6 Best regards, Vladimir Ivanov On 04/07/2019 18:02, Erik ?sterlund wrote: > Hi, > > The i2c adapter sets a thread-local "callee_target" Method*, which is > caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c > call is "bad" (e.g. not_entrant). This error handler forwards execution > to the callee c2i entry. If the SharedRuntime::handle_wrong_method > method is called again due to the i2c2i call being still bad, then we > will crash the VM in the following guarantee in > SharedRuntime::handle_wrong_method: > > Method* callee = thread->callee_target(); > guarantee(callee != NULL && callee->is_method(), "bad handshake"); > > Unfortunately, the c2i entry can indeed fail again if it, e.g., hits the > new class initialization entry barrier of the c2i adapter. > The solution is to simply not clear the thread-local "callee_target" > after handling the first failure, as we can't really know there won't be > another one. There is no reason to clear this value as nobody else reads > it than the SharedRuntime::handle_wrong_method handler (and we really do > want it to be able to read the value as many times as it takes until the > call goes through). I found some confused clearing of this callee_target > in JavaThread::oops_do(), with a comment saying this is a methodOop that > we need to clear to make GC happy or something. Seems like old traces of > perm gen. So I deleted that too. > > I caught this in ZGC where the timing window for hitting this issue > seems to be wider due to concurrent code cache unloading. But it is > equally problematic for all GCs. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8227260 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ > > Thanks, > /Erik From erik.osterlund at oracle.com Wed Jul 17 12:25:56 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 17 Jul 2019 14:25:56 +0200 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> Message-ID: Hi Vladimir, Looks good. Thanks for fixing. /Erik On 2019-07-17 12:26, Vladimir Ivanov wrote: > Revised fix: > ? http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ > > It turned out the problem is not specific to i2c2i: fast class > initialization barriers on nmethod entry trigger the assert as well. > > JNI upcalls (CallStaticMethod) don't have class initialization > checks, so it's possible to initiate a JNI upcall from a > non-initializing thread and JVM should let it complete. > > It leads to a busy loop (asserts in debug) between nmethod entry barrier > & SharedRuntime::handle_wrong_method until holder class is initialized > (possibly infinite if it blocks class initialization). > > Proposed fix is to keep using c2i, but jump over class initialization > barrier right to the argument shuffling logic on verified entry when > coming from SharedRuntime::handle_wrong_method. > > Improved regression test reliably reproduces the problem. > > Testing: regression test, hs-precheckin-comp, tier1-6 > > Best regards, > Vladimir Ivanov > > On 04/07/2019 18:02, Erik ?sterlund wrote: >> Hi, >> >> The i2c adapter sets a thread-local "callee_target" Method*, which is >> caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c >> call is "bad" (e.g. not_entrant). This error handler forwards >> execution to the callee c2i entry. If the >> SharedRuntime::handle_wrong_method method is called again due to the >> i2c2i call being still bad, then we will crash the VM in the following >> guarantee in SharedRuntime::handle_wrong_method: >> >> Method* callee = thread->callee_target(); >> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >> >> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits >> the new class initialization entry barrier of the c2i adapter. >> The solution is to simply not clear the thread-local "callee_target" >> after handling the first failure, as we can't really know there won't >> be another one. There is no reason to clear this value as nobody else >> reads it than the SharedRuntime::handle_wrong_method handler (and we >> really do want it to be able to read the value as many times as it >> takes until the call goes through). I found some confused clearing of >> this callee_target in JavaThread::oops_do(), with a comment saying >> this is a methodOop that we need to clear to make GC happy or >> something. Seems like old traces of perm gen. So I deleted that too. >> >> I caught this in ZGC where the timing window for hitting this issue >> seems to be wider due to concurrent code cache unloading. But it is >> equally problematic for all GCs. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8227260 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >> >> Thanks, >> /Erik From sgehwolf at redhat.com Wed Jul 17 12:44:10 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 17 Jul 2019 14:44:10 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> Message-ID: <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> Hi Igor, Misha, On Tue, 2019-07-16 at 11:49 -0700, Igor Ignatyev wrote: > Hi Severin, > > I don't think that tests (or test libraries for that matter) should > be responsible for setting correct PATH value, it should be a part of > host configuration procedure (tests can/should check that all > required bins are available though). in other words, I'd prefer if > you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and > TestJFREvents. the rest looks good to me. Updated webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/02/webrev/ No more additions to PATH are being done. I've discovered that VMProps.java which defines "docker.required", used the "docker" binary even for podman test runs. This ended up not running most of the tests even with -Djdk.test.docker.command=podman specified. I've fixed that by moving DOCKER_COMMAND to Platform.java so that it can be used in both places. Testing: Container tests with docker daemon running on Linux x86_64, container tests without docker daemon running (podman is daemon-less) via the podman binary on Linux x86_64 (with -e:PATH). All pass. More thoughts? Thanks, Severin > Thanks, > -- Igor > > > On Jul 16, 2019, at 5:36 AM, Severin Gehwolf wrote: > > > > Hi, > > > > I believe I still need a *R*eviewer for this. Any takers? > > > > Thanks, > > Severin > > > > On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: > > > Hi Severin, > > > > > > The change looks good to me. Thank you for adding support for Podman > > > container technology. > > > > > > Testing: I ran both HotSpot and JDK container tests with your patch; > > > tests executed on Oracle Linux 7.6 using default container engine (Docker): > > > > > > test/hotspot/jtreg/containers/ AND > > > test/jdk/jdk/internal/platform/docker/ > > > > > > All PASS > > > > > > > > > Thanks, > > > > > > Misha > > > > > > > > > On 7/12/19 11:08 AM, Severin Gehwolf wrote: > > > > Hi, > > > > > > > > There is an alternative container engine which is being used by Fedora > > > > and RHEL 8, called podman[1]. It's mostly compatible with docker. It > > > > looks like OpenJDK docker tests can be made podman compatible with a > > > > few little tweaks. One "interesting" one is to not assert "Successfully > > > > built" in the build output but only rely on the exit code, which seems > > > > to be OK for my testing. Interestingly the test would be skipped in > > > > that case. > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 > > > > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ > > > > > > > > Adjustments I've done: > > > > * Don't assert "Successfully built" in image build output[2]. > > > > * Add /usr/sbin to PATH as the podman binary relies on iptables for it > > > > to work which is in /usr/sbin on Fedora > > > > * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() > > > > to be equal to the previous value. I've found those counters to be > > > > slowly increasing, which made the tests unreliable. > > > > > > > > Testing: > > > > > > > > Running docker tests with docker as engine. Did the same with podman as > > > > engine via -Djdk.test.docker.command=podman on Linux x86_64. Both > > > > passed (non-trivially). > > > > > > > > Thoughts? > > > > > > > > Thanks, > > > > Severin > > > > > > > > [1] https://podman.io/ > > > > [2] Image builds with podman look > > > > like ("COMMIT" over "Successfully built"): > > > > STEP 1: FROM fedora:29 > > > > STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all > > > > --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d > > > > STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ > > > > 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 > > > > STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics > > > > STEP 5: COMMIT fedora-metrics-11 > > > > d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e > > > > From vladimir.x.ivanov at oracle.com Wed Jul 17 13:06:39 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 17 Jul 2019 16:06:39 +0300 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> Message-ID: <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> Thanks, Erik. Also, since I touch platform-specific code, I'd like Martin and Dmitrij (implementors of support for s390, ppc, and aarch64) to take a look at the patch as well. Best regards, Vladimir Ivanov On 17/07/2019 15:25, Erik ?sterlund wrote: > Hi Vladimir, > > Looks good. Thanks for fixing. > > /Erik > > On 2019-07-17 12:26, Vladimir Ivanov wrote: >> Revised fix: >> ?? http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ >> >> It turned out the problem is not specific to i2c2i: fast class >> initialization barriers on nmethod entry trigger the assert as well. >> >> JNI upcalls (CallStaticMethod) don't have class initialization >> checks, so it's possible to initiate a JNI upcall from a >> non-initializing thread and JVM should let it complete. >> >> It leads to a busy loop (asserts in debug) between nmethod entry >> barrier & SharedRuntime::handle_wrong_method until holder class is >> initialized (possibly infinite if it blocks class initialization). >> >> Proposed fix is to keep using c2i, but jump over class initialization >> barrier right to the argument shuffling logic on verified entry when >> coming from SharedRuntime::handle_wrong_method. >> >> Improved regression test reliably reproduces the problem. >> >> Testing: regression test, hs-precheckin-comp, tier1-6 >> >> Best regards, >> Vladimir Ivanov >> >> On 04/07/2019 18:02, Erik ?sterlund wrote: >>> Hi, >>> >>> The i2c adapter sets a thread-local "callee_target" Method*, which is >>> caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c >>> call is "bad" (e.g. not_entrant). This error handler forwards >>> execution to the callee c2i entry. If the >>> SharedRuntime::handle_wrong_method method is called again due to the >>> i2c2i call being still bad, then we will crash the VM in the >>> following guarantee in SharedRuntime::handle_wrong_method: >>> >>> Method* callee = thread->callee_target(); >>> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >>> >>> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits >>> the new class initialization entry barrier of the c2i adapter. >>> The solution is to simply not clear the thread-local "callee_target" >>> after handling the first failure, as we can't really know there won't >>> be another one. There is no reason to clear this value as nobody else >>> reads it than the SharedRuntime::handle_wrong_method handler (and we >>> really do want it to be able to read the value as many times as it >>> takes until the call goes through). I found some confused clearing of >>> this callee_target in JavaThread::oops_do(), with a comment saying >>> this is a methodOop that we need to clear to make GC happy or >>> something. Seems like old traces of perm gen. So I deleted that too. >>> >>> I caught this in ZGC where the timing window for hitting this issue >>> seems to be wider due to concurrent code cache unloading. But it is >>> equally problematic for all GCs. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8227260 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >>> >>> Thanks, >>> /Erik From christoph.langer at sap.com Wed Jul 17 14:10:23 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 17 Jul 2019 14:10:23 +0000 Subject: 8225200: runtime/memory/RunUnitTestsConcurrently.java has a memory leak - push to jdk13? Message-ID: Hi, as we're running into this issue in our nightly test environment, I would be very interested to bring this thing to jdk13. As per RDP rules (https://openjdk.java.net/jeps/3) , we are currently transitioning from RDP1 to RDP2. But in both phases, it is allowed to push test fixes. So, would you say this is a test fix and can be pushed while still adhering to the rules? I'd say yes, but I'd like to get some confirmation (or rejection if I'm wrong...) That would be the change to push: http://hg.openjdk.java.net/jdk/jdk/rev/8a153a932d0f Thanks Christoph > -----Original Message----- > From: hotspot-dev On Behalf Of > Thomas St?fe > Sent: Montag, 1. Juli 2019 21:19 > To: Coleen Phillmore > Cc: HotSpot Open Source Developers > Subject: Re: RFR(xs): 8225200: > runtime/memory/RunUnitTestsConcurrently.java has a memory leak > > Thanks Coleen! > > On Mon, Jul 1, 2019, 21:14 wrote: > > > +1 > > Thank you for taking care of this! > > Coleen > > > > On 7/1/19 3:07 PM, Thomas St?fe wrote: > > > Thanks Stefan! > > > > > > On Mon, Jul 1, 2019, 21:06 Stefan Karlsson > > > wrote: > > > > > >> On 2019-07-01 20:56, Thomas St?fe wrote: > > >>> Hi all, > > >>> > > >>> may I please have reviews and opinions about the following patch: > > >>> > > >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 > > >>> cr: > > >>> > > >> > > http://cr.openjdk.java.net/~stuefe/webrevs/8227041- > rununittestsconcurrently-has-a-mem-leak/webrev.00/webrev/index.html > > >>> There is a memory leak in test_virtual_space_list_large_chunk(), called > > >> as > > >>> part of the whitebox tests WB_RunMemoryUnitTests(). In this test > > >> metaspace > > >>> allocation is tested by rapidly allocating and subsequently leaking a > > >>> metachunk of ~512K. This is done by a number of threads in a tight > loop > > >> for > > >>> 15 seconds, which usually makes for 10-20GB rss. Test is usually OOM > > >> killed. > > >>> This test seems to be often excluded, which makes sense, since this > > leak > > >>> makes its memory usage difficult to predict. > > >>> > > >>> It is also earmarked by Oracle for gtest-ification, see 8213269. > > >>> > > >>> This leak is not easy to fix, among other things because it is not > > clear > > >>> what it is it wants to test. Meanwhile, time moved on and we have > quite > > >>> nice gtests to test metaspace allocation (see e.g. > > >>> test_metaspace_allocation.cpp) and I rather would run those gtests > > >>> concurrently. Which could be a future RFE. > > >>> > > >>> So I just removed this metaspace related test from > > >> WB_RunMemoryUnitTests() > > >>> altogether, since to me it does nothing useful. Once you remove the > > >> leaking > > >>> allocation, not much is left. > > >>> > > >>> Without this part RunUnitTestsConcurrently test runs smoothly > through > > its > > >>> other parts, and in that form it is still useful. > > >>> > > >>> What do you think? > > >> I think this makes sense and it looks good to me. > > >> > > >> Thanks, > > >> StefanK > > >> > > >>> Cheers, Thomas > > >> > > > > From shade at redhat.com Wed Jul 17 14:17:56 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 17 Jul 2019 16:17:56 +0200 Subject: 8225200: runtime/memory/RunUnitTestsConcurrently.java has a memory leak - push to jdk13? In-Reply-To: References: Message-ID: <599def4d-6c9c-98b3-582a-781a0bb51fee@redhat.com> On 7/17/19 4:10 PM, Langer, Christoph wrote: > as we're running into this issue in our nightly test environment, I would be very interested to > bring this thing to jdk13. As per RDP rules (https://openjdk.java.net/jeps/3) , we are currently > transitioning from RDP1 to RDP2. But in both phases, it is allowed to push test fixes. So, would > you say this is a test fix and can be pushed while still adhering to the rules? I'd say yes, but > I'd like to get some confirmation (or rejection if I'm wrong...) > > That would be the change to push: http://hg.openjdk.java.net/jdk/jdk/rev/8a153a932d0f This looks a simple test bug. "Depending on how fast the machine is, this will usually eat up 10-20GB, often causing the process being OOM killed." Ooof. I believe it is needed in jdk13. -- Thanks, -Aleksey From daniel.daugherty at oracle.com Wed Jul 17 14:23:12 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 17 Jul 2019 10:23:12 -0400 Subject: 8225200: runtime/memory/RunUnitTestsConcurrently.java has a memory leak - push to jdk13? In-Reply-To: References: Message-ID: <94d06881-c472-a41e-31b0-791bdf8c8962@oracle.com> The subject has the following bug ID: 8225200 ??? JDK-8225200 assert(vs.actual_committed_size() >= commit_size) failed and a synopsis from a different bug: ??? JDK-8227041 runtime/memory/RunUnitTestsConcurrently.java has a memory leak Please clarify which you would like to backport... Dan On 7/17/19 10:10 AM, Langer, Christoph wrote: > Hi, > > as we're running into this issue in our nightly test environment, I would be very interested to bring this thing to jdk13. As per RDP rules (https://openjdk.java.net/jeps/3) , we are currently transitioning from RDP1 to RDP2. But in both phases, it is allowed to push test fixes. So, would you say this is a test fix and can be pushed while still adhering to the rules? I'd say yes, but I'd like to get some confirmation (or rejection if I'm wrong...) > > That would be the change to push: http://hg.openjdk.java.net/jdk/jdk/rev/8a153a932d0f > > Thanks > Christoph > > >> -----Original Message----- >> From: hotspot-dev On Behalf Of >> Thomas St?fe >> Sent: Montag, 1. Juli 2019 21:19 >> To: Coleen Phillmore >> Cc: HotSpot Open Source Developers >> Subject: Re: RFR(xs): 8225200: >> runtime/memory/RunUnitTestsConcurrently.java has a memory leak >> >> Thanks Coleen! >> >> On Mon, Jul 1, 2019, 21:14 wrote: >> >>> +1 >>> Thank you for taking care of this! >>> Coleen >>> >>> On 7/1/19 3:07 PM, Thomas St?fe wrote: >>>> Thanks Stefan! >>>> >>>> On Mon, Jul 1, 2019, 21:06 Stefan Karlsson >>>> wrote: >>>> >>>>> On 2019-07-01 20:56, Thomas St?fe wrote: >>>>>> Hi all, >>>>>> >>>>>> may I please have reviews and opinions about the following patch: >>>>>> >>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 >>>>>> cr: >>>>>> >>> http://cr.openjdk.java.net/~stuefe/webrevs/8227041- >> rununittestsconcurrently-has-a-mem-leak/webrev.00/webrev/index.html >>>>>> There is a memory leak in test_virtual_space_list_large_chunk(), called >>>>> as >>>>>> part of the whitebox tests WB_RunMemoryUnitTests(). In this test >>>>> metaspace >>>>>> allocation is tested by rapidly allocating and subsequently leaking a >>>>>> metachunk of ~512K. This is done by a number of threads in a tight >> loop >>>>> for >>>>>> 15 seconds, which usually makes for 10-20GB rss. Test is usually OOM >>>>> killed. >>>>>> This test seems to be often excluded, which makes sense, since this >>> leak >>>>>> makes its memory usage difficult to predict. >>>>>> >>>>>> It is also earmarked by Oracle for gtest-ification, see 8213269. >>>>>> >>>>>> This leak is not easy to fix, among other things because it is not >>> clear >>>>>> what it is it wants to test. Meanwhile, time moved on and we have >> quite >>>>>> nice gtests to test metaspace allocation (see e.g. >>>>>> test_metaspace_allocation.cpp) and I rather would run those gtests >>>>>> concurrently. Which could be a future RFE. >>>>>> >>>>>> So I just removed this metaspace related test from >>>>> WB_RunMemoryUnitTests() >>>>>> altogether, since to me it does nothing useful. Once you remove the >>>>> leaking >>>>>> allocation, not much is left. >>>>>> >>>>>> Without this part RunUnitTestsConcurrently test runs smoothly >> through >>> its >>>>>> other parts, and in that form it is still useful. >>>>>> >>>>>> What do you think? >>>>> I think this makes sense and it looks good to me. >>>>> >>>>> Thanks, >>>>> StefanK >>>>> >>>>>> Cheers, Thomas >>> From christoph.langer at sap.com Wed Jul 17 14:31:57 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 17 Jul 2019 14:31:57 +0000 Subject: 8227041 (was 8225200): runtime/memory/RunUnitTestsConcurrently.java has a memory leak - push to jdk13? Message-ID: Hi Dan, sorry, I just replied to the original thread which has the 8225200 bug id in its subject line for whatever reason. But the bug to backport is JDK-8227041 of course. However, 8225200 has no associated patch anyway as Thomas thinks that it is resolved by 8227041, too. Best regards Christoph > -----Original Message----- > From: Daniel D. Daugherty > Sent: Mittwoch, 17. Juli 2019 16:23 > To: Langer, Christoph ; Thomas St?fe > ; Coleen Phillmore > ; Stefan Karlsson > > Cc: HotSpot Open Source Developers > Subject: Re: 8225200: runtime/memory/RunUnitTestsConcurrently.java has a > memory leak - push to jdk13? > > The subject has the following bug ID: 8225200 > > ??? JDK-8225200 assert(vs.actual_committed_size() >= commit_size) failed > > and a synopsis from a different bug: > > ??? JDK-8227041 runtime/memory/RunUnitTestsConcurrently.java has a > memory leak > > Please clarify which you would like to backport... > > Dan > > > > On 7/17/19 10:10 AM, Langer, Christoph wrote: > > Hi, > > > > as we're running into this issue in our nightly test environment, I would be > very interested to bring this thing to jdk13. As per RDP rules > (https://openjdk.java.net/jeps/3) , we are currently transitioning from RDP1 > to RDP2. But in both phases, it is allowed to push test fixes. So, would you say > this is a test fix and can be pushed while still adhering to the rules? I'd say > yes, but I'd like to get some confirmation (or rejection if I'm wrong...) > > > > That would be the change to push: > http://hg.openjdk.java.net/jdk/jdk/rev/8a153a932d0f > > > > Thanks > > Christoph > > > > > >> -----Original Message----- > >> From: hotspot-dev On Behalf > Of > >> Thomas St?fe > >> Sent: Montag, 1. Juli 2019 21:19 > >> To: Coleen Phillmore > >> Cc: HotSpot Open Source Developers > >> Subject: Re: RFR(xs): 8225200: > >> runtime/memory/RunUnitTestsConcurrently.java has a memory leak > >> > >> Thanks Coleen! > >> > >> On Mon, Jul 1, 2019, 21:14 wrote: > >> > >>> +1 > >>> Thank you for taking care of this! > >>> Coleen > >>> > >>> On 7/1/19 3:07 PM, Thomas St?fe wrote: > >>>> Thanks Stefan! > >>>> > >>>> On Mon, Jul 1, 2019, 21:06 Stefan Karlsson > > >>>> wrote: > >>>> > >>>>> On 2019-07-01 20:56, Thomas St?fe wrote: > >>>>>> Hi all, > >>>>>> > >>>>>> may I please have reviews and opinions about the following patch: > >>>>>> > >>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 > >>>>>> cr: > >>>>>> > >>> http://cr.openjdk.java.net/~stuefe/webrevs/8227041- > >> rununittestsconcurrently-has-a-mem- > leak/webrev.00/webrev/index.html > >>>>>> There is a memory leak in test_virtual_space_list_large_chunk(), > called > >>>>> as > >>>>>> part of the whitebox tests WB_RunMemoryUnitTests(). In this test > >>>>> metaspace > >>>>>> allocation is tested by rapidly allocating and subsequently leaking a > >>>>>> metachunk of ~512K. This is done by a number of threads in a tight > >> loop > >>>>> for > >>>>>> 15 seconds, which usually makes for 10-20GB rss. Test is usually OOM > >>>>> killed. > >>>>>> This test seems to be often excluded, which makes sense, since this > >>> leak > >>>>>> makes its memory usage difficult to predict. > >>>>>> > >>>>>> It is also earmarked by Oracle for gtest-ification, see 8213269. > >>>>>> > >>>>>> This leak is not easy to fix, among other things because it is not > >>> clear > >>>>>> what it is it wants to test. Meanwhile, time moved on and we have > >> quite > >>>>>> nice gtests to test metaspace allocation (see e.g. > >>>>>> test_metaspace_allocation.cpp) and I rather would run those gtests > >>>>>> concurrently. Which could be a future RFE. > >>>>>> > >>>>>> So I just removed this metaspace related test from > >>>>> WB_RunMemoryUnitTests() > >>>>>> altogether, since to me it does nothing useful. Once you remove the > >>>>> leaking > >>>>>> allocation, not much is left. > >>>>>> > >>>>>> Without this part RunUnitTestsConcurrently test runs smoothly > >> through > >>> its > >>>>>> other parts, and in that form it is still useful. > >>>>>> > >>>>>> What do you think? > >>>>> I think this makes sense and it looks good to me. > >>>>> > >>>>> Thanks, > >>>>> StefanK > >>>>> > >>>>>> Cheers, Thomas > >>> From matthias.baesken at sap.com Wed Jul 17 15:06:42 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 17 Jul 2019 15:06:42 +0000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp Message-ID: Hello, there are a couple of non matching format specifiers in os_aix.cpp . I adjust them with my change . Please review ! Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8227869 http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ Thanks, Matthias From martin.doerr at sap.com Wed Jul 17 15:10:51 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 17 Jul 2019 15:10:51 +0000 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> Message-ID: Hi Vladimir, thank you for taking care of these platforms. PPC64 and s390 parts look good and the test passes. sharedRuntime.cpp: Is caller_frame.is_entry_frame() precise enough? Can it not also be true when calling normally from VM? If so, don't we need the clinit check in this case? Best regards, Martin > -----Original Message----- > From: Vladimir Ivanov > Sent: Mittwoch, 17. Juli 2019 15:07 > To: Doerr, Martin ; hotspot- > dev at openjdk.java.net; Dmitrij Pochepko > Subject: Re: RFR[13]: 8227260: Can't deal with > SharedRuntime::handle_wrong_method triggering more than once for > interpreter calls > > Thanks, Erik. > > Also, since I touch platform-specific code, I'd like Martin and Dmitrij > (implementors of support for s390, ppc, and aarch64) to take a look at > the patch as well. > > Best regards, > Vladimir Ivanov > > On 17/07/2019 15:25, Erik ?sterlund wrote: > > Hi Vladimir, > > > > Looks good. Thanks for fixing. > > > > /Erik > > > > On 2019-07-17 12:26, Vladimir Ivanov wrote: > >> Revised fix: > >> ?? http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ > >> > >> It turned out the problem is not specific to i2c2i: fast class > >> initialization barriers on nmethod entry trigger the assert as well. > >> > >> JNI upcalls (CallStaticMethod) don't have class initialization > >> checks, so it's possible to initiate a JNI upcall from a > >> non-initializing thread and JVM should let it complete. > >> > >> It leads to a busy loop (asserts in debug) between nmethod entry > >> barrier & SharedRuntime::handle_wrong_method until holder class is > >> initialized (possibly infinite if it blocks class initialization). > >> > >> Proposed fix is to keep using c2i, but jump over class initialization > >> barrier right to the argument shuffling logic on verified entry when > >> coming from SharedRuntime::handle_wrong_method. > >> > >> Improved regression test reliably reproduces the problem. > >> > >> Testing: regression test, hs-precheckin-comp, tier1-6 > >> > >> Best regards, > >> Vladimir Ivanov > >> > >> On 04/07/2019 18:02, Erik ?sterlund wrote: > >>> Hi, > >>> > >>> The i2c adapter sets a thread-local "callee_target" Method*, which is > >>> caught (and cleared) by SharedRuntime::handle_wrong_method if the > i2c > >>> call is "bad" (e.g. not_entrant). This error handler forwards > >>> execution to the callee c2i entry. If the > >>> SharedRuntime::handle_wrong_method method is called again due to > the > >>> i2c2i call being still bad, then we will crash the VM in the > >>> following guarantee in SharedRuntime::handle_wrong_method: > >>> > >>> Method* callee = thread->callee_target(); > >>> guarantee(callee != NULL && callee->is_method(), "bad handshake"); > >>> > >>> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits > >>> the new class initialization entry barrier of the c2i adapter. > >>> The solution is to simply not clear the thread-local "callee_target" > >>> after handling the first failure, as we can't really know there won't > >>> be another one. There is no reason to clear this value as nobody else > >>> reads it than the SharedRuntime::handle_wrong_method handler (and > we > >>> really do want it to be able to read the value as many times as it > >>> takes until the call goes through). I found some confused clearing of > >>> this callee_target in JavaThread::oops_do(), with a comment saying > >>> this is a methodOop that we need to clear to make GC happy or > >>> something. Seems like old traces of perm gen. So I deleted that too. > >>> > >>> I caught this in ZGC where the timing window for hitting this issue > >>> seems to be wider due to concurrent code cache unloading. But it is > >>> equally problematic for all GCs. > >>> > >>> Bug: > >>> https://bugs.openjdk.java.net/browse/JDK-8227260 > >>> > >>> Webrev: > >>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ > >>> > >>> Thanks, > >>> /Erik From daniel.daugherty at oracle.com Wed Jul 17 15:22:05 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 17 Jul 2019 11:22:05 -0400 Subject: 8227041 (was 8225200): runtime/memory/RunUnitTestsConcurrently.java has a memory leak - push to jdk13? In-Reply-To: References: Message-ID: <19eaf56e-89e9-c516-1242-0881ef3c219e@oracle.com> Thanks for the clarification. As a test fix, I also believe that JDK-8227041 can be pushed to jdk13. Dan On 7/17/19 10:31 AM, Langer, Christoph wrote: > Hi Dan, > > sorry, I just replied to the original thread which has the 8225200 bug id in its subject line for whatever reason. But the bug to backport is JDK-8227041 of course. > > However, 8225200 has no associated patch anyway as Thomas thinks that it is resolved by 8227041, too. > > Best regards > Christoph > >> -----Original Message----- >> From: Daniel D. Daugherty >> Sent: Mittwoch, 17. Juli 2019 16:23 >> To: Langer, Christoph ; Thomas St?fe >> ; Coleen Phillmore >> ; Stefan Karlsson >> >> Cc: HotSpot Open Source Developers >> Subject: Re: 8225200: runtime/memory/RunUnitTestsConcurrently.java has a >> memory leak - push to jdk13? >> >> The subject has the following bug ID: 8225200 >> >> ??? JDK-8225200 assert(vs.actual_committed_size() >= commit_size) failed >> >> and a synopsis from a different bug: >> >> ??? JDK-8227041 runtime/memory/RunUnitTestsConcurrently.java has a >> memory leak >> >> Please clarify which you would like to backport... >> >> Dan >> >> >> >> On 7/17/19 10:10 AM, Langer, Christoph wrote: >>> Hi, >>> >>> as we're running into this issue in our nightly test environment, I would be >> very interested to bring this thing to jdk13. As per RDP rules >> (https://openjdk.java.net/jeps/3) , we are currently transitioning from RDP1 >> to RDP2. But in both phases, it is allowed to push test fixes. So, would you say >> this is a test fix and can be pushed while still adhering to the rules? I'd say >> yes, but I'd like to get some confirmation (or rejection if I'm wrong...) >>> That would be the change to push: >> http://hg.openjdk.java.net/jdk/jdk/rev/8a153a932d0f >>> Thanks >>> Christoph >>> >>> >>>> -----Original Message----- >>>> From: hotspot-dev On Behalf >> Of >>>> Thomas St?fe >>>> Sent: Montag, 1. Juli 2019 21:19 >>>> To: Coleen Phillmore >>>> Cc: HotSpot Open Source Developers >>>> Subject: Re: RFR(xs): 8225200: >>>> runtime/memory/RunUnitTestsConcurrently.java has a memory leak >>>> >>>> Thanks Coleen! >>>> >>>> On Mon, Jul 1, 2019, 21:14 wrote: >>>> >>>>> +1 >>>>> Thank you for taking care of this! >>>>> Coleen >>>>> >>>>> On 7/1/19 3:07 PM, Thomas St?fe wrote: >>>>>> Thanks Stefan! >>>>>> >>>>>> On Mon, Jul 1, 2019, 21:06 Stefan Karlsson >> >>>>>> wrote: >>>>>> >>>>>>> On 2019-07-01 20:56, Thomas St?fe wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> may I please have reviews and opinions about the following patch: >>>>>>>> >>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 >>>>>>>> cr: >>>>>>>> >>>>> http://cr.openjdk.java.net/~stuefe/webrevs/8227041- >>>> rununittestsconcurrently-has-a-mem- >> leak/webrev.00/webrev/index.html >>>>>>>> There is a memory leak in test_virtual_space_list_large_chunk(), >> called >>>>>>> as >>>>>>>> part of the whitebox tests WB_RunMemoryUnitTests(). In this test >>>>>>> metaspace >>>>>>>> allocation is tested by rapidly allocating and subsequently leaking a >>>>>>>> metachunk of ~512K. This is done by a number of threads in a tight >>>> loop >>>>>>> for >>>>>>>> 15 seconds, which usually makes for 10-20GB rss. Test is usually OOM >>>>>>> killed. >>>>>>>> This test seems to be often excluded, which makes sense, since this >>>>> leak >>>>>>>> makes its memory usage difficult to predict. >>>>>>>> >>>>>>>> It is also earmarked by Oracle for gtest-ification, see 8213269. >>>>>>>> >>>>>>>> This leak is not easy to fix, among other things because it is not >>>>> clear >>>>>>>> what it is it wants to test. Meanwhile, time moved on and we have >>>> quite >>>>>>>> nice gtests to test metaspace allocation (see e.g. >>>>>>>> test_metaspace_allocation.cpp) and I rather would run those gtests >>>>>>>> concurrently. Which could be a future RFE. >>>>>>>> >>>>>>>> So I just removed this metaspace related test from >>>>>>> WB_RunMemoryUnitTests() >>>>>>>> altogether, since to me it does nothing useful. Once you remove the >>>>>>> leaking >>>>>>>> allocation, not much is left. >>>>>>>> >>>>>>>> Without this part RunUnitTestsConcurrently test runs smoothly >>>> through >>>>> its >>>>>>>> other parts, and in that form it is still useful. >>>>>>>> >>>>>>>> What do you think? >>>>>>> I think this makes sense and it looks good to me. >>>>>>> >>>>>>> Thanks, >>>>>>> StefanK >>>>>>> >>>>>>>> Cheers, Thomas From vladimir.x.ivanov at oracle.com Wed Jul 17 15:32:00 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 17 Jul 2019 18:32:00 +0300 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> Message-ID: > thank you for taking care of these platforms. PPC64 and s390 parts look good and the test passes. Thanks, Martin. > sharedRuntime.cpp: > Is caller_frame.is_entry_frame() precise enough? > Can it not also be true when calling normally from VM? > If so, don't we need the clinit check in this case? If you have upcalls from JVM code in mind, then there's already a barrier on caller side: JavaCalls::call_static() calls into LinkResolver::resolve_static_call() which has initialization barrier. So, there's no need to repeat the check. Best regards, Vladimir Ivanov >> -----Original Message----- >> From: Vladimir Ivanov >> Sent: Mittwoch, 17. Juli 2019 15:07 >> To: Doerr, Martin ; hotspot- >> dev at openjdk.java.net; Dmitrij Pochepko >> Subject: Re: RFR[13]: 8227260: Can't deal with >> SharedRuntime::handle_wrong_method triggering more than once for >> interpreter calls >> >> Thanks, Erik. >> >> Also, since I touch platform-specific code, I'd like Martin and Dmitrij >> (implementors of support for s390, ppc, and aarch64) to take a look at >> the patch as well. >> >> Best regards, >> Vladimir Ivanov >> >> On 17/07/2019 15:25, Erik ?sterlund wrote: >>> Hi Vladimir, >>> >>> Looks good. Thanks for fixing. >>> >>> /Erik >>> >>> On 2019-07-17 12:26, Vladimir Ivanov wrote: >>>> Revised fix: >>>> ?? http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ >>>> >>>> It turned out the problem is not specific to i2c2i: fast class >>>> initialization barriers on nmethod entry trigger the assert as well. >>>> >>>> JNI upcalls (CallStaticMethod) don't have class initialization >>>> checks, so it's possible to initiate a JNI upcall from a >>>> non-initializing thread and JVM should let it complete. >>>> >>>> It leads to a busy loop (asserts in debug) between nmethod entry >>>> barrier & SharedRuntime::handle_wrong_method until holder class is >>>> initialized (possibly infinite if it blocks class initialization). >>>> >>>> Proposed fix is to keep using c2i, but jump over class initialization >>>> barrier right to the argument shuffling logic on verified entry when >>>> coming from SharedRuntime::handle_wrong_method. >>>> >>>> Improved regression test reliably reproduces the problem. >>>> >>>> Testing: regression test, hs-precheckin-comp, tier1-6 >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>> On 04/07/2019 18:02, Erik ?sterlund wrote: >>>>> Hi, >>>>> >>>>> The i2c adapter sets a thread-local "callee_target" Method*, which is >>>>> caught (and cleared) by SharedRuntime::handle_wrong_method if the >> i2c >>>>> call is "bad" (e.g. not_entrant). This error handler forwards >>>>> execution to the callee c2i entry. If the >>>>> SharedRuntime::handle_wrong_method method is called again due to >> the >>>>> i2c2i call being still bad, then we will crash the VM in the >>>>> following guarantee in SharedRuntime::handle_wrong_method: >>>>> >>>>> Method* callee = thread->callee_target(); >>>>> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >>>>> >>>>> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits >>>>> the new class initialization entry barrier of the c2i adapter. >>>>> The solution is to simply not clear the thread-local "callee_target" >>>>> after handling the first failure, as we can't really know there won't >>>>> be another one. There is no reason to clear this value as nobody else >>>>> reads it than the SharedRuntime::handle_wrong_method handler (and >> we >>>>> really do want it to be able to read the value as many times as it >>>>> takes until the call goes through). I found some confused clearing of >>>>> this callee_target in JavaThread::oops_do(), with a comment saying >>>>> this is a methodOop that we need to clear to make GC happy or >>>>> something. Seems like old traces of perm gen. So I deleted that too. >>>>> >>>>> I caught this in ZGC where the timing window for hitting this issue >>>>> seems to be wider due to concurrent code cache unloading. But it is >>>>> equally problematic for all GCs. >>>>> >>>>> Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8227260 >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >>>>> >>>>> Thanks, >>>>> /Erik From martin.doerr at sap.com Wed Jul 17 15:39:42 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 17 Jul 2019 15:39:42 +0000 Subject: RFR: 8227633: avoid comparing this pointers to NULL - was : RE: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] In-Reply-To: References: Message-ID: Hi Matthias, looks good to me. Please make sure that this change got built on all platforms we have. The adlc is used during build so if it has passed on all platforms, it should be ok. Best regards, Martin > -----Original Message----- > From: hotspot-dev On Behalf Of > coleen.phillimore at oracle.com > Sent: Freitag, 12. Juli 2019 14:49 > To: hotspot-dev at openjdk.java.net > Subject: Re: RFR: 8227633: avoid comparing this pointers to NULL - was : RE: > this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined- > compare] > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.0/src/hotspot/sha > re/adlc/formssel.cpp.udiff.html > > + if (mnode) mnode->count_instr_names(names); > > > We also try to avoid implicit checks against null for pointers so change > this to: > > + if (mnode != NULL) mnode->count_instr_names(names); > > I didn't see that you added a check for NULL in the callers of > print_opcodes or setstr.? Can those callers never pass NULL? > > We've done a few passes to clean up these this == NULL checks. Thank you > for doing this! > > Coleen > > > On 7/12/19 8:30 AM, Baesken, Matthias wrote: > > Hello Erik, thanks for the input . > > > > We still have a few places in the HS codebase where "this" is compared to > NULL. > > When compiling with xlc16 / xlclang we get these warnings : > > > > warning: 'this' pointer cannot be null in well-defined C++ code; > comparison may be assumed to always evaluate to false [-Wtautological- > undefined-compare] > > > > so those places should be removed where possible. > > > > > > I adjusted 3 checks , please review ! > > > > > > > > Bug/webrev : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.0/ > > > > https://bugs.openjdk.java.net/browse/JDK-8227633 > > > > Thanks , Matthias > > > > > >> -----Original Message----- > >> From: Erik ?sterlund > >> Sent: Freitag, 12. Juli 2019 10:22 > >> To: Baesken, Matthias ; 'hotspot- > >> dev at openjdk.java.net' > >> Subject: Re: this-pointer NULL-checks in hotspot codebase [- > Wtautological- > >> undefined-compare] > >> > >> Hi Matthias, > >> > >> Removing such NULL checks seems like a good idea in general due to the > >> undefined behaviour. > >> Worth mentioning though that there are some tricky ones, like in > >> markOopDesc* where this == NULL > >> means that the mark word has the "inflating" value. So we explicitly > >> check if this == NULL and > >> hope the compiler will not elide the check. Just gonna drop that one > >> here and run for it. > >> > >> Thanks, > >> /Erik > >> > >> On 2019-07-12 09:48, Baesken, Matthias wrote: > >>> Hello , when looking into the recent xlc16 / xlclang warnings I came > >> across those 3 : > >>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:1729:7: warning: 'this' > >> pointer cannot be null in well-defined C++ code; > >>> comparison may be assumed to always evaluate to true [-Wtautological- > >> undefined-compare] > >>> if( this != NULL ) { > >>> ^~~~ ~~~~ > >>> > >>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:3416:7: warning: 'this' > >> pointer cannot be null in well-defined C++ code; > >>> comparison may be assumed to always evaluate to false [-Wtautological- > >> undefined-compare] > >>> if( this == NULL ) return; > >>> > >>> /nightly/jdk/src/hotspot/share/libadt/set.cpp:46:7: warning: 'this' > pointer > >> cannot be null in well-defined C++ code; > >>> comparison may be assumed to always evaluate to false [-Wtautological- > >> undefined-compare] > >>> if( this == NULL ) return os::strdup("{no set}"); > >>> > >>> > >>> Do you think the NULL-checks can be removed or is there still some > value > >> in doing them ? > >>> Best regards, Matthias From martin.doerr at sap.com Wed Jul 17 15:53:19 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 17 Jul 2019 15:53:19 +0000 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> Message-ID: Hi Vladimir, thanks for explaining. So I think it's correct. Best regards, Martin > -----Original Message----- > From: Vladimir Ivanov > Sent: Mittwoch, 17. Juli 2019 17:32 > To: Doerr, Martin ; hotspot-dev at openjdk.java.net > Subject: Re: RFR[13]: 8227260: Can't deal with > SharedRuntime::handle_wrong_method triggering more than once for > interpreter calls > > > > > thank you for taking care of these platforms. PPC64 and s390 parts look > good and the test passes. > > Thanks, Martin. > > > sharedRuntime.cpp: > > Is caller_frame.is_entry_frame() precise enough? > > Can it not also be true when calling normally from VM? > > If so, don't we need the clinit check in this case? > > If you have upcalls from JVM code in mind, then there's already a > barrier on caller side: JavaCalls::call_static() calls into > LinkResolver::resolve_static_call() which has initialization barrier. > So, there's no need to repeat the check. > > Best regards, > Vladimir Ivanov > > >> -----Original Message----- > >> From: Vladimir Ivanov > >> Sent: Mittwoch, 17. Juli 2019 15:07 > >> To: Doerr, Martin ; hotspot- > >> dev at openjdk.java.net; Dmitrij Pochepko sw.com> > >> Subject: Re: RFR[13]: 8227260: Can't deal with > >> SharedRuntime::handle_wrong_method triggering more than once for > >> interpreter calls > >> > >> Thanks, Erik. > >> > >> Also, since I touch platform-specific code, I'd like Martin and Dmitrij > >> (implementors of support for s390, ppc, and aarch64) to take a look at > >> the patch as well. > >> > >> Best regards, > >> Vladimir Ivanov > >> > >> On 17/07/2019 15:25, Erik ?sterlund wrote: > >>> Hi Vladimir, > >>> > >>> Looks good. Thanks for fixing. > >>> > >>> /Erik > >>> > >>> On 2019-07-17 12:26, Vladimir Ivanov wrote: > >>>> Revised fix: > >>>> ?? http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ > >>>> > >>>> It turned out the problem is not specific to i2c2i: fast class > >>>> initialization barriers on nmethod entry trigger the assert as well. > >>>> > >>>> JNI upcalls (CallStaticMethod) don't have class initialization > >>>> checks, so it's possible to initiate a JNI upcall from a > >>>> non-initializing thread and JVM should let it complete. > >>>> > >>>> It leads to a busy loop (asserts in debug) between nmethod entry > >>>> barrier & SharedRuntime::handle_wrong_method until holder class is > >>>> initialized (possibly infinite if it blocks class initialization). > >>>> > >>>> Proposed fix is to keep using c2i, but jump over class initialization > >>>> barrier right to the argument shuffling logic on verified entry when > >>>> coming from SharedRuntime::handle_wrong_method. > >>>> > >>>> Improved regression test reliably reproduces the problem. > >>>> > >>>> Testing: regression test, hs-precheckin-comp, tier1-6 > >>>> > >>>> Best regards, > >>>> Vladimir Ivanov > >>>> > >>>> On 04/07/2019 18:02, Erik ?sterlund wrote: > >>>>> Hi, > >>>>> > >>>>> The i2c adapter sets a thread-local "callee_target" Method*, which is > >>>>> caught (and cleared) by SharedRuntime::handle_wrong_method if > the > >> i2c > >>>>> call is "bad" (e.g. not_entrant). This error handler forwards > >>>>> execution to the callee c2i entry. If the > >>>>> SharedRuntime::handle_wrong_method method is called again due > to > >> the > >>>>> i2c2i call being still bad, then we will crash the VM in the > >>>>> following guarantee in SharedRuntime::handle_wrong_method: > >>>>> > >>>>> Method* callee = thread->callee_target(); > >>>>> guarantee(callee != NULL && callee->is_method(), "bad handshake"); > >>>>> > >>>>> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits > >>>>> the new class initialization entry barrier of the c2i adapter. > >>>>> The solution is to simply not clear the thread-local "callee_target" > >>>>> after handling the first failure, as we can't really know there won't > >>>>> be another one. There is no reason to clear this value as nobody else > >>>>> reads it than the SharedRuntime::handle_wrong_method handler > (and > >> we > >>>>> really do want it to be able to read the value as many times as it > >>>>> takes until the call goes through). I found some confused clearing of > >>>>> this callee_target in JavaThread::oops_do(), with a comment saying > >>>>> this is a methodOop that we need to clear to make GC happy or > >>>>> something. Seems like old traces of perm gen. So I deleted that too. > >>>>> > >>>>> I caught this in ZGC where the timing window for hitting this issue > >>>>> seems to be wider due to concurrent code cache unloading. But it is > >>>>> equally problematic for all GCs. > >>>>> > >>>>> Bug: > >>>>> https://bugs.openjdk.java.net/browse/JDK-8227260 > >>>>> > >>>>> Webrev: > >>>>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ > >>>>> > >>>>> Thanks, > >>>>> /Erik From adam.farley at uk.ibm.com Wed Jul 17 16:05:09 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Wed, 17 Jul 2019 17:05:09 +0100 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN Message-ID: Hey All, Reviewers and sponsors requested to inspect the following. I've re-written the code change, as discussed with David Holes in emails last week, and now the webrev changes do this: - Cause the VM to shut down with a relevant error message if one or more of the sun.boot.library.path paths is too long for the system. - Apply similar error-producing code to the (legacy?) code in linker_md.c. - Allow the numerical parameter for split_path to indicate anything we plan to add to the path once split, allowing for more accurate path length detection. - Add an optional parameter to the os::split_path function that specifies where the paths came from, for a better error message. Bug: https://bugs.openjdk.java.net/browse/JDK-8227021 New Webrev: http://cr.openjdk.java.net/~afarley/8227021.1/webrev/ Best Regards Adam Farley IBM Runtimes Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From christoph.langer at sap.com Wed Jul 17 16:45:29 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 17 Jul 2019 16:45:29 +0000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: References: Message-ID: Hi Matthias, thanks for this tedious cleanup. Looks good to me. Best regards Christoph > -----Original Message----- > From: hotspot-dev On Behalf Of > Baesken, Matthias > Sent: Mittwoch, 17. Juli 2019 17:07 > To: 'hotspot-dev at openjdk.java.net' ; > 'ppc-aix-port-dev at openjdk.java.net' > Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > Hello, there are a couple of non matching format specifiers in os_aix.cpp . > I adjust them with my change . > > Please review ! > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8227869 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ > > Thanks, Matthias From christoph.langer at sap.com Wed Jul 17 16:46:57 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 17 Jul 2019 16:46:57 +0000 Subject: 8227041 (was 8225200): runtime/memory/RunUnitTestsConcurrently.java has a memory leak - push to jdk13? In-Reply-To: <19eaf56e-89e9-c516-1242-0881ef3c219e@oracle.com> References: <19eaf56e-89e9-c516-1242-0881ef3c219e@oracle.com> Message-ID: Thanks, Dan and Aleksey. So I'll push it tomorrow unless somebody objects until then. Cheers Christoph > -----Original Message----- > From: Daniel D. Daugherty > Sent: Mittwoch, 17. Juli 2019 17:22 > To: Langer, Christoph ; Thomas St?fe > ; Coleen Phillmore > ; Stefan Karlsson > > Cc: HotSpot Open Source Developers > Subject: Re: 8227041 (was 8225200): > runtime/memory/RunUnitTestsConcurrently.java has a memory leak - push > to jdk13? > > Thanks for the clarification. > > As a test fix, I also believe that JDK-8227041 can be pushed to jdk13. > > Dan > > > On 7/17/19 10:31 AM, Langer, Christoph wrote: > > Hi Dan, > > > > sorry, I just replied to the original thread which has the 8225200 bug id in its > subject line for whatever reason. But the bug to backport is JDK-8227041 of > course. > > > > However, 8225200 has no associated patch anyway as Thomas thinks that it > is resolved by 8227041, too. > > > > Best regards > > Christoph > > > >> -----Original Message----- > >> From: Daniel D. Daugherty > >> Sent: Mittwoch, 17. Juli 2019 16:23 > >> To: Langer, Christoph ; Thomas St?fe > >> ; Coleen Phillmore > >> ; Stefan Karlsson > >> > >> Cc: HotSpot Open Source Developers > >> Subject: Re: 8225200: runtime/memory/RunUnitTestsConcurrently.java > has a > >> memory leak - push to jdk13? > >> > >> The subject has the following bug ID: 8225200 > >> > >> ??? JDK-8225200 assert(vs.actual_committed_size() >= commit_size) failed > >> > >> and a synopsis from a different bug: > >> > >> ??? JDK-8227041 runtime/memory/RunUnitTestsConcurrently.java has a > >> memory leak > >> > >> Please clarify which you would like to backport... > >> > >> Dan > >> > >> > >> > >> On 7/17/19 10:10 AM, Langer, Christoph wrote: > >>> Hi, > >>> > >>> as we're running into this issue in our nightly test environment, I would > be > >> very interested to bring this thing to jdk13. As per RDP rules > >> (https://openjdk.java.net/jeps/3) , we are currently transitioning from > RDP1 > >> to RDP2. But in both phases, it is allowed to push test fixes. So, would you > say > >> this is a test fix and can be pushed while still adhering to the rules? I'd say > >> yes, but I'd like to get some confirmation (or rejection if I'm wrong...) > >>> That would be the change to push: > >> http://hg.openjdk.java.net/jdk/jdk/rev/8a153a932d0f > >>> Thanks > >>> Christoph > >>> > >>> > >>>> -----Original Message----- > >>>> From: hotspot-dev On > Behalf > >> Of > >>>> Thomas St?fe > >>>> Sent: Montag, 1. Juli 2019 21:19 > >>>> To: Coleen Phillmore > >>>> Cc: HotSpot Open Source Developers dev at openjdk.java.net> > >>>> Subject: Re: RFR(xs): 8225200: > >>>> runtime/memory/RunUnitTestsConcurrently.java has a memory leak > >>>> > >>>> Thanks Coleen! > >>>> > >>>> On Mon, Jul 1, 2019, 21:14 wrote: > >>>> > >>>>> +1 > >>>>> Thank you for taking care of this! > >>>>> Coleen > >>>>> > >>>>> On 7/1/19 3:07 PM, Thomas St?fe wrote: > >>>>>> Thanks Stefan! > >>>>>> > >>>>>> On Mon, Jul 1, 2019, 21:06 Stefan Karlsson > >> > >>>>>> wrote: > >>>>>> > >>>>>>> On 2019-07-01 20:56, Thomas St?fe wrote: > >>>>>>>> Hi all, > >>>>>>>> > >>>>>>>> may I please have reviews and opinions about the following > patch: > >>>>>>>> > >>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8227041 > >>>>>>>> cr: > >>>>>>>> > >>>>> http://cr.openjdk.java.net/~stuefe/webrevs/8227041- > >>>> rununittestsconcurrently-has-a-mem- > >> leak/webrev.00/webrev/index.html > >>>>>>>> There is a memory leak in test_virtual_space_list_large_chunk(), > >> called > >>>>>>> as > >>>>>>>> part of the whitebox tests WB_RunMemoryUnitTests(). In this > test > >>>>>>> metaspace > >>>>>>>> allocation is tested by rapidly allocating and subsequently leaking a > >>>>>>>> metachunk of ~512K. This is done by a number of threads in a > tight > >>>> loop > >>>>>>> for > >>>>>>>> 15 seconds, which usually makes for 10-20GB rss. Test is usually > OOM > >>>>>>> killed. > >>>>>>>> This test seems to be often excluded, which makes sense, since > this > >>>>> leak > >>>>>>>> makes its memory usage difficult to predict. > >>>>>>>> > >>>>>>>> It is also earmarked by Oracle for gtest-ification, see 8213269. > >>>>>>>> > >>>>>>>> This leak is not easy to fix, among other things because it is not > >>>>> clear > >>>>>>>> what it is it wants to test. Meanwhile, time moved on and we > have > >>>> quite > >>>>>>>> nice gtests to test metaspace allocation (see e.g. > >>>>>>>> test_metaspace_allocation.cpp) and I rather would run those > gtests > >>>>>>>> concurrently. Which could be a future RFE. > >>>>>>>> > >>>>>>>> So I just removed this metaspace related test from > >>>>>>> WB_RunMemoryUnitTests() > >>>>>>>> altogether, since to me it does nothing useful. Once you remove > the > >>>>>>> leaking > >>>>>>>> allocation, not much is left. > >>>>>>>> > >>>>>>>> Without this part RunUnitTestsConcurrently test runs smoothly > >>>> through > >>>>> its > >>>>>>>> other parts, and in that form it is still useful. > >>>>>>>> > >>>>>>>> What do you think? > >>>>>>> I think this makes sense and it looks good to me. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> StefanK > >>>>>>> > >>>>>>>> Cheers, Thomas From dmitrij.pochepko at bell-sw.com Wed Jul 17 16:57:37 2019 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Wed, 17 Jul 2019 19:57:37 +0300 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> Message-ID: Hi, looks fine. I also ran updated ClassInitBarrier test on aarch64 fastdebug build Thanks, Dmitrij On 17/07/2019 4:06 PM, Vladimir Ivanov wrote: > Thanks, Erik. > > Also, since I touch platform-specific code, I'd like Martin and > Dmitrij (implementors of support for s390, ppc, and aarch64) to take a > look at the patch as well. > > Best regards, > Vladimir Ivanov > > On 17/07/2019 15:25, Erik ?sterlund wrote: >> Hi Vladimir, >> >> Looks good. Thanks for fixing. >> >> /Erik >> >> On 2019-07-17 12:26, Vladimir Ivanov wrote: >>> Revised fix: >>> ?? http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ >>> >>> It turned out the problem is not specific to i2c2i: fast class >>> initialization barriers on nmethod entry trigger the assert as well. >>> >>> JNI upcalls (CallStaticMethod) don't have class initialization >>> checks, so it's possible to initiate a JNI upcall from a >>> non-initializing thread and JVM should let it complete. >>> >>> It leads to a busy loop (asserts in debug) between nmethod entry >>> barrier & SharedRuntime::handle_wrong_method until holder class is >>> initialized (possibly infinite if it blocks class initialization). >>> >>> Proposed fix is to keep using c2i, but jump over class >>> initialization barrier right to the argument shuffling logic on >>> verified entry when coming from SharedRuntime::handle_wrong_method. >>> >>> Improved regression test reliably reproduces the problem. >>> >>> Testing: regression test, hs-precheckin-comp, tier1-6 >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> On 04/07/2019 18:02, Erik ?sterlund wrote: >>>> Hi, >>>> >>>> The i2c adapter sets a thread-local "callee_target" Method*, which >>>> is caught (and cleared) by SharedRuntime::handle_wrong_method if >>>> the i2c call is "bad" (e.g. not_entrant). This error handler >>>> forwards execution to the callee c2i entry. If the >>>> SharedRuntime::handle_wrong_method method is called again due to >>>> the i2c2i call being still bad, then we will crash the VM in the >>>> following guarantee in SharedRuntime::handle_wrong_method: >>>> >>>> Method* callee = thread->callee_target(); >>>> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >>>> >>>> Unfortunately, the c2i entry can indeed fail again if it, e.g., >>>> hits the new class initialization entry barrier of the c2i adapter. >>>> The solution is to simply not clear the thread-local >>>> "callee_target" after handling the first failure, as we can't >>>> really know there won't be another one. There is no reason to clear >>>> this value as nobody else reads it than the >>>> SharedRuntime::handle_wrong_method handler (and we really do want >>>> it to be able to read the value as many times as it takes until the >>>> call goes through). I found some confused clearing of this >>>> callee_target in JavaThread::oops_do(), with a comment saying this >>>> is a methodOop that we need to clear to make GC happy or something. >>>> Seems like old traces of perm gen. So I deleted that too. >>>> >>>> I caught this in ZGC where the timing window for hitting this issue >>>> seems to be wider due to concurrent code cache unloading. But it is >>>> equally problematic for all GCs. >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8227260 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >>>> >>>> Thanks, >>>> /Erik From erik.osterlund at oracle.com Wed Jul 17 16:59:55 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 17 Jul 2019 18:59:55 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> Message-ID: <6729068e-06b6-e7e6-b675-4c305ab18196@oracle.com> Hi Dean, You are correct that the winner of the race will already have made sure the operations you listed have run. So by not returning you would essentially just continue performing a bunch of unnecessary operations that won't do anything. I would prefer though to stay as close to my original fix as possible as that one has gone through extensive testing, and I would like to push this before the cut-off for P3 bugs to 13 tomorrow. Here is my latest attempt: http://cr.openjdk.java.net/~eosterlund/8224674/webrev.03/ Here is an incremental webrev to the original proposition: http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00_03/ As you can see I have only changed the guarantee to assert as requested by Coleen, and added a bunch of commentary to explain why e.g. a failing transition does not need to worry about the 3 side effects before the failure. Hope my comments make sense. I hope you think this is okay. If there is more clarification or cleanups you would like to see, I am more than happy to file such RFEs for 14. Thanks, /Erik On 2019-07-17 09:17, dean.long at oracle.com wrote: > On 7/16/19 10:51 AM, dean.long at oracle.com wrote: >> Back to the make_not_entrant / make_unloaded race.? If >> make_not_entrant bails out half-way through because make_unloaded won >> the race, doesn't that mean that make_unloaded needs to have already >> done all the work that make_not_entrant is not doing? >> unlink_from_method, invalidate_nmethod_mirror, remove_osr_nmethod, >> unregister_nmethod, etc. > > What I'm thinking is, what happens if instead of this: > > 1365 // Change state > 1366 if (!try_transition(state)) { > 1367 return false; > 1368 } > we do this: 1365 // Maybe change state > 1366 if (!try_transition(state)) { > 1367 // fall through 1368 } > > dl > From mikhailo.seledtsov at oracle.com Wed Jul 17 17:22:10 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 17 Jul 2019 10:22:10 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> Message-ID: Hi Severin, On 7/17/19 5:44 AM, Severin Gehwolf wrote: > Hi Igor, Misha, > > On Tue, 2019-07-16 at 11:49 -0700, Igor Ignatyev wrote: >> Hi Severin, >> >> I don't think that tests (or test libraries for that matter) should >> be responsible for setting correct PATH value, it should be a part of >> host configuration procedure (tests can/should check that all >> required bins are available though). in other words, I'd prefer if >> you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and >> TestJFREvents. the rest looks good to me. > Updated webrev: > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/02/webrev/ > > No more additions to PATH are being done. > > I've discovered that VMProps.java which defines "docker.required", used > the "docker" binary even for podman test runs. This ended up not > running most of the tests even with -Djdk.test.docker.command=podman > specified. Good catch. > I've fixed that by moving DOCKER_COMMAND to Platform.java so > that it can be used in both places. Sounds good to me. (of course, the alternative would be to import jdk.test.lib.containers.docker.DockerTestUtils into VMProps.java -- not sure if there are any potential problems doing it this way) > Testing: Container tests with docker daemon running on Linux x86_64, > container tests without docker daemon running (podman is daemon-less) > via the podman binary on Linux x86_64 (with -e:PATH). All pass. Sounds good. Overall looks good to me. One minor nit: DockerTestUtils.java does not need "import java.util.Map;" (no need to post updated webrev for this change) Thank you, Misha > > More thoughts? > > Thanks, > Severin > >> Thanks, >> -- Igor >> >>> On Jul 16, 2019, at 5:36 AM, Severin Gehwolf wrote: >>> >>> Hi, >>> >>> I believe I still need a *R*eviewer for this. Any takers? >>> >>> Thanks, >>> Severin >>> >>> On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: >>>> Hi Severin, >>>> >>>> The change looks good to me. Thank you for adding support for Podman >>>> container technology. >>>> >>>> Testing: I ran both HotSpot and JDK container tests with your patch; >>>> tests executed on Oracle Linux 7.6 using default container engine (Docker): >>>> >>>> test/hotspot/jtreg/containers/ AND >>>> test/jdk/jdk/internal/platform/docker/ >>>> >>>> All PASS >>>> >>>> >>>> Thanks, >>>> >>>> Misha >>>> >>>> >>>> On 7/12/19 11:08 AM, Severin Gehwolf wrote: >>>>> Hi, >>>>> >>>>> There is an alternative container engine which is being used by Fedora >>>>> and RHEL 8, called podman[1]. It's mostly compatible with docker. It >>>>> looks like OpenJDK docker tests can be made podman compatible with a >>>>> few little tweaks. One "interesting" one is to not assert "Successfully >>>>> built" in the build output but only rely on the exit code, which seems >>>>> to be OK for my testing. Interestingly the test would be skipped in >>>>> that case. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 >>>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ >>>>> >>>>> Adjustments I've done: >>>>> * Don't assert "Successfully built" in image build output[2]. >>>>> * Add /usr/sbin to PATH as the podman binary relies on iptables for it >>>>> to work which is in /usr/sbin on Fedora >>>>> * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() >>>>> to be equal to the previous value. I've found those counters to be >>>>> slowly increasing, which made the tests unreliable. >>>>> >>>>> Testing: >>>>> >>>>> Running docker tests with docker as engine. Did the same with podman as >>>>> engine via -Djdk.test.docker.command=podman on Linux x86_64. Both >>>>> passed (non-trivially). >>>>> >>>>> Thoughts? >>>>> >>>>> Thanks, >>>>> Severin >>>>> >>>>> [1] https://podman.io/ >>>>> [2] Image builds with podman look >>>>> like ("COMMIT" over "Successfully built"): >>>>> STEP 1: FROM fedora:29 >>>>> STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all >>>>> --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d >>>>> STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ >>>>> 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 >>>>> STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics >>>>> STEP 5: COMMIT fedora-metrics-11 >>>>> d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e >>>>> From sgehwolf at redhat.com Wed Jul 17 17:34:18 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 17 Jul 2019 19:34:18 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> Message-ID: <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> Hi Misha, On Wed, 2019-07-17 at 10:22 -0700, mikhailo.seledtsov at oracle.com wrote: > Hi Severin, > > On 7/17/19 5:44 AM, Severin Gehwolf wrote: > > Hi Igor, Misha, > > > > On Tue, 2019-07-16 at 11:49 -0700, Igor Ignatyev wrote: > > > Hi Severin, > > > > > > I don't think that tests (or test libraries for that matter) should > > > be responsible for setting correct PATH value, it should be a part of > > > host configuration procedure (tests can/should check that all > > > required bins are available though). in other words, I'd prefer if > > > you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and > > > TestJFREvents. the rest looks good to me. > > Updated webrev: > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/02/webrev/ > > > > No more additions to PATH are being done. > > > > I've discovered that VMProps.java which defines "docker.required", used > > the "docker" binary even for podman test runs. This ended up not > > running most of the tests even with -Djdk.test.docker.command=podman > > specified. > Good catch. > > I've fixed that by moving DOCKER_COMMAND to Platform.java so > > that it can be used in both places. > > Sounds good to me. > > (of course, the alternative would be to import > jdk.test.lib.containers.docker.DockerTestUtils into VMProps.java -- not > sure if there are any potential problems doing it this way) I've tried that but for some reason this was a problem and VMProps failed to compile. I don't know exactly how those jtreg extensions work and went with the Platform approach. Hope that's OK. > > Testing: Container tests with docker daemon running on Linux x86_64, > > container tests without docker daemon running (podman is daemon-less) > > via the podman binary on Linux x86_64 (with -e:PATH). All pass. > > Sounds good. > > > Overall looks good to me. Thanks for the review! > One minor nit: DockerTestUtils.java does not need "import > java.util.Map;" (no need to post updated webrev for this change) OK, good catch. Fixed locally. Thanks, Severin > > Thank you, > > Misha > > > More thoughts? > > > > Thanks, > > Severin > > > > > Thanks, > > > -- Igor > > > > > > > On Jul 16, 2019, at 5:36 AM, Severin Gehwolf wrote: > > > > > > > > Hi, > > > > > > > > I believe I still need a *R*eviewer for this. Any takers? > > > > > > > > Thanks, > > > > Severin > > > > > > > > On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: > > > > > Hi Severin, > > > > > > > > > > The change looks good to me. Thank you for adding support for Podman > > > > > container technology. > > > > > > > > > > Testing: I ran both HotSpot and JDK container tests with your patch; > > > > > tests executed on Oracle Linux 7.6 using default container engine (Docker): > > > > > > > > > > test/hotspot/jtreg/containers/ AND > > > > > test/jdk/jdk/internal/platform/docker/ > > > > > > > > > > All PASS > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Misha > > > > > > > > > > > > > > > On 7/12/19 11:08 AM, Severin Gehwolf wrote: > > > > > > Hi, > > > > > > > > > > > > There is an alternative container engine which is being used by Fedora > > > > > > and RHEL 8, called podman[1]. It's mostly compatible with docker. It > > > > > > looks like OpenJDK docker tests can be made podman compatible with a > > > > > > few little tweaks. One "interesting" one is to not assert "Successfully > > > > > > built" in the build output but only rely on the exit code, which seems > > > > > > to be OK for my testing. Interestingly the test would be skipped in > > > > > > that case. > > > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 > > > > > > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ > > > > > > > > > > > > Adjustments I've done: > > > > > > * Don't assert "Successfully built" in image build output[2]. > > > > > > * Add /usr/sbin to PATH as the podman binary relies on iptables for it > > > > > > to work which is in /usr/sbin on Fedora > > > > > > * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() > > > > > > to be equal to the previous value. I've found those counters to be > > > > > > slowly increasing, which made the tests unreliable. > > > > > > > > > > > > Testing: > > > > > > > > > > > > Running docker tests with docker as engine. Did the same with podman as > > > > > > engine via -Djdk.test.docker.command=podman on Linux x86_64. Both > > > > > > passed (non-trivially). > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > Thanks, > > > > > > Severin > > > > > > > > > > > > [1] https://podman.io/ > > > > > > [2] Image builds with podman look > > > > > > like ("COMMIT" over "Successfully built"): > > > > > > STEP 1: FROM fedora:29 > > > > > > STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all > > > > > > --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d > > > > > > STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ > > > > > > 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 > > > > > > STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics > > > > > > STEP 5: COMMIT fedora-metrics-11 > > > > > > d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e > > > > > > From mikhailo.seledtsov at oracle.com Wed Jul 17 18:15:12 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 17 Jul 2019 11:15:12 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> Message-ID: <8f0b52df-51fc-a5d1-7d3a-ee795d6b6c18@oracle.com> On 7/17/19 10:34 AM, Severin Gehwolf wrote: > Hi Misha, > > On Wed, 2019-07-17 at 10:22 -0700, mikhailo.seledtsov at oracle.com wrote: >> Hi Severin, >> >> On 7/17/19 5:44 AM, Severin Gehwolf wrote: >>> Hi Igor, Misha, >>> >>> On Tue, 2019-07-16 at 11:49 -0700, Igor Ignatyev wrote: >>>> Hi Severin, >>>> >>>> I don't think that tests (or test libraries for that matter) should >>>> be responsible for setting correct PATH value, it should be a part of >>>> host configuration procedure (tests can/should check that all >>>> required bins are available though). in other words, I'd prefer if >>>> you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and >>>> TestJFREvents. the rest looks good to me. >>> Updated webrev: >>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/02/webrev/ >>> >>> No more additions to PATH are being done. >>> >>> I've discovered that VMProps.java which defines "docker.required", used >>> the "docker" binary even for podman test runs. This ended up not >>> running most of the tests even with -Djdk.test.docker.command=podman >>> specified. >> Good catch. >>> I've fixed that by moving DOCKER_COMMAND to Platform.java so >>> that it can be used in both places. >> Sounds good to me. >> >> (of course, the alternative would be to import >> jdk.test.lib.containers.docker.DockerTestUtils into VMProps.java -- not >> sure if there are any potential problems doing it this way) > I've tried that but for some reason this was a problem and VMProps > failed to compile. I don't know exactly how those jtreg extensions work > and went with the Platform approach. Hope that's OK. Thank you for the details. That's OK by me. Thank you, Misha > >>> Testing: Container tests with docker daemon running on Linux x86_64, >>> container tests without docker daemon running (podman is daemon-less) >>> via the podman binary on Linux x86_64 (with -e:PATH). All pass. >> Sounds good. >> >> >> Overall looks good to me. > Thanks for the review! > >> One minor nit: DockerTestUtils.java does not need "import >> java.util.Map;" (no need to post updated webrev for this change) > OK, good catch. Fixed locally. > > Thanks, > Severin > >> Thank you, >> >> Misha >> >>> More thoughts? >>> >>> Thanks, >>> Severin >>> >>>> Thanks, >>>> -- Igor >>>> >>>>> On Jul 16, 2019, at 5:36 AM, Severin Gehwolf wrote: >>>>> >>>>> Hi, >>>>> >>>>> I believe I still need a *R*eviewer for this. Any takers? >>>>> >>>>> Thanks, >>>>> Severin >>>>> >>>>> On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>> Hi Severin, >>>>>> >>>>>> The change looks good to me. Thank you for adding support for Podman >>>>>> container technology. >>>>>> >>>>>> Testing: I ran both HotSpot and JDK container tests with your patch; >>>>>> tests executed on Oracle Linux 7.6 using default container engine (Docker): >>>>>> >>>>>> test/hotspot/jtreg/containers/ AND >>>>>> test/jdk/jdk/internal/platform/docker/ >>>>>> >>>>>> All PASS >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Misha >>>>>> >>>>>> >>>>>> On 7/12/19 11:08 AM, Severin Gehwolf wrote: >>>>>>> Hi, >>>>>>> >>>>>>> There is an alternative container engine which is being used by Fedora >>>>>>> and RHEL 8, called podman[1]. It's mostly compatible with docker. It >>>>>>> looks like OpenJDK docker tests can be made podman compatible with a >>>>>>> few little tweaks. One "interesting" one is to not assert "Successfully >>>>>>> built" in the build output but only rely on the exit code, which seems >>>>>>> to be OK for my testing. Interestingly the test would be skipped in >>>>>>> that case. >>>>>>> >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 >>>>>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ >>>>>>> >>>>>>> Adjustments I've done: >>>>>>> * Don't assert "Successfully built" in image build output[2]. >>>>>>> * Add /usr/sbin to PATH as the podman binary relies on iptables for it >>>>>>> to work which is in /usr/sbin on Fedora >>>>>>> * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() >>>>>>> to be equal to the previous value. I've found those counters to be >>>>>>> slowly increasing, which made the tests unreliable. >>>>>>> >>>>>>> Testing: >>>>>>> >>>>>>> Running docker tests with docker as engine. Did the same with podman as >>>>>>> engine via -Djdk.test.docker.command=podman on Linux x86_64. Both >>>>>>> passed (non-trivially). >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> Thanks, >>>>>>> Severin >>>>>>> >>>>>>> [1] https://podman.io/ >>>>>>> [2] Image builds with podman look >>>>>>> like ("COMMIT" over "Successfully built"): >>>>>>> STEP 1: FROM fedora:29 >>>>>>> STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all >>>>>>> --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d >>>>>>> STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ >>>>>>> 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 >>>>>>> STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics >>>>>>> STEP 5: COMMIT fedora-metrics-11 >>>>>>> d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e >>>>>>> From igor.ignatyev at oracle.com Wed Jul 17 18:37:58 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 17 Jul 2019 11:37:58 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> Message-ID: Hi Severin, the updated webrev looks good to me, please see a couple comments below. Cheers, -- Igor > On Jul 17, 2019, at 10:34 AM, Severin Gehwolf wrote: > > Hi Misha, > > On Wed, 2019-07-17 at 10:22 -0700, mikhailo.seledtsov at oracle.com wrote: >> Hi Severin, >> >> On 7/17/19 5:44 AM, Severin Gehwolf wrote: >>> Hi Igor, Misha, >>> >>> On Tue, 2019-07-16 at 11:49 -0700, Igor Ignatyev wrote: >>>> Hi Severin, >>>> >>>> I don't think that tests (or test libraries for that matter) should >>>> be responsible for setting correct PATH value, it should be a part of >>>> host configuration procedure (tests can/should check that all >>>> required bins are available though). in other words, I'd prefer if >>>> you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and >>>> TestJFREvents. the rest looks good to me. >>> Updated webrev: >>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/02/webrev/ >>> >>> No more additions to PATH are being done. >>> >>> I've discovered that VMProps.java which defines "docker.required", used >>> the "docker" binary even for podman test runs. This ended up not >>> running most of the tests even with -Djdk.test.docker.command=podman >>> specified. >> Good catch. should we rename docker.support and DOCKER_COMMAND to something more abstract? >>> I've fixed that by moving DOCKER_COMMAND to Platform.java so >>> that it can be used in both places. >> >> Sounds good to me. >> >> (of course, the alternative would be to import >> jdk.test.lib.containers.docker.DockerTestUtils into VMProps.java -- not >> sure if there are any potential problems doing it this way) > > I've tried that but for some reason this was a problem and VMProps > failed to compile. I don't know exactly how those jtreg extensions work > and went with the Platform approach. Hope that's OK. all files needed for VMProps (or other @requires expression class) have to be listed in requires.extraPropDefns or requires.extraPropDefns.bootlibs property in TEST.ROOT file in all the test suites which use these extensions. we are trying to be very cautious in what is used by VMProps (directly and indirectly) so these lists won't grow and we won't require any modules other than java.base, given DockerTestUtils has dependencies on a number of other library classes, the Platform approach is much better from that point of view. > >>> Testing: Container tests with docker daemon running on Linux x86_64, >>> container tests without docker daemon running (podman is daemon-less) >>> via the podman binary on Linux x86_64 (with -e:PATH). All pass. >> >> Sounds good. >> >> >> Overall looks good to me. > > Thanks for the review! > >> One minor nit: DockerTestUtils.java does not need "import >> java.util.Map;" (no need to post updated webrev for this change) > > OK, good catch. Fixed locally. > > Thanks, > Severin > >> >> Thank you, >> >> Misha >> >>> More thoughts? >>> >>> Thanks, >>> Severin >>> >>>> Thanks, >>>> -- Igor >>>> >>>>> On Jul 16, 2019, at 5:36 AM, Severin Gehwolf wrote: >>>>> >>>>> Hi, >>>>> >>>>> I believe I still need a *R*eviewer for this. Any takers? >>>>> >>>>> Thanks, >>>>> Severin >>>>> >>>>> On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>> Hi Severin, >>>>>> >>>>>> The change looks good to me. Thank you for adding support for Podman >>>>>> container technology. >>>>>> >>>>>> Testing: I ran both HotSpot and JDK container tests with your patch; >>>>>> tests executed on Oracle Linux 7.6 using default container engine (Docker): >>>>>> >>>>>> test/hotspot/jtreg/containers/ AND >>>>>> test/jdk/jdk/internal/platform/docker/ >>>>>> >>>>>> All PASS >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Misha >>>>>> >>>>>> >>>>>> On 7/12/19 11:08 AM, Severin Gehwolf wrote: >>>>>>> Hi, >>>>>>> >>>>>>> There is an alternative container engine which is being used by Fedora >>>>>>> and RHEL 8, called podman[1]. It's mostly compatible with docker. It >>>>>>> looks like OpenJDK docker tests can be made podman compatible with a >>>>>>> few little tweaks. One "interesting" one is to not assert "Successfully >>>>>>> built" in the build output but only rely on the exit code, which seems >>>>>>> to be OK for my testing. Interestingly the test would be skipped in >>>>>>> that case. >>>>>>> >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 >>>>>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ >>>>>>> >>>>>>> Adjustments I've done: >>>>>>> * Don't assert "Successfully built" in image build output[2]. >>>>>>> * Add /usr/sbin to PATH as the podman binary relies on iptables for it >>>>>>> to work which is in /usr/sbin on Fedora >>>>>>> * Allow for Metrics.getCpuSystemUsage() and Metrics.getCpuUserUsage() >>>>>>> to be equal to the previous value. I've found those counters to be >>>>>>> slowly increasing, which made the tests unreliable. >>>>>>> >>>>>>> Testing: >>>>>>> >>>>>>> Running docker tests with docker as engine. Did the same with podman as >>>>>>> engine via -Djdk.test.docker.command=podman on Linux x86_64. Both >>>>>>> passed (non-trivially). >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> Thanks, >>>>>>> Severin >>>>>>> >>>>>>> [1] https://podman.io/ >>>>>>> [2] Image builds with podman look >>>>>>> like ("COMMIT" over "Successfully built"): >>>>>>> STEP 1: FROM fedora:29 >>>>>>> STEP 2: RUN dnf install -y java-11-openjdk-devel && dnf clean all >>>>>>> --> Using cache 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d >>>>>>> STEP 3: COPY TestMetrics.class TestMetrics.java /opt/ >>>>>>> 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 >>>>>>> STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt --add-modules java.base --add-exports java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics >>>>>>> STEP 5: COMMIT fedora-metrics-11 >>>>>>> d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e From dean.long at oracle.com Wed Jul 17 19:24:14 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 17 Jul 2019 12:24:14 -0700 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <6729068e-06b6-e7e6-b675-4c305ab18196@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> <6729068e-06b6-e7e6-b675-4c305ab18196@oracle.com> Message-ID: <554ae6d4-cfe4-f1b1-7ab6-cb8e9f2fa337@oracle.com> On 7/17/19 9:59 AM, Erik ?sterlund wrote: > Hi Dean, > > You are correct that the winner of the race will already have made > sure the operations you listed have run. So by not returning you would > essentially just continue performing a bunch of unnecessary operations > that won't do anything. > > I would prefer though to stay as close to my original fix as possible > as that one has gone through extensive testing, and I would like to > push this before the cut-off for P3 bugs to 13 tomorrow. > > Here is my latest attempt: > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.03/ > > Here is an incremental webrev to the original proposition: > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00_03/ > > As you can see I have only changed the guarantee to assert as > requested by Coleen, and added a bunch of commentary to explain why > e.g. a failing transition does not need to worry about the 3 side > effects before the failure. Hope my comments make sense. > > I hope you think this is okay. If there is more clarification or > cleanups you would like to see, I am more than happy to file such RFEs > for 14. > I'm OK with this, but yes, please file an RFE for cleanup in 14. It's not obvious to me that make_not_entrant and make_unloaded are doing the same operations.? Some refactoring here should help a lot. dl > Thanks, > /Erik > > On 2019-07-17 09:17, dean.long at oracle.com wrote: >> On 7/16/19 10:51 AM, dean.long at oracle.com wrote: >>> Back to the make_not_entrant / make_unloaded race.? If >>> make_not_entrant bails out half-way through because make_unloaded >>> won the race, doesn't that mean that make_unloaded needs to have >>> already done all the work that make_not_entrant is not doing? >>> unlink_from_method, invalidate_nmethod_mirror, remove_osr_nmethod, >>> unregister_nmethod, etc. >> >> What I'm thinking is, what happens if instead of this: >> >> 1365???? // Change state >> 1366 if (!try_transition(state)) { >> 1367 return false; >> 1368 } >> we do this: 1365???? // Maybe change state >> 1366 if (!try_transition(state)) { >> 1367 // fall through 1368 } >> >> dl >> From erik.osterlund at oracle.com Wed Jul 17 19:32:01 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 17 Jul 2019 21:32:01 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <554ae6d4-cfe4-f1b1-7ab6-cb8e9f2fa337@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> <6729068e-06b6-e7e6-b675-4c305ab18196@oracle.com> <554ae6d4-cfe4-f1b1-7ab6-cb8e9f2fa337@oracle.com> Message-ID: <41F29B1D-68EF-450D-8F63-A09CA9270915@oracle.com> Hi Dean, Thanks for the review. I will file an enhancement to refactor the code. I think what we need to make this more intuitive is to have a new unlinking function. More on that in 14... Thanks, /Erik > On 17 Jul 2019, at 21:24, dean.long at oracle.com wrote: > >> On 7/17/19 9:59 AM, Erik ?sterlund wrote: >> Hi Dean, >> >> You are correct that the winner of the race will already have made sure the operations you listed have run. So by not returning you would essentially just continue performing a bunch of unnecessary operations that won't do anything. >> >> I would prefer though to stay as close to my original fix as possible as that one has gone through extensive testing, and I would like to push this before the cut-off for P3 bugs to 13 tomorrow. >> >> Here is my latest attempt: >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.03/ >> >> Here is an incremental webrev to the original proposition: >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00_03/ >> >> As you can see I have only changed the guarantee to assert as requested by Coleen, and added a bunch of commentary to explain why e.g. a failing transition does not need to worry about the 3 side effects before the failure. Hope my comments make sense. >> >> I hope you think this is okay. If there is more clarification or cleanups you would like to see, I am more than happy to file such RFEs for 14. >> > > I'm OK with this, but yes, please file an RFE for cleanup in 14. It's not obvious to me that make_not_entrant and make_unloaded are doing the same operations. Some refactoring here should help a lot. > > dl > >> Thanks, >> /Erik >> >>> On 2019-07-17 09:17, dean.long at oracle.com wrote: >>>> On 7/16/19 10:51 AM, dean.long at oracle.com wrote: >>>> Back to the make_not_entrant / make_unloaded race. If make_not_entrant bails out half-way through because make_unloaded won the race, doesn't that mean that make_unloaded needs to have already done all the work that make_not_entrant is not doing? unlink_from_method, invalidate_nmethod_mirror, remove_osr_nmethod, unregister_nmethod, etc. >>> >>> What I'm thinking is, what happens if instead of this: >>> >>> 1365 // Change state >>> 1366 if (!try_transition(state)) { >>> 1367 return false; >>> 1368 } >>> we do this: 1365 // Maybe change state >>> 1366 if (!try_transition(state)) { >>> 1367 // fall through 1368 } >>> >>> dl >>> > From coleen.phillimore at oracle.com Wed Jul 17 20:19:16 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 17 Jul 2019 16:19:16 -0400 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <554ae6d4-cfe4-f1b1-7ab6-cb8e9f2fa337@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> <6729068e-06b6-e7e6-b675-4c305ab18196@oracle.com> <554ae6d4-cfe4-f1b1-7ab6-cb8e9f2fa337@oracle.com> Message-ID: <66e91dff-2d63-c7ce-a801-4b49dcb979de@oracle.com> On 7/17/19 3:24 PM, dean.long at oracle.com wrote: > On 7/17/19 9:59 AM, Erik ?sterlund wrote: >> Hi Dean, >> >> You are correct that the winner of the race will already have made >> sure the operations you listed have run. So by not returning you >> would essentially just continue performing a bunch of unnecessary >> operations that won't do anything. >> >> I would prefer though to stay as close to my original fix as possible >> as that one has gone through extensive testing, and I would like to >> push this before the cut-off for P3 bugs to 13 tomorrow. >> >> Here is my latest attempt: >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.03/ >> >> Here is an incremental webrev to the original proposition: >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00_03/ >> >> As you can see I have only changed the guarantee to assert as >> requested by Coleen, and added a bunch of commentary to explain why >> e.g. a failing transition does not need to worry about the 3 side >> effects before the failure. Hope my comments make sense. >> >> I hope you think this is okay. If there is more clarification or >> cleanups you would like to see, I am more than happy to file such >> RFEs for 14. >> > > I'm OK with this, but yes, please file an RFE for cleanup in 14. It's > not obvious to me that make_not_entrant and make_unloaded are doing > the same operations.? Some refactoring here should help a lot. I agree.? I had to reread the code to see it (and ask Erik!) and it still looks different.? The change looks good for 13 though, and the comments are helpful. Thanks, Coleen > > dl > >> Thanks, >> /Erik >> >> On 2019-07-17 09:17, dean.long at oracle.com wrote: >>> On 7/16/19 10:51 AM, dean.long at oracle.com wrote: >>>> Back to the make_not_entrant / make_unloaded race.? If >>>> make_not_entrant bails out half-way through because make_unloaded >>>> won the race, doesn't that mean that make_unloaded needs to have >>>> already done all the work that make_not_entrant is not doing? >>>> unlink_from_method, invalidate_nmethod_mirror, remove_osr_nmethod, >>>> unregister_nmethod, etc. >>> >>> What I'm thinking is, what happens if instead of this: >>> >>> 1365???? // Change state >>> 1366 if (!try_transition(state)) { >>> 1367 return false; >>> 1368 } >>> we do this: 1365???? // Maybe change state >>> 1366 if (!try_transition(state)) { >>> 1367 // fall through 1368 } >>> >>> dl >>> > From erik.osterlund at oracle.com Wed Jul 17 20:22:35 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 17 Jul 2019 22:22:35 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <66e91dff-2d63-c7ce-a801-4b49dcb979de@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> <6729068e-06b6-e7e6-b675-4c305ab18196@oracle.com> <554ae6d4-cfe4-f1b1-7ab6-cb8e9f2fa337@oracle.com> <66e91dff-2d63-c7ce-a801-4b49dcb979de@oracle.com> Message-ID: <6606F22B-EB5E-4EC4-A68C-F42D651A8F06@oracle.com> Hi Coleen, Thanks for the review. /Erik > On 17 Jul 2019, at 22:19, coleen.phillimore at oracle.com wrote: > > > >> On 7/17/19 3:24 PM, dean.long at oracle.com wrote: >>> On 7/17/19 9:59 AM, Erik ?sterlund wrote: >>> Hi Dean, >>> >>> You are correct that the winner of the race will already have made sure the operations you listed have run. So by not returning you would essentially just continue performing a bunch of unnecessary operations that won't do anything. >>> >>> I would prefer though to stay as close to my original fix as possible as that one has gone through extensive testing, and I would like to push this before the cut-off for P3 bugs to 13 tomorrow. >>> >>> Here is my latest attempt: >>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.03/ >>> >>> Here is an incremental webrev to the original proposition: >>> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00_03/ >>> >>> As you can see I have only changed the guarantee to assert as requested by Coleen, and added a bunch of commentary to explain why e.g. a failing transition does not need to worry about the 3 side effects before the failure. Hope my comments make sense. >>> >>> I hope you think this is okay. If there is more clarification or cleanups you would like to see, I am more than happy to file such RFEs for 14. >>> >> >> I'm OK with this, but yes, please file an RFE for cleanup in 14. It's not obvious to me that make_not_entrant and make_unloaded are doing the same operations. Some refactoring here should help a lot. > > I agree. I had to reread the code to see it (and ask Erik!) and it still looks different. The change looks good for 13 though, and the comments are helpful. > > Thanks, > Coleen > >> >> dl >> >>> Thanks, >>> /Erik >>> >>>> On 2019-07-17 09:17, dean.long at oracle.com wrote: >>>>> On 7/16/19 10:51 AM, dean.long at oracle.com wrote: >>>>> Back to the make_not_entrant / make_unloaded race. If make_not_entrant bails out half-way through because make_unloaded won the race, doesn't that mean that make_unloaded needs to have already done all the work that make_not_entrant is not doing? unlink_from_method, invalidate_nmethod_mirror, remove_osr_nmethod, unregister_nmethod, etc. >>>> >>>> What I'm thinking is, what happens if instead of this: >>>> >>>> 1365 // Change state >>>> 1366 if (!try_transition(state)) { >>>> 1367 return false; >>>> 1368 } >>>> we do this: 1365 // Maybe change state >>>> 1366 if (!try_transition(state)) { >>>> 1367 // fall through 1368 } >>>> >>>> dl >>>> >> > From vladimir.x.ivanov at oracle.com Wed Jul 17 21:35:08 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 18 Jul 2019 00:35:08 +0300 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> Message-ID: <470316cf-850d-7160-250a-ad6669b2ca9e@oracle.com> Thanks, Martin and Dmitrij for reviews. ... >> If you have upcalls from JVM code in mind, then there's already a >> barrier on caller side: JavaCalls::call_static() calls into >> LinkResolver::resolve_static_call() which has initialization barrier. >> So, there's no need to repeat the check. As an afterthought, I decided to update the comment in SharedRuntime::handle_wrong_method() to clarify the difference in behavior between upcalls coming from JVM & JNI. Best regards, Vladimir Ivanov >>>> -----Original Message----- >>>> From: Vladimir Ivanov >>>> Sent: Mittwoch, 17. Juli 2019 15:07 >>>> To: Doerr, Martin ; hotspot- >>>> dev at openjdk.java.net; Dmitrij Pochepko > sw.com> >>>> Subject: Re: RFR[13]: 8227260: Can't deal with >>>> SharedRuntime::handle_wrong_method triggering more than once for >>>> interpreter calls >>>> >>>> Thanks, Erik. >>>> >>>> Also, since I touch platform-specific code, I'd like Martin and Dmitrij >>>> (implementors of support for s390, ppc, and aarch64) to take a look at >>>> the patch as well. >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>> On 17/07/2019 15:25, Erik ?sterlund wrote: >>>>> Hi Vladimir, >>>>> >>>>> Looks good. Thanks for fixing. >>>>> >>>>> /Erik >>>>> >>>>> On 2019-07-17 12:26, Vladimir Ivanov wrote: >>>>>> Revised fix: >>>>>> ?? http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ >>>>>> >>>>>> It turned out the problem is not specific to i2c2i: fast class >>>>>> initialization barriers on nmethod entry trigger the assert as well. >>>>>> >>>>>> JNI upcalls (CallStaticMethod) don't have class initialization >>>>>> checks, so it's possible to initiate a JNI upcall from a >>>>>> non-initializing thread and JVM should let it complete. >>>>>> >>>>>> It leads to a busy loop (asserts in debug) between nmethod entry >>>>>> barrier & SharedRuntime::handle_wrong_method until holder class is >>>>>> initialized (possibly infinite if it blocks class initialization). >>>>>> >>>>>> Proposed fix is to keep using c2i, but jump over class initialization >>>>>> barrier right to the argument shuffling logic on verified entry when >>>>>> coming from SharedRuntime::handle_wrong_method. >>>>>> >>>>>> Improved regression test reliably reproduces the problem. >>>>>> >>>>>> Testing: regression test, hs-precheckin-comp, tier1-6 >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>>> >>>>>> On 04/07/2019 18:02, Erik ?sterlund wrote: >>>>>>> Hi, >>>>>>> >>>>>>> The i2c adapter sets a thread-local "callee_target" Method*, which is >>>>>>> caught (and cleared) by SharedRuntime::handle_wrong_method if >> the >>>> i2c >>>>>>> call is "bad" (e.g. not_entrant). This error handler forwards >>>>>>> execution to the callee c2i entry. If the >>>>>>> SharedRuntime::handle_wrong_method method is called again due >> to >>>> the >>>>>>> i2c2i call being still bad, then we will crash the VM in the >>>>>>> following guarantee in SharedRuntime::handle_wrong_method: >>>>>>> >>>>>>> Method* callee = thread->callee_target(); >>>>>>> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >>>>>>> >>>>>>> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits >>>>>>> the new class initialization entry barrier of the c2i adapter. >>>>>>> The solution is to simply not clear the thread-local "callee_target" >>>>>>> after handling the first failure, as we can't really know there won't >>>>>>> be another one. There is no reason to clear this value as nobody else >>>>>>> reads it than the SharedRuntime::handle_wrong_method handler >> (and >>>> we >>>>>>> really do want it to be able to read the value as many times as it >>>>>>> takes until the call goes through). I found some confused clearing of >>>>>>> this callee_target in JavaThread::oops_do(), with a comment saying >>>>>>> this is a methodOop that we need to clear to make GC happy or >>>>>>> something. Seems like old traces of perm gen. So I deleted that too. >>>>>>> >>>>>>> I caught this in ZGC where the timing window for hitting this issue >>>>>>> seems to be wider due to concurrent code cache unloading. But it is >>>>>>> equally problematic for all GCs. >>>>>>> >>>>>>> Bug: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227260 >>>>>>> >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >>>>>>> >>>>>>> Thanks, >>>>>>> /Erik From kim.barrett at oracle.com Wed Jul 17 22:48:13 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 17 Jul 2019 18:48:13 -0400 Subject: RFR: 8227653: Add VM Global OopStorage In-Reply-To: References: Message-ID: <16F0945B-E74B-472A-ADCF-5363FAAC9461@oracle.com> > On Jul 16, 2019, at 11:52 AM, Vladimir Kozlov wrote: > > Here goes my work for JVMCI oops handling ;) Yeah, sorry about that. I think I went on vacation in the middle of the review of those changes. > Kim, after this change the only use for JVMCI::_object_handles is JVMCI::is_global_handle() which is only used in assert() in deleteGlobalHandle() in jvmciCompilerToVM.cpp. Do we really need it there? May be we should remove this use too. I hadn't looked through the uses carefully, assuming replacement of one OopStorage with another wouldn't uncover any problems. Unfortunately, it turns out there's a pre-existing bug lurking. JVMCI::_object_handles is also used by JVMCI::make_global, not surprisingly. What did surprise me is the lack of a corresponding destroy function. And then I looked at deleteGlobalHandles() in jvmciCompilerToVM.cpp, and it seems to have not been updated from the old JNIHandleBlock implementation when these global handles were changed to use OopStorage. So instead of calling OopStorage::release, deleteGlobalHandles just leaks the handles. I'm posting the fix for this as an incremental change on my JDK-8227653 change, but I'm wondering if it ought to be split out into a separate bug fix for JDK 13. Webrevs: full: http://cr.openjdk.java.net/~kbarrett/8227653/open.01/ incr: http://cr.openjdk.java.net/~kbarrett/8227653/open.01.inc/ Testing: mach5 hs-tier3-5 (in progress), which do some graal testing. From vladimir.kozlov at oracle.com Wed Jul 17 23:36:39 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 17 Jul 2019 16:36:39 -0700 Subject: RFR: 8227653: Add VM Global OopStorage In-Reply-To: <16F0945B-E74B-472A-ADCF-5363FAAC9461@oracle.com> References: <16F0945B-E74B-472A-ADCF-5363FAAC9461@oracle.com> Message-ID: <6dbf15d1-e340-1541-3704-a05c376be7b1@oracle.com> Thank you, Kim Good. Please file bug for JDK 13 and assign it to me. I will port your JVMCI fix. Thanks, Vladimir On 7/17/19 3:48 PM, Kim Barrett wrote: >> On Jul 16, 2019, at 11:52 AM, Vladimir Kozlov wrote: >> >> Here goes my work for JVMCI oops handling ;) > > Yeah, sorry about that. I think I went on vacation in the middle of > the review of those changes. > >> Kim, after this change the only use for JVMCI::_object_handles is JVMCI::is_global_handle() which is only used in assert() in deleteGlobalHandle() in jvmciCompilerToVM.cpp. Do we really need it there? May be we should remove this use too. > > I hadn't looked through the uses carefully, assuming replacement of > one OopStorage with another wouldn't uncover any problems. > > Unfortunately, it turns out there's a pre-existing bug lurking. > > JVMCI::_object_handles is also used by JVMCI::make_global, not > surprisingly. What did surprise me is the lack of a corresponding > destroy function. And then I looked at deleteGlobalHandles() in > jvmciCompilerToVM.cpp, and it seems to have not been updated from the > old JNIHandleBlock implementation when these global handles were > changed to use OopStorage. So instead of calling OopStorage::release, > deleteGlobalHandles just leaks the handles. > > I'm posting the fix for this as an incremental change on my > JDK-8227653 change, but I'm wondering if it ought to be split out into > a separate bug fix for JDK 13. > > Webrevs: > full: http://cr.openjdk.java.net/~kbarrett/8227653/open.01/ > incr: http://cr.openjdk.java.net/~kbarrett/8227653/open.01.inc/ > > Testing: > mach5 hs-tier3-5 (in progress), which do some graal testing. > From mikhailo.seledtsov at oracle.com Thu Jul 18 01:38:11 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 17 Jul 2019 18:38:11 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> Message-ID: <4899d2fa-bfd1-153a-7d8b-ade73cba0289@oracle.com> On 7/17/19 11:37 AM, Igor Ignatyev wrote: > > Hi Severin, > > the updated webrev looks good to me, please see a couple comments below. > > Cheers, > -- Igor > >> On Jul 17, 2019, at 10:34 AM, Severin Gehwolf > > wrote: >> >> Hi Misha, >> >> On Wed, 2019-07-17 at 10:22 -0700,mikhailo.seledtsov at oracle.com >> wrote: >>> Hi Severin, >>> >>> On 7/17/19 5:44 AM, Severin Gehwolf wrote: >>>> Hi Igor, Misha, >>>> >>>> On Tue, 2019-07-16 at 11:49 -0700, Igor Ignatyev wrote: >>>>> Hi Severin, >>>>> >>>>> I don't think that tests (or test libraries for that matter) should >>>>> be responsible for setting correct PATH value, it should be a part of >>>>> host configuration procedure (tests can/should check that all >>>>> required bins are available though). in other words, I'd prefer if >>>>> you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and >>>>> TestJFREvents. the rest looks good to me. >>>> Updated webrev: >>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/02/webrev/ >>>> >>>> No more additions to PATH are being done. >>>> >>>> I've discovered that VMProps.java which defines "docker.required", used >>>> the "docker" binary even for podman test runs. This ended up not >>>> running most of the tests even with -Djdk.test.docker.command=podman >>>> specified. >>> Good catch. > should we rename docker.support and DOCKER_COMMAND to something more > abstract? Now that more container technologies are coming online we could consider more generic names for these properties/variables. Here are some thoughts: ? - container.support (CONTAINER_COMMAND) - may be too generic ? - linux.container.support (LINUX_CONTAINER_COMMAND) - more narrow ? - even more narrow/specific: oci.container.support (OCI_CONTAINER_COMMAND) ???? OCI in this case is " Open Container Initiative", ( Linux Foundation project to design open standards for Linux Container technology) ???? I believe both Docker and Podman are OCI compliant. However, I would recommend to do this work as part of a new RFE. If we agree, I will create an RFE, and we can continue discussion in the context of that RFE. Thank you, Misha > >>>> I've fixed that by moving DOCKER_COMMAND to Platform.java so >>>> that it can be used in both places. >>> >>> Sounds good to me. >>> >>> (of course, the alternative would be to import >>> jdk.test.lib.containers.docker.DockerTestUtils into VMProps.java -- not >>> sure if there are any potential problems doing it this way) >> >> I've tried that but for some reason this was a problem and VMProps >> failed to compile. I don't know exactly how those jtreg extensions work >> and went with the Platform approach. Hope that's OK. > > all files needed for VMProps (or other @requires expression class) > have to be listed in requires.extraPropDefns or > requires.extraPropDefns.bootlibs property in TEST.ROOT file in all the > test suites which use?these extensions. we are trying to be very > cautious in what is used by VMProps (directly and indirectly) so these > lists won't grow and we won't require any modules other than > java.base, given?DockerTestUtils has dependencies on a number of other > library classes, the Platform approach is much better from that point > of view. > >> >>>> Testing: Container tests with docker daemon running on Linux x86_64, >>>> container tests without docker daemon running (podman is daemon-less) >>>> via the podman binary on Linux x86_64 (with -e:PATH). All pass. >>> >>> Sounds good. >>> >>> >>> Overall looks good to me. >> >> Thanks for the review! >> >>> One minor nit: DockerTestUtils.java does not need "import >>> java.util.Map;" (no need to post updated webrev for this change) >> >> OK, good catch. Fixed locally. >> >> Thanks, >> Severin >> >>> >>> Thank you, >>> >>> Misha >>> >>>> More thoughts? >>>> >>>> Thanks, >>>> Severin >>>> >>>>> Thanks, >>>>> -- Igor >>>>> >>>>>> On Jul 16, 2019, at 5:36 AM, Severin Gehwolf >>>>> > wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I believe I still need a *R*eviewer for this. Any takers? >>>>>> >>>>>> Thanks, >>>>>> Severin >>>>>> >>>>>> On Fri, 2019-07-12 at 15:19 -0700, mikhailo.seledtsov at oracle.com >>>>>> wrote: >>>>>>> Hi Severin, >>>>>>> >>>>>>> ??The change looks good to me. Thank you for adding support for >>>>>>> Podman >>>>>>> container technology. >>>>>>> >>>>>>> Testing: I ran both HotSpot and JDK container tests with your patch; >>>>>>> tests executed on Oracle Linux 7.6 using default container >>>>>>> engine (Docker): >>>>>>> >>>>>>> ????test/hotspot/jtreg/containers/ ??AND >>>>>>> test/jdk/jdk/internal/platform/docker/ >>>>>>> >>>>>>> All PASS >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Misha >>>>>>> >>>>>>> >>>>>>> On 7/12/19 11:08 AM, Severin Gehwolf wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> There is an alternative container engine which is being used by >>>>>>>> Fedora >>>>>>>> and RHEL 8, called podman[1]. It's mostly compatible with >>>>>>>> docker. It >>>>>>>> looks like OpenJDK docker tests can be made podman compatible >>>>>>>> with a >>>>>>>> few little tweaks. One "interesting" one is to not assert >>>>>>>> "Successfully >>>>>>>> built" in the build output but only rely on the exit code, >>>>>>>> which seems >>>>>>>> to be OK for my testing. Interestingly the test would be skipped in >>>>>>>> that case. >>>>>>>> >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ >>>>>>>> >>>>>>>> Adjustments I've done: >>>>>>>> ?* Don't assert "Successfully built" in image build output[2]. >>>>>>>> ?* Add /usr/sbin to PATH as the podman binary relies on >>>>>>>> iptables for it >>>>>>>> ???to work which is in /usr/sbin on Fedora >>>>>>>> ?* Allow for Metrics.getCpuSystemUsage() and >>>>>>>> Metrics.getCpuUserUsage() >>>>>>>> ???to be equal to the previous value. I've found those counters >>>>>>>> to be >>>>>>>> ???slowly increasing, which made the tests unreliable. >>>>>>>> >>>>>>>> Testing: >>>>>>>> >>>>>>>> Running docker tests with docker as engine. Did the same with >>>>>>>> podman as >>>>>>>> engine via -Djdk.test.docker.command=podman on Linux x86_64. Both >>>>>>>> passed (non-trivially). >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Severin >>>>>>>> >>>>>>>> [1] https://podman.io/ >>>>>>>> [2] Image builds with podman look >>>>>>>> ????like ("COMMIT" over "Successfully built"): >>>>>>>> STEP 1: FROM fedora:29 >>>>>>>> STEP 2: RUN dnf install -y java-11-openjdk-devel && ????dnf >>>>>>>> clean all >>>>>>>> --> Using cache >>>>>>>> 96f8b1a0dfe7dba581a64fc67a27002ddf52e032af55f9ddc765182a690afd9d >>>>>>>> STEP 3: COPY TestMetrics.class ?TestMetrics.java /opt/ >>>>>>>> 269042160f7a4e6a06789cd19640ea658a8f941bc53de0fd40a574dc3bdb49a8 >>>>>>>> STEP 4: CMD /usr/lib/jvm/java-11-openjdk/bin/java -cp /opt >>>>>>>> --add-modules java.base --add-exports >>>>>>>> java.base/jdk.internal.platform=ALL-UNNAMED TestMetrics >>>>>>>> STEP 5: COMMIT fedora-metrics-11 >>>>>>>> d749088d6ce4510f212820ad4eca55a9b05e5c5c245f2372b6cfe91926e8cd7e > From igor.ignatyev at oracle.com Thu Jul 18 01:45:10 2019 From: igor.ignatyev at oracle.com (Igor Ignatev) Date: Wed, 17 Jul 2019 18:45:10 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <4899d2fa-bfd1-153a-7d8b-ade73cba0289@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> <4899d2fa-bfd1-153a-7d8b-ade73cba0289@oracle.com> Message-ID: <39875294-AD67-4255-9E52-792A31A4F233@oracle.com> We definitely should do it as a separate RFE, I meant to write it in my email, but was interrupted by a fire drill, and forgot about it when returned. ? Igor > On Jul 17, 2019, at 6:38 PM, mikhailo.seledtsov at oracle.com wrote: > > However, I would recommend to do this work as part of a new RFE. If we agree, I will create an RFE, and we can continue discussion in the context of that RFE. > From mikhailo.seledtsov at oracle.com Thu Jul 18 01:58:51 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Wed, 17 Jul 2019 18:58:51 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <39875294-AD67-4255-9E52-792A31A4F233@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> <4899d2fa-bfd1-153a-7d8b-ade73cba0289@oracle.com> <39875294-AD67-4255-9E52-792A31A4F233@oracle.com> Message-ID: <636b3321-f04d-b537-6d24-c4b3a17c37f0@oracle.com> Sounds good, Thank you, Misha On 7/17/19 6:45 PM, Igor Ignatev wrote: > We definitely should do it as a separate RFE, I meant to write it in > my email, but was interrupted by a fire drill, and forgot about it > when returned. > > ? Igor > > On Jul 17, 2019, at 6:38 PM, mikhailo.seledtsov at oracle.com > wrote: > >> However, I would recommend to do this work as part of a new RFE. If >> we agree, I will create an RFE, and we can continue discussion in the >> context of that RFE. >> From david.holmes at oracle.com Thu Jul 18 02:28:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Jul 2019 12:28:55 +1000 Subject: 8227652: SetupOperatorNewDeleteCheck should discuss deleting destructors In-Reply-To: <40590A26-1A32-4B3F-B1D8-55A56090C5F4@oracle.com> References: <40590A26-1A32-4B3F-B1D8-55A56090C5F4@oracle.com> Message-ID: <08ef9d8e-f74d-83cb-4a9a-ac04364c2b0f@oracle.com> Looks fine and trivial to me. Thanks, David On 16/07/2019 5:51 am, Kim Barrett wrote: > Please review this explanatory comment being added to the description > of the check for using global operator new/delete in Hotspot code. > The described situation is somewhat obscure, and encountering it for > the first time (or again after a long time, as happened to me recently) > can be quite puzzling. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8227652 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8227652/open.00/ > From david.holmes at oracle.com Thu Jul 18 04:43:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Jul 2019 14:43:55 +1000 Subject: RFR(S) [13] : 8226910 : make it possible to use jtreg's -match via run-test framework In-Reply-To: <2F2CE24E-9DDB-489D-9CC6-3296C0149B9A@oracle.com> References: <8B6A5349-A39A-4AE0-980D-5C336C339DE7@oracle.com> <9DA3B077-FFE6-472E-B3EA-7C4CFFDB45EB@oracle.com> <5b10f093-8aa8-4b5f-14bf-a9b7c5704381@oracle.com> <2F2CE24E-9DDB-489D-9CC6-3296C0149B9A@oracle.com> Message-ID: Hi Igor, This seems fine to me. Thanks, David On 17/07/2019 7:35 am, Igor Ignatyev wrote: > can I get a review for this patch? > http://cr.openjdk.java.net/~iignatyev//8226910/webrev.01/index.html > > Thanks, > -- Igor > >> On Jul 6, 2019, at 11:50 AM, Igor Ignatyev > > wrote: >> >> Hi David, >> >>> On Jul 6, 2019, at 1:58 AM, David Holmes >> > wrote: >>> >>> Hi Igor, >>> >>> On 6/07/2019 1:09 pm, Igor Ignatyev wrote: >>>> ping? >>>> -- Igor >>>>> On Jun 27, 2019, at 3:25 PM, Igor Ignatyev >>>>> > wrote: >>>>> >>>>> http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>>>>> 25 lines changed: 18 ins; 3 del; 4 mod; >>>>> >>>>> Hi all, >>>>> >>>>> could you please review this small patch which adds >>>>> JTREG_RUN_PROBLEM_LISTS options to run-test framework? when >>>>> JTREG_RUN_PROBLEM_LISTS is set to true, jtreg will use problem >>>>> lists as values of -match: instead of -exclude, which effectively >>>>> means it will run only problem listed tests. >>> >>> doc/testing.md >>> >>> + Set to `true` of `false`. >>> >>> typo: s/of/or/ >> fixed .md, regenerated .html. >>> >>> Build changes seem okay - I can't attest to the operation of the flag. >> >> here is how I verified that it does that it supposed to: >> >> $ make test "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true" >> TEST=open/test/hotspot/jtreg/:hotspot_all >> lists 53 tests, the same command w/o RUN_PROBLEM_LISTS (or w/ >> RUN_PROBLEM_LISTS=false) lists 6698 tests. >> >> $ make test >> "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true;EXTRA_PROBLEM_LISTS=ProblemList-aot.txt >> lists 81 tests, the same command w/o RUN_PROBLEM_LISTS lists 6670 tests. >> >>> >>>>> doc/building.html got changed when I ran update-build-docs, I can >>>>> exclude it from the patch, but it seems it will keep changing every >>>>> time we run update-build-docs, so I decided to at least bring it up. >>> >>> Weird it seems to have removed line-breaks in that paragraph. What >>> platform did you build on? >> I built on macos. now when I wrote that, I remember pandoc used to >> produce different results on macos. so I've rerun it on linux on the >> source w/o my change, and doc/building.html still got changed in the >> exact same way. >> >>> David >>> ----- >>> >>>>> >>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8226910 >>>>> webrev:http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>>>> >>>>> Thanks, >>>>> -- Igor > From tobias.hartmann at oracle.com Thu Jul 18 05:17:04 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 18 Jul 2019 07:17:04 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: <6729068e-06b6-e7e6-b675-4c305ab18196@oracle.com> References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> <6729068e-06b6-e7e6-b675-4c305ab18196@oracle.com> Message-ID: Hi Erik, On 17.07.19 18:59, Erik ?sterlund wrote: > Here is my latest attempt: > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.03/ > > Here is an incremental webrev to the original proposition: > http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00_03/ Still looks good to me. Best regards, Tobias From erik.osterlund at oracle.com Thu Jul 18 06:07:54 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Thu, 18 Jul 2019 08:07:54 +0200 Subject: RFR[13]: 8224674: NMethod state machine is not monotonic In-Reply-To: References: <625f018c-4eb1-09bb-e2b3-0a41ba65db19@oracle.com> <4380063e-f08a-5c0d-5f90-aac4e0fdb570@oracle.com> <00d16c64-dc06-f0fa-6bd3-2d3fbc3a857c@oracle.com> <34ee8d9e-f668-2f3f-07f7-3c959c843e7f@oracle.com> <6729068e-06b6-e7e6-b675-4c305ab18196@oracle.com> Message-ID: Hi Tobias, Thanks for the review. /Erik > On 18 Jul 2019, at 07:17, Tobias Hartmann wrote: > > Hi Erik, > >> On 17.07.19 18:59, Erik ?sterlund wrote: >> Here is my latest attempt: >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.03/ >> >> Here is an incremental webrev to the original proposition: >> http://cr.openjdk.java.net/~eosterlund/8224674/webrev.00_03/ > > Still looks good to me. > > Best regards, > Tobias From dean.long at oracle.com Thu Jul 18 06:26:14 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 17 Jul 2019 23:26:14 -0700 Subject: RFR: 8227633: avoid comparing this pointers to NULL In-Reply-To: <5ace8298-e942-09ab-43ce-874937c160ba@oracle.com> References: <5ace8298-e942-09ab-43ce-874937c160ba@oracle.com> Message-ID: <5f0e20ab-b9cf-b882-d0dc-64f4d908d024@oracle.com> The adlc changes look OK. dl On 7/16/19 8:30 AM, coleen.phillimore at oracle.com wrote: > > This looks good to me.? I don't know this compiler code very well, so > please wait for a second reviewer. > Thanks, > Coleen > > On 7/16/19 9:01 AM, Baesken, Matthias wrote: >> Hello Coleen , >> >> I adjusted the check in?? formssel.cpp?? to????? if (mnode != NULL)?? , >> >>>> I didn't see that you added a check for NULL in the callers of >>>> print_opcodes >> ? and added NULL checks to the? _inst._opcode->print_opcode calls? >> in?? src/hotspot/share/adlc/output_c.cpp? . >> >> Regarding?? Set::setstr()? in src/hotspot/share/libadt/set.cpp , >> ? This is?? used? in print()?? and? this? can be called "conveniently >> in the debugger"? (see set.hpp ). >> So I think it is okay to remove the check . >> >> >> please? see?? the new webrev? : >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.2/ >> >> >> Thanks, Matthias >> >> >> >>>> + if (mnode) mnode->count_instr_names(names); >>>> >>>> >>>> We also try to avoid implicit checks against null for pointers so >>>> change >>>> this to: >>>> >>> Hi Coleen, sure I can change this ; I just found a lot of places?? >>> in? formssel.cpp >>> where? if (ptr) { ...? }??? is used? . >>> >>>> I didn't see that you added a check for NULL in the callers of >>>> print_opcodes or setstr.? Can those callers never pass NULL? >>>> >>> It looked to me that the setstr?? is never really called? and void >>> Set::print() >>> const { ... }?? where it is used is used for debug printing - did I >>> miss something >>> ? >>> >>> Regarding print_opcodes?? ,? there probably? the NULL checks at >>> caller palces >>> should better be added . >>> >>> Regards, Matthias >>> > From Alan.Bateman at oracle.com Thu Jul 18 06:27:10 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 18 Jul 2019 07:27:10 +0100 Subject: RFR(trivial): 8227512: [TESTBUG] Fix JTReg javac test failures with Graal In-Reply-To: References: Message-ID: On 12/07/2019 04:27, Pengfei Li (Arm Technology China) wrote: > Hi, > > Please help review this small fix. > JBS: https://bugs.openjdk.java.net/browse/JDK-8227512 > Webrev: http://cr.openjdk.java.net/~pli/rfr/8227512/ > > JTReg javac tests > * langtools/tools/javac/modules/InheritRuntimeEnvironmentTest.java > * langtools/tools/javac/file/LimitedImage.java > failed when Graal is used as JVMCI compiler. > > These cases test javac behavior with the condition that observable modules are limited. But Graal is unable to be found in the limited module scope. This fixes these two tests by adding "jdk.internal.vm.compiler" into the limited modules. > I see this has been pushed but it looks like it is missing `@modules jdk.internal.vm.compiler` as the test now requires this module to be in the run-time image under test. As the test is not interesting when testing with the Graal compiler then maybe an alternative is to add `@requires !vm.graal.enabled` so that the test is not selected when exercising Graal - we've done this in a few other tests that run with `--limit-modules`. -Alan. From Pengfei.Li at arm.com Thu Jul 18 06:51:35 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Thu, 18 Jul 2019 06:51:35 +0000 Subject: RFR(trivial): 8227512: [TESTBUG] Fix JTReg javac test failures with Graal In-Reply-To: References: Message-ID: Hi Alan, > I see this has been pushed but it looks like it is missing `@modules > jdk.internal.vm.compiler` as the test now requires this module to be in the > run-time image under test. As the test is not interesting when testing with the > Graal compiler then maybe an alternative is to add > `@requires !vm.graal.enabled` so that the test is not selected when > exercising Graal - we've done this in a few other tests that run with `--limit- > modules`. Thanks for reply. I've used this alternative approach before when I tried to clean up other false failures in Graal jtreg (see http://hg.openjdk.java.net/jdk/jdk/rev/206afa6372ae). This time I choose to add the missing module because I thought the javac test would be interesting when Graal is used since javac is also written in Java. This change is already pushed, but it's fine to me if you would like to submit another patch to disable this two cases with Graal. -- Thanks, Pengfei From matthias.baesken at sap.com Thu Jul 18 07:00:41 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 18 Jul 2019 07:00:41 +0000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: References: Message-ID: Thanks ! May I get a second review please ? Best regards, Matthias > -----Original Message----- > From: Langer, Christoph > Sent: Mittwoch, 17. Juli 2019 18:45 > To: Baesken, Matthias ; 'hotspot- > dev at openjdk.java.net' ; 'ppc-aix-port- > dev at openjdk.java.net' > Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > Hi Matthias, > > thanks for this tedious cleanup. Looks good to me. > > Best regards > Christoph > > > -----Original Message----- > > From: hotspot-dev On Behalf > Of > > Baesken, Matthias > > Sent: Mittwoch, 17. Juli 2019 17:07 > > To: 'hotspot-dev at openjdk.java.net' ; > > 'ppc-aix-port-dev at openjdk.java.net' dev at openjdk.java.net> > > Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > > > Hello, there are a couple of non matching format specifiers in os_aix.cpp . > > I adjust them with my change . > > > > Please review ! > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8227869 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ > > > > Thanks, Matthias From david.holmes at oracle.com Thu Jul 18 07:08:17 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Jul 2019 17:08:17 +1000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: References: Message-ID: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> Hi Matthias, On 18/07/2019 5:00 pm, Baesken, Matthias wrote: > Thanks ! May I get a second review please ? @@ -1888,12 +1887,12 @@ if (!contains_range(p, s)) { trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " "range of [" PTR_FORMAT " - " PTR_FORMAT "].", - p, p + s, addr, addr + size); + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); pointers should be used with PTR_FORMAT. p2i(p) should be used with INTPTR_FORMAT. So the above looks like it was already correct and now is not correct. Using p2i with UINTX_FORMAT also looks dubious to me. Cheers, David ----- > Best regards, Matthias > > > >> -----Original Message----- >> From: Langer, Christoph >> Sent: Mittwoch, 17. Juli 2019 18:45 >> To: Baesken, Matthias ; 'hotspot- >> dev at openjdk.java.net' ; 'ppc-aix-port- >> dev at openjdk.java.net' >> Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >> >> Hi Matthias, >> >> thanks for this tedious cleanup. Looks good to me. >> >> Best regards >> Christoph >> >>> -----Original Message----- >>> From: hotspot-dev On Behalf >> Of >>> Baesken, Matthias >>> Sent: Mittwoch, 17. Juli 2019 17:07 >>> To: 'hotspot-dev at openjdk.java.net' ; >>> 'ppc-aix-port-dev at openjdk.java.net' > dev at openjdk.java.net> >>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>> >>> Hello, there are a couple of non matching format specifiers in os_aix.cpp . >>> I adjust them with my change . >>> >>> Please review ! >>> >>> Bug/webrev : >>> >>> https://bugs.openjdk.java.net/browse/JDK-8227869 >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ >>> >>> Thanks, Matthias From matthias.baesken at sap.com Thu Jul 18 07:40:20 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 18 Jul 2019 07:40:20 +0000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> Message-ID: > pointers should be used with PTR_FORMAT. p2i(p) should be used with > INTPTR_FORMAT. So the above looks like it was already correct and now is > not correct. Hi David, I noticed p2i is used together with PTR_FORMAT at dozens locations in the HS code , did I miss something ? In os_aix.cpp we currently get these warnings , seems PTR_FORMAT is unsigned long , that?s why we see these warnings : /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] p, p + s, addr, addr + size); ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' fprintf(stderr, fmt, ##__VA_ARGS__); \ ^~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] p, p + s, addr, addr + size); ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' fprintf(stderr, fmt, ##__VA_ARGS__); \ ^~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] p, p + s, addr, addr + size); ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' fprintf(stderr, fmt, ##__VA_ARGS__); \ ^~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] p, p + s, addr, addr + size); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' fprintf(stderr, fmt, ##__VA_ARGS__); \ ^~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' fprintf(stderr, fmt, ##__VA_ARGS__); \ ^~~~~~~~~~~ /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 18. Juli 2019 09:08 > To: Baesken, Matthias ; Langer, Christoph > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' port-dev at openjdk.java.net> > Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > Hi Matthias, > > On 18/07/2019 5:00 pm, Baesken, Matthias wrote: > > Thanks ! May I get a second review please ? > > @@ -1888,12 +1887,12 @@ > if (!contains_range(p, s)) { > trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " > "range of [" PTR_FORMAT " - " PTR_FORMAT "].", > - p, p + s, addr, addr + size); > + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); > > pointers should be used with PTR_FORMAT. p2i(p) should be used with > INTPTR_FORMAT. So the above looks like it was already correct and now is > not correct. Using p2i with UINTX_FORMAT also looks dubious to me. > > Cheers, > David > ----- > > > Best regards, Matthias > > > > > > > >> -----Original Message----- > >> From: Langer, Christoph > >> Sent: Mittwoch, 17. Juli 2019 18:45 > >> To: Baesken, Matthias ; 'hotspot- > >> dev at openjdk.java.net' ; 'ppc-aix-port- > >> dev at openjdk.java.net' > >> Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >> > >> Hi Matthias, > >> > >> thanks for this tedious cleanup. Looks good to me. > >> > >> Best regards > >> Christoph > >> > >>> -----Original Message----- > >>> From: hotspot-dev On > Behalf > >> Of > >>> Baesken, Matthias > >>> Sent: Mittwoch, 17. Juli 2019 17:07 > >>> To: 'hotspot-dev at openjdk.java.net' ; > >>> 'ppc-aix-port-dev at openjdk.java.net' >> dev at openjdk.java.net> > >>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >>> > >>> Hello, there are a couple of non matching format specifiers in os_aix.cpp > . > >>> I adjust them with my change . > >>> > >>> Please review ! > >>> > >>> Bug/webrev : > >>> > >>> https://bugs.openjdk.java.net/browse/JDK-8227869 > >>> > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ > >>> > >>> Thanks, Matthias From david.holmes at oracle.com Thu Jul 18 08:04:45 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Jul 2019 18:04:45 +1000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> Message-ID: <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> On 18/07/2019 5:40 pm, Baesken, Matthias wrote: >> pointers should be used with PTR_FORMAT. p2i(p) should be used with >> INTPTR_FORMAT. So the above looks like it was already correct and now is >> not correct. > > Hi David, I noticed p2i is used together with PTR_FORMAT at dozens locations in the HS code , did I miss something ? Okay our usage is a bit of a historical mess. :( > In os_aix.cpp we currently get these warnings , seems PTR_FORMAT is unsigned long , that?s why we see these warnings : Defining PTR_FORMAT as an integral format it just broken - but dates back forever because %p wasn't portable. If this fixes things on AIX then that's fine. For new code I'd recommend use of INTPTR_FORMAT and p2i to print pointers. Thanks, David > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > p, p + s, addr, addr + size); > ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' > fprintf(stderr, fmt, ##__VA_ARGS__); \ > ^~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > p, p + s, addr, addr + size); > ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' > fprintf(stderr, fmt, ##__VA_ARGS__); \ > ^~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > p, p + s, addr, addr + size); > ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' > fprintf(stderr, fmt, ##__VA_ARGS__); \ > ^~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > p, p + s, addr, addr + size); > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' > fprintf(stderr, fmt, ##__VA_ARGS__); \ > ^~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from macro 'trcVerbose' > fprintf(stderr, fmt, ##__VA_ARGS__); \ > ^~~~~~~~~~~ > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Best regards, Matthias > > >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 18. Juli 2019 09:08 >> To: Baesken, Matthias ; Langer, Christoph >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > port-dev at openjdk.java.net> >> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >> >> Hi Matthias, >> >> On 18/07/2019 5:00 pm, Baesken, Matthias wrote: >>> Thanks ! May I get a second review please ? >> >> @@ -1888,12 +1887,12 @@ >> if (!contains_range(p, s)) { >> trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " >> "range of [" PTR_FORMAT " - " PTR_FORMAT "].", >> - p, p + s, addr, addr + size); >> + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); >> >> pointers should be used with PTR_FORMAT. p2i(p) should be used with >> INTPTR_FORMAT. So the above looks like it was already correct and now is >> not correct. Using p2i with UINTX_FORMAT also looks dubious to me. >> >> Cheers, >> David >> ----- >> >>> Best regards, Matthias >>> >>> >>> >>>> -----Original Message----- >>>> From: Langer, Christoph >>>> Sent: Mittwoch, 17. Juli 2019 18:45 >>>> To: Baesken, Matthias ; 'hotspot- >>>> dev at openjdk.java.net' ; 'ppc-aix-port- >>>> dev at openjdk.java.net' >>>> Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>> >>>> Hi Matthias, >>>> >>>> thanks for this tedious cleanup. Looks good to me. >>>> >>>> Best regards >>>> Christoph >>>> >>>>> -----Original Message----- >>>>> From: hotspot-dev On >> Behalf >>>> Of >>>>> Baesken, Matthias >>>>> Sent: Mittwoch, 17. Juli 2019 17:07 >>>>> To: 'hotspot-dev at openjdk.java.net' ; >>>>> 'ppc-aix-port-dev at openjdk.java.net' >>> dev at openjdk.java.net> >>>>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>>> >>>>> Hello, there are a couple of non matching format specifiers in os_aix.cpp >> . >>>>> I adjust them with my change . >>>>> >>>>> Please review ! >>>>> >>>>> Bug/webrev : >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8227869 >>>>> >>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ >>>>> >>>>> Thanks, Matthias From matthias.baesken at sap.com Thu Jul 18 08:25:35 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 18 Jul 2019 08:25:35 +0000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> Message-ID: Hi David, do you see an issue using p2i with char* pointers , should I add a cast or some other conversion ? (afaik it is usually used without other casts/conversions in the codebase) jdk/src/hotspot/share/utilities/globalDefinitions.hpp : 1055 // Convert pointer to intptr_t, for use in printing pointers. 1056 inline intptr_t p2i(const void * p) { 1057 return (intptr_t) p; 1058 } > If this fixes things on AIX then that's fine. Yes it does . But I have to agree with you it feels a bit shaky ... Regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 18. Juli 2019 10:05 > To: Baesken, Matthias ; Langer, Christoph > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' port-dev at openjdk.java.net> > Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > On 18/07/2019 5:40 pm, Baesken, Matthias wrote: > >> pointers should be used with PTR_FORMAT. p2i(p) should be used with > >> INTPTR_FORMAT. So the above looks like it was already correct and now > is > >> not correct. > > > > Hi David, I noticed p2i is used together with PTR_FORMAT at dozens > locations in the HS code , did I miss something ? > > Okay our usage is a bit of a historical mess. :( > > > In os_aix.cpp we currently get these warnings , seems PTR_FORMAT is > unsigned long , that?s why we see these warnings : > > Defining PTR_FORMAT as an integral format it just broken - but dates > back forever because %p wasn't portable. > > If this fixes things on AIX then that's fine. For new code I'd recommend > use of INTPTR_FORMAT and p2i to print pointers. > > Thanks, > David > > > > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format > specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > > p, p + s, addr, addr + size); > > ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from > macro 'trcVerbose' > > fprintf(stderr, fmt, ##__VA_ARGS__); \ > > ^~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format > specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > > p, p + s, addr, addr + size); > > ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from > macro 'trcVerbose' > > fprintf(stderr, fmt, ##__VA_ARGS__); \ > > ^~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format > specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > > p, p + s, addr, addr + size); > > ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from > macro 'trcVerbose' > > fprintf(stderr, fmt, ##__VA_ARGS__); \ > > ^~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format > specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > > p, p + s, addr, addr + size); > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from > macro 'trcVerbose' > > fprintf(stderr, fmt, ##__VA_ARGS__); \ > > ^~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format > specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > > " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ > ~~~~~~~~~~~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from > macro 'trcVerbose' > > fprintf(stderr, fmt, ##__VA_ARGS__); \ > > ^~~~~~~~~~~ > > /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format > specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] > > " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > ~~~~~~~~~~~~~~~~~~~~ > > > > Best regards, Matthias > > > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Donnerstag, 18. Juli 2019 09:08 > >> To: Baesken, Matthias ; Langer, Christoph > >> ; 'hotspot-dev at openjdk.java.net' >> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' >> port-dev at openjdk.java.net> > >> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >> > >> Hi Matthias, > >> > >> On 18/07/2019 5:00 pm, Baesken, Matthias wrote: > >>> Thanks ! May I get a second review please ? > >> > >> @@ -1888,12 +1887,12 @@ > >> if (!contains_range(p, s)) { > >> trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " > >> "range of [" PTR_FORMAT " - " PTR_FORMAT "].", > >> - p, p + s, addr, addr + size); > >> + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); > >> > >> pointers should be used with PTR_FORMAT. p2i(p) should be used with > >> INTPTR_FORMAT. So the above looks like it was already correct and now > is > >> not correct. Using p2i with UINTX_FORMAT also looks dubious to me. > >> > >> Cheers, > >> David > >> ----- > >> > >>> Best regards, Matthias > >>> > >>> > >>> > >>>> -----Original Message----- > >>>> From: Langer, Christoph > >>>> Sent: Mittwoch, 17. Juli 2019 18:45 > >>>> To: Baesken, Matthias ; 'hotspot- > >>>> dev at openjdk.java.net' ; 'ppc-aix- > port- > >>>> dev at openjdk.java.net' > >>>> Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >>>> > >>>> Hi Matthias, > >>>> > >>>> thanks for this tedious cleanup. Looks good to me. > >>>> > >>>> Best regards > >>>> Christoph > >>>> > >>>>> -----Original Message----- > >>>>> From: hotspot-dev On > >> Behalf > >>>> Of > >>>>> Baesken, Matthias > >>>>> Sent: Mittwoch, 17. Juli 2019 17:07 > >>>>> To: 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net>; > >>>>> 'ppc-aix-port-dev at openjdk.java.net' >>>> dev at openjdk.java.net> > >>>>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >>>>> > >>>>> Hello, there are a couple of non matching format specifiers in > os_aix.cpp > >> . > >>>>> I adjust them with my change . > >>>>> > >>>>> Please review ! > >>>>> > >>>>> Bug/webrev : > >>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8227869 > >>>>> > >>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ > >>>>> > >>>>> Thanks, Matthias From christoph.langer at sap.com Thu Jul 18 08:57:48 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 18 Jul 2019 08:57:48 +0000 Subject: RFR(trivial): 8227512: [TESTBUG] Fix JTReg javac test failures with Graal In-Reply-To: References: Message-ID: Hi, we observe this issue on some of our platforms (ppc64, ppc64le) where graal/jdk.internal.vm.compiler is not available. So a good fix would either be to have `@requires !vm.graal.enabled` or, if jtreg supports it, we'd need two sets of @modules directives and VM Options (--limit-modules) to cover both cases, with or without aot. Does anybody know if this is possible? Thanks Christoph > -----Original Message----- > From: hotspot-dev On Behalf Of > Pengfei Li (Arm Technology China) > Sent: Donnerstag, 18. Juli 2019 08:52 > To: Alan Bateman > Cc: nd ; compiler-dev at openjdk.java.net; hotspot- > dev at openjdk.java.net > Subject: RE: RFR(trivial): 8227512: [TESTBUG] Fix JTReg javac test failures with > Graal > > Hi Alan, > > > I see this has been pushed but it looks like it is missing `@modules > > jdk.internal.vm.compiler` as the test now requires this module to be in the > > run-time image under test. As the test is not interesting when testing with > the > > Graal compiler then maybe an alternative is to add > > `@requires !vm.graal.enabled` so that the test is not selected when > > exercising Graal - we've done this in a few other tests that run with `--limit- > > modules`. > > Thanks for reply. I've used this alternative approach before when I tried to > clean up other false failures in Graal jtreg (see > http://hg.openjdk.java.net/jdk/jdk/rev/206afa6372ae). This time I choose to > add the missing module because I thought the javac test would be > interesting when Graal is used since javac is also written in Java. This change > is already pushed, but it's fine to me if you would like to submit another > patch to disable this two cases with Graal. > > -- > Thanks, > Pengfei From martin.doerr at sap.com Thu Jul 18 09:08:25 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 18 Jul 2019 09:08:25 +0000 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <470316cf-850d-7160-250a-ad6669b2ca9e@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> <470316cf-850d-7160-250a-ad6669b2ca9e@oracle.com> Message-ID: Hi Vladimir, > As an afterthought, I decided to update the comment in > SharedRuntime::handle_wrong_method() to clarify the difference in > behavior between upcalls coming from JVM & JNI. This sounds helpful. Thanks. Best regards, Martin From david.holmes at oracle.com Thu Jul 18 10:03:58 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Jul 2019 20:03:58 +1000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> Message-ID: <39b081bf-d298-e515-3311-02d4c4a51db2@oracle.com> On 18/07/2019 6:25 pm, Baesken, Matthias wrote: > Hi David, do you see an issue using p2i with char* pointers , should I add a cast or some other conversion ? > (afaik it is usually used without other casts/conversions in the codebase) > > jdk/src/hotspot/share/utilities/globalDefinitions.hpp : > > 1055 // Convert pointer to intptr_t, for use in printing pointers. > 1056 inline intptr_t p2i(const void * p) { > 1057 return (intptr_t) p; > 1058 } p2i is what you should always use when printing a pointer to convert it to an integral type. But it should really be used with INTPTR_FORMAT. It will work with PTR_FORMAT due to other integral conversions. >> If this fixes things on AIX then that's fine. > > Yes it does . > But I have to agree with you it feels a bit shaky ... Changing PTR_FORMAT to INTPTR_FORMAT would remove that shakiness IMHO. :) Cheers, David > > Regards, Matthias > > > >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 18. Juli 2019 10:05 >> To: Baesken, Matthias ; Langer, Christoph >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > port-dev at openjdk.java.net> >> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >> >> On 18/07/2019 5:40 pm, Baesken, Matthias wrote: >>>> pointers should be used with PTR_FORMAT. p2i(p) should be used with >>>> INTPTR_FORMAT. So the above looks like it was already correct and now >> is >>>> not correct. >>> >>> Hi David, I noticed p2i is used together with PTR_FORMAT at dozens >> locations in the HS code , did I miss something ? >> >> Okay our usage is a bit of a historical mess. :( >> >>> In os_aix.cpp we currently get these warnings , seems PTR_FORMAT is >> unsigned long , that?s why we see these warnings : >> >> Defining PTR_FORMAT as an integral format it just broken - but dates >> back forever because %p wasn't portable. >> >> If this fixes things on AIX then that's fine. For new code I'd recommend >> use of INTPTR_FORMAT and p2i to print pointers. >> >> Thanks, >> David >> >>> >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format >> specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] >>> p, p + s, addr, addr + size); >>> ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from >> macro 'trcVerbose' >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>> ^~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format >> specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] >>> p, p + s, addr, addr + size); >>> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from >> macro 'trcVerbose' >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>> ^~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format >> specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] >>> p, p + s, addr, addr + size); >>> ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from >> macro 'trcVerbose' >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>> ^~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format >> specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] >>> p, p + s, addr, addr + size); >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from >> macro 'trcVerbose' >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>> ^~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format >> specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] >>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); >>> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ >> ~~~~~~~~~~~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded from >> macro 'trcVerbose' >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>> ^~~~~~~~~~~ >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format >> specifies type 'unsigned long' but the argument has type 'char *' [-Wformat] >>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); >>> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ >> ~~~~~~~~~~~~~~~~~~~~ >>> >>> Best regards, Matthias >>> >>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Donnerstag, 18. Juli 2019 09:08 >>>> To: Baesken, Matthias ; Langer, Christoph >>>> ; 'hotspot-dev at openjdk.java.net' >>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' >>> port-dev at openjdk.java.net> >>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>> >>>> Hi Matthias, >>>> >>>> On 18/07/2019 5:00 pm, Baesken, Matthias wrote: >>>>> Thanks ! May I get a second review please ? >>>> >>>> @@ -1888,12 +1887,12 @@ >>>> if (!contains_range(p, s)) { >>>> trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " >>>> "range of [" PTR_FORMAT " - " PTR_FORMAT "].", >>>> - p, p + s, addr, addr + size); >>>> + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); >>>> >>>> pointers should be used with PTR_FORMAT. p2i(p) should be used with >>>> INTPTR_FORMAT. So the above looks like it was already correct and now >> is >>>> not correct. Using p2i with UINTX_FORMAT also looks dubious to me. >>>> >>>> Cheers, >>>> David >>>> ----- >>>> >>>>> Best regards, Matthias >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Langer, Christoph >>>>>> Sent: Mittwoch, 17. Juli 2019 18:45 >>>>>> To: Baesken, Matthias ; 'hotspot- >>>>>> dev at openjdk.java.net' ; 'ppc-aix- >> port- >>>>>> dev at openjdk.java.net' >>>>>> Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>>>> >>>>>> Hi Matthias, >>>>>> >>>>>> thanks for this tedious cleanup. Looks good to me. >>>>>> >>>>>> Best regards >>>>>> Christoph >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: hotspot-dev On >>>> Behalf >>>>>> Of >>>>>>> Baesken, Matthias >>>>>>> Sent: Mittwoch, 17. Juli 2019 17:07 >>>>>>> To: 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net>; >>>>>>> 'ppc-aix-port-dev at openjdk.java.net' >>>>> dev at openjdk.java.net> >>>>>>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>>>>> >>>>>>> Hello, there are a couple of non matching format specifiers in >> os_aix.cpp >>>> . >>>>>>> I adjust them with my change . >>>>>>> >>>>>>> Please review ! >>>>>>> >>>>>>> Bug/webrev : >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227869 >>>>>>> >>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ >>>>>>> >>>>>>> Thanks, Matthias From martin.doerr at sap.com Thu Jul 18 10:15:25 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 18 Jul 2019 10:15:25 +0000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: <39b081bf-d298-e515-3311-02d4c4a51db2@oracle.com> References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> <39b081bf-d298-e515-3311-02d4c4a51db2@oracle.com> Message-ID: Hi David, there's no difference between INTPTR_FORMAT and PTR_FORMAT: #ifdef _LP64 #define INTPTR_FORMAT "0x%016" PRIxPTR #define PTR_FORMAT "0x%016" PRIxPTR #else // !_LP64 #define INTPTR_FORMAT "0x%08" PRIxPTR #define PTR_FORMAT "0x%08" PRIxPTR #endif // _LP64 I guess this was different in the past. I don't know why we still have both. Best regards, Martin > -----Original Message----- > From: ppc-aix-port-dev On > Behalf Of David Holmes > Sent: Donnerstag, 18. Juli 2019 12:04 > To: Baesken, Matthias ; Langer, Christoph > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' port-dev at openjdk.java.net> > Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > On 18/07/2019 6:25 pm, Baesken, Matthias wrote: > > Hi David, do you see an issue using p2i with char* pointers , should I add > a cast or some other conversion ? > > (afaik it is usually used without other casts/conversions in the codebase) > > > > jdk/src/hotspot/share/utilities/globalDefinitions.hpp : > > > > 1055 // Convert pointer to intptr_t, for use in printing pointers. > > 1056 inline intptr_t p2i(const void * p) { > > 1057 return (intptr_t) p; > > 1058 } > > p2i is what you should always use when printing a pointer to convert it > to an integral type. But it should really be used with INTPTR_FORMAT. It > will work with PTR_FORMAT due to other integral conversions. > > >> If this fixes things on AIX then that's fine. > > > > Yes it does . > > But I have to agree with you it feels a bit shaky ... > > Changing PTR_FORMAT to INTPTR_FORMAT would remove that shakiness > IMHO. :) > > Cheers, > David > > > > > Regards, Matthias > > > > > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Donnerstag, 18. Juli 2019 10:05 > >> To: Baesken, Matthias ; Langer, Christoph > >> ; 'hotspot-dev at openjdk.java.net' >> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' >> port-dev at openjdk.java.net> > >> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >> > >> On 18/07/2019 5:40 pm, Baesken, Matthias wrote: > >>>> pointers should be used with PTR_FORMAT. p2i(p) should be used with > >>>> INTPTR_FORMAT. So the above looks like it was already correct and > now > >> is > >>>> not correct. > >>> > >>> Hi David, I noticed p2i is used together with PTR_FORMAT at > dozens > >> locations in the HS code , did I miss something ? > >> > >> Okay our usage is a bit of a historical mess. :( > >> > >>> In os_aix.cpp we currently get these warnings , seems PTR_FORMAT > is > >> unsigned long , that?s why we see these warnings : > >> > >> Defining PTR_FORMAT as an integral format it just broken - but dates > >> back forever because %p wasn't portable. > >> > >> If this fixes things on AIX then that's fine. For new code I'd recommend > >> use of INTPTR_FORMAT and p2i to print pointers. > >> > >> Thanks, > >> David > >> > >>> > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format > >> specifies type 'unsigned long' but the argument has type 'char *' [- > Wformat] > >>> p, p + s, addr, addr + size); > >>> ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > from > >> macro 'trcVerbose' > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>> ^~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format > >> specifies type 'unsigned long' but the argument has type 'char *' [- > Wformat] > >>> p, p + s, addr, addr + size); > >>> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > from > >> macro 'trcVerbose' > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>> ^~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format > >> specifies type 'unsigned long' but the argument has type 'char *' [- > Wformat] > >>> p, p + s, addr, addr + size); > >>> ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > from > >> macro 'trcVerbose' > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>> ^~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format > >> specifies type 'unsigned long' but the argument has type 'char *' [- > Wformat] > >>> p, p + s, addr, addr + size); > >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > from > >> macro 'trcVerbose' > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>> ^~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format > >> specifies type 'unsigned long' but the argument has type 'char *' [- > Wformat] > >>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); > >>> > >> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ > >> ~~~~~~~~~~~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > from > >> macro 'trcVerbose' > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>> ^~~~~~~~~~~ > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format > >> specifies type 'unsigned long' but the argument has type 'char *' [- > Wformat] > >>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); > >>> > >> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > >> ~~~~~~~~~~~~~~~~~~~~ > >>> > >>> Best regards, Matthias > >>> > >>> > >>>> -----Original Message----- > >>>> From: David Holmes > >>>> Sent: Donnerstag, 18. Juli 2019 09:08 > >>>> To: Baesken, Matthias ; Langer, > Christoph > >>>> ; 'hotspot-dev at openjdk.java.net' > >>>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' aix- > >>>> port-dev at openjdk.java.net> > >>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >>>> > >>>> Hi Matthias, > >>>> > >>>> On 18/07/2019 5:00 pm, Baesken, Matthias wrote: > >>>>> Thanks ! May I get a second review please ? > >>>> > >>>> @@ -1888,12 +1887,12 @@ > >>>> if (!contains_range(p, s)) { > >>>> trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " > >>>> "range of [" PTR_FORMAT " - " PTR_FORMAT "].", > >>>> - p, p + s, addr, addr + size); > >>>> + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); > >>>> > >>>> pointers should be used with PTR_FORMAT. p2i(p) should be used with > >>>> INTPTR_FORMAT. So the above looks like it was already correct and > now > >> is > >>>> not correct. Using p2i with UINTX_FORMAT also looks dubious to me. > >>>> > >>>> Cheers, > >>>> David > >>>> ----- > >>>> > >>>>> Best regards, Matthias > >>>>> > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Langer, Christoph > >>>>>> Sent: Mittwoch, 17. Juli 2019 18:45 > >>>>>> To: Baesken, Matthias ; 'hotspot- > >>>>>> dev at openjdk.java.net' ; 'ppc-aix- > >> port- > >>>>>> dev at openjdk.java.net' > >>>>>> Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >>>>>> > >>>>>> Hi Matthias, > >>>>>> > >>>>>> thanks for this tedious cleanup. Looks good to me. > >>>>>> > >>>>>> Best regards > >>>>>> Christoph > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: hotspot-dev On > >>>> Behalf > >>>>>> Of > >>>>>>> Baesken, Matthias > >>>>>>> Sent: Mittwoch, 17. Juli 2019 17:07 > >>>>>>> To: 'hotspot-dev at openjdk.java.net' >> dev at openjdk.java.net>; > >>>>>>> 'ppc-aix-port-dev at openjdk.java.net' >>>>>> dev at openjdk.java.net> > >>>>>>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >>>>>>> > >>>>>>> Hello, there are a couple of non matching format specifiers in > >> os_aix.cpp > >>>> . > >>>>>>> I adjust them with my change . > >>>>>>> > >>>>>>> Please review ! > >>>>>>> > >>>>>>> Bug/webrev : > >>>>>>> > >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227869 > >>>>>>> > >>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ > >>>>>>> > >>>>>>> Thanks, Matthias From matthias.baesken at sap.com Thu Jul 18 10:31:41 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 18 Jul 2019 10:31:41 +0000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> <39b081bf-d298-e515-3311-02d4c4a51db2@oracle.com> Message-ID: Hi Martin, thanks for your input ! So I think PTR_FORMAT and p2i is okay . Do you have other concerns about 8227869 ? may I ad you as a reviewer ? Best regards, Matthias > -----Original Message----- > From: Doerr, Martin > Sent: Donnerstag, 18. Juli 2019 12:15 > To: David Holmes ; Baesken, Matthias > ; Langer, Christoph > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' port-dev at openjdk.java.net> > Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > Hi David, > > there's no difference between INTPTR_FORMAT and PTR_FORMAT: > > #ifdef _LP64 > #define INTPTR_FORMAT "0x%016" PRIxPTR > #define PTR_FORMAT "0x%016" PRIxPTR > #else // !_LP64 > #define INTPTR_FORMAT "0x%08" PRIxPTR > #define PTR_FORMAT "0x%08" PRIxPTR > #endif // _LP64 > > I guess this was different in the past. I don't know why we still have both. > > Best regards, > Martin > > > > -----Original Message----- > > From: ppc-aix-port-dev > On > > Behalf Of David Holmes > > Sent: Donnerstag, 18. Juli 2019 12:04 > > To: Baesken, Matthias ; Langer, Christoph > > ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > port-dev at openjdk.java.net> > > Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > > > On 18/07/2019 6:25 pm, Baesken, Matthias wrote: > > > Hi David, do you see an issue using p2i with char* pointers , should I > add > > a cast or some other conversion ? > > > (afaik it is usually used without other casts/conversions in the codebase) > > > > > > jdk/src/hotspot/share/utilities/globalDefinitions.hpp : > > > > > > 1055 // Convert pointer to intptr_t, for use in printing pointers. > > > 1056 inline intptr_t p2i(const void * p) { > > > 1057 return (intptr_t) p; > > > 1058 } > > > > p2i is what you should always use when printing a pointer to convert it > > to an integral type. But it should really be used with INTPTR_FORMAT. It > > will work with PTR_FORMAT due to other integral conversions. > > > > >> If this fixes things on AIX then that's fine. > > > > > > Yes it does . > > > But I have to agree with you it feels a bit shaky ... > > > > Changing PTR_FORMAT to INTPTR_FORMAT would remove that shakiness > > IMHO. :) > > > > Cheers, > > David > > > > > > > > Regards, Matthias > > > > > > > > > > > >> -----Original Message----- > > >> From: David Holmes > > >> Sent: Donnerstag, 18. Juli 2019 10:05 > > >> To: Baesken, Matthias ; Langer, > Christoph > > >> ; 'hotspot-dev at openjdk.java.net' > > >> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' aix- > > >> port-dev at openjdk.java.net> > > >> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > >> > > >> On 18/07/2019 5:40 pm, Baesken, Matthias wrote: > > >>>> pointers should be used with PTR_FORMAT. p2i(p) should be used > with > > >>>> INTPTR_FORMAT. So the above looks like it was already correct and > > now > > >> is > > >>>> not correct. > > >>> > > >>> Hi David, I noticed p2i is used together with PTR_FORMAT at > > dozens > > >> locations in the HS code , did I miss something ? > > >> > > >> Okay our usage is a bit of a historical mess. :( > > >> > > >>> In os_aix.cpp we currently get these warnings , seems > PTR_FORMAT > > is > > >> unsigned long , that?s why we see these warnings : > > >> > > >> Defining PTR_FORMAT as an integral format it just broken - but dates > > >> back forever because %p wasn't portable. > > >> > > >> If this fixes things on AIX then that's fine. For new code I'd recommend > > >> use of INTPTR_FORMAT and p2i to print pointers. > > >> > > >> Thanks, > > >> David > > >> > > >>> > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > Wformat] > > >>> p, p + s, addr, addr + size); > > >>> ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > from > > >> macro 'trcVerbose' > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > >>> ^~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > Wformat] > > >>> p, p + s, addr, addr + size); > > >>> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > from > > >> macro 'trcVerbose' > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > >>> ^~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > Wformat] > > >>> p, p + s, addr, addr + size); > > >>> ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > from > > >> macro 'trcVerbose' > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > >>> ^~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > Wformat] > > >>> p, p + s, addr, addr + size); > > >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > from > > >> macro 'trcVerbose' > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > >>> ^~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > Wformat] > > >>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); > > >>> > > >> > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ > > >> ~~~~~~~~~~~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > from > > >> macro 'trcVerbose' > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > >>> ^~~~~~~~~~~ > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > Wformat] > > >>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); > > >>> > > >> > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > > >> ~~~~~~~~~~~~~~~~~~~~ > > >>> > > >>> Best regards, Matthias > > >>> > > >>> > > >>>> -----Original Message----- > > >>>> From: David Holmes > > >>>> Sent: Donnerstag, 18. Juli 2019 09:08 > > >>>> To: Baesken, Matthias ; Langer, > > Christoph > > >>>> ; 'hotspot-dev at openjdk.java.net' > > > >>>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > aix- > > >>>> port-dev at openjdk.java.net> > > >>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > >>>> > > >>>> Hi Matthias, > > >>>> > > >>>> On 18/07/2019 5:00 pm, Baesken, Matthias wrote: > > >>>>> Thanks ! May I get a second review please ? > > >>>> > > >>>> @@ -1888,12 +1887,12 @@ > > >>>> if (!contains_range(p, s)) { > > >>>> trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " > > >>>> "range of [" PTR_FORMAT " - " PTR_FORMAT "].", > > >>>> - p, p + s, addr, addr + size); > > >>>> + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); > > >>>> > > >>>> pointers should be used with PTR_FORMAT. p2i(p) should be used > with > > >>>> INTPTR_FORMAT. So the above looks like it was already correct and > > now > > >> is > > >>>> not correct. Using p2i with UINTX_FORMAT also looks dubious to me. > > >>>> > > >>>> Cheers, > > >>>> David > > >>>> ----- > > >>>> > > >>>>> Best regards, Matthias > > >>>>> > > >>>>> > > >>>>> > > >>>>>> -----Original Message----- > > >>>>>> From: Langer, Christoph > > >>>>>> Sent: Mittwoch, 17. Juli 2019 18:45 > > >>>>>> To: Baesken, Matthias ; 'hotspot- > > >>>>>> dev at openjdk.java.net' ; 'ppc- > aix- > > >> port- > > >>>>>> dev at openjdk.java.net' > > >>>>>> Subject: RE: RFR : 8227869: fix wrong format specifiers in > os_aix.cpp > > >>>>>> > > >>>>>> Hi Matthias, > > >>>>>> > > >>>>>> thanks for this tedious cleanup. Looks good to me. > > >>>>>> > > >>>>>> Best regards > > >>>>>> Christoph > > >>>>>> > > >>>>>>> -----Original Message----- > > >>>>>>> From: hotspot-dev > On > > >>>> Behalf > > >>>>>> Of > > >>>>>>> Baesken, Matthias > > >>>>>>> Sent: Mittwoch, 17. Juli 2019 17:07 > > >>>>>>> To: 'hotspot-dev at openjdk.java.net' > >> dev at openjdk.java.net>; > > >>>>>>> 'ppc-aix-port-dev at openjdk.java.net' > >>>>>> dev at openjdk.java.net> > > >>>>>>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > >>>>>>> > > >>>>>>> Hello, there are a couple of non matching format specifiers in > > >> os_aix.cpp > > >>>> . > > >>>>>>> I adjust them with my change . > > >>>>>>> > > >>>>>>> Please review ! > > >>>>>>> > > >>>>>>> Bug/webrev : > > >>>>>>> > > >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227869 > > >>>>>>> > > >>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ > > >>>>>>> > > >>>>>>> Thanks, Matthias From martin.doerr at sap.com Thu Jul 18 10:38:34 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 18 Jul 2019 10:38:34 +0000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> <39b081bf-d298-e515-3311-02d4c4a51db2@oracle.com> Message-ID: Hi Matthias, You can add me as reviewer. Looks good to me. Only indentation could be improved. But I don't need to see another webrev for that. Best regards, Martin > -----Original Message----- > From: Baesken, Matthias > Sent: Donnerstag, 18. Juli 2019 12:32 > To: Doerr, Martin ; David Holmes > ; Langer, Christoph > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' port-dev at openjdk.java.net> > Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > Hi Martin, thanks for your input ! > > So I think PTR_FORMAT and p2i is okay . > > Do you have other concerns about 8227869 ? may I ad you as a reviewer ? > > Best regards, Matthias > > > > -----Original Message----- > > From: Doerr, Martin > > Sent: Donnerstag, 18. Juli 2019 12:15 > > To: David Holmes ; Baesken, Matthias > > ; Langer, Christoph > > ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > port-dev at openjdk.java.net> > > Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > > > Hi David, > > > > there's no difference between INTPTR_FORMAT and PTR_FORMAT: > > > > #ifdef _LP64 > > #define INTPTR_FORMAT "0x%016" PRIxPTR > > #define PTR_FORMAT "0x%016" PRIxPTR > > #else // !_LP64 > > #define INTPTR_FORMAT "0x%08" PRIxPTR > > #define PTR_FORMAT "0x%08" PRIxPTR > > #endif // _LP64 > > > > I guess this was different in the past. I don't know why we still have both. > > > > Best regards, > > Martin > > > > > > > -----Original Message----- > > > From: ppc-aix-port-dev > > On > > > Behalf Of David Holmes > > > Sent: Donnerstag, 18. Juli 2019 12:04 > > > To: Baesken, Matthias ; Langer, Christoph > > > ; 'hotspot-dev at openjdk.java.net' > > > dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > > port-dev at openjdk.java.net> > > > Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > > > > > On 18/07/2019 6:25 pm, Baesken, Matthias wrote: > > > > Hi David, do you see an issue using p2i with char* pointers , should I > > add > > > a cast or some other conversion ? > > > > (afaik it is usually used without other casts/conversions in the > codebase) > > > > > > > > jdk/src/hotspot/share/utilities/globalDefinitions.hpp : > > > > > > > > 1055 // Convert pointer to intptr_t, for use in printing pointers. > > > > 1056 inline intptr_t p2i(const void * p) { > > > > 1057 return (intptr_t) p; > > > > 1058 } > > > > > > p2i is what you should always use when printing a pointer to convert it > > > to an integral type. But it should really be used with INTPTR_FORMAT. It > > > will work with PTR_FORMAT due to other integral conversions. > > > > > > >> If this fixes things on AIX then that's fine. > > > > > > > > Yes it does . > > > > But I have to agree with you it feels a bit shaky ... > > > > > > Changing PTR_FORMAT to INTPTR_FORMAT would remove that > shakiness > > > IMHO. :) > > > > > > Cheers, > > > David > > > > > > > > > > > Regards, Matthias > > > > > > > > > > > > > > > >> -----Original Message----- > > > >> From: David Holmes > > > >> Sent: Donnerstag, 18. Juli 2019 10:05 > > > >> To: Baesken, Matthias ; Langer, > > Christoph > > > >> ; 'hotspot-dev at openjdk.java.net' > > > > >> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > aix- > > > >> port-dev at openjdk.java.net> > > > >> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > > >> > > > >> On 18/07/2019 5:40 pm, Baesken, Matthias wrote: > > > >>>> pointers should be used with PTR_FORMAT. p2i(p) should be used > > with > > > >>>> INTPTR_FORMAT. So the above looks like it was already correct and > > > now > > > >> is > > > >>>> not correct. > > > >>> > > > >>> Hi David, I noticed p2i is used together with PTR_FORMAT at > > > dozens > > > >> locations in the HS code , did I miss something ? > > > >> > > > >> Okay our usage is a bit of a historical mess. :( > > > >> > > > >>> In os_aix.cpp we currently get these warnings , seems > > PTR_FORMAT > > > is > > > >> unsigned long , that?s why we see these warnings : > > > >> > > > >> Defining PTR_FORMAT as an integral format it just broken - but dates > > > >> back forever because %p wasn't portable. > > > >> > > > >> If this fixes things on AIX then that's fine. For new code I'd > recommend > > > >> use of INTPTR_FORMAT and p2i to print pointers. > > > >> > > > >> Thanks, > > > >> David > > > >> > > > >>> > > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format > > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > > Wformat] > > > >>> p, p + s, addr, addr + size); > > > >>> ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > > from > > > >> macro 'trcVerbose' > > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > > >>> ^~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format > > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > > Wformat] > > > >>> p, p + s, addr, addr + size); > > > >>> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > > from > > > >> macro 'trcVerbose' > > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > > >>> ^~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format > > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > > Wformat] > > > >>> p, p + s, addr, addr + size); > > > >>> ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > > from > > > >> macro 'trcVerbose' > > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > > >>> ^~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format > > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > > Wformat] > > > >>> p, p + s, addr, addr + size); > > > >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > > from > > > >> macro 'trcVerbose' > > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > > >>> ^~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format > > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > > Wformat] > > > >>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) > pagesize); > > > >>> > > > >> > > > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ > > > >> ~~~~~~~~~~~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > > > from > > > >> macro 'trcVerbose' > > > >>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > > > >>> ^~~~~~~~~~~ > > > >>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format > > > >> specifies type 'unsigned long' but the argument has type 'char *' [- > > > Wformat] > > > >>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) > pagesize); > > > >>> > > > >> > > > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > > > >> ~~~~~~~~~~~~~~~~~~~~ > > > >>> > > > >>> Best regards, Matthias > > > >>> > > > >>> > > > >>>> -----Original Message----- > > > >>>> From: David Holmes > > > >>>> Sent: Donnerstag, 18. Juli 2019 09:08 > > > >>>> To: Baesken, Matthias ; Langer, > > > Christoph > > > >>>> ; 'hotspot-dev at openjdk.java.net' > > > > > >>>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > > > aix- > > > >>>> port-dev at openjdk.java.net> > > > >>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > > >>>> > > > >>>> Hi Matthias, > > > >>>> > > > >>>> On 18/07/2019 5:00 pm, Baesken, Matthias wrote: > > > >>>>> Thanks ! May I get a second review please ? > > > >>>> > > > >>>> @@ -1888,12 +1887,12 @@ > > > >>>> if (!contains_range(p, s)) { > > > >>>> trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub > " > > > >>>> "range of [" PTR_FORMAT " - " PTR_FORMAT "].", > > > >>>> - p, p + s, addr, addr + size); > > > >>>> + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); > > > >>>> > > > >>>> pointers should be used with PTR_FORMAT. p2i(p) should be used > > with > > > >>>> INTPTR_FORMAT. So the above looks like it was already correct and > > > now > > > >> is > > > >>>> not correct. Using p2i with UINTX_FORMAT also looks dubious to > me. > > > >>>> > > > >>>> Cheers, > > > >>>> David > > > >>>> ----- > > > >>>> > > > >>>>> Best regards, Matthias > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>>> -----Original Message----- > > > >>>>>> From: Langer, Christoph > > > >>>>>> Sent: Mittwoch, 17. Juli 2019 18:45 > > > >>>>>> To: Baesken, Matthias ; 'hotspot- > > > >>>>>> dev at openjdk.java.net' ; 'ppc- > > aix- > > > >> port- > > > >>>>>> dev at openjdk.java.net' > > > >>>>>> Subject: RE: RFR : 8227869: fix wrong format specifiers in > > os_aix.cpp > > > >>>>>> > > > >>>>>> Hi Matthias, > > > >>>>>> > > > >>>>>> thanks for this tedious cleanup. Looks good to me. > > > >>>>>> > > > >>>>>> Best regards > > > >>>>>> Christoph > > > >>>>>> > > > >>>>>>> -----Original Message----- > > > >>>>>>> From: hotspot-dev > > On > > > >>>> Behalf > > > >>>>>> Of > > > >>>>>>> Baesken, Matthias > > > >>>>>>> Sent: Mittwoch, 17. Juli 2019 17:07 > > > >>>>>>> To: 'hotspot-dev at openjdk.java.net' > > >> dev at openjdk.java.net>; > > > >>>>>>> 'ppc-aix-port-dev at openjdk.java.net' > > >>>>>> dev at openjdk.java.net> > > > >>>>>>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > > >>>>>>> > > > >>>>>>> Hello, there are a couple of non matching format specifiers in > > > >> os_aix.cpp > > > >>>> . > > > >>>>>>> I adjust them with my change . > > > >>>>>>> > > > >>>>>>> Please review ! > > > >>>>>>> > > > >>>>>>> Bug/webrev : > > > >>>>>>> > > > >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227869 > > > >>>>>>> > > > >>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ > > > >>>>>>> > > > >>>>>>> Thanks, Matthias From matthias.baesken at sap.com Thu Jul 18 10:39:44 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 18 Jul 2019 10:39:44 +0000 Subject: RFR: 8227633: avoid comparing this pointers to NULL - was : RE: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] In-Reply-To: References: Message-ID: Hi Martin, thanks for the review ! Best regards, Matthias > -----Original Message----- > From: Doerr, Martin > Sent: Mittwoch, 17. Juli 2019 17:40 > To: coleen.phillimore at oracle.com; hotspot-dev at openjdk.java.net; > Baesken, Matthias > Subject: RE: RFR: 8227633: avoid comparing this pointers to NULL - was : RE: > this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined- > compare] > > Hi Matthias, > > looks good to me. > > Please make sure that this change got built on all platforms we have. > The adlc is used during build so if it has passed on all platforms, it should be > ok. > > Best regards, > Martin > > > > -----Original Message----- > > From: hotspot-dev On Behalf > Of > > coleen.phillimore at oracle.com > > Sent: Freitag, 12. Juli 2019 14:49 > > To: hotspot-dev at openjdk.java.net > > Subject: Re: RFR: 8227633: avoid comparing this pointers to NULL - was : RE: > > this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined- > > compare] > > > > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8227633.0/src/hotspot/sha > > re/adlc/formssel.cpp.udiff.html > > > > + if (mnode) mnode->count_instr_names(names); > > > > > > We also try to avoid implicit checks against null for pointers so change > > this to: > > > > + if (mnode != NULL) mnode->count_instr_names(names); > > > > I didn't see that you added a check for NULL in the callers of > > print_opcodes or setstr.? Can those callers never pass NULL? > > > > We've done a few passes to clean up these this == NULL checks. Thank you > > for doing this! > > > > Coleen > > From david.holmes at oracle.com Thu Jul 18 10:52:18 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Jul 2019 20:52:18 +1000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> <39b081bf-d298-e515-3311-02d4c4a51db2@oracle.com> Message-ID: <367d65ad-6df7-65f6-ef6c-d153e6977b9a@oracle.com> On 18/07/2019 8:15 pm, Doerr, Martin wrote: > Hi David, > > there's no difference between INTPTR_FORMAT and PTR_FORMAT: > > #ifdef _LP64 > #define INTPTR_FORMAT "0x%016" PRIxPTR > #define PTR_FORMAT "0x%016" PRIxPTR > #else // !_LP64 > #define INTPTR_FORMAT "0x%08" PRIxPTR > #define PTR_FORMAT "0x%08" PRIxPTR > #endif // _LP64 > > I guess this was different in the past. I don't know why we still have both. Sorry about that - was confused by the reported error message. David > Best regards, > Martin > > >> -----Original Message----- >> From: ppc-aix-port-dev On >> Behalf Of David Holmes >> Sent: Donnerstag, 18. Juli 2019 12:04 >> To: Baesken, Matthias ; Langer, Christoph >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > port-dev at openjdk.java.net> >> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >> >> On 18/07/2019 6:25 pm, Baesken, Matthias wrote: >>> Hi David, do you see an issue using p2i with char* pointers , should I add >> a cast or some other conversion ? >>> (afaik it is usually used without other casts/conversions in the codebase) >>> >>> jdk/src/hotspot/share/utilities/globalDefinitions.hpp : >>> >>> 1055 // Convert pointer to intptr_t, for use in printing pointers. >>> 1056 inline intptr_t p2i(const void * p) { >>> 1057 return (intptr_t) p; >>> 1058 } >> >> p2i is what you should always use when printing a pointer to convert it >> to an integral type. But it should really be used with INTPTR_FORMAT. It >> will work with PTR_FORMAT due to other integral conversions. >> >>>> If this fixes things on AIX then that's fine. >>> >>> Yes it does . >>> But I have to agree with you it feels a bit shaky ... >> >> Changing PTR_FORMAT to INTPTR_FORMAT would remove that shakiness >> IMHO. :) >> >> Cheers, >> David >> >>> >>> Regards, Matthias >>> >>> >>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Donnerstag, 18. Juli 2019 10:05 >>>> To: Baesken, Matthias ; Langer, Christoph >>>> ; 'hotspot-dev at openjdk.java.net' >>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' >>> port-dev at openjdk.java.net> >>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>> >>>> On 18/07/2019 5:40 pm, Baesken, Matthias wrote: >>>>>> pointers should be used with PTR_FORMAT. p2i(p) should be used with >>>>>> INTPTR_FORMAT. So the above looks like it was already correct and >> now >>>> is >>>>>> not correct. >>>>> >>>>> Hi David, I noticed p2i is used together with PTR_FORMAT at >> dozens >>>> locations in the HS code , did I miss something ? >>>> >>>> Okay our usage is a bit of a historical mess. :( >>>> >>>>> In os_aix.cpp we currently get these warnings , seems PTR_FORMAT >> is >>>> unsigned long , that?s why we see these warnings : >>>> >>>> Defining PTR_FORMAT as an integral format it just broken - but dates >>>> back forever because %p wasn't portable. >>>> >>>> If this fixes things on AIX then that's fine. For new code I'd recommend >>>> use of INTPTR_FORMAT and p2i to print pointers. >>>> >>>> Thanks, >>>> David >>>> >>>>> >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format >>>> specifies type 'unsigned long' but the argument has type 'char *' [- >> Wformat] >>>>> p, p + s, addr, addr + size); >>>>> ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >> from >>>> macro 'trcVerbose' >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>> ^~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format >>>> specifies type 'unsigned long' but the argument has type 'char *' [- >> Wformat] >>>>> p, p + s, addr, addr + size); >>>>> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >> from >>>> macro 'trcVerbose' >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>> ^~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format >>>> specifies type 'unsigned long' but the argument has type 'char *' [- >> Wformat] >>>>> p, p + s, addr, addr + size); >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >> from >>>> macro 'trcVerbose' >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>> ^~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format >>>> specifies type 'unsigned long' but the argument has type 'char *' [- >> Wformat] >>>>> p, p + s, addr, addr + size); >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >> from >>>> macro 'trcVerbose' >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>> ^~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format >>>> specifies type 'unsigned long' but the argument has type 'char *' [- >> Wformat] >>>>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); >>>>> >>>> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ >>>> ~~~~~~~~~~~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >> from >>>> macro 'trcVerbose' >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>> ^~~~~~~~~~~ >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format >>>> specifies type 'unsigned long' but the argument has type 'char *' [- >> Wformat] >>>>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) pagesize); >>>>> >>>> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ >>>> ~~~~~~~~~~~~~~~~~~~~ >>>>> >>>>> Best regards, Matthias >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes >>>>>> Sent: Donnerstag, 18. Juli 2019 09:08 >>>>>> To: Baesken, Matthias ; Langer, >> Christoph >>>>>> ; 'hotspot-dev at openjdk.java.net' >> >>>>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > aix- >>>>>> port-dev at openjdk.java.net> >>>>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>>>> >>>>>> Hi Matthias, >>>>>> >>>>>> On 18/07/2019 5:00 pm, Baesken, Matthias wrote: >>>>>>> Thanks ! May I get a second review please ? >>>>>> >>>>>> @@ -1888,12 +1887,12 @@ >>>>>> if (!contains_range(p, s)) { >>>>>> trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " >>>>>> "range of [" PTR_FORMAT " - " PTR_FORMAT "].", >>>>>> - p, p + s, addr, addr + size); >>>>>> + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); >>>>>> >>>>>> pointers should be used with PTR_FORMAT. p2i(p) should be used with >>>>>> INTPTR_FORMAT. So the above looks like it was already correct and >> now >>>> is >>>>>> not correct. Using p2i with UINTX_FORMAT also looks dubious to me. >>>>>> >>>>>> Cheers, >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Best regards, Matthias >>>>>>> >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Langer, Christoph >>>>>>>> Sent: Mittwoch, 17. Juli 2019 18:45 >>>>>>>> To: Baesken, Matthias ; 'hotspot- >>>>>>>> dev at openjdk.java.net' ; 'ppc-aix- >>>> port- >>>>>>>> dev at openjdk.java.net' >>>>>>>> Subject: RE: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>>>>>> >>>>>>>> Hi Matthias, >>>>>>>> >>>>>>>> thanks for this tedious cleanup. Looks good to me. >>>>>>>> >>>>>>>> Best regards >>>>>>>> Christoph >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: hotspot-dev On >>>>>> Behalf >>>>>>>> Of >>>>>>>>> Baesken, Matthias >>>>>>>>> Sent: Mittwoch, 17. Juli 2019 17:07 >>>>>>>>> To: 'hotspot-dev at openjdk.java.net' >>> dev at openjdk.java.net>; >>>>>>>>> 'ppc-aix-port-dev at openjdk.java.net' >>>>>>> dev at openjdk.java.net> >>>>>>>>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>>>>>>> >>>>>>>>> Hello, there are a couple of non matching format specifiers in >>>> os_aix.cpp >>>>>> . >>>>>>>>> I adjust them with my change . >>>>>>>>> >>>>>>>>> Please review ! >>>>>>>>> >>>>>>>>> Bug/webrev : >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227869 >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ >>>>>>>>> >>>>>>>>> Thanks, Matthias From david.holmes at oracle.com Thu Jul 18 10:54:52 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Jul 2019 20:54:52 +1000 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <470316cf-850d-7160-250a-ad6669b2ca9e@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> <470316cf-850d-7160-250a-ad6669b2ca9e@oracle.com> Message-ID: <776901dc-8927-24d1-21ed-4049fd4860fb@oracle.com> Hi Vladimir, I'm not intimately familiar with the code details but I get the gist of the fix and the avoidance of the barrier for the JNI call to restore the existing behaviour. So looks good in that sense. Thanks, David On 18/07/2019 7:35 am, Vladimir Ivanov wrote: > Thanks, Martin and Dmitrij for reviews. > > ... >>> If you have upcalls from JVM code in mind, then there's already a >>> barrier on caller side: JavaCalls::call_static() calls into >>> LinkResolver::resolve_static_call() which has initialization barrier. >>> So, there's no need to repeat the check. > > As an afterthought, I decided to update the comment in > SharedRuntime::handle_wrong_method() to clarify the difference in > behavior between upcalls coming from JVM & JNI. > > Best regards, > Vladimir Ivanov > >>>>> -----Original Message----- >>>>> From: Vladimir Ivanov >>>>> Sent: Mittwoch, 17. Juli 2019 15:07 >>>>> To: Doerr, Martin ; hotspot- >>>>> dev at openjdk.java.net; Dmitrij Pochepko >> sw.com> >>>>> Subject: Re: RFR[13]: 8227260: Can't deal with >>>>> SharedRuntime::handle_wrong_method triggering more than once for >>>>> interpreter calls >>>>> >>>>> Thanks, Erik. >>>>> >>>>> Also, since I touch platform-specific code, I'd like Martin and >>>>> Dmitrij >>>>> (implementors of support for s390, ppc, and aarch64) to take a look at >>>>> the patch as well. >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>> On 17/07/2019 15:25, Erik ?sterlund wrote: >>>>>> Hi Vladimir, >>>>>> >>>>>> Looks good. Thanks for fixing. >>>>>> >>>>>> /Erik >>>>>> >>>>>> On 2019-07-17 12:26, Vladimir Ivanov wrote: >>>>>>> Revised fix: >>>>>>> ? ?? http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ >>>>>>> >>>>>>> It turned out the problem is not specific to i2c2i: fast class >>>>>>> initialization barriers on nmethod entry trigger the assert as well. >>>>>>> >>>>>>> JNI upcalls (CallStaticMethod) don't have class initialization >>>>>>> checks, so it's possible to initiate a JNI upcall from a >>>>>>> non-initializing thread and JVM should let it complete. >>>>>>> >>>>>>> It leads to a busy loop (asserts in debug) between nmethod entry >>>>>>> barrier & SharedRuntime::handle_wrong_method until holder class is >>>>>>> initialized (possibly infinite if it blocks class initialization). >>>>>>> >>>>>>> Proposed fix is to keep using c2i, but jump over class >>>>>>> initialization >>>>>>> barrier right to the argument shuffling logic on verified entry when >>>>>>> coming from SharedRuntime::handle_wrong_method. >>>>>>> >>>>>>> Improved regression test reliably reproduces the problem. >>>>>>> >>>>>>> Testing: regression test, hs-precheckin-comp, tier1-6 >>>>>>> >>>>>>> Best regards, >>>>>>> Vladimir Ivanov >>>>>>> >>>>>>> On 04/07/2019 18:02, Erik ?sterlund wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> The i2c adapter sets a thread-local "callee_target" Method*, >>>>>>>> which is >>>>>>>> caught (and cleared) by SharedRuntime::handle_wrong_method if >>> the >>>>> i2c >>>>>>>> call is "bad" (e.g. not_entrant). This error handler forwards >>>>>>>> execution to the callee c2i entry. If the >>>>>>>> SharedRuntime::handle_wrong_method method is called again due >>> to >>>>> the >>>>>>>> i2c2i call being still bad, then we will crash the VM in the >>>>>>>> following guarantee in SharedRuntime::handle_wrong_method: >>>>>>>> >>>>>>>> Method* callee = thread->callee_target(); >>>>>>>> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >>>>>>>> >>>>>>>> Unfortunately, the c2i entry can indeed fail again if it, e.g., >>>>>>>> hits >>>>>>>> the new class initialization entry barrier of the c2i adapter. >>>>>>>> The solution is to simply not clear the thread-local >>>>>>>> "callee_target" >>>>>>>> after handling the first failure, as we can't really know there >>>>>>>> won't >>>>>>>> be another one. There is no reason to clear this value as nobody >>>>>>>> else >>>>>>>> reads it than the SharedRuntime::handle_wrong_method handler >>> (and >>>>> we >>>>>>>> really do want it to be able to read the value as many times as it >>>>>>>> takes until the call goes through). I found some confused >>>>>>>> clearing of >>>>>>>> this callee_target in JavaThread::oops_do(), with a comment saying >>>>>>>> this is a methodOop that we need to clear to make GC happy or >>>>>>>> something. Seems like old traces of perm gen. So I deleted that >>>>>>>> too. >>>>>>>> >>>>>>>> I caught this in ZGC where the timing window for hitting this issue >>>>>>>> seems to be wider due to concurrent code cache unloading. But it is >>>>>>>> equally problematic for all GCs. >>>>>>>> >>>>>>>> Bug: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227260 >>>>>>>> >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >>>>>>>> >>>>>>>> Thanks, >>>>>>>> /Erik From patrick at os.amperecomputing.com Thu Jul 18 11:06:06 2019 From: patrick at os.amperecomputing.com (Patrick Zhang OS) Date: Thu, 18 Jul 2019 11:06:06 +0000 Subject: The default choice in setup_large_page_type() if set -XX:+UseLargePages only Message-ID: I found a weird "issue" when setting up an env with -XX:+UseLargePages only. I knew later on that at least one of UseHugeTLBFS/UseSHM/UseTransparentHugePages, but in the beginning everything worked well without any warnings. The default choice is UseHugeTLBFS behind this. However when I added -XX:-UseTransparentHugePages, the function got completely disabled and setup_large_page_type() returned false. Is this an expected behavior? or any warnings ought to show in console? If -XX:+UseLargePages is allowed to be specified alone, perhaps disabling the default UseHugeTLBFS choice can be less misleading? Thanks for any comments. Here is the related source code: bool os::Linux::setup_large_page_type(size_t page_size) https://hg.openjdk.java.net/jdk/jdk/file/065142ace8e9/src/hotspot/os/linux/os_linux.cpp#l3764 java -Xmx512m -XX:+PrintFlagsFinal -version (default values) bool UseHugeTLBFS = false {product} {default} bool UseLargePages = false {pd product} {default} bool UseSHM = false {product} {default} bool UseTransparentHugePages = false {product} {default} java -Xmx512m -XX:+PrintFlagsFinal -XX:LargePageSizeInBytes=2m -XX:+UseLargePages -version bool UseHugeTLBFS = true {product} {command line} bool UseLargePages = true {pd product} {command line} bool UseSHM = false {product} {default} bool UseTransparentHugePages = false {product} {default} java -Xmx512m -XX:+PrintFlagsFinal -XX:LargePageSizeInBytes=2m -XX:+UseLargePages -XX:-UseTransparentHugePages -version bool UseHugeTLBFS = false {product} {command line} bool UseLargePages = false {pd product} {command line} bool UseSHM = false {product} {default} bool UseTransparentHugePages = false {product} {default} java -Xmx512m -XX:+PrintFlagsFinal -XX:LargePageSizeInBytes=2m -XX:+UseLargePages -XX:-UseSHM -version bool UseHugeTLBFS = false {product} {command line} bool UseLargePages = false {pd product} {command line} bool UseSHM = false {product} {default} bool UseTransparentHugePages = false {product} {default} Regards Patrick From vladimir.x.ivanov at oracle.com Thu Jul 18 11:13:43 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 18 Jul 2019 14:13:43 +0300 Subject: RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls In-Reply-To: <776901dc-8927-24d1-21ed-4049fd4860fb@oracle.com> References: <8d183958-197c-600d-edda-22121a8eb677@oracle.com> <8063c7c3-432d-6318-4525-5f0d9a9e8524@oracle.com> <470316cf-850d-7160-250a-ad6669b2ca9e@oracle.com> <776901dc-8927-24d1-21ed-4049fd4860fb@oracle.com> Message-ID: <6f691f9c-d785-36be-99a0-3ebd82a668fc@oracle.com> Thanks for review, David. Best regards, Vladimir Ivanov On 18/07/2019 13:54, David Holmes wrote: > Hi Vladimir, > > I'm not intimately familiar with the code details but I get the gist of > the fix and the avoidance of the barrier for the JNI call to restore the > existing behaviour. So looks good in that sense. > > Thanks, > David > > On 18/07/2019 7:35 am, Vladimir Ivanov wrote: >> Thanks, Martin and Dmitrij for reviews. >> >> ... >>>> If you have upcalls from JVM code in mind, then there's already a >>>> barrier on caller side: JavaCalls::call_static() calls into >>>> LinkResolver::resolve_static_call() which has initialization barrier. >>>> So, there's no need to repeat the check. >> >> As an afterthought, I decided to update the comment in >> SharedRuntime::handle_wrong_method() to clarify the difference in >> behavior between upcalls coming from JVM & JNI. >> >> Best regards, >> Vladimir Ivanov >> >>>>>> -----Original Message----- >>>>>> From: Vladimir Ivanov >>>>>> Sent: Mittwoch, 17. Juli 2019 15:07 >>>>>> To: Doerr, Martin ; hotspot- >>>>>> dev at openjdk.java.net; Dmitrij Pochepko >>> sw.com> >>>>>> Subject: Re: RFR[13]: 8227260: Can't deal with >>>>>> SharedRuntime::handle_wrong_method triggering more than once for >>>>>> interpreter calls >>>>>> >>>>>> Thanks, Erik. >>>>>> >>>>>> Also, since I touch platform-specific code, I'd like Martin and >>>>>> Dmitrij >>>>>> (implementors of support for s390, ppc, and aarch64) to take a >>>>>> look at >>>>>> the patch as well. >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>>> >>>>>> On 17/07/2019 15:25, Erik ?sterlund wrote: >>>>>>> Hi Vladimir, >>>>>>> >>>>>>> Looks good. Thanks for fixing. >>>>>>> >>>>>>> /Erik >>>>>>> >>>>>>> On 2019-07-17 12:26, Vladimir Ivanov wrote: >>>>>>>> Revised fix: >>>>>>>> ? ?? http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/ >>>>>>>> >>>>>>>> It turned out the problem is not specific to i2c2i: fast class >>>>>>>> initialization barriers on nmethod entry trigger the assert as >>>>>>>> well. >>>>>>>> >>>>>>>> JNI upcalls (CallStaticMethod) don't have class >>>>>>>> initialization >>>>>>>> checks, so it's possible to initiate a JNI upcall from a >>>>>>>> non-initializing thread and JVM should let it complete. >>>>>>>> >>>>>>>> It leads to a busy loop (asserts in debug) between nmethod entry >>>>>>>> barrier & SharedRuntime::handle_wrong_method until holder class is >>>>>>>> initialized (possibly infinite if it blocks class initialization). >>>>>>>> >>>>>>>> Proposed fix is to keep using c2i, but jump over class >>>>>>>> initialization >>>>>>>> barrier right to the argument shuffling logic on verified entry >>>>>>>> when >>>>>>>> coming from SharedRuntime::handle_wrong_method. >>>>>>>> >>>>>>>> Improved regression test reliably reproduces the problem. >>>>>>>> >>>>>>>> Testing: regression test, hs-precheckin-comp, tier1-6 >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Vladimir Ivanov >>>>>>>> >>>>>>>> On 04/07/2019 18:02, Erik ?sterlund wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> The i2c adapter sets a thread-local "callee_target" Method*, >>>>>>>>> which is >>>>>>>>> caught (and cleared) by SharedRuntime::handle_wrong_method if >>>> the >>>>>> i2c >>>>>>>>> call is "bad" (e.g. not_entrant). This error handler forwards >>>>>>>>> execution to the callee c2i entry. If the >>>>>>>>> SharedRuntime::handle_wrong_method method is called again due >>>> to >>>>>> the >>>>>>>>> i2c2i call being still bad, then we will crash the VM in the >>>>>>>>> following guarantee in SharedRuntime::handle_wrong_method: >>>>>>>>> >>>>>>>>> Method* callee = thread->callee_target(); >>>>>>>>> guarantee(callee != NULL && callee->is_method(), "bad handshake"); >>>>>>>>> >>>>>>>>> Unfortunately, the c2i entry can indeed fail again if it, e.g., >>>>>>>>> hits >>>>>>>>> the new class initialization entry barrier of the c2i adapter. >>>>>>>>> The solution is to simply not clear the thread-local >>>>>>>>> "callee_target" >>>>>>>>> after handling the first failure, as we can't really know there >>>>>>>>> won't >>>>>>>>> be another one. There is no reason to clear this value as >>>>>>>>> nobody else >>>>>>>>> reads it than the SharedRuntime::handle_wrong_method handler >>>> (and >>>>>> we >>>>>>>>> really do want it to be able to read the value as many times as it >>>>>>>>> takes until the call goes through). I found some confused >>>>>>>>> clearing of >>>>>>>>> this callee_target in JavaThread::oops_do(), with a comment saying >>>>>>>>> this is a methodOop that we need to clear to make GC happy or >>>>>>>>> something. Seems like old traces of perm gen. So I deleted that >>>>>>>>> too. >>>>>>>>> >>>>>>>>> I caught this in ZGC where the timing window for hitting this >>>>>>>>> issue >>>>>>>>> seems to be wider due to concurrent code cache unloading. But >>>>>>>>> it is >>>>>>>>> equally problematic for all GCs. >>>>>>>>> >>>>>>>>> Bug: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227260 >>>>>>>>> >>>>>>>>> Webrev: >>>>>>>>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/ >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> /Erik From matthias.baesken at sap.com Thu Jul 18 11:50:15 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 18 Jul 2019 11:50:15 +0000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: <367d65ad-6df7-65f6-ef6c-d153e6977b9a@oracle.com> References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> <39b081bf-d298-e515-3311-02d4c4a51db2@oracle.com> <367d65ad-6df7-65f6-ef6c-d153e6977b9a@oracle.com> Message-ID: Hi David may I add you as a reviewer too ? Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 18. Juli 2019 12:52 > To: Doerr, Martin ; Baesken, Matthias > ; Langer, Christoph > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' port-dev at openjdk.java.net> > Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > > On 18/07/2019 8:15 pm, Doerr, Martin wrote: > > Hi David, > > > > there's no difference between INTPTR_FORMAT and PTR_FORMAT: > > > > #ifdef _LP64 > > #define INTPTR_FORMAT "0x%016" PRIxPTR > > #define PTR_FORMAT "0x%016" PRIxPTR > > #else // !_LP64 > > #define INTPTR_FORMAT "0x%08" PRIxPTR > > #define PTR_FORMAT "0x%08" PRIxPTR > > #endif // _LP64 > > > > I guess this was different in the past. I don't know why we still have both. > > Sorry about that - was confused by the reported error message. > > David > > > Best regards, > > Martin > > > > > >> -----Original Message----- > >> From: ppc-aix-port-dev > On > >> Behalf Of David Holmes > >> Sent: Donnerstag, 18. Juli 2019 12:04 > >> To: Baesken, Matthias ; Langer, Christoph > >> ; 'hotspot-dev at openjdk.java.net' >> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' >> port-dev at openjdk.java.net> > >> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >> > >> On 18/07/2019 6:25 pm, Baesken, Matthias wrote: > >>> Hi David, do you see an issue using p2i with char* pointers , should I > add > >> a cast or some other conversion ? > >>> (afaik it is usually used without other casts/conversions in the codebase) > >>> > >>> jdk/src/hotspot/share/utilities/globalDefinitions.hpp : > >>> > >>> 1055 // Convert pointer to intptr_t, for use in printing pointers. > >>> 1056 inline intptr_t p2i(const void * p) { > >>> 1057 return (intptr_t) p; > >>> 1058 } > >> > >> p2i is what you should always use when printing a pointer to convert it > >> to an integral type. But it should really be used with INTPTR_FORMAT. It > >> will work with PTR_FORMAT due to other integral conversions. > >> > >>>> If this fixes things on AIX then that's fine. > >>> > >>> Yes it does . > >>> But I have to agree with you it feels a bit shaky ... > >> > >> Changing PTR_FORMAT to INTPTR_FORMAT would remove that shakiness > >> IMHO. :) > >> > >> Cheers, > >> David > >> > >>> > >>> Regards, Matthias > >>> > >>> > >>> > >>>> -----Original Message----- > >>>> From: David Holmes > >>>> Sent: Donnerstag, 18. Juli 2019 10:05 > >>>> To: Baesken, Matthias ; Langer, > Christoph > >>>> ; 'hotspot-dev at openjdk.java.net' > >>>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' aix- > >>>> port-dev at openjdk.java.net> > >>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >>>> > >>>> On 18/07/2019 5:40 pm, Baesken, Matthias wrote: > >>>>>> pointers should be used with PTR_FORMAT. p2i(p) should be used > with > >>>>>> INTPTR_FORMAT. So the above looks like it was already correct and > >> now > >>>> is > >>>>>> not correct. > >>>>> > >>>>> Hi David, I noticed p2i is used together with PTR_FORMAT at > >> dozens > >>>> locations in the HS code , did I miss something ? > >>>> > >>>> Okay our usage is a bit of a historical mess. :( > >>>> > >>>>> In os_aix.cpp we currently get these warnings , seems > PTR_FORMAT > >> is > >>>> unsigned long , that?s why we see these warnings : > >>>> > >>>> Defining PTR_FORMAT as an integral format it just broken - but dates > >>>> back forever because %p wasn't portable. > >>>> > >>>> If this fixes things on AIX then that's fine. For new code I'd recommend > >>>> use of INTPTR_FORMAT and p2i to print pointers. > >>>> > >>>> Thanks, > >>>> David > >>>> > >>>>> > >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format > >>>> specifies type 'unsigned long' but the argument has type 'char *' [- > >> Wformat] > >>>>> p, p + s, addr, addr + size); > >>>>> ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > >> from > >>>> macro 'trcVerbose' > >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>>>> ^~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format > >>>> specifies type 'unsigned long' but the argument has type 'char *' [- > >> Wformat] > >>>>> p, p + s, addr, addr + size); > >>>>> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > >> from > >>>> macro 'trcVerbose' > >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>>>> ^~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format > >>>> specifies type 'unsigned long' but the argument has type 'char *' [- > >> Wformat] > >>>>> p, p + s, addr, addr + size); > >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > >> from > >>>> macro 'trcVerbose' > >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>>>> ^~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format > >>>> specifies type 'unsigned long' but the argument has type 'char *' [- > >> Wformat] > >>>>> p, p + s, addr, addr + size); > >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > >> from > >>>> macro 'trcVerbose' > >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>>>> ^~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format > >>>> specifies type 'unsigned long' but the argument has type 'char *' [- > >> Wformat] > >>>>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) > pagesize); > >>>>> > >>>> > >> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ > >>>> ~~~~~~~~~~~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded > >> from > >>>> macro 'trcVerbose' > >>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ > >>>>> ^~~~~~~~~~~ > >>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format > >>>> specifies type 'unsigned long' but the argument has type 'char *' [- > >> Wformat] > >>>>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) > pagesize); > >>>>> > >>>> > >> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ > >>>> ~~~~~~~~~~~~~~~~~~~~ > >>>>> > >>>>> Best regards, Matthias > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: David Holmes > >>>>>> Sent: Donnerstag, 18. Juli 2019 09:08 > >>>>>> To: Baesken, Matthias ; Langer, > >> Christoph > >>>>>> ; 'hotspot-dev at openjdk.java.net' > >> >>>>>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > >> aix- > >>>>>> port-dev at openjdk.java.net> > >>>>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >>>>>> > >>>>>> Hi Matthias, > >>>>>> > >>>>>> On 18/07/2019 5:00 pm, Baesken, Matthias wrote: > >>>>>>> Thanks ! May I get a second review please ? > >>>>>> > >>>>>> @@ -1888,12 +1887,12 @@ > >>>>>> if (!contains_range(p, s)) { > >>>>>> trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " > >>>>>> "range of [" PTR_FORMAT " - " PTR_FORMAT "].", > >>>>>> - p, p + s, addr, addr + size); > >>>>>> + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); > >>>>>> > >>>>>> pointers should be used with PTR_FORMAT. p2i(p) should be used > with > >>>>>> INTPTR_FORMAT. So the above looks like it was already correct and > >> now > >>>> is > >>>>>> not correct. Using p2i with UINTX_FORMAT also looks dubious to > me. > >>>>>> > >>>>>> Cheers, > >>>>>> David > >>>>>> ----- > >>>>>> > >>>>>>> Best regards, Matthias > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: Langer, Christoph > >>>>>>>> Sent: Mittwoch, 17. Juli 2019 18:45 > >>>>>>>> To: Baesken, Matthias ; 'hotspot- > >>>>>>>> dev at openjdk.java.net' ; 'ppc- > aix- > >>>> port- > >>>>>>>> dev at openjdk.java.net' > >>>>>>>> Subject: RE: RFR : 8227869: fix wrong format specifiers in > os_aix.cpp > >>>>>>>> > >>>>>>>> Hi Matthias, > >>>>>>>> > >>>>>>>> thanks for this tedious cleanup. Looks good to me. > >>>>>>>> > >>>>>>>> Best regards > >>>>>>>> Christoph > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: hotspot-dev > On > >>>>>> Behalf > >>>>>>>> Of > >>>>>>>>> Baesken, Matthias > >>>>>>>>> Sent: Mittwoch, 17. Juli 2019 17:07 > >>>>>>>>> To: 'hotspot-dev at openjdk.java.net' >>>> dev at openjdk.java.net>; > >>>>>>>>> 'ppc-aix-port-dev at openjdk.java.net' >>>>>>>> dev at openjdk.java.net> > >>>>>>>>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp > >>>>>>>>> > >>>>>>>>> Hello, there are a couple of non matching format specifiers in > >>>> os_aix.cpp > >>>>>> . > >>>>>>>>> I adjust them with my change . > >>>>>>>>> > >>>>>>>>> Please review ! > >>>>>>>>> > >>>>>>>>> Bug/webrev : > >>>>>>>>> > >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227869 > >>>>>>>>> > >>>>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ > >>>>>>>>> > >>>>>>>>> Thanks, Matthias From david.holmes at oracle.com Thu Jul 18 13:08:13 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Jul 2019 23:08:13 +1000 Subject: The default choice in setup_large_page_type() if set -XX:+UseLargePages only In-Reply-To: References: Message-ID: Hi Patrick, On 18/07/2019 9:06 pm, Patrick Zhang OS wrote: > I found a weird "issue" when setting up an env with -XX:+UseLargePages only. I knew later on that at least one of UseHugeTLBFS/UseSHM/UseTransparentHugePages, but in the beginning everything worked well without any warnings. The default choice is UseHugeTLBFS behind this. However when I added -XX:-UseTransparentHugePages, the function got completely disabled and setup_large_page_type() returned false. Is this an expected behavior? or any warnings ought to show in console? If -XX:+UseLargePages is allowed to be specified alone, perhaps disabling the default UseHugeTLBFS choice can be less misleading? You can see in the logic that if you enable large pages only then it configures the other flags for you in the way that makes most sense: try UseHugeTLBFS and then UseSHM, but don't try UseTransparentHugePages since there are known performance issues with it turned on. But if you enable large pages and explicitly set/clear at least one of the other flags, then it assumes you've set everything yourself as needed and only does some basic sanity checks whilst trying select the right mode. I agree it seems odd that explicitly disabling a flag that would be disabled anyway changes the behaviour, but you've basically switched things from "configure things for me" mode, to "manual" mode by specifying any other flag explicitly. As a result explicitly disabling any of the flags results in all flags being off. Cheers, David > Thanks for any comments. > > > Here is the related source code: > bool os::Linux::setup_large_page_type(size_t page_size) > > https://hg.openjdk.java.net/jdk/jdk/file/065142ace8e9/src/hotspot/os/linux/os_linux.cpp#l3764 > > > > java -Xmx512m -XX:+PrintFlagsFinal -version (default values) > > bool UseHugeTLBFS = false {product} {default} > > bool UseLargePages = false {pd product} {default} > > bool UseSHM = false {product} {default} > > bool UseTransparentHugePages = false {product} {default} > > java -Xmx512m -XX:+PrintFlagsFinal -XX:LargePageSizeInBytes=2m -XX:+UseLargePages -version > > bool UseHugeTLBFS = true {product} {command line} > > bool UseLargePages = true {pd product} {command line} > > bool UseSHM = false {product} {default} > > bool UseTransparentHugePages = false {product} {default} > > java -Xmx512m -XX:+PrintFlagsFinal -XX:LargePageSizeInBytes=2m -XX:+UseLargePages -XX:-UseTransparentHugePages -version > > bool UseHugeTLBFS = false {product} {command line} > > bool UseLargePages = false {pd product} {command line} > > bool UseSHM = false {product} {default} > > bool UseTransparentHugePages = false {product} {default} > > java -Xmx512m -XX:+PrintFlagsFinal -XX:LargePageSizeInBytes=2m -XX:+UseLargePages -XX:-UseSHM -version > > bool UseHugeTLBFS = false {product} {command line} > > bool UseLargePages = false {pd product} {command line} > > bool UseSHM = false {product} {default} > > bool UseTransparentHugePages = false {product} {default} > > Regards > Patrick > From david.holmes at oracle.com Thu Jul 18 13:08:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Jul 2019 23:08:55 +1000 Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp In-Reply-To: References: <481446f4-3303-1ff5-27b0-d42d13fd38d9@oracle.com> <61cd310d-1b06-e400-a05a-3885aaa0d175@oracle.com> <39b081bf-d298-e515-3311-02d4c4a51db2@oracle.com> <367d65ad-6df7-65f6-ef6c-d153e6977b9a@oracle.com> Message-ID: On 18/07/2019 9:50 pm, Baesken, Matthias wrote: > Hi David may I add you as a reviewer too ? Yes. Thanks, David > Best regards, Matthias > >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 18. Juli 2019 12:52 >> To: Doerr, Martin ; Baesken, Matthias >> ; Langer, Christoph >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > port-dev at openjdk.java.net> >> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >> >> On 18/07/2019 8:15 pm, Doerr, Martin wrote: >>> Hi David, >>> >>> there's no difference between INTPTR_FORMAT and PTR_FORMAT: >>> >>> #ifdef _LP64 >>> #define INTPTR_FORMAT "0x%016" PRIxPTR >>> #define PTR_FORMAT "0x%016" PRIxPTR >>> #else // !_LP64 >>> #define INTPTR_FORMAT "0x%08" PRIxPTR >>> #define PTR_FORMAT "0x%08" PRIxPTR >>> #endif // _LP64 >>> >>> I guess this was different in the past. I don't know why we still have both. >> >> Sorry about that - was confused by the reported error message. >> >> David >> >>> Best regards, >>> Martin >>> >>> >>>> -----Original Message----- >>>> From: ppc-aix-port-dev >> On >>>> Behalf Of David Holmes >>>> Sent: Donnerstag, 18. Juli 2019 12:04 >>>> To: Baesken, Matthias ; Langer, Christoph >>>> ; 'hotspot-dev at openjdk.java.net' >>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' >>> port-dev at openjdk.java.net> >>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>> >>>> On 18/07/2019 6:25 pm, Baesken, Matthias wrote: >>>>> Hi David, do you see an issue using p2i with char* pointers , should I >> add >>>> a cast or some other conversion ? >>>>> (afaik it is usually used without other casts/conversions in the codebase) >>>>> >>>>> jdk/src/hotspot/share/utilities/globalDefinitions.hpp : >>>>> >>>>> 1055 // Convert pointer to intptr_t, for use in printing pointers. >>>>> 1056 inline intptr_t p2i(const void * p) { >>>>> 1057 return (intptr_t) p; >>>>> 1058 } >>>> >>>> p2i is what you should always use when printing a pointer to convert it >>>> to an integral type. But it should really be used with INTPTR_FORMAT. It >>>> will work with PTR_FORMAT due to other integral conversions. >>>> >>>>>> If this fixes things on AIX then that's fine. >>>>> >>>>> Yes it does . >>>>> But I have to agree with you it feels a bit shaky ... >>>> >>>> Changing PTR_FORMAT to INTPTR_FORMAT would remove that shakiness >>>> IMHO. :) >>>> >>>> Cheers, >>>> David >>>> >>>>> >>>>> Regards, Matthias >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes >>>>>> Sent: Donnerstag, 18. Juli 2019 10:05 >>>>>> To: Baesken, Matthias ; Langer, >> Christoph >>>>>> ; 'hotspot-dev at openjdk.java.net' >> >>>>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' > aix- >>>>>> port-dev at openjdk.java.net> >>>>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>>>> >>>>>> On 18/07/2019 5:40 pm, Baesken, Matthias wrote: >>>>>>>> pointers should be used with PTR_FORMAT. p2i(p) should be used >> with >>>>>>>> INTPTR_FORMAT. So the above looks like it was already correct and >>>> now >>>>>> is >>>>>>>> not correct. >>>>>>> >>>>>>> Hi David, I noticed p2i is used together with PTR_FORMAT at >>>> dozens >>>>>> locations in the HS code , did I miss something ? >>>>>> >>>>>> Okay our usage is a bit of a historical mess. :( >>>>>> >>>>>>> In os_aix.cpp we currently get these warnings , seems >> PTR_FORMAT >>>> is >>>>>> unsigned long , that?s why we see these warnings : >>>>>> >>>>>> Defining PTR_FORMAT as an integral format it just broken - but dates >>>>>> back forever because %p wasn't portable. >>>>>> >>>>>> If this fixes things on AIX then that's fine. For new code I'd recommend >>>>>> use of INTPTR_FORMAT and p2i to print pointers. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> >>>>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:15: warning: format >>>>>> specifies type 'unsigned long' but the argument has type 'char *' [- >>>> Wformat] >>>>>>> p, p + s, addr, addr + size); >>>>>>> ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >>>> from >>>>>> macro 'trcVerbose' >>>>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>>>> ^~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:18: warning: format >>>>>> specifies type 'unsigned long' but the argument has type 'char *' [- >>>> Wformat] >>>>>>> p, p + s, addr, addr + size); >>>>>>> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >>>> from >>>>>> macro 'trcVerbose' >>>>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>>>> ^~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:25: warning: format >>>>>> specifies type 'unsigned long' but the argument has type 'char *' [- >>>> Wformat] >>>>>>> p, p + s, addr, addr + size); >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >>>> from >>>>>> macro 'trcVerbose' >>>>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>>>> ^~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1894:31: warning: format >>>>>> specifies type 'unsigned long' but the argument has type 'char *' [- >>>> Wformat] >>>>>>> p, p + s, addr, addr + size); >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >>>> from >>>>>> macro 'trcVerbose' >>>>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>>>> ^~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:45: warning: format >>>>>> specifies type 'unsigned long' but the argument has type 'char *' [- >>>> Wformat] >>>>>>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) >> pagesize); >>>>>>> >>>>>> >>>> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ >>>>>> ~~~~~~~~~~~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/misc_aix.hpp:40:28: note: expanded >>>> from >>>>>> macro 'trcVerbose' >>>>>>> fprintf(stderr, fmt, ##__VA_ARGS__); \ >>>>>>> ^~~~~~~~~~~ >>>>>>> /nightly/jdk/src/hotspot/os/aix/os_aix.cpp:1899:48: warning: format >>>>>> specifies type 'unsigned long' but the argument has type 'char *' [- >>>> Wformat] >>>>>>> " aligned to pagesize (%lu)", p, p + s, (unsigned long) >> pagesize); >>>>>>> >>>>>> >>>> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ >>>>>> ~~~~~~~~~~~~~~~~~~~~ >>>>>>> >>>>>>> Best regards, Matthias >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: David Holmes >>>>>>>> Sent: Donnerstag, 18. Juli 2019 09:08 >>>>>>>> To: Baesken, Matthias ; Langer, >>>> Christoph >>>>>>>> ; 'hotspot-dev at openjdk.java.net' >>>> >>>>>>> dev at openjdk.java.net>; 'ppc-aix-port-dev at openjdk.java.net' >> >>> aix- >>>>>>>> port-dev at openjdk.java.net> >>>>>>>> Subject: Re: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>>>>>> >>>>>>>> Hi Matthias, >>>>>>>> >>>>>>>> On 18/07/2019 5:00 pm, Baesken, Matthias wrote: >>>>>>>>> Thanks ! May I get a second review please ? >>>>>>>> >>>>>>>> @@ -1888,12 +1887,12 @@ >>>>>>>> if (!contains_range(p, s)) { >>>>>>>> trcVerbose("[" PTR_FORMAT " - " PTR_FORMAT "] is not a sub " >>>>>>>> "range of [" PTR_FORMAT " - " PTR_FORMAT "].", >>>>>>>> - p, p + s, addr, addr + size); >>>>>>>> + p2i(p), p2i(p + s), p2i(addr), p2i(addr + size)); >>>>>>>> >>>>>>>> pointers should be used with PTR_FORMAT. p2i(p) should be used >> with >>>>>>>> INTPTR_FORMAT. So the above looks like it was already correct and >>>> now >>>>>> is >>>>>>>> not correct. Using p2i with UINTX_FORMAT also looks dubious to >> me. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> Best regards, Matthias >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Langer, Christoph >>>>>>>>>> Sent: Mittwoch, 17. Juli 2019 18:45 >>>>>>>>>> To: Baesken, Matthias ; 'hotspot- >>>>>>>>>> dev at openjdk.java.net' ; 'ppc- >> aix- >>>>>> port- >>>>>>>>>> dev at openjdk.java.net' >>>>>>>>>> Subject: RE: RFR : 8227869: fix wrong format specifiers in >> os_aix.cpp >>>>>>>>>> >>>>>>>>>> Hi Matthias, >>>>>>>>>> >>>>>>>>>> thanks for this tedious cleanup. Looks good to me. >>>>>>>>>> >>>>>>>>>> Best regards >>>>>>>>>> Christoph >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: hotspot-dev >> On >>>>>>>> Behalf >>>>>>>>>> Of >>>>>>>>>>> Baesken, Matthias >>>>>>>>>>> Sent: Mittwoch, 17. Juli 2019 17:07 >>>>>>>>>>> To: 'hotspot-dev at openjdk.java.net' >>>>> dev at openjdk.java.net>; >>>>>>>>>>> 'ppc-aix-port-dev at openjdk.java.net' >>>>>>>>> dev at openjdk.java.net> >>>>>>>>>>> Subject: RFR : 8227869: fix wrong format specifiers in os_aix.cpp >>>>>>>>>>> >>>>>>>>>>> Hello, there are a couple of non matching format specifiers in >>>>>> os_aix.cpp >>>>>>>> . >>>>>>>>>>> I adjust them with my change . >>>>>>>>>>> >>>>>>>>>>> Please review ! >>>>>>>>>>> >>>>>>>>>>> Bug/webrev : >>>>>>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227869 >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8227869.0/ >>>>>>>>>>> >>>>>>>>>>> Thanks, Matthias From sgehwolf at redhat.com Thu Jul 18 15:14:47 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 18 Jul 2019 17:14:47 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> Message-ID: <0147c2dfb1fec34bce4128ec8af2ca5fc725d79c.camel@redhat.com> Hi Igor, On Wed, 2019-07-17 at 11:37 -0700, Igor Ignatyev wrote: > > Hi Severin, > > the updated webrev looks good to me, please see a couple comments below. Thanks. More below. > Cheers, > -- Igor > > > On Jul 17, 2019, at 10:34 AM, Severin Gehwolf wrote: > > > > Hi Misha, > > > > On Wed, 2019-07-17 at 10:22 -0700, mikhailo.seledtsov at oracle.com wrote: > > > Hi Severin, > > > > > > On 7/17/19 5:44 AM, Severin Gehwolf wrote: > > > > Hi Igor, Misha, > > > > > > > > On Tue, 2019-07-16 at 11:49 -0700, Igor Ignatyev wrote: > > > > > Hi Severin, > > > > > > > > > > I don't think that tests (or test libraries for that matter) should > > > > > be responsible for setting correct PATH value, it should be a part of > > > > > host configuration procedure (tests can/should check that all > > > > > required bins are available though). in other words, I'd prefer if > > > > > you remove 'env.put("PATH", ...)' lines from both DockerTestUtils and > > > > > TestJFREvents. the rest looks good to me. > > > > Updated webrev: > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/02/webrev/ > > > > > > > > No more additions to PATH are being done. > > > > > > > > I've discovered that VMProps.java which defines "docker.required", used > > > > the "docker" binary even for podman test runs. This ended up not > > > > running most of the tests even with -Djdk.test.docker.command=podman > > > > specified. > > > Good catch. > should we rename docker.support and DOCKER_COMMAND to something more abstract? container.support and CONTAINER_ENGINE_COMMAND perhaps? > > > > I've fixed that by moving DOCKER_COMMAND to Platform.java so > > > > that it can be used in both places. > > > > > > Sounds good to me. > > > > > > (of course, the alternative would be to import > > > jdk.test.lib.containers.docker.DockerTestUtils into VMProps.java -- not > > > sure if there are any potential problems doing it this way) > > > > I've tried that but for some reason this was a problem and VMProps > > failed to compile. I don't know exactly how those jtreg extensions work > > and went with the Platform approach. Hope that's OK. > > all files needed for VMProps (or other @requires expression class) > have to be listed in requires.extraPropDefns or > requires.extraPropDefns.bootlibs property in TEST.ROOT file in all > the test suites which use these extensions. we are trying to be very > cautious in what is used by VMProps (directly and indirectly) so > these lists won't grow and we won't require any modules other than > java.base, given DockerTestUtils has dependencies on a number of > other library classes, the Platform approach is much better from that > point of view. I had a feeling it was something like that. Thanks for the explanation! Cheers, Severin From kim.barrett at oracle.com Thu Jul 18 16:05:07 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 18 Jul 2019 12:05:07 -0400 Subject: 8227652: SetupOperatorNewDeleteCheck should discuss deleting destructors In-Reply-To: <08ef9d8e-f74d-83cb-4a9a-ac04364c2b0f@oracle.com> References: <40590A26-1A32-4B3F-B1D8-55A56090C5F4@oracle.com> <08ef9d8e-f74d-83cb-4a9a-ac04364c2b0f@oracle.com> Message-ID: <51392BF2-CF69-4FC9-828A-0237425DAD8A@oracle.com> > On Jul 17, 2019, at 10:28 PM, David Holmes wrote: > > Looks fine and trivial to me. Thanks. > Thanks, > David > > On 16/07/2019 5:51 am, Kim Barrett wrote: >> Please review this explanatory comment being added to the description >> of the check for using global operator new/delete in Hotspot code. >> The described situation is somewhat obscure, and encountering it for >> the first time (or again after a long time, as happened to me recently) >> can be quite puzzling. >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8227652 >> Webrev: >> http://cr.openjdk.java.net/~kbarrett/8227652/open.00/ From vladimir.kozlov at oracle.com Thu Jul 18 16:34:44 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 18 Jul 2019 09:34:44 -0700 Subject: RFR(trivial): 8227512: [TESTBUG] Fix JTReg javac test failures with Graal In-Reply-To: References: Message-ID: <35bf9c4b-75ad-7a19-a802-12a279cbc286@oracle.com> Yes, I think it is possible. You need 2 @test blocks. Something like next: http://hg.openjdk.java.net/jdk/jdk/file/aeb124322000/test/hotspot/jtreg/compiler/floatingpoint/TestFloatJNIArgs.java Vladimir On 7/18/19 1:57 AM, Langer, Christoph wrote: > Hi, > > we observe this issue on some of our platforms (ppc64, ppc64le) where graal/jdk.internal.vm.compiler is not available. So a good fix would either be to have `@requires !vm.graal.enabled` or, if jtreg supports it, we'd need two sets of @modules directives and VM Options (--limit-modules) to cover both cases, with or without aot. Does anybody know if this is possible? > > Thanks > Christoph > >> -----Original Message----- >> From: hotspot-dev On Behalf Of >> Pengfei Li (Arm Technology China) >> Sent: Donnerstag, 18. Juli 2019 08:52 >> To: Alan Bateman >> Cc: nd ; compiler-dev at openjdk.java.net; hotspot- >> dev at openjdk.java.net >> Subject: RE: RFR(trivial): 8227512: [TESTBUG] Fix JTReg javac test failures with >> Graal >> >> Hi Alan, >> >>> I see this has been pushed but it looks like it is missing `@modules >>> jdk.internal.vm.compiler` as the test now requires this module to be in the >>> run-time image under test. As the test is not interesting when testing with >> the >>> Graal compiler then maybe an alternative is to add >>> `@requires !vm.graal.enabled` so that the test is not selected when >>> exercising Graal - we've done this in a few other tests that run with `--limit- >>> modules`. >> >> Thanks for reply. I've used this alternative approach before when I tried to >> clean up other false failures in Graal jtreg (see >> http://hg.openjdk.java.net/jdk/jdk/rev/206afa6372ae). This time I choose to >> add the missing module because I thought the javac test would be >> interesting when Graal is used since javac is also written in Java. This change >> is already pushed, but it's fine to me if you would like to submit another >> patch to disable this two cases with Graal. >> >> -- >> Thanks, >> Pengfei > From mikhailo.seledtsov at oracle.com Thu Jul 18 17:14:47 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 18 Jul 2019 10:14:47 -0700 Subject: RFR(S) [13] : 8226910 : make it possible to use jtreg's -match via run-test framework In-Reply-To: References: <8B6A5349-A39A-4AE0-980D-5C336C339DE7@oracle.com> <9DA3B077-FFE6-472E-B3EA-7C4CFFDB45EB@oracle.com> <5b10f093-8aa8-4b5f-14bf-a9b7c5704381@oracle.com> <2F2CE24E-9DDB-489D-9CC6-3296C0149B9A@oracle.com> Message-ID: <7277225c-bd32-5be5-83a1-bc285e3436e8@oracle.com> +1 On 7/17/19 9:43 PM, David Holmes wrote: > Hi Igor, > > This seems fine to me. > > Thanks, > David > > On 17/07/2019 7:35 am, Igor Ignatyev wrote: >> can I get a review for this patch? >> http://cr.openjdk.java.net/~iignatyev//8226910/webrev.01/index.html >> >> Thanks, >> -- Igor >> >>> On Jul 6, 2019, at 11:50 AM, Igor Ignatyev >> > wrote: >>> >>> Hi David, >>> >>>> On Jul 6, 2019, at 1:58 AM, David Holmes >>> > wrote: >>>> >>>> Hi Igor, >>>> >>>> On 6/07/2019 1:09 pm, Igor Ignatyev wrote: >>>>> ping? >>>>> -- Igor >>>>>> On Jun 27, 2019, at 3:25 PM, Igor Ignatyev >>>>>> > wrote: >>>>>> >>>>>> http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>>>>>> 25 lines changed: 18 ins; 3 del; 4 mod; >>>>>> >>>>>> Hi all, >>>>>> >>>>>> could you please review this small patch which adds >>>>>> JTREG_RUN_PROBLEM_LISTS options to run-test framework? when >>>>>> JTREG_RUN_PROBLEM_LISTS is set to true, jtreg will use problem >>>>>> lists as values of -match: instead of -exclude, which effectively >>>>>> means it will run only problem listed tests. >>>> >>>> doc/testing.md >>>> >>>> + Set to `true` of `false`. >>>> >>>> typo: s/of/or/ >>> fixed .md, regenerated .html. >>>> >>>> Build changes seem okay - I can't attest to the operation of the flag. >>> >>> here is how I verified that it does that it supposed to: >>> >>> $ make test "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true" >>> TEST=open/test/hotspot/jtreg/:hotspot_all >>> lists 53 tests, the same command w/o RUN_PROBLEM_LISTS (or w/ >>> RUN_PROBLEM_LISTS=false) lists 6698 tests. >>> >>> $ make test >>> "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true;EXTRA_PROBLEM_LISTS=ProblemList-aot.txt >>> lists 81 tests, the same command w/o RUN_PROBLEM_LISTS lists 6670 >>> tests. >>> >>>> >>>>>> doc/building.html got changed when I ran update-build-docs, I can >>>>>> exclude it from the patch, but it seems it will keep changing >>>>>> every time we run update-build-docs, so I decided to at least >>>>>> bring it up. >>>> >>>> Weird it seems to have removed line-breaks in that paragraph. What >>>> platform did you build on? >>> I built on macos. now when I wrote that, I remember pandoc used to >>> produce different results on macos. so I've rerun it on linux on the >>> source w/o my change, and doc/building.html still got changed in the >>> exact same way. >>> >>>> David >>>> ----- >>>> >>>>>> >>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8226910 >>>>>> webrev:http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>>>>> >>>>>> >>>>>> Thanks, >>>>>> -- Igor >> From igor.ignatyev at oracle.com Thu Jul 18 18:50:21 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 18 Jul 2019 11:50:21 -0700 Subject: RFR(S) [13] : 8226910 : make it possible to use jtreg's -match via run-test framework In-Reply-To: <7277225c-bd32-5be5-83a1-bc285e3436e8@oracle.com> References: <8B6A5349-A39A-4AE0-980D-5C336C339DE7@oracle.com> <9DA3B077-FFE6-472E-B3EA-7C4CFFDB45EB@oracle.com> <5b10f093-8aa8-4b5f-14bf-a9b7c5704381@oracle.com> <2F2CE24E-9DDB-489D-9CC6-3296C0149B9A@oracle.com> <7277225c-bd32-5be5-83a1-bc285e3436e8@oracle.com> Message-ID: <0C4C276A-FE8B-47A8-BEED-6F29C01982A2@oracle.com> David, Misha, thanks for your review, pushed. -- Igor > On Jul 18, 2019, at 10:14 AM, mikhailo.seledtsov at oracle.com wrote: > > +1 > > On 7/17/19 9:43 PM, David Holmes wrote: >> Hi Igor, >> >> This seems fine to me. >> >> Thanks, >> David >> >> On 17/07/2019 7:35 am, Igor Ignatyev wrote: >>> can I get a review for this patch? http://cr.openjdk.java.net/~iignatyev//8226910/webrev.01/index.html >>> >>> Thanks, >>> -- Igor >>> >>>> On Jul 6, 2019, at 11:50 AM, Igor Ignatyev > wrote: >>>> >>>> Hi David, >>>> >>>>> On Jul 6, 2019, at 1:58 AM, David Holmes > wrote: >>>>> >>>>> Hi Igor, >>>>> >>>>> On 6/07/2019 1:09 pm, Igor Ignatyev wrote: >>>>>> ping? >>>>>> -- Igor >>>>>>> On Jun 27, 2019, at 3:25 PM, Igor Ignatyev > wrote: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>>>>>>> 25 lines changed: 18 ins; 3 del; 4 mod; >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> could you please review this small patch which adds JTREG_RUN_PROBLEM_LISTS options to run-test framework? when JTREG_RUN_PROBLEM_LISTS is set to true, jtreg will use problem lists as values of -match: instead of -exclude, which effectively means it will run only problem listed tests. >>>>> >>>>> doc/testing.md >>>>> >>>>> + Set to `true` of `false`. >>>>> >>>>> typo: s/of/or/ >>>> fixed .md, regenerated .html. >>>>> >>>>> Build changes seem okay - I can't attest to the operation of the flag. >>>> >>>> here is how I verified that it does that it supposed to: >>>> >>>> $ make test "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true" TEST=open/test/hotspot/jtreg/:hotspot_all >>>> lists 53 tests, the same command w/o RUN_PROBLEM_LISTS (or w/ RUN_PROBLEM_LISTS=false) lists 6698 tests. >>>> >>>> $ make test "JTREG=OPTIONS=-l;RUN_PROBLEM_LISTS=true;EXTRA_PROBLEM_LISTS=ProblemList-aot.txt >>>> lists 81 tests, the same command w/o RUN_PROBLEM_LISTS lists 6670 tests. >>>> >>>>> >>>>>>> doc/building.html got changed when I ran update-build-docs, I can exclude it from the patch, but it seems it will keep changing every time we run update-build-docs, so I decided to at least bring it up. >>>>> >>>>> Weird it seems to have removed line-breaks in that paragraph. What platform did you build on? >>>> I built on macos. now when I wrote that, I remember pandoc used to produce different results on macos. so I've rerun it on linux on the source w/o my change, and doc/building.html still got changed in the exact same way. >>>> >>>>> David >>>>> ----- >>>>> >>>>>>> >>>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8226910 >>>>>>> webrev:http://cr.openjdk.java.net/~iignatyev//8226910/webrev.00/index.html >>>>>>> >>>>>>> Thanks, >>>>>>> -- Igor >>> From igor.ignatyev at oracle.com Fri Jul 19 01:29:59 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 18 Jul 2019 18:29:59 -0700 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <0147c2dfb1fec34bce4128ec8af2ca5fc725d79c.camel@redhat.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <5bc3ac00-6ac9-99aa-052d-0a4aa6b04f8f@oracle.com> <47390A32-BD5B-4FF3-B93B-69ACECBC3E78@oracle.com> <243091d0e29604851d100b94d5ad777d9cf59127.camel@redhat.com> <60f8f5a9003dd199f2384360c16032d21c881dbb.camel@redhat.com> <0147c2dfb1fec34bce4128ec8af2ca5fc725d79c.camel@redhat.com> Message-ID: <271B3583-A5E1-4562-B728-1208F4476731@oracle.com> > On Jul 18, 2019, at 8:14 AM, Severin Gehwolf wrote: > >> should we rename docker.support and DOCKER_COMMAND to something more abstract? > > container.support and CONTAINER_ENGINE_COMMAND perhaps? sounds good to me, but as Misha suggested (and I originally intended to write), let's do this renaming in a separate RFE. that's to say, you can consider 8227642 as fully reviewed. Thanks, -- Igor From patrick at os.amperecomputing.com Fri Jul 19 03:07:42 2019 From: patrick at os.amperecomputing.com (Patrick Zhang OS) Date: Fri, 19 Jul 2019 03:07:42 +0000 Subject: The default choice in setup_large_page_type() if set -XX:+UseLargePages only In-Reply-To: References: Message-ID: According to the logic, setting "+UseLargePages only" is equivalent to "+UseHugeTLBFS +UseSHM", since UseHugeTLBFS is of the higher priority (to try). What I learnt here is we'd better not set +UseLargePages since explicitly setting any of UseHugeTLBFS/UseSHM/UseTransparentHugePages would implicitly reconfigure it as true/false, while setting -UseLargePages can play a key role to disable all no mater what other three options are used. Thanks David. Regards Patrick -----Original Message----- From: David Holmes Sent: Thursday, July 18, 2019 9:08 PM To: Patrick Zhang OS ; 'hotspot-dev at openjdk.java.net' Subject: Re: The default choice in setup_large_page_type() if set -XX:+UseLargePages only Hi Patrick, On 18/07/2019 9:06 pm, Patrick Zhang OS wrote: > I found a weird "issue" when setting up an env with -XX:+UseLargePages only. I knew later on that at least one of UseHugeTLBFS/UseSHM/UseTransparentHugePages, but in the beginning everything worked well without any warnings. The default choice is UseHugeTLBFS behind this. However when I added -XX:-UseTransparentHugePages, the function got completely disabled and setup_large_page_type() returned false. Is this an expected behavior? or any warnings ought to show in console? If -XX:+UseLargePages is allowed to be specified alone, perhaps disabling the default UseHugeTLBFS choice can be less misleading? You can see in the logic that if you enable large pages only then it configures the other flags for you in the way that makes most sense: try UseHugeTLBFS and then UseSHM, but don't try UseTransparentHugePages since there are known performance issues with it turned on. But if you enable large pages and explicitly set/clear at least one of the other flags, then it assumes you've set everything yourself as needed and only does some basic sanity checks whilst trying select the right mode. I agree it seems odd that explicitly disabling a flag that would be disabled anyway changes the behaviour, but you've basically switched things from "configure things for me" mode, to "manual" mode by specifying any other flag explicitly. As a result explicitly disabling any of the flags results in all flags being off. Cheers, David > Thanks for any comments. > > > Here is the related source code: > bool os::Linux::setup_large_page_type(size_t page_size) > > https://hg.openjdk.java.net/jdk/jdk/file/065142ace8e9/src/hotspot/os/l > inux/os_linux.cpp#l3764 > > > > java -Xmx512m -XX:+PrintFlagsFinal -version (default values) > > bool UseHugeTLBFS = false {product} {default} > > bool UseLargePages = false {pd product} {default} > > bool UseSHM = false {product} {default} > > bool UseTransparentHugePages = false {product} {default} > > java -Xmx512m -XX:+PrintFlagsFinal -XX:LargePageSizeInBytes=2m > -XX:+UseLargePages -version > > bool UseHugeTLBFS = true {product} {command line} > > bool UseLargePages = true {pd product} {command line} > > bool UseSHM = false {product} {default} > > bool UseTransparentHugePages = false {product} {default} > > java -Xmx512m -XX:+PrintFlagsFinal -XX:LargePageSizeInBytes=2m > -XX:+UseLargePages -XX:-UseTransparentHugePages -version > > bool UseHugeTLBFS = false {product} {command line} > > bool UseLargePages = false {pd product} {command line} > > bool UseSHM = false {product} {default} > > bool UseTransparentHugePages = false {product} {default} > > java -Xmx512m -XX:+PrintFlagsFinal -XX:LargePageSizeInBytes=2m > -XX:+UseLargePages -XX:-UseSHM -version > > bool UseHugeTLBFS = false {product} {command line} > > bool UseLargePages = false {pd product} {command line} > > bool UseSHM = false {product} {default} > > bool UseTransparentHugePages = false {product} {default} > > Regards > Patrick > From david.holmes at oracle.com Fri Jul 19 03:28:29 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Jul 2019 13:28:29 +1000 Subject: Please implement client switch in 64-bit server JDK 14 builds In-Reply-To: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> References: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> Message-ID: Hi Ty, I'm moving this discussion to hotspot-dev as it's more appropriate. On 19/07/2019 12:46 pm, Ty Young wrote: > Hi, > > > I'm requesting that the long unimplemented "client" java switch be > implemented in Java 14. Background: the client VM is historically only supported on 32-bit platforms explicitly, so the memory issues you are seeing are a combination of factors based on the ergonomic selections made by the VM during startup. The "client VM" is predominantly a 32-bit JVM that only supports the C1 JIT-compiler. The "server VM" in contrast supports the C2 JIT-compiler. For a while now this distinction has blurred because the JIT uses tiered-compilation so that it starts by acting similar to the C1 compiler (for faster startup) and progresses into a mode that acts like C2 (for throughput optimisation). Though there are flags you can set to get it to act just like C1 or just like C2. Whether a machine is considered "server class" only partially relates to this. The startup ergonomics for a "server class" machine will configure subsystems to use more memory than a "non-server class" machine. Again these days (and for a while) we do not use this classification when starting the JVM. Various ergonomic selections are made based on the default settings for a range of components (mainly GC and JIT) together with the characteristics of the actual runtime environment (available memory and processors etc). The JVM is highly tunable in this regard, but of course it needs to have a reasonable out-of-the-box configuration - and that has evolved over the years, but is, at least for 64-bit systems, skewed towards server-style systems. So we cannot please everybody with the out-of-box default configuration. It's been suggested in the past that perhaps we should support a number of different initial configurations to make it easy(er) to adapt to specific user requirements, but this quickly breaks down as you can't get consensus on what those settings should be, and anyone who really cares will do their own tuning anyway. I can't go through your email point by point in detail sorry. Perhaps others can focus on specific memory issues. In particular if JavaFX is a source of problems then that will need to be discussed with the JavaFX folk. A very strong "business case" would need to be made for the community to look at supporting something like "-client" in the current OpenJDK. Cheers, David > > (Note: this entire request is based on the assumption that a JVM with > -client is equivalent to a client JVM variant. If this is wrong, I > apologies. There isn't much documentation to go on.) > > > Since there aren't many google results or any kind of mention of this > feature/ability even existing, i'll give an explanation to the best of > my knowledge and personal observations: > > > A "client" JVM variant is geared towards graphical end-user > applications. According to a URL link found in the man entry for java[1] > this supposedly results in faster startups. While this *may* be true, a > much larger and more important benefit is a massive committed memory > reduction in the range of about 25% to 50% when running a JavaFX > application. At minimum with similar heap sizes, that is a 75 MB memory > savings at 300MB (a somewhat typical peak usage with JavaFX > applications) with a typical server JVM. That's huge. > > > The downside to this however is that at most, the maximum amount of > (committed?) memory that a client JVM variant can use is somewhere > around 300MB by default. For the intended purpose of the client JVM > switch/variant this is *probably* fine. Server JVM variants only seems > to allocate more memory to boost performance, which really isn?t that > much of a difference with the intended use case of the client JVM > switch/variant? especially considering the more appealing memory savings. > > > So why should this be implemented? > > > The answer is simple: using more memory then is necessary is bad, angers > users, and frustrates developers who want to be responsible by not > wanting to eat up their users's memory[2] when it isn't needed. > > > Even if you've have never heard anyone complain about Java's memory > usage, you've most likely heard someone complain about a similar > cross-platform software: Electron. People hate Electron applications for > their absurd memory usage and will actively avoid them by using > alternatives if possible. > > > For reference, Etcher, an Electron application that allows users to > easily create bootable USB drives on Windows, Linux, and probably Mac OS > uses around 298 MB just at launch on Linux. Electron is both comparable > in both goals(cross-platform solutions, JavaFX vs. Electron) and in > memory usage. > > > Java may not be a native language and there may be *some* unavoidable > penalty for that but being wasteful and consuming resources where not > necessary is, well, unnecessary. This can help reduce the amount of > memory a java application uses significantly when used. > > > With that all said, since JEPs include risks/impact/problems, it's best > to mention some that come to mind: > > > Because of the default lower memory limit, applications which go beyond > this will fail. The easiest and best workaround would be to simply make > the client JVM switch/variant opt-in. This would allow all existing Java > applications to continue to work as expected. > > > The only other issue that I can think of is people launching > applications with -client without knowing the limitations of it and > filing bogus bug reports to app developers. This can be mitigated with > better documentation and awareness in places like the man page for Java. > Since no one seems to really have used or knew about it before it's more > likely end developers that will be passing the switch to their > applications via scripts then end users will be. > > > All in all, this is pretty safe as long as server JVM switch/variant > remains the default. Maybe others can think of other > risks/impacts/problems. > > > And finally addressing the two questions/comments I imagine someone at > some point are going to ask/say: > > > Why not just compile a client JVM variant from source and use jLink? > > > and/or > > > If heap and garbage collection is healthy, who cares? > > > For the first one, yes, this is a route that could be taken. It has some > problems however, namely: > > > - You have to be the developer or have source code access to use jLink. > > > - jLink -from my understanding- requires a **fully** modular Java > application. Some used libraries may not be modular yet. > > > - A full JDK source code compile is required - something that is really > easy to do under Linux but might not be under Windows and takes > considerable CPU power to do. No one that I?m aware of (on Linux anyway) > provides client JVM variant builds. Presumably This is because the > server JVM variant is the most versatile. > > > and as for the second: just because there is say, 5.8GB out of 8GB > available doesn't mean you should or have the right to use it as you see > fit. People do more than use Java applications. If you are running a web > browser with lots of tabs open, a Java application could realistically > cause major system stuttering as memory is moved to swap/pagefile. While > I used 300MB above as an easy realistic example, i've seen JavaFX > applications consume as much as 700MB and even 1GB committed memory. > Just opening Scene Builder and playing around with the GUI consumes > 400MB easily on a server JVM variant(Oracle JDK/JRE 10 to be exact). > While memory usage may never be as good as native, the current amount of > memory being consumed is insane and any normal user with standard amount > of memory(6-8GB) *will* feel this. Adding this switch could potentially > help a lot here and give Java a slight edge over similar software > solutions. > > > Can this feature please be implemented? Likewise, could the > documentation on what a "client" JVM and other JVM variants be updated > and improved? > > > [1] > https://docs.oracle.com/javase/8/docs/technotes/guides/vm/server-class.html > > > [2] > https://stackoverflow.com/questions/13692206/high-java-memory-usage-even-for-small-programs > > From kim.barrett at oracle.com Fri Jul 19 06:26:56 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 19 Jul 2019 02:26:56 -0400 Subject: RFR: 8227653: Add VM Global OopStorage In-Reply-To: <6dbf15d1-e340-1541-3704-a05c376be7b1@oracle.com> References: <16F0945B-E74B-472A-ADCF-5363FAAC9461@oracle.com> <6dbf15d1-e340-1541-3704-a05c376be7b1@oracle.com> Message-ID: > On Jul 17, 2019, at 7:36 PM, Vladimir Kozlov wrote: > > Thank you, Kim > > Good. Please file bug for JDK 13 and assign it to me. I will port your JVMCI fix. Thanks, Vladimir. And thanks to Thomas and Lois for reviews. For the record, I pushed the original change (open.00) to jdk/jdk. Vladimir pushed my open.01.inc change to jdk/jdk13 as a fix for JDK-8228340, and it will get forward ported to jdk/jdk as part of the usual process. From matthias.baesken at sap.com Fri Jul 19 07:46:18 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 19 Jul 2019 07:46:18 +0000 Subject: 8228420: compile error in shenandoahSupport.cpp with clang 9 Message-ID: Hello, on OSX we recently run into this compile error (see below ). We use clang 9 : configure: Using clang C compiler version 9.0.0 [Apple LLVM version 9.0.0 (clang-900.0.39.2) /nightly/jdk/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp:3019:33: error: operator '?:' has lower precedence than '+'; '+' will be evaluated first [-Werror,-Wparentheses] return Node::hash() + _native ? 1 : 0; ~~~~~~~~~~~~~~~~~~~~~~ ^ /nightly/jdk/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp:3019:33: note: place parentheses around the '+' expression to silence this warning return Node::hash() + _native ? 1 : 0; ^ ( ) This might be related to : 8227677: Shenandoah: C2: Make in-native LRB special case of normal LRB I opened the bug : https://bugs.openjdk.java.net/browse/JDK-8228420 Should we go for return Node::hash() + ( _native ? 1 : 0 ); to please the compiler ? Is the issue on present on more recent Xcode / clang versions on OSX ? Thanks, Matthias From shade at redhat.com Fri Jul 19 07:57:04 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 19 Jul 2019 09:57:04 +0200 Subject: 8228420: compile error in shenandoahSupport.cpp with clang 9 In-Reply-To: References: Message-ID: On 7/19/19 9:46 AM, Baesken, Matthias wrote: > Should we go for > > return Node::hash() + (_native ? 1 : 0); > > to please the compiler ? Yes, please do. -- Thanks, -Aleksey From matthias.baesken at sap.com Fri Jul 19 08:20:00 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 19 Jul 2019 08:20:00 +0000 Subject: 8228420: compile error in shenandoahSupport.cpp with clang 9 In-Reply-To: References: Message-ID: Hi Aleksey, here is my change : Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8228420 http://cr.openjdk.java.net/~mbaesken/webrevs/8228420/ Thanks, Matthias > > * PGP Signed by an unknown key > > On 7/19/19 9:46 AM, Baesken, Matthias wrote: > > Should we go for > > > > return Node::hash() + (_native ? 1 : 0); > > > > to please the compiler ? > > Yes, please do. > > -- > Thanks, > -Aleksey > > > * Unknown Key > * 0x62A119A7 From shade at redhat.com Fri Jul 19 08:23:05 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 19 Jul 2019 10:23:05 +0200 Subject: 8228420: compile error in shenandoahSupport.cpp with clang 9 In-Reply-To: References: Message-ID: <1f14645b-e481-9ba4-d8ff-9a03c7067521@redhat.com> On 7/19/19 10:20 AM, Baesken, Matthias wrote: > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8228420 > http://cr.openjdk.java.net/~mbaesken/webrevs/8228420/ Looks good and trivial. Please push! -Aleksey From matthias.baesken at sap.com Fri Jul 19 10:42:02 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 19 Jul 2019 10:42:02 +0000 Subject: 8228420: compile error in shenandoahSupport.cpp with clang 9 In-Reply-To: <1f14645b-e481-9ba4-d8ff-9a03c7067521@redhat.com> References: <1f14645b-e481-9ba4-d8ff-9a03c7067521@redhat.com> Message-ID: Thanks for the review ! > -----Original Message----- > From: Aleksey Shipilev > Sent: Freitag, 19. Juli 2019 10:23 > To: Baesken, Matthias ; 'hotspot- > dev at openjdk.java.net' > Subject: Re: 8228420: compile error in shenandoahSupport.cpp with clang 9 > > * PGP Signed by an unknown key > > On 7/19/19 10:20 AM, Baesken, Matthias wrote: > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8228420 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228420/ > > Looks good and trivial. Please push! > > -Aleksey > > > * Unknown Key > * 0x62A119A7 From Alan.Bateman at oracle.com Fri Jul 19 10:58:15 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 19 Jul 2019 11:58:15 +0100 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> Message-ID: <2bd4e000-6008-a0f0-d17e-aeaf5336569b@oracle.com> On 12/07/2019 19:08, Severin Gehwolf wrote: > Hi, > > There is an alternative container engine which is being used by Fedora > and RHEL 8, called podman[1]. It's mostly compatible with docker. It > looks like OpenJDK docker tests can be made podman compatible with a > few little tweaks. One "interesting" one is to not assert "Successfully > built" in the build output but only rely on the exit code, which seems > to be OK for my testing. Interestingly the test would be skipped in > that case. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ > Just looking at 02/webrev and I see it adds a System.getProperty to test/lib/jdk/test/lib/Platform.java. That may cause issues for tests that use this test infrastructure and have their own security policy (we've run into issues in the past with test infrastructure requiring permissions that the tests using the test library don't know about). In this case it might be better to create Platform.Docker.COMMAND so that only the container tests need to be concerned with it. -Alan From sgehwolf at redhat.com Fri Jul 19 11:24:10 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 19 Jul 2019 13:24:10 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <2bd4e000-6008-a0f0-d17e-aeaf5336569b@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <2bd4e000-6008-a0f0-d17e-aeaf5336569b@oracle.com> Message-ID: <514c5818de749950d15da01adcae8fec720c726c.camel@redhat.com> Hi Alan, On Fri, 2019-07-19 at 11:58 +0100, Alan Bateman wrote: > On 12/07/2019 19:08, Severin Gehwolf wrote: > > Hi, > > > > There is an alternative container engine which is being used by Fedora > > and RHEL 8, called podman[1]. It's mostly compatible with docker. It > > looks like OpenJDK docker tests can be made podman compatible with a > > few little tweaks. One "interesting" one is to not assert "Successfully > > built" in the build output but only rely on the exit code, which seems > > to be OK for my testing. Interestingly the test would be skipped in > > that case. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227642 > > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8227642/01/webrev/ > > > Just looking at 02/webrev and I see it adds a System.getProperty to > test/lib/jdk/test/lib/Platform.java. That may cause issues for tests > that use this test infrastructure and have their own security policy > (we've run into issues in the past with test infrastructure requiring > permissions that the tests using the test library don't know about). In > this case it might be better to create Platform.Docker.COMMAND so that > only the container tests need to be concerned with it. Thanks for the heads-up. Unfortunately, it's too late. I've pushed it already: https://hg.openjdk.java.net/jdk/jdk/rev/709913d8ace9 Your comment leaves me confused, though. VMProps.java (which is being used for all @requires foo extensions) does System.getProperty() calls too but that's not a problem? If so why? Is there a way to reproduce this? Once I've got a reproducer I'd be happy to fix it. FWIW, jdk/submit came back green before this was pushed. Thanks, Severin From Alan.Bateman at oracle.com Fri Jul 19 11:48:18 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 19 Jul 2019 12:48:18 +0100 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <514c5818de749950d15da01adcae8fec720c726c.camel@redhat.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <2bd4e000-6008-a0f0-d17e-aeaf5336569b@oracle.com> <514c5818de749950d15da01adcae8fec720c726c.camel@redhat.com> Message-ID: <675ce319-00cc-099b-a287-9c3cb97a8653@oracle.com> On 19/07/2019 12:24, Severin Gehwolf wrote: > : > Thanks for the heads-up. > > Unfortunately, it's too late. I've pushed it already: > https://hg.openjdk.java.net/jdk/jdk/rev/709913d8ace9 > > Your comment leaves me confused, though. VMProps.java (which is being > used for all @requires foo extensions) does System.getProperty() calls > too but that's not a problem? If so why? Is there a way to reproduce > this? Once I've got a reproducer I'd be happy to fix it. > > FWIW, jdk/submit came back green before this was pushed. > I didn't realize this was pushed (I'm just catching up after being out for a few days). I just checked the CI system in Oracle and it looks like jdk/net/Sockets/Test.java is failing because of this. I just checked it locally and it duplicate readily so should help with the follow-up issue. There may be more but can't confirm that until all tests have run. -Alan. From sgehwolf at redhat.com Fri Jul 19 11:56:28 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 19 Jul 2019 13:56:28 +0200 Subject: RFR: 8227642: [TESTBUG] Make docker tests podman compatible In-Reply-To: <675ce319-00cc-099b-a287-9c3cb97a8653@oracle.com> References: <32c8a1934bf07e4c9c6a961e60dcb7abd9931fe1.camel@redhat.com> <2bd4e000-6008-a0f0-d17e-aeaf5336569b@oracle.com> <514c5818de749950d15da01adcae8fec720c726c.camel@redhat.com> <675ce319-00cc-099b-a287-9c3cb97a8653@oracle.com> Message-ID: <3b9638b4321f6460a5b63d72247942c86d3b464a.camel@redhat.com> On Fri, 2019-07-19 at 12:48 +0100, Alan Bateman wrote: > On 19/07/2019 12:24, Severin Gehwolf wrote: > > : > > Thanks for the heads-up. > > > > Unfortunately, it's too late. I've pushed it already: > > https://hg.openjdk.java.net/jdk/jdk/rev/709913d8ace9 > > > > Your comment leaves me confused, though. VMProps.java (which is being > > used for all @requires foo extensions) does System.getProperty() calls > > too but that's not a problem? If so why? Is there a way to reproduce > > this? Once I've got a reproducer I'd be happy to fix it. > > > > FWIW, jdk/submit came back green before this was pushed. > > > I didn't realize this was pushed (I'm just catching up after being out > for a few days). I just checked the CI system in Oracle and it looks > like jdk/net/Sockets/Test.java is failing because of this. I just > checked it locally and it duplicate readily so should help with the > follow-up issue. There may be more but can't confirm that until all > tests have run. Thanks, reproduced. I'll have a look. Cheers, Severin From boris.ulasevich at bell-sw.com Fri Jul 19 17:09:41 2019 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Fri, 19 Jul 2019 20:09:41 +0300 Subject: Please implement client switch in 64-bit server JDK 14 builds In-Reply-To: References: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> Message-ID: Hi Ty, If I understood you correct, you are looking for a client VM as it is not so greedy for memory. And you are asking for this feature (client VM) to be implemented (may be I misunderstand something?). But vm variants is not a feature to implement - it is already implemented and works well on many platforms. You just need to build OpenJDK yourself or find a JVM vendor who provides JDK binaries with support of client/server variants for your target platform. I know at least one vendor, I bet Liberica JDK from BellSoft should fit your request :) regards, Boris 19.07.2019 6:28, David Holmes ?????: > Hi Ty, > > I'm moving this discussion to hotspot-dev as it's more appropriate. > > On 19/07/2019 12:46 pm, Ty Young wrote: >> Hi, >> >> >> I'm requesting that the long unimplemented "client" java switch be >> implemented in Java 14. > > Background: the client VM is historically only supported on 32-bit > platforms explicitly, so the memory issues you are seeing are a > combination of factors based on the ergonomic selections made by the > VM during startup. The "client VM" is predominantly a 32-bit JVM that > only supports the C1 JIT-compiler. The "server VM" in contrast > supports the C2 JIT-compiler. For a while now this distinction has > blurred because the JIT uses tiered-compilation so that it starts by > acting similar to the C1 compiler (for faster startup) and progresses > into a mode that acts like C2 (for throughput optimisation). Though > there are flags you can set to get it to act just like C1 or just like > C2. > > Whether a machine is considered "server class" only partially relates > to this. The startup ergonomics for a "server class" machine will > configure subsystems to use more memory than a "non-server class" > machine. Again these days (and for a while) we do not use this > classification when starting the JVM. Various ergonomic selections are > made based on the default settings for a range of components (mainly > GC and JIT) together with the characteristics of the actual runtime > environment (available memory and processors etc). > > The JVM is highly tunable in this regard, but of course it needs to > have a reasonable out-of-the-box configuration - and that has evolved > over the years, but is, at least for 64-bit systems, skewed towards > server-style systems. So we cannot please everybody with the > out-of-box default configuration. It's been suggested in the past that > perhaps we should support a number of different initial configurations > to make it easy(er) to adapt to specific user requirements, but this > quickly breaks down as you can't get consensus on what those settings > should be, and anyone who really cares will do their own tuning anyway. > > I can't go through your email point by point in detail sorry. Perhaps > others can focus on specific memory issues. In particular if JavaFX is > a source of problems then that will need to be discussed with the > JavaFX folk. > > A very strong "business case" would need to be made for the community > to look at supporting something like "-client" in the current OpenJDK. > > Cheers, > David > >> >> (Note: this entire request is based on the assumption that a JVM with >> -client is equivalent to a client JVM variant. If this is wrong, I >> apologies. There isn't much documentation to go on.) >> >> >> Since there aren't many google results or any kind of mention of this >> feature/ability even existing, i'll give an explanation to the best >> of my knowledge and personal observations: >> >> >> A "client" JVM variant is geared towards graphical end-user >> applications. According to a URL link found in the man entry for >> java[1] this supposedly results in faster startups. While this *may* >> be true, a much larger and more important benefit is a massive >> committed memory reduction in the range of about 25% to 50% when >> running a JavaFX application. At minimum with similar heap sizes, >> that is a 75 MB memory savings at 300MB (a somewhat typical peak >> usage with JavaFX applications) with a typical server JVM. That's huge. >> >> >> The downside to this however is that at most, the maximum amount of >> (committed?) memory that a client JVM variant can use is somewhere >> around 300MB by default. For the intended purpose of the client JVM >> switch/variant this is *probably* fine. Server JVM variants only >> seems to allocate more memory to boost performance, which really >> isn?t that much of a difference with the intended use case of the >> client JVM switch/variant? especially considering the more appealing >> memory savings. >> >> >> So why should this be implemented? >> >> >> The answer is simple: using more memory then is necessary is bad, >> angers users, and frustrates developers who want to be responsible by >> not wanting to eat up their users's memory[2] when it isn't needed. >> >> >> Even if you've have never heard anyone complain about Java's memory >> usage, you've most likely heard someone complain about a similar >> cross-platform software: Electron. People hate Electron applications >> for their absurd memory usage and will actively avoid them by using >> alternatives if possible. >> >> >> For reference, Etcher, an Electron application that allows users to >> easily create bootable USB drives on Windows, Linux, and probably Mac >> OS uses around 298 MB just at launch on Linux. Electron is both >> comparable in both goals(cross-platform solutions, JavaFX vs. >> Electron) and in memory usage. >> >> >> Java may not be a native language and there may be *some* unavoidable >> penalty for that but being wasteful and consuming resources where not >> necessary is, well, unnecessary. This can help reduce the amount of >> memory a java application uses significantly when used. >> >> >> With that all said, since JEPs include risks/impact/problems, it's >> best to mention some that come to mind: >> >> >> Because of the default lower memory limit, applications which go >> beyond this will fail. The easiest and best workaround would be to >> simply make the client JVM switch/variant opt-in. This would allow >> all existing Java applications to continue to work as expected. >> >> >> The only other issue that I can think of is people launching >> applications with -client without knowing the limitations of it and >> filing bogus bug reports to app developers. This can be mitigated >> with better documentation and awareness in places like the man page >> for Java. Since no one seems to really have used or knew about it >> before it's more likely end developers that will be passing the >> switch to their applications via scripts then end users will be. >> >> >> All in all, this is pretty safe as long as server JVM switch/variant >> remains the default. Maybe others can think of other >> risks/impacts/problems. >> >> >> And finally addressing the two questions/comments I imagine someone >> at some point are going to ask/say: >> >> >> Why not just compile a client JVM variant from source and use jLink? >> >> >> and/or >> >> >> If heap and garbage collection is healthy, who cares? >> >> >> For the first one, yes, this is a route that could be taken. It has >> some problems however, namely: >> >> >> - You have to be the developer or have source code access to use jLink. >> >> >> - jLink -from my understanding- requires a **fully** modular Java >> application. Some used libraries may not be modular yet. >> >> >> - A full JDK source code compile is required - something that is >> really easy to do under Linux but might not be under Windows and >> takes considerable CPU power to do. No one that I?m aware of (on >> Linux anyway) provides client JVM variant builds. Presumably This is >> because the server JVM variant is the most versatile. >> >> >> and as for the second: just because there is say, 5.8GB out of 8GB >> available doesn't mean you should or have the right to use it as you >> see fit. People do more than use Java applications. If you are >> running a web browser with lots of tabs open, a Java application >> could realistically cause major system stuttering as memory is moved >> to swap/pagefile. While I used 300MB above as an easy realistic >> example, i've seen JavaFX applications consume as much as 700MB and >> even 1GB committed memory. Just opening Scene Builder and playing >> around with the GUI consumes 400MB easily on a server JVM >> variant(Oracle JDK/JRE 10 to be exact). While memory usage may never >> be as good as native, the current amount of memory being consumed is >> insane and any normal user with standard amount of memory(6-8GB) >> *will* feel this. Adding this switch could potentially help a lot >> here and give Java a slight edge over similar software solutions. >> >> >> Can this feature please be implemented? Likewise, could the >> documentation on what a "client" JVM and other JVM variants be >> updated and improved? >> >> >> [1] >> https://docs.oracle.com/javase/8/docs/technotes/guides/vm/server-class.html >> >> >> [2] >> https://stackoverflow.com/questions/13692206/high-java-memory-usage-even-for-small-programs >> >> From vladimir.kozlov at oracle.com Fri Jul 19 17:25:03 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 19 Jul 2019 10:25:03 -0700 Subject: Please implement client switch in 64-bit server JDK 14 builds In-Reply-To: References: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> Message-ID: <1ed0b633-640b-f8c5-ed8d-815d4a15add7@oracle.com> Ty, you can try -XX:+NeverActAsServerClassMachine flag which sets configuration similar to old Client VM (C1 JIT + SerialGC): http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/compiler/compilerDefinitions.cpp#l116 http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/gc/shared/gcConfig.cpp#l109 Vladimir On 7/18/19 8:28 PM, David Holmes wrote: > Hi Ty, > > I'm moving this discussion to hotspot-dev as it's more appropriate. > > On 19/07/2019 12:46 pm, Ty Young wrote: >> Hi, >> >> >> I'm requesting that the long unimplemented "client" java switch be implemented in Java 14. > > Background: the client VM is historically only supported on 32-bit platforms explicitly, so the memory issues you are > seeing are a combination of factors based on the ergonomic selections made by the VM during startup. The "client VM" is > predominantly a 32-bit JVM that only supports the C1 JIT-compiler. The "server VM" in contrast supports the C2 > JIT-compiler. For a while now this distinction has blurred because the JIT uses tiered-compilation so that it starts by > acting similar to the C1 compiler (for faster startup) and progresses into a mode that acts like C2 (for throughput > optimisation). Though there are flags you can set to get it to act just like C1 or just like C2. > > Whether a machine is considered "server class" only partially relates to this. The startup ergonomics for a "server > class" machine will configure subsystems to use more memory than a "non-server class" machine. Again these days (and for > a while) we do not use this classification when starting the JVM. Various ergonomic selections are made based on the > default settings for a range of components (mainly GC and JIT) together with the characteristics of the actual runtime > environment (available memory and processors etc). > > The JVM is highly tunable in this regard, but of course it needs to have a reasonable out-of-the-box configuration - and > that has evolved over the years, but is, at least for 64-bit systems, skewed towards server-style systems. So we cannot > please everybody with the out-of-box default configuration. It's been suggested in the past that perhaps we should > support a number of different initial configurations to make it easy(er) to adapt to specific user requirements, but > this quickly breaks down as you can't get consensus on what those settings should be, and anyone who really cares will > do their own tuning anyway. > > I can't go through your email point by point in detail sorry. Perhaps others can focus on specific memory issues. In > particular if JavaFX is a source of problems then that will need to be discussed with the JavaFX folk. > > A very strong "business case" would need to be made for the community to look at supporting something like "-client" in > the current OpenJDK. > > Cheers, > David > >> >> (Note: this entire request is based on the assumption that a JVM with -client is equivalent to a client JVM variant. >> If this is wrong, I apologies. There isn't much documentation to go on.) >> >> >> Since there aren't many google results or any kind of mention of this feature/ability even existing, i'll give an >> explanation to the best of my knowledge and personal observations: >> >> >> A "client" JVM variant is geared towards graphical end-user applications. According to a URL link found in the man >> entry for java[1] this supposedly results in faster startups. While this *may* be true, a much larger and more >> important benefit is a massive committed memory reduction in the range of about 25% to 50% when running a JavaFX >> application. At minimum with similar heap sizes, that is a 75 MB memory savings at 300MB (a somewhat typical peak >> usage with JavaFX applications) with a typical server JVM. That's huge. >> >> >> The downside to this however is that at most, the maximum amount of (committed?) memory that a client JVM variant can >> use is somewhere around 300MB by default. For the intended purpose of the client JVM switch/variant this is *probably* >> fine. Server JVM variants only seems to allocate more memory to boost performance, which really isn?t that much of a >> difference with the intended use case of the client JVM switch/variant? especially considering the more appealing >> memory savings. >> >> >> So why should this be implemented? >> >> >> The answer is simple: using more memory then is necessary is bad, angers users, and frustrates developers who want to >> be responsible by not wanting to eat up their users's memory[2] when it isn't needed. >> >> >> Even if you've have never heard anyone complain about Java's memory usage, you've most likely heard someone complain >> about a similar cross-platform software: Electron. People hate Electron applications for their absurd memory usage and >> will actively avoid them by using alternatives if possible. >> >> >> For reference, Etcher, an Electron application that allows users to easily create bootable USB drives on Windows, >> Linux, and probably Mac OS uses around 298 MB just at launch on Linux. Electron is both comparable in both >> goals(cross-platform solutions, JavaFX vs. Electron) and in memory usage. >> >> >> Java may not be a native language and there may be *some* unavoidable penalty for that but being wasteful and >> consuming resources where not necessary is, well, unnecessary. This can help reduce the amount of memory a java >> application uses significantly when used. >> >> >> With that all said, since JEPs include risks/impact/problems, it's best to mention some that come to mind: >> >> >> Because of the default lower memory limit, applications which go beyond this will fail. The easiest and best >> workaround would be to simply make the client JVM switch/variant opt-in. This would allow all existing Java >> applications to continue to work as expected. >> >> >> The only other issue that I can think of is people launching applications with -client without knowing the limitations >> of it and filing bogus bug reports to app developers. This can be mitigated with better documentation and awareness in >> places like the man page for Java. Since no one seems to really have used or knew about it before it's more likely end >> developers that will be passing the switch to their applications via scripts then end users will be. >> >> >> All in all, this is pretty safe as long as server JVM switch/variant remains the default. Maybe others can think of >> other risks/impacts/problems. >> >> >> And finally addressing the two questions/comments I imagine someone at some point are going to ask/say: >> >> >> Why not just compile a client JVM variant from source and use jLink? >> >> >> and/or >> >> >> If heap and garbage collection is healthy, who cares? >> >> >> For the first one, yes, this is a route that could be taken. It has some problems however, namely: >> >> >> - You have to be the developer or have source code access to use jLink. >> >> >> - jLink -from my understanding- requires a **fully** modular Java application. Some used libraries may not be modular >> yet. >> >> >> - A full JDK source code compile is required - something that is really easy to do under Linux but might not be under >> Windows and takes considerable CPU power to do. No one that I?m aware of (on Linux anyway) provides client JVM variant >> builds. Presumably This is because the server JVM variant is the most versatile. >> >> >> and as for the second: just because there is say, 5.8GB out of 8GB available doesn't mean you should or have the right >> to use it as you see fit. People do more than use Java applications. If you are running a web browser with lots of >> tabs open, a Java application could realistically cause major system stuttering as memory is moved to swap/pagefile. >> While I used 300MB above as an easy realistic example, i've seen JavaFX applications consume as much as 700MB and even >> 1GB committed memory. Just opening Scene Builder and playing around with the GUI consumes 400MB easily on a server JVM >> variant(Oracle JDK/JRE 10 to be exact). While memory usage may never be as good as native, the current amount of >> memory being consumed is insane and any normal user with standard amount of memory(6-8GB) *will* feel this. Adding >> this switch could potentially help a lot here and give Java a slight edge over similar software solutions. >> >> >> Can this feature please be implemented? Likewise, could the documentation on what a "client" JVM and other JVM >> variants be updated and improved? >> >> >> [1] https://docs.oracle.com/javase/8/docs/technotes/guides/vm/server-class.html >> >> >> [2] https://stackoverflow.com/questions/13692206/high-java-memory-usage-even-for-small-programs >> From youngty1997 at gmail.com Fri Jul 19 21:35:50 2019 From: youngty1997 at gmail.com (Ty Young) Date: Fri, 19 Jul 2019 16:35:50 -0500 Subject: Please implement client switch in 64-bit server JDK 14 builds In-Reply-To: <1ed0b633-640b-f8c5-ed8d-815d4a15add7@oracle.com> References: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> <1ed0b633-640b-f8c5-ed8d-815d4a15add7@oracle.com> Message-ID: Yes that works nicely. Are there any other hotspot switches to reduce memory usage or is that it? On 7/19/19 12:25 PM, Vladimir Kozlov wrote: > Ty, you can try -XX:+NeverActAsServerClassMachine flag which sets > configuration similar to old Client VM (C1 JIT + SerialGC): > > http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/compiler/compilerDefinitions.cpp#l116 > > http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/gc/shared/gcConfig.cpp#l109 > > > Vladimir > > On 7/18/19 8:28 PM, David Holmes wrote: >> Hi Ty, >> >> I'm moving this discussion to hotspot-dev as it's more appropriate. >> >> On 19/07/2019 12:46 pm, Ty Young wrote: >>> Hi, >>> >>> >>> I'm requesting that the long unimplemented "client" java switch be >>> implemented in Java 14. >> >> Background: the client VM is historically only supported on 32-bit >> platforms explicitly, so the memory issues you are seeing are a >> combination of factors based on the ergonomic selections made by the >> VM during startup. The "client VM" is predominantly a 32-bit JVM that >> only supports the C1 JIT-compiler. The "server VM" in contrast >> supports the C2 JIT-compiler. For a while now this distinction has >> blurred because the JIT uses tiered-compilation so that it starts by >> acting similar to the C1 compiler (for faster startup) and progresses >> into a mode that acts like C2 (for throughput optimisation). Though >> there are flags you can set to get it to act just like C1 or just >> like C2. >> >> Whether a machine is considered "server class" only partially relates >> to this. The startup ergonomics for a "server class" machine will >> configure subsystems to use more memory than a "non-server class" >> machine. Again these days (and for a while) we do not use this >> classification when starting the JVM. Various ergonomic selections >> are made based on the default settings for a range of components >> (mainly GC and JIT) together with the characteristics of the actual >> runtime environment (available memory and processors etc). >> >> The JVM is highly tunable in this regard, but of course it needs to >> have a reasonable out-of-the-box configuration - and that has evolved >> over the years, but is, at least for 64-bit systems, skewed towards >> server-style systems. So we cannot please everybody with the >> out-of-box default configuration. It's been suggested in the past >> that perhaps we should support a number of different initial >> configurations to make it easy(er) to adapt to specific user >> requirements, but this quickly breaks down as you can't get consensus >> on what those settings should be, and anyone who really cares will do >> their own tuning anyway. >> >> I can't go through your email point by point in detail sorry. Perhaps >> others can focus on specific memory issues. In particular if JavaFX >> is a source of problems then that will need to be discussed with the >> JavaFX folk. >> >> A very strong "business case" would need to be made for the community >> to look at supporting something like "-client" in the current OpenJDK. >> >> Cheers, >> David >> >>> >>> (Note: this entire request is based on the assumption that a JVM >>> with -client is equivalent to a client JVM variant. If this is >>> wrong, I apologies. There isn't much documentation to go on.) >>> >>> >>> Since there aren't many google results or any kind of mention of >>> this feature/ability even existing, i'll give an explanation to the >>> best of my knowledge and personal observations: >>> >>> >>> A "client" JVM variant is geared towards graphical end-user >>> applications. According to a URL link found in the man entry for >>> java[1] this supposedly results in faster startups. While this *may* >>> be true, a much larger and more important benefit is a massive >>> committed memory reduction in the range of about 25% to 50% when >>> running a JavaFX application. At minimum with similar heap sizes, >>> that is a 75 MB memory savings at 300MB (a somewhat typical peak >>> usage with JavaFX applications) with a typical server JVM. That's huge. >>> >>> >>> The downside to this however is that at most, the maximum amount of >>> (committed?) memory that a client JVM variant can use is somewhere >>> around 300MB by default. For the intended purpose of the client JVM >>> switch/variant this is *probably* fine. Server JVM variants only >>> seems to allocate more memory to boost performance, which really >>> isn?t that much of a difference with the intended use case of the >>> client JVM switch/variant? especially considering the more appealing >>> memory savings. >>> >>> >>> So why should this be implemented? >>> >>> >>> The answer is simple: using more memory then is necessary is bad, >>> angers users, and frustrates developers who want to be responsible >>> by not wanting to eat up their users's memory[2] when it isn't needed. >>> >>> >>> Even if you've have never heard anyone complain about Java's memory >>> usage, you've most likely heard someone complain about a similar >>> cross-platform software: Electron. People hate Electron applications >>> for their absurd memory usage and will actively avoid them by using >>> alternatives if possible. >>> >>> >>> For reference, Etcher, an Electron application that allows users to >>> easily create bootable USB drives on Windows, Linux, and probably >>> Mac OS uses around 298 MB just at launch on Linux. Electron is both >>> comparable in both goals(cross-platform solutions, JavaFX vs. >>> Electron) and in memory usage. >>> >>> >>> Java may not be a native language and there may be *some* >>> unavoidable penalty for that but being wasteful and consuming >>> resources where not necessary is, well, unnecessary. This can help >>> reduce the amount of memory a java application uses significantly >>> when used. >>> >>> >>> With that all said, since JEPs include risks/impact/problems, it's >>> best to mention some that come to mind: >>> >>> >>> Because of the default lower memory limit, applications which go >>> beyond this will fail. The easiest and best workaround would be to >>> simply make the client JVM switch/variant opt-in. This would allow >>> all existing Java applications to continue to work as expected. >>> >>> >>> The only other issue that I can think of is people launching >>> applications with -client without knowing the limitations of it and >>> filing bogus bug reports to app developers. This can be mitigated >>> with better documentation and awareness in places like the man page >>> for Java. Since no one seems to really have used or knew about it >>> before it's more likely end developers that will be passing the >>> switch to their applications via scripts then end users will be. >>> >>> >>> All in all, this is pretty safe as long as server JVM switch/variant >>> remains the default. Maybe others can think of other >>> risks/impacts/problems. >>> >>> >>> And finally addressing the two questions/comments I imagine someone >>> at some point are going to ask/say: >>> >>> >>> Why not just compile a client JVM variant from source and use jLink? >>> >>> >>> and/or >>> >>> >>> If heap and garbage collection is healthy, who cares? >>> >>> >>> For the first one, yes, this is a route that could be taken. It has >>> some problems however, namely: >>> >>> >>> - You have to be the developer or have source code access to use jLink. >>> >>> >>> - jLink -from my understanding- requires a **fully** modular Java >>> application. Some used libraries may not be modular yet. >>> >>> >>> - A full JDK source code compile is required - something that is >>> really easy to do under Linux but might not be under Windows and >>> takes considerable CPU power to do. No one that I?m aware of (on >>> Linux anyway) provides client JVM variant builds. Presumably This is >>> because the server JVM variant is the most versatile. >>> >>> >>> and as for the second: just because there is say, 5.8GB out of 8GB >>> available doesn't mean you should or have the right to use it as you >>> see fit. People do more than use Java applications. If you are >>> running a web browser with lots of tabs open, a Java application >>> could realistically cause major system stuttering as memory is moved >>> to swap/pagefile. While I used 300MB above as an easy realistic >>> example, i've seen JavaFX applications consume as much as 700MB and >>> even 1GB committed memory. Just opening Scene Builder and playing >>> around with the GUI consumes 400MB easily on a server JVM >>> variant(Oracle JDK/JRE 10 to be exact). While memory usage may never >>> be as good as native, the current amount of memory being consumed is >>> insane and any normal user with standard amount of memory(6-8GB) >>> *will* feel this. Adding this switch could potentially help a lot >>> here and give Java a slight edge over similar software solutions. >>> >>> >>> Can this feature please be implemented? Likewise, could the >>> documentation on what a "client" JVM and other JVM variants be >>> updated and improved? >>> >>> >>> [1] >>> https://docs.oracle.com/javase/8/docs/technotes/guides/vm/server-class.html >>> >>> >>> [2] >>> https://stackoverflow.com/questions/13692206/high-java-memory-usage-even-for-small-programs >>> From youngty1997 at gmail.com Fri Jul 19 21:39:13 2019 From: youngty1997 at gmail.com (Ty Young) Date: Fri, 19 Jul 2019 16:39:13 -0500 Subject: Please implement client switch in 64-bit server JDK 14 builds In-Reply-To: References: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> Message-ID: Compiling from source is what I've been doing however it takes CPU power and time. It would be nice if this was a standard JDK feature on more common JDK builds(server). On 7/19/19 12:09 PM, Boris Ulasevich wrote: > Hi Ty, > > If I understood you correct, you are looking for a client VM as it is > not so greedy for memory. And you are asking for this feature (client > VM) to be implemented (may be I misunderstand something?). But vm > variants is not a feature to implement - it is already implemented and > works well on many platforms. You just need to build OpenJDK yourself > or find a JVM vendor who provides JDK binaries with support of > client/server variants for your target platform. I know at least one > vendor, I bet Liberica JDK from BellSoft should fit your request :) > > regards, > Boris > > 19.07.2019 6:28, David Holmes ?????: >> Hi Ty, >> >> I'm moving this discussion to hotspot-dev as it's more appropriate. >> >> On 19/07/2019 12:46 pm, Ty Young wrote: >>> Hi, >>> >>> >>> I'm requesting that the long unimplemented "client" java switch be >>> implemented in Java 14. >> >> Background: the client VM is historically only supported on 32-bit >> platforms explicitly, so the memory issues you are seeing are a >> combination of factors based on the ergonomic selections made by the >> VM during startup. The "client VM" is predominantly a 32-bit JVM that >> only supports the C1 JIT-compiler. The "server VM" in contrast >> supports the C2 JIT-compiler. For a while now this distinction has >> blurred because the JIT uses tiered-compilation so that it starts by >> acting similar to the C1 compiler (for faster startup) and progresses >> into a mode that acts like C2 (for throughput optimisation). Though >> there are flags you can set to get it to act just like C1 or just >> like C2. >> >> Whether a machine is considered "server class" only partially relates >> to this. The startup ergonomics for a "server class" machine will >> configure subsystems to use more memory than a "non-server class" >> machine. Again these days (and for a while) we do not use this >> classification when starting the JVM. Various ergonomic selections >> are made based on the default settings for a range of components >> (mainly GC and JIT) together with the characteristics of the actual >> runtime environment (available memory and processors etc). >> >> The JVM is highly tunable in this regard, but of course it needs to >> have a reasonable out-of-the-box configuration - and that has evolved >> over the years, but is, at least for 64-bit systems, skewed towards >> server-style systems. So we cannot please everybody with the >> out-of-box default configuration. It's been suggested in the past >> that perhaps we should support a number of different initial >> configurations to make it easy(er) to adapt to specific user >> requirements, but this quickly breaks down as you can't get consensus >> on what those settings should be, and anyone who really cares will do >> their own tuning anyway. >> >> I can't go through your email point by point in detail sorry. Perhaps >> others can focus on specific memory issues. In particular if JavaFX >> is a source of problems then that will need to be discussed with the >> JavaFX folk. >> >> A very strong "business case" would need to be made for the community >> to look at supporting something like "-client" in the current OpenJDK. >> >> Cheers, >> David >> >>> >>> (Note: this entire request is based on the assumption that a JVM >>> with -client is equivalent to a client JVM variant. If this is >>> wrong, I apologies. There isn't much documentation to go on.) >>> >>> >>> Since there aren't many google results or any kind of mention of >>> this feature/ability even existing, i'll give an explanation to the >>> best of my knowledge and personal observations: >>> >>> >>> A "client" JVM variant is geared towards graphical end-user >>> applications. According to a URL link found in the man entry for >>> java[1] this supposedly results in faster startups. While this *may* >>> be true, a much larger and more important benefit is a massive >>> committed memory reduction in the range of about 25% to 50% when >>> running a JavaFX application. At minimum with similar heap sizes, >>> that is a 75 MB memory savings at 300MB (a somewhat typical peak >>> usage with JavaFX applications) with a typical server JVM. That's huge. >>> >>> >>> The downside to this however is that at most, the maximum amount of >>> (committed?) memory that a client JVM variant can use is somewhere >>> around 300MB by default. For the intended purpose of the client JVM >>> switch/variant this is *probably* fine. Server JVM variants only >>> seems to allocate more memory to boost performance, which really >>> isn?t that much of a difference with the intended use case of the >>> client JVM switch/variant? especially considering the more appealing >>> memory savings. >>> >>> >>> So why should this be implemented? >>> >>> >>> The answer is simple: using more memory then is necessary is bad, >>> angers users, and frustrates developers who want to be responsible >>> by not wanting to eat up their users's memory[2] when it isn't needed. >>> >>> >>> Even if you've have never heard anyone complain about Java's memory >>> usage, you've most likely heard someone complain about a similar >>> cross-platform software: Electron. People hate Electron applications >>> for their absurd memory usage and will actively avoid them by using >>> alternatives if possible. >>> >>> >>> For reference, Etcher, an Electron application that allows users to >>> easily create bootable USB drives on Windows, Linux, and probably >>> Mac OS uses around 298 MB just at launch on Linux. Electron is both >>> comparable in both goals(cross-platform solutions, JavaFX vs. >>> Electron) and in memory usage. >>> >>> >>> Java may not be a native language and there may be *some* >>> unavoidable penalty for that but being wasteful and consuming >>> resources where not necessary is, well, unnecessary. This can help >>> reduce the amount of memory a java application uses significantly >>> when used. >>> >>> >>> With that all said, since JEPs include risks/impact/problems, it's >>> best to mention some that come to mind: >>> >>> >>> Because of the default lower memory limit, applications which go >>> beyond this will fail. The easiest and best workaround would be to >>> simply make the client JVM switch/variant opt-in. This would allow >>> all existing Java applications to continue to work as expected. >>> >>> >>> The only other issue that I can think of is people launching >>> applications with -client without knowing the limitations of it and >>> filing bogus bug reports to app developers. This can be mitigated >>> with better documentation and awareness in places like the man page >>> for Java. Since no one seems to really have used or knew about it >>> before it's more likely end developers that will be passing the >>> switch to their applications via scripts then end users will be. >>> >>> >>> All in all, this is pretty safe as long as server JVM switch/variant >>> remains the default. Maybe others can think of other >>> risks/impacts/problems. >>> >>> >>> And finally addressing the two questions/comments I imagine someone >>> at some point are going to ask/say: >>> >>> >>> Why not just compile a client JVM variant from source and use jLink? >>> >>> >>> and/or >>> >>> >>> If heap and garbage collection is healthy, who cares? >>> >>> >>> For the first one, yes, this is a route that could be taken. It has >>> some problems however, namely: >>> >>> >>> - You have to be the developer or have source code access to use jLink. >>> >>> >>> - jLink -from my understanding- requires a **fully** modular Java >>> application. Some used libraries may not be modular yet. >>> >>> >>> - A full JDK source code compile is required - something that is >>> really easy to do under Linux but might not be under Windows and >>> takes considerable CPU power to do. No one that I?m aware of (on >>> Linux anyway) provides client JVM variant builds. Presumably This is >>> because the server JVM variant is the most versatile. >>> >>> >>> and as for the second: just because there is say, 5.8GB out of 8GB >>> available doesn't mean you should or have the right to use it as you >>> see fit. People do more than use Java applications. If you are >>> running a web browser with lots of tabs open, a Java application >>> could realistically cause major system stuttering as memory is moved >>> to swap/pagefile. While I used 300MB above as an easy realistic >>> example, i've seen JavaFX applications consume as much as 700MB and >>> even 1GB committed memory. Just opening Scene Builder and playing >>> around with the GUI consumes 400MB easily on a server JVM >>> variant(Oracle JDK/JRE 10 to be exact). While memory usage may never >>> be as good as native, the current amount of memory being consumed is >>> insane and any normal user with standard amount of memory(6-8GB) >>> *will* feel this. Adding this switch could potentially help a lot >>> here and give Java a slight edge over similar software solutions. >>> >>> >>> Can this feature please be implemented? Likewise, could the >>> documentation on what a "client" JVM and other JVM variants be >>> updated and improved? >>> >>> >>> [1] >>> https://docs.oracle.com/javase/8/docs/technotes/guides/vm/server-class.html >>> >>> >>> [2] >>> https://stackoverflow.com/questions/13692206/high-java-memory-usage-even-for-small-programs >>> >>> > From vladimir.kozlov at oracle.com Fri Jul 19 21:56:05 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 19 Jul 2019 14:56:05 -0700 Subject: Please implement client switch in 64-bit server JDK 14 builds In-Reply-To: References: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> <1ed0b633-640b-f8c5-ed8d-815d4a15add7@oracle.com> Message-ID: <2ecca81e-a4d5-06b4-49f3-4e83c353979d@oracle.com> It does not reduce static size of JVM itself - it is still Server VM. You have to rebuild it as other suggested if you want to reduce its size. The default memory allocated during execution is reduced with this flag as you can see in my first link. Such as ReservedCodeCacheSize (compiled code), MetaspaceSize (classes metadata). Default Java heap size is also reduced but, as I understand from your previous comments, you want to use your own -Xmx value. These 3 values are main settings which control memory usage by VM. I would suggest to do experiments and see. Vladimir On 7/19/19 2:35 PM, Ty Young wrote: > Yes that works nicely. Are there any other hotspot switches to reduce memory usage or is that it? > > > On 7/19/19 12:25 PM, Vladimir Kozlov wrote: >> Ty, you can try -XX:+NeverActAsServerClassMachine flag which sets configuration similar to old Client VM (C1 JIT + >> SerialGC): >> >> http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/compiler/compilerDefinitions.cpp#l116 >> http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/gc/shared/gcConfig.cpp#l109 >> >> Vladimir >> >> On 7/18/19 8:28 PM, David Holmes wrote: >>> Hi Ty, >>> >>> I'm moving this discussion to hotspot-dev as it's more appropriate. >>> >>> On 19/07/2019 12:46 pm, Ty Young wrote: >>>> Hi, >>>> >>>> >>>> I'm requesting that the long unimplemented "client" java switch be implemented in Java 14. >>> >>> Background: the client VM is historically only supported on 32-bit platforms explicitly, so the memory issues you are >>> seeing are a combination of factors based on the ergonomic selections made by the VM during startup. The "client VM" >>> is predominantly a 32-bit JVM that only supports the C1 JIT-compiler. The "server VM" in contrast supports the C2 >>> JIT-compiler. For a while now this distinction has blurred because the JIT uses tiered-compilation so that it starts >>> by acting similar to the C1 compiler (for faster startup) and progresses into a mode that acts like C2 (for >>> throughput optimisation). Though there are flags you can set to get it to act just like C1 or just like C2. >>> >>> Whether a machine is considered "server class" only partially relates to this. The startup ergonomics for a "server >>> class" machine will configure subsystems to use more memory than a "non-server class" machine. Again these days (and >>> for a while) we do not use this classification when starting the JVM. Various ergonomic selections are made based on >>> the default settings for a range of components (mainly GC and JIT) together with the characteristics of the actual >>> runtime environment (available memory and processors etc). >>> >>> The JVM is highly tunable in this regard, but of course it needs to have a reasonable out-of-the-box configuration - >>> and that has evolved over the years, but is, at least for 64-bit systems, skewed towards server-style systems. So we >>> cannot please everybody with the out-of-box default configuration. It's been suggested in the past that perhaps we >>> should support a number of different initial configurations to make it easy(er) to adapt to specific user >>> requirements, but this quickly breaks down as you can't get consensus on what those settings should be, and anyone >>> who really cares will do their own tuning anyway. >>> >>> I can't go through your email point by point in detail sorry. Perhaps others can focus on specific memory issues. In >>> particular if JavaFX is a source of problems then that will need to be discussed with the JavaFX folk. >>> >>> A very strong "business case" would need to be made for the community to look at supporting something like "-client" >>> in the current OpenJDK. >>> >>> Cheers, >>> David >>> >>>> >>>> (Note: this entire request is based on the assumption that a JVM with -client is equivalent to a client JVM variant. >>>> If this is wrong, I apologies. There isn't much documentation to go on.) >>>> >>>> >>>> Since there aren't many google results or any kind of mention of this feature/ability even existing, i'll give an >>>> explanation to the best of my knowledge and personal observations: >>>> >>>> >>>> A "client" JVM variant is geared towards graphical end-user applications. According to a URL link found in the man >>>> entry for java[1] this supposedly results in faster startups. While this *may* be true, a much larger and more >>>> important benefit is a massive committed memory reduction in the range of about 25% to 50% when running a JavaFX >>>> application. At minimum with similar heap sizes, that is a 75 MB memory savings at 300MB (a somewhat typical peak >>>> usage with JavaFX applications) with a typical server JVM. That's huge. >>>> >>>> >>>> The downside to this however is that at most, the maximum amount of (committed?) memory that a client JVM variant >>>> can use is somewhere around 300MB by default. For the intended purpose of the client JVM switch/variant this is >>>> *probably* fine. Server JVM variants only seems to allocate more memory to boost performance, which really isn?t >>>> that much of a difference with the intended use case of the client JVM switch/variant? especially considering the >>>> more appealing memory savings. >>>> >>>> >>>> So why should this be implemented? >>>> >>>> >>>> The answer is simple: using more memory then is necessary is bad, angers users, and frustrates developers who want >>>> to be responsible by not wanting to eat up their users's memory[2] when it isn't needed. >>>> >>>> >>>> Even if you've have never heard anyone complain about Java's memory usage, you've most likely heard someone complain >>>> about a similar cross-platform software: Electron. People hate Electron applications for their absurd memory usage >>>> and will actively avoid them by using alternatives if possible. >>>> >>>> >>>> For reference, Etcher, an Electron application that allows users to easily create bootable USB drives on Windows, >>>> Linux, and probably Mac OS uses around 298 MB just at launch on Linux. Electron is both comparable in both >>>> goals(cross-platform solutions, JavaFX vs. Electron) and in memory usage. >>>> >>>> >>>> Java may not be a native language and there may be *some* unavoidable penalty for that but being wasteful and >>>> consuming resources where not necessary is, well, unnecessary. This can help reduce the amount of memory a java >>>> application uses significantly when used. >>>> >>>> >>>> With that all said, since JEPs include risks/impact/problems, it's best to mention some that come to mind: >>>> >>>> >>>> Because of the default lower memory limit, applications which go beyond this will fail. The easiest and best >>>> workaround would be to simply make the client JVM switch/variant opt-in. This would allow all existing Java >>>> applications to continue to work as expected. >>>> >>>> >>>> The only other issue that I can think of is people launching applications with -client without knowing the >>>> limitations of it and filing bogus bug reports to app developers. This can be mitigated with better documentation >>>> and awareness in places like the man page for Java. Since no one seems to really have used or knew about it before >>>> it's more likely end developers that will be passing the switch to their applications via scripts then end users >>>> will be. >>>> >>>> >>>> All in all, this is pretty safe as long as server JVM switch/variant remains the default. Maybe others can think of >>>> other risks/impacts/problems. >>>> >>>> >>>> And finally addressing the two questions/comments I imagine someone at some point are going to ask/say: >>>> >>>> >>>> Why not just compile a client JVM variant from source and use jLink? >>>> >>>> >>>> and/or >>>> >>>> >>>> If heap and garbage collection is healthy, who cares? >>>> >>>> >>>> For the first one, yes, this is a route that could be taken. It has some problems however, namely: >>>> >>>> >>>> - You have to be the developer or have source code access to use jLink. >>>> >>>> >>>> - jLink -from my understanding- requires a **fully** modular Java application. Some used libraries may not be >>>> modular yet. >>>> >>>> >>>> - A full JDK source code compile is required - something that is really easy to do under Linux but might not be >>>> under Windows and takes considerable CPU power to do. No one that I?m aware of (on Linux anyway) provides client JVM >>>> variant builds. Presumably This is because the server JVM variant is the most versatile. >>>> >>>> >>>> and as for the second: just because there is say, 5.8GB out of 8GB available doesn't mean you should or have the >>>> right to use it as you see fit. People do more than use Java applications. If you are running a web browser with >>>> lots of tabs open, a Java application could realistically cause major system stuttering as memory is moved to >>>> swap/pagefile. While I used 300MB above as an easy realistic example, i've seen JavaFX applications consume as much >>>> as 700MB and even 1GB committed memory. Just opening Scene Builder and playing around with the GUI consumes 400MB >>>> easily on a server JVM variant(Oracle JDK/JRE 10 to be exact). While memory usage may never be as good as native, >>>> the current amount of memory being consumed is insane and any normal user with standard amount of memory(6-8GB) >>>> *will* feel this. Adding this switch could potentially help a lot here and give Java a slight edge over similar >>>> software solutions. >>>> >>>> >>>> Can this feature please be implemented? Likewise, could the documentation on what a "client" JVM and other JVM >>>> variants be updated and improved? >>>> >>>> >>>> [1] https://docs.oracle.com/javase/8/docs/technotes/guides/vm/server-class.html >>>> >>>> >>>> [2] https://stackoverflow.com/questions/13692206/high-java-memory-usage-even-for-small-programs >>>> From youngty1997 at gmail.com Sat Jul 20 17:53:16 2019 From: youngty1997 at gmail.com (Ty Young) Date: Sat, 20 Jul 2019 12:53:16 -0500 Subject: Please implement client switch in 64-bit server JDK 14 builds In-Reply-To: <2ecca81e-a4d5-06b4-49f3-4e83c353979d@oracle.com> References: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> <1ed0b633-640b-f8c5-ed8d-815d4a15add7@oracle.com> <2ecca81e-a4d5-06b4-49f3-4e83c353979d@oracle.com> Message-ID: <84157ea2-f645-0652-7ccb-aba41620135e@gmail.com> Can you clarify on what you mean by "static"? For clarification, what seems to be happening(according to Netbean's heap monitor) to cause this massive spike in committed memory usage is an explosion of objects created by JavaFX when resizing the window and switching content which in turn causes about 8-10 GC runs in the span of a few seconds. Eventually the heap goes back to about normal but Java never lets go of the memory it needlessly allocated besides a few MBs. When the application first launches and all content is visible at least once, 150MB seems to be the norm on a client JVM which is still way higher than it should be but is way lower than the 280MB high I've seen even with the client JVM. Committed memory size is easily about 6x what the heap size is. Yes, the -XX:+NeverActAsServerClassMachine switch/flag helps a lot and seems to regain most if not all of the memory savings of a client JVM but now i'm wondering/asking what other hidden JVM switches/flags exist that may help here. Again, the documentation of these switches/flags is very poor. Even if you find a website that mentions them, the chances of those website providing meaningful documentation on what they do is basically zero. No, I'm not all that interested in setting a custom max heap size. I don't even think it would be safe to do so given JavaFX explosive memory allocation nor would it be super beneficial as the heap size is already under 75MB used(max might be like 100MB or something) and healthy. The problem isn't really heap AFAIK, it's the 6x committed memory Java is allocating for no good reason and never really letting go of it. A client JVM allocates about 25% to 50% of what a server JVM does which helps but the underlying problem is that Java just isn't letting go of memory it doesn't really need. On 7/19/19 4:56 PM, Vladimir Kozlov wrote: > It does not reduce static size of JVM itself - it is still Server VM. > You have to rebuild it as other suggested if you want to reduce its size. > > The default memory allocated during execution is reduced with this > flag as you can see in my first link. Such as ReservedCodeCacheSize > (compiled code), MetaspaceSize (classes metadata). Default Java heap > size is also reduced but, as I understand from your previous comments, > you want to use your own -Xmx value. These 3 values are main settings > which control memory usage by VM. > > I would suggest to do experiments and see. > > Vladimir > > > On 7/19/19 2:35 PM, Ty Young wrote: >> Yes that works nicely. Are there any other hotspot switches to reduce >> memory usage or is that it? >> >> >> On 7/19/19 12:25 PM, Vladimir Kozlov wrote: >>> Ty, you can try -XX:+NeverActAsServerClassMachine flag which sets >>> configuration similar to old Client VM (C1 JIT + SerialGC): >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/compiler/compilerDefinitions.cpp#l116 >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/gc/shared/gcConfig.cpp#l109 >>> >>> >>> Vladimir >>> >>> On 7/18/19 8:28 PM, David Holmes wrote: >>>> Hi Ty, >>>> >>>> I'm moving this discussion to hotspot-dev as it's more appropriate. >>>> >>>> On 19/07/2019 12:46 pm, Ty Young wrote: >>>>> Hi, >>>>> >>>>> >>>>> I'm requesting that the long unimplemented "client" java switch be >>>>> implemented in Java 14. >>>> >>>> Background: the client VM is historically only supported on 32-bit >>>> platforms explicitly, so the memory issues you are seeing are a >>>> combination of factors based on the ergonomic selections made by >>>> the VM during startup. The "client VM" is predominantly a 32-bit >>>> JVM that only supports the C1 JIT-compiler. The "server VM" in >>>> contrast supports the C2 JIT-compiler. For a while now this >>>> distinction has blurred because the JIT uses tiered-compilation so >>>> that it starts by acting similar to the C1 compiler (for faster >>>> startup) and progresses into a mode that acts like C2 (for >>>> throughput optimisation). Though there are flags you can set to get >>>> it to act just like C1 or just like C2. >>>> >>>> Whether a machine is considered "server class" only partially >>>> relates to this. The startup ergonomics for a "server class" >>>> machine will configure subsystems to use more memory than a >>>> "non-server class" machine. Again these days (and for a while) we >>>> do not use this classification when starting the JVM. Various >>>> ergonomic selections are made based on the default settings for a >>>> range of components (mainly GC and JIT) together with the >>>> characteristics of the actual runtime environment (available memory >>>> and processors etc). >>>> >>>> The JVM is highly tunable in this regard, but of course it needs to >>>> have a reasonable out-of-the-box configuration - and that has >>>> evolved over the years, but is, at least for 64-bit systems, skewed >>>> towards server-style systems. So we cannot please everybody with >>>> the out-of-box default configuration. It's been suggested in the >>>> past that perhaps we should support a number of different initial >>>> configurations to make it easy(er) to adapt to specific user >>>> requirements, but this quickly breaks down as you can't get >>>> consensus on what those settings should be, and anyone who really >>>> cares will do their own tuning anyway. >>>> >>>> I can't go through your email point by point in detail sorry. >>>> Perhaps others can focus on specific memory issues. In particular >>>> if JavaFX is a source of problems then that will need to be >>>> discussed with the JavaFX folk. >>>> >>>> A very strong "business case" would need to be made for the >>>> community to look at supporting something like "-client" in the >>>> current OpenJDK. >>>> >>>> Cheers, >>>> David >>>> >>>>> >>>>> (Note: this entire request is based on the assumption that a JVM >>>>> with -client is equivalent to a client JVM variant. If this is >>>>> wrong, I apologies. There isn't much documentation to go on.) >>>>> >>>>> >>>>> Since there aren't many google results or any kind of mention of >>>>> this feature/ability even existing, i'll give an explanation to >>>>> the best of my knowledge and personal observations: >>>>> >>>>> >>>>> A "client" JVM variant is geared towards graphical end-user >>>>> applications. According to a URL link found in the man entry for >>>>> java[1] this supposedly results in faster startups. While this >>>>> *may* be true, a much larger and more important benefit is a >>>>> massive committed memory reduction in the range of about 25% to >>>>> 50% when running a JavaFX application. At minimum with similar >>>>> heap sizes, that is a 75 MB memory savings at 300MB (a somewhat >>>>> typical peak usage with JavaFX applications) with a typical server >>>>> JVM. That's huge. >>>>> >>>>> >>>>> The downside to this however is that at most, the maximum amount >>>>> of (committed?) memory that a client JVM variant can use is >>>>> somewhere around 300MB by default. For the intended purpose of the >>>>> client JVM switch/variant this is *probably* fine. Server JVM >>>>> variants only seems to allocate more memory to boost performance, >>>>> which really isn?t that much of a difference with the intended use >>>>> case of the client JVM switch/variant? especially considering the >>>>> more appealing memory savings. >>>>> >>>>> >>>>> So why should this be implemented? >>>>> >>>>> >>>>> The answer is simple: using more memory then is necessary is bad, >>>>> angers users, and frustrates developers who want to be responsible >>>>> by not wanting to eat up their users's memory[2] when it isn't >>>>> needed. >>>>> >>>>> >>>>> Even if you've have never heard anyone complain about Java's >>>>> memory usage, you've most likely heard someone complain about a >>>>> similar cross-platform software: Electron. People hate Electron >>>>> applications for their absurd memory usage and will actively avoid >>>>> them by using alternatives if possible. >>>>> >>>>> >>>>> For reference, Etcher, an Electron application that allows users >>>>> to easily create bootable USB drives on Windows, Linux, and >>>>> probably Mac OS uses around 298 MB just at launch on Linux. >>>>> Electron is both comparable in both goals(cross-platform >>>>> solutions, JavaFX vs. Electron) and in memory usage. >>>>> >>>>> >>>>> Java may not be a native language and there may be *some* >>>>> unavoidable penalty for that but being wasteful and consuming >>>>> resources where not necessary is, well, unnecessary. This can help >>>>> reduce the amount of memory a java application uses significantly >>>>> when used. >>>>> >>>>> >>>>> With that all said, since JEPs include risks/impact/problems, it's >>>>> best to mention some that come to mind: >>>>> >>>>> >>>>> Because of the default lower memory limit, applications which go >>>>> beyond this will fail. The easiest and best workaround would be to >>>>> simply make the client JVM switch/variant opt-in. This would allow >>>>> all existing Java applications to continue to work as expected. >>>>> >>>>> >>>>> The only other issue that I can think of is people launching >>>>> applications with -client without knowing the limitations of it >>>>> and filing bogus bug reports to app developers. This can be >>>>> mitigated with better documentation and awareness in places like >>>>> the man page for Java. Since no one seems to really have used or >>>>> knew about it before it's more likely end developers that will be >>>>> passing the switch to their applications via scripts then end >>>>> users will be. >>>>> >>>>> >>>>> All in all, this is pretty safe as long as server JVM >>>>> switch/variant remains the default. Maybe others can think of >>>>> other risks/impacts/problems. >>>>> >>>>> >>>>> And finally addressing the two questions/comments I imagine >>>>> someone at some point are going to ask/say: >>>>> >>>>> >>>>> Why not just compile a client JVM variant from source and use jLink? >>>>> >>>>> >>>>> and/or >>>>> >>>>> >>>>> If heap and garbage collection is healthy, who cares? >>>>> >>>>> >>>>> For the first one, yes, this is a route that could be taken. It >>>>> has some problems however, namely: >>>>> >>>>> >>>>> - You have to be the developer or have source code access to use >>>>> jLink. >>>>> >>>>> >>>>> - jLink -from my understanding- requires a **fully** modular Java >>>>> application. Some used libraries may not be modular yet. >>>>> >>>>> >>>>> - A full JDK source code compile is required - something that is >>>>> really easy to do under Linux but might not be under Windows and >>>>> takes considerable CPU power to do. No one that I?m aware of (on >>>>> Linux anyway) provides client JVM variant builds. Presumably This >>>>> is because the server JVM variant is the most versatile. >>>>> >>>>> >>>>> and as for the second: just because there is say, 5.8GB out of 8GB >>>>> available doesn't mean you should or have the right to use it as >>>>> you see fit. People do more than use Java applications. If you are >>>>> running a web browser with lots of tabs open, a Java application >>>>> could realistically cause major system stuttering as memory is >>>>> moved to swap/pagefile. While I used 300MB above as an easy >>>>> realistic example, i've seen JavaFX applications consume as much >>>>> as 700MB and even 1GB committed memory. Just opening Scene Builder >>>>> and playing around with the GUI consumes 400MB easily on a server >>>>> JVM variant(Oracle JDK/JRE 10 to be exact). While memory usage may >>>>> never be as good as native, the current amount of memory being >>>>> consumed is insane and any normal user with standard amount of >>>>> memory(6-8GB) *will* feel this. Adding this switch could >>>>> potentially help a lot here and give Java a slight edge over >>>>> similar software solutions. >>>>> >>>>> >>>>> Can this feature please be implemented? Likewise, could the >>>>> documentation on what a "client" JVM and other JVM variants be >>>>> updated and improved? >>>>> >>>>> >>>>> [1] >>>>> https://docs.oracle.com/javase/8/docs/technotes/guides/vm/server-class.html >>>>> >>>>> >>>>> [2] >>>>> https://stackoverflow.com/questions/13692206/high-java-memory-usage-even-for-small-programs >>>>> From youngty1997 at gmail.com Sun Jul 21 19:05:01 2019 From: youngty1997 at gmail.com (Ty Young) Date: Sun, 21 Jul 2019 14:05:01 -0500 Subject: Please implement client switch in 64-bit server JDK 14 builds In-Reply-To: <84157ea2-f645-0652-7ccb-aba41620135e@gmail.com> References: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> <1ed0b633-640b-f8c5-ed8d-815d4a15add7@oracle.com> <2ecca81e-a4d5-06b4-49f3-4e83c353979d@oracle.com> <84157ea2-f645-0652-7ccb-aba41620135e@gmail.com> Message-ID: Never mind, I had a client VM selected as the running VM. Memory usage starts off low then goes to the same amount. Is there really nothing that can be done to take the committed memory size? On 7/20/19 12:53 PM, Ty Young wrote: > Can you clarify on what you mean by "static"? > > > For clarification, what seems to be happening(according to Netbean's > heap monitor) to cause this massive spike in committed memory usage is > an explosion of objects created by JavaFX when resizing the window and > switching content which in turn causes about 8-10 GC runs in the span > of a few seconds. Eventually the heap goes back to about normal but > Java never lets go of the memory it needlessly allocated besides a few > MBs. When the application first launches and all content is visible at > least once, 150MB seems to be the norm on a client JVM which is still > way higher than it should be but is way lower than the 280MB high I've > seen even with the client JVM. Committed memory size is easily about > 6x what the heap size is. > > > Yes, the -XX:+NeverActAsServerClassMachine switch/flag helps a lot and > seems to regain most if not all of the memory savings of a client JVM > but now i'm wondering/asking what other hidden JVM switches/flags > exist that may help here. Again, the documentation of these > switches/flags is very poor. Even if you find a website that mentions > them, the chances of those website providing meaningful documentation > on what they do is basically zero. > > > No, I'm not all that interested in setting a custom max heap size. I > don't even think it would be safe to do so given JavaFX explosive > memory allocation nor would it be super beneficial as the heap size is > already under 75MB used(max might be like 100MB or something) and > healthy. The problem isn't really heap AFAIK, it's the 6x committed > memory Java is allocating for no good reason and never really letting > go of it. A client JVM allocates about 25% to 50% of what a server JVM > does which helps but the underlying problem is that Java just isn't > letting go of memory it doesn't really need. > > > On 7/19/19 4:56 PM, Vladimir Kozlov wrote: >> It does not reduce static size of JVM itself - it is still Server VM. >> You have to rebuild it as other suggested if you want to reduce its >> size. >> >> The default memory allocated during execution is reduced with this >> flag as you can see in my first link. Such as ReservedCodeCacheSize >> (compiled code), MetaspaceSize (classes metadata). Default Java heap >> size is also reduced but, as I understand from your previous >> comments, you want to use your own -Xmx value. These 3 values are >> main settings which control memory usage by VM. >> >> I would suggest to do experiments and see. >> >> Vladimir >> >> >> On 7/19/19 2:35 PM, Ty Young wrote: >>> Yes that works nicely. Are there any other hotspot switches to >>> reduce memory usage or is that it? >>> >>> >>> On 7/19/19 12:25 PM, Vladimir Kozlov wrote: >>>> Ty, you can try -XX:+NeverActAsServerClassMachine flag which sets >>>> configuration similar to old Client VM (C1 JIT + SerialGC): >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/compiler/compilerDefinitions.cpp#l116 >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/014decdb5086/src/hotspot/share/gc/shared/gcConfig.cpp#l109 >>>> >>>> >>>> Vladimir >>>> >>>> On 7/18/19 8:28 PM, David Holmes wrote: >>>>> Hi Ty, >>>>> >>>>> I'm moving this discussion to hotspot-dev as it's more appropriate. >>>>> >>>>> On 19/07/2019 12:46 pm, Ty Young wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> I'm requesting that the long unimplemented "client" java switch >>>>>> be implemented in Java 14. >>>>> >>>>> Background: the client VM is historically only supported on 32-bit >>>>> platforms explicitly, so the memory issues you are seeing are a >>>>> combination of factors based on the ergonomic selections made by >>>>> the VM during startup. The "client VM" is predominantly a 32-bit >>>>> JVM that only supports the C1 JIT-compiler. The "server VM" in >>>>> contrast supports the C2 JIT-compiler. For a while now this >>>>> distinction has blurred because the JIT uses tiered-compilation so >>>>> that it starts by acting similar to the C1 compiler (for faster >>>>> startup) and progresses into a mode that acts like C2 (for >>>>> throughput optimisation). Though there are flags you can set to >>>>> get it to act just like C1 or just like C2. >>>>> >>>>> Whether a machine is considered "server class" only partially >>>>> relates to this. The startup ergonomics for a "server class" >>>>> machine will configure subsystems to use more memory than a >>>>> "non-server class" machine. Again these days (and for a while) we >>>>> do not use this classification when starting the JVM. Various >>>>> ergonomic selections are made based on the default settings for a >>>>> range of components (mainly GC and JIT) together with the >>>>> characteristics of the actual runtime environment (available >>>>> memory and processors etc). >>>>> >>>>> The JVM is highly tunable in this regard, but of course it needs >>>>> to have a reasonable out-of-the-box configuration - and that has >>>>> evolved over the years, but is, at least for 64-bit systems, >>>>> skewed towards server-style systems. So we cannot please everybody >>>>> with the out-of-box default configuration. It's been suggested in >>>>> the past that perhaps we should support a number of different >>>>> initial configurations to make it easy(er) to adapt to specific >>>>> user requirements, but this quickly breaks down as you can't get >>>>> consensus on what those settings should be, and anyone who really >>>>> cares will do their own tuning anyway. >>>>> >>>>> I can't go through your email point by point in detail sorry. >>>>> Perhaps others can focus on specific memory issues. In particular >>>>> if JavaFX is a source of problems then that will need to be >>>>> discussed with the JavaFX folk. >>>>> >>>>> A very strong "business case" would need to be made for the >>>>> community to look at supporting something like "-client" in the >>>>> current OpenJDK. >>>>> >>>>> Cheers, >>>>> David >>>>> >>>>>> >>>>>> (Note: this entire request is based on the assumption that a JVM >>>>>> with -client is equivalent to a client JVM variant. If this is >>>>>> wrong, I apologies. There isn't much documentation to go on.) >>>>>> >>>>>> >>>>>> Since there aren't many google results or any kind of mention of >>>>>> this feature/ability even existing, i'll give an explanation to >>>>>> the best of my knowledge and personal observations: >>>>>> >>>>>> >>>>>> A "client" JVM variant is geared towards graphical end-user >>>>>> applications. According to a URL link found in the man entry for >>>>>> java[1] this supposedly results in faster startups. While this >>>>>> *may* be true, a much larger and more important benefit is a >>>>>> massive committed memory reduction in the range of about 25% to >>>>>> 50% when running a JavaFX application. At minimum with similar >>>>>> heap sizes, that is a 75 MB memory savings at 300MB (a somewhat >>>>>> typical peak usage with JavaFX applications) with a typical >>>>>> server JVM. That's huge. >>>>>> >>>>>> >>>>>> The downside to this however is that at most, the maximum amount >>>>>> of (committed?) memory that a client JVM variant can use is >>>>>> somewhere around 300MB by default. For the intended purpose of >>>>>> the client JVM switch/variant this is *probably* fine. Server JVM >>>>>> variants only seems to allocate more memory to boost performance, >>>>>> which really isn?t that much of a difference with the intended >>>>>> use case of the client JVM switch/variant? especially considering >>>>>> the more appealing memory savings. >>>>>> >>>>>> >>>>>> So why should this be implemented? >>>>>> >>>>>> >>>>>> The answer is simple: using more memory then is necessary is bad, >>>>>> angers users, and frustrates developers who want to be >>>>>> responsible by not wanting to eat up their users's memory[2] when >>>>>> it isn't needed. >>>>>> >>>>>> >>>>>> Even if you've have never heard anyone complain about Java's >>>>>> memory usage, you've most likely heard someone complain about a >>>>>> similar cross-platform software: Electron. People hate Electron >>>>>> applications for their absurd memory usage and will actively >>>>>> avoid them by using alternatives if possible. >>>>>> >>>>>> >>>>>> For reference, Etcher, an Electron application that allows users >>>>>> to easily create bootable USB drives on Windows, Linux, and >>>>>> probably Mac OS uses around 298 MB just at launch on Linux. >>>>>> Electron is both comparable in both goals(cross-platform >>>>>> solutions, JavaFX vs. Electron) and in memory usage. >>>>>> >>>>>> >>>>>> Java may not be a native language and there may be *some* >>>>>> unavoidable penalty for that but being wasteful and consuming >>>>>> resources where not necessary is, well, unnecessary. This can >>>>>> help reduce the amount of memory a java application uses >>>>>> significantly when used. >>>>>> >>>>>> >>>>>> With that all said, since JEPs include risks/impact/problems, >>>>>> it's best to mention some that come to mind: >>>>>> >>>>>> >>>>>> Because of the default lower memory limit, applications which go >>>>>> beyond this will fail. The easiest and best workaround would be >>>>>> to simply make the client JVM switch/variant opt-in. This would >>>>>> allow all existing Java applications to continue to work as >>>>>> expected. >>>>>> >>>>>> >>>>>> The only other issue that I can think of is people launching >>>>>> applications with -client without knowing the limitations of it >>>>>> and filing bogus bug reports to app developers. This can be >>>>>> mitigated with better documentation and awareness in places like >>>>>> the man page for Java. Since no one seems to really have used or >>>>>> knew about it before it's more likely end developers that will be >>>>>> passing the switch to their applications via scripts then end >>>>>> users will be. >>>>>> >>>>>> >>>>>> All in all, this is pretty safe as long as server JVM >>>>>> switch/variant remains the default. Maybe others can think of >>>>>> other risks/impacts/problems. >>>>>> >>>>>> >>>>>> And finally addressing the two questions/comments I imagine >>>>>> someone at some point are going to ask/say: >>>>>> >>>>>> >>>>>> Why not just compile a client JVM variant from source and use jLink? >>>>>> >>>>>> >>>>>> and/or >>>>>> >>>>>> >>>>>> If heap and garbage collection is healthy, who cares? >>>>>> >>>>>> >>>>>> For the first one, yes, this is a route that could be taken. It >>>>>> has some problems however, namely: >>>>>> >>>>>> >>>>>> - You have to be the developer or have source code access to use >>>>>> jLink. >>>>>> >>>>>> >>>>>> - jLink -from my understanding- requires a **fully** modular Java >>>>>> application. Some used libraries may not be modular yet. >>>>>> >>>>>> >>>>>> - A full JDK source code compile is required - something that is >>>>>> really easy to do under Linux but might not be under Windows and >>>>>> takes considerable CPU power to do. No one that I?m aware of (on >>>>>> Linux anyway) provides client JVM variant builds. Presumably This >>>>>> is because the server JVM variant is the most versatile. >>>>>> >>>>>> >>>>>> and as for the second: just because there is say, 5.8GB out of >>>>>> 8GB available doesn't mean you should or have the right to use it >>>>>> as you see fit. People do more than use Java applications. If you >>>>>> are running a web browser with lots of tabs open, a Java >>>>>> application could realistically cause major system stuttering as >>>>>> memory is moved to swap/pagefile. While I used 300MB above as an >>>>>> easy realistic example, i've seen JavaFX applications consume as >>>>>> much as 700MB and even 1GB committed memory. Just opening Scene >>>>>> Builder and playing around with the GUI consumes 400MB easily on >>>>>> a server JVM variant(Oracle JDK/JRE 10 to be exact). While memory >>>>>> usage may never be as good as native, the current amount of >>>>>> memory being consumed is insane and any normal user with standard >>>>>> amount of memory(6-8GB) *will* feel this. Adding this switch >>>>>> could potentially help a lot here and give Java a slight edge >>>>>> over similar software solutions. >>>>>> >>>>>> >>>>>> Can this feature please be implemented? Likewise, could the >>>>>> documentation on what a "client" JVM and other JVM variants be >>>>>> updated and improved? >>>>>> >>>>>> >>>>>> [1] >>>>>> https://docs.oracle.com/javase/8/docs/technotes/guides/vm/server-class.html >>>>>> >>>>>> >>>>>> [2] >>>>>> https://stackoverflow.com/questions/13692206/high-java-memory-usage-even-for-small-programs >>>>>> From david.holmes at oracle.com Mon Jul 22 01:25:42 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 22 Jul 2019 11:25:42 +1000 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: References: Message-ID: Hi Adam, Adding in serviceability-dev as you've made changes in that area too. Will take a closer look at the changes soon. David ----- On 18/07/2019 2:05 am, Adam Farley8 wrote: > Hey All, > > Reviewers and sponsors requested to inspect the following. > > I've re-written the code change, as discussed with David Holes in emails > last week, and now the webrev changes do this: > > - Cause the VM to shut down with a relevant error message if one or more > of the sun.boot.library.path paths is too long for the system. > - Apply similar error-producing code to the (legacy?) code in linker_md.c. > - Allow the numerical parameter for split_path to indicate anything we > plan to add to the path once split, allowing for more accurate path length > detection. > - Add an optional parameter to the os::split_path function that specifies > where the paths came from, for a better error message. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227021 > > New Webrev: http://cr.openjdk.java.net/~afarley/8227021.1/webrev/ > > Best Regards > > Adam Farley > IBM Runtimes > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > From david.holmes at oracle.com Mon Jul 22 02:34:37 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 22 Jul 2019 12:34:37 +1000 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: References: Message-ID: <25234969-2215-57e9-d8c5-d97b5669ebb1@oracle.com> Hi Adam, Some higher-level issues/concerns ... On 22/07/2019 11:25 am, David Holmes wrote: > Hi Adam, > > Adding in serviceability-dev as you've made changes in that area too. > > Will take a closer look at the changes soon. > > David > ----- > > On 18/07/2019 2:05 am, Adam Farley8 wrote: >> Hey All, >> >> Reviewers and sponsors requested to inspect the following. >> >> I've re-written the code change, as discussed with David Holes in emails >> last week, and now the webrev changes do this: >> >> - Cause the VM to shut down with a relevant error message if one or more >> of the sun.boot.library.path paths is too long for the system. I'm not seeing that implemented at the moment. Nor am I clear that such an error will always be detected during VM initialization. The code paths look fairly general purpose, but perhaps that is an illusion and we will always check this during initialization? (also see discussion at end) >> - Apply similar error-producing code to the (legacy?) code in >> linker_md.c. I think the JDWP changes need to be split off and handled under their own issue. It's a similar issue but not directly related. Also the change to sys.h raises the need for a CSR request as it seems to be exported for external use - though I can't find any existing test code that includes it, or which uses the affected code (which is another reason so split this of and let serviceability folk consider it). >> - Allow the numerical parameter for split_path to indicate anything we >> plan to add to the path once split, allowing for more accurate path >> length detection. This is a bit icky but I understand your desire to be more accurate with the checking - as otherwise you would still need to keep overflow checks in other places once the full path+name is assembled. But then such checks must be missing in places now ?? I'm not clear why you have implemented the path check the way you instead of simply augmenting the existing code ie. where we have: 1347 // do the actual splitting 1348 p = inpath; 1349 for (int i = 0 ; i < count ; i++) { 1350 size_t len = strcspn(p, os::path_separator()); 1351 if (len > JVM_MAXPATHLEN) { 1352 return NULL; 1353 } why not just change the calculation at line 1351 to include the prefix length, and then report the error rather than return NULL? BTW the existing code fails to free opath before returning NULL. >> - Add an optional parameter to the os::split_path function that specifies >> where the paths came from, for a better error message. It's not appropriate to set that up in os::dll_locate_lib, hardwired as "sun.boot.library.path". os::dll_locate_lib doesn't know where it is being asked to look, it is the callers that usually use Arguments::get_dll_dir(), but in one case in jvmciEnv.cpp we have: os::dll_locate_lib(path, sizeof(path), JVMCILibPath, ... so the error message would be wrong in that case. If you want to pass through this diagnostic help information then it needs to be set by the callers of, and passed into, os::dll_locate_lib. Looking at all the callers of os::dll_locate_lib that all pass Arguments::get_dll_dir, it seems quite inefficient that we will potentially split the same set of paths multiple times. I wonder whether we can do this specifically during VM initialization and cache the split paths instead? That doesn't address the problem of a path element that only exceeds the maximum length when a specific library name is added, but I'm trying to see how to minimise the splitting and put the onus for the checking back on the code creating the paths. Lets see if others have comments/suggestions here. Thanks, David >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8227021 >> >> New Webrev: http://cr.openjdk.java.net/~afarley/8227021.1/webrev/ >> >> Best Regards >> >> Adam Farley >> IBM Runtimes >> >> Unless stated otherwise above: >> IBM United Kingdom Limited - Registered in England and Wales with number >> 741598. >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 >> 3AU >> From david.holmes at oracle.com Mon Jul 22 06:49:46 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 22 Jul 2019 16:49:46 +1000 Subject: RFC: JWarmup precompile java hot methods at application startup In-Reply-To: <40bd126f-ca71-4a71-8fda-552cf8f289ad.kuaiwei.kw@alibaba-inc.com> References: <8cfbaa83-c50f-61c4-5336-5f30b3885d45@oracle.com> <26f88253-deea-64d5-714c-28bb73989c62@oracle.com> <40bd126f-ca71-4a71-8fda-552cf8f289ad.kuaiwei.kw@alibaba-inc.com> Message-ID: <6c74ebb1-1292-43ae-86d2-9c8be14af0e9@oracle.com> Hi Kuai, On 21/06/2019 5:18 pm, Kuai Wei wrote: > Hi David, > > Sorry for the late reply. Sorry for my even later one. I was traveling and then had vacation, and have had other things to look at. > We plan to create a wiki page on OpenJDK website and put the design documents there. How do you think about it? That sounds like a good idea. > Here are the answers to some questions in your last message: I haven't had time to context switch in everything sorry. So just a couple of responses below. > - "source file" in JWarmUp record: > Application will load same class from multiple places. For example, logging jar will be packaged by different web apps. So we record this property. > > - super class resolution > It's used for diagnostic. Same class loaded by different loaders will cause a warning message in PreloadClassChain::record_loaded_class(). We use the super class resolve mark to reduce warning messages when resolving super class. We are thinking to refine it. If "refine" means remove then I encourage your thinking :) > > - dummy method > It's hard to know whether JWarmUp compilations are completed or not. The dummy method is used as the last method compiled due to JWarmUp. We are able to check its entry to see whether all compilations are finished. > > - native entry in jvm.cpp > JWarmUp defined some jvm entries which are invoked by java. We assume all jvm entries are put into jvm.cpp. Would you give us some reference we can follow? jvm.cpp contains the definitions of the JVM entry point methods but it doesn't (as you can see in existing file) contain code for registering those methods: #define CC (char*) static JNINativeMethod jdk_jwarmup_JWarmUp_methods[] = { { CC "notifyApplicationStartUpIsDone0", CC "()V", (void *)&JVM_NotifyApplicationStartUpIsDone}, { CC "checkIfCompilationIsComplete0", CC "()Z", (void *)&JVM_CheckJWarmUpCompilationIsComplete}, { CC "notifyJVMDeoptWarmUpMethods0", CC "()V", (void *)&JVM_NotifyJVMDeoptWarmUpMethods} }; JVM_ENTRY(void, JVM_RegisterJWarmUpMethods(JNIEnv *env, jclass jwarmupclass)) JVMWrapper("JVM_RegisterJWarmUpMethods"); ThreadToNativeFromVM ttnfv(thread); // can't be in VM when we call JNI int ok = env->RegisterNatives(jwarmupclass, jdk_jwarmup_JWarmUp_methods, sizeof(jdk_jwarmup_JWarmUp_methods)/sizeof(JNINativeMethod)); guarantee(ok == 0, "register jdk.jwarmup.JWarmUp natives"); JVM_END #undef CC That is typically done by the C code in the JDK. See for example src/java.base/share/native/libjava/System.c > - logging flags > JWarmUp was initially developed for JDK8. A flag was used to print out trace. When we ported the patch to JDK tip, we changed code to use the new log utility but with the legacy flag kept. Please remove legacy flag. > - VM flags > We will check and remove unnecessary flags. Thank you. > - init.cpp and mutex initialization > We will modify that. > > - Deoptimization change > I'm not clear about that. Would you like to provide more details? We will check the impact on our patch. I forget the exact context now sorry. If you've rebased to latest code and everything builds and runs then that should suffice. I have a lot of general concerns about the impact of this work on various areas of the JVM. It really needs to be as unobtrusive as possible and ideally causing no changes to executed code unless enabled. Potentially/possibly it might even need to be selectable at build-time, as to whether this feature is included. And I apologise in advance because I don't have a lot of time to deep dive into all the details of this proposed feature. Thanks, David ----- > Thanks, > Kuai Wei > > > > > ------------------------------------------------------------------ > From:David Holmes > Send Time:2019?6?10?(???) 15:18 > To:yumin qi ; hotspot-runtim. > Cc:hotspot-dev > Subject:Re: RFC: JWarmup precompile java hot methods at application startup > > Hi Yumin, > > On 8/06/2019 3:25 am, yumin qi wrote: >> Hi, David and all >> >> Can I have one more comment from runtime expert for the JEP? >> David, can you comment for the changes? Really appreciate your last >> comment. It is best if you follow the comment. >> Looking forward to having your comment. > > I still have a lot of trouble understanding the overall design here. The > JEP is very high-level; the webrev is very low-level; and there's > nothing in between to explain the details of the design - the kind of > document you would produce for a design review/walkthrough. For example > I can't see why you need to record the "source file"; I can't see why > you need to make changes to the superclass resolution. I can't tell when > changes outside of Jwarmup may need to make changes to the Jwarmup code > - the dependencies are unclear. I'm unclear on the role of the "dummy > method" - is it just a sentinel? Why do we need it versus using some > state in the JitWarmup instance? > > Some further code comments, but not a file by file review by any means ... > > The code split between the JDK and JVM doesn't seem quite right to me. > registerNatives is something usually done by the JDK .c files > corresponding to the classes defining the native method; it's not > something done in jvm.cpp. Or if this is meant to be a special case like > JVM_RegisterMethodHandleMethods then probably it should be in the > jwarmup.cpp file. Also if you pass the necessary objects through the API > you won't need to jump back to native to call a JNI function. > > AliasedLoggingFlags are for converting legacy flags to unified logging. > You should just be using UL and not introducing the > PrintCompilationWarmUpDetail psuedo-flag. > > This work introduces thirteen new VM flags! That's very excessive. > Perhaps you should look at defining something more like -Xlog that > encodes all the options? (And this will need a very lengthy CSR request!). > > The init.cpp code should be factored out into jwarmup_init functions in > jwarmup.cpp. > > Mutex initialization should be conditional on jwarmup being enabled. > > Deoptimization has been changed lately to avoid use of safepoints so you > may need to re-examine that aspect. > > You have a number of uses of patterns like this (but not everywhere): > > + JitWarmUp* jwp = JitWarmUp::instance(); > + assert(jwp != NULL, "sanity check"); > + jwp->preloader()->jvm_booted_is_done(); > > The assertion should be inside instance() so that these collapse to a > single line: > > JitWarmup::instance()->preloader->whatever(); > > Your Java files have @version 1.8. > > --- > > Cheers, > David > ----- > > > >> Thanks >> Yumin >> >> On Sun, May 19, 2019 at 10:28 AM yumin qi > > wrote: >> >> Hi, Tobias and all >> Have done changes based on Tobias' comments. New webrev based on >> most recent base is updated at: >> http://cr.openjdk.java.net/~minqi/8220692/webrev-03/ >> >> Tested local for jwarmup and compiler. >> >> Thanks >> Yumin >> >> On Tue, May 14, 2019 at 11:26 AM yumin qi > > wrote: >> >> HI, Tobias >> >> Thanks very much for the comments. >> >> On Mon, May 13, 2019 at 2:58 AM Tobias Hartmann >> > >> wrote: >> >> Hi Yumin, >> >> > In this version, the profiled method data is not used at >> > precomilation, it will be addressed in followed bug fix. >> After the >> > first version integrated, will file bug for it. >> >> Why is that? I think it would be good to have everything in >> one JEP. >> >> >> We have done some tests on adding profiling data and found the >> result is not as expected, and the current version is working >> well for internal online applications. There is no other reason >> not adding to this patch now, we will like to study further to >> see if we can improve that for a better performance. >> >> I've looked at the compiler related changes. Here are some >> comments/questions. >> >> ciMethod.cpp >> - So CompilationWarmUp is not using any profile information? >> Not even the profile obtained in the >> current execution? >> >> >> Yes. This is also related to previous question. >> >> compile.cpp >> - line 748: Why is that required? Couldn't it happen that a >> method is never compiled because the >> code that would resolve a field is never executed? >> >> >> Here a very aggressive decision --- to avoid compilation failure >> requires that all fields have already been resolved. >> >> graphKit.cpp >> - line 2832: please add a comment >> - line 2917: checks should be merged into one if and please >> add a comment >> >> >> Will fix it. >> >> jvm.cpp >> - Could you explain why it's guaranteed that warmup >> compilation is completed once the dummy method >> is compiled? And why is it hardcoded to print >> "com.alibaba.jwarmup.JWarmUp"? >> >> >> This is from practical testing of real applications. Due to the >> parallelism of compilation works, it should check if >> compilation queue contains any of those methods --- completed if >> no any of them on the queue and it is not economic. By using of >> a dummy method as a simplified version for that, in real case, >> it is not observed that dummy method is not the last compilation >> for warmup. Do you have suggestion of a way to do that? The >> dummy way is not strictly a guaranteed one theoretically. >> Forgot to change the print to new name after renaming package, >> thanks for the catching. >> >> - What is test_symbol_matcher() used for? >> >> >> This is a leftover(used to test matching patterns), will remove >> it from the file. >> >> jitWarmUp.cpp: >> - line 146: So what about methods that are only ever >> compiled at C1 level? Wouldn't it make sense to >> keep track of the comp_level during CompilationWarmUpRecording? >> >> >> Will consider your suggestion in future work on it. >> >> I also found several typos while reading through the code >> (listed in random order): >> >> globals.hpp >> - "flushing profling" -> "flushing profiling" >> >> method.hpp >> - "when this method first been invoked" >> >> templateInterpreterGenerator_x86.cpp >> - initializition -> initialization >> >> dict.cpp >> - initializated -> initialized >> >> jitWarmUp.cpp >> - uninitilaized -> uninitialized >> - inited -> should be initialized, right? >> >> jitWarmUp.hpp >> . nofityApplicationStartUpIsDone -> >> notifyApplicationStartUpIsDone >> >> constantPool.cpp >> - recusive -> recursive >> >> JWarmUp.java >> - appliacation -> application >> >> TestThrowInitializaitonException.java -> >> TestThrowInitializationException.java >> >> These tests should be renamed (it's not clear what issue the >> number refers to): >> - issue9780156.sh >> - Issue11272598.java >> >> >> Will fix all above suggestions. >> >> Thanks! >> >> Yumin >> > From thomas.schatzl at oracle.com Mon Jul 22 12:39:08 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 22 Jul 2019 14:39:08 +0200 Subject: Please implement client switch in 64-bit server JDK 14 builds In-Reply-To: References: <36722c0b-1383-f2ab-315e-060aad41a5c2@gmail.com> <1ed0b633-640b-f8c5-ed8d-815d4a15add7@oracle.com> <2ecca81e-a4d5-06b4-49f3-4e83c353979d@oracle.com> <84157ea2-f645-0652-7ccb-aba41620135e@gmail.com> Message-ID: <8e5c1661bcfba8f4a742b7b10e763a7d1ece5b9d.camel@oracle.com> Hi, On Sun, 2019-07-21 at 14:05 -0500, Ty Young wrote: > Never mind, I had a client VM selected as the running VM. Memory > usage starts off low then goes to the same amount. Is there really > nothing that can be done to take the committed memory size? it depends on the garbage collector when they uncommit memory, and there are configurable options how much memory is taken away. Serial, Parallel and CMS afaik (not sure about CMS actually, likely it also gives back memory on background whole-heap collections) only give back memory on a full GC. G1, the default collector since JDK9, until JDK12 gave back memory only during a foreground whole-heap collection. With JDK12 it also gives back memory during background collections. That version also adds a feature to regularly start background collections after some inactivity [0]. Shenandoah and ZGC have similar functionality. How much memory is given back is controlled by the "MinHeapFreeRatio" and "MaxHeapFreeRatio" flags, which give percentages (yeah, misnomer, they are not ratios) of the whole heap. MinHeapFreeRatio (default 40) is the minimum percentage of free heap after GC to avoid expansion. I.e. if there is less than 40% of free heap after GC, the collectors will expand. MaxHeapFreeRatio (default 70) is the maximum percentage of free heap after GC to avoid shrinking. I.e. if there is more than that free heap after GC, they will shrink the heap. Currently running Netbeans IDE with only -J- XX:G1PeriodicGCInterval=30000 (using G1, keeping the default), and it idles at 245MB used/730MB committed which given above default values for Min/MaxHeapFreeRatio seems within specs. (There is also a short section about the periodic garbage collections feature in the gc tuning guide) Hope this helps, Thomas [0] http://openjdk.java.net/jeps/346 [1] https://docs.oracle.com/en/java/javase/12/gctuning/garbage-first-garbage-collector.html#GUID-DA6296DD-9AAB-4955-8B5B-683651936155 From coleen.phillimore at oracle.com Mon Jul 22 18:45:54 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 22 Jul 2019 14:45:54 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) Message-ID: Summary: Increase max size for SymbolTable and fix experimental option range.? Make experimental options trueInDebug so they're tested by the command line option testing open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8227123 Tested locally with default and -XX:+UseZGC since ZGC has a lot of experimental options.? I didn't test with shenanodoah. I will test with hs-tier1-3 before checking in. Thanks, Coleen From jianglizhou at google.com Mon Jul 22 19:40:54 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Mon, 22 Jul 2019 12:40:54 -0700 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: Message-ID: Looks good. Best regards, Jiangli On Mon, Jul 22, 2019 at 11:46 AM wrote: > > Summary: Increase max size for SymbolTable and fix experimental option > range. Make experimental options trueInDebug so they're tested by the > command line option testing > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8227123 > > Tested locally with default and -XX:+UseZGC since ZGC has a lot of > experimental options. I didn't test with shenanodoah. > > I will test with hs-tier1-3 before checking in. > > Thanks, > Coleen From coleen.phillimore at oracle.com Mon Jul 22 19:51:48 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 22 Jul 2019 15:51:48 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: Message-ID: <834c9d78-c05d-41e9-9dae-4b4e4cb43a8f@oracle.com> Thanks Jiangli, and for the suggestion to increase the max size.? I was waffling about removing the experimental option completely. Coleen On 7/22/19 3:40 PM, Jiangli Zhou wrote: > Looks good. > > Best regards, > Jiangli > > On Mon, Jul 22, 2019 at 11:46 AM wrote: >> Summary: Increase max size for SymbolTable and fix experimental option >> range. Make experimental options trueInDebug so they're tested by the >> command line option testing >> >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >> >> Tested locally with default and -XX:+UseZGC since ZGC has a lot of >> experimental options. I didn't test with shenanodoah. >> >> I will test with hs-tier1-3 before checking in. >> >> Thanks, >> Coleen From kuaiwei.kw at alibaba-inc.com Tue Jul 23 01:46:43 2019 From: kuaiwei.kw at alibaba-inc.com (Kuai Wei) Date: Tue, 23 Jul 2019 09:46:43 +0800 Subject: =?UTF-8?B?UmU6IFJGQzogSldhcm11cCBwcmVjb21waWxlIGphdmEgaG90IG1ldGhvZHMgYXQgYXBwbGlj?= =?UTF-8?B?YXRpb24gc3RhcnR1cA==?= In-Reply-To: <6c74ebb1-1292-43ae-86d2-9c8be14af0e9@oracle.com> References: <8cfbaa83-c50f-61c4-5336-5f30b3885d45@oracle.com> <26f88253-deea-64d5-714c-28bb73989c62@oracle.com> <40bd126f-ca71-4a71-8fda-552cf8f289ad.kuaiwei.kw@alibaba-inc.com>, <6c74ebb1-1292-43ae-86d2-9c8be14af0e9@oracle.com> Message-ID: Hi David, Thanks for the clarification. We will update wiki and patch for next round review. Kuai Wei ------------------------------------------------------------------ From:David Holmes Send Time:2019?7?22?(???) 14:49 To:??(??) ; yumin qi ; hotspot-runtime-dev at openjdk.java.net Cc:hotspot-dev Subject:Re: RFC: JWarmup precompile java hot methods at application startup Hi Kuai, On 21/06/2019 5:18 pm, Kuai Wei wrote: > Hi David, > > Sorry for the late reply. Sorry for my even later one. I was traveling and then had vacation, and have had other things to look at. > We plan to create a wiki page on OpenJDK website and put the design documents there. How do you think about it? That sounds like a good idea. > Here are the answers to some questions in your last message: I haven't had time to context switch in everything sorry. So just a couple of responses below. > - "source file" in JWarmUp record: > Application will load same class from multiple places. For example, logging jar will be packaged by different web apps. So we record this property. > > - super class resolution > It's used for diagnostic. Same class loaded by different loaders will cause a warning message in PreloadClassChain::record_loaded_class(). We use the super class resolve mark to reduce warning messages when resolving super class. We are thinking to refine it. If "refine" means remove then I encourage your thinking :) > > - dummy method > It's hard to know whether JWarmUp compilations are completed or not. The dummy method is used as the last method compiled due to JWarmUp. We are able to check its entry to see whether all compilations are finished. > > - native entry in jvm.cpp > JWarmUp defined some jvm entries which are invoked by java. We assume all jvm entries are put into jvm.cpp. Would you give us some reference we can follow? jvm.cpp contains the definitions of the JVM entry point methods but it doesn't (as you can see in existing file) contain code for registering those methods: #define CC (char*) static JNINativeMethod jdk_jwarmup_JWarmUp_methods[] = { { CC "notifyApplicationStartUpIsDone0", CC "()V", (void *)&JVM_NotifyApplicationStartUpIsDone}, { CC "checkIfCompilationIsComplete0", CC "()Z", (void *)&JVM_CheckJWarmUpCompilationIsComplete}, { CC "notifyJVMDeoptWarmUpMethods0", CC "()V", (void *)&JVM_NotifyJVMDeoptWarmUpMethods} }; JVM_ENTRY(void, JVM_RegisterJWarmUpMethods(JNIEnv *env, jclass jwarmupclass)) JVMWrapper("JVM_RegisterJWarmUpMethods"); ThreadToNativeFromVM ttnfv(thread); // can't be in VM when we call JNI int ok = env->RegisterNatives(jwarmupclass, jdk_jwarmup_JWarmUp_methods, sizeof(jdk_jwarmup_JWarmUp_methods)/sizeof(JNINativeMethod)); guarantee(ok == 0, "register jdk.jwarmup.JWarmUp natives"); JVM_END #undef CC That is typically done by the C code in the JDK. See for example src/java.base/share/native/libjava/System.c > - logging flags > JWarmUp was initially developed for JDK8. A flag was used to print out trace. When we ported the patch to JDK tip, we changed code to use the new log utility but with the legacy flag kept. Please remove legacy flag. > - VM flags > We will check and remove unnecessary flags. Thank you. > - init.cpp and mutex initialization > We will modify that. > > - Deoptimization change > I'm not clear about that. Would you like to provide more details? We will check the impact on our patch. I forget the exact context now sorry. If you've rebased to latest code and everything builds and runs then that should suffice. I have a lot of general concerns about the impact of this work on various areas of the JVM. It really needs to be as unobtrusive as possible and ideally causing no changes to executed code unless enabled. Potentially/possibly it might even need to be selectable at build-time, as to whether this feature is included. And I apologise in advance because I don't have a lot of time to deep dive into all the details of this proposed feature. Thanks, David ----- > Thanks, > Kuai Wei > > > > > ------------------------------------------------------------------ > From:David Holmes > Send Time:2019?6?10?(???) 15:18 > To:yumin qi ; hotspot-runtim. > Cc:hotspot-dev > Subject:Re: RFC: JWarmup precompile java hot methods at application startup > > Hi Yumin, > > On 8/06/2019 3:25 am, yumin qi wrote: >> Hi, David and all >> >> Can I have one more comment from runtime expert for the JEP? >> David, can you comment for the changes? Really appreciate your last >> comment. It is best if you follow the comment. >> Looking forward to having your comment. > > I still have a lot of trouble understanding the overall design here. The > JEP is very high-level; the webrev is very low-level; and there's > nothing in between to explain the details of the design - the kind of > document you would produce for a design review/walkthrough. For example > I can't see why you need to record the "source file"; I can't see why > you need to make changes to the superclass resolution. I can't tell when > changes outside of Jwarmup may need to make changes to the Jwarmup code > - the dependencies are unclear. I'm unclear on the role of the "dummy > method" - is it just a sentinel? Why do we need it versus using some > state in the JitWarmup instance? > > Some further code comments, but not a file by file review by any means ... > > The code split between the JDK and JVM doesn't seem quite right to me. > registerNatives is something usually done by the JDK .c files > corresponding to the classes defining the native method; it's not > something done in jvm.cpp. Or if this is meant to be a special case like > JVM_RegisterMethodHandleMethods then probably it should be in the > jwarmup.cpp file. Also if you pass the necessary objects through the API > you won't need to jump back to native to call a JNI function. > > AliasedLoggingFlags are for converting legacy flags to unified logging. > You should just be using UL and not introducing the > PrintCompilationWarmUpDetail psuedo-flag. > > This work introduces thirteen new VM flags! That's very excessive. > Perhaps you should look at defining something more like -Xlog that > encodes all the options? (And this will need a very lengthy CSR request!). > > The init.cpp code should be factored out into jwarmup_init functions in > jwarmup.cpp. > > Mutex initialization should be conditional on jwarmup being enabled. > > Deoptimization has been changed lately to avoid use of safepoints so you > may need to re-examine that aspect. > > You have a number of uses of patterns like this (but not everywhere): > > + JitWarmUp* jwp = JitWarmUp::instance(); > + assert(jwp != NULL, "sanity check"); > + jwp->preloader()->jvm_booted_is_done(); > > The assertion should be inside instance() so that these collapse to a > single line: > > JitWarmup::instance()->preloader->whatever(); > > Your Java files have @version 1.8. > > --- > > Cheers, > David > ----- > > > >> Thanks >> Yumin >> >> On Sun, May 19, 2019 at 10:28 AM yumin qi > > wrote: >> >> Hi, Tobias and all >> Have done changes based on Tobias' comments. New webrev based on >> most recent base is updated at: >> http://cr.openjdk.java.net/~minqi/8220692/webrev-03/ >> >> Tested local for jwarmup and compiler. >> >> Thanks >> Yumin >> >> On Tue, May 14, 2019 at 11:26 AM yumin qi > > wrote: >> >> HI, Tobias >> >> Thanks very much for the comments. >> >> On Mon, May 13, 2019 at 2:58 AM Tobias Hartmann >> > >> wrote: >> >> Hi Yumin, >> >> > In this version, the profiled method data is not used at >> > precomilation, it will be addressed in followed bug fix. >> After the >> > first version integrated, will file bug for it. >> >> Why is that? I think it would be good to have everything in >> one JEP. >> >> >> We have done some tests on adding profiling data and found the >> result is not as expected, and the current version is working >> well for internal online applications. There is no other reason >> not adding to this patch now, we will like to study further to >> see if we can improve that for a better performance. >> >> I've looked at the compiler related changes. Here are some >> comments/questions. >> >> ciMethod.cpp >> - So CompilationWarmUp is not using any profile information? >> Not even the profile obtained in the >> current execution? >> >> >> Yes. This is also related to previous question. >> >> compile.cpp >> - line 748: Why is that required? Couldn't it happen that a >> method is never compiled because the >> code that would resolve a field is never executed? >> >> >> Here a very aggressive decision --- to avoid compilation failure >> requires that all fields have already been resolved. >> >> graphKit.cpp >> - line 2832: please add a comment >> - line 2917: checks should be merged into one if and please >> add a comment >> >> >> Will fix it. >> >> jvm.cpp >> - Could you explain why it's guaranteed that warmup >> compilation is completed once the dummy method >> is compiled? And why is it hardcoded to print >> "com.alibaba.jwarmup.JWarmUp"? >> >> >> This is from practical testing of real applications. Due to the >> parallelism of compilation works, it should check if >> compilation queue contains any of those methods --- completed if >> no any of them on the queue and it is not economic. By using of >> a dummy method as a simplified version for that, in real case, >> it is not observed that dummy method is not the last compilation >> for warmup. Do you have suggestion of a way to do that? The >> dummy way is not strictly a guaranteed one theoretically. >> Forgot to change the print to new name after renaming package, >> thanks for the catching. >> >> - What is test_symbol_matcher() used for? >> >> >> This is a leftover(used to test matching patterns), will remove >> it from the file. >> >> jitWarmUp.cpp: >> - line 146: So what about methods that are only ever >> compiled at C1 level? Wouldn't it make sense to >> keep track of the comp_level during CompilationWarmUpRecording? >> >> >> Will consider your suggestion in future work on it. >> >> I also found several typos while reading through the code >> (listed in random order): >> >> globals.hpp >> - "flushing profling" -> "flushing profiling" >> >> method.hpp >> - "when this method first been invoked" >> >> templateInterpreterGenerator_x86.cpp >> - initializition -> initialization >> >> dict.cpp >> - initializated -> initialized >> >> jitWarmUp.cpp >> - uninitilaized -> uninitialized >> - inited -> should be initialized, right? >> >> jitWarmUp.hpp >> . nofityApplicationStartUpIsDone -> >> notifyApplicationStartUpIsDone >> >> constantPool.cpp >> - recusive -> recursive >> >> JWarmUp.java >> - appliacation -> application >> >> TestThrowInitializaitonException.java -> >> TestThrowInitializationException.java >> >> These tests should be renamed (it's not clear what issue the >> number refers to): >> - issue9780156.sh >> - Issue11272598.java >> >> >> Will fix all above suggestions. >> >> Thanks! >> >> Yumin >> > From david.holmes at oracle.com Tue Jul 23 04:27:25 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Jul 2019 14:27:25 +1000 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: Message-ID: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> Hi Coleen, - experimental(bool, UnlockExperimentalVMOptions, false, \ + experimental(bool, UnlockExperimentalVMOptions, trueInDebug, \ I can't quite convince myself this is harmless nor necessary. Functional change seems fine. Is it worth adding a clarifying comment to: + range(minimumSymbolTableSize, 16777216ul) \ with: + range(minimumSymbolTableSize, 16777216ul /* 2^24 */) \ Thanks, David On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: > Summary: Increase max size for SymbolTable and fix experimental option > range.? Make experimental options trueInDebug so they're tested by the > command line option testing > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8227123 > > Tested locally with default and -XX:+UseZGC since ZGC has a lot of > experimental options.? I didn't test with shenanodoah. > > I will test with hs-tier1-3 before checking in. > > Thanks, > Coleen From coleen.phillimore at oracle.com Tue Jul 23 11:03:14 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Jul 2019 07:03:14 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> Message-ID: On 7/23/19 12:27 AM, David Holmes wrote: > Hi Coleen, > > -? experimental(bool, UnlockExperimentalVMOptions, false, ??? \ > +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, ??? \ > > I can't quite convince myself this is harmless nor necessary. Well if it's added, then the option range test would test the option.? Otherwise, I think it's benign.? In debug mode, one would no longer have to specify -XX:+UnlockExperimental options, just like UnlockDiagnosticVMOptions.?? The option is there either way. > > Functional change seems fine. Is it worth adding a clarifying comment to: > > +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ > > with: > > +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) > ?????????????? \ Let me see if the X macro allows that and I could also add that to StringTableSize (which is not experimental option). Thanks, Coleen > > Thanks, > David > > On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >> Summary: Increase max size for SymbolTable and fix experimental >> option range.? Make experimental options trueInDebug so they're >> tested by the command line option testing >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >> >> Tested locally with default and -XX:+UseZGC since ZGC has a lot of >> experimental options.? I didn't test with shenanodoah. >> >> I will test with hs-tier1-3 before checking in. >> >> Thanks, >> Coleen From coleen.phillimore at oracle.com Tue Jul 23 11:19:02 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Jul 2019 07:19:02 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> Message-ID: <3afae19f-71fe-ab80-49cd-a0489bd44da0@oracle.com> Let me edit this: On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: > > > On 7/23/19 12:27 AM, David Holmes wrote: >> Hi Coleen, >> >> -? experimental(bool, UnlockExperimentalVMOptions, false, ??? \ >> +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, ??? \ >> >> I can't quite convince myself this is harmless nor necessary. > > Well if it's added, then the option range test would test all the > experimental options that have a range.? Otherwise, I think it's > harmless.? In debug mode, one would no longer have to specify > -XX:+UnlockExperimental options, just like > UnlockDiagnosticVMOptions.?? The experimental options are compiled in > the sources either way. >> >> Functional change seems fine. Is it worth adding a clarifying comment >> to: >> >> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >> >> with: >> >> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) >> ?????????????? \ > > Let me see if the X macro allows that and I could also add that to > StringTableSize (which is not experimental option). > Thanks, > Coleen >> >> Thanks, >> David >> >> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>> Summary: Increase max size for SymbolTable and fix experimental >>> option range.? Make experimental options trueInDebug so they're >>> tested by the command line option testing >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>> >>> Tested locally with default and -XX:+UseZGC since ZGC has a lot of >>> experimental options.? I didn't test with shenanodoah. >>> >>> I will test with hs-tier1-3 before checking in. >>> >>> Thanks, >>> Coleen > From daniel.daugherty at oracle.com Tue Jul 23 13:45:18 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 23 Jul 2019 09:45:18 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> Message-ID: <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: > > > On 7/23/19 12:27 AM, David Holmes wrote: >> Hi Coleen, >> >> -? experimental(bool, UnlockExperimentalVMOptions, false, ??? \ >> +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, ??? \ >> >> I can't quite convince myself this is harmless nor necessary. > > Well if it's added, then the option range test would test the option.? > Otherwise, I think it's benign.? In debug mode, one would no longer > have to specify -XX:+UnlockExperimental options, just like > UnlockDiagnosticVMOptions.?? The option is there either way. Mentioning 'UnlockDiagnosticVMOptions' reminds me that some folks think that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause bugs in tests that are runnable in all build configs: 'release', 'fastdebug' and 'slowdebug'. Folks use an option in a test that requires '-XX:+UnlockDiagnosticVMOptions', but forget to include it in the test's run statement and we end up with a test failure in 'release' bits. I would prefer that 'UnlockExperimentalVMOptions' did not introduce the same path to failing tests. Dan >> >> Functional change seems fine. Is it worth adding a clarifying comment >> to: >> >> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >> >> with: >> >> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) >> ?????????????? \ > > Let me see if the X macro allows that and I could also add that to > StringTableSize (which is not experimental option). > Thanks, > Coleen >> >> Thanks, >> David >> >> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>> Summary: Increase max size for SymbolTable and fix experimental >>> option range.? Make experimental options trueInDebug so they're >>> tested by the command line option testing >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>> >>> Tested locally with default and -XX:+UseZGC since ZGC has a lot of >>> experimental options.? I didn't test with shenanodoah. >>> >>> I will test with hs-tier1-3 before checking in. >>> >>> Thanks, >>> Coleen > From coleen.phillimore at oracle.com Tue Jul 23 15:09:42 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Jul 2019 11:09:42 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> Message-ID: On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: > On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/23/19 12:27 AM, David Holmes wrote: >>> Hi Coleen, >>> >>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>> +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, ??? \ >>> >>> I can't quite convince myself this is harmless nor necessary. >> >> Well if it's added, then the option range test would test the >> option.? Otherwise, I think it's benign.? In debug mode, one would no >> longer have to specify -XX:+UnlockExperimental options, just like >> UnlockDiagnosticVMOptions.?? The option is there either way. > > Mentioning 'UnlockDiagnosticVMOptions' reminds me that some folks think > that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause bugs in > tests > that are runnable in all build configs: 'release', 'fastdebug' and > 'slowdebug'. > Folks use an option in a test that requires > '-XX:+UnlockDiagnosticVMOptions', > but forget to include it in the test's run statement and we end up with a > test failure in 'release' bits. > > I would prefer that 'UnlockExperimentalVMOptions' did not introduce > the same > path to failing tests. I tried to change UnlockDiagnosticVMOptions to be false, and got a wall of opposition: See: https://bugs.openjdk.java.net/browse/JDK-8153783 http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html I think the same exact arguments should apply to UnlockExperimentalVMOptions.? I'd like to hear from someone that uses experimental options on ZGC or shenandoah, since those have the most experimental options. The reason that I made it trueInDebug is so that the command line option range test would test these options.? Otherwise a more hacky solution could be done, including adding the parameter -XX:+UnlockExperimentalVMOptions to all the VM option range tests. I'd rather not do this. Thanks, Coleen > > Dan > > >>> >>> Functional change seems fine. Is it worth adding a clarifying >>> comment to: >>> >>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>> >>> with: >>> >>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) >>> ?????????????? \ >> >> Let me see if the X macro allows that and I could also add that to >> StringTableSize (which is not experimental option). >> Thanks, >> Coleen >>> >>> Thanks, >>> David >>> >>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>> Summary: Increase max size for SymbolTable and fix experimental >>>> option range.? Make experimental options trueInDebug so they're >>>> tested by the command line option testing >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>> >>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot of >>>> experimental options.? I didn't test with shenanodoah. >>>> >>>> I will test with hs-tier1-3 before checking in. >>>> >>>> Thanks, >>>> Coleen >> > From matthias.baesken at sap.com Tue Jul 23 15:14:52 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 23 Jul 2019 15:14:52 +0000 Subject: RFR: 8228482: fix xlc16/xlclang comparison of distinct pointer types and string literal conversion warnings Message-ID: Hello please review this patch . It fixes a couple of xlc16/xlclang warnings , especially comparison of distinct pointer types and string literal conversion warnings . When building with xlc16/xlclang, we still have a couple of warnings that have to be fixed : warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] for example : /nightly/jdk/src/hotspot/os/aix/libodm_aix.cpp:81:18: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] odmWrapper odm("product", "/usr/lib/objrepos"); // could also use "lpp" ^ /nightly/jdk/src/hotspot/os/aix/libodm_aix.cpp:81:29: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] odmWrapper odm("product", "/usr/lib/objrepos"); // could also use "lpp" ^ warning: comparison of distinct pointer types, for example : /nightly/jdk/src/java.desktop/aix/native/libawt/porting_aix.c:50:14: warning: comparison of distinct pointer types ('void *' and 'char *') [-Wcompare-distinct-pointer-types] addr < (((char*)p->ldinfo_textorg) + p->ldinfo_textsize)) { ~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8228482 http://cr.openjdk.java.net/~mbaesken/webrevs/8228482.1/ Thanks, Matthias From jianglizhou at google.com Tue Jul 23 15:25:57 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Tue, 23 Jul 2019 08:25:57 -0700 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <834c9d78-c05d-41e9-9dae-4b4e4cb43a8f@oracle.com> References: <834c9d78-c05d-41e9-9dae-4b4e4cb43a8f@oracle.com> Message-ID: Hi Coleen, On Mon, Jul 22, 2019 at 12:51 PM wrote: > > > Thanks Jiangli, and for the suggestion to increase the max size. I was > waffling about removing the experimental option completely. Would it be worth considering making SymbolTableSize a product flag? There was a noticeable performance issue for a large application with older JDK versions. It was found to be related to slow SymbolTable::lookup due to too many collisions. Setting a large initial symbol table size worked around the performance issue and the improvement was significant. With newer JDK (> 12), user may still prefer setting a large initial size to avoid any potential overhead (avoid resizing) for large applications. Best regards, Jiangli > Coleen > > On 7/22/19 3:40 PM, Jiangli Zhou wrote: > > Looks good. > > > > Best regards, > > Jiangli > > > > On Mon, Jul 22, 2019 at 11:46 AM wrote: > >> Summary: Increase max size for SymbolTable and fix experimental option > >> range. Make experimental options trueInDebug so they're tested by the > >> command line option testing > >> > >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev > >> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 > >> > >> Tested locally with default and -XX:+UseZGC since ZGC has a lot of > >> experimental options. I didn't test with shenanodoah. > >> > >> I will test with hs-tier1-3 before checking in. > >> > >> Thanks, > >> Coleen > From christoph.langer at sap.com Tue Jul 23 15:29:54 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Tue, 23 Jul 2019 15:29:54 +0000 Subject: [11u] RFR: 8227041: runtime/memory/RunUnitTestsConcurrently.java has a memory leak Message-ID: Hi, please review backport of this test fix. Bug: https://bugs.openjdk.java.net/browse/JDK-8227041 Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8227041.11u-dev.0/ The test is a real resource drain and causes OOMs - while not really testing something useful. It was already removed in jdk/jdk - so requesting to remove it from JDK11u as well (to fix sporadic test failures). In jdk11 the relevant source files look a bit different, so I had to modify the original changeset a bit. Thanks Christoph From daniel.daugherty at oracle.com Tue Jul 23 15:30:53 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 23 Jul 2019 11:30:53 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> Message-ID: On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: > > > On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/23/19 12:27 AM, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>> +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, ??? \ >>>> >>>> I can't quite convince myself this is harmless nor necessary. >>> >>> Well if it's added, then the option range test would test the >>> option.? Otherwise, I think it's benign.? In debug mode, one would >>> no longer have to specify -XX:+UnlockExperimental options, just like >>> UnlockDiagnosticVMOptions.?? The option is there either way. >> >> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some folks think >> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause bugs >> in tests >> that are runnable in all build configs: 'release', 'fastdebug' and >> 'slowdebug'. >> Folks use an option in a test that requires >> '-XX:+UnlockDiagnosticVMOptions', >> but forget to include it in the test's run statement and we end up >> with a >> test failure in 'release' bits. >> >> I would prefer that 'UnlockExperimentalVMOptions' did not introduce >> the same >> path to failing tests. > > I tried to change UnlockDiagnosticVMOptions to be false, and got a > wall of opposition: > > See: https://bugs.openjdk.java.net/browse/JDK-8153783 > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html > I would not say "a wall of opposition". You got almost equal amounts of "yea" and "nay". I was a "yea" and I have been continuing to train my fingers (and my scripts) to do the right thing. Interestingly, David H was a "nay" on changing UnlockDiagnosticVMOptions to be 'false', but appears to be leaning toward "nay" on changing UnlockExperimentalVMOptions to 'trueInDebug'... > I think the same exact arguments should apply to > UnlockExperimentalVMOptions.? I'd like to hear from someone that uses > experimental options on ZGC or shenandoah, since those have the most > experimental options. I agree that the same arguments apply to UnlockExperimentalVMOptions. For consistency's sake if anything, they should be the same. > The reason that I made it trueInDebug is so that the command line > option range test would test these options.? Otherwise a more hacky > solution could be done, including adding the parameter > -XX:+UnlockExperimentalVMOptions to all the VM option range tests. I'd > rather not do this. Can explain this a bit more? Why would a default value of 'false' mean that the command line option range test would not test these options? In any case, I'm fine if you want to move forward with changing the default of UnlockExperimentalVMOptions to 'trueInDebug'. Dan > > Thanks, > Coleen > >> >> Dan >> >> >>>> >>>> Functional change seems fine. Is it worth adding a clarifying >>>> comment to: >>>> >>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>> >>>> with: >>>> >>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) >>>> ?????????????? \ >>> >>> Let me see if the X macro allows that and I could also add that to >>> StringTableSize (which is not experimental option). >>> Thanks, >>> Coleen >>>> >>>> Thanks, >>>> David >>>> >>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>> Summary: Increase max size for SymbolTable and fix experimental >>>>> option range.? Make experimental options trueInDebug so they're >>>>> tested by the command line option testing >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>> >>>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot of >>>>> experimental options.? I didn't test with shenanodoah. >>>>> >>>>> I will test with hs-tier1-3 before checking in. >>>>> >>>>> Thanks, >>>>> Coleen >>> >> > From coleen.phillimore at oracle.com Tue Jul 23 15:36:34 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Jul 2019 11:36:34 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: <834c9d78-c05d-41e9-9dae-4b4e4cb43a8f@oracle.com> Message-ID: <00538bce-7322-e289-6716-3366cd664ae6@oracle.com> On 7/23/19 11:25 AM, Jiangli Zhou wrote: > Hi Coleen, > > On Mon, Jul 22, 2019 at 12:51 PM wrote: >> >> Thanks Jiangli, and for the suggestion to increase the max size. I was >> waffling about removing the experimental option completely. > Would it be worth considering making SymbolTableSize a product flag? > There was a noticeable performance issue for a large application with > older JDK versions. It was found to be related to slow > SymbolTable::lookup due to too many collisions. Setting a large > initial symbol table size worked around the performance issue and the > improvement was significant. With newer JDK (> 12), user may still > prefer setting a large initial size to avoid any potential overhead > (avoid resizing) for large applications. I think the cost of resizing is low enough that it's not worth making it a product flag.? I know it doesn't match StringTableSize in this way.? Ideally, customers should not have knowledge that these things are hashtables and should not have controls for internal implementation decisions. With resizing, both of these tables should perform acceptably out of the box.?? Do we have any recent cases of customers needing to set this value to a large size, that may have prompted your bug report? Coleen > > Best regards, > Jiangli > >> Coleen >> >> On 7/22/19 3:40 PM, Jiangli Zhou wrote: >>> Looks good. >>> >>> Best regards, >>> Jiangli >>> >>> On Mon, Jul 22, 2019 at 11:46 AM wrote: >>>> Summary: Increase max size for SymbolTable and fix experimental option >>>> range. Make experimental options trueInDebug so they're tested by the >>>> command line option testing >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>> >>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot of >>>> experimental options. I didn't test with shenanodoah. >>>> >>>> I will test with hs-tier1-3 before checking in. >>>> >>>> Thanks, >>>> Coleen From jianglizhou at google.com Tue Jul 23 15:45:27 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Tue, 23 Jul 2019 08:45:27 -0700 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <00538bce-7322-e289-6716-3366cd664ae6@oracle.com> References: <834c9d78-c05d-41e9-9dae-4b4e4cb43a8f@oracle.com> <00538bce-7322-e289-6716-3366cd664ae6@oracle.com> Message-ID: On Tue, Jul 23, 2019 at 8:35 AM wrote: > > > > On 7/23/19 11:25 AM, Jiangli Zhou wrote: > > Hi Coleen, > > > > On Mon, Jul 22, 2019 at 12:51 PM wrote: > >> > >> Thanks Jiangli, and for the suggestion to increase the max size. I was > >> waffling about removing the experimental option completely. > > Would it be worth considering making SymbolTableSize a product flag? > > There was a noticeable performance issue for a large application with > > older JDK versions. It was found to be related to slow > > SymbolTable::lookup due to too many collisions. Setting a large > > initial symbol table size worked around the performance issue and the > > improvement was significant. With newer JDK (> 12), user may still > > prefer setting a large initial size to avoid any potential overhead > > (avoid resizing) for large applications. > > I think the cost of resizing is low enough that it's not worth making it > a product flag. I know it doesn't match StringTableSize in this way. > Ideally, customers should not have knowledge that these things are > hashtables and should not have controls for internal implementation > decisions. > > With resizing, both of these tables should perform acceptably out of the > box. Do we have any recent cases of customers needing to set this > value to a large size, that may have prompted your bug report? That was indeed the case, though it was with older JDK. Best regards, Jiangli > > Coleen > > > > > Best regards, > > Jiangli > > > >> Coleen > >> > >> On 7/22/19 3:40 PM, Jiangli Zhou wrote: > >>> Looks good. > >>> > >>> Best regards, > >>> Jiangli > >>> > >>> On Mon, Jul 22, 2019 at 11:46 AM wrote: > >>>> Summary: Increase max size for SymbolTable and fix experimental option > >>>> range. Make experimental options trueInDebug so they're tested by the > >>>> command line option testing > >>>> > >>>> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev > >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 > >>>> > >>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot of > >>>> experimental options. I didn't test with shenanodoah. > >>>> > >>>> I will test with hs-tier1-3 before checking in. > >>>> > >>>> Thanks, > >>>> Coleen > From coleen.phillimore at oracle.com Tue Jul 23 15:48:59 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Jul 2019 11:48:59 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> Message-ID: <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: > On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>> Hi Coleen, >>>>> >>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>> +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, ??? \ >>>>> >>>>> I can't quite convince myself this is harmless nor necessary. >>>> >>>> Well if it's added, then the option range test would test the >>>> option.? Otherwise, I think it's benign.? In debug mode, one would >>>> no longer have to specify -XX:+UnlockExperimental options, just >>>> like UnlockDiagnosticVMOptions.?? The option is there either way. >>> >>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some folks think >>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause bugs >>> in tests >>> that are runnable in all build configs: 'release', 'fastdebug' and >>> 'slowdebug'. >>> Folks use an option in a test that requires >>> '-XX:+UnlockDiagnosticVMOptions', >>> but forget to include it in the test's run statement and we end up >>> with a >>> test failure in 'release' bits. >>> >>> I would prefer that 'UnlockExperimentalVMOptions' did not introduce >>> the same >>> path to failing tests. >> >> I tried to change UnlockDiagnosticVMOptions to be false, and got a >> wall of opposition: >> >> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >> >> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >> > > I would not say "a wall of opposition". You got almost equal amounts > of "yea" and "nay". I was a "yea" and I have been continuing to train > my fingers (and my scripts) to do the right thing. You should have seen my slack channel at that time. :)? Maybe the "wall" was primarily from a couple of people who strongly objected. > > Interestingly, David H was a "nay" on changing UnlockDiagnosticVMOptions > to be 'false', but appears to be leaning toward "nay" on changing > UnlockExperimentalVMOptions to 'trueInDebug'... > I think he's mostly just asking the question.? We'll see what he answers later. > >> I think the same exact arguments should apply to >> UnlockExperimentalVMOptions.? I'd like to hear from someone that uses >> experimental options on ZGC or shenandoah, since those have the most >> experimental options. > > I agree that the same arguments apply to UnlockExperimentalVMOptions. > For consistency's sake if anything, they should be the same. > > >> The reason that I made it trueInDebug is so that the command line >> option range test would test these options.? Otherwise a more hacky >> solution could be done, including adding the parameter >> -XX:+UnlockExperimentalVMOptions to all the VM option range tests. >> I'd rather not do this. > > Can explain this a bit more? Why would a default value of 'false' mean > that > the command line option range test would not test these options? So the command line option tests do - java -XX:+PrintFlagsRanges -version and collect the flags that come out, parse the ranges, and then run java with each of these flags with the limits of the range (unless the limit is INT_MAX).? Some flags are excluded explicitly because they cause problems. The reason that SymbolTableSize escaped the testing, is because it wasn't reported with -XX:+PrintFlagsRanges.? You'd need -XX:+UnlockExperimentalVMOptions in the java command to gather the flags, and then pass it to all the java commands to test the ranges.? It's not that bad, just a bit gross. In any case, I think the experimental flags ranges should be tested.? I'm glad/amazed that more didn't fail when I turned it on in my testing. > > In any case, I'm fine if you want to move forward with changing the > default of UnlockExperimentalVMOptions to 'trueInDebug'. > Okay, we'll wait to see whether I get a wall of opposition or support.? I still think it should be by default the same as UnlockDiagnosticVMoptions. Thanks! Coleen > Dan > > >> >> Thanks, >> Coleen >> >>> >>> Dan >>> >>> >>>>> >>>>> Functional change seems fine. Is it worth adding a clarifying >>>>> comment to: >>>>> >>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>> >>>>> with: >>>>> >>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) >>>>> ?????????????? \ >>>> >>>> Let me see if the X macro allows that and I could also add that to >>>> StringTableSize (which is not experimental option). >>>> Thanks, >>>> Coleen >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Increase max size for SymbolTable and fix experimental >>>>>> option range.? Make experimental options trueInDebug so they're >>>>>> tested by the command line option testing >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>> >>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot >>>>>> of experimental options.? I didn't test with shenanodoah. >>>>>> >>>>>> I will test with hs-tier1-3 before checking in. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>> >>> >> > From coleen.phillimore at oracle.com Tue Jul 23 17:21:52 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 23 Jul 2019 13:21:52 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: <834c9d78-c05d-41e9-9dae-4b4e4cb43a8f@oracle.com> <00538bce-7322-e289-6716-3366cd664ae6@oracle.com> Message-ID: <891f2995-12c7-2d71-2c45-754babb18a16@oracle.com> On 7/23/19 11:45 AM, Jiangli Zhou wrote: > On Tue, Jul 23, 2019 at 8:35 AM wrote: >> >> >> On 7/23/19 11:25 AM, Jiangli Zhou wrote: >>> Hi Coleen, >>> >>> On Mon, Jul 22, 2019 at 12:51 PM wrote: >>>> Thanks Jiangli, and for the suggestion to increase the max size. I was >>>> waffling about removing the experimental option completely. >>> Would it be worth considering making SymbolTableSize a product flag? >>> There was a noticeable performance issue for a large application with >>> older JDK versions. It was found to be related to slow >>> SymbolTable::lookup due to too many collisions. Setting a large >>> initial symbol table size worked around the performance issue and the >>> improvement was significant. With newer JDK (> 12), user may still >>> prefer setting a large initial size to avoid any potential overhead >>> (avoid resizing) for large applications. >> I think the cost of resizing is low enough that it's not worth making it >> a product flag. I know it doesn't match StringTableSize in this way. >> Ideally, customers should not have knowledge that these things are >> hashtables and should not have controls for internal implementation >> decisions. >> >> With resizing, both of these tables should perform acceptably out of the >> box. Do we have any recent cases of customers needing to set this >> value to a large size, that may have prompted your bug report? > That was indeed the case, though it was with older JDK. Okay, that's good that it wasn't for the latest JDK. https://bugs.openjdk.java.net/browse/JDK-8019375 This bug has the history of why we made SymbolTableSize experimental, which still holds today, so I want to leave it as experimental.? I'm thinking in the future we could make StringTableSize experimental to match, and someday remove both, but not just now. Thanks, Coleen > > Best regards, > Jiangli >> Coleen >> >>> Best regards, >>> Jiangli >>> >>>> Coleen >>>> >>>> On 7/22/19 3:40 PM, Jiangli Zhou wrote: >>>>> Looks good. >>>>> >>>>> Best regards, >>>>> Jiangli >>>>> >>>>> On Mon, Jul 22, 2019 at 11:46 AM wrote: >>>>>> Summary: Increase max size for SymbolTable and fix experimental option >>>>>> range. Make experimental options trueInDebug so they're tested by the >>>>>> command line option testing >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>> >>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot of >>>>>> experimental options. I didn't test with shenanodoah. >>>>>> >>>>>> I will test with hs-tier1-3 before checking in. >>>>>> >>>>>> Thanks, >>>>>> Coleen From jianglizhou at google.com Wed Jul 24 00:54:11 2019 From: jianglizhou at google.com (Jiangli Zhou) Date: Tue, 23 Jul 2019 17:54:11 -0700 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <891f2995-12c7-2d71-2c45-754babb18a16@oracle.com> References: <834c9d78-c05d-41e9-9dae-4b4e4cb43a8f@oracle.com> <00538bce-7322-e289-6716-3366cd664ae6@oracle.com> <891f2995-12c7-2d71-2c45-754babb18a16@oracle.com> Message-ID: Hi Coleen, On Tue, Jul 23, 2019 at 10:23 AM wrote: > > > > On 7/23/19 11:45 AM, Jiangli Zhou wrote: > > On Tue, Jul 23, 2019 at 8:35 AM wrote: > >> > >> > >> On 7/23/19 11:25 AM, Jiangli Zhou wrote: > >>> Hi Coleen, > >>> > >>> On Mon, Jul 22, 2019 at 12:51 PM wrote: > >>>> Thanks Jiangli, and for the suggestion to increase the max size. I was > >>>> waffling about removing the experimental option completely. > >>> Would it be worth considering making SymbolTableSize a product flag? > >>> There was a noticeable performance issue for a large application with > >>> older JDK versions. It was found to be related to slow > >>> SymbolTable::lookup due to too many collisions. Setting a large > >>> initial symbol table size worked around the performance issue and the > >>> improvement was significant. With newer JDK (> 12), user may still > >>> prefer setting a large initial size to avoid any potential overhead > >>> (avoid resizing) for large applications. > >> I think the cost of resizing is low enough that it's not worth making it > >> a product flag. I know it doesn't match StringTableSize in this way. > >> Ideally, customers should not have knowledge that these things are > >> hashtables and should not have controls for internal implementation > >> decisions. > >> > >> With resizing, both of these tables should perform acceptably out of the > >> box. Do we have any recent cases of customers needing to set this > >> value to a large size, that may have prompted your bug report? > > That was indeed the case, though it was with older JDK. > > Okay, that's good that it wasn't for the latest JDK. > https://bugs.openjdk.java.net/browse/JDK-8019375 This bug has the > history of why we made SymbolTableSize experimental, which still holds > today, so I want to leave it as experimental. I'm thinking in the > future we could make StringTableSize experimental to match, and someday > remove both, but not just now. Thanks for the pointer. It might be beneficial to also measure the potential resizing overhead in the future. Best regards, Jiangli > > Thanks, > Coleen > > > > > Best regards, > > Jiangli > >> Coleen > >> > >>> Best regards, > >>> Jiangli > >>> > >>>> Coleen > >>>> > >>>> On 7/22/19 3:40 PM, Jiangli Zhou wrote: > >>>>> Looks good. > >>>>> > >>>>> Best regards, > >>>>> Jiangli > >>>>> > >>>>> On Mon, Jul 22, 2019 at 11:46 AM wrote: > >>>>>> Summary: Increase max size for SymbolTable and fix experimental option > >>>>>> range. Make experimental options trueInDebug so they're tested by the > >>>>>> command line option testing > >>>>>> > >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev > >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 > >>>>>> > >>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot of > >>>>>> experimental options. I didn't test with shenanodoah. > >>>>>> > >>>>>> I will test with hs-tier1-3 before checking in. > >>>>>> > >>>>>> Thanks, > >>>>>> Coleen > From david.holmes at oracle.com Wed Jul 24 02:20:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Jul 2019 12:20:36 +1000 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> Message-ID: <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: > On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>> +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, ??? \ >>>>>> >>>>>> I can't quite convince myself this is harmless nor necessary. >>>>> >>>>> Well if it's added, then the option range test would test the >>>>> option.? Otherwise, I think it's benign.? In debug mode, one would >>>>> no longer have to specify -XX:+UnlockExperimental options, just >>>>> like UnlockDiagnosticVMOptions.?? The option is there either way. >>>> >>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some folks think >>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause bugs >>>> in tests >>>> that are runnable in all build configs: 'release', 'fastdebug' and >>>> 'slowdebug'. >>>> Folks use an option in a test that requires >>>> '-XX:+UnlockDiagnosticVMOptions', >>>> but forget to include it in the test's run statement and we end up >>>> with a test failure in 'release' bits. >>>> >>>> I would prefer that 'UnlockExperimentalVMOptions' did not introduce >>>> the same path to failing tests. >>> >>> I tried to change UnlockDiagnosticVMOptions to be false, and got a >>> wall of opposition: >>> >>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>> >>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>> >> >> I would not say "a wall of opposition". You got almost equal amounts >> of "yea" and "nay". I was a "yea" and I have been continuing to train >> my fingers (and my scripts) to do the right thing. > > You should have seen my slack channel at that time. :)? Maybe the "wall" > was primarily from a couple of people who strongly objected. >> >> Interestingly, David H was a "nay" on changing UnlockDiagnosticVMOptions >> to be 'false', but appears to be leaning toward "nay" on changing >> UnlockExperimentalVMOptions to 'trueInDebug'... >> > > I think he's mostly just asking the question.? We'll see what he answers > later. Yes I'm just asking the question. I don't think changing this buys us much other than "it's now the same as for diagnostic flags". Testing these flags can (and probably should) be handled explicitly. I looked back at the discussion on JDK-8153783 (sorry can't recall what may have been said in slack) and I'm not sure what my specific concern was then. From a testing perspective if you use an experimental or diagnostic flag then you should remember to explicitly unlock it in the test setup. Not having trueInDebug catches when you forget that and only test in a debug build. Cheers, David ----- >> >>> I think the same exact arguments should apply to >>> UnlockExperimentalVMOptions.? I'd like to hear from someone that uses >>> experimental options on ZGC or shenandoah, since those have the most >>> experimental options. >> >> I agree that the same arguments apply to UnlockExperimentalVMOptions. >> For consistency's sake if anything, they should be the same. >> >> >>> The reason that I made it trueInDebug is so that the command line >>> option range test would test these options.? Otherwise a more hacky >>> solution could be done, including adding the parameter >>> -XX:+UnlockExperimentalVMOptions to all the VM option range tests. >>> I'd rather not do this. >> >> Can explain this a bit more? Why would a default value of 'false' mean >> that >> the command line option range test would not test these options? > > So the command line option tests do - java -XX:+PrintFlagsRanges > -version and collect the flags that come out, parse the ranges, and then > run java with each of these flags with the limits of the range (unless > the limit is INT_MAX).? Some flags are excluded explicitly because they > cause problems. > > The reason that SymbolTableSize escaped the testing, is because it > wasn't reported with -XX:+PrintFlagsRanges.? You'd need > -XX:+UnlockExperimentalVMOptions in the java command to gather the > flags, and then pass it to all the java commands to test the ranges. > It's not that bad, just a bit gross. > > In any case, I think the experimental flags ranges should be tested. I'm > glad/amazed that more didn't fail when I turned it on in my testing. > >> >> In any case, I'm fine if you want to move forward with changing the >> default of UnlockExperimentalVMOptions to 'trueInDebug'. >> > > Okay, we'll wait to see whether I get a wall of opposition or support. I > still think it should be by default the same as UnlockDiagnosticVMoptions. > > Thanks! > Coleen > >> Dan >> >> >>> >>> Thanks, >>> Coleen >>> >>>> >>>> Dan >>>> >>>> >>>>>> >>>>>> Functional change seems fine. Is it worth adding a clarifying >>>>>> comment to: >>>>>> >>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>> >>>>>> with: >>>>>> >>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) >>>>>> ?????????????? \ >>>>> >>>>> Let me see if the X macro allows that and I could also add that to >>>>> StringTableSize (which is not experimental option). >>>>> Thanks, >>>>> Coleen >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Increase max size for SymbolTable and fix experimental >>>>>>> option range.? Make experimental options trueInDebug so they're >>>>>>> tested by the command line option testing >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>> >>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot >>>>>>> of experimental options.? I didn't test with shenanodoah. >>>>>>> >>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>> >>>> >>> >> > From martin.doerr at sap.com Wed Jul 24 10:14:17 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 24 Jul 2019 10:14:17 +0000 Subject: RFR: 8228482: fix xlc16/xlclang comparison of distinct pointer types and string literal conversion warnings In-Reply-To: References: Message-ID: Hi Matthias, I wouldn?t say ?looks good?, but I think it?s the right thing to do ?? The type casts look correct to fit to the AIX headers. libodm_aix: Good. Maybe we should open a new issue for freeing what?s returned by odm_set_path and we could also remove the AIX 5 support. NetworkInterface.c: Strange, but ok. Should be reviewed by somebody else in addition. Other files: No comments. Best regards, Martin From: ppc-aix-port-dev On Behalf Of Baesken, Matthias Sent: Dienstag, 23. Juli 2019 17:15 To: 'hotspot-dev at openjdk.java.net' ; core-libs-dev at openjdk.java.net; net-dev at openjdk.java.net Cc: 'ppc-aix-port-dev at openjdk.java.net' Subject: RFR: 8228482: fix xlc16/xlclang comparison of distinct pointer types and string literal conversion warnings Hello please review this patch . It fixes a couple of xlc16/xlclang warnings , especially comparison of distinct pointer types and string literal conversion warnings . When building with xlc16/xlclang, we still have a couple of warnings that have to be fixed : warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] for example : /nightly/jdk/src/hotspot/os/aix/libodm_aix.cpp:81:18: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] odmWrapper odm("product", "/usr/lib/objrepos"); // could also use "lpp" ^ /nightly/jdk/src/hotspot/os/aix/libodm_aix.cpp:81:29: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] odmWrapper odm("product", "/usr/lib/objrepos"); // could also use "lpp" ^ warning: comparison of distinct pointer types, for example : /nightly/jdk/src/java.desktop/aix/native/libawt/porting_aix.c:50:14: warning: comparison of distinct pointer types ('void *' and 'char *') [-Wcompare-distinct-pointer-types] addr < (((char*)p->ldinfo_textorg) + p->ldinfo_textsize)) { ~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8228482 http://cr.openjdk.java.net/~mbaesken/webrevs/8228482.1/ Thanks, Matthias From christian.hagedorn at oracle.com Wed Jul 24 11:57:26 2019 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Wed, 24 Jul 2019 13:57:26 +0200 Subject: [14] RFR(XS): 8071275: remove AbstractAssembler::update_delayed_values dead code Message-ID: <9dc4aa33-d65f-cef5-2541-06f3446eac99@oracle.com> Hi Please review the following enhancement: https://bugs.openjdk.java.net/browse/JDK-8071275 http://cr.openjdk.java.net/~thartmann/8071275/webrev.00/ This just removes some dead code. Thanks! Best regards, Christian From tobias.hartmann at oracle.com Wed Jul 24 12:04:24 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 24 Jul 2019 14:04:24 +0200 Subject: [14] RFR(XS): 8071275: remove AbstractAssembler::update_delayed_values dead code In-Reply-To: <9dc4aa33-d65f-cef5-2541-06f3446eac99@oracle.com> References: <9dc4aa33-d65f-cef5-2541-06f3446eac99@oracle.com> Message-ID: Hi Christian, looks good to me. Best regards, Tobias On 24.07.19 13:57, Christian Hagedorn wrote: > Hi > > Please review the following enhancement: > https://bugs.openjdk.java.net/browse/JDK-8071275 > http://cr.openjdk.java.net/~thartmann/8071275/webrev.00/ > > This just removes some dead code. > > Thanks! > > Best regards, > Christian From martin.doerr at sap.com Wed Jul 24 12:16:42 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 24 Jul 2019 12:16:42 +0000 Subject: [14] RFR(XS): 8071275: remove AbstractAssembler::update_delayed_values dead code In-Reply-To: References: <9dc4aa33-d65f-cef5-2541-06f3446eac99@oracle.com> Message-ID: Hi Christian, please also remove check_method_handle_type from macroAssembler_ppc.hpp: diff -r 4c1fc3947383 src/hotspot/cpu/ppc/macroAssembler_ppc.hpp --- a/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp Wed Jul 24 14:06:44 2019 +0200 +++ b/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp Wed Jul 24 14:13:27 2019 +0200 @@ -565,8 +565,6 @@ Label* L_slow_path = NULL); // Method handle support (JSR 292). - void check_method_handle_type(Register mtype_reg, Register mh_reg, Register temp_reg, Label& wrong_method_type); - RegisterOrConstant argument_offset(RegisterOrConstant arg_slot, Register temp_reg, int extra_slot_offset = 0); // Biased locking support Looks good otherwise. I don't need to see another webrev for that. Thanks for cleaning this up. Best regards, Martin > -----Original Message----- > From: hotspot-dev On Behalf Of > Tobias Hartmann > Sent: Mittwoch, 24. Juli 2019 14:04 > To: Christian Hagedorn ; hotspot- > dev at openjdk.java.net > Subject: Re: [14] RFR(XS): 8071275: remove > AbstractAssembler::update_delayed_values dead code > > Hi Christian, > > looks good to me. > > Best regards, > Tobias > > On 24.07.19 13:57, Christian Hagedorn wrote: > > Hi > > > > Please review the following enhancement: > > https://bugs.openjdk.java.net/browse/JDK-8071275 > > http://cr.openjdk.java.net/~thartmann/8071275/webrev.00/ > > > > This just removes some dead code. > > > > Thanks! > > > > Best regards, > > Christian From coleen.phillimore at oracle.com Wed Jul 24 13:04:20 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 24 Jul 2019 09:04:20 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> Message-ID: <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> On 7/23/19 10:20 PM, David Holmes wrote: > On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, >>>>>>> ??? \ >>>>>>> >>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>> >>>>>> Well if it's added, then the option range test would test the >>>>>> option.? Otherwise, I think it's benign.? In debug mode, one >>>>>> would no longer have to specify -XX:+UnlockExperimental options, >>>>>> just like UnlockDiagnosticVMOptions.?? The option is there either >>>>>> way. >>>>> >>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some folks >>>>> think >>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause >>>>> bugs in tests >>>>> that are runnable in all build configs: 'release', 'fastdebug' and >>>>> 'slowdebug'. >>>>> Folks use an option in a test that requires >>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>> but forget to include it in the test's run statement and we end up >>>>> with a test failure in 'release' bits. >>>>> >>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>> introduce the same path to failing tests. >>>> >>>> I tried to change UnlockDiagnosticVMOptions to be false, and got a >>>> wall of opposition: >>>> >>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>> >>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>> >>> >>> I would not say "a wall of opposition". You got almost equal amounts >>> of "yea" and "nay". I was a "yea" and I have been continuing to train >>> my fingers (and my scripts) to do the right thing. >> >> You should have seen my slack channel at that time. :)? Maybe the >> "wall" was primarily from a couple of people who strongly objected. >>> >>> Interestingly, David H was a "nay" on changing >>> UnlockDiagnosticVMOptions >>> to be 'false', but appears to be leaning toward "nay" on changing >>> UnlockExperimentalVMOptions to 'trueInDebug'... >>> >> >> I think he's mostly just asking the question.? We'll see what he >> answers later. > > Yes I'm just asking the question. I don't think changing this buys us > much other than "it's now the same as for diagnostic flags". Testing > these flags can (and probably should) be handled explicitly. I disagree.? I don't think we should test these flags explicitly when we have a perfectly good test for all the flags, that should be enabled.?? Which is what my change does. > > I looked back at the discussion on JDK-8153783 (sorry can't recall > what may have been said in slack) and I'm not sure what my specific > concern was then. From a testing perspective if you use an > experimental or diagnostic flag then you should remember to explicitly > unlock it in the test setup. Not having trueInDebug catches when you > forget that and only test in a debug build. Yes, that was the rationale for making it 'false' rather than 'trueInDebug'.? People were adding tests with a diagnostic option and it was failing in product mode because the Unlock flag wasn't present.? The more vocal side of the question didn't want to have to add the Unlock flag for all their day to day local testing.?? I assume the same argument can be made for the experimental options. It would be good to hear the opinion from someone who uses these options.?? This is degenerated into an opinion question, and besides being able to cleanly test these options, neither one of us uses or tests experimental options as far as I can tell.? I see tests from the Compiler and GC components.? What do other people think? Thanks, Coleen > > Cheers, > David > ----- > >>> >>>> I think the same exact arguments should apply to >>>> UnlockExperimentalVMOptions.? I'd like to hear from someone that >>>> uses experimental options on ZGC or shenandoah, since those have >>>> the most experimental options. >>> >>> I agree that the same arguments apply to UnlockExperimentalVMOptions. >>> For consistency's sake if anything, they should be the same. >>> >>> >>>> The reason that I made it trueInDebug is so that the command line >>>> option range test would test these options.? Otherwise a more hacky >>>> solution could be done, including adding the parameter >>>> -XX:+UnlockExperimentalVMOptions to all the VM option range tests. >>>> I'd rather not do this. >>> >>> Can explain this a bit more? Why would a default value of 'false' >>> mean that >>> the command line option range test would not test these options? >> >> So the command line option tests do - java -XX:+PrintFlagsRanges >> -version and collect the flags that come out, parse the ranges, and >> then run java with each of these flags with the limits of the range >> (unless the limit is INT_MAX).? Some flags are excluded explicitly >> because they cause problems. >> >> The reason that SymbolTableSize escaped the testing, is because it >> wasn't reported with -XX:+PrintFlagsRanges.? You'd need >> -XX:+UnlockExperimentalVMOptions in the java command to gather the >> flags, and then pass it to all the java commands to test the ranges. >> It's not that bad, just a bit gross. >> >> In any case, I think the experimental flags ranges should be tested. >> I'm glad/amazed that more didn't fail when I turned it on in my testing. >> >>> >>> In any case, I'm fine if you want to move forward with changing the >>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>> >> >> Okay, we'll wait to see whether I get a wall of opposition or >> support. I still think it should be by default the same as >> UnlockDiagnosticVMoptions. >> >> Thanks! >> Coleen >> >>> Dan >>> >>> >>>> >>>> Thanks, >>>> Coleen >>>> >>>>> >>>>> Dan >>>>> >>>>> >>>>>>> >>>>>>> Functional change seems fine. Is it worth adding a clarifying >>>>>>> comment to: >>>>>>> >>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>> >>>>>>> with: >>>>>>> >>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) >>>>>>> ?????????????? \ >>>>>> >>>>>> Let me see if the X macro allows that and I could also add that >>>>>> to StringTableSize (which is not experimental option). >>>>>> Thanks, >>>>>> Coleen >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Increase max size for SymbolTable and fix experimental >>>>>>>> option range.? Make experimental options trueInDebug so they're >>>>>>>> tested by the command line option testing >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>> >>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot >>>>>>>> of experimental options.? I didn't test with shenanodoah. >>>>>>>> >>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>> >>>>> >>>> >>> >> From daniel.daugherty at oracle.com Wed Jul 24 13:07:33 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 24 Jul 2019 09:07:33 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> Message-ID: On 7/24/19 9:04 AM, coleen.phillimore at oracle.com wrote: > > > On 7/23/19 10:20 PM, David Holmes wrote: >> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, >>>>>>>> ??? \ >>>>>>>> >>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>> >>>>>>> Well if it's added, then the option range test would test the >>>>>>> option.? Otherwise, I think it's benign.? In debug mode, one >>>>>>> would no longer have to specify -XX:+UnlockExperimental options, >>>>>>> just like UnlockDiagnosticVMOptions.?? The option is there >>>>>>> either way. >>>>>> >>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some folks >>>>>> think >>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause >>>>>> bugs in tests >>>>>> that are runnable in all build configs: 'release', 'fastdebug' >>>>>> and 'slowdebug'. >>>>>> Folks use an option in a test that requires >>>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>>> but forget to include it in the test's run statement and we end >>>>>> up with a test failure in 'release' bits. >>>>>> >>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>>> introduce the same path to failing tests. >>>>> >>>>> I tried to change UnlockDiagnosticVMOptions to be false, and got a >>>>> wall of opposition: >>>>> >>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>> >>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>> >>>> >>>> I would not say "a wall of opposition". You got almost equal amounts >>>> of "yea" and "nay". I was a "yea" and I have been continuing to train >>>> my fingers (and my scripts) to do the right thing. >>> >>> You should have seen my slack channel at that time. :)? Maybe the >>> "wall" was primarily from a couple of people who strongly objected. >>>> >>>> Interestingly, David H was a "nay" on changing >>>> UnlockDiagnosticVMOptions >>>> to be 'false', but appears to be leaning toward "nay" on changing >>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>> >>> >>> I think he's mostly just asking the question.? We'll see what he >>> answers later. >> >> Yes I'm just asking the question. I don't think changing this buys us >> much other than "it's now the same as for diagnostic flags". Testing >> these flags can (and probably should) be handled explicitly. > > I disagree.? I don't think we should test these flags explicitly when > we have a perfectly good test for all the flags, that should be > enabled.?? Which is what my change does. >> >> I looked back at the discussion on JDK-8153783 (sorry can't recall >> what may have been said in slack) and I'm not sure what my specific >> concern was then. From a testing perspective if you use an >> experimental or diagnostic flag then you should remember to >> explicitly unlock it in the test setup. Not having trueInDebug >> catches when you forget that and only test in a debug build. > > Yes, that was the rationale for making it 'false' rather than > 'trueInDebug'.? People were adding tests with a diagnostic option and > it was failing in product mode because the Unlock flag wasn't > present.? The more vocal side of the question didn't want to have to > add the Unlock flag for all their day to day local testing.?? I assume > the same argument can be made for the experimental options. > > It would be good to hear the opinion from someone who uses these > options.?? This is degenerated into an opinion question, and besides > being able to cleanly test these options, neither one of us uses or > tests experimental options as far as I can tell.? I see tests from the > Compiler and GC components.? What do other people think? I use experimental options in the various subsystems that I maintain, but as I said, I'm training my fingers and my scripts to include the various Unlock options... I think the consistency argument is a winner as is the argument that folks need to test 'release' bits in addition to 'fastdebug' bits. It's interesting that non-HotSpot folks typically test with 'release' bits and have a hard time seeing why HotSpot folks always test with 'fastdebug'. It seems like at least some HotSpot folks only test with 'fastdebug' and not with 'release'... Dan P.S. I suspect that I'm one of the few anal retentive folks that tests all three build configs: 'release', 'fastdebug' and 'slowdebug', but I have the advantage of having my own lab with various (somewhat fast) machines... Of course that monthly power bill is a pain since it seems that I need A/C here in Orlando... Can't get away with no A/C like in Colorado... > > Thanks, > Coleen > >> >> Cheers, >> David >> ----- >> >>>> >>>>> I think the same exact arguments should apply to >>>>> UnlockExperimentalVMOptions.? I'd like to hear from someone that >>>>> uses experimental options on ZGC or shenandoah, since those have >>>>> the most experimental options. >>>> >>>> I agree that the same arguments apply to UnlockExperimentalVMOptions. >>>> For consistency's sake if anything, they should be the same. >>>> >>>> >>>>> The reason that I made it trueInDebug is so that the command line >>>>> option range test would test these options.? Otherwise a more >>>>> hacky solution could be done, including adding the parameter >>>>> -XX:+UnlockExperimentalVMOptions to all the VM option range tests. >>>>> I'd rather not do this. >>>> >>>> Can explain this a bit more? Why would a default value of 'false' >>>> mean that >>>> the command line option range test would not test these options? >>> >>> So the command line option tests do - java -XX:+PrintFlagsRanges >>> -version and collect the flags that come out, parse the ranges, and >>> then run java with each of these flags with the limits of the range >>> (unless the limit is INT_MAX).? Some flags are excluded explicitly >>> because they cause problems. >>> >>> The reason that SymbolTableSize escaped the testing, is because it >>> wasn't reported with -XX:+PrintFlagsRanges.? You'd need >>> -XX:+UnlockExperimentalVMOptions in the java command to gather the >>> flags, and then pass it to all the java commands to test the ranges. >>> It's not that bad, just a bit gross. >>> >>> In any case, I think the experimental flags ranges should be tested. >>> I'm glad/amazed that more didn't fail when I turned it on in my >>> testing. >>> >>>> >>>> In any case, I'm fine if you want to move forward with changing the >>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>> >>> >>> Okay, we'll wait to see whether I get a wall of opposition or >>> support. I still think it should be by default the same as >>> UnlockDiagnosticVMoptions. >>> >>> Thanks! >>> Coleen >>> >>>> Dan >>>> >>>> >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>>> >>>>>>>> Functional change seems fine. Is it worth adding a clarifying >>>>>>>> comment to: >>>>>>>> >>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>>> >>>>>>>> with: >>>>>>>> >>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) >>>>>>>> ?????????????? \ >>>>>>> >>>>>>> Let me see if the X macro allows that and I could also add that >>>>>>> to StringTableSize (which is not experimental option). >>>>>>> Thanks, >>>>>>> Coleen >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>> Summary: Increase max size for SymbolTable and fix >>>>>>>>> experimental option range.? Make experimental options >>>>>>>>> trueInDebug so they're tested by the command line option testing >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>> >>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a >>>>>>>>> lot of experimental options.? I didn't test with shenanodoah. >>>>>>>>> >>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>> >>>>>> >>>>> >>>> >>> > From david.holmes at oracle.com Wed Jul 24 13:20:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Jul 2019 23:20:00 +1000 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> Message-ID: <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> On 24/07/2019 11:04 pm, coleen.phillimore at oracle.com wrote: > On 7/23/19 10:20 PM, David Holmes wrote: >> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, trueInDebug, >>>>>>>> ??? \ >>>>>>>> >>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>> >>>>>>> Well if it's added, then the option range test would test the >>>>>>> option.? Otherwise, I think it's benign.? In debug mode, one >>>>>>> would no longer have to specify -XX:+UnlockExperimental options, >>>>>>> just like UnlockDiagnosticVMOptions.?? The option is there either >>>>>>> way. >>>>>> >>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some folks >>>>>> think >>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause >>>>>> bugs in tests >>>>>> that are runnable in all build configs: 'release', 'fastdebug' and >>>>>> 'slowdebug'. >>>>>> Folks use an option in a test that requires >>>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>>> but forget to include it in the test's run statement and we end up >>>>>> with a test failure in 'release' bits. >>>>>> >>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>>> introduce the same path to failing tests. >>>>> >>>>> I tried to change UnlockDiagnosticVMOptions to be false, and got a >>>>> wall of opposition: >>>>> >>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>> >>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>> >>>> >>>> I would not say "a wall of opposition". You got almost equal amounts >>>> of "yea" and "nay". I was a "yea" and I have been continuing to train >>>> my fingers (and my scripts) to do the right thing. >>> >>> You should have seen my slack channel at that time. :)? Maybe the >>> "wall" was primarily from a couple of people who strongly objected. >>>> >>>> Interestingly, David H was a "nay" on changing >>>> UnlockDiagnosticVMOptions >>>> to be 'false', but appears to be leaning toward "nay" on changing >>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>> >>> >>> I think he's mostly just asking the question.? We'll see what he >>> answers later. >> >> Yes I'm just asking the question. I don't think changing this buys us >> much other than "it's now the same as for diagnostic flags". Testing >> these flags can (and probably should) be handled explicitly. > > I disagree.? I don't think we should test these flags explicitly when we > have a perfectly good test for all the flags, that should be enabled. > Which is what my change does. Your change only causes the experimental flags to be tested in debug builds. I would argue they should also be tested in product builds, hence the need to be explicit about it. David ----- >> >> I looked back at the discussion on JDK-8153783 (sorry can't recall >> what may have been said in slack) and I'm not sure what my specific >> concern was then. From a testing perspective if you use an >> experimental or diagnostic flag then you should remember to explicitly >> unlock it in the test setup. Not having trueInDebug catches when you >> forget that and only test in a debug build. > > Yes, that was the rationale for making it 'false' rather than > 'trueInDebug'.? People were adding tests with a diagnostic option and it > was failing in product mode because the Unlock flag wasn't present.? The > more vocal side of the question didn't want to have to add the Unlock > flag for all their day to day local testing.?? I assume the same > argument can be made for the experimental options. > > It would be good to hear the opinion from someone who uses these > options.?? This is degenerated into an opinion question, and besides > being able to cleanly test these options, neither one of us uses or > tests experimental options as far as I can tell.? I see tests from the > Compiler and GC components.? What do other people think? > > Thanks, > Coleen > >> >> Cheers, >> David >> ----- >> >>>> >>>>> I think the same exact arguments should apply to >>>>> UnlockExperimentalVMOptions.? I'd like to hear from someone that >>>>> uses experimental options on ZGC or shenandoah, since those have >>>>> the most experimental options. >>>> >>>> I agree that the same arguments apply to UnlockExperimentalVMOptions. >>>> For consistency's sake if anything, they should be the same. >>>> >>>> >>>>> The reason that I made it trueInDebug is so that the command line >>>>> option range test would test these options.? Otherwise a more hacky >>>>> solution could be done, including adding the parameter >>>>> -XX:+UnlockExperimentalVMOptions to all the VM option range tests. >>>>> I'd rather not do this. >>>> >>>> Can explain this a bit more? Why would a default value of 'false' >>>> mean that >>>> the command line option range test would not test these options? >>> >>> So the command line option tests do - java -XX:+PrintFlagsRanges >>> -version and collect the flags that come out, parse the ranges, and >>> then run java with each of these flags with the limits of the range >>> (unless the limit is INT_MAX).? Some flags are excluded explicitly >>> because they cause problems. >>> >>> The reason that SymbolTableSize escaped the testing, is because it >>> wasn't reported with -XX:+PrintFlagsRanges.? You'd need >>> -XX:+UnlockExperimentalVMOptions in the java command to gather the >>> flags, and then pass it to all the java commands to test the ranges. >>> It's not that bad, just a bit gross. >>> >>> In any case, I think the experimental flags ranges should be tested. >>> I'm glad/amazed that more didn't fail when I turned it on in my testing. >>> >>>> >>>> In any case, I'm fine if you want to move forward with changing the >>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>> >>> >>> Okay, we'll wait to see whether I get a wall of opposition or >>> support. I still think it should be by default the same as >>> UnlockDiagnosticVMoptions. >>> >>> Thanks! >>> Coleen >>> >>>> Dan >>>> >>>> >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>>> >>>>>>>> Functional change seems fine. Is it worth adding a clarifying >>>>>>>> comment to: >>>>>>>> >>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>>> >>>>>>>> with: >>>>>>>> >>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 */) >>>>>>>> ?????????????? \ >>>>>>> >>>>>>> Let me see if the X macro allows that and I could also add that >>>>>>> to StringTableSize (which is not experimental option). >>>>>>> Thanks, >>>>>>> Coleen >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>> Summary: Increase max size for SymbolTable and fix experimental >>>>>>>>> option range.? Make experimental options trueInDebug so they're >>>>>>>>> tested by the command line option testing >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>> >>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot >>>>>>>>> of experimental options.? I didn't test with shenanodoah. >>>>>>>>> >>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>> >>>>>> >>>>> >>>> >>> > From christian.hagedorn at oracle.com Wed Jul 24 13:29:35 2019 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Wed, 24 Jul 2019 15:29:35 +0200 Subject: [14] RFR(XS): 8071275: remove AbstractAssembler::update_delayed_values dead code In-Reply-To: References: <9dc4aa33-d65f-cef5-2541-06f3446eac99@oracle.com> Message-ID: <6689ef18-fb51-ffbc-3847-6896c8b17f2e@oracle.com> Hi Martin, hi Tobias Thanks for the reviews, I missed that. I updated the webrev in place. Best regards, Christian On 24.07.19 14:16, Doerr, Martin wrote: > Hi Christian, > > please also remove check_method_handle_type from macroAssembler_ppc.hpp: > diff -r 4c1fc3947383 src/hotspot/cpu/ppc/macroAssembler_ppc.hpp > --- a/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp Wed Jul 24 14:06:44 2019 +0200 > +++ b/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp Wed Jul 24 14:13:27 2019 +0200 > @@ -565,8 +565,6 @@ > Label* L_slow_path = NULL); > > // Method handle support (JSR 292). > - void check_method_handle_type(Register mtype_reg, Register mh_reg, Register temp_reg, Label& wrong_method_type); > - > RegisterOrConstant argument_offset(RegisterOrConstant arg_slot, Register temp_reg, int extra_slot_offset = 0); > > // Biased locking support > > Looks good otherwise. I don't need to see another webrev for that. Thanks for cleaning this up. > > Best regards, > Martin > > >> -----Original Message----- >> From: hotspot-dev On Behalf Of >> Tobias Hartmann >> Sent: Mittwoch, 24. Juli 2019 14:04 >> To: Christian Hagedorn ; hotspot- >> dev at openjdk.java.net >> Subject: Re: [14] RFR(XS): 8071275: remove >> AbstractAssembler::update_delayed_values dead code >> >> Hi Christian, >> >> looks good to me. >> >> Best regards, >> Tobias >> >> On 24.07.19 13:57, Christian Hagedorn wrote: >>> Hi >>> >>> Please review the following enhancement: >>> https://bugs.openjdk.java.net/browse/JDK-8071275 >>> http://cr.openjdk.java.net/~thartmann/8071275/webrev.00/ >>> >>> This just removes some dead code. >>> >>> Thanks! >>> >>> Best regards, >>> Christian From coleen.phillimore at oracle.com Wed Jul 24 13:47:03 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 24 Jul 2019 09:47:03 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> Message-ID: <3bbf5c0b-7e20-f471-2c5e-15777e553499@oracle.com> On 7/24/19 9:07 AM, Daniel D. Daugherty wrote: > On 7/24/19 9:04 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/23/19 10:20 PM, David Holmes wrote: >>> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, >>>>>>>>> trueInDebug, ??? \ >>>>>>>>> >>>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>>> >>>>>>>> Well if it's added, then the option range test would test the >>>>>>>> option.? Otherwise, I think it's benign. In debug mode, one >>>>>>>> would no longer have to specify -XX:+UnlockExperimental >>>>>>>> options, just like UnlockDiagnosticVMOptions.?? The option is >>>>>>>> there either way. >>>>>>> >>>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some >>>>>>> folks think >>>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause >>>>>>> bugs in tests >>>>>>> that are runnable in all build configs: 'release', 'fastdebug' >>>>>>> and 'slowdebug'. >>>>>>> Folks use an option in a test that requires >>>>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>>>> but forget to include it in the test's run statement and we end >>>>>>> up with a test failure in 'release' bits. >>>>>>> >>>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>>>> introduce the same path to failing tests. >>>>>> >>>>>> I tried to change UnlockDiagnosticVMOptions to be false, and got >>>>>> a wall of opposition: >>>>>> >>>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>>> >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>>> >>>>> >>>>> I would not say "a wall of opposition". You got almost equal amounts >>>>> of "yea" and "nay". I was a "yea" and I have been continuing to train >>>>> my fingers (and my scripts) to do the right thing. >>>> >>>> You should have seen my slack channel at that time. :) Maybe the >>>> "wall" was primarily from a couple of people who strongly objected. >>>>> >>>>> Interestingly, David H was a "nay" on changing >>>>> UnlockDiagnosticVMOptions >>>>> to be 'false', but appears to be leaning toward "nay" on changing >>>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>>> >>>> >>>> I think he's mostly just asking the question.? We'll see what he >>>> answers later. >>> >>> Yes I'm just asking the question. I don't think changing this buys >>> us much other than "it's now the same as for diagnostic flags". >>> Testing these flags can (and probably should) be handled explicitly. >> >> I disagree.? I don't think we should test these flags explicitly when >> we have a perfectly good test for all the flags, that should be >> enabled.?? Which is what my change does. >>> >>> I looked back at the discussion on JDK-8153783 (sorry can't recall >>> what may have been said in slack) and I'm not sure what my specific >>> concern was then. From a testing perspective if you use an >>> experimental or diagnostic flag then you should remember to >>> explicitly unlock it in the test setup. Not having trueInDebug >>> catches when you forget that and only test in a debug build. >> >> Yes, that was the rationale for making it 'false' rather than >> 'trueInDebug'.? People were adding tests with a diagnostic option and >> it was failing in product mode because the Unlock flag wasn't >> present.? The more vocal side of the question didn't want to have to >> add the Unlock flag for all their day to day local testing.?? I >> assume the same argument can be made for the experimental options. >> >> It would be good to hear the opinion from someone who uses these >> options.?? This is degenerated into an opinion question, and besides >> being able to cleanly test these options, neither one of us uses or >> tests experimental options as far as I can tell.? I see tests from >> the Compiler and GC components.? What do other people think? > > I use experimental options in the various subsystems that I maintain, but > as I said, I'm training my fingers and my scripts to include the various > Unlock options... > > I think the consistency argument is a winner as is the argument that > folks > need to test 'release' bits in addition to 'fastdebug' bits. It's > interesting > that non-HotSpot folks typically test with 'release' bits and have a hard > time seeing why HotSpot folks always test with 'fastdebug'. It seems > like at > least some HotSpot folks only test with 'fastdebug' and not with > 'release'... fastdebug has the assertions so seems most profitable to test with. For most changes, testing with product/release yields no more information, unless you get lucky with a race.? It's low percentage.?? Don't misunderstand me though, I think product testing needs to be done but not for individual changes unless you are testing racy code. Which consistency argument?? That it should be 'false' rather than 'trueInDebug'. Ok, so 2 to 1 in votes. Coleen > > Dan > > P.S. > I suspect that I'm one of the few anal retentive folks that tests all > three > build configs: 'release', 'fastdebug' and 'slowdebug', but I have the > advantage > of having my own lab with various (somewhat fast) machines... Of > course that > monthly power bill is a pain since it seems that I need A/C here in > Orlando... > Can't get away with no A/C like in Colorado... > > >> >> Thanks, >> Coleen >> >>> >>> Cheers, >>> David >>> ----- >>> >>>>> >>>>>> I think the same exact arguments should apply to >>>>>> UnlockExperimentalVMOptions.? I'd like to hear from someone that >>>>>> uses experimental options on ZGC or shenandoah, since those have >>>>>> the most experimental options. >>>>> >>>>> I agree that the same arguments apply to UnlockExperimentalVMOptions. >>>>> For consistency's sake if anything, they should be the same. >>>>> >>>>> >>>>>> The reason that I made it trueInDebug is so that the command line >>>>>> option range test would test these options.? Otherwise a more >>>>>> hacky solution could be done, including adding the parameter >>>>>> -XX:+UnlockExperimentalVMOptions to all the VM option range >>>>>> tests. I'd rather not do this. >>>>> >>>>> Can explain this a bit more? Why would a default value of 'false' >>>>> mean that >>>>> the command line option range test would not test these options? >>>> >>>> So the command line option tests do - java -XX:+PrintFlagsRanges >>>> -version and collect the flags that come out, parse the ranges, and >>>> then run java with each of these flags with the limits of the range >>>> (unless the limit is INT_MAX).? Some flags are excluded explicitly >>>> because they cause problems. >>>> >>>> The reason that SymbolTableSize escaped the testing, is because it >>>> wasn't reported with -XX:+PrintFlagsRanges. You'd need >>>> -XX:+UnlockExperimentalVMOptions in the java command to gather the >>>> flags, and then pass it to all the java commands to test the >>>> ranges. It's not that bad, just a bit gross. >>>> >>>> In any case, I think the experimental flags ranges should be >>>> tested. I'm glad/amazed that more didn't fail when I turned it on >>>> in my testing. >>>> >>>>> >>>>> In any case, I'm fine if you want to move forward with changing the >>>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>>> >>>> >>>> Okay, we'll wait to see whether I get a wall of opposition or >>>> support. I still think it should be by default the same as >>>> UnlockDiagnosticVMoptions. >>>> >>>> Thanks! >>>> Coleen >>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> Functional change seems fine. Is it worth adding a clarifying >>>>>>>>> comment to: >>>>>>>>> >>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>>>> >>>>>>>>> with: >>>>>>>>> >>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 >>>>>>>>> */) ?????????????? \ >>>>>>>> >>>>>>>> Let me see if the X macro allows that and I could also add that >>>>>>>> to StringTableSize (which is not experimental option). >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>>> Summary: Increase max size for SymbolTable and fix >>>>>>>>>> experimental option range.? Make experimental options >>>>>>>>>> trueInDebug so they're tested by the command line option testing >>>>>>>>>> >>>>>>>>>> open webrev at >>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>>> >>>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a >>>>>>>>>> lot of experimental options.? I didn't test with shenanodoah. >>>>>>>>>> >>>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> > From daniel.daugherty at oracle.com Wed Jul 24 13:52:11 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 24 Jul 2019 09:52:11 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <3bbf5c0b-7e20-f471-2c5e-15777e553499@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> <3bbf5c0b-7e20-f471-2c5e-15777e553499@oracle.com> Message-ID: On 7/24/19 9:47 AM, coleen.phillimore at oracle.com wrote: > > > On 7/24/19 9:07 AM, Daniel D. Daugherty wrote: >> On 7/24/19 9:04 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/23/19 10:20 PM, David Holmes wrote: >>>> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>>>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>>>> Hi Coleen, >>>>>>>>>> >>>>>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, >>>>>>>>>> trueInDebug, ??? \ >>>>>>>>>> >>>>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>>>> >>>>>>>>> Well if it's added, then the option range test would test the >>>>>>>>> option.? Otherwise, I think it's benign. In debug mode, one >>>>>>>>> would no longer have to specify -XX:+UnlockExperimental >>>>>>>>> options, just like UnlockDiagnosticVMOptions.?? The option is >>>>>>>>> there either way. >>>>>>>> >>>>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some >>>>>>>> folks think >>>>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause >>>>>>>> bugs in tests >>>>>>>> that are runnable in all build configs: 'release', 'fastdebug' >>>>>>>> and 'slowdebug'. >>>>>>>> Folks use an option in a test that requires >>>>>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>>>>> but forget to include it in the test's run statement and we end >>>>>>>> up with a test failure in 'release' bits. >>>>>>>> >>>>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>>>>> introduce the same path to failing tests. >>>>>>> >>>>>>> I tried to change UnlockDiagnosticVMOptions to be false, and got >>>>>>> a wall of opposition: >>>>>>> >>>>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>>>> >>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>>>> >>>>>> >>>>>> I would not say "a wall of opposition". You got almost equal amounts >>>>>> of "yea" and "nay". I was a "yea" and I have been continuing to >>>>>> train >>>>>> my fingers (and my scripts) to do the right thing. >>>>> >>>>> You should have seen my slack channel at that time. :) Maybe the >>>>> "wall" was primarily from a couple of people who strongly objected. >>>>>> >>>>>> Interestingly, David H was a "nay" on changing >>>>>> UnlockDiagnosticVMOptions >>>>>> to be 'false', but appears to be leaning toward "nay" on changing >>>>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>>>> >>>>> >>>>> I think he's mostly just asking the question.? We'll see what he >>>>> answers later. >>>> >>>> Yes I'm just asking the question. I don't think changing this buys >>>> us much other than "it's now the same as for diagnostic flags". >>>> Testing these flags can (and probably should) be handled explicitly. >>> >>> I disagree.? I don't think we should test these flags explicitly >>> when we have a perfectly good test for all the flags, that should be >>> enabled.?? Which is what my change does. >>>> >>>> I looked back at the discussion on JDK-8153783 (sorry can't recall >>>> what may have been said in slack) and I'm not sure what my specific >>>> concern was then. From a testing perspective if you use an >>>> experimental or diagnostic flag then you should remember to >>>> explicitly unlock it in the test setup. Not having trueInDebug >>>> catches when you forget that and only test in a debug build. >>> >>> Yes, that was the rationale for making it 'false' rather than >>> 'trueInDebug'.? People were adding tests with a diagnostic option >>> and it was failing in product mode because the Unlock flag wasn't >>> present.? The more vocal side of the question didn't want to have to >>> add the Unlock flag for all their day to day local testing.?? I >>> assume the same argument can be made for the experimental options. >>> >>> It would be good to hear the opinion from someone who uses these >>> options.?? This is degenerated into an opinion question, and besides >>> being able to cleanly test these options, neither one of us uses or >>> tests experimental options as far as I can tell.? I see tests from >>> the Compiler and GC components.? What do other people think? >> >> I use experimental options in the various subsystems that I maintain, >> but >> as I said, I'm training my fingers and my scripts to include the various >> Unlock options... >> >> I think the consistency argument is a winner as is the argument that >> folks >> need to test 'release' bits in addition to 'fastdebug' bits. It's >> interesting >> that non-HotSpot folks typically test with 'release' bits and have a >> hard >> time seeing why HotSpot folks always test with 'fastdebug'. It seems >> like at >> least some HotSpot folks only test with 'fastdebug' and not with >> 'release'... > > fastdebug has the assertions so seems most profitable to test with. > For most changes, testing with product/release yields no more > information, unless you get lucky with a race.? It's low percentage.?? > Don't misunderstand me though, I think product testing needs to be > done but not for individual changes unless you are testing racy code. > > Which consistency argument?? That it should be 'false' rather than > 'trueInDebug'. UnlockDiagnosticVMOptions is 'trueInDebug' and that's been settled before so UnlockExperimentalVMOptions should also be 'trueInDebug'. Dan > > Ok, so 2 to 1 in votes. > > Coleen >> >> Dan >> >> P.S. >> I suspect that I'm one of the few anal retentive folks that tests all >> three >> build configs: 'release', 'fastdebug' and 'slowdebug', but I have the >> advantage >> of having my own lab with various (somewhat fast) machines... Of >> course that >> monthly power bill is a pain since it seems that I need A/C here in >> Orlando... >> Can't get away with no A/C like in Colorado... >> >> >>> >>> Thanks, >>> Coleen >>> >>>> >>>> Cheers, >>>> David >>>> ----- >>>> >>>>>> >>>>>>> I think the same exact arguments should apply to >>>>>>> UnlockExperimentalVMOptions.? I'd like to hear from someone that >>>>>>> uses experimental options on ZGC or shenandoah, since those have >>>>>>> the most experimental options. >>>>>> >>>>>> I agree that the same arguments apply to >>>>>> UnlockExperimentalVMOptions. >>>>>> For consistency's sake if anything, they should be the same. >>>>>> >>>>>> >>>>>>> The reason that I made it trueInDebug is so that the command >>>>>>> line option range test would test these options.? Otherwise a >>>>>>> more hacky solution could be done, including adding the >>>>>>> parameter -XX:+UnlockExperimentalVMOptions to all the VM option >>>>>>> range tests. I'd rather not do this. >>>>>> >>>>>> Can explain this a bit more? Why would a default value of 'false' >>>>>> mean that >>>>>> the command line option range test would not test these options? >>>>> >>>>> So the command line option tests do - java -XX:+PrintFlagsRanges >>>>> -version and collect the flags that come out, parse the ranges, >>>>> and then run java with each of these flags with the limits of the >>>>> range (unless the limit is INT_MAX).? Some flags are excluded >>>>> explicitly because they cause problems. >>>>> >>>>> The reason that SymbolTableSize escaped the testing, is because it >>>>> wasn't reported with -XX:+PrintFlagsRanges. You'd need >>>>> -XX:+UnlockExperimentalVMOptions in the java command to gather the >>>>> flags, and then pass it to all the java commands to test the >>>>> ranges. It's not that bad, just a bit gross. >>>>> >>>>> In any case, I think the experimental flags ranges should be >>>>> tested. I'm glad/amazed that more didn't fail when I turned it on >>>>> in my testing. >>>>> >>>>>> >>>>>> In any case, I'm fine if you want to move forward with changing the >>>>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>>>> >>>>> >>>>> Okay, we'll wait to see whether I get a wall of opposition or >>>>> support. I still think it should be by default the same as >>>>> UnlockDiagnosticVMoptions. >>>>> >>>>> Thanks! >>>>> Coleen >>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>>> Functional change seems fine. Is it worth adding a clarifying >>>>>>>>>> comment to: >>>>>>>>>> >>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>>>>> >>>>>>>>>> with: >>>>>>>>>> >>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 >>>>>>>>>> */) ?????????????? \ >>>>>>>>> >>>>>>>>> Let me see if the X macro allows that and I could also add >>>>>>>>> that to StringTableSize (which is not experimental option). >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>>>> Summary: Increase max size for SymbolTable and fix >>>>>>>>>>> experimental option range.? Make experimental options >>>>>>>>>>> trueInDebug so they're tested by the command line option >>>>>>>>>>> testing >>>>>>>>>>> >>>>>>>>>>> open webrev at >>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>>>> >>>>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a >>>>>>>>>>> lot of experimental options. I didn't test with shenanodoah. >>>>>>>>>>> >>>>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Coleen >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>> >> > From coleen.phillimore at oracle.com Wed Jul 24 13:52:45 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 24 Jul 2019 09:52:45 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> Message-ID: <301bd591-c90f-b907-2f4b-26d1cb519e49@oracle.com> On 7/24/19 9:20 AM, David Holmes wrote: > On 24/07/2019 11:04 pm, coleen.phillimore at oracle.com wrote: >> On 7/23/19 10:20 PM, David Holmes wrote: >>> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, >>>>>>>>> trueInDebug, ??? \ >>>>>>>>> >>>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>>> >>>>>>>> Well if it's added, then the option range test would test the >>>>>>>> option.? Otherwise, I think it's benign. In debug mode, one >>>>>>>> would no longer have to specify -XX:+UnlockExperimental >>>>>>>> options, just like UnlockDiagnosticVMOptions.?? The option is >>>>>>>> there either way. >>>>>>> >>>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some >>>>>>> folks think >>>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause >>>>>>> bugs in tests >>>>>>> that are runnable in all build configs: 'release', 'fastdebug' >>>>>>> and 'slowdebug'. >>>>>>> Folks use an option in a test that requires >>>>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>>>> but forget to include it in the test's run statement and we end >>>>>>> up with a test failure in 'release' bits. >>>>>>> >>>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>>>> introduce the same path to failing tests. >>>>>> >>>>>> I tried to change UnlockDiagnosticVMOptions to be false, and got >>>>>> a wall of opposition: >>>>>> >>>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>>> >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>>> >>>>> >>>>> I would not say "a wall of opposition". You got almost equal amounts >>>>> of "yea" and "nay". I was a "yea" and I have been continuing to train >>>>> my fingers (and my scripts) to do the right thing. >>>> >>>> You should have seen my slack channel at that time. :) Maybe the >>>> "wall" was primarily from a couple of people who strongly objected. >>>>> >>>>> Interestingly, David H was a "nay" on changing >>>>> UnlockDiagnosticVMOptions >>>>> to be 'false', but appears to be leaning toward "nay" on changing >>>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>>> >>>> >>>> I think he's mostly just asking the question.? We'll see what he >>>> answers later. >>> >>> Yes I'm just asking the question. I don't think changing this buys >>> us much other than "it's now the same as for diagnostic flags". >>> Testing these flags can (and probably should) be handled explicitly. >> >> I disagree.? I don't think we should test these flags explicitly when >> we have a perfectly good test for all the flags, that should be >> enabled. Which is what my change does. > > Your change only causes the experimental flags to be tested in debug > builds. I would argue they should also be tested in product builds, > hence the need to be explicit about it. The same is true for diagnostic options.? I'd be surprised if testing in release made a difference though, except taking more time. Coleen > > David > ----- > >>> >>> I looked back at the discussion on JDK-8153783 (sorry can't recall >>> what may have been said in slack) and I'm not sure what my specific >>> concern was then. From a testing perspective if you use an >>> experimental or diagnostic flag then you should remember to >>> explicitly unlock it in the test setup. Not having trueInDebug >>> catches when you forget that and only test in a debug build. >> >> Yes, that was the rationale for making it 'false' rather than >> 'trueInDebug'.? People were adding tests with a diagnostic option and >> it was failing in product mode because the Unlock flag wasn't >> present.? The more vocal side of the question didn't want to have to >> add the Unlock flag for all their day to day local testing.?? I >> assume the same argument can be made for the experimental options. >> >> It would be good to hear the opinion from someone who uses these >> options.?? This is degenerated into an opinion question, and besides >> being able to cleanly test these options, neither one of us uses or >> tests experimental options as far as I can tell.? I see tests from >> the Compiler and GC components.? What do other people think? >> >> Thanks, >> Coleen >> >>> >>> Cheers, >>> David >>> ----- >>> >>>>> >>>>>> I think the same exact arguments should apply to >>>>>> UnlockExperimentalVMOptions.? I'd like to hear from someone that >>>>>> uses experimental options on ZGC or shenandoah, since those have >>>>>> the most experimental options. >>>>> >>>>> I agree that the same arguments apply to UnlockExperimentalVMOptions. >>>>> For consistency's sake if anything, they should be the same. >>>>> >>>>> >>>>>> The reason that I made it trueInDebug is so that the command line >>>>>> option range test would test these options.? Otherwise a more >>>>>> hacky solution could be done, including adding the parameter >>>>>> -XX:+UnlockExperimentalVMOptions to all the VM option range >>>>>> tests. I'd rather not do this. >>>>> >>>>> Can explain this a bit more? Why would a default value of 'false' >>>>> mean that >>>>> the command line option range test would not test these options? >>>> >>>> So the command line option tests do - java -XX:+PrintFlagsRanges >>>> -version and collect the flags that come out, parse the ranges, and >>>> then run java with each of these flags with the limits of the range >>>> (unless the limit is INT_MAX).? Some flags are excluded explicitly >>>> because they cause problems. >>>> >>>> The reason that SymbolTableSize escaped the testing, is because it >>>> wasn't reported with -XX:+PrintFlagsRanges. You'd need >>>> -XX:+UnlockExperimentalVMOptions in the java command to gather the >>>> flags, and then pass it to all the java commands to test the >>>> ranges. It's not that bad, just a bit gross. >>>> >>>> In any case, I think the experimental flags ranges should be >>>> tested. I'm glad/amazed that more didn't fail when I turned it on >>>> in my testing. >>>> >>>>> >>>>> In any case, I'm fine if you want to move forward with changing the >>>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>>> >>>> >>>> Okay, we'll wait to see whether I get a wall of opposition or >>>> support. I still think it should be by default the same as >>>> UnlockDiagnosticVMoptions. >>>> >>>> Thanks! >>>> Coleen >>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> Functional change seems fine. Is it worth adding a clarifying >>>>>>>>> comment to: >>>>>>>>> >>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>>>> >>>>>>>>> with: >>>>>>>>> >>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 >>>>>>>>> */) ?????????????? \ >>>>>>>> >>>>>>>> Let me see if the X macro allows that and I could also add that >>>>>>>> to StringTableSize (which is not experimental option). >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>>> Summary: Increase max size for SymbolTable and fix >>>>>>>>>> experimental option range.? Make experimental options >>>>>>>>>> trueInDebug so they're tested by the command line option testing >>>>>>>>>> >>>>>>>>>> open webrev at >>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>>> >>>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a >>>>>>>>>> lot of experimental options.? I didn't test with shenanodoah. >>>>>>>>>> >>>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> From shade at redhat.com Wed Jul 24 14:40:41 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 24 Jul 2019 16:40:41 +0200 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator Message-ID: RFE: https://bugs.openjdk.java.net/browse/JDK-8228400 There is a lot of code in AArch64 port that hooks up to the built-in simulator. That simulator was used to bootstrap/develop the port when hardware was not available. This simulator is not needed now, and we should remove it to unclutter the code. Removal webrev: https://cr.openjdk.java.net/~shade/8228400/webrev.02/ The only thing that feels risky for me is removal of call_VM_leaf_base1 in templateTable_aarch64.cpp, please take a thorough look. I am planning to backport it to 11u and 8u-aarch64 too. Testing: linux-aarch64-fastdebug tier1, tier2; linux-x86_64-fastdebug builds; jdk-submit (running) -- Thanks, -Aleksey From adam.farley at uk.ibm.com Wed Jul 24 17:57:47 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Wed, 24 Jul 2019 18:57:47 +0100 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: <25234969-2215-57e9-d8c5-d97b5669ebb1@oracle.com> References: <25234969-2215-57e9-d8c5-d97b5669ebb1@oracle.com> Message-ID: Hi David, Welcome back. :) David Holmes wrote on 22/07/2019 03:34:37: > From: David Holmes > To: Adam Farley8 , hotspot- > dev at openjdk.java.net, serviceability-dev > Date: 22/07/2019 03:34 > Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > paths are longer than JVM_MAXPATHLEN > > Hi Adam, > > Some higher-level issues/concerns ... > > On 22/07/2019 11:25 am, David Holmes wrote: > > Hi Adam, > > > > Adding in serviceability-dev as you've made changes in that area too. > > > > Will take a closer look at the changes soon. > > > > David > > ----- > > > > On 18/07/2019 2:05 am, Adam Farley8 wrote: > >> Hey All, > >> > >> Reviewers and sponsors requested to inspect the following. > >> > >> I've re-written the code change, as discussed with David Holmes in emails > >> last week, and now the webrev changes do this: > >> > >> - Cause the VM to shut down with a relevant error message if one or more > >> of the sun.boot.library.path paths is too long for the system. > > I'm not seeing that implemented at the moment. Nor am I clear that such > an error will always be detected during VM initialization. The code > paths look fairly general purpose, but perhaps that is an illusion and > we will always check this during initialization? (also see discussion at > end) This is implemented in the ".1" webrev, though I did comment out a necessary line to attempt to test the linker_md changes. I've removed the "//" and re-uploaded. It's the added line in the os.cpp file that begins "vm_exit_during_initialization". You're correct in that this code would only be triggered if we're loading a library, though I'm not sure that's a problem. We seem to load a couple of libraries every time we run even the most minimalist of classes, and if we somehow manage not to load any libraries *at all*, the contents of a library path property seems irrelevant. > > >> - Apply similar error-producing code to the (legacy?) code in > >> linker_md.c. > > I think the JDWP changes need to be split off and handled under their > own issue. It's a similar issue but not directly related. Also the > change to sys.h raises the need for a CSR request as it seems to be > exported for external use - though I can't find any existing test code > that includes it, or which uses the affected code (which is another > reason so split this of and let serviceability folk consider it). A reasonable suggestion. Thanks for the tip about sys.h. Seemed cleaner to change sys.h, but this change isn't worth a CSR. The jdwp changes were removed from the new ".2" webrev. http://cr.openjdk.java.net/~afarley/8227021.2/webrev > > >> - Allow the numerical parameter for split_path to indicate anything we > >> plan to add to the path once split, allowing for more accurate path > >> length detection. > > This is a bit icky but I understand your desire to be more accurate with > the checking - as otherwise you would still need to keep overflow checks > in other places once the full path+name is assembled. But then such > checks must be missing in places now ?? Correct, to my understanding. Likely more a problem on Windows than Linux. > > I'm not clear why you have implemented the path check the way you > instead of simply augmenting the existing code ie. where we have: > > 1347 // do the actual splitting > 1348 p = inpath; > 1349 for (int i = 0 ; i < count ; i++) { > 1350 size_t len = strcspn(p, os::path_separator()); > 1351 if (len > JVM_MAXPATHLEN) { > 1352 return NULL; > 1353 } > > why not just change the calculation at line 1351 to include the prefix > length, and then report the error rather than return NULL? You're right. The code was originally changed to enable the "skip too-long paths" logic, and then when we went to a "fail immediately" policy, I tweaked the modified code rather than start over again. See the .2 webrev for this change. http://cr.openjdk.java.net/~afarley/8227021.2/webrev > > BTW the existing code fails to free opath before returning NULL. True. I added a fix to free the memory in the two cases we do that. Though not strictly needed in the vm-exit case, the internet suggested it was bad practice to assume the os would handle it. > > >> - Add an optional parameter to the os::split_path function that specifies > >> where the paths came from, for a better error message. > > It's not appropriate to set that up in os::dll_locate_lib, hardwired as > "sun.boot.library.path". os::dll_locate_lib doesn't know where it is > being asked to look, it is the callers that usually use > Arguments::get_dll_dir(), but in one case in jvmciEnv.cpp we have: > > os::dll_locate_lib(path, sizeof(path), JVMCILibPath, ... > > so the error message would be wrong in that case. If you want to pass > through this diagnostic help information then it needs to be set by the > callers of, and passed into, os::dll_locate_lib. Hmm, perhaps a simpler solution would be to make the error message more vague and remove the passing-in of the path source. E.g. "The VM tried to use a path that exceeds the maximum path length for " "this system. Review path-containing parameters and properties, such as " "sun.boot.library.path, to identify potential sources for this path." That way we're covered no matter where the path comes from. > > Looking at all the callers of os::dll_locate_lib that all pass > Arguments::get_dll_dir, it seems quite inefficient that we will > potentially split the same set of paths multiple times. I wonder whether > we can do this specifically during VM initialization and cache the split > paths instead? That doesn't address the problem of a path element that > only exceeds the maximum length when a specific library name is added, > but I'm trying to see how to minimise the splitting and put the onus for > the checking back on the code creating the paths. > We'd have to check for changes to the source property every time we used the value. E.g. copy the property into another string, split the paths, cache the split, and compare that to the "live" property storage string before using the cache. That, or assume that sun.boot.library.path could never change after being "split", an assumption which feels unsafe. > Lets see if others have comments/suggestions here. > > Thanks, > David Sure thing. - Adam > > >> > >> Bug: https://urldefense.proofpoint.com/v2/url? > u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8227021&d=DwICaQ&c=jf_iaSHvJObTbx- > siA1ZOg&r=P5m8KWUXJf- > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=eUr83eH1dOFWQ7zfl1cSle0RxJM9Ayl9AszJYR45Gvo&s=dAT1OR_BIZPvCjoGtIlC2J1CCoCB4n43JKHFLfuHrjA&e= > >> > >> New Webrev: https://urldefense.proofpoint.com/v2/url? > u=http-3A__cr.openjdk.java.net_-7Eafarley_8227021. > 1_webrev_&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf- > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=eUr83eH1dOFWQ7zfl1cSle0RxJM9Ayl9AszJYR45Gvo&s=mGM6YxmVHe2xW8mlGgI0i7XBLCqdyHN0J1ECgZ8QuRo&e= (Superseded by the .2 version) > >> > >> Best Regards > >> > >> Adam Farley > >> IBM Runtimes > >> > >> Unless stated otherwise above: > >> IBM United Kingdom Limited - Registered in England and Wales with number > >> 741598. > >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 > >> 3AU > >> > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From david.holmes at oracle.com Wed Jul 24 22:52:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Jul 2019 08:52:00 +1000 Subject: Fwd: Verify error in hg:jdk/jdk -- repository now READ-ONLY In-Reply-To: <1d13b87c-55d4-4665-aeb3-1b3bc4323a08@default> References: <1d13b87c-55d4-4665-aeb3-1b3bc4323a08@default> Message-ID: <5895de6f-e445-9df9-b5e7-7c665ee7ddec@oracle.com> FYI in case this was not seen on jdk-dev list. David -------------- next part -------------- An embedded message was scrubbed... From: Iris Clark Subject: Verify error in hg:jdk/jdk -- repository now READ-ONLY Date: Wed, 24 Jul 2019 10:04:32 -0700 (PDT) Size: 5933 URL: From david.holmes at oracle.com Thu Jul 25 01:12:12 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Jul 2019 11:12:12 +1000 Subject: Fwd: Re: Verify error in hg:jdk/jdk -- repository now READ-ONLY In-Reply-To: <3bfb52e7-0eb4-4047-bb44-a42ba9059d71@default> References: <3bfb52e7-0eb4-4047-bb44-a42ba9059d71@default> Message-ID: FYI open again -------------- next part -------------- An embedded message was scrubbed... From: Iris Clark Subject: Re: Verify error in hg:jdk/jdk -- repository now READ-ONLY Date: Wed, 24 Jul 2019 16:02:22 -0700 (PDT) Size: 6546 URL: From tobias.hartmann at oracle.com Thu Jul 25 06:05:18 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 25 Jul 2019 08:05:18 +0200 Subject: [14] RFR(XS): 8071275: remove AbstractAssembler::update_delayed_values dead code In-Reply-To: <6689ef18-fb51-ffbc-3847-6896c8b17f2e@oracle.com> References: <9dc4aa33-d65f-cef5-2541-06f3446eac99@oracle.com> <6689ef18-fb51-ffbc-3847-6896c8b17f2e@oracle.com> Message-ID: <41a1a980-1bbe-4095-240c-05de08ea299f@oracle.com> Hi Christian, still looks good. Pushed. Best regards, Tobias On 24.07.19 15:29, Christian Hagedorn wrote: > Hi Martin, hi Tobias > > Thanks for the reviews, I missed that. I updated the webrev in place. > > Best regards, > Christian > > On 24.07.19 14:16, Doerr, Martin wrote: >> Hi Christian, >> >> please also remove check_method_handle_type from macroAssembler_ppc.hpp: >> diff -r 4c1fc3947383 src/hotspot/cpu/ppc/macroAssembler_ppc.hpp >> --- a/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp??????? Wed Jul 24 14:06:44 2019 +0200 >> +++ b/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp??????? Wed Jul 24 14:13:27 2019 +0200 >> @@ -565,8 +565,6 @@ >> ??????????????????????? Label* L_slow_path = NULL); >> >> ??? // Method handle support (JSR 292). >> -? void check_method_handle_type(Register mtype_reg, Register mh_reg, Register temp_reg, Label& >> wrong_method_type); >> - >> ??? RegisterOrConstant argument_offset(RegisterOrConstant arg_slot, Register temp_reg, int >> extra_slot_offset = 0); >> >> ??? // Biased locking support >> >> Looks good otherwise. I don't need to see another webrev for that. Thanks for cleaning this up. >> >> Best regards, >> Martin >> >> >>> -----Original Message----- >>> From: hotspot-dev On Behalf Of >>> Tobias Hartmann >>> Sent: Mittwoch, 24. Juli 2019 14:04 >>> To: Christian Hagedorn ; hotspot- >>> dev at openjdk.java.net >>> Subject: Re: [14] RFR(XS): 8071275: remove >>> AbstractAssembler::update_delayed_values dead code >>> >>> Hi Christian, >>> >>> looks good to me. >>> >>> Best regards, >>> Tobias >>> >>> On 24.07.19 13:57, Christian Hagedorn wrote: >>>> Hi >>>> >>>> Please review the following enhancement: >>>> https://bugs.openjdk.java.net/browse/JDK-8071275 >>>> http://cr.openjdk.java.net/~thartmann/8071275/webrev.00/ >>>> >>>> This just removes some dead code. >>>> >>>> Thanks! >>>> >>>> Best regards, >>>> Christian From matthias.baesken at sap.com Thu Jul 25 07:47:29 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 25 Jul 2019 07:47:29 +0000 Subject: RFR [XS]: 8228585: jdk/internal/platform/cgroup/TestCgroupMetrics.java - NumberFormatException because of large long values (memory limit_in_bytes) Message-ID: Hello, please review this small test related fix . On some linux x86_64 machine we run in the test "jdk/internal/platform/cgroup/TestCgroupMetrics.java" into this NumberFormatException : java.lang.NumberFormatException: For input string: "18446744073709551615" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:68) at java.base/java.lang.Long.parseLong(Long.java:699) at java.base/java.lang.Long.parseLong(Long.java:824) at jdk.test.lib.containers.cgroup.MetricsTester.getLongValueFromFile(MetricsTester.java:160) at jdk.test.lib.containers.cgroup.MetricsTester.testMemorySubsystem(MetricsTester.java:223) at TestCgroupMetrics.main(TestCgroupMetrics.java:50) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:567) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) at java.base/java.lang.Thread.run(Thread.java:830) I checked the number "18446744073709551615" and it seems to be larger than Long.MAX_VALUE . Background is that we seem to deal with unsigned long long ints where Java Long is not always sufficient . There has been similar handling done here where in case of overflow we "round" to Long.MAX_VALUE : java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java 127 public static long convertStringToLong(String strval) { 128 long retval = 0; 129 if (strval == null) return 0L; 130 131 try { 132 retval = Long.parseLong(strval); 133 } catch (NumberFormatException e) { 134 // For some properties (e.g. memory.limit_in_bytes) we may overflow the range of signed long. 135 // In this case, return Long.max And I do the same now in the test coding . Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8228585 http://cr.openjdk.java.net/~mbaesken/webrevs/8228585.0/ Thanks, Matthias From david.holmes at oracle.com Thu Jul 25 08:21:27 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Jul 2019 18:21:27 +1000 Subject: RFR [XS]: 8228585: jdk/internal/platform/cgroup/TestCgroupMetrics.java - NumberFormatException because of large long values (memory limit_in_bytes) In-Reply-To: References: Message-ID: <4fd6c6aa-aee1-7d8a-2dfe-9a3b1beb82db@oracle.com> Hi Matthias, Looks like a good fix. Minor nit: ! // In this case, return Long.max s/max/MAX_VALUE Thanks, David On 25/07/2019 5:47 pm, Baesken, Matthias wrote: > Hello, please review this small test related fix . > > On some linux x86_64 machine we run in the test "jdk/internal/platform/cgroup/TestCgroupMetrics.java" into this NumberFormatException : > > java.lang.NumberFormatException: For input string: "18446744073709551615" > at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:68) > at java.base/java.lang.Long.parseLong(Long.java:699) > at java.base/java.lang.Long.parseLong(Long.java:824) > at jdk.test.lib.containers.cgroup.MetricsTester.getLongValueFromFile(MetricsTester.java:160) > at jdk.test.lib.containers.cgroup.MetricsTester.testMemorySubsystem(MetricsTester.java:223) > at TestCgroupMetrics.main(TestCgroupMetrics.java:50) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:567) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > at java.base/java.lang.Thread.run(Thread.java:830) > > I checked the number "18446744073709551615" and it seems to be larger than Long.MAX_VALUE . > Background is that we seem to deal with unsigned long long ints where Java Long is not always sufficient . > > > There has been similar handling done here where in case of overflow we "round" to Long.MAX_VALUE : > > java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java > > 127 public static long convertStringToLong(String strval) { > 128 long retval = 0; > 129 if (strval == null) return 0L; > 130 > 131 try { > 132 retval = Long.parseLong(strval); > 133 } catch (NumberFormatException e) { > 134 // For some properties (e.g. memory.limit_in_bytes) we may overflow the range of signed long. > 135 // In this case, return Long.max > > And I do the same now in the test coding . > > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8228585 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228585.0/ > > > Thanks, Matthias > From matthias.baesken at sap.com Thu Jul 25 09:06:32 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 25 Jul 2019 09:06:32 +0000 Subject: RFR [XS]: 8228585: jdk/internal/platform/cgroup/TestCgroupMetrics.java - NumberFormatException because of large long values (memory limit_in_bytes) In-Reply-To: <4fd6c6aa-aee1-7d8a-2dfe-9a3b1beb82db@oracle.com> References: <4fd6c6aa-aee1-7d8a-2dfe-9a3b1beb82db@oracle.com> Message-ID: Thank's fort he review . Bob, are you fine with my change too ? Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 25. Juli 2019 10:21 > To: Baesken, Matthias ; 'hotspot- > dev at openjdk.java.net' > Subject: Re: RFR [XS]: 8228585: > jdk/internal/platform/cgroup/TestCgroupMetrics.java - > NumberFormatException because of large long values (memory > limit_in_bytes) > > Hi Matthias, > > Looks like a good fix. > > Minor nit: > > ! // In this case, return Long.max > > s/max/MAX_VALUE > > Thanks, > David > > On 25/07/2019 5:47 pm, Baesken, Matthias wrote: > > Hello, please review this small test related fix . > > > > On some linux x86_64 machine we run in the test > "jdk/internal/platform/cgroup/TestCgroupMetrics.java" into this > NumberFormatException : > > > > java.lang.NumberFormatException: For input string: > "18446744073709551615" > > at > java.base/java.lang.NumberFormatException.forInputString(NumberFormat > Exception.java:68) > > at java.base/java.lang.Long.parseLong(Long.java:699) > > at java.base/java.lang.Long.parseLong(Long.java:824) > > at > jdk.test.lib.containers.cgroup.MetricsTester.getLongValueFromFile(MetricsT > ester.java:160) > > at > jdk.test.lib.containers.cgroup.MetricsTester.testMemorySubsystem(Metrics > Tester.java:223) > > at TestCgroupMetrics.main(TestCgroupMetrics.java:50) > > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMet > hodAccessorImpl.java:62) > > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Delega > tingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:567) > > at > com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapp > er.java:127) > > at java.base/java.lang.Thread.run(Thread.java:830) > > > > I checked the number "18446744073709551615" and it seems to be larger > than Long.MAX_VALUE . > > Background is that we seem to deal with unsigned long long ints where > Java Long is not always sufficient . > > > > > > There has been similar handling done here where in case of overflow we > "round" to Long.MAX_VALUE : > > > > java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java > > > > 127 public static long convertStringToLong(String strval) { > > 128 long retval = 0; > > 129 if (strval == null) return 0L; > > 130 > > 131 try { > > 132 retval = Long.parseLong(strval); > > 133 } catch (NumberFormatException e) { > > 134 // For some properties (e.g. memory.limit_in_bytes) we may > overflow the range of signed long. > > 135 // In this case, return Long.max > > > > And I do the same now in the test coding . > > > > > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8228585 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228585.0/ > > > > > > Thanks, Matthias > > From sgehwolf at redhat.com Thu Jul 25 09:18:43 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 25 Jul 2019 11:18:43 +0200 Subject: RFR [XS]: 8228585: jdk/internal/platform/cgroup/TestCgroupMetrics.java - NumberFormatException because of large long values (memory limit_in_bytes) In-Reply-To: References: Message-ID: Hi Matthias, On Thu, 2019-07-25 at 07:47 +0000, Baesken, Matthias wrote: > Hello, please review this small test related fix . > > On some linux x86_64 machine we run in the test "jdk/internal/platform/cgroup/TestCgroupMetrics.java" into this NumberFormatException : > > java.lang.NumberFormatException: For input string: "18446744073709551615" > at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:68) > at java.base/java.lang.Long.parseLong(Long.java:699) > at java.base/java.lang.Long.parseLong(Long.java:824) > at jdk.test.lib.containers.cgroup.MetricsTester.getLongValueFromFile(MetricsTester.java:160) > at jdk.test.lib.containers.cgroup.MetricsTester.testMemorySubsystem(MetricsTester.java:223) > at TestCgroupMetrics.main(TestCgroupMetrics.java:50) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:567) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > at java.base/java.lang.Thread.run(Thread.java:830) > > I checked the number "18446744073709551615" and it seems to be larger than Long.MAX_VALUE . > Background is that we seem to deal with unsigned long long ints where Java Long is not always sufficient . > > > There has been similar handling done here where in case of overflow we "round" to Long.MAX_VALUE : > > java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java > > 127 public static long convertStringToLong(String strval) { > 128 long retval = 0; > 129 if (strval == null) return 0L; > 130 > 131 try { > 132 retval = Long.parseLong(strval); > 133 } catch (NumberFormatException e) { > 134 // For some properties (e.g. memory.limit_in_bytes) we may overflow the range of signed long. > 135 // In this case, return Long.max > > And I do the same now in the test coding . > > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8228585 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228585.0/ This looks good (with the change David pointed out). Thanks, Severin From adinn at redhat.com Thu Jul 25 12:22:47 2019 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 Jul 2019 13:22:47 +0100 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator In-Reply-To: References: Message-ID: On 24/07/2019 15:40, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8228400 > > There is a lot of code in AArch64 port that hooks up to the built-in simulator. That simulator was > used to bootstrap/develop the port when hardware was not available. This simulator is not needed > now, and we should remove it to unclutter the code. > > Removal webrev: > https://cr.openjdk.java.net/~shade/8228400/webrev.02/ > > The only thing that feels risky for me is removal of call_VM_leaf_base1 in > templateTable_aarch64.cpp, please take a thorough look. I think this is right (although it is a long time ago since I looked at it). On AArch64 there is no need to consider the num_args argument so it can be omitted and default to 0. The same applies on x86_64 -- num_args is only relevant on x86_32. > I am planning to backport it to 11u and 8u-aarch64 too. > > Testing: linux-aarch64-fastdebug tier1, tier2; linux-x86_64-fastdebug builds; jdk-submit (running) Well, this all looks very good and could go in as is. However, I think you may have missed some opportunities for removal: 1) File cpustate_aarch64.hpp exists primarily to declare a class CPUState. This was needed to save/restore AArch64 register state on exit from/re-entry into the simulator. I don't think anything else ought to be using class CPUState or any of the other types it defines. Was there any good reason not simply to delete this file? (if so perhaps whatever is keeping that file alive needs to be relocated to a home that corresponds to the x86 file layout). 2) File decode_aarch64.hpp contains almost entirely redundant stuff. I believe the only code that is referenced from another file is the suite of various pickbit* functions and their underlying mask* functions, the client being code in file immediate_aarch64.cpp. All the enums are redundant. So, I think this needs fixing by removing everything but the pickbit* and mask* fns. It would probably be better to move these to file immediate_aarch64.hpp and delete file decode_aarch64.hpp. regards, Andrew Dinn ----------- From matthias.baesken at sap.com Thu Jul 25 13:45:11 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 25 Jul 2019 13:45:11 +0000 Subject: RFR: 8228482: fix xlc16/xlclang comparison of distinct pointer types and string literal conversion warnings In-Reply-To: References: Message-ID: Thanks Martin . May I get a second review ? Best regards, Matthias From: Doerr, Martin Sent: Mittwoch, 24. Juli 2019 12:14 To: Baesken, Matthias ; 'hotspot-dev at openjdk.java.net' ; core-libs-dev at openjdk.java.net; net-dev at openjdk.java.net Cc: 'ppc-aix-port-dev at openjdk.java.net' Subject: RE: RFR: 8228482: fix xlc16/xlclang comparison of distinct pointer types and string literal conversion warnings Hi Matthias, I wouldn?t say ?looks good?, but I think it?s the right thing to do ?? The type casts look correct to fit to the AIX headers. libodm_aix: Good. Maybe we should open a new issue for freeing what?s returned by odm_set_path and we could also remove the AIX 5 support. NetworkInterface.c: Strange, but ok. Should be reviewed by somebody else in addition. Other files: No comments. Best regards, Martin From: ppc-aix-port-dev > On Behalf Of Baesken, Matthias Sent: Dienstag, 23. Juli 2019 17:15 To: 'hotspot-dev at openjdk.java.net' >; core-libs-dev at openjdk.java.net; net-dev at openjdk.java.net Cc: 'ppc-aix-port-dev at openjdk.java.net' > Subject: RFR: 8228482: fix xlc16/xlclang comparison of distinct pointer types and string literal conversion warnings Hello please review this patch . It fixes a couple of xlc16/xlclang warnings , especially comparison of distinct pointer types and string literal conversion warnings . When building with xlc16/xlclang, we still have a couple of warnings that have to be fixed : warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] for example : /nightly/jdk/src/hotspot/os/aix/libodm_aix.cpp:81:18: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] odmWrapper odm("product", "/usr/lib/objrepos"); // could also use "lpp" ^ /nightly/jdk/src/hotspot/os/aix/libodm_aix.cpp:81:29: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings] odmWrapper odm("product", "/usr/lib/objrepos"); // could also use "lpp" ^ warning: comparison of distinct pointer types, for example : /nightly/jdk/src/java.desktop/aix/native/libawt/porting_aix.c:50:14: warning: comparison of distinct pointer types ('void *' and 'char *') [-Wcompare-distinct-pointer-types] addr < (((char*)p->ldinfo_textorg) + p->ldinfo_textsize)) { ~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8228482 http://cr.openjdk.java.net/~mbaesken/webrevs/8228482.1/ Thanks, Matthias From bob.vandette at oracle.com Thu Jul 25 17:08:56 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 25 Jul 2019 13:08:56 -0400 Subject: RFR [XS]: 8228585: jdk/internal/platform/cgroup/TestCgroupMetrics.java - NumberFormatException because of large long values (memory limit_in_bytes) In-Reply-To: References: <4fd6c6aa-aee1-7d8a-2dfe-9a3b1beb82db@oracle.com> Message-ID: <48A0FF82-B41F-4C2F-90D8-A6796EA81261@oracle.com> Matthias, Does this issue impact the VM Container code ?jdk/open/src/hotspot/os/linux/osContainer_linux.cpp?? Do we properly detect unlimited? Try running ?java -XshowSettings:system -version Bob. > On Jul 25, 2019, at 5:06 AM, Baesken, Matthias wrote: > > Thank's fort he review . > > Bob, are you fine with my change too ? > > Best regards, Matthias > > >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 25. Juli 2019 10:21 >> To: Baesken, Matthias ; 'hotspot- >> dev at openjdk.java.net' >> Subject: Re: RFR [XS]: 8228585: >> jdk/internal/platform/cgroup/TestCgroupMetrics.java - >> NumberFormatException because of large long values (memory >> limit_in_bytes) >> >> Hi Matthias, >> >> Looks like a good fix. >> >> Minor nit: >> >> ! // In this case, return Long.max >> >> s/max/MAX_VALUE >> >> Thanks, >> David >> >> On 25/07/2019 5:47 pm, Baesken, Matthias wrote: >>> Hello, please review this small test related fix . >>> >>> On some linux x86_64 machine we run in the test >> "jdk/internal/platform/cgroup/TestCgroupMetrics.java" into this >> NumberFormatException : >>> >>> java.lang.NumberFormatException: For input string: >> "18446744073709551615" >>> at >> java.base/java.lang.NumberFormatException.forInputString(NumberFormat >> Exception.java:68) >>> at java.base/java.lang.Long.parseLong(Long.java:699) >>> at java.base/java.lang.Long.parseLong(Long.java:824) >>> at >> jdk.test.lib.containers.cgroup.MetricsTester.getLongValueFromFile(MetricsT >> ester.java:160) >>> at >> jdk.test.lib.containers.cgroup.MetricsTester.testMemorySubsystem(Metrics >> Tester.java:223) >>> at TestCgroupMetrics.main(TestCgroupMetrics.java:50) >>> at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) >>> at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMet >> hodAccessorImpl.java:62) >>> at >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Delega >> tingMethodAccessorImpl.java:43) >>> at java.base/java.lang.reflect.Method.invoke(Method.java:567) >>> at >> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapp >> er.java:127) >>> at java.base/java.lang.Thread.run(Thread.java:830) >>> >>> I checked the number "18446744073709551615" and it seems to be larger >> than Long.MAX_VALUE . >>> Background is that we seem to deal with unsigned long long ints where >> Java Long is not always sufficient . >>> >>> >>> There has been similar handling done here where in case of overflow we >> "round" to Long.MAX_VALUE : >>> >>> java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java >>> >>> 127 public static long convertStringToLong(String strval) { >>> 128 long retval = 0; >>> 129 if (strval == null) return 0L; >>> 130 >>> 131 try { >>> 132 retval = Long.parseLong(strval); >>> 133 } catch (NumberFormatException e) { >>> 134 // For some properties (e.g. memory.limit_in_bytes) we may >> overflow the range of signed long. >>> 135 // In this case, return Long.max >>> >>> And I do the same now in the test coding . >>> >>> >>> >>> Bug/webrev : >>> >>> https://bugs.openjdk.java.net/browse/JDK-8228585 >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228585.0/ >>> >>> >>> Thanks, Matthias >>> From coleen.phillimore at oracle.com Thu Jul 25 21:53:28 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 25 Jul 2019 17:53:28 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <301bd591-c90f-b907-2f4b-26d1cb519e49@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> <301bd591-c90f-b907-2f4b-26d1cb519e49@oracle.com> Message-ID: <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> After some offline polling of various people, I'm going to withdraw the UnlockExperimentalOptions change to trueInDebug, and fixed the options test. http://cr.openjdk.java.net/~coleenp/2019/8227123.02/webrev/index.html These test changes would have found the original bug. Thanks, Coleen On 7/24/19 9:52 AM, coleen.phillimore at oracle.com wrote: > > > On 7/24/19 9:20 AM, David Holmes wrote: >> On 24/07/2019 11:04 pm, coleen.phillimore at oracle.com wrote: >>> On 7/23/19 10:20 PM, David Holmes wrote: >>>> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>>>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>>>> Hi Coleen, >>>>>>>>>> >>>>>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, >>>>>>>>>> trueInDebug, ??? \ >>>>>>>>>> >>>>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>>>> >>>>>>>>> Well if it's added, then the option range test would test the >>>>>>>>> option.? Otherwise, I think it's benign. In debug mode, one >>>>>>>>> would no longer have to specify -XX:+UnlockExperimental >>>>>>>>> options, just like UnlockDiagnosticVMOptions.?? The option is >>>>>>>>> there either way. >>>>>>>> >>>>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some >>>>>>>> folks think >>>>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause >>>>>>>> bugs in tests >>>>>>>> that are runnable in all build configs: 'release', 'fastdebug' >>>>>>>> and 'slowdebug'. >>>>>>>> Folks use an option in a test that requires >>>>>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>>>>> but forget to include it in the test's run statement and we end >>>>>>>> up with a test failure in 'release' bits. >>>>>>>> >>>>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>>>>> introduce the same path to failing tests. >>>>>>> >>>>>>> I tried to change UnlockDiagnosticVMOptions to be false, and got >>>>>>> a wall of opposition: >>>>>>> >>>>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>>>> >>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>>>> >>>>>> >>>>>> I would not say "a wall of opposition". You got almost equal amounts >>>>>> of "yea" and "nay". I was a "yea" and I have been continuing to >>>>>> train >>>>>> my fingers (and my scripts) to do the right thing. >>>>> >>>>> You should have seen my slack channel at that time. :) Maybe the >>>>> "wall" was primarily from a couple of people who strongly objected. >>>>>> >>>>>> Interestingly, David H was a "nay" on changing >>>>>> UnlockDiagnosticVMOptions >>>>>> to be 'false', but appears to be leaning toward "nay" on changing >>>>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>>>> >>>>> >>>>> I think he's mostly just asking the question.? We'll see what he >>>>> answers later. >>>> >>>> Yes I'm just asking the question. I don't think changing this buys >>>> us much other than "it's now the same as for diagnostic flags". >>>> Testing these flags can (and probably should) be handled explicitly. >>> >>> I disagree.? I don't think we should test these flags explicitly >>> when we have a perfectly good test for all the flags, that should be >>> enabled. Which is what my change does. >> >> Your change only causes the experimental flags to be tested in debug >> builds. I would argue they should also be tested in product builds, >> hence the need to be explicit about it. > > The same is true for diagnostic options.? I'd be surprised if testing > in release made a difference though, except taking more time. > > Coleen > >> >> David >> ----- >> >>>> >>>> I looked back at the discussion on JDK-8153783 (sorry can't recall >>>> what may have been said in slack) and I'm not sure what my specific >>>> concern was then. From a testing perspective if you use an >>>> experimental or diagnostic flag then you should remember to >>>> explicitly unlock it in the test setup. Not having trueInDebug >>>> catches when you forget that and only test in a debug build. >>> >>> Yes, that was the rationale for making it 'false' rather than >>> 'trueInDebug'.? People were adding tests with a diagnostic option >>> and it was failing in product mode because the Unlock flag wasn't >>> present.? The more vocal side of the question didn't want to have to >>> add the Unlock flag for all their day to day local testing.?? I >>> assume the same argument can be made for the experimental options. >>> >>> It would be good to hear the opinion from someone who uses these >>> options.?? This is degenerated into an opinion question, and besides >>> being able to cleanly test these options, neither one of us uses or >>> tests experimental options as far as I can tell.? I see tests from >>> the Compiler and GC components.? What do other people think? >>> >>> Thanks, >>> Coleen >>> >>>> >>>> Cheers, >>>> David >>>> ----- >>>> >>>>>> >>>>>>> I think the same exact arguments should apply to >>>>>>> UnlockExperimentalVMOptions.? I'd like to hear from someone that >>>>>>> uses experimental options on ZGC or shenandoah, since those have >>>>>>> the most experimental options. >>>>>> >>>>>> I agree that the same arguments apply to >>>>>> UnlockExperimentalVMOptions. >>>>>> For consistency's sake if anything, they should be the same. >>>>>> >>>>>> >>>>>>> The reason that I made it trueInDebug is so that the command >>>>>>> line option range test would test these options.? Otherwise a >>>>>>> more hacky solution could be done, including adding the >>>>>>> parameter -XX:+UnlockExperimentalVMOptions to all the VM option >>>>>>> range tests. I'd rather not do this. >>>>>> >>>>>> Can explain this a bit more? Why would a default value of 'false' >>>>>> mean that >>>>>> the command line option range test would not test these options? >>>>> >>>>> So the command line option tests do - java -XX:+PrintFlagsRanges >>>>> -version and collect the flags that come out, parse the ranges, >>>>> and then run java with each of these flags with the limits of the >>>>> range (unless the limit is INT_MAX).? Some flags are excluded >>>>> explicitly because they cause problems. >>>>> >>>>> The reason that SymbolTableSize escaped the testing, is because it >>>>> wasn't reported with -XX:+PrintFlagsRanges. You'd need >>>>> -XX:+UnlockExperimentalVMOptions in the java command to gather the >>>>> flags, and then pass it to all the java commands to test the >>>>> ranges. It's not that bad, just a bit gross. >>>>> >>>>> In any case, I think the experimental flags ranges should be >>>>> tested. I'm glad/amazed that more didn't fail when I turned it on >>>>> in my testing. >>>>> >>>>>> >>>>>> In any case, I'm fine if you want to move forward with changing the >>>>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>>>> >>>>> >>>>> Okay, we'll wait to see whether I get a wall of opposition or >>>>> support. I still think it should be by default the same as >>>>> UnlockDiagnosticVMoptions. >>>>> >>>>> Thanks! >>>>> Coleen >>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>>> Functional change seems fine. Is it worth adding a clarifying >>>>>>>>>> comment to: >>>>>>>>>> >>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>>>>> >>>>>>>>>> with: >>>>>>>>>> >>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 >>>>>>>>>> */) ?????????????? \ >>>>>>>>> >>>>>>>>> Let me see if the X macro allows that and I could also add >>>>>>>>> that to StringTableSize (which is not experimental option). >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>>>> Summary: Increase max size for SymbolTable and fix >>>>>>>>>>> experimental option range.? Make experimental options >>>>>>>>>>> trueInDebug so they're tested by the command line option >>>>>>>>>>> testing >>>>>>>>>>> >>>>>>>>>>> open webrev at >>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>>>> >>>>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a >>>>>>>>>>> lot of experimental options. I didn't test with shenanodoah. >>>>>>>>>>> >>>>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Coleen >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>> > From kim.barrett at oracle.com Thu Jul 25 22:59:22 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 25 Jul 2019 18:59:22 -0400 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects Message-ID: 8227054: ServiceThread needs to know about all OopStorage objects 8227053: ServiceThread cleanup of OopStorage is missing some Please review this change in how OopStorage objects are managed and accessed. There is a new (all static) class, OopStorages, which provides infrastructure for creating all the storage objects, access via an enum-based id, and iterating over them. Various components that previously managed their own storage objects now obtain them from OopStorages. A number of access functions have been eliminated as part of that, though some have been retained for internal convenience of a component. The set of OopStorage objects is now declared in one place, using x-macros, with collective definitions and usages ultimately driven off those macros. This includes the ServiceThread (which no longer needs explicit knowledge of the set, and is no longer missing any) and the OopStorage portion of WeakProcessorPhases. For now, the various GCs still have explicit knowledge of the set; that will be addressed in followup changes specific to each collector. (This delay minimizes the impact on Leo's in-progress review that changes ParallelGC to use WorkGangs.) This change also includes a couple of utility macros for working with x-macros. CR: https://bugs.openjdk.java.net/browse/JDK-8227054 https://bugs.openjdk.java.net/browse/JDK-8227053 Webrev: http://cr.openjdk.java.net/~kbarrett/8227054/open.00/ Testing: mach5 tier1-3 From matthias.baesken at sap.com Fri Jul 26 06:55:26 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 26 Jul 2019 06:55:26 +0000 Subject: RFR [XS]: 8228585: jdk/internal/platform/cgroup/TestCgroupMetrics.java - NumberFormatException because of large long values (memory limit_in_bytes) In-Reply-To: <48A0FF82-B41F-4C2F-90D8-A6796EA81261@oracle.com> References: <4fd6c6aa-aee1-7d8a-2dfe-9a3b1beb82db@oracle.com> <48A0FF82-B41F-4C2F-90D8-A6796EA81261@oracle.com> Message-ID: Hi Bob, my change only touches the test coding . test/lib/jdk/test/lib/containers/cgroup/MetricsTester.java I am not aware of related issues in the HS code, but I have not checked much . This is what java gives me on the system : java -XshowSettings:system -version Operating System Metrics: Provider: cgroupv1 Effective CPU Count: 8 CPU Period: 100000us CPU Quota: -1 CPU Shares: -1 List of Processors, 8 total: 0 1 2 3 4 5 6 7 List of Effective Processors, 0 total: List of Memory Nodes, 1 total: 0 List of Available Memory Nodes, 0 total: CPUSet Memory Pressure Enabled: false Memory Limit: Unlimited Memory Soft Limit: Unlimited Memory & Swap Limit: 0.00K Kernel Memory Limit: 0.00K TCP Memory Limit: 0.00K Out Of Memory Killer Enabled: true Best regards, Matthias > > Matthias, > > Does this issue impact the VM Container code > ?jdk/open/src/hotspot/os/linux/osContainer_linux.cpp?? > > Do we properly detect unlimited? > > Try running ?java -XshowSettings:system -version > > Bob. > > > > On Jul 25, 2019, at 5:06 AM, Baesken, Matthias > wrote: > > > > Thank's fort he review . > > > > Bob, are you fine with my change too ? > > > > Best regards, Matthias > > > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Donnerstag, 25. Juli 2019 10:21 > >> To: Baesken, Matthias ; 'hotspot- > >> dev at openjdk.java.net' > >> Subject: Re: RFR [XS]: 8228585: > >> jdk/internal/platform/cgroup/TestCgroupMetrics.java - > >> NumberFormatException because of large long values (memory > >> limit_in_bytes) > >> > >> Hi Matthias, > >> > >> Looks like a good fix. > >> > >> Minor nit: > >> > >> ! // In this case, return Long.max > >> > >> s/max/MAX_VALUE > >> > >> Thanks, > >> David > >> > >> On 25/07/2019 5:47 pm, Baesken, Matthias wrote: > >>> Hello, please review this small test related fix . > >>> > >>> On some linux x86_64 machine we run in the test > >> "jdk/internal/platform/cgroup/TestCgroupMetrics.java" into this > >> NumberFormatException : > >>> > >>> java.lang.NumberFormatException: For input string: > >> "18446744073709551615" > >>> at > >> > java.base/java.lang.NumberFormatException.forInputString(NumberFormat > >> Exception.java:68) > >>> at java.base/java.lang.Long.parseLong(Long.java:699) > >>> at java.base/java.lang.Long.parseLong(Long.java:824) > >>> at > >> > jdk.test.lib.containers.cgroup.MetricsTester.getLongValueFromFile(MetricsT > >> ester.java:160) > >>> at > >> > jdk.test.lib.containers.cgroup.MetricsTester.testMemorySubsystem(Metrics > >> Tester.java:223) > >>> at TestCgroupMetrics.main(TestCgroupMetrics.java:50) > >>> at > >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > >> Method) > >>> at > >> > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMet > >> hodAccessorImpl.java:62) > >>> at > >> > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Delega > >> tingMethodAccessorImpl.java:43) > >>> at java.base/java.lang.reflect.Method.invoke(Method.java:567) > >>> at > >> > com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapp > >> er.java:127) > >>> at java.base/java.lang.Thread.run(Thread.java:830) > >>> > >>> I checked the number "18446744073709551615" and it seems to be > larger > >> than Long.MAX_VALUE . > >>> Background is that we seem to deal with unsigned long long ints where > >> Java Long is not always sufficient . > >>> > >>> > >>> There has been similar handling done here where in case of overflow > we > >> "round" to Long.MAX_VALUE : > >>> > >>> java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java > >>> > >>> 127 public static long convertStringToLong(String strval) { > >>> 128 long retval = 0; > >>> 129 if (strval == null) return 0L; > >>> 130 > >>> 131 try { > >>> 132 retval = Long.parseLong(strval); > >>> 133 } catch (NumberFormatException e) { > >>> 134 // For some properties (e.g. memory.limit_in_bytes) we may > >> overflow the range of signed long. > >>> 135 // In this case, return Long.max > >>> > >>> And I do the same now in the test coding . > >>> > >>> > >>> > >>> Bug/webrev : > >>> > >>> https://bugs.openjdk.java.net/browse/JDK-8228585 > >>> > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228585.0/ > >>> > >>> > >>> Thanks, Matthias > >>> From shade at redhat.com Fri Jul 26 06:56:13 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 26 Jul 2019 08:56:13 +0200 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator In-Reply-To: References: Message-ID: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> On 7/25/19 2:22 PM, Andrew Dinn wrote: > 1) File cpustate_aarch64.hpp exists primarily to declare a class > CPUState. This was needed to save/restore AArch64 register state on exit > from/re-entry into the simulator. I don't think anything else ought to > be using class CPUState or any of the other types it defines. > > Was there any good reason not simply to delete this file? (if so perhaps > whatever is keeping that file alive needs to be relocated to a home that > corresponds to the x86 file layout). Right on, removed cpustate_aarch64.hpp. > 2) File decode_aarch64.hpp contains almost entirely redundant stuff. I > believe the only code that is referenced from another file is the suite > of various pickbit* functions and their underlying mask* functions, the > client being code in file immediate_aarch64.cpp. All the enums are > redundant. So, I think this needs fixing by removing everything but the > pickbit* and mask* fns. It would probably be better to move these to > file immediate_aarch64.hpp and delete file decode_aarch64.hpp. Right. I moved required definitions to immediate_aarch64.cpp and removed decode_aarch64.hpp. New webrev: http://cr.openjdk.java.net/~shade/8228400/webrev.03/ Testing: aarch64 cross-build; (gonna test tier1 once aarch64 box is free) -- Thanks, -Aleksey From adinn at redhat.com Fri Jul 26 08:28:48 2019 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 26 Jul 2019 09:28:48 +0100 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator In-Reply-To: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> References: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> Message-ID: <974255ac-ef96-860f-929e-14cc1a1be1ff@redhat.com> On 26/07/2019 07:56, Aleksey Shipilev wrote: > New webrev: > http://cr.openjdk.java.net/~shade/8228400/webrev.03/ > > Testing: aarch64 cross-build; (gonna test tier1 once aarch64 box is free) That looks good. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From christian.hagedorn at oracle.com Fri Jul 26 12:23:00 2019 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Fri, 26 Jul 2019 14:23:00 +0200 Subject: [14] RFR(XS): 8156207 : Resource allocated BitMaps are often cleared unnecessarily Message-ID: Hi Please review the following patch: http://cr.openjdk.java.net/~thartmann/8156207/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8156207 This is just a completion of the suggested change in the comments. Thank you! Best regards, Christian From bob.vandette at oracle.com Fri Jul 26 12:44:38 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Fri, 26 Jul 2019 08:44:38 -0400 Subject: RFR [XS]: 8228585: jdk/internal/platform/cgroup/TestCgroupMetrics.java - NumberFormatException because of large long values (memory limit_in_bytes) In-Reply-To: References: <4fd6c6aa-aee1-7d8a-2dfe-9a3b1beb82db@oracle.com> <48A0FF82-B41F-4C2F-90D8-A6796EA81261@oracle.com> Message-ID: <08776D43-43DB-4E6C-A9F7-845971646187@oracle.com> Ok, looks like the problem doesn?t impact the hotspot code. Your change looks good to me. Thanks for checking, Bob. > On Jul 26, 2019, at 2:55 AM, Baesken, Matthias wrote: > > Hi Bob, my change only touches the test coding . > > test/lib/jdk/test/lib/containers/cgroup/MetricsTester.java > > I am not aware of related issues in the HS code, but I have not checked much . > This is what java gives me on the system : > > java -XshowSettings:system -version > > Operating System Metrics: > Provider: cgroupv1 > Effective CPU Count: 8 > CPU Period: 100000us > CPU Quota: -1 > CPU Shares: -1 > List of Processors, 8 total: > 0 1 2 3 4 5 6 7 > List of Effective Processors, 0 total: > List of Memory Nodes, 1 total: > 0 > List of Available Memory Nodes, 0 total: > CPUSet Memory Pressure Enabled: false > Memory Limit: Unlimited > Memory Soft Limit: Unlimited > Memory & Swap Limit: 0.00K > Kernel Memory Limit: 0.00K > TCP Memory Limit: 0.00K > Out Of Memory Killer Enabled: true > > Best regards, Matthias > > >> >> Matthias, >> >> Does this issue impact the VM Container code >> ?jdk/open/src/hotspot/os/linux/osContainer_linux.cpp?? >> >> Do we properly detect unlimited? >> >> Try running ?java -XshowSettings:system -version >> >> Bob. >> >> >>> On Jul 25, 2019, at 5:06 AM, Baesken, Matthias >> wrote: >>> >>> Thank's fort he review . >>> >>> Bob, are you fine with my change too ? >>> >>> Best regards, Matthias >>> >>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Donnerstag, 25. Juli 2019 10:21 >>>> To: Baesken, Matthias ; 'hotspot- >>>> dev at openjdk.java.net' >>>> Subject: Re: RFR [XS]: 8228585: >>>> jdk/internal/platform/cgroup/TestCgroupMetrics.java - >>>> NumberFormatException because of large long values (memory >>>> limit_in_bytes) >>>> >>>> Hi Matthias, >>>> >>>> Looks like a good fix. >>>> >>>> Minor nit: >>>> >>>> ! // In this case, return Long.max >>>> >>>> s/max/MAX_VALUE >>>> >>>> Thanks, >>>> David >>>> >>>> On 25/07/2019 5:47 pm, Baesken, Matthias wrote: >>>>> Hello, please review this small test related fix . >>>>> >>>>> On some linux x86_64 machine we run in the test >>>> "jdk/internal/platform/cgroup/TestCgroupMetrics.java" into this >>>> NumberFormatException : >>>>> >>>>> java.lang.NumberFormatException: For input string: >>>> "18446744073709551615" >>>>> at >>>> >> java.base/java.lang.NumberFormatException.forInputString(NumberFormat >>>> Exception.java:68) >>>>> at java.base/java.lang.Long.parseLong(Long.java:699) >>>>> at java.base/java.lang.Long.parseLong(Long.java:824) >>>>> at >>>> >> jdk.test.lib.containers.cgroup.MetricsTester.getLongValueFromFile(MetricsT >>>> ester.java:160) >>>>> at >>>> >> jdk.test.lib.containers.cgroup.MetricsTester.testMemorySubsystem(Metrics >>>> Tester.java:223) >>>>> at TestCgroupMetrics.main(TestCgroupMetrics.java:50) >>>>> at >>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >>>> Method) >>>>> at >>>> >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMet >>>> hodAccessorImpl.java:62) >>>>> at >>>> >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Delega >>>> tingMethodAccessorImpl.java:43) >>>>> at java.base/java.lang.reflect.Method.invoke(Method.java:567) >>>>> at >>>> >> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapp >>>> er.java:127) >>>>> at java.base/java.lang.Thread.run(Thread.java:830) >>>>> >>>>> I checked the number "18446744073709551615" and it seems to be >> larger >>>> than Long.MAX_VALUE . >>>>> Background is that we seem to deal with unsigned long long ints where >>>> Java Long is not always sufficient . >>>>> >>>>> >>>>> There has been similar handling done here where in case of overflow >> we >>>> "round" to Long.MAX_VALUE : >>>>> >>>>> java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java >>>>> >>>>> 127 public static long convertStringToLong(String strval) { >>>>> 128 long retval = 0; >>>>> 129 if (strval == null) return 0L; >>>>> 130 >>>>> 131 try { >>>>> 132 retval = Long.parseLong(strval); >>>>> 133 } catch (NumberFormatException e) { >>>>> 134 // For some properties (e.g. memory.limit_in_bytes) we may >>>> overflow the range of signed long. >>>>> 135 // In this case, return Long.max >>>>> >>>>> And I do the same now in the test coding . >>>>> >>>>> >>>>> >>>>> Bug/webrev : >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8228585 >>>>> >>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228585.0/ >>>>> >>>>> >>>>> Thanks, Matthias >>>>> > From harold.seigel at oracle.com Fri Jul 26 12:56:35 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Fri, 26 Jul 2019 08:56:35 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> <301bd591-c90f-b907-2f4b-26d1cb519e49@oracle.com> <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> Message-ID: Looks good! Thanks, Harold On 7/25/2019 5:53 PM, coleen.phillimore at oracle.com wrote: > > After some offline polling of various people, I'm going to withdraw > the UnlockExperimentalOptions change to trueInDebug, and fixed the > options test. > > http://cr.openjdk.java.net/~coleenp/2019/8227123.02/webrev/index.html > > These test changes would have found the original bug. > > Thanks, > Coleen > > > On 7/24/19 9:52 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/24/19 9:20 AM, David Holmes wrote: >>> On 24/07/2019 11:04 pm, coleen.phillimore at oracle.com wrote: >>>> On 7/23/19 10:20 PM, David Holmes wrote: >>>>> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>>>>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>>>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>>>>> Hi Coleen, >>>>>>>>>>> >>>>>>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, >>>>>>>>>>> trueInDebug, \ >>>>>>>>>>> >>>>>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>>>>> >>>>>>>>>> Well if it's added, then the option range test would test the >>>>>>>>>> option.? Otherwise, I think it's benign. In debug mode, one >>>>>>>>>> would no longer have to specify -XX:+UnlockExperimental >>>>>>>>>> options, just like UnlockDiagnosticVMOptions.?? The option is >>>>>>>>>> there either way. >>>>>>>>> >>>>>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some >>>>>>>>> folks think >>>>>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause >>>>>>>>> bugs in tests >>>>>>>>> that are runnable in all build configs: 'release', 'fastdebug' >>>>>>>>> and 'slowdebug'. >>>>>>>>> Folks use an option in a test that requires >>>>>>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>>>>>> but forget to include it in the test's run statement and we >>>>>>>>> end up with a test failure in 'release' bits. >>>>>>>>> >>>>>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>>>>>> introduce the same path to failing tests. >>>>>>>> >>>>>>>> I tried to change UnlockDiagnosticVMOptions to be false, and >>>>>>>> got a wall of opposition: >>>>>>>> >>>>>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>>>>> >>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>>>>> >>>>>>> >>>>>>> I would not say "a wall of opposition". You got almost equal >>>>>>> amounts >>>>>>> of "yea" and "nay". I was a "yea" and I have been continuing to >>>>>>> train >>>>>>> my fingers (and my scripts) to do the right thing. >>>>>> >>>>>> You should have seen my slack channel at that time. :) Maybe the >>>>>> "wall" was primarily from a couple of people who strongly objected. >>>>>>> >>>>>>> Interestingly, David H was a "nay" on changing >>>>>>> UnlockDiagnosticVMOptions >>>>>>> to be 'false', but appears to be leaning toward "nay" on changing >>>>>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>>>>> >>>>>> >>>>>> I think he's mostly just asking the question.? We'll see what he >>>>>> answers later. >>>>> >>>>> Yes I'm just asking the question. I don't think changing this buys >>>>> us much other than "it's now the same as for diagnostic flags". >>>>> Testing these flags can (and probably should) be handled explicitly. >>>> >>>> I disagree.? I don't think we should test these flags explicitly >>>> when we have a perfectly good test for all the flags, that should >>>> be enabled. Which is what my change does. >>> >>> Your change only causes the experimental flags to be tested in debug >>> builds. I would argue they should also be tested in product builds, >>> hence the need to be explicit about it. >> >> The same is true for diagnostic options.? I'd be surprised if testing >> in release made a difference though, except taking more time. >> >> Coleen >> >>> >>> David >>> ----- >>> >>>>> >>>>> I looked back at the discussion on JDK-8153783 (sorry can't recall >>>>> what may have been said in slack) and I'm not sure what my >>>>> specific concern was then. From a testing perspective if you use >>>>> an experimental or diagnostic flag then you should remember to >>>>> explicitly unlock it in the test setup. Not having trueInDebug >>>>> catches when you forget that and only test in a debug build. >>>> >>>> Yes, that was the rationale for making it 'false' rather than >>>> 'trueInDebug'.? People were adding tests with a diagnostic option >>>> and it was failing in product mode because the Unlock flag wasn't >>>> present.? The more vocal side of the question didn't want to have >>>> to add the Unlock flag for all their day to day local testing.?? I >>>> assume the same argument can be made for the experimental options. >>>> >>>> It would be good to hear the opinion from someone who uses these >>>> options.?? This is degenerated into an opinion question, and >>>> besides being able to cleanly test these options, neither one of us >>>> uses or tests experimental options as far as I can tell.? I see >>>> tests from the Compiler and GC components.? What do other people >>>> think? >>>> >>>> Thanks, >>>> Coleen >>>> >>>>> >>>>> Cheers, >>>>> David >>>>> ----- >>>>> >>>>>>> >>>>>>>> I think the same exact arguments should apply to >>>>>>>> UnlockExperimentalVMOptions.? I'd like to hear from someone >>>>>>>> that uses experimental options on ZGC or shenandoah, since >>>>>>>> those have the most experimental options. >>>>>>> >>>>>>> I agree that the same arguments apply to >>>>>>> UnlockExperimentalVMOptions. >>>>>>> For consistency's sake if anything, they should be the same. >>>>>>> >>>>>>> >>>>>>>> The reason that I made it trueInDebug is so that the command >>>>>>>> line option range test would test these options.? Otherwise a >>>>>>>> more hacky solution could be done, including adding the >>>>>>>> parameter -XX:+UnlockExperimentalVMOptions to all the VM option >>>>>>>> range tests. I'd rather not do this. >>>>>>> >>>>>>> Can explain this a bit more? Why would a default value of >>>>>>> 'false' mean that >>>>>>> the command line option range test would not test these options? >>>>>> >>>>>> So the command line option tests do - java -XX:+PrintFlagsRanges >>>>>> -version and collect the flags that come out, parse the ranges, >>>>>> and then run java with each of these flags with the limits of the >>>>>> range (unless the limit is INT_MAX).? Some flags are excluded >>>>>> explicitly because they cause problems. >>>>>> >>>>>> The reason that SymbolTableSize escaped the testing, is because >>>>>> it wasn't reported with -XX:+PrintFlagsRanges. You'd need >>>>>> -XX:+UnlockExperimentalVMOptions in the java command to gather >>>>>> the flags, and then pass it to all the java commands to test the >>>>>> ranges. It's not that bad, just a bit gross. >>>>>> >>>>>> In any case, I think the experimental flags ranges should be >>>>>> tested. I'm glad/amazed that more didn't fail when I turned it on >>>>>> in my testing. >>>>>> >>>>>>> >>>>>>> In any case, I'm fine if you want to move forward with changing the >>>>>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>>>>> >>>>>> >>>>>> Okay, we'll wait to see whether I get a wall of opposition or >>>>>> support. I still think it should be by default the same as >>>>>> UnlockDiagnosticVMoptions. >>>>>> >>>>>> Thanks! >>>>>> Coleen >>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>>> >>>>>>>>> Dan >>>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Functional change seems fine. Is it worth adding a >>>>>>>>>>> clarifying comment to: >>>>>>>>>>> >>>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>>>>>> >>>>>>>>>>> with: >>>>>>>>>>> >>>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 >>>>>>>>>>> */) ?????????????? \ >>>>>>>>>> >>>>>>>>>> Let me see if the X macro allows that and I could also add >>>>>>>>>> that to StringTableSize (which is not experimental option). >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> Summary: Increase max size for SymbolTable and fix >>>>>>>>>>>> experimental option range.? Make experimental options >>>>>>>>>>>> trueInDebug so they're tested by the command line option >>>>>>>>>>>> testing >>>>>>>>>>>> >>>>>>>>>>>> open webrev at >>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>>>>> >>>>>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a >>>>>>>>>>>> lot of experimental options. I didn't test with shenanodoah. >>>>>>>>>>>> >>>>>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >> > From matthias.baesken at sap.com Fri Jul 26 13:25:27 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 26 Jul 2019 13:25:27 +0000 Subject: RFR: [XS] 8228650: runtime/SharedArchiveFile/CheckDefaultArchiveFile.java test fails on AIX Message-ID: Hello, please review this small CDS test related fix . Currently the runtime/SharedArchiveFile/CheckDefaultArchiveFile.java test fails on AIX . It runs into this NPE : java.lang.NullPointerException at java.base/sun.nio.fs.UnixPath.normalizeAndCheck(UnixPath.java:75) at java.base/sun.nio.fs.UnixPath.(UnixPath.java:69) at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279) at java.base/java.nio.file.Path.of(Path.java:147) at java.base/java.nio.file.Paths.get(Paths.java:69) at CheckDefaultArchiveFile.main(CheckDefaultArchiveFile.java:51) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:567) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) at java.base/java.lang.Thread.run(Thread.java:830) Reason is that the path to the classes.jsa is null, which is correct on a non CDS platform (like AIX). See the coding : jdk/src/hotspot/share/runtime/arguments.hpp : static char* get_default_shared_archive_path() NOT_CDS_RETURN_(NULL); jdk/src/hotspot/share/runtime/arguments.cpp : 3477 #if INCLUDE_CDS 3478 // Sharing support 3479 // Construct the path to the archive 3480 char* Arguments::get_default_shared_archive_path() { However the test does not handle this case correctly. AIX has CDS disabled, it is currently not supported on the platform . jdk/make/autoconf/hotspot.m4 : 495 # Disable CDS on AIX. 496 if test "x$OPENJDK_TARGET_OS" = "xaix"; then 497 ENABLE_CDS="false" 498 if test "x$enable_cds" = "xyes"; then 499 AC_MSG_ERROR([CDS is currently not supported on AIX. Remove --enable-cds.]) 500 fi 501 fi Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8228650 http://cr.openjdk.java.net/~mbaesken/webrevs/8228650.0/ Thanks, Matthias From coleen.phillimore at oracle.com Fri Jul 26 13:57:54 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 26 Jul 2019 09:57:54 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> <301bd591-c90f-b907-2f4b-26d1cb519e49@oracle.com> <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> Message-ID: <5f357d3f-ab4e-7705-7b5c-9eee10cdfec9@oracle.com> Thanks, Harold! Coleen On 7/26/19 8:56 AM, Harold Seigel wrote: > Looks good! > > Thanks, Harold > > On 7/25/2019 5:53 PM, coleen.phillimore at oracle.com wrote: >> >> After some offline polling of various people, I'm going to withdraw >> the UnlockExperimentalOptions change to trueInDebug, and fixed the >> options test. >> >> http://cr.openjdk.java.net/~coleenp/2019/8227123.02/webrev/index.html >> >> These test changes would have found the original bug. >> >> Thanks, >> Coleen >> >> >> On 7/24/19 9:52 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/24/19 9:20 AM, David Holmes wrote: >>>> On 24/07/2019 11:04 pm, coleen.phillimore at oracle.com wrote: >>>>> On 7/23/19 10:20 PM, David Holmes wrote: >>>>>> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>>>>>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>>>>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>>>>>> Hi Coleen, >>>>>>>>>>>> >>>>>>>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, >>>>>>>>>>>> trueInDebug, \ >>>>>>>>>>>> >>>>>>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>>>>>> >>>>>>>>>>> Well if it's added, then the option range test would test >>>>>>>>>>> the option.? Otherwise, I think it's benign. In debug mode, >>>>>>>>>>> one would no longer have to specify -XX:+UnlockExperimental >>>>>>>>>>> options, just like UnlockDiagnosticVMOptions.?? The option >>>>>>>>>>> is there either way. >>>>>>>>>> >>>>>>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some >>>>>>>>>> folks think >>>>>>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can >>>>>>>>>> cause bugs in tests >>>>>>>>>> that are runnable in all build configs: 'release', >>>>>>>>>> 'fastdebug' and 'slowdebug'. >>>>>>>>>> Folks use an option in a test that requires >>>>>>>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>>>>>>> but forget to include it in the test's run statement and we >>>>>>>>>> end up with a test failure in 'release' bits. >>>>>>>>>> >>>>>>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>>>>>>> introduce the same path to failing tests. >>>>>>>>> >>>>>>>>> I tried to change UnlockDiagnosticVMOptions to be false, and >>>>>>>>> got a wall of opposition: >>>>>>>>> >>>>>>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>>>>>> >>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>>>>>> >>>>>>>> >>>>>>>> I would not say "a wall of opposition". You got almost equal >>>>>>>> amounts >>>>>>>> of "yea" and "nay". I was a "yea" and I have been continuing to >>>>>>>> train >>>>>>>> my fingers (and my scripts) to do the right thing. >>>>>>> >>>>>>> You should have seen my slack channel at that time. :) Maybe the >>>>>>> "wall" was primarily from a couple of people who strongly objected. >>>>>>>> >>>>>>>> Interestingly, David H was a "nay" on changing >>>>>>>> UnlockDiagnosticVMOptions >>>>>>>> to be 'false', but appears to be leaning toward "nay" on changing >>>>>>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>>>>>> >>>>>>> >>>>>>> I think he's mostly just asking the question.? We'll see what he >>>>>>> answers later. >>>>>> >>>>>> Yes I'm just asking the question. I don't think changing this >>>>>> buys us much other than "it's now the same as for diagnostic >>>>>> flags". Testing these flags can (and probably should) be handled >>>>>> explicitly. >>>>> >>>>> I disagree.? I don't think we should test these flags explicitly >>>>> when we have a perfectly good test for all the flags, that should >>>>> be enabled. Which is what my change does. >>>> >>>> Your change only causes the experimental flags to be tested in >>>> debug builds. I would argue they should also be tested in product >>>> builds, hence the need to be explicit about it. >>> >>> The same is true for diagnostic options.? I'd be surprised if >>> testing in release made a difference though, except taking more time. >>> >>> Coleen >>> >>>> >>>> David >>>> ----- >>>> >>>>>> >>>>>> I looked back at the discussion on JDK-8153783 (sorry can't >>>>>> recall what may have been said in slack) and I'm not sure what my >>>>>> specific concern was then. From a testing perspective if you use >>>>>> an experimental or diagnostic flag then you should remember to >>>>>> explicitly unlock it in the test setup. Not having trueInDebug >>>>>> catches when you forget that and only test in a debug build. >>>>> >>>>> Yes, that was the rationale for making it 'false' rather than >>>>> 'trueInDebug'.? People were adding tests with a diagnostic option >>>>> and it was failing in product mode because the Unlock flag wasn't >>>>> present.? The more vocal side of the question didn't want to have >>>>> to add the Unlock flag for all their day to day local testing.?? I >>>>> assume the same argument can be made for the experimental options. >>>>> >>>>> It would be good to hear the opinion from someone who uses these >>>>> options.?? This is degenerated into an opinion question, and >>>>> besides being able to cleanly test these options, neither one of >>>>> us uses or tests experimental options as far as I can tell.? I see >>>>> tests from the Compiler and GC components.? What do other people >>>>> think? >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>>> >>>>>> Cheers, >>>>>> David >>>>>> ----- >>>>>> >>>>>>>> >>>>>>>>> I think the same exact arguments should apply to >>>>>>>>> UnlockExperimentalVMOptions.? I'd like to hear from someone >>>>>>>>> that uses experimental options on ZGC or shenandoah, since >>>>>>>>> those have the most experimental options. >>>>>>>> >>>>>>>> I agree that the same arguments apply to >>>>>>>> UnlockExperimentalVMOptions. >>>>>>>> For consistency's sake if anything, they should be the same. >>>>>>>> >>>>>>>> >>>>>>>>> The reason that I made it trueInDebug is so that the command >>>>>>>>> line option range test would test these options.? Otherwise a >>>>>>>>> more hacky solution could be done, including adding the >>>>>>>>> parameter -XX:+UnlockExperimentalVMOptions to all the VM >>>>>>>>> option range tests. I'd rather not do this. >>>>>>>> >>>>>>>> Can explain this a bit more? Why would a default value of >>>>>>>> 'false' mean that >>>>>>>> the command line option range test would not test these options? >>>>>>> >>>>>>> So the command line option tests do - java -XX:+PrintFlagsRanges >>>>>>> -version and collect the flags that come out, parse the ranges, >>>>>>> and then run java with each of these flags with the limits of >>>>>>> the range (unless the limit is INT_MAX).? Some flags are >>>>>>> excluded explicitly because they cause problems. >>>>>>> >>>>>>> The reason that SymbolTableSize escaped the testing, is because >>>>>>> it wasn't reported with -XX:+PrintFlagsRanges. You'd need >>>>>>> -XX:+UnlockExperimentalVMOptions in the java command to gather >>>>>>> the flags, and then pass it to all the java commands to test the >>>>>>> ranges. It's not that bad, just a bit gross. >>>>>>> >>>>>>> In any case, I think the experimental flags ranges should be >>>>>>> tested. I'm glad/amazed that more didn't fail when I turned it >>>>>>> on in my testing. >>>>>>> >>>>>>>> >>>>>>>> In any case, I'm fine if you want to move forward with changing >>>>>>>> the >>>>>>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>>>>>> >>>>>>> >>>>>>> Okay, we'll wait to see whether I get a wall of opposition or >>>>>>> support. I still think it should be by default the same as >>>>>>> UnlockDiagnosticVMoptions. >>>>>>> >>>>>>> Thanks! >>>>>>> Coleen >>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Dan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Functional change seems fine. Is it worth adding a >>>>>>>>>>>> clarifying comment to: >>>>>>>>>>>> >>>>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>>>>>>> >>>>>>>>>>>> with: >>>>>>>>>>>> >>>>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 >>>>>>>>>>>> */) ?????????????? \ >>>>>>>>>>> >>>>>>>>>>> Let me see if the X macro allows that and I could also add >>>>>>>>>>> that to StringTableSize (which is not experimental option). >>>>>>>>>>> Thanks, >>>>>>>>>>> Coleen >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>>> Summary: Increase max size for SymbolTable and fix >>>>>>>>>>>>> experimental option range.? Make experimental options >>>>>>>>>>>>> trueInDebug so they're tested by the command line option >>>>>>>>>>>>> testing >>>>>>>>>>>>> >>>>>>>>>>>>> open webrev at >>>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>>>>>> >>>>>>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has >>>>>>>>>>>>> a lot of experimental options. I didn't test with >>>>>>>>>>>>> shenanodoah. >>>>>>>>>>>>> >>>>>>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>> >> From coleen.phillimore at oracle.com Fri Jul 26 14:38:59 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 26 Jul 2019 10:38:59 -0400 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: References: Message-ID: I like this change! http://cr.openjdk.java.net/~kbarrett/8227054/open.00/src/hotspot/share/gc/shenandoah/shenandoahRootProcessor.inline.hpp.udiff.html Can you remove some #include directives from the GC code since the oop storages are coming from oopStorages? http://cr.openjdk.java.net/~kbarrett/8227054/open.00/src/hotspot/share/prims/resolvedMethodTable.cpp.udiff.html Does this still need to #include oopStorage.inline.hpp ? http://cr.openjdk.java.net/~kbarrett/8227054/open.00/src/hotspot/share/prims/resolvedMethodTable.hpp.frames.html This doesn't seem to need to include oopStorage.hpp.? Might be others too. http://cr.openjdk.java.net/~kbarrett/8227054/open.00/src/hotspot/share/gc/shared/oopStorages.hpp.html I thought our compilers were tolerant of a trailing comma in enumerations? The macros aren't bad though.? It seems like it would be easy to add a new OopStorage if we wanted to, but it would be better to use an existing vm weak or vm strong oop storage, if we wanted to move more oops into an oop storage (say for JFR). This change looks great apart from trying to remove more #includes. Thanks! Coleen On 7/25/19 6:59 PM, Kim Barrett wrote: > 8227054: ServiceThread needs to know about all OopStorage objects > 8227053: ServiceThread cleanup of OopStorage is missing some > > Please review this change in how OopStorage objects are managed and > accessed. There is a new (all static) class, OopStorages, which > provides infrastructure for creating all the storage objects, access > via an enum-based id, and iterating over them. > > Various components that previously managed their own storage objects > now obtain them from OopStorages. A number of access functions have > been eliminated as part of that, though some have been retained for > internal convenience of a component. > > The set of OopStorage objects is now declared in one place, using > x-macros, with collective definitions and usages ultimately driven off > those macros. This includes the ServiceThread (which no longer needs > explicit knowledge of the set, and is no longer missing any) and the > OopStorage portion of WeakProcessorPhases. For now, the various GCs > still have explicit knowledge of the set; that will be addressed in > followup changes specific to each collector. (This delay minimizes > the impact on Leo's in-progress review that changes ParallelGC to use > WorkGangs.) > > This change also includes a couple of utility macros for working with > x-macros. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8227054 > https://bugs.openjdk.java.net/browse/JDK-8227053 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8227054/open.00/ > > Testing: > mach5 tier1-3 > From gerard.ziemski at oracle.com Fri Jul 26 14:47:50 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Fri, 26 Jul 2019 09:47:50 -0500 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> <301bd591-c90f-b907-2f4b-26d1cb519e49@oracle.com> <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> Message-ID: <803024d4-bfd8-c58a-814d-8aaf49eecc3c@oracle.com> Thank you for fixing this! Just a small nick pick - this comment in symbolTable.cpp: +// 2^24 is max size, like StringTable. doesn't particularly strike me as all that useful. cheers On 7/25/19 4:53 PM, coleen.phillimore at oracle.com wrote: > > After some offline polling of various people, I'm going to withdraw > the UnlockExperimentalOptions change to trueInDebug, and fixed the > options test. > > http://cr.openjdk.java.net/~coleenp/2019/8227123.02/webrev/index.html > > These test changes would have found the original bug. > > Thanks, > Coleen From mandy.chung at oracle.com Fri Jul 26 16:06:46 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 26 Jul 2019 09:06:46 -0700 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: References: Message-ID: Hi Kim, A passing comment (not a review): 40 // For serviceability agent. ServiceThread is used for serviceability such as JVM TI and M&M but not serviceability agent. Mandy On 7/25/19 3:59 PM, Kim Barrett wrote: > 8227054: ServiceThread needs to know about all OopStorage objects > 8227053: ServiceThread cleanup of OopStorage is missing some > > Please review this change in how OopStorage objects are managed and > accessed. There is a new (all static) class, OopStorages, which > provides infrastructure for creating all the storage objects, access > via an enum-based id, and iterating over them. > > Various components that previously managed their own storage objects > now obtain them from OopStorages. A number of access functions have > been eliminated as part of that, though some have been retained for > internal convenience of a component. > > The set of OopStorage objects is now declared in one place, using > x-macros, with collective definitions and usages ultimately driven off > those macros. This includes the ServiceThread (which no longer needs > explicit knowledge of the set, and is no longer missing any) and the > OopStorage portion of WeakProcessorPhases. For now, the various GCs > still have explicit knowledge of the set; that will be addressed in > followup changes specific to each collector. (This delay minimizes > the impact on Leo's in-progress review that changes ParallelGC to use > WorkGangs.) > > This change also includes a couple of utility macros for working with > x-macros. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8227054 > https://bugs.openjdk.java.net/browse/JDK-8227053 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8227054/open.00/ > > Testing: > mach5 tier1-3 > From kim.barrett at oracle.com Fri Jul 26 16:38:54 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 26 Jul 2019 12:38:54 -0400 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: References: Message-ID: <5C4FAAEA-3C4C-495B-AF58-4744B47E6CE4@oracle.com> > On Jul 26, 2019, at 12:06 PM, Mandy Chung wrote: > > Hi Kim, > > A passing comment (not a review): > 40 // For serviceability agent. > > > > ServiceThread is used for serviceability such as JVM TI and M&M but not serviceability agent. The comment is correct. The variables and their initialization that the comment applies to are referred to by the serviceability agent. The ServiceThread doesn?t care about these at all. Indeed, with these changes, nothing in the VM cares about them anymore, and I was originally going to delete them. From mandy.chung at oracle.com Fri Jul 26 17:31:52 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 26 Jul 2019 10:31:52 -0700 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: <5C4FAAEA-3C4C-495B-AF58-4744B47E6CE4@oracle.com> References: <5C4FAAEA-3C4C-495B-AF58-4744B47E6CE4@oracle.com> Message-ID: <96f25960-65a0-d25f-3873-a78663edfef2@oracle.com> On 7/26/19 9:38 AM, Kim Barrett wrote: >> On Jul 26, 2019, at 12:06 PM, Mandy Chung wrote: >> >> Hi Kim, >> >> A passing comment (not a review): >> 40 // For serviceability agent. >> >> >> >> ServiceThread is used for serviceability such as JVM TI and M&M but not serviceability agent. > The comment is correct. The variables and their initialization that the comment > applies to are referred to by the serviceability agent. The ServiceThread doesn?t > care about these at all. Indeed, with these changes, nothing in the VM cares about > them anymore, and I was originally going to delete them. > I see what you meant.? It might be clearer to say these fields are read by serviceability agent. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/JNIHandles.java Mandy From kim.barrett at oracle.com Fri Jul 26 18:05:12 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 26 Jul 2019 14:05:12 -0400 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: <96f25960-65a0-d25f-3873-a78663edfef2@oracle.com> References: <5C4FAAEA-3C4C-495B-AF58-4744B47E6CE4@oracle.com> <96f25960-65a0-d25f-3873-a78663edfef2@oracle.com> Message-ID: > On Jul 26, 2019, at 1:31 PM, Mandy Chung wrote: > > > > On 7/26/19 9:38 AM, Kim Barrett wrote: >>> On Jul 26, 2019, at 12:06 PM, Mandy Chung wrote: >>> >>> Hi Kim, >>> >>> A passing comment (not a review): >>> 40 // For serviceability agent. >>> >>> >>> >>> ServiceThread is used for serviceability such as JVM TI and M&M but not serviceability agent. >> The comment is correct. The variables and their initialization that the comment >> applies to are referred to by the serviceability agent. The ServiceThread doesn?t >> care about these at all. Indeed, with these changes, nothing in the VM cares about >> them anymore, and I was originally going to delete them. >> > > I see what you meant. It might be clearer to say these fields are read by serviceability agent. > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/JNIHandles.java > > Mandy I changed the comment to say ?These are used by the serviceability agent.? From stefan.karlsson at oracle.com Fri Jul 26 20:18:54 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 26 Jul 2019 22:18:54 +0200 Subject: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] In-Reply-To: References: <55e8bddf-3228-0fd7-3639-cc9bc920e2c5@oracle.com> Message-ID: <032d94a2-d2a9-7d59-92c9-47dcd2acc2de@oracle.com> FWIW, I have a prototype that rewrites markOopDesc that gets rid of this undefined behavior. I got some internal feedback that it was a worth-while change, but I didn't have time to get this into JDK 13, and post-poned it. The first patch renames the markOopDesc to MarkWord and removes the inheritances from oopDesc: ?https://cr.openjdk.java.net/~stefank/prototype/markWord/webrev.markWord.simpleRename/ On top of that I have the patch that makes MarkWord an AllStatic class and removes the UB: ?https://cr.openjdk.java.net/~stefank/prototype/markWord/webrev.markWord.makeStatic.delta/ ?https://cr.openjdk.java.net/~stefank/prototype/markWord/webrev.markWord.makeStatic/ This was written in May and hasn't been rebased against the latest changes. StefanK On 2019-07-12 16:46, Erik ?sterlund wrote: > Hi Harold, > > It's worse than that though, unfortunately. You are not allowed to > have "this" equal to NULL, whether you perform such explicit NULL > comparisons or not. > > The implication is that as long as "inflating" is NULL, we kind of > can't use any of the functions on markOop and hence? mustrewrite > pretty much all uses of markOop to do something else. > The same goes for things like Register, where rax == NULL. To be > compliant, we would similarly have to rewrite all uses of Register. > > In other words, if we are to really hunt down uses of this == NULL > and remove them, we will find ourselves with a mountain of work. > > Again, just gonna drop that here and run. > > /Erik > > On 2019-07-12 14:14, Harold Seigel wrote: >> The functions that compare 'this' to NULL could be changed from >> instance to static functions where 'this' is explicitly passed as a >> parameter.? Then you could keep the equivalent NULL checks. >> >> Harold >> >> On 7/12/2019 4:22 AM, Erik ?sterlund wrote: >>> Hi Matthias, >>> >>> Removing such NULL checks seems like a good idea in general due to >>> the undefined behaviour. >>> Worth mentioning though that there are some tricky ones, like in >>> markOopDesc* where this == NULL >>> means that the mark word has the "inflating" value. So we explicitly >>> check if this == NULL and >>> hope the compiler will not elide the check. Just gonna drop that one >>> here and run for it. >>> >>> Thanks, >>> /Erik >>> >>> On 2019-07-12 09:48, Baesken, Matthias wrote: >>>> Hello , when looking? into? the recent xlc16 / xlclang?? warnings I >>>> came? across? those? 3 : >>>> >>>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:1729:7: warning: >>>> 'this' pointer cannot be null in well-defined C++ code; >>>> comparison may be assumed to always evaluate to true >>>> [-Wtautological-undefined-compare] >>>> ?? if( this != NULL ) { >>>> ?????? ^~~~??? ~~~~ >>>> >>>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:3416:7: warning: >>>> 'this' pointer cannot be null in well-defined C++ code; >>>> comparison may be assumed to always evaluate to false >>>> [-Wtautological-undefined-compare] >>>> ?? if( this == NULL ) return; >>>> >>>> /nightly/jdk/src/hotspot/share/libadt/set.cpp:46:7: warning: 'this' >>>> pointer cannot be null in well-defined C++ code; >>>> comparison may be assumed to always evaluate to false >>>> [-Wtautological-undefined-compare] >>>> ?? if( this == NULL ) return os::strdup("{no set}"); >>>> >>>> >>>> Do you think the? NULL-checks can be removed or is there still some >>>> value in doing them ? >>>> >>>> Best regards, Matthias >>> > From kim.barrett at oracle.com Fri Jul 26 20:39:19 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 26 Jul 2019 16:39:19 -0400 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: References: Message-ID: <8181924C-36E6-4B91-9567-150F00286241@oracle.com> > On Jul 26, 2019, at 10:38 AM, coleen.phillimore at oracle.com wrote: > > > I like this change! > > http://cr.openjdk.java.net/~kbarrett/8227054/open.00/src/hotspot/share/gc/shenandoah/shenandoahRootProcessor.inline.hpp.udiff.html > > Can you remove some #include directives from the GC code since the oop storages are coming from oopStorages? > > http://cr.openjdk.java.net/~kbarrett/8227054/open.00/src/hotspot/share/prims/resolvedMethodTable.cpp.udiff.html > > Does this still need to #include oopStorage.inline.hpp ? > > http://cr.openjdk.java.net/~kbarrett/8227054/open.00/src/hotspot/share/prims/resolvedMethodTable.hpp.frames.html > > This doesn't seem to need to include oopStorage.hpp. Might be others too. Thanks for suggesting some #include trimming. I did some cleanup. > http://cr.openjdk.java.net/~kbarrett/8227054/open.00/src/hotspot/share/gc/shared/oopStorages.hpp.html > > I thought our compilers were tolerant of a trailing comma in enumerations? Solaris Studio doesn't like trailing commas in enums. > The macros aren't bad though. It seems like it would be easy to add a new OopStorage if we wanted to, but it would be better to use an existing vm weak or vm strong oop storage, if we wanted to move more oops into an oop storage (say for JFR). Right. Also easy to remove one. (Someone told me there are ideas afloat that might remove the need for the ResolvedMethodTableWeak storage object.) I don't expect the set to be changing frequently. But it's a bit of a pain to track down all the right places. (Even still, because the GCs still have explicit knowledge of the full set, though this change provides tools to address that.) > This change looks great apart from trying to remove more #includes. Thanks. New webrevs: full: http://cr.openjdk.java.net/~kbarrett/8227054/open.01/ incr: http://cr.openjdk.java.net/~kbarrett/8227054/open.01.inc/ From mikhailo.seledtsov at oracle.com Fri Jul 26 22:57:01 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Fri, 26 Jul 2019 15:57:01 -0700 Subject: RFR: [XS] 8228650: runtime/SharedArchiveFile/CheckDefaultArchiveFile.java test fails on AIX In-Reply-To: References: Message-ID: Looks good to me, Misha On 7/26/19 6:25 AM, Baesken, Matthias wrote: > Hello, please review this small CDS test related fix . > > Currently the runtime/SharedArchiveFile/CheckDefaultArchiveFile.java test fails on AIX . > It runs into this NPE : > > java.lang.NullPointerException > at java.base/sun.nio.fs.UnixPath.normalizeAndCheck(UnixPath.java:75) > at java.base/sun.nio.fs.UnixPath.(UnixPath.java:69) > at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279) > at java.base/java.nio.file.Path.of(Path.java:147) > at java.base/java.nio.file.Paths.get(Paths.java:69) > at CheckDefaultArchiveFile.main(CheckDefaultArchiveFile.java:51) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:567) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > at java.base/java.lang.Thread.run(Thread.java:830) > > Reason is that the path to the classes.jsa is null, which is correct on a non CDS platform (like AIX). > See the coding : > > jdk/src/hotspot/share/runtime/arguments.hpp : > > static char* get_default_shared_archive_path() NOT_CDS_RETURN_(NULL); > > jdk/src/hotspot/share/runtime/arguments.cpp : > > 3477 #if INCLUDE_CDS > 3478 // Sharing support > 3479 // Construct the path to the archive > 3480 char* Arguments::get_default_shared_archive_path() { > > > However the test does not handle this case correctly. > AIX has CDS disabled, it is currently not supported on the platform . > > jdk/make/autoconf/hotspot.m4 : > > 495 # Disable CDS on AIX. > 496 if test "x$OPENJDK_TARGET_OS" = "xaix"; then > 497 ENABLE_CDS="false" > 498 if test "x$enable_cds" = "xyes"; then > 499 AC_MSG_ERROR([CDS is currently not supported on AIX. Remove --enable-cds.]) > 500 fi > 501 fi > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8228650 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228650.0/ > > > Thanks, Matthias From vladimir.kozlov at oracle.com Sat Jul 27 01:14:27 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 26 Jul 2019 18:14:27 -0700 Subject: [14] RFR(XS): 8156207 : Resource allocated BitMaps are often cleared unnecessarily In-Reply-To: References: Message-ID: <746abe9f-1839-a2fd-027e-3a3e7f94e016@oracle.com> Looks good. Thanks, Vladimir On 7/26/19 5:23 AM, Christian Hagedorn wrote: > Hi > > Please review the following patch: > http://cr.openjdk.java.net/~thartmann/8156207/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8156207 > > This is just a completion of the suggested change in the comments. > > Thank you! > > Best regards, > Christian From kim.barrett at oracle.com Sat Jul 27 23:42:33 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Sat, 27 Jul 2019 19:42:33 -0400 Subject: [14] RFR(XS): 8156207 : Resource allocated BitMaps are often cleared unnecessarily In-Reply-To: References: Message-ID: > On Jul 26, 2019, at 8:23 AM, Christian Hagedorn wrote: > > Hi > > Please review the following patch: > http://cr.openjdk.java.net/~thartmann/8156207/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8156207 > > This is just a completion of the suggested change in the comments. > > Thank you! > > Best regards, > Christian ------------------------------------------------------------------------------ src/hotspot/share/utilities/bitMap.hpp 332 // Clears the bitmap memory. 333 ResourceBitMap(idx_t size_in_bits, bool clear = true); The comment should be updated to indicate the clearing is conditional. ------------------------------------------------------------------------------ Otherwise, looks good. I don't need a new webrev for a fix of that comment. While reviewing this change I noticed JDK-8228692. From dms at samersoff.net Sun Jul 28 16:52:37 2019 From: dms at samersoff.net (Dmitry Samersoff) Date: Sun, 28 Jul 2019 19:52:37 +0300 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator In-Reply-To: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> References: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> Message-ID: Hello Aleksey, macroAssembler_aarch64.cpp:1414 call_VM_leaf_base 1. Do I understand correctly that we no longer use number_of_arguments parameter? Should we remove it and version of call_VM_leaf on l. 1430 2. Do we still need to stp/ldp rscratch1? The rest looks good for me. -Dmitry On 26.07.2019 9:56, Aleksey Shipilev wrote: > On 7/25/19 2:22 PM, Andrew Dinn wrote: >> 1) File cpustate_aarch64.hpp exists primarily to declare a class >> CPUState. This was needed to save/restore AArch64 register state on exit >> from/re-entry into the simulator. I don't think anything else ought to >> be using class CPUState or any of the other types it defines. >> >> Was there any good reason not simply to delete this file? (if so perhaps >> whatever is keeping that file alive needs to be relocated to a home that >> corresponds to the x86 file layout). > > Right on, removed cpustate_aarch64.hpp. > >> 2) File decode_aarch64.hpp contains almost entirely redundant stuff. I >> believe the only code that is referenced from another file is the suite >> of various pickbit* functions and their underlying mask* functions, the >> client being code in file immediate_aarch64.cpp. All the enums are >> redundant. So, I think this needs fixing by removing everything but the >> pickbit* and mask* fns. It would probably be better to move these to >> file immediate_aarch64.hpp and delete file decode_aarch64.hpp. > > Right. I moved required definitions to immediate_aarch64.cpp and removed decode_aarch64.hpp. > > New webrev: > http://cr.openjdk.java.net/~shade/8228400/webrev.03/ > > Testing: aarch64 cross-build; (gonna test tier1 once aarch64 box is free) > From shade at redhat.com Sun Jul 28 20:03:10 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Sun, 28 Jul 2019 22:03:10 +0200 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator In-Reply-To: References: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> Message-ID: <9a21a708-5d2a-fe68-8627-71e098dd3492@redhat.com> On 7/28/19 6:52 PM, Dmitry Samersoff wrote: > 1. Do I understand correctly that we no longer use number_of_arguments > parameter? Yes, I think so. > Should we remove it and version of call_VM_leaf on l. 1430 Maybe? I would leave it as follow-up. The change would be local and easy to test separately. Unfortunately, it would invalidate lots of testing already done for this patch. I can see how much hassle that would be, and maybe fold that improvement here... > 2. Do we still need to stp/ldp rscratch1? I think so. AFAIU, it caller-saves rscratch1+rmethod, which is what call_VM_leaf_base might expect? The call is still pretty much there, so calling convention should be kept intact. -Aleksey From christian.hagedorn at oracle.com Mon Jul 29 07:34:10 2019 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Mon, 29 Jul 2019 09:34:10 +0200 Subject: [14] RFR(XS): 8156207 : Resource allocated BitMaps are often cleared unnecessarily In-Reply-To: <746abe9f-1839-a2fd-027e-3a3e7f94e016@oracle.com> References: <746abe9f-1839-a2fd-027e-3a3e7f94e016@oracle.com> Message-ID: <133b3675-4bf6-eee6-24db-17a3b85fa5ad@oracle.com> Hi Vladimir Thanks for the review. Best regards, Christian On 27.07.19 03:14, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 7/26/19 5:23 AM, Christian Hagedorn wrote: >> Hi >> >> Please review the following patch: >> http://cr.openjdk.java.net/~thartmann/8156207/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8156207 >> >> This is just a completion of the suggested change in the comments. >> >> Thank you! >> >> Best regards, >> Christian From christian.hagedorn at oracle.com Mon Jul 29 07:38:44 2019 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Mon, 29 Jul 2019 09:38:44 +0200 Subject: [14] RFR(XS): 8156207 : Resource allocated BitMaps are often cleared unnecessarily In-Reply-To: References: Message-ID: <9d4c63d5-0238-6ea7-f989-11fd15378d3b@oracle.com> Hi Kim On 28.07.19 01:42, Kim Barrett wrote: > Otherwise, looks good. I don't need a new webrev for a fix of that > comment. Thanks for the review. I updated the comment. > While reviewing this change I noticed JDK-8228692. Nice finding! Best regards, Christian From matthias.baesken at sap.com Mon Jul 29 08:11:01 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Mon, 29 Jul 2019 08:11:01 +0000 Subject: RFR: [XS] 8228650: runtime/SharedArchiveFile/CheckDefaultArchiveFile.java test fails on AIX In-Reply-To: References: Message-ID: Thanks for the review ! Best regards, Matthias > -----Original Message----- > From: mikhailo.seledtsov at oracle.com > Sent: Samstag, 27. Juli 2019 00:57 > To: Baesken, Matthias ; 'hotspot- > dev at openjdk.java.net' > Subject: Re: RFR: [XS] 8228650: > runtime/SharedArchiveFile/CheckDefaultArchiveFile.java test fails on AIX > > Looks good to me, > > Misha > > On 7/26/19 6:25 AM, Baesken, Matthias wrote: > > Hello, please review this small CDS test related fix . > > > > Currently the runtime/SharedArchiveFile/CheckDefaultArchiveFile.java > test fails on AIX . > > It runs into this NPE : > > > > java.lang.NullPointerException > > at java.base/sun.nio.fs.UnixPath.normalizeAndCheck(UnixPath.java:75) > > at java.base/sun.nio.fs.UnixPath.(UnixPath.java:69) > > at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279) > > at java.base/java.nio.file.Path.of(Path.java:147) > > at java.base/java.nio.file.Paths.get(Paths.java:69) > > at CheckDefaultArchiveFile.main(CheckDefaultArchiveFile.java:51) > > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMet > hodAccessorImpl.java:62) > > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Delega > tingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:567) > > at > com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapp > er.java:127) > > at java.base/java.lang.Thread.run(Thread.java:830) > > > > Reason is that the path to the classes.jsa is null, which is correct on a non > CDS platform (like AIX). > > See the coding : > > > > jdk/src/hotspot/share/runtime/arguments.hpp : > > > > static char* get_default_shared_archive_path() > NOT_CDS_RETURN_(NULL); > > > > jdk/src/hotspot/share/runtime/arguments.cpp : > > > > 3477 #if INCLUDE_CDS > > 3478 // Sharing support > > 3479 // Construct the path to the archive > > 3480 char* Arguments::get_default_shared_archive_path() { > > > > > > However the test does not handle this case correctly. > > AIX has CDS disabled, it is currently not supported on the platform . > > > > jdk/make/autoconf/hotspot.m4 : > > > > 495 # Disable CDS on AIX. > > 496 if test "x$OPENJDK_TARGET_OS" = "xaix"; then > > 497 ENABLE_CDS="false" > > 498 if test "x$enable_cds" = "xyes"; then > > 499 AC_MSG_ERROR([CDS is currently not supported on AIX. Remove -- > enable-cds.]) > > 500 fi > > 501 fi > > > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8228650 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228650.0/ > > > > > > Thanks, Matthias From adinn at redhat.com Mon Jul 29 08:54:06 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 29 Jul 2019 09:54:06 +0100 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator In-Reply-To: <9a21a708-5d2a-fe68-8627-71e098dd3492@redhat.com> References: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> <9a21a708-5d2a-fe68-8627-71e098dd3492@redhat.com> Message-ID: On 28/07/2019 21:03, Aleksey Shipilev wrote: > On 7/28/19 6:52 PM, Dmitry Samersoff wrote: >> 1. Do I understand correctly that we no longer use number_of_arguments >> parameter? > > Yes, I think so. Yes we don't need to use this parameter. Indeed we could probably change the signature to match the fact that we never supply an argument for it. However, ... The reason it is declared in the Aarch64 code is because it mirrors the code for x86_64. The argument is not needed for x86_64 either but the code is dual purpose for x86_32 where the number of parameters is needed. So, by dropping this parameter we would be choosing to diverge from x86_64. I'm not sure how much virtue there is in doing that. When we first ported the x86_64 code to Aarch64 we tried to keep the code for the two ports aligned as far as possible (i.e. only diverge when the architecture and/or performance required it). n.b. that's much the same tactic as is adopted when backporting and happens for much the same reasons -- many innovations happen first in x86, hence need /cross/ porting to AArch64. This policy has occasionally led to minor oddities like this one but it has also made development and maintenance much easier. Diverging on this specific point probably wouldn't matter too much one way or the other. So long as whoever is maintaining the code knows that it is derived from x86_64 they can easily make allowance such a minor differences. However, it gets more and more difficult to port code as these sort of changes accumulate. Personally, I would prefer to keep the two ports aligned as far as possible because my experience is that it has made it a lot easier to avoid errors and spot defects. i.e. I think it is a benefit rather than a problem that maintainers really need to keep that alignment in mind. >> Should we remove it and version of call_VM_leaf on l. 1430 > > Maybe? I would leave it as follow-up. The change would be local and easy to test separately. > Unfortunately, it would invalidate lots of testing already done for this patch. I can see how much > hassle that would be, and maybe fold that improvement here... See above. However, Aleksey is right that this should be done as a follow-up patch if at all. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From shade at redhat.com Mon Jul 29 09:05:28 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 29 Jul 2019 11:05:28 +0200 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator In-Reply-To: References: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> <9a21a708-5d2a-fe68-8627-71e098dd3492@redhat.com> Message-ID: <297dda0b-580f-dc21-0ad6-2006d58bf91e@redhat.com> On 7/29/19 10:54 AM, Andrew Dinn wrote: > See above. However, Aleksey is right that this should be done as a > follow-up patch if at all. Yes. So I would be pushing the already reviewed change soon then. -- Thanks, -Aleksey From adinn at redhat.com Mon Jul 29 09:12:53 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 29 Jul 2019 10:12:53 +0100 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator In-Reply-To: <297dda0b-580f-dc21-0ad6-2006d58bf91e@redhat.com> References: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> <9a21a708-5d2a-fe68-8627-71e098dd3492@redhat.com> <297dda0b-580f-dc21-0ad6-2006d58bf91e@redhat.com> Message-ID: On 29/07/2019 10:05, Aleksey Shipilev wrote: > On 7/29/19 10:54 AM, Andrew Dinn wrote: >> See above. However, Aleksey is right that this should be done as a >> follow-up patch if at all. > Yes. So I would be pushing the already reviewed change soon then. Please do. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From matthias.baesken at sap.com Mon Jul 29 10:20:42 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Mon, 29 Jul 2019 10:20:42 +0000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms Message-ID: Hello , please review this small test fix . The test test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime.java fails sometimes on fast Linux machines with this error message : java.lang.RuntimeException: Total safepoint time illegal value: 0 ms (MIN = 1; MAX = 9223372036854775807) looks like the total safepoint time is too low currently on these machines, it is < 1 ms. There might be several ways to handle this : * Change the test in a way that it might generate nigher safepoint times * Allow safepoint time == 0 ms * Offer an additional interface that gives safepoint times with finer granularity ( currently the HS has safepoint time values in ns , see jdk/src/hotspot/share/runtime/safepoint.cpp SafepointTracing::end But it is converted on ms in this code 114jlong RuntimeService::safepoint_time_ms() { 115 return UsePerfData ? 116 Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; 117} 064jlong Management::ticks_to_ms(jlong ticks) { 2065 assert(os::elapsed_frequency() > 0, "Must be non-zero"); 2066 return (jlong)(((double)ticks / (double)os::elapsed_frequency()) 2067 * (double)1000.0); 2068} Currently I go for the first attempt (and try to generate higher safepoint times in my patch) . Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8228658 http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ Thanks, Matthias From christian.hagedorn at oracle.com Mon Jul 29 13:27:06 2019 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Mon, 29 Jul 2019 15:27:06 +0200 Subject: [14] RFR(S): 8193042: NativeLookup::lookup_critical_entry() should only load shared library once Message-ID: <74e526f5-9036-3463-520c-bc754e1f9beb@oracle.com> Hi Please review the following enhancement: http://cr.openjdk.java.net/~thartmann/8193042/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8193042 This avoids repeated loads/unloads of the same shared library. Thanks! Best regards, Christian From sgehwolf at redhat.com Mon Jul 29 14:02:35 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 29 Jul 2019 16:02:35 +0200 Subject: [8u] [RFR] 8140482: Various minor code improvements (runtime) In-Reply-To: References: Message-ID: <292ecbce64827d6f9596c85aa20587c87f0ca18d.camel@redhat.com> Hi Andrew, On Wed, 2018-11-21 at 06:45 +0000, Andrew Hughes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8140482 > Original changeset: > https://hg.openjdk.java.net/jdk-updates/jdk9u/hotspot/rev/cd86b5699825 > Webrev: https://cr.openjdk.java.net/~andrew/openjdk8/8140482/webrev.01/ > > The patch largely applies as is, with some adjustment for context and > the dropping of the changes to src/cpu/x86/vm/stubRoutines_x86.cpp, > src/share/vm/runtime/task.cpp and src/os/windows/vm/attachListener_windows.cpp > which don't exist in 8u. A clean backport of 7127191 is included, which > allows the changes to agent/src/os/linux/libproc_impl.c to apply as-is. I see that 7127191 is already part of openjdk8u212. Can you rebase your webrev to jdk8u-dev HEAD, please? Thanks, Severin > Applying the change to 8u improves the code quality there and aids > in backporting other changes, such as 8210836 [0]. > > Ok for 8u? > > [0] https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-November/025991.html > > Thanks, From coleen.phillimore at oracle.com Mon Jul 29 14:56:34 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 29 Jul 2019 10:56:34 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <803024d4-bfd8-c58a-814d-8aaf49eecc3c@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> <301bd591-c90f-b907-2f4b-26d1cb519e49@oracle.com> <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> <803024d4-bfd8-c58a-814d-8aaf49eecc3c@oracle.com> Message-ID: On 7/26/19 10:47 AM, gerard ziemski wrote: > Thank you for fixing this! > > Just a small nick pick - this comment in symbolTable.cpp: > > +// 2^24 is max size, like StringTable. > > doesn't particularly strike me as all that useful. Thanks Gerard.? I don't see why it should be different.? Most Strings come to the JVM as Symbol's first, ie. JVM_CONSTANT_Utf8, so it seems that the tables should have the same limits. Thank you for the code review. Coleen > > cheers > > > > On 7/25/19 4:53 PM, coleen.phillimore at oracle.com wrote: >> >> After some offline polling of various people, I'm going to withdraw >> the UnlockExperimentalOptions change to trueInDebug, and fixed the >> options test. >> >> http://cr.openjdk.java.net/~coleenp/2019/8227123.02/webrev/index.html >> >> These test changes would have found the original bug. >> >> Thanks, >> Coleen > From coleen.phillimore at oracle.com Mon Jul 29 15:16:42 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 29 Jul 2019 11:16:42 -0400 Subject: this-pointer NULL-checks in hotspot codebase [-Wtautological-undefined-compare] In-Reply-To: <032d94a2-d2a9-7d59-92c9-47dcd2acc2de@oracle.com> References: <55e8bddf-3228-0fd7-3639-cc9bc920e2c5@oracle.com> <032d94a2-d2a9-7d59-92c9-47dcd2acc2de@oracle.com> Message-ID: On 7/26/19 4:18 PM, Stefan Karlsson wrote: > FWIW, I have a prototype that rewrites markOopDesc that gets rid of > this undefined behavior. I got some internal feedback that it was a > worth-while change, but I didn't have time to get this into JDK 13, > and post-poned it. > > The first patch renames the markOopDesc to MarkWord and removes the > inheritances from oopDesc: > ?https://cr.openjdk.java.net/~stefank/prototype/markWord/webrev.markWord.simpleRename/ > I didn't click all of it, but you could also move markOop* to markWord.* but keep it in oops, please. > > On top of that I have the patch that makes MarkWord an AllStatic class > and removes the UB: > ?https://cr.openjdk.java.net/~stefank/prototype/markWord/webrev.markWord.makeStatic.delta/ > > ?https://cr.openjdk.java.net/~stefank/prototype/markWord/webrev.markWord.makeStatic/ > > > This was written in May and hasn't been rebased against the latest > changes. I vote for you to continue this! Coleen > > StefanK > > On 2019-07-12 16:46, Erik ?sterlund wrote: >> Hi Harold, >> >> It's worse than that though, unfortunately. You are not allowed to >> have "this" equal to NULL, whether you perform such explicit NULL >> comparisons or not. >> >> The implication is that as long as "inflating" is NULL, we kind of >> can't use any of the functions on markOop and hence? mustrewrite >> pretty much all uses of markOop to do something else. >> The same goes for things like Register, where rax == NULL. To be >> compliant, we would similarly have to rewrite all uses of Register. >> >> In other words, if we are to really hunt down uses of this == NULL >> and remove them, we will find ourselves with a mountain of work. >> >> Again, just gonna drop that here and run. >> >> /Erik >> >> On 2019-07-12 14:14, Harold Seigel wrote: >>> The functions that compare 'this' to NULL could be changed from >>> instance to static functions where 'this' is explicitly passed as a >>> parameter.? Then you could keep the equivalent NULL checks. >>> >>> Harold >>> >>> On 7/12/2019 4:22 AM, Erik ?sterlund wrote: >>>> Hi Matthias, >>>> >>>> Removing such NULL checks seems like a good idea in general due to >>>> the undefined behaviour. >>>> Worth mentioning though that there are some tricky ones, like in >>>> markOopDesc* where this == NULL >>>> means that the mark word has the "inflating" value. So we >>>> explicitly check if this == NULL and >>>> hope the compiler will not elide the check. Just gonna drop that >>>> one here and run for it. >>>> >>>> Thanks, >>>> /Erik >>>> >>>> On 2019-07-12 09:48, Baesken, Matthias wrote: >>>>> Hello , when looking? into? the recent xlc16 / xlclang?? warnings >>>>> I came? across? those? 3 : >>>>> >>>>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:1729:7: warning: >>>>> 'this' pointer cannot be null in well-defined C++ code; >>>>> comparison may be assumed to always evaluate to true >>>>> [-Wtautological-undefined-compare] >>>>> ?? if( this != NULL ) { >>>>> ?????? ^~~~??? ~~~~ >>>>> >>>>> /nightly/jdk/src/hotspot/share/adlc/formssel.cpp:3416:7: warning: >>>>> 'this' pointer cannot be null in well-defined C++ code; >>>>> comparison may be assumed to always evaluate to false >>>>> [-Wtautological-undefined-compare] >>>>> ?? if( this == NULL ) return; >>>>> >>>>> /nightly/jdk/src/hotspot/share/libadt/set.cpp:46:7: warning: >>>>> 'this' pointer cannot be null in well-defined C++ code; >>>>> comparison may be assumed to always evaluate to false >>>>> [-Wtautological-undefined-compare] >>>>> ?? if( this == NULL ) return os::strdup("{no set}"); >>>>> >>>>> >>>>> Do you think the? NULL-checks can be removed or is there still >>>>> some value in doing them ? >>>>> >>>>> Best regards, Matthias >>>> >> > From shade at redhat.com Mon Jul 29 16:48:30 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 29 Jul 2019 18:48:30 +0200 Subject: RFR (S) 8228725: AArch64: Purge method call format support Message-ID: RFE: https://bugs.openjdk.java.net/browse/JDK-8228725 https://cr.openjdk.java.net/~shade/8228725/webrev.01/index.html This is a leftover from initial AArch64 push and recent Simulator removal. This code does not seem to be needed in current AArch64. I am planning to backport it all the way down to 8u-aarch64, where there are leftover additions to Method to store that call format, and this would eliminate parts of 8u exposure. Testing: aarch64 build, tier1 -- Thanks, -Aleksey From dean.long at oracle.com Mon Jul 29 21:21:33 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 29 Jul 2019 14:21:33 -0700 Subject: [14] RFR(S): 8193042: NativeLookup::lookup_critical_entry() should only load shared library once In-Reply-To: <74e526f5-9036-3463-520c-bc754e1f9beb@oracle.com> References: <74e526f5-9036-3463-520c-bc754e1f9beb@oracle.com> Message-ID: Looks good. dl On 7/29/19 6:27 AM, Christian Hagedorn wrote: > Hi > > Please review the following enhancement: > http://cr.openjdk.java.net/~thartmann/8193042/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8193042 > > This avoids repeated loads/unloads of the same shared library. > > Thanks! > > Best regards, > Christian From dean.long at oracle.com Mon Jul 29 21:38:29 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 29 Jul 2019 14:38:29 -0700 Subject: [14] RFR(XS): 8156207 : Resource allocated BitMaps are often cleared unnecessarily In-Reply-To: References: Message-ID: <2a1249d8-2656-0950-61ab-dc12acc434d3@oracle.com> I see that this has already been pushed, but I'm just curious, wouldn't this code be clearer if it used a copy ctor? ResourceBitMap g(_gen.size(), false); g.set_from(_gen); vs ResourceBitMap g(_gen); dl On 7/26/19 5:23 AM, Christian Hagedorn wrote: > Hi > > Please review the following patch: > http://cr.openjdk.java.net/~thartmann/8156207/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8156207 > > This is just a completion of the suggested change in the comments. > > Thank you! > > Best regards, > Christian From kim.barrett at oracle.com Mon Jul 29 22:10:24 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 29 Jul 2019 18:10:24 -0400 Subject: [14] RFR(XS): 8156207 : Resource allocated BitMaps are often cleared unnecessarily In-Reply-To: <2a1249d8-2656-0950-61ab-dc12acc434d3@oracle.com> References: <2a1249d8-2656-0950-61ab-dc12acc434d3@oracle.com> Message-ID: > On Jul 29, 2019, at 5:38 PM, dean.long at oracle.com wrote: > > I see that this has already been pushed, but I'm just curious, wouldn't this code be clearer if it used a copy ctor? > > ResourceBitMap g(_gen.size(), false); g.set_from(_gen); > > vs > > ResourceBitMap g(_gen); The current copy constructor definition for ResourceBitMap is the default (shallow) copy, which isn?t what is wanted here. I tried experimenting with making ResourceBitMap noncopyable a while ago, but ran into uses. Changing the copy constructor to not be shallow would negatively affect the performance of such uses. Of course, it might also positively affect the correctness of some of those uses, if there are existing bugs... From david.holmes at oracle.com Mon Jul 29 22:22:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Jul 2019 08:22:36 +1000 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131,072) In-Reply-To: <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> <301bd591-c90f-b907-2f4b-26d1cb519e49@oracle.com> <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> Message-ID: Hi Coleen, Updates LGTM too! Thanks, David On 26/07/2019 7:53 am, coleen.phillimore at oracle.com wrote: > > After some offline polling of various people, I'm going to withdraw the > UnlockExperimentalOptions change to trueInDebug, and fixed the options > test. > > http://cr.openjdk.java.net/~coleenp/2019/8227123.02/webrev/index.html > > These test changes would have found the original bug. > > Thanks, > Coleen > > > On 7/24/19 9:52 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/24/19 9:20 AM, David Holmes wrote: >>> On 24/07/2019 11:04 pm, coleen.phillimore at oracle.com wrote: >>>> On 7/23/19 10:20 PM, David Holmes wrote: >>>>> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>>>>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>>>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>>>>> Hi Coleen, >>>>>>>>>>> >>>>>>>>>>> -? experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>>>>> +? experimental(bool, UnlockExperimentalVMOptions, >>>>>>>>>>> trueInDebug, ??? \ >>>>>>>>>>> >>>>>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>>>>> >>>>>>>>>> Well if it's added, then the option range test would test the >>>>>>>>>> option.? Otherwise, I think it's benign. In debug mode, one >>>>>>>>>> would no longer have to specify -XX:+UnlockExperimental >>>>>>>>>> options, just like UnlockDiagnosticVMOptions.?? The option is >>>>>>>>>> there either way. >>>>>>>>> >>>>>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some >>>>>>>>> folks think >>>>>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause >>>>>>>>> bugs in tests >>>>>>>>> that are runnable in all build configs: 'release', 'fastdebug' >>>>>>>>> and 'slowdebug'. >>>>>>>>> Folks use an option in a test that requires >>>>>>>>> '-XX:+UnlockDiagnosticVMOptions', >>>>>>>>> but forget to include it in the test's run statement and we end >>>>>>>>> up with a test failure in 'release' bits. >>>>>>>>> >>>>>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not >>>>>>>>> introduce the same path to failing tests. >>>>>>>> >>>>>>>> I tried to change UnlockDiagnosticVMOptions to be false, and got >>>>>>>> a wall of opposition: >>>>>>>> >>>>>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>>>>> >>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>>>>> >>>>>>> >>>>>>> I would not say "a wall of opposition". You got almost equal amounts >>>>>>> of "yea" and "nay". I was a "yea" and I have been continuing to >>>>>>> train >>>>>>> my fingers (and my scripts) to do the right thing. >>>>>> >>>>>> You should have seen my slack channel at that time. :) Maybe the >>>>>> "wall" was primarily from a couple of people who strongly objected. >>>>>>> >>>>>>> Interestingly, David H was a "nay" on changing >>>>>>> UnlockDiagnosticVMOptions >>>>>>> to be 'false', but appears to be leaning toward "nay" on changing >>>>>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>>>>> >>>>>> >>>>>> I think he's mostly just asking the question.? We'll see what he >>>>>> answers later. >>>>> >>>>> Yes I'm just asking the question. I don't think changing this buys >>>>> us much other than "it's now the same as for diagnostic flags". >>>>> Testing these flags can (and probably should) be handled explicitly. >>>> >>>> I disagree.? I don't think we should test these flags explicitly >>>> when we have a perfectly good test for all the flags, that should be >>>> enabled. Which is what my change does. >>> >>> Your change only causes the experimental flags to be tested in debug >>> builds. I would argue they should also be tested in product builds, >>> hence the need to be explicit about it. >> >> The same is true for diagnostic options.? I'd be surprised if testing >> in release made a difference though, except taking more time. >> >> Coleen >> >>> >>> David >>> ----- >>> >>>>> >>>>> I looked back at the discussion on JDK-8153783 (sorry can't recall >>>>> what may have been said in slack) and I'm not sure what my specific >>>>> concern was then. From a testing perspective if you use an >>>>> experimental or diagnostic flag then you should remember to >>>>> explicitly unlock it in the test setup. Not having trueInDebug >>>>> catches when you forget that and only test in a debug build. >>>> >>>> Yes, that was the rationale for making it 'false' rather than >>>> 'trueInDebug'.? People were adding tests with a diagnostic option >>>> and it was failing in product mode because the Unlock flag wasn't >>>> present.? The more vocal side of the question didn't want to have to >>>> add the Unlock flag for all their day to day local testing.?? I >>>> assume the same argument can be made for the experimental options. >>>> >>>> It would be good to hear the opinion from someone who uses these >>>> options.?? This is degenerated into an opinion question, and besides >>>> being able to cleanly test these options, neither one of us uses or >>>> tests experimental options as far as I can tell.? I see tests from >>>> the Compiler and GC components.? What do other people think? >>>> >>>> Thanks, >>>> Coleen >>>> >>>>> >>>>> Cheers, >>>>> David >>>>> ----- >>>>> >>>>>>> >>>>>>>> I think the same exact arguments should apply to >>>>>>>> UnlockExperimentalVMOptions.? I'd like to hear from someone that >>>>>>>> uses experimental options on ZGC or shenandoah, since those have >>>>>>>> the most experimental options. >>>>>>> >>>>>>> I agree that the same arguments apply to >>>>>>> UnlockExperimentalVMOptions. >>>>>>> For consistency's sake if anything, they should be the same. >>>>>>> >>>>>>> >>>>>>>> The reason that I made it trueInDebug is so that the command >>>>>>>> line option range test would test these options.? Otherwise a >>>>>>>> more hacky solution could be done, including adding the >>>>>>>> parameter -XX:+UnlockExperimentalVMOptions to all the VM option >>>>>>>> range tests. I'd rather not do this. >>>>>>> >>>>>>> Can explain this a bit more? Why would a default value of 'false' >>>>>>> mean that >>>>>>> the command line option range test would not test these options? >>>>>> >>>>>> So the command line option tests do - java -XX:+PrintFlagsRanges >>>>>> -version and collect the flags that come out, parse the ranges, >>>>>> and then run java with each of these flags with the limits of the >>>>>> range (unless the limit is INT_MAX).? Some flags are excluded >>>>>> explicitly because they cause problems. >>>>>> >>>>>> The reason that SymbolTableSize escaped the testing, is because it >>>>>> wasn't reported with -XX:+PrintFlagsRanges. You'd need >>>>>> -XX:+UnlockExperimentalVMOptions in the java command to gather the >>>>>> flags, and then pass it to all the java commands to test the >>>>>> ranges. It's not that bad, just a bit gross. >>>>>> >>>>>> In any case, I think the experimental flags ranges should be >>>>>> tested. I'm glad/amazed that more didn't fail when I turned it on >>>>>> in my testing. >>>>>> >>>>>>> >>>>>>> In any case, I'm fine if you want to move forward with changing the >>>>>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>>>>> >>>>>> >>>>>> Okay, we'll wait to see whether I get a wall of opposition or >>>>>> support. I still think it should be by default the same as >>>>>> UnlockDiagnosticVMoptions. >>>>>> >>>>>> Thanks! >>>>>> Coleen >>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>>> >>>>>>>>> Dan >>>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Functional change seems fine. Is it worth adding a clarifying >>>>>>>>>>> comment to: >>>>>>>>>>> >>>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul) ??? \ >>>>>>>>>>> >>>>>>>>>>> with: >>>>>>>>>>> >>>>>>>>>>> +????????? range(minimumSymbolTableSize, 16777216ul /* 2^24 >>>>>>>>>>> */) ?????????????? \ >>>>>>>>>> >>>>>>>>>> Let me see if the X macro allows that and I could also add >>>>>>>>>> that to StringTableSize (which is not experimental option). >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> Summary: Increase max size for SymbolTable and fix >>>>>>>>>>>> experimental option range.? Make experimental options >>>>>>>>>>>> trueInDebug so they're tested by the command line option >>>>>>>>>>>> testing >>>>>>>>>>>> >>>>>>>>>>>> open webrev at >>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>>>>> >>>>>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a >>>>>>>>>>>> lot of experimental options. I didn't test with shenanodoah. >>>>>>>>>>>> >>>>>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >> > From david.holmes at oracle.com Mon Jul 29 22:56:10 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Jul 2019 08:56:10 +1000 Subject: [14] RFR(S): 8193042: NativeLookup::lookup_critical_entry() should only load shared library once In-Reply-To: <74e526f5-9036-3463-520c-bc754e1f9beb@oracle.com> References: <74e526f5-9036-3463-520c-bc754e1f9beb@oracle.com> Message-ID: <3e4dee76-c57e-2b66-d01b-6e3c0dbd630e@oracle.com> Hi Christian, Thanks for fixing this one - not sure why it was filed as a compiler bug rather than runtime :) On 29/07/2019 11:27 pm, Christian Hagedorn wrote: > Hi > > Please review the following enhancement: > http://cr.openjdk.java.net/~thartmann/8193042/webrev.00/ Changes look good. There are a few inherited style nits (missing braces on if-blocks) but not worth pointing out as that code contains a lot of them. What testing did you do for this? (I'm not sure which tests would exercise this code.) Thanks, David > https://bugs.openjdk.java.net/browse/JDK-8193042 > > This avoids repeated loads/unloads of the same shared library. > > Thanks! > > Best regards, > Christian From coleen.phillimore at oracle.com Mon Jul 29 23:41:42 2019 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 29 Jul 2019 19:41:42 -0400 Subject: RFR (S) 8227123: Assertion failure when setting SymbolTableSize larger than 2^17 (131, 072) In-Reply-To: References: <813bedf3-4689-fb6e-2516-74f505ec4774@oracle.com> <53aff5c4-3d0d-c375-a7e1-622da731d4a0@oracle.com> <4cd9d175-1946-030a-717f-022207a7bd73@oracle.com> <0abca1d5-6334-c3bb-8554-e07e03492205@oracle.com> <9ee0f48d-0215-eb57-7f6f-44f76ebfe21b@oracle.com> <367cf23d-1f89-0c68-bac8-593a4c6ed3b4@oracle.com> <301bd591-c90f-b907-2f4b-26d1cb519e49@oracle.com> <6a5877d0-890d-6498-07fe-751a214a8b04@oracle.com> Message-ID: <3BC0A9B6-C6E3-4D9E-A607-055693343F0B@oracle.com> Thanks David! Coleen > On Jul 29, 2019, at 6:22 PM, David Holmes wrote: > > Hi Coleen, > > Updates LGTM too! > > Thanks, > David > >> On 26/07/2019 7:53 am, coleen.phillimore at oracle.com wrote: >> After some offline polling of various people, I'm going to withdraw the UnlockExperimentalOptions change to trueInDebug, and fixed the options test. >> http://cr.openjdk.java.net/~coleenp/2019/8227123.02/webrev/index.html >> These test changes would have found the original bug. >> Thanks, >> Coleen >>> On 7/24/19 9:52 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>>> On 7/24/19 9:20 AM, David Holmes wrote: >>>>> On 24/07/2019 11:04 pm, coleen.phillimore at oracle.com wrote: >>>>>> On 7/23/19 10:20 PM, David Holmes wrote: >>>>>>> On 24/07/2019 1:48 am, coleen.phillimore at oracle.com wrote: >>>>>>>> On 7/23/19 11:30 AM, Daniel D. Daugherty wrote: >>>>>>>>> On 7/23/19 11:09 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>> On 7/23/19 9:45 AM, Daniel D. Daugherty wrote: >>>>>>>>>>> On 7/23/19 7:03 AM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> On 7/23/19 12:27 AM, David Holmes wrote: >>>>>>>>>>>> Hi Coleen, >>>>>>>>>>>> >>>>>>>>>>>> - experimental(bool, UnlockExperimentalVMOptions, false, \ >>>>>>>>>>>> + experimental(bool, UnlockExperimentalVMOptions, trueInDebug, \ >>>>>>>>>>>> >>>>>>>>>>>> I can't quite convince myself this is harmless nor necessary. >>>>>>>>>>> >>>>>>>>>>> Well if it's added, then the option range test would test the option. Otherwise, I think it's benign. In debug mode, one would no longer have to specify -XX:+UnlockExperimental options, just like UnlockDiagnosticVMOptions. The option is there either way. >>>>>>>>>> >>>>>>>>>> Mentioning 'UnlockDiagnosticVMOptions' reminds me that some folks think >>>>>>>>>> that 'UnlockDiagnosticVMOptions' being 'trueInDebug' can cause bugs in tests >>>>>>>>>> that are runnable in all build configs: 'release', 'fastdebug' and 'slowdebug'. >>>>>>>>>> Folks use an option in a test that requires '-XX:+UnlockDiagnosticVMOptions', >>>>>>>>>> but forget to include it in the test's run statement and we end up with a test failure in 'release' bits. >>>>>>>>>> >>>>>>>>>> I would prefer that 'UnlockExperimentalVMOptions' did not introduce the same path to failing tests. >>>>>>>>> >>>>>>>>> I tried to change UnlockDiagnosticVMOptions to be false, and got a wall of opposition: >>>>>>>>> >>>>>>>>> See: https://bugs.openjdk.java.net/browse/JDK-8153783 >>>>>>>>> >>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-January/029882.html >>>>>>>> >>>>>>>> I would not say "a wall of opposition". You got almost equal amounts >>>>>>>> of "yea" and "nay". I was a "yea" and I have been continuing to train >>>>>>>> my fingers (and my scripts) to do the right thing. >>>>>>> >>>>>>> You should have seen my slack channel at that time. :) Maybe the "wall" was primarily from a couple of people who strongly objected. >>>>>>>> >>>>>>>> Interestingly, David H was a "nay" on changing UnlockDiagnosticVMOptions >>>>>>>> to be 'false', but appears to be leaning toward "nay" on changing >>>>>>>> UnlockExperimentalVMOptions to 'trueInDebug'... >>>>>>>> >>>>>>> >>>>>>> I think he's mostly just asking the question. We'll see what he answers later. >>>>>> >>>>>> Yes I'm just asking the question. I don't think changing this buys us much other than "it's now the same as for diagnostic flags". Testing these flags can (and probably should) be handled explicitly. >>>>> >>>>> I disagree. I don't think we should test these flags explicitly when we have a perfectly good test for all the flags, that should be enabled. Which is what my change does. >>>> >>>> Your change only causes the experimental flags to be tested in debug builds. I would argue they should also be tested in product builds, hence the need to be explicit about it. >>> >>> The same is true for diagnostic options. I'd be surprised if testing in release made a difference though, except taking more time. >>> >>> Coleen >>> >>>> >>>> David >>>> ----- >>>> >>>>>> >>>>>> I looked back at the discussion on JDK-8153783 (sorry can't recall what may have been said in slack) and I'm not sure what my specific concern was then. From a testing perspective if you use an experimental or diagnostic flag then you should remember to explicitly unlock it in the test setup. Not having trueInDebug catches when you forget that and only test in a debug build. >>>>> >>>>> Yes, that was the rationale for making it 'false' rather than 'trueInDebug'. People were adding tests with a diagnostic option and it was failing in product mode because the Unlock flag wasn't present. The more vocal side of the question didn't want to have to add the Unlock flag for all their day to day local testing. I assume the same argument can be made for the experimental options. >>>>> >>>>> It would be good to hear the opinion from someone who uses these options. This is degenerated into an opinion question, and besides being able to cleanly test these options, neither one of us uses or tests experimental options as far as I can tell. I see tests from the Compiler and GC components. What do other people think? >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>>> >>>>>> Cheers, >>>>>> David >>>>>> ----- >>>>>> >>>>>>>> >>>>>>>>> I think the same exact arguments should apply to UnlockExperimentalVMOptions. I'd like to hear from someone that uses experimental options on ZGC or shenandoah, since those have the most experimental options. >>>>>>>> >>>>>>>> I agree that the same arguments apply to UnlockExperimentalVMOptions. >>>>>>>> For consistency's sake if anything, they should be the same. >>>>>>>> >>>>>>>> >>>>>>>>> The reason that I made it trueInDebug is so that the command line option range test would test these options. Otherwise a more hacky solution could be done, including adding the parameter -XX:+UnlockExperimentalVMOptions to all the VM option range tests. I'd rather not do this. >>>>>>>> >>>>>>>> Can explain this a bit more? Why would a default value of 'false' mean that >>>>>>>> the command line option range test would not test these options? >>>>>>> >>>>>>> So the command line option tests do - java -XX:+PrintFlagsRanges -version and collect the flags that come out, parse the ranges, and then run java with each of these flags with the limits of the range (unless the limit is INT_MAX). Some flags are excluded explicitly because they cause problems. >>>>>>> >>>>>>> The reason that SymbolTableSize escaped the testing, is because it wasn't reported with -XX:+PrintFlagsRanges. You'd need -XX:+UnlockExperimentalVMOptions in the java command to gather the flags, and then pass it to all the java commands to test the ranges. It's not that bad, just a bit gross. >>>>>>> >>>>>>> In any case, I think the experimental flags ranges should be tested. I'm glad/amazed that more didn't fail when I turned it on in my testing. >>>>>>> >>>>>>>> >>>>>>>> In any case, I'm fine if you want to move forward with changing the >>>>>>>> default of UnlockExperimentalVMOptions to 'trueInDebug'. >>>>>>>> >>>>>>> >>>>>>> Okay, we'll wait to see whether I get a wall of opposition or support. I still think it should be by default the same as UnlockDiagnosticVMoptions. >>>>>>> >>>>>>> Thanks! >>>>>>> Coleen >>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Dan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Functional change seems fine. Is it worth adding a clarifying comment to: >>>>>>>>>>>> >>>>>>>>>>>> + range(minimumSymbolTableSize, 16777216ul) \ >>>>>>>>>>>> >>>>>>>>>>>> with: >>>>>>>>>>>> >>>>>>>>>>>> + range(minimumSymbolTableSize, 16777216ul /* 2^24 */) \ >>>>>>>>>>> >>>>>>>>>>> Let me see if the X macro allows that and I could also add that to StringTableSize (which is not experimental option). >>>>>>>>>>> Thanks, >>>>>>>>>>> Coleen >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>>> On 23/07/2019 4:45 am, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>>> Summary: Increase max size for SymbolTable and fix experimental option range. Make experimental options trueInDebug so they're tested by the command line option testing >>>>>>>>>>>>> >>>>>>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8227123.01/webrev >>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8227123 >>>>>>>>>>>>> >>>>>>>>>>>>> Tested locally with default and -XX:+UseZGC since ZGC has a lot of experimental options. I didn't test with shenanodoah. >>>>>>>>>>>>> >>>>>>>>>>>>> I will test with hs-tier1-3 before checking in. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Coleen >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>> From david.holmes at oracle.com Tue Jul 30 02:27:31 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Jul 2019 12:27:31 +1000 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: References: Message-ID: <2668bf38-162a-7b6f-404b-0c1a598a304e@oracle.com> Hi Kim, A meta-comment: "storages" is not a well formed term. Can we have something clearer, perhaps OopStorageManager, or something like that? Thanks, David On 26/07/2019 8:59 am, Kim Barrett wrote: > 8227054: ServiceThread needs to know about all OopStorage objects > 8227053: ServiceThread cleanup of OopStorage is missing some > > Please review this change in how OopStorage objects are managed and > accessed. There is a new (all static) class, OopStorages, which > provides infrastructure for creating all the storage objects, access > via an enum-based id, and iterating over them. > > Various components that previously managed their own storage objects > now obtain them from OopStorages. A number of access functions have > been eliminated as part of that, though some have been retained for > internal convenience of a component. > > The set of OopStorage objects is now declared in one place, using > x-macros, with collective definitions and usages ultimately driven off > those macros. This includes the ServiceThread (which no longer needs > explicit knowledge of the set, and is no longer missing any) and the > OopStorage portion of WeakProcessorPhases. For now, the various GCs > still have explicit knowledge of the set; that will be addressed in > followup changes specific to each collector. (This delay minimizes > the impact on Leo's in-progress review that changes ParallelGC to use > WorkGangs.) > > This change also includes a couple of utility macros for working with > x-macros. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8227054 > https://bugs.openjdk.java.net/browse/JDK-8227053 > > Webrev: > http://cr.openjdk.java.net/~kbarrett/8227054/open.00/ > > Testing: > mach5 tier1-3 > From david.holmes at oracle.com Tue Jul 30 02:37:53 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Jul 2019 12:37:53 +1000 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: References: <25234969-2215-57e9-d8c5-d97b5669ebb1@oracle.com> Message-ID: <5ba808b0-ae52-0d0a-b84b-fc34df35475d@oracle.com> Hi Adam, On 25/07/2019 3:57 am, Adam Farley8 wrote: > Hi David, > > Welcome back. :) Thanks. Sorry for the delay in getting back to this. I like .v2 as it is much simpler (notwithstanding freeing the already allocated arrays adds some complexity - thanks for fixing that). I'm still not sure we can't optimise things better for unchangeable properties like the boot libary path, but that's another RFE. Thanks, David > > David Holmes wrote on 22/07/2019 03:34:37: > >> From: David Holmes >> To: Adam Farley8 , hotspot- >> dev at openjdk.java.net, serviceability-dev >> Date: 22/07/2019 03:34 >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path >> paths are longer than JVM_MAXPATHLEN >> >> Hi Adam, >> >> Some higher-level issues/concerns ... >> >> On 22/07/2019 11:25 am, David Holmes wrote: >> > Hi Adam, >> > >> > Adding in serviceability-dev as you've made changes in that area too. >> > >> > Will take a closer look at the changes soon. >> > >> > David >> > ----- >> > >> > On 18/07/2019 2:05 am, Adam Farley8 wrote: >> >> Hey All, >> >> >> >> Reviewers and sponsors requested to inspect the following. >> >> >> >> I've re-written the code change, as discussed with David Holmes in emails >> >> last week, and now the webrev changes do this: >> >> >> >> - Cause the VM to shut down with a relevant error message if one or more >> >> of the sun.boot.library.path paths is too long for the system. >> >> I'm not seeing that implemented at the moment. Nor am I clear that such >> an error will always be detected during VM initialization. The code >> paths look fairly general purpose, but perhaps that is an illusion and >> we will always check this during initialization? (also see discussion at >> end) > > This is implemented in the ".1" webrev, though I did comment out a > necessary > line to attempt to test the linker_md changes. I've removed the "//" and > re-uploaded. It's the added line in the os.cpp file that begins > "vm_exit_during_initialization". > > You're correct in that this code would only be triggered if we're loading a > library, though I'm not sure that's a problem. We seem to load a couple of > libraries every time we run even the most minimalist of classes, and if > we somehow manage not to load any libraries *at all*, the contents of a > library path property seems irrelevant. > >> >> >> - Apply similar error-producing code to the (legacy?) code in >> >> linker_md.c. >> >> I think the JDWP changes need to be split off and handled under their >> own issue. It's a similar issue but not directly related. Also the >> change to sys.h raises the need for a CSR request as it seems to be >> exported for external use - though I can't find any existing test code >> that includes it, or which uses the affected code (which is another >> reason so split this of and let serviceability folk consider it). > > A reasonable suggestion. Thanks for the tip about sys.h. Seemed cleaner to > change sys.h, but this change isn't worth a CSR. > > The jdwp changes were removed from the new ".2" webrev. > > http://cr.openjdk.java.net/~afarley/8227021.2/webrev > >> >> >> - Allow the numerical parameter for split_path to indicate anything we >> >> plan to add to the path once split, allowing for more accurate path >> >> length detection. >> >> This is a bit icky but I understand your desire to be more accurate with >> the checking - as otherwise you would still need to keep overflow checks >> in other places once the full path+name is assembled. But then such >> checks must be missing in places now ?? > > Correct, to my understanding. Likely more a problem on Windows than Linux. > >> >> I'm not clear why you have implemented the path check the way you >> instead of simply augmenting the existing code ie. where we have: >> >> 1347 ? // do the actual splitting >> 1348 ? p = inpath; >> 1349 ? for (int i = 0 ; i < count ; i++) { >> 1350 ? ? size_t len = strcspn(p, os::path_separator()); >> 1351 ? ? if (len > JVM_MAXPATHLEN) { >> 1352 ? ? ? return NULL; >> 1353 ? ? } >> >> why not just change the calculation at line 1351 to include the prefix >> length, and then report the error rather than return NULL? > > You're right. The code was originally changed to enable the "skip too-long > paths" logic, and then when we went to a "fail immediately" policy, I > tweaked > the modified code rather than start over again. > > See the .2 webrev for this change. > > http://cr.openjdk.java.net/~afarley/8227021.2/webrev > >> >> BTW the existing code fails to free opath before returning NULL. > > True. I added a fix to free the memory in the two cases we do that. > > Though not strictly needed in the vm-exit case, the internet suggested > it was bad practice to assume the os would handle it. > >> >> >> - Add an optional parameter to the os::split_path function that specifies >> >> where the paths came from, for a better error message. >> >> It's not appropriate to set that up in os::dll_locate_lib, hardwired as >> "sun.boot.library.path". os::dll_locate_lib doesn't know where it is >> being asked to look, it is the callers that usually use >> Arguments::get_dll_dir(), but in one case in jvmciEnv.cpp we have: >> >> os::dll_locate_lib(path, sizeof(path), JVMCILibPath, ... >> >> so the error message would be wrong in that case. If you want to pass >> through this diagnostic help information then it needs to be set by the >> callers of, and passed into, os::dll_locate_lib. > > Hmm, perhaps a simpler solution would be to make the error message more > vague and remove the passing-in of the path source. > > E.g. "The VM tried to use a path that exceeds the maximum path length for " > ? ? ?"this system. Review path-containing parameters and properties, > such as " > ? ? ?"sun.boot.library.path, to identify potential sources for this path." > > That way we're covered no matter where the path comes from. > >> >> Looking at all the callers of os::dll_locate_lib that all pass >> Arguments::get_dll_dir, it seems quite inefficient that we will >> potentially split the same set of paths multiple times. I wonder whether >> we can do this specifically during VM initialization and cache the split >> paths instead? That doesn't address the problem of a path element that >> only exceeds the maximum length when a specific library name is added, >> but I'm trying to see how to minimise the splitting and put the onus for >> the checking back on the code creating the paths. >> > > We'd have to check for changes to the source property every time we used > the value. E.g. copy the property into another string, split the paths, > cache the split, and compare that to the "live" property storage string > before using the cache. > > That, or assume that sun.boot.library.path could never change after being > "split", an assumption which feels unsafe. > >> Lets see if others have comments/suggestions here. >> >> Thanks, >> David > > Sure thing. > > - Adam > >> >> >> >> >> Bug: https://urldefense.proofpoint.com/v2/url? >> u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8227021&d=DwICaQ&c=jf_iaSHvJObTbx- >> siA1ZOg&r=P5m8KWUXJf- >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=eUr83eH1dOFWQ7zfl1cSle0RxJM9Ayl9AszJYR45Gvo&s=dAT1OR_BIZPvCjoGtIlC2J1CCoCB4n43JKHFLfuHrjA&e= >> >> >> >> New Webrev: https://urldefense.proofpoint.com/v2/url? >> u=http-3A__cr.openjdk.java.net_-7Eafarley_8227021. >> 1_webrev_&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf- >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=eUr83eH1dOFWQ7zfl1cSle0RxJM9Ayl9AszJYR45Gvo&s=mGM6YxmVHe2xW8mlGgI0i7XBLCqdyHN0J1ECgZ8QuRo&e= > > (Superseded by the .2 version) > >> >> >> >> Best Regards >> >> >> >> Adam Farley >> >> IBM Runtimes >> >> >> >> Unless stated otherwise above: >> >> IBM United Kingdom Limited - Registered in England and Wales with number >> >> 741598. >> >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 >> >> 3AU >> >> >> > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From david.holmes at oracle.com Tue Jul 30 02:50:02 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Jul 2019 12:50:02 +1000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: References: Message-ID: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> Hi Matthias, On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > Hello , please review this small test fix . > > The test test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime.java fails sometimes on fast Linux machines with this error message : > > java.lang.RuntimeException: Total safepoint time illegal value: 0 ms (MIN = 1; MAX = 9223372036854775807) > > looks like the total safepoint time is too low currently on these machines, it is < 1 ms. > > There might be several ways to handle this : > > * Change the test in a way that it might generate nigher safepoint times > * Allow safepoint time == 0 ms > * Offer an additional interface that gives safepoint times with finer granularity ( currently the HS has safepoint time values in ns , see jdk/src/hotspot/share/runtime/safepoint.cpp SafepointTracing::end > > But it is converted on ms in this code > > 114jlong RuntimeService::safepoint_time_ms() { > 115 return UsePerfData ? > 116 Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > 117} > > 064jlong Management::ticks_to_ms(jlong ticks) { > 2065 assert(os::elapsed_frequency() > 0, "Must be non-zero"); > 2066 return (jlong)(((double)ticks / (double)os::elapsed_frequency()) > 2067 * (double)1000.0); > 2068} > > > > Currently I go for the first attempt (and try to generate higher safepoint times in my patch) . Yes that's probably best. Coarse-grained timing on very fast machines was bound to eventually lead to problems. But perhaps a more future-proof approach is to just add a do-while loop around the stack dumps and only exit when we have a non-zero safepoint time? Thanks, David ----- > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8228658 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > > > Thanks, Matthias > From jcbeyler at google.com Tue Jul 30 03:38:59 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Mon, 29 Jul 2019 20:38:59 -0700 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> Message-ID: Hi Matthias, I wonder if you should not do what David is suggesting and then put that whole code (the while loop) in a helper method. Below you have a calculation again using value2 (which I wonder what the added value of it is though) but anyway, that value2 could also be 0 at some point, no? So would it not be best to just refactor the getAllStackTraces and calculate safepoint time in a helper method for both value / value2 variables? Thanks, Jc On Mon, Jul 29, 2019 at 7:50 PM David Holmes wrote: > Hi Matthias, > > On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > > Hello , please review this small test fix . > > > > The test > test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime.java > fails sometimes on fast Linux machines with this error message : > > > > java.lang.RuntimeException: Total safepoint time illegal value: 0 ms > (MIN = 1; MAX = 9223372036854775807) > > > > looks like the total safepoint time is too low currently on these > machines, it is < 1 ms. > > > > There might be several ways to handle this : > > > > * Change the test in a way that it might generate nigher safepoint > times > > * Allow safepoint time == 0 ms > > * Offer an additional interface that gives safepoint times with > finer granularity ( currently the HS has safepoint time values in ns , see > jdk/src/hotspot/share/runtime/safepoint.cpp SafepointTracing::end > > > > But it is converted on ms in this code > > > > 114jlong RuntimeService::safepoint_time_ms() { > > 115 return UsePerfData ? > > 116 Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > > 117} > > > > 064jlong Management::ticks_to_ms(jlong ticks) { > > 2065 assert(os::elapsed_frequency() > 0, "Must be non-zero"); > > 2066 return (jlong)(((double)ticks / (double)os::elapsed_frequency()) > > 2067 * (double)1000.0); > > 2068} > > > > > > > > Currently I go for the first attempt (and try to generate higher > safepoint times in my patch) . > > Yes that's probably best. Coarse-grained timing on very fast machines > was bound to eventually lead to problems. > > But perhaps a more future-proof approach is to just add a do-while loop > around the stack dumps and only exit when we have a non-zero safepoint > time? > > Thanks, > David > ----- > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8228658 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > > > > > > Thanks, Matthias > > > -- Thanks, Jc From tobias.hartmann at oracle.com Tue Jul 30 05:13:34 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 30 Jul 2019 07:13:34 +0200 Subject: [14] RFR(S): 8193042: NativeLookup::lookup_critical_entry() should only load shared library once In-Reply-To: <3e4dee76-c57e-2b66-d01b-6e3c0dbd630e@oracle.com> References: <74e526f5-9036-3463-520c-bc754e1f9beb@oracle.com> <3e4dee76-c57e-2b66-d01b-6e3c0dbd630e@oracle.com> Message-ID: On 30.07.19 00:56, David Holmes wrote: > Thanks for fixing this one - not sure why it was filed as a compiler bug rather than runtime :) I've filed it in hotspot/compiler back then because you moved JDK-8191360 to compiler ;) > What testing did you do for this? (I'm not sure which tests would exercise this code.) I think besides the regular tiers, the test(s) referenced in JDK-8191360 should be suitable. Best regards, Tobias From adinn at redhat.com Tue Jul 30 08:07:54 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 30 Jul 2019 09:07:54 +0100 Subject: RFR (S) 8228725: AArch64: Purge method call format support In-Reply-To: References: Message-ID: <514b3bde-2f3f-0c53-c90e-0c08b71b3ba8@redhat.com> On 29/07/2019 17:48, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8228725 > https://cr.openjdk.java.net/~shade/8228725/webrev.01/index.html > > This is a leftover from initial AArch64 push and recent Simulator removal. This code does not seem > to be needed in current AArch64. I am planning to backport it all the way down to 8u-aarch64, where > there are leftover additions to Method to store that call format, and this would eliminate parts of > 8u exposure. > > Testing: aarch64 build, tier1 Yes thank you, that can be removed from all ports. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From dms at samersoff.net Tue Jul 30 11:11:51 2019 From: dms at samersoff.net (Dmitry Samersoff) Date: Tue, 30 Jul 2019 14:11:51 +0300 Subject: RFR (M) 8228400: Remove built-in AArch64 simulator In-Reply-To: References: <000a6f10-5ea4-8d8a-1b61-ad671a501193@redhat.com> <9a21a708-5d2a-fe68-8627-71e098dd3492@redhat.com> Message-ID: Andrew, Thank you for the explanation. I see your point and OK to leave it as is. Aleksey, could you add a comment, explaining that number_of_arguments parameter is no longer used and we keep it here just to maintain x86 compatibility? -Dmitry\S On 29.07.2019 11:54, Andrew Dinn wrote: > On 28/07/2019 21:03, Aleksey Shipilev wrote: >> On 7/28/19 6:52 PM, Dmitry Samersoff wrote: >>> 1. Do I understand correctly that we no longer use number_of_arguments >>> parameter? >> >> Yes, I think so. > > Yes we don't need to use this parameter. Indeed we could probably change > the signature to match the fact that we never supply an argument for it. > However, ... The reason it is declared in the Aarch64 code is because it > mirrors the code for x86_64. The argument is not needed for x86_64 > either but the code is dual purpose for x86_32 where the number of > parameters is needed. So, by dropping this parameter we would be > choosing to diverge from x86_64. > > I'm not sure how much virtue there is in doing that. When we first > ported the x86_64 code to Aarch64 we tried to keep the code for the two > ports aligned as far as possible (i.e. only diverge when the > architecture and/or performance required it). n.b. that's much the same > tactic as is adopted when backporting and happens for much the same > reasons -- many innovations happen first in x86, hence need /cross/ > porting to AArch64. > > This policy has occasionally led to minor oddities like this one but it > has also made development and maintenance much easier. Diverging on this > specific point probably wouldn't matter too much one way or the other. > So long as whoever is maintaining the code knows that it is derived from > x86_64 they can easily make allowance such a minor differences. However, > it gets more and more difficult to port code as these sort of changes > accumulate. Personally, I would prefer to keep the two ports aligned as > far as possible because my experience is that it has made it a lot > easier to avoid errors and spot defects. i.e. I think it is a benefit > rather than a problem that maintainers really need to keep that > alignment in mind. > >>> Should we remove it and version of call_VM_leaf on l. 1430 >> >> Maybe? I would leave it as follow-up. The change would be local and easy to test separately. >> Unfortunately, it would invalidate lots of testing already done for this patch. I can see how much >> hassle that would be, and maybe fold that improvement here... > > See above. However, Aleksey is right that this should be done as a > follow-up patch if at all. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From matthias.baesken at sap.com Tue Jul 30 11:25:05 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 30 Jul 2019 11:25:05 +0000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> Message-ID: Hello JC / David, here is a second webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ It moves the thread dump execution into a method executeThreadDumps(long) , and also adds while loops (but with a limitation for the number of thread dumps, really don?t want to cause timeouts etc.). I removed a check for MAX_VALUE_FOR_PASS because we cannot go over Long.MAX_VALUE . Hope you like this version better. Best regards, Matthias From: Jean Christophe Beyler Sent: Dienstag, 30. Juli 2019 05:39 To: David Holmes Cc: Baesken, Matthias ; hotspot-dev at openjdk.java.net; serviceability-dev Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms Hi Matthias, I wonder if you should not do what David is suggesting and then put that whole code (the while loop) in a helper method. Below you have a calculation again using value2 (which I wonder what the added value of it is though) but anyway, that value2 could also be 0 at some point, no? So would it not be best to just refactor the getAllStackTraces and calculate safepoint time in a helper method for both value / value2 variables? Thanks, Jc On Mon, Jul 29, 2019 at 7:50 PM David Holmes > wrote: Hi Matthias, On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > Hello , please review this small test fix . > > The test test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime.java fails sometimes on fast Linux machines with this error message : > > java.lang.RuntimeException: Total safepoint time illegal value: 0 ms (MIN = 1; MAX = 9223372036854775807) > > looks like the total safepoint time is too low currently on these machines, it is < 1 ms. > > There might be several ways to handle this : > > * Change the test in a way that it might generate nigher safepoint times > * Allow safepoint time == 0 ms > * Offer an additional interface that gives safepoint times with finer granularity ( currently the HS has safepoint time values in ns , see jdk/src/hotspot/share/runtime/safepoint.cpp SafepointTracing::end > > But it is converted on ms in this code > > 114jlong RuntimeService::safepoint_time_ms() { > 115 return UsePerfData ? > 116 Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > 117} > > 064jlong Management::ticks_to_ms(jlong ticks) { > 2065 assert(os::elapsed_frequency() > 0, "Must be non-zero"); > 2066 return (jlong)(((double)ticks / (double)os::elapsed_frequency()) > 2067 * (double)1000.0); > 2068} > > > > Currently I go for the first attempt (and try to generate higher safepoint times in my patch) . Yes that's probably best. Coarse-grained timing on very fast machines was bound to eventually lead to problems. But perhaps a more future-proof approach is to just add a do-while loop around the stack dumps and only exit when we have a non-zero safepoint time? Thanks, David ----- > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8228658 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > > > Thanks, Matthias > -- Thanks, Jc From david.holmes at oracle.com Tue Jul 30 12:12:13 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Jul 2019 22:12:13 +1000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> Message-ID: Hi Matthias, On 30/07/2019 9:25 pm, Baesken, Matthias wrote: > Hello? JC / David,?? here is a second webrev? : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ > > It moves?? the? thread dump execution into a? method > executeThreadDumps(long)?? ??, and also adds? while loops?? (but with a > limitation? for the number of thread dumps, really don?t > want to cause timeouts etc.).??? I removed a check for > MAX_VALUE_FOR_PASS?? because we cannot go over Long.MAX_VALUE . I don't think executeThreadDumps is worth factoring out like out. The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather it remains a constant 100, and then you set a simple loop iteration count limit. Further with the proposed code when you get here: 85 NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; you don't even know what value you may be starting with. But I was thinking of simply: long value = 0; do { Thread.getAllStackTraces(); value = mbean.getTotalSafepointTime(); } while (value == 0); We'd only hit a timeout if something is completely broken - which is fine. Overall tests like this are not very useful, yet very fragile. Thanks, David > Hope you like this version ?better. > > Best regards, Matthias > > *From:*Jean Christophe Beyler > *Sent:* Dienstag, 30. Juli 2019 05:39 > *To:* David Holmes > *Cc:* Baesken, Matthias ; > hotspot-dev at openjdk.java.net; serviceability-dev > > *Subject:* Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails > on fast Linux machines with Total safepoint time 0 ms > > Hi Matthias, > > I wonder if you should not do what David is suggesting and then put that > whole code (the while loop) in a helper method. Below you have a > calculation again using value2 (which I wonder what the added value of > it is though) but anyway, that value2 could also be 0 at some point, no? > > So would it not be best to just refactor the getAllStackTraces and > calculate safepoint time in a helper method for both value / value2 > variables? > > Thanks, > > Jc > > On Mon, Jul 29, 2019 at 7:50 PM David Holmes > wrote: > > Hi Matthias, > > On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > > Hello , please review this small test fix . > > > > The test > test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime.java > fails sometimes on fast Linux machines with this error message : > > > > java.lang.RuntimeException: Total safepoint time illegal value: 0 > ms (MIN = 1; MAX = 9223372036854775807) > > > > looks like the total safepoint time is too low currently on these > machines, it is < 1 ms. > > > > There might be several ways to handle this : > > > >? ? *? ?Change the test? in a way that it might generate nigher > safepoint times > >? ? *? ?Allow? safepoint time? == 0 ms > >? ? *? ?Offer an additional interface that gives? safepoint times > with finer granularity ( currently the HS has safepoint time values > in ns , see? jdk/src/hotspot/share/runtime/safepoint.cpp > ?SafepointTracing::end > > > > But it is converted on ms in this code > > > > 114jlong RuntimeService::safepoint_time_ms() { > > 115? return UsePerfData ? > > 116 > Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > > 117} > > > > 064jlong Management::ticks_to_ms(jlong ticks) { > > 2065? assert(os::elapsed_frequency() > 0, "Must be non-zero"); > > 2066? return (jlong)(((double)ticks / > (double)os::elapsed_frequency()) > > 2067? ? ? ? ? ? ? ? ?* (double)1000.0); > > 2068} > > > > > > > > Currently I go for? the first attempt (and try to generate > higher safepoint times in my patch) . > > Yes that's probably best. Coarse-grained timing on very fast machines > was bound to eventually lead to problems. > > But perhaps a more future-proof approach is to just add a do-while loop > around the stack dumps and only exit when we have a non-zero safepoint > time? > > Thanks, > David > ----- > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8228658 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > > > > > > Thanks, Matthias > > > > > -- > > Thanks, > > Jc > From matthias.baesken at sap.com Tue Jul 30 12:39:56 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 30 Jul 2019 12:39:56 +0000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> Message-ID: Hi David, "put that whole code (the while loop) in a helper method." was JC's idea, and I like the idea . Let's see what others think . > > Overall tests like this are not very useful, yet very fragile. > I am also fine with putting the test on the exclude list. Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Dienstag, 30. Juli 2019 14:12 > To: Baesken, Matthias ; Jean Christophe > Beyler > Cc: hotspot-dev at openjdk.java.net; serviceability-dev dev at openjdk.java.net> > Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast > Linux machines with Total safepoint time 0 ms > > Hi Matthias, > > On 30/07/2019 9:25 pm, Baesken, Matthias wrote: > > Hello? JC / David,?? here is a second webrev? : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ > > > > It moves?? the? thread dump execution into a? method > > executeThreadDumps(long)?? ??, and also adds? while loops?? (but with a > > limitation? for the number of thread dumps, really don?t > > want to cause timeouts etc.).??? I removed a check for > > MAX_VALUE_FOR_PASS?? because we cannot go over Long.MAX_VALUE . > > I don't think executeThreadDumps is worth factoring out like out. > > The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather it > remains a constant 100, and then you set a simple loop iteration count > limit. Further with the proposed code when you get here: > > 85 NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; > > you don't even know what value you may be starting with. > > But I was thinking of simply: > > long value = 0; > do { > Thread.getAllStackTraces(); > value = mbean.getTotalSafepointTime(); > } while (value == 0); > > We'd only hit a timeout if something is completely broken - which is fine. > > Overall tests like this are not very useful, yet very fragile. > > Thanks, > David > > > Hope you like this version ?better. > > > > Best regards, Matthias > > > > *From:*Jean Christophe Beyler > > *Sent:* Dienstag, 30. Juli 2019 05:39 > > *To:* David Holmes > > *Cc:* Baesken, Matthias ; > > hotspot-dev at openjdk.java.net; serviceability-dev > > > > *Subject:* Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails > > on fast Linux machines with Total safepoint time 0 ms > > > > Hi Matthias, > > > > I wonder if you should not do what David is suggesting and then put that > > whole code (the while loop) in a helper method. Below you have a > > calculation again using value2 (which I wonder what the added value of > > it is though) but anyway, that value2 could also be 0 at some point, no? > > > > So would it not be best to just refactor the getAllStackTraces and > > calculate safepoint time in a helper method for both value / value2 > > variables? > > > > Thanks, > > > > Jc > > > > On Mon, Jul 29, 2019 at 7:50 PM David Holmes > > wrote: > > > > Hi Matthias, > > > > On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > > > Hello , please review this small test fix . > > > > > > The test > > > test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. > java > > fails sometimes on fast Linux machines with this error message : > > > > > > java.lang.RuntimeException: Total safepoint time illegal value: 0 > > ms (MIN = 1; MAX = 9223372036854775807) > > > > > > looks like the total safepoint time is too low currently on these > > machines, it is < 1 ms. > > > > > > There might be several ways to handle this : > > > > > >? ? *? ?Change the test? in a way that it might generate nigher > > safepoint times > > >? ? *? ?Allow? safepoint time? == 0 ms > > >? ? *? ?Offer an additional interface that gives? safepoint times > > with finer granularity ( currently the HS has safepoint time values > > in ns , see? jdk/src/hotspot/share/runtime/safepoint.cpp > > ?SafepointTracing::end > > > > > > But it is converted on ms in this code > > > > > > 114jlong RuntimeService::safepoint_time_ms() { > > > 115? return UsePerfData ? > > > 116 > > Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > > > 117} > > > > > > 064jlong Management::ticks_to_ms(jlong ticks) { > > > 2065? assert(os::elapsed_frequency() > 0, "Must be non-zero"); > > > 2066? return (jlong)(((double)ticks / > > (double)os::elapsed_frequency()) > > > 2067? ? ? ? ? ? ? ? ?* (double)1000.0); > > > 2068} > > > > > > > > > > > > Currently I go for? the first attempt (and try to generate > > higher safepoint times in my patch) . > > > > Yes that's probably best. Coarse-grained timing on very fast machines > > was bound to eventually lead to problems. > > > > But perhaps a more future-proof approach is to just add a do-while loop > > around the stack dumps and only exit when we have a non-zero > safepoint > > time? > > > > Thanks, > > David > > ----- > > > > > Bug/webrev : > > > > > > https://bugs.openjdk.java.net/browse/JDK-8228658 > > > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > > > > > > > > > Thanks, Matthias > > > > > > > > > -- > > > > Thanks, > > > > Jc > > From christian.hagedorn at oracle.com Tue Jul 30 12:47:09 2019 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Tue, 30 Jul 2019 14:47:09 +0200 Subject: [14] RFR(S): 8193042: NativeLookup::lookup_critical_entry() should only load shared library once In-Reply-To: References: <74e526f5-9036-3463-520c-bc754e1f9beb@oracle.com> <3e4dee76-c57e-2b66-d01b-6e3c0dbd630e@oracle.com> Message-ID: <274a1ced-eb0d-0eef-4e8c-47076943a301@oracle.com> Hi David, hi Dean Thanks for your reviews! On 30.07.19 00:56, David Holmes wrote: > There are a few inherited style nits (missing braces on if-blocks) but > not worth pointing out as that code contains a lot of them. I fixed the style issues where I touched the code. I updated the webrev: http://cr.openjdk.java.net/~thartmann/8193042/webrev.01/ On 30.07.19 07:13, Tobias Hartmann wrote: > > On 30.07.19 00:56, David Holmes wrote: >> What testing did you do for this? (I'm not sure which tests would exercise this code.) > > I think besides the regular tiers, the test(s) referenced in JDK-8191360 should be suitable. I tested it with hs-precheckin-comp,hs-tier1,2 and 3. hs-tier1 includes [1] which uses this code for a critical native lookup. Best regards, Christian [1] http://hg.openjdk.java.net/jdk/jdk/file/144585063bc8/test/hotspot/jtreg/compiler/runtime/criticalnatives/lookup From david.holmes at oracle.com Tue Jul 30 20:42:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Jul 2019 06:42:48 +1000 Subject: [14] RFR(S): 8193042: NativeLookup::lookup_critical_entry() should only load shared library once In-Reply-To: <274a1ced-eb0d-0eef-4e8c-47076943a301@oracle.com> References: <74e526f5-9036-3463-520c-bc754e1f9beb@oracle.com> <3e4dee76-c57e-2b66-d01b-6e3c0dbd630e@oracle.com> <274a1ced-eb0d-0eef-4e8c-47076943a301@oracle.com> Message-ID: <82ba405f-16fe-9b57-0476-1c89910ad0da@oracle.com> Hi Christian, On 30/07/2019 10:47 pm, Christian Hagedorn wrote: > Hi David, hi Dean > > Thanks for your reviews! > > > On 30.07.19 00:56, David Holmes wrote: > > There are a few inherited style nits (missing braces on if-blocks) but > > not worth pointing out as that code contains a lot of them. > > I fixed the style issues where I touched the code. I updated the webrev: > http://cr.openjdk.java.net/~thartmann/8193042/webrev.01/ Thanks for doing that - looks good. > > On 30.07.19 07:13, Tobias Hartmann wrote: >> >> On 30.07.19 00:56, David Holmes wrote: >>> What testing did you do for this? (I'm not sure which tests would >>> exercise this code.) >> >> I think besides the regular tiers, the test(s) referenced in >> JDK-8191360 should be suitable. > > I tested it with hs-precheckin-comp,hs-tier1,2 and 3. hs-tier1 includes > [1] which uses this code for a critical native lookup. Okay thanks. David ----- > Best regards, > Christian > > > [1] > http://hg.openjdk.java.net/jdk/jdk/file/144585063bc8/test/hotspot/jtreg/compiler/runtime/criticalnatives/lookup > From kim.barrett at oracle.com Tue Jul 30 20:59:30 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 30 Jul 2019 16:59:30 -0400 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: <2668bf38-162a-7b6f-404b-0c1a598a304e@oracle.com> References: <2668bf38-162a-7b6f-404b-0c1a598a304e@oracle.com> Message-ID: > On Jul 29, 2019, at 10:27 PM, David Holmes wrote: > > Hi Kim, > > A meta-comment: "storages" is not a well formed term. Can we have something clearer, perhaps OopStorageManager, or something like that? > > Thanks, > David Coleen suggested the name OopStorages, as the plural of OopStorage. (Unpublished versions of the change had a different name that I didn't really like and Coleen actively disliked.) Coleen and I both have an antipathy toward "Manager" suffixed names, and I don't see how it's any clearer in this case. "Set" suggests a wider API. Also, drive-by name bikeshedding doesn't carry much weight. From david.holmes at oracle.com Tue Jul 30 21:11:29 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Jul 2019 07:11:29 +1000 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: References: <2668bf38-162a-7b6f-404b-0c1a598a304e@oracle.com> Message-ID: <06be806b-158f-eb9c-b27b-9cf8d6aa549c@oracle.com> On 31/07/2019 6:59 am, Kim Barrett wrote: >> On Jul 29, 2019, at 10:27 PM, David Holmes wrote: >> >> Hi Kim, >> >> A meta-comment: "storages" is not a well formed term. Can we have something clearer, perhaps OopStorageManager, or something like that? >> >> Thanks, >> David > > Coleen suggested the name OopStorages, as the plural of OopStorage. "storage" doesn't really have a plural in common use. > (Unpublished versions of the change had a different name that I didn't > really like and Coleen actively disliked.) Coleen and I both have an > antipathy toward "Manager" suffixed names, and I don't see how it's > any clearer in this case. "Set" suggests a wider API. > > Also, drive-by name bikeshedding doesn't carry much weight. Okay how about its really poor form to have classes and files that differ by only one letter. I looked at this to see what it was about and had to keep double-checking if I was looking at OopStorage or OopStorages. In addition OopStorages conveys no semantic meaning to me. Thanks, David From david.holmes at oracle.com Tue Jul 30 21:35:12 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Jul 2019 07:35:12 +1000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> Message-ID: <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> On 30/07/2019 10:39 pm, Baesken, Matthias wrote: > Hi David, "put that whole code (the while loop) in a helper method." was JC's idea, and I like the idea . Regardless I think the way you are using NUM_THREAD_DUMPS is really confusing. As an all-caps static you'd expect it to be a constant. Thanks, David > Let's see what others think . > >> >> Overall tests like this are not very useful, yet very fragile. >> > > I am also fine with putting the test on the exclude list. > > Best regards, Matthias > > >> -----Original Message----- >> From: David Holmes >> Sent: Dienstag, 30. Juli 2019 14:12 >> To: Baesken, Matthias ; Jean Christophe >> Beyler >> Cc: hotspot-dev at openjdk.java.net; serviceability-dev > dev at openjdk.java.net> >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast >> Linux machines with Total safepoint time 0 ms >> >> Hi Matthias, >> >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote: >>> Hello? JC / David,?? here is a second webrev? : >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ >>> >>> It moves?? the? thread dump execution into a? method >>> executeThreadDumps(long)?? ??, and also adds? while loops?? (but with a >>> limitation? for the number of thread dumps, really don?t >>> want to cause timeouts etc.).??? I removed a check for >>> MAX_VALUE_FOR_PASS?? because we cannot go over Long.MAX_VALUE . >> >> I don't think executeThreadDumps is worth factoring out like out. >> >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather it >> remains a constant 100, and then you set a simple loop iteration count >> limit. Further with the proposed code when you get here: >> >> 85 NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; >> >> you don't even know what value you may be starting with. >> >> But I was thinking of simply: >> >> long value = 0; >> do { >> Thread.getAllStackTraces(); >> value = mbean.getTotalSafepointTime(); >> } while (value == 0); >> >> We'd only hit a timeout if something is completely broken - which is fine. >> >> Overall tests like this are not very useful, yet very fragile. >> >> Thanks, >> David >> >>> Hope you like this version ?better. >>> >>> Best regards, Matthias >>> >>> *From:*Jean Christophe Beyler >>> *Sent:* Dienstag, 30. Juli 2019 05:39 >>> *To:* David Holmes >>> *Cc:* Baesken, Matthias ; >>> hotspot-dev at openjdk.java.net; serviceability-dev >>> >>> *Subject:* Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails >>> on fast Linux machines with Total safepoint time 0 ms >>> >>> Hi Matthias, >>> >>> I wonder if you should not do what David is suggesting and then put that >>> whole code (the while loop) in a helper method. Below you have a >>> calculation again using value2 (which I wonder what the added value of >>> it is though) but anyway, that value2 could also be 0 at some point, no? >>> >>> So would it not be best to just refactor the getAllStackTraces and >>> calculate safepoint time in a helper method for both value / value2 >>> variables? >>> >>> Thanks, >>> >>> Jc >>> >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes >> > wrote: >>> >>> Hi Matthias, >>> >>> On 29/07/2019 8:20 pm, Baesken, Matthias wrote: >>> > Hello , please review this small test fix . >>> > >>> > The test >>> >> test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. >> java >>> fails sometimes on fast Linux machines with this error message : >>> > >>> > java.lang.RuntimeException: Total safepoint time illegal value: 0 >>> ms (MIN = 1; MAX = 9223372036854775807) >>> > >>> > looks like the total safepoint time is too low currently on these >>> machines, it is < 1 ms. >>> > >>> > There might be several ways to handle this : >>> > >>> >? ? *? ?Change the test? in a way that it might generate nigher >>> safepoint times >>> >? ? *? ?Allow? safepoint time? == 0 ms >>> >? ? *? ?Offer an additional interface that gives? safepoint times >>> with finer granularity ( currently the HS has safepoint time values >>> in ns , see? jdk/src/hotspot/share/runtime/safepoint.cpp >>> ?SafepointTracing::end >>> > >>> > But it is converted on ms in this code >>> > >>> > 114jlong RuntimeService::safepoint_time_ms() { >>> > 115? return UsePerfData ? >>> > 116 >>> Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; >>> > 117} >>> > >>> > 064jlong Management::ticks_to_ms(jlong ticks) { >>> > 2065? assert(os::elapsed_frequency() > 0, "Must be non-zero"); >>> > 2066? return (jlong)(((double)ticks / >>> (double)os::elapsed_frequency()) >>> > 2067? ? ? ? ? ? ? ? ?* (double)1000.0); >>> > 2068} >>> > >>> > >>> > >>> > Currently I go for? the first attempt (and try to generate >>> higher safepoint times in my patch) . >>> >>> Yes that's probably best. Coarse-grained timing on very fast machines >>> was bound to eventually lead to problems. >>> >>> But perhaps a more future-proof approach is to just add a do-while loop >>> around the stack dumps and only exit when we have a non-zero >> safepoint >>> time? >>> >>> Thanks, >>> David >>> ----- >>> >>> > Bug/webrev : >>> > >>> > https://bugs.openjdk.java.net/browse/JDK-8228658 >>> > >>> > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ >>> > >>> > >>> > Thanks, Matthias >>> > >>> >>> >>> -- >>> >>> Thanks, >>> >>> Jc >>> From calvin.cheung at oracle.com Tue Jul 30 21:48:57 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Tue, 30 Jul 2019 14:48:57 -0700 Subject: CFV: New HotSpot Group Member: Jiangli Zhou Message-ID: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Greetings, I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to Membership in the HotSpot Group. Jiangli is a JDK project and JDK update project reviewer. She is currently a member of Google Java Platform team and has contributed over 100 changesets[3] to Hotspot JVM in various areas since 2011. In recent years, she has been mainly focusing on the runtime memory footprint reduction and Class Data Sharing. Votes are due by August 13, 2019, 15:00 PDT. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks! Calvin [1] http://openjdk.java.net/census#hotspot [2] http://openjdk.java.net/groups/#member-vote [3] http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() From david.holmes at oracle.com Tue Jul 30 21:58:02 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Jul 2019 07:58:02 +1000 Subject: CFV: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: <07a66a9e-68d8-9b1b-5690-bc5801fc56c8@oracle.com> Vote: yes David On 31/07/2019 7:48 am, Calvin Cheung wrote: > Greetings, > > I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to > Membership in the HotSpot Group. > > Jiangli is a JDK project and JDK update project reviewer. She is > currently a member of Google Java Platform team and has contributed over > 100 changesets[3] to Hotspot JVM in various areas since 2011. In recent > years, she has been mainly focusing on the runtime memory footprint > reduction and Class Data Sharing. > > Votes are due by August 13, 2019, 15:00 PDT. > > Only current Members of the HotSpot Group [1] are eligible to vote on > this nomination. Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks! > > Calvin > > [1] http://openjdk.java.net/census#hotspot > > [2] http://openjdk.java.net/groups/#member-vote > > [3] > http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() > > From coleen.phillimore at oracle.com Tue Jul 30 21:59:56 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 30 Jul 2019 17:59:56 -0400 Subject: CFV: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: <6492a4f8-9663-c6cf-e70e-e9b3784ed452@oracle.com> Vote: yes On 7/30/19 5:48 PM, Calvin Cheung wrote: > Greetings, > > I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to > Membership in the HotSpot Group. > > Jiangli is a JDK project and JDK update project reviewer. She is > currently a member of Google Java Platform team and has contributed > over 100 changesets[3] to Hotspot JVM in various areas since 2011. In > recent years, she has been mainly focusing on the runtime memory > footprint reduction and Class Data Sharing. > > Votes are due by August 13, 2019, 15:00 PDT. > > Only current Members of the HotSpot Group [1] are eligible to vote on > this nomination. Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks! > > Calvin > > [1] http://openjdk.java.net/census#hotspot > > [2] http://openjdk.java.net/groups/#member-vote > > [3] > http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() > From vladimir.kozlov at oracle.com Tue Jul 30 22:13:01 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 30 Jul 2019 15:13:01 -0700 Subject: CFV: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: <2961E787-F39F-44B8-AB1F-8AC737DC481D@oracle.com> Vote yes Thanks Vladimir > On Jul 30, 2019, at 2:48 PM, Calvin Cheung wrote: > > Greetings, > > I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to Membership in the HotSpot Group. > > Jiangli is a JDK project and JDK update project reviewer. She is currently a member of Google Java Platform team and has contributed over 100 changesets[3] to Hotspot JVM in various areas since 2011. In recent years, she has been mainly focusing on the runtime memory footprint reduction and Class Data Sharing. > > Votes are due by August 13, 2019, 15:00 PDT. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks! > > Calvin > > [1] http://openjdk.java.net/census#hotspot > > [2] http://openjdk.java.net/groups/#member-vote > > [3] http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() > From coleen.phillimore at oracle.com Tue Jul 30 22:15:04 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 30 Jul 2019 18:15:04 -0400 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: <06be806b-158f-eb9c-b27b-9cf8d6aa549c@oracle.com> References: <2668bf38-162a-7b6f-404b-0c1a598a304e@oracle.com> <06be806b-158f-eb9c-b27b-9cf8d6aa549c@oracle.com> Message-ID: <2a2a3aaa-8227-7455-2aa3-58b3aa3cd260@oracle.com> On 7/30/19 5:11 PM, David Holmes wrote: > On 31/07/2019 6:59 am, Kim Barrett wrote: >>> On Jul 29, 2019, at 10:27 PM, David Holmes >>> wrote: >>> >>> Hi Kim, >>> >>> A meta-comment: "storages" is not a well formed term. Can we have >>> something clearer, perhaps OopStorageManager, or something like that? >>> >>> Thanks, >>> David >> >> Coleen suggested the name OopStorages, as the plural of OopStorage. > > "storage" doesn't really have a plural in common use. Well this isn't common use.? There are more than one oopStorage things in oopStorages. > >> (Unpublished versions of the change had a different name that I didn't >> really like and Coleen actively disliked.)? Coleen and I both have an >> antipathy toward "Manager" suffixed names, and I don't see how it's >> any clearer in this case.? "Set" suggests a wider API. >> >> Also, drive-by name bikeshedding doesn't carry much weight. > > Okay how about its really poor form to have classes and files that > differ by only one letter. I looked at this to see what it was about > and had to keep double-checking if I was looking at OopStorage or > OopStorages. In addition OopStorages conveys no semantic meaning to me. > This might be confusing to someone who doesn't normally look at the code.? If you come up with a better name than Manager, it might be okay to change.? So far, our other name ideas weren't better than just the succinct "Storages".?? Meaning multiple oopStorage objects (they're not objects, that's a bad name because it could be confusing with oops which are also called objects). Coleen > Thanks, > David From kim.barrett at oracle.com Tue Jul 30 22:27:26 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 30 Jul 2019 18:27:26 -0400 Subject: CFV: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: vote: yes > On Jul 30, 2019, at 5:48 PM, Calvin Cheung wrote: > > Greetings, > > I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to Membership in the HotSpot Group. > > Jiangli is a JDK project and JDK update project reviewer. She is currently a member of Google Java Platform team and has contributed over 100 changesets[3] to Hotspot JVM in various areas since 2011. In recent years, she has been mainly focusing on the runtime memory footprint reduction and Class Data Sharing. > > Votes are due by August 13, 2019, 15:00 PDT. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks! > > Calvin > > [1] http://openjdk.java.net/census#hotspot > > [2] http://openjdk.java.net/groups/#member-vote > > [3] http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() From jcbeyler at google.com Tue Jul 30 23:08:23 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Tue, 30 Jul 2019 16:08:23 -0700 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> Message-ID: FWIW, I would have done something like what David was suggesting, just slightly tweaked: public static long executeThreadDumps() { long value; long initial_value = mbean.getTotalSafepointTime(); do { Thread.getAllStackTraces(); value = mbean.getTotalSafepointTime(); } while (value == initial_value); return value; } This ensures that the value is a new value as opposed to the current value and if something goes wrong, as David said, it will timeout; which is ok. But I come back to not really understanding why we are doing this at this point of relaxing (just get a new value of safepoint time). Because, if we accept timeouts now as a failure here, then really the whole test becomes: executeThreadDumps(); executeThreadDumps(); Since the first call will return when value > 0 and the second call will return when value2 > value (I still wonder why we want to ensure it works twice...). So both failures and even testing for it is kind of redundant, once you have a do/while until a change? Thanks, Jc On Tue, Jul 30, 2019 at 2:35 PM David Holmes wrote: > On 30/07/2019 10:39 pm, Baesken, Matthias wrote: > > Hi David, "put that whole code (the while loop) in a helper method." > was JC's idea, and I like the idea . > > Regardless I think the way you are using NUM_THREAD_DUMPS is really > confusing. As an all-caps static you'd expect it to be a constant. > > Thanks, > David > > > Let's see what others think . > > > >> > >> Overall tests like this are not very useful, yet very fragile. > >> > > > > I am also fine with putting the test on the exclude list. > > > > Best regards, Matthias > > > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Dienstag, 30. Juli 2019 14:12 > >> To: Baesken, Matthias ; Jean Christophe > >> Beyler > >> Cc: hotspot-dev at openjdk.java.net; serviceability-dev >> dev at openjdk.java.net> > >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails > on fast > >> Linux machines with Total safepoint time 0 ms > >> > >> Hi Matthias, > >> > >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote: > >>> Hello JC / David, here is a second webrev : > >>> > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ > >>> > >>> It moves the thread dump execution into a method > >>> executeThreadDumps(long) , and also adds while loops (but with a > >>> limitation for the number of thread dumps, really don?t > >>> want to cause timeouts etc.). I removed a check for > >>> MAX_VALUE_FOR_PASS because we cannot go over Long.MAX_VALUE . > >> > >> I don't think executeThreadDumps is worth factoring out like out. > >> > >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather it > >> remains a constant 100, and then you set a simple loop iteration count > >> limit. Further with the proposed code when you get here: > >> > >> 85 NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; > >> > >> you don't even know what value you may be starting with. > >> > >> But I was thinking of simply: > >> > >> long value = 0; > >> do { > >> Thread.getAllStackTraces(); > >> value = mbean.getTotalSafepointTime(); > >> } while (value == 0); > >> > >> We'd only hit a timeout if something is completely broken - which is > fine. > >> > >> Overall tests like this are not very useful, yet very fragile. > >> > >> Thanks, > >> David > >> > >>> Hope you like this version better. > >>> > >>> Best regards, Matthias > >>> > >>> *From:*Jean Christophe Beyler > >>> *Sent:* Dienstag, 30. Juli 2019 05:39 > >>> *To:* David Holmes > >>> *Cc:* Baesken, Matthias ; > >>> hotspot-dev at openjdk.java.net; serviceability-dev > >>> > >>> *Subject:* Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails > >>> on fast Linux machines with Total safepoint time 0 ms > >>> > >>> Hi Matthias, > >>> > >>> I wonder if you should not do what David is suggesting and then put > that > >>> whole code (the while loop) in a helper method. Below you have a > >>> calculation again using value2 (which I wonder what the added value of > >>> it is though) but anyway, that value2 could also be 0 at some point, > no? > >>> > >>> So would it not be best to just refactor the getAllStackTraces and > >>> calculate safepoint time in a helper method for both value / value2 > >>> variables? > >>> > >>> Thanks, > >>> > >>> Jc > >>> > >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes >>> > wrote: > >>> > >>> Hi Matthias, > >>> > >>> On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > >>> > Hello , please review this small test fix . > >>> > > >>> > The test > >>> > >> test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. > >> java > >>> fails sometimes on fast Linux machines with this error message : > >>> > > >>> > java.lang.RuntimeException: Total safepoint time illegal > value: 0 > >>> ms (MIN = 1; MAX = 9223372036854775807) > >>> > > >>> > looks like the total safepoint time is too low currently on > these > >>> machines, it is < 1 ms. > >>> > > >>> > There might be several ways to handle this : > >>> > > >>> > * Change the test in a way that it might generate nigher > >>> safepoint times > >>> > * Allow safepoint time == 0 ms > >>> > * Offer an additional interface that gives safepoint > times > >>> with finer granularity ( currently the HS has safepoint time > values > >>> in ns , see jdk/src/hotspot/share/runtime/safepoint.cpp > >>> SafepointTracing::end > >>> > > >>> > But it is converted on ms in this code > >>> > > >>> > 114jlong RuntimeService::safepoint_time_ms() { > >>> > 115 return UsePerfData ? > >>> > 116 > >>> Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > >>> > 117} > >>> > > >>> > 064jlong Management::ticks_to_ms(jlong ticks) { > >>> > 2065 assert(os::elapsed_frequency() > 0, "Must be non-zero"); > >>> > 2066 return (jlong)(((double)ticks / > >>> (double)os::elapsed_frequency()) > >>> > 2067 * (double)1000.0); > >>> > 2068} > >>> > > >>> > > >>> > > >>> > Currently I go for the first attempt (and try to generate > >>> higher safepoint times in my patch) . > >>> > >>> Yes that's probably best. Coarse-grained timing on very fast > machines > >>> was bound to eventually lead to problems. > >>> > >>> But perhaps a more future-proof approach is to just add a > do-while loop > >>> around the stack dumps and only exit when we have a non-zero > >> safepoint > >>> time? > >>> > >>> Thanks, > >>> David > >>> ----- > >>> > >>> > Bug/webrev : > >>> > > >>> > https://bugs.openjdk.java.net/browse/JDK-8228658 > >>> > > >>> > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > >>> > > >>> > > >>> > Thanks, Matthias > >>> > > >>> > >>> > >>> -- > >>> > >>> Thanks, > >>> > >>> Jc > >>> > -- Thanks, Jc From igor.ignatyev at oracle.com Tue Jul 30 23:20:18 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 30 Jul 2019 16:20:18 -0700 Subject: CFV: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: <9D45A964-1EC0-4EC2-8768-463D1F7EF117@oracle.com> Vote: yes -- Igor > On Jul 30, 2019, at 2:48 PM, Calvin Cheung wrote: > > I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to Membership in the HotSpot Group. From daniel.daugherty at oracle.com Tue Jul 30 23:31:55 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 30 Jul 2019 19:31:55 -0400 Subject: CFV: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: Vote: yes Dan On 7/30/19 5:48 PM, Calvin Cheung wrote: > Greetings, > > I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to > Membership in the HotSpot Group. > > Jiangli is a JDK project and JDK update project reviewer. She is > currently a member of Google Java Platform team and has contributed > over 100 changesets[3] to Hotspot JVM in various areas since 2011. In > recent years, she has been mainly focusing on the runtime memory > footprint reduction and Class Data Sharing. > > Votes are due by August 13, 2019, 15:00 PDT. > > Only current Members of the HotSpot Group [1] are eligible to vote on > this nomination. Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks! > > Calvin > > [1] http://openjdk.java.net/census#hotspot > > [2] http://openjdk.java.net/groups/#member-vote > > [3] > http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() > > From hohensee at amazon.com Wed Jul 31 00:00:28 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 31 Jul 2019 00:00:28 +0000 Subject: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: <2C9374AF-14D2-4838-A020-0DC50C577D71@amazon.com> Vote: yes ?On 7/30/19, 2:50 PM, "hotspot-dev on behalf of Calvin Cheung" wrote: Greetings, I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to Membership in the HotSpot Group. Jiangli is a JDK project and JDK update project reviewer. She is currently a member of Google Java Platform team and has contributed over 100 changesets[3] to Hotspot JVM in various areas since 2011. In recent years, she has been mainly focusing on the runtime memory footprint reduction and Class Data Sharing. Votes are due by August 13, 2019, 15:00 PDT. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks! Calvin [1] http://openjdk.java.net/census#hotspot [2] http://openjdk.java.net/groups/#member-vote [3] http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() From zgu at redhat.com Wed Jul 31 00:25:53 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 30 Jul 2019 20:25:53 -0400 Subject: CFV: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: <1d27ec1c-359d-bcca-a90d-d0746b5609ed@redhat.com> Vote: yes -Zhengyu On 7/30/19 5:48 PM, Calvin Cheung wrote: > Greetings, > > I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to > Membership in the HotSpot Group. > > Jiangli is a JDK project and JDK update project reviewer. She is > currently a member of Google Java Platform team and has contributed over > 100 changesets[3] to Hotspot JVM in various areas since 2011. In recent > years, she has been mainly focusing on the runtime memory footprint > reduction and Class Data Sharing. > > Votes are due by August 13, 2019, 15:00 PDT. > > Only current Members of the HotSpot Group [1] are eligible to vote on > this nomination. Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks! > > Calvin > > [1] http://openjdk.java.net/census#hotspot > > [2] http://openjdk.java.net/groups/#member-vote > > [3] > http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() > > From david.holmes at oracle.com Wed Jul 31 03:04:21 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Jul 2019 13:04:21 +1000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> Message-ID: <470c58c5-c364-aca0-62fe-ef469c5cb390@oracle.com> On 31/07/2019 9:08 am, Jean Christophe Beyler wrote: > FWIW, I would have done something like what David was suggesting, just > slightly tweaked: > > public static long executeThreadDumps() { > ?long value; > ?long initial_value = mbean.getTotalSafepointTime(); > ?do { > ? ? ?Thread.getAllStackTraces(); > ? ? ?value = mbean.getTotalSafepointTime(); > ?} while (value == initial_value); > ?return value; > } > > This ensures that the value is a new value as opposed to the current > value and if something goes wrong, as David said, it will timeout; which > is ok. Works for me. > But I come back to not really understanding why we are doing this at > this point of relaxing (just get a new value of safepoint time). > Because, if we accept timeouts now as a failure here, then really the > whole test becomes: > > executeThreadDumps(); > executeThreadDumps(); > > Since?the first call will return when value > 0 and the second call will > return when value2 > value (I still wonder why we want to ensure it > works twice...). The test is trying to sanity check that we are actually recording the time used by safepoints. So first check is that we can get a non-zero value; second check is we get a greater non-zero value. It's just a sanity test to try and catch if something gets unexpectedly broken in the time tracking code. > So both failures and even testing for it is kind of redundant, once you > have a do/while until a change? Yes - the problem with the tests that try to check internal VM behaviour is that we have no specified way to do something, in this case execute safepoints, that relates to internal VM behaviour, so we have to do something we know will currently work even if not specified to do so - e.g. dumping all thread stacks uses a global safepoint. The second problem is that the timer granularity is so coarse that we then have to guess how many times we need to do that something before seeing a change. To make the test robust we can keep doing stuff until we see a change and so the only way that will fail is if the overall timeout of the test kicks in. Or we can try and second guess how long it should take by introducing our own internal timeout - either directly or by limiting the number of loops in this case. That has its own problems and in general we have tried to reduce internal test timeouts (by removing them) and let overall timeouts take charge. No ideal solution. And this has already consumed way too much of everyone's time. Cheers, David > Thanks, > Jc > > > On Tue, Jul 30, 2019 at 2:35 PM David Holmes > wrote: > > On 30/07/2019 10:39 pm, Baesken, Matthias wrote: > > Hi David,? ?"put that whole code (the while loop) in a helper > method."? ?was JC's idea,? and I like the idea . > > Regardless I think the way you are using NUM_THREAD_DUMPS is really > confusing. As an all-caps static you'd expect it to be a constant. > > Thanks, > David > > > Let's see what others think . > > > >> > >> Overall tests like this are not very useful, yet very fragile. > >> > > > > I am also? fine with putting the test on the exclude list. > > > > Best regards, Matthias > > > > > >> -----Original Message----- > >> From: David Holmes > > >> Sent: Dienstag, 30. Juli 2019 14:12 > >> To: Baesken, Matthias >; Jean Christophe > >> Beyler > > >> Cc: hotspot-dev at openjdk.java.net > ; serviceability-dev > >> dev at openjdk.java.net > > >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java > fails on fast > >> Linux machines with Total safepoint time 0 ms > >> > >> Hi Matthias, > >> > >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote: > >>> Hello? JC / David,?? here is a second webrev? : > >>> > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ > >>> > >>> It moves?? the? thread dump execution into a? method > >>> executeThreadDumps(long)?? ??, and also adds? while loops > (but with a > >>> limitation? for the number of thread dumps, really don?t > >>> want to cause timeouts etc.).??? I removed a check for > >>> MAX_VALUE_FOR_PASS?? because we cannot go over Long.MAX_VALUE . > >> > >> I don't think executeThreadDumps is worth factoring out like out. > >> > >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather it > >> remains a constant 100, and then you set a simple loop iteration > count > >> limit. Further with the proposed code when you get here: > >> > >>? ? 85? ? ? ? ?NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; > >> > >> you don't even know what value you may be starting with. > >> > >> But I was thinking of simply: > >> > >> long value = 0; > >> do { > >>? ? ? ?Thread.getAllStackTraces(); > >>? ? ? ?value = mbean.getTotalSafepointTime(); > >> } while (value == 0); > >> > >> We'd only hit a timeout if something is completely broken - > which is fine. > >> > >> Overall tests like this are not very useful, yet very fragile. > >> > >> Thanks, > >> David > >> > >>> Hope you like this version ?better. > >>> > >>> Best regards, Matthias > >>> > >>> *From:*Jean Christophe Beyler > > >>> *Sent:* Dienstag, 30. Juli 2019 05:39 > >>> *To:* David Holmes > > >>> *Cc:* Baesken, Matthias >; > >>> hotspot-dev at openjdk.java.net > ; serviceability-dev > >>> > > >>> *Subject:* Re: RFR: [XS] 8228658: test > GetTotalSafepointTime.java fails > >>> on fast Linux machines with Total safepoint time 0 ms > >>> > >>> Hi Matthias, > >>> > >>> I wonder if you should not do what David is suggesting and then > put that > >>> whole code (the while loop) in a helper method. Below you have a > >>> calculation again using value2 (which I wonder what the added > value of > >>> it is though) but anyway, that value2 could also be 0 at some > point, no? > >>> > >>> So would it not be best to just refactor the getAllStackTraces and > >>> calculate safepoint time in a helper method for both value / value2 > >>> variables? > >>> > >>> Thanks, > >>> > >>> Jc > >>> > >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes > > >>> >> wrote: > >>> > >>>? ? ? Hi Matthias, > >>> > >>>? ? ? On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > >>>? ? ? ?> Hello , please review this small test fix . > >>>? ? ? ?> > >>>? ? ? ?> The test > >>> > >> test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. > >> java > >>>? ? ? fails sometimes on fast Linux machines with this error > message : > >>>? ? ? ?> > >>>? ? ? ?> java.lang.RuntimeException: Total safepoint time > illegal value: 0 > >>>? ? ? ms (MIN = 1; MAX = 9223372036854775807) > >>>? ? ? ?> > >>>? ? ? ?> looks like the total safepoint time is too low > currently on these > >>>? ? ? machines, it is < 1 ms. > >>>? ? ? ?> > >>>? ? ? ?> There might be several ways to handle this : > >>>? ? ? ?> > >>>? ? ? ?>? ? *? ?Change the test? in a way that it might generate > nigher > >>>? ? ? safepoint times > >>>? ? ? ?>? ? *? ?Allow? safepoint time? == 0 ms > >>>? ? ? ?>? ? *? ?Offer an additional interface that gives > safepoint times > >>>? ? ? with finer granularity ( currently the HS has safepoint > time values > >>>? ? ? in ns , see? jdk/src/hotspot/share/runtime/safepoint.cpp > >>>? ? ? ??SafepointTracing::end > >>>? ? ? ?> > >>>? ? ? ?> But it is converted on ms in this code > >>>? ? ? ?> > >>>? ? ? ?> 114jlong RuntimeService::safepoint_time_ms() { > >>>? ? ? ?> 115? return UsePerfData ? > >>>? ? ? ?> 116 > >>> > Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > >>>? ? ? ?> 117} > >>>? ? ? ?> > >>>? ? ? ?> 064jlong Management::ticks_to_ms(jlong ticks) { > >>>? ? ? ?> 2065? assert(os::elapsed_frequency() > 0, "Must be > non-zero"); > >>>? ? ? ?> 2066? return (jlong)(((double)ticks / > >>>? ? ? (double)os::elapsed_frequency()) > >>>? ? ? ?> 2067? ? ? ? ? ? ? ? ?* (double)1000.0); > >>>? ? ? ?> 2068} > >>>? ? ? ?> > >>>? ? ? ?> > >>>? ? ? ?> > >>>? ? ? ?> Currently I go for? the first attempt (and try to generate > >>>? ? ? higher safepoint times in my patch) . > >>> > >>>? ? ? Yes that's probably best. Coarse-grained timing on very > fast machines > >>>? ? ? was bound to eventually lead to problems. > >>> > >>>? ? ? But perhaps a more future-proof approach is to just add a > do-while loop > >>>? ? ? around the stack dumps and only exit when we have a non-zero > >> safepoint > >>>? ? ? time? > >>> > >>>? ? ? Thanks, > >>>? ? ? David > >>>? ? ? ----- > >>> > >>>? ? ? ?> Bug/webrev : > >>>? ? ? ?> > >>>? ? ? ?> https://bugs.openjdk.java.net/browse/JDK-8228658 > >>>? ? ? ?> > >>>? ? ? ?> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > >>>? ? ? ?> > >>>? ? ? ?> > >>>? ? ? ?> Thanks, Matthias > >>>? ? ? ?> > >>> > >>> > >>> -- > >>> > >>> Thanks, > >>> > >>> Jc > >>> > > > > -- > > Thanks, > Jc From tobias.hartmann at oracle.com Wed Jul 31 05:15:55 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 31 Jul 2019 07:15:55 +0200 Subject: CFV: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: Vote: yes Best regards, Tobias On 30.07.19 23:48, Calvin Cheung wrote: > Greetings, > > I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to Membership in the HotSpot Group. > > Jiangli is a JDK project and JDK update project reviewer. She is currently a member of Google Java > Platform team and has contributed over 100 changesets[3] to Hotspot JVM in various areas since 2011. > In recent years, she has been mainly focusing on the runtime memory footprint reduction and Class > Data Sharing. > > Votes are due by August 13, 2019, 15:00 PDT. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be > cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks! > > Calvin > > [1] http://openjdk.java.net/census#hotspot > > [2] http://openjdk.java.net/groups/#member-vote > > [3] http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() > From david.holmes at oracle.com Wed Jul 31 06:13:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Jul 2019 16:13:55 +1000 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: <2a2a3aaa-8227-7455-2aa3-58b3aa3cd260@oracle.com> References: <2668bf38-162a-7b6f-404b-0c1a598a304e@oracle.com> <06be806b-158f-eb9c-b27b-9cf8d6aa549c@oracle.com> <2a2a3aaa-8227-7455-2aa3-58b3aa3cd260@oracle.com> Message-ID: Hi Coleen, I've sat on this for a few hours :) On 31/07/2019 8:15 am, coleen.phillimore at oracle.com wrote: > On 7/30/19 5:11 PM, David Holmes wrote: >> On 31/07/2019 6:59 am, Kim Barrett wrote: >>>> On Jul 29, 2019, at 10:27 PM, David Holmes >>>> wrote: >>>> >>>> Hi Kim, >>>> >>>> A meta-comment: "storages" is not a well formed term. Can we have >>>> something clearer, perhaps OopStorageManager, or something like that? >>>> >>>> Thanks, >>>> David >>> >>> Coleen suggested the name OopStorages, as the plural of OopStorage. >> >> "storage" doesn't really have a plural in common use. > > Well this isn't common use.? There are more than one oopStorage things > in oopStorages. >> >>> (Unpublished versions of the change had a different name that I didn't >>> really like and Coleen actively disliked.)? Coleen and I both have an >>> antipathy toward "Manager" suffixed names, and I don't see how it's >>> any clearer in this case.? "Set" suggests a wider API. >>> >>> Also, drive-by name bikeshedding doesn't carry much weight. >> >> Okay how about its really poor form to have classes and files that >> differ by only one letter. I looked at this to see what it was about >> and had to keep double-checking if I was looking at OopStorage or >> OopStorages. In addition OopStorages conveys no semantic meaning to me. >> > > This might be confusing to someone who doesn't normally look at the > code. The fact they differ by only one letter leads to an easy source of mistakes in both reading and writing the code. The very first change I saw in the webrev was: #include "gc/shared/oopStorage.inline.hpp" + #include "gc/shared/oopStorages.hpp" and I immediately thought it was a mistake because the .hpp would be included by the .inline.hpp file - but I'd missed the 's'. > If you come up with a better name than Manager, it might be okay > to change.? So far, our other name ideas weren't better than just the > succinct "Storages".?? Meaning multiple oopStorage objects (they're not > objects, that's a bad name because it could be confusing with oops which > are also called objects). OopStorageUnit OopStorageDepot OopStorageFactory OopStorageHolder OopStorageSet Arguably this could/should be folded into OopStorage itself and avoid the naming issues altogether. Cheers, David P.S. What's so bad about Manager? :) > Coleen > >> Thanks, >> David > From thomas.stuefe at gmail.com Wed Jul 31 08:11:10 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 31 Jul 2019 10:11:10 +0200 Subject: CFV: New HotSpot Group Member: Jiangli Zhou In-Reply-To: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> References: <30c09194-6557-5ff6-caa0-37f976e75618@oracle.com> Message-ID: Vote: yes. On Tue, Jul 30, 2019 at 11:49 PM Calvin Cheung wrote: > Greetings, > > I hereby nominate Jiangli Zhou (OpenJDK user name: jiangli) to > Membership in the HotSpot Group. > > Jiangli is a JDK project and JDK update project reviewer. She is > currently a member of Google Java Platform team and has contributed over > 100 changesets[3] to Hotspot JVM in various areas since 2011. In recent > years, she has been mainly focusing on the runtime memory footprint > reduction and Class Data Sharing. > > Votes are due by August 13, 2019, 15:00 PDT. > > Only current Members of the HotSpot Group [1] are eligible to vote on > this nomination. Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks! > > Calvin > > [1] http://openjdk.java.net/census#hotspot > > [2] http://openjdk.java.net/groups/#member-vote > > [3] > > http://hg.openjdk.java.net/jdk/jdk/log?revcount=300&rev=(author(jiangli))+and+not+merge() > > From adam.farley at uk.ibm.com Wed Jul 31 08:59:07 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Wed, 31 Jul 2019 09:59:07 +0100 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: <5ba808b0-ae52-0d0a-b84b-fc34df35475d@oracle.com> References: <25234969-2215-57e9-d8c5-d97b5669ebb1@oracle.com> <5ba808b0-ae52-0d0a-b84b-fc34df35475d@oracle.com> Message-ID: Hi All, Reviewers requested for the change below. @David - Agreed. Would you be prepared to sponsor the change? Best Regards Adam Farley IBM Runtimes David Holmes wrote on 30/07/2019 03:37:53: > From: David Holmes > To: Adam Farley8 > Cc: hotspot-dev at openjdk.java.net, serviceability-dev > > Date: 30/07/2019 03:38 > Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > paths are longer than JVM_MAXPATHLEN > > Hi Adam, > > On 25/07/2019 3:57 am, Adam Farley8 wrote: > > Hi David, > > > > Welcome back. :) > > Thanks. Sorry for the delay in getting back to this. > > I like .v2 as it is much simpler (notwithstanding freeing the already > allocated arrays adds some complexity - thanks for fixing that). > > I'm still not sure we can't optimise things better for unchangeable > properties like the boot libary path, but that's another RFE. > > Thanks, > David > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From adam.farley at uk.ibm.com Wed Jul 31 09:01:36 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Wed, 31 Jul 2019 10:01:36 +0100 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: <5ba808b0-ae52-0d0a-b84b-fc34df35475d@oracle.com> References: <25234969-2215-57e9-d8c5-d97b5669ebb1@oracle.com> <5ba808b0-ae52-0d0a-b84b-fc34df35475d@oracle.com> Message-ID: Hi All, Reviewers requested for the change below. @David - Agreed. Would you be prepared to sponsor the change? Bug: https://bugs.openjdk.java.net/browse/JDK-8227021 Webrev: http://cr.openjdk.java.net/~afarley/8227021.2/webrev/ Best Regards Adam Farley IBM Runtimes P.S. Remembered to add the links this time. :) David Holmes wrote on 30/07/2019 03:37:53: > From: David Holmes > To: Adam Farley8 > Cc: hotspot-dev at openjdk.java.net, serviceability-dev > > Date: 30/07/2019 03:38 > Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path > paths are longer than JVM_MAXPATHLEN > > Hi Adam, > > On 25/07/2019 3:57 am, Adam Farley8 wrote: > > Hi David, > > > > Welcome back. :) > > Thanks. Sorry for the delay in getting back to this. > > I like .v2 as it is much simpler (notwithstanding freeing the already > allocated arrays adds some complexity - thanks for fixing that). > > I'm still not sure we can't optimise things better for unchangeable > properties like the boot libary path, but that's another RFE. > > Thanks, > David > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From david.holmes at oracle.com Wed Jul 31 10:15:17 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Jul 2019 20:15:17 +1000 Subject: RFR: JDK-8227021: VM fails if any sun.boot.library.path paths are longer than JVM_MAXPATHLEN In-Reply-To: References: <25234969-2215-57e9-d8c5-d97b5669ebb1@oracle.com> <5ba808b0-ae52-0d0a-b84b-fc34df35475d@oracle.com> Message-ID: <27cd3f3d-2b5e-54f9-c690-3aa7f7ec33aa@oracle.com> On 31/07/2019 7:01 pm, Adam Farley8 wrote: > Hi All, > > Reviewers requested for the change below. > > @David - Agreed. Would you be prepared to sponsor the change? Sure I can sponsor once there is another reviewer. BTW could have dropped serviceability-dev as this no longer has any serviceability changes in it. :) Cheers, David > Bug: https://bugs.openjdk.java.net/browse/JDK-8227021 > Webrev: http://cr.openjdk.java.net/~afarley/8227021.2/webrev/ > > Best Regards > > Adam Farley > IBM Runtimes > > P.S. Remembered to add the links this time. :) > > > David Holmes wrote on 30/07/2019 03:37:53: > >> From: David Holmes >> To: Adam Farley8 >> Cc: hotspot-dev at openjdk.java.net, serviceability-dev >> >> Date: 30/07/2019 03:38 >> Subject: Re: RFR: JDK-8227021: VM fails if any sun.boot.library.path >> paths are longer than JVM_MAXPATHLEN >> >> Hi Adam, >> >> On 25/07/2019 3:57 am, Adam Farley8 wrote: >> > Hi David, >> > >> > Welcome back. :) >> >> Thanks. Sorry for the delay in getting back to this. >> >> I like .v2 as it is much simpler (notwithstanding freeing the already >> allocated arrays adds some complexity - thanks for fixing that). >> >> I'm still not sure we can't optimise things better for unchangeable >> properties like the boot libary path, but that's another RFE. >> >> Thanks, >> David >> > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From matthias.baesken at sap.com Wed Jul 31 12:05:21 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 31 Jul 2019 12:05:21 +0000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: <470c58c5-c364-aca0-62fe-ef469c5cb390@oracle.com> References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> <470c58c5-c364-aca0-62fe-ef469c5cb390@oracle.com> Message-ID: Hello, here is a version following the latest proposal of JC . Unfortunately attached as patch, sorry for that - the uploads / pushes currently do not work from here . Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 31. Juli 2019 05:04 > To: Jean Christophe Beyler > Cc: Baesken, Matthias ; hotspot- > dev at openjdk.java.net; serviceability-dev dev at openjdk.java.net> > Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast > Linux machines with Total safepoint time 0 ms > > On 31/07/2019 9:08 am, Jean Christophe Beyler wrote: > > FWIW, I would have done something like what David was suggesting, just > > slightly tweaked: > > > > public static long executeThreadDumps() { > > ?long value; > > ?long initial_value = mbean.getTotalSafepointTime(); > > ?do { > > ? ? ?Thread.getAllStackTraces(); > > ? ? ?value = mbean.getTotalSafepointTime(); > > ?} while (value == initial_value); > > ?return value; > > } > > > > This ensures that the value is a new value as opposed to the current > > value and if something goes wrong, as David said, it will timeout; which > > is ok. > > Works for me. > > > But I come back to not really understanding why we are doing this at > > this point of relaxing (just get a new value of safepoint time). > > Because, if we accept timeouts now as a failure here, then really the > > whole test becomes: > > > > executeThreadDumps(); > > executeThreadDumps(); > > > > Since?the first call will return when value > 0 and the second call will > > return when value2 > value (I still wonder why we want to ensure it > > works twice...). > > The test is trying to sanity check that we are actually recording the > time used by safepoints. So first check is that we can get a non-zero > value; second check is we get a greater non-zero value. It's just a > sanity test to try and catch if something gets unexpectedly broken in > the time tracking code. > > > So both failures and even testing for it is kind of redundant, once you > > have a do/while until a change? > > Yes - the problem with the tests that try to check internal VM behaviour > is that we have no specified way to do something, in this case execute > safepoints, that relates to internal VM behaviour, so we have to do > something we know will currently work even if not specified to do so - > e.g. dumping all thread stacks uses a global safepoint. The second > problem is that the timer granularity is so coarse that we then have to > guess how many times we need to do that something before seeing a > change. To make the test robust we can keep doing stuff until we see a > change and so the only way that will fail is if the overall timeout of > the test kicks in. Or we can try and second guess how long it should > take by introducing our own internal timeout - either directly or by > limiting the number of loops in this case. That has its own problems and > in general we have tried to reduce internal test timeouts (by removing > them) and let overall timeouts take charge. > > No ideal solution. And this has already consumed way too much of > everyone's time. > > Cheers, > David > > > Thanks, > > Jc > > > > > > On Tue, Jul 30, 2019 at 2:35 PM David Holmes > > wrote: > > > > On 30/07/2019 10:39 pm, Baesken, Matthias wrote: > > > Hi David,? ?"put that whole code (the while loop) in a helper > > method."? ?was JC's idea,? and I like the idea . > > > > Regardless I think the way you are using NUM_THREAD_DUMPS is really > > confusing. As an all-caps static you'd expect it to be a constant. > > > > Thanks, > > David > > > > > Let's see what others think . > > > > > >> > > >> Overall tests like this are not very useful, yet very fragile. > > >> > > > > > > I am also? fine with putting the test on the exclude list. > > > > > > Best regards, Matthias > > > > > > > > >> -----Original Message----- > > >> From: David Holmes > > > > >> Sent: Dienstag, 30. Juli 2019 14:12 > > >> To: Baesken, Matthias > >; Jean Christophe > > >> Beyler > > > >> Cc: hotspot-dev at openjdk.java.net > > ; serviceability-dev > > > >> dev at openjdk.java.net > > > >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java > > fails on fast > > >> Linux machines with Total safepoint time 0 ms > > >> > > >> Hi Matthias, > > >> > > >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote: > > >>> Hello? JC / David,?? here is a second webrev? : > > >>> > > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ > > >>> > > >>> It moves?? the? thread dump execution into a? method > > >>> executeThreadDumps(long)?? ??, and also adds? while loops > > (but with a > > >>> limitation? for the number of thread dumps, really don?t > > >>> want to cause timeouts etc.).??? I removed a check for > > >>> MAX_VALUE_FOR_PASS?? because we cannot go over > Long.MAX_VALUE . > > >> > > >> I don't think executeThreadDumps is worth factoring out like out. > > >> > > >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather it > > >> remains a constant 100, and then you set a simple loop iteration > > count > > >> limit. Further with the proposed code when you get here: > > >> > > >>? ? 85? ? ? ? ?NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; > > >> > > >> you don't even know what value you may be starting with. > > >> > > >> But I was thinking of simply: > > >> > > >> long value = 0; > > >> do { > > >>? ? ? ?Thread.getAllStackTraces(); > > >>? ? ? ?value = mbean.getTotalSafepointTime(); > > >> } while (value == 0); > > >> > > >> We'd only hit a timeout if something is completely broken - > > which is fine. > > >> > > >> Overall tests like this are not very useful, yet very fragile. > > >> > > >> Thanks, > > >> David > > >> > > >>> Hope you like this version ?better. > > >>> > > >>> Best regards, Matthias > > >>> > > >>> *From:*Jean Christophe Beyler > > > > >>> *Sent:* Dienstag, 30. Juli 2019 05:39 > > >>> *To:* David Holmes > > > > >>> *Cc:* Baesken, Matthias > >; > > >>> hotspot-dev at openjdk.java.net > > ; serviceability-dev > > >>> > > > > >>> *Subject:* Re: RFR: [XS] 8228658: test > > GetTotalSafepointTime.java fails > > >>> on fast Linux machines with Total safepoint time 0 ms > > >>> > > >>> Hi Matthias, > > >>> > > >>> I wonder if you should not do what David is suggesting and then > > put that > > >>> whole code (the while loop) in a helper method. Below you have a > > >>> calculation again using value2 (which I wonder what the added > > value of > > >>> it is though) but anyway, that value2 could also be 0 at some > > point, no? > > >>> > > >>> So would it not be best to just refactor the getAllStackTraces and > > >>> calculate safepoint time in a helper method for both value / value2 > > >>> variables? > > >>> > > >>> Thanks, > > >>> > > >>> Jc > > >>> > > >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes > > > > >>> > >> wrote: > > >>> > > >>>? ? ? Hi Matthias, > > >>> > > >>>? ? ? On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > > >>>? ? ? ?> Hello , please review this small test fix . > > >>>? ? ? ?> > > >>>? ? ? ?> The test > > >>> > > >> > test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. > > >> java > > >>>? ? ? fails sometimes on fast Linux machines with this error > > message : > > >>>? ? ? ?> > > >>>? ? ? ?> java.lang.RuntimeException: Total safepoint time > > illegal value: 0 > > >>>? ? ? ms (MIN = 1; MAX = 9223372036854775807) > > >>>? ? ? ?> > > >>>? ? ? ?> looks like the total safepoint time is too low > > currently on these > > >>>? ? ? machines, it is < 1 ms. > > >>>? ? ? ?> > > >>>? ? ? ?> There might be several ways to handle this : > > >>>? ? ? ?> > > >>>? ? ? ?>? ? *? ?Change the test? in a way that it might generate > > nigher > > >>>? ? ? safepoint times > > >>>? ? ? ?>? ? *? ?Allow? safepoint time? == 0 ms > > >>>? ? ? ?>? ? *? ?Offer an additional interface that gives > > safepoint times > > >>>? ? ? with finer granularity ( currently the HS has safepoint > > time values > > >>>? ? ? in ns , see? jdk/src/hotspot/share/runtime/safepoint.cpp > > >>>? ? ? ??SafepointTracing::end > > >>>? ? ? ?> > > >>>? ? ? ?> But it is converted on ms in this code > > >>>? ? ? ?> > > >>>? ? ? ?> 114jlong RuntimeService::safepoint_time_ms() { > > >>>? ? ? ?> 115? return UsePerfData ? > > >>>? ? ? ?> 116 > > >>> > > Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > > >>>? ? ? ?> 117} > > >>>? ? ? ?> > > >>>? ? ? ?> 064jlong Management::ticks_to_ms(jlong ticks) { > > >>>? ? ? ?> 2065? assert(os::elapsed_frequency() > 0, "Must be > > non-zero"); > > >>>? ? ? ?> 2066? return (jlong)(((double)ticks / > > >>>? ? ? (double)os::elapsed_frequency()) > > >>>? ? ? ?> 2067? ? ? ? ? ? ? ? ?* (double)1000.0); > > >>>? ? ? ?> 2068} > > >>>? ? ? ?> > > >>>? ? ? ?> > > >>>? ? ? ?> > > >>>? ? ? ?> Currently I go for? the first attempt (and try to generate > > >>>? ? ? higher safepoint times in my patch) . > > >>> > > >>>? ? ? Yes that's probably best. Coarse-grained timing on very > > fast machines > > >>>? ? ? was bound to eventually lead to problems. > > >>> > > >>>? ? ? But perhaps a more future-proof approach is to just add a > > do-while loop > > >>>? ? ? around the stack dumps and only exit when we have a non-zero > > >> safepoint > > >>>? ? ? time? > > >>> > > >>>? ? ? Thanks, > > >>>? ? ? David > > >>>? ? ? ----- > > >>> > > >>>? ? ? ?> Bug/webrev : > > >>>? ? ? ?> > > >>>? ? ? ?> https://bugs.openjdk.java.net/browse/JDK-8228658 > > >>>? ? ? ?> > > >>>? ? ? ?> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > > >>>? ? ? ?> > > >>>? ? ? ?> > > >>>? ? ? ?> Thanks, Matthias > > >>>? ? ? ?> > > >>> > > >>> > > >>> -- > > >>> > > >>> Thanks, > > >>> > > >>> Jc > > >>> > > > > > > > > -- > > > > Thanks, > > Jc From coleen.phillimore at oracle.com Wed Jul 31 12:56:23 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 31 Jul 2019 08:56:23 -0400 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 Message-ID: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> Summary: give SurvivorAlignmentInBytes the same range as ObjAlignmentInBytes. Reran options validation against ParallelGC. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228855.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8228855 Thanks, Coleen From shade at redhat.com Wed Jul 31 13:04:13 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 31 Jul 2019 15:04:13 +0200 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 In-Reply-To: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> References: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> Message-ID: <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> On 7/31/19 2:56 PM, coleen.phillimore at oracle.com wrote: > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228855.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8228855 Looks good and trivial. -- Thanks, -Aleksey From david.holmes at oracle.com Wed Jul 31 13:12:02 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Jul 2019 23:12:02 +1000 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 In-Reply-To: <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> References: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> Message-ID: <56f8f704-dab0-0885-8af9-fe01a31c3a12@oracle.com> I haven't seen Coleen's original mail turn up yet, so I'll respond here. Shouldn't the range be handled by the constraint function: SurvivorAlignmentInBytesConstraintFunc ? David (signing off for the night) On 31/07/2019 11:04 pm, Aleksey Shipilev wrote: > On 7/31/19 2:56 PM, coleen.phillimore at oracle.com wrote: >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228855.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8228855 > > Looks good and trivial. > From coleen.phillimore at oracle.com Wed Jul 31 13:25:02 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 31 Jul 2019 09:25:02 -0400 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 In-Reply-To: <56f8f704-dab0-0885-8af9-fe01a31c3a12@oracle.com> References: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> <56f8f704-dab0-0885-8af9-fe01a31c3a12@oracle.com> Message-ID: <6e7f086b-e4c7-027c-9475-9b3558c2bd4d@oracle.com> On 7/31/19 9:12 AM, David Holmes wrote: > I haven't seen Coleen's original mail turn up yet, so I'll respond here. > > Shouldn't the range be handled by the constraint function: It is not handled that way in ObjAlignmentInBytes. Coleen > > ?SurvivorAlignmentInBytesConstraintFunc > > ? > > David (signing off for the night) > > On 31/07/2019 11:04 pm, Aleksey Shipilev wrote: >> On 7/31/19 2:56 PM, coleen.phillimore at oracle.com wrote: >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8228855.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8228855 >> >> Looks good and trivial. >> From coleen.phillimore at oracle.com Wed Jul 31 13:39:55 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 31 Jul 2019 09:39:55 -0400 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 In-Reply-To: <6e7f086b-e4c7-027c-9475-9b3558c2bd4d@oracle.com> References: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> <56f8f704-dab0-0885-8af9-fe01a31c3a12@oracle.com> <6e7f086b-e4c7-027c-9475-9b3558c2bd4d@oracle.com> Message-ID: <84006e85-90a7-8c42-05d6-a26000d6d2e5@oracle.com> On 7/31/19 9:25 AM, coleen.phillimore at oracle.com wrote: > > > On 7/31/19 9:12 AM, David Holmes wrote: >> I haven't seen Coleen's original mail turn up yet, so I'll respond here. I haven't gotten the email yet either. >> >> Shouldn't the range be handled by the constraint function: > > It is not handled that way in ObjAlignmentInBytes. What I meant is that ObjectAlignmentInBytes has the constraint function AND the range.? SurvivorAlignmentInBytes should be the same.? The constraint function tests that it's > ObjectAlignmentInBytes. 158 lp64_product(intx, ObjectAlignmentInBytes, 8, \ 159 "Default object alignment in bytes, 8 is minimum") \ 160 range(8, 256) \ 161 constraint(ObjectAlignmentInBytesConstraintFunc,AtParse) \ Coleen > > Coleen >> >> ?SurvivorAlignmentInBytesConstraintFunc >> >> ? >> >> David (signing off for the night) >> >> On 31/07/2019 11:04 pm, Aleksey Shipilev wrote: >>> On 7/31/19 2:56 PM, coleen.phillimore at oracle.com wrote: >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8228855.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8228855 >>> >>> Looks good and trivial. >>> > From matthias.baesken at sap.com Wed Jul 31 14:01:19 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 31 Jul 2019 14:01:19 +0000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> <470c58c5-c364-aca0-62fe-ef469c5cb390@oracle.com> Message-ID: Hi upload works again, now with webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.2/ Best regards, Matthias > -----Original Message----- > From: Baesken, Matthias > Sent: Mittwoch, 31. Juli 2019 14:05 > To: 'David Holmes' ; Jean Christophe Beyler > > Cc: hotspot-dev at openjdk.java.net; serviceability-dev dev at openjdk.java.net> > Subject: RE: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast > Linux machines with Total safepoint time 0 ms > > Hello, here is a version following the latest proposal of JC . > > Unfortunately attached as patch, sorry for that - the uploads / pushes > currently do not work from here . > > Best regards, Matthias > > > > -----Original Message----- > > From: David Holmes > > Sent: Mittwoch, 31. Juli 2019 05:04 > > To: Jean Christophe Beyler > > Cc: Baesken, Matthias ; hotspot- > > dev at openjdk.java.net; serviceability-dev > dev at openjdk.java.net> > > Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on > fast > > Linux machines with Total safepoint time 0 ms > > > > On 31/07/2019 9:08 am, Jean Christophe Beyler wrote: > > > FWIW, I would have done something like what David was suggesting, just > > > slightly tweaked: > > > > > > public static long executeThreadDumps() { > > > ?long value; > > > ?long initial_value = mbean.getTotalSafepointTime(); > > > ?do { > > > ? ? ?Thread.getAllStackTraces(); > > > ? ? ?value = mbean.getTotalSafepointTime(); > > > ?} while (value == initial_value); > > > ?return value; > > > } > > > > > > This ensures that the value is a new value as opposed to the current > > > value and if something goes wrong, as David said, it will timeout; which > > > is ok. > > > > Works for me. > > > > > But I come back to not really understanding why we are doing this at > > > this point of relaxing (just get a new value of safepoint time). > > > Because, if we accept timeouts now as a failure here, then really the > > > whole test becomes: > > > > > > executeThreadDumps(); > > > executeThreadDumps(); > > > > > > Since?the first call will return when value > 0 and the second call will > > > return when value2 > value (I still wonder why we want to ensure it > > > works twice...). > > > > The test is trying to sanity check that we are actually recording the > > time used by safepoints. So first check is that we can get a non-zero > > value; second check is we get a greater non-zero value. It's just a > > sanity test to try and catch if something gets unexpectedly broken in > > the time tracking code. > > > > > So both failures and even testing for it is kind of redundant, once you > > > have a do/while until a change? > > > > Yes - the problem with the tests that try to check internal VM behaviour > > is that we have no specified way to do something, in this case execute > > safepoints, that relates to internal VM behaviour, so we have to do > > something we know will currently work even if not specified to do so - > > e.g. dumping all thread stacks uses a global safepoint. The second > > problem is that the timer granularity is so coarse that we then have to > > guess how many times we need to do that something before seeing a > > change. To make the test robust we can keep doing stuff until we see a > > change and so the only way that will fail is if the overall timeout of > > the test kicks in. Or we can try and second guess how long it should > > take by introducing our own internal timeout - either directly or by > > limiting the number of loops in this case. That has its own problems and > > in general we have tried to reduce internal test timeouts (by removing > > them) and let overall timeouts take charge. > > > > No ideal solution. And this has already consumed way too much of > > everyone's time. > > > > Cheers, > > David > > > > > Thanks, > > > Jc > > > > > > > > > On Tue, Jul 30, 2019 at 2:35 PM David Holmes > > > wrote: > > > > > > On 30/07/2019 10:39 pm, Baesken, Matthias wrote: > > > > Hi David,? ?"put that whole code (the while loop) in a helper > > > method."? ?was JC's idea,? and I like the idea . > > > > > > Regardless I think the way you are using NUM_THREAD_DUMPS is > really > > > confusing. As an all-caps static you'd expect it to be a constant. > > > > > > Thanks, > > > David > > > > > > > Let's see what others think . > > > > > > > >> > > > >> Overall tests like this are not very useful, yet very fragile. > > > >> > > > > > > > > I am also? fine with putting the test on the exclude list. > > > > > > > > Best regards, Matthias > > > > > > > > > > > >> -----Original Message----- > > > >> From: David Holmes > > > > > > >> Sent: Dienstag, 30. Juli 2019 14:12 > > > >> To: Baesken, Matthias > > >; Jean Christophe > > > >> Beyler > > > > >> Cc: hotspot-dev at openjdk.java.net > > > ; serviceability-dev > > > > > >> dev at openjdk.java.net > > > > >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java > > > fails on fast > > > >> Linux machines with Total safepoint time 0 ms > > > >> > > > >> Hi Matthias, > > > >> > > > >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote: > > > >>> Hello? JC / David,?? here is a second webrev? : > > > >>> > > > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ > > > >>> > > > >>> It moves?? the? thread dump execution into a? method > > > >>> executeThreadDumps(long)?? ??, and also adds? while loops > > > (but with a > > > >>> limitation? for the number of thread dumps, really don?t > > > >>> want to cause timeouts etc.).??? I removed a check for > > > >>> MAX_VALUE_FOR_PASS?? because we cannot go over > > Long.MAX_VALUE . > > > >> > > > >> I don't think executeThreadDumps is worth factoring out like out. > > > >> > > > >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather > it > > > >> remains a constant 100, and then you set a simple loop iteration > > > count > > > >> limit. Further with the proposed code when you get here: > > > >> > > > >>? ? 85? ? ? ? ?NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; > > > >> > > > >> you don't even know what value you may be starting with. > > > >> > > > >> But I was thinking of simply: > > > >> > > > >> long value = 0; > > > >> do { > > > >>? ? ? ?Thread.getAllStackTraces(); > > > >>? ? ? ?value = mbean.getTotalSafepointTime(); > > > >> } while (value == 0); > > > >> > > > >> We'd only hit a timeout if something is completely broken - > > > which is fine. > > > >> > > > >> Overall tests like this are not very useful, yet very fragile. > > > >> > > > >> Thanks, > > > >> David > > > >> > > > >>> Hope you like this version ?better. > > > >>> > > > >>> Best regards, Matthias > > > >>> > > > >>> *From:*Jean Christophe Beyler > > > > > > >>> *Sent:* Dienstag, 30. Juli 2019 05:39 > > > >>> *To:* David Holmes > > > > > > >>> *Cc:* Baesken, Matthias > > >; > > > >>> hotspot-dev at openjdk.java.net > > > ; serviceability-dev > > > >>> > > > > > > >>> *Subject:* Re: RFR: [XS] 8228658: test > > > GetTotalSafepointTime.java fails > > > >>> on fast Linux machines with Total safepoint time 0 ms > > > >>> > > > >>> Hi Matthias, > > > >>> > > > >>> I wonder if you should not do what David is suggesting and then > > > put that > > > >>> whole code (the while loop) in a helper method. Below you have a > > > >>> calculation again using value2 (which I wonder what the added > > > value of > > > >>> it is though) but anyway, that value2 could also be 0 at some > > > point, no? > > > >>> > > > >>> So would it not be best to just refactor the getAllStackTraces and > > > >>> calculate safepoint time in a helper method for both value / value2 > > > >>> variables? > > > >>> > > > >>> Thanks, > > > >>> > > > >>> Jc > > > >>> > > > >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes > > > > > > >>> > > >> wrote: > > > >>> > > > >>>? ? ? Hi Matthias, > > > >>> > > > >>>? ? ? On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > > > >>>? ? ? ?> Hello , please review this small test fix . > > > >>>? ? ? ?> > > > >>>? ? ? ?> The test > > > >>> > > > >> > > > test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. > > > >> java > > > >>>? ? ? fails sometimes on fast Linux machines with this error > > > message : > > > >>>? ? ? ?> > > > >>>? ? ? ?> java.lang.RuntimeException: Total safepoint time > > > illegal value: 0 > > > >>>? ? ? ms (MIN = 1; MAX = 9223372036854775807) > > > >>>? ? ? ?> > > > >>>? ? ? ?> looks like the total safepoint time is too low > > > currently on these > > > >>>? ? ? machines, it is < 1 ms. > > > >>>? ? ? ?> > > > >>>? ? ? ?> There might be several ways to handle this : > > > >>>? ? ? ?> > > > >>>? ? ? ?>? ? *? ?Change the test? in a way that it might generate > > > nigher > > > >>>? ? ? safepoint times > > > >>>? ? ? ?>? ? *? ?Allow? safepoint time? == 0 ms > > > >>>? ? ? ?>? ? *? ?Offer an additional interface that gives > > > safepoint times > > > >>>? ? ? with finer granularity ( currently the HS has safepoint > > > time values > > > >>>? ? ? in ns , see? jdk/src/hotspot/share/runtime/safepoint.cpp > > > >>>? ? ? ??SafepointTracing::end > > > >>>? ? ? ?> > > > >>>? ? ? ?> But it is converted on ms in this code > > > >>>? ? ? ?> > > > >>>? ? ? ?> 114jlong RuntimeService::safepoint_time_ms() { > > > >>>? ? ? ?> 115? return UsePerfData ? > > > >>>? ? ? ?> 116 > > > >>> > > > Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > > > >>>? ? ? ?> 117} > > > >>>? ? ? ?> > > > >>>? ? ? ?> 064jlong Management::ticks_to_ms(jlong ticks) { > > > >>>? ? ? ?> 2065? assert(os::elapsed_frequency() > 0, "Must be > > > non-zero"); > > > >>>? ? ? ?> 2066? return (jlong)(((double)ticks / > > > >>>? ? ? (double)os::elapsed_frequency()) > > > >>>? ? ? ?> 2067? ? ? ? ? ? ? ? ?* (double)1000.0); > > > >>>? ? ? ?> 2068} > > > >>>? ? ? ?> > > > >>>? ? ? ?> > > > >>>? ? ? ?> > > > >>>? ? ? ?> Currently I go for? the first attempt (and try to generate > > > >>>? ? ? higher safepoint times in my patch) . > > > >>> > > > >>>? ? ? Yes that's probably best. Coarse-grained timing on very > > > fast machines > > > >>>? ? ? was bound to eventually lead to problems. > > > >>> > > > >>>? ? ? But perhaps a more future-proof approach is to just add a > > > do-while loop > > > >>>? ? ? around the stack dumps and only exit when we have a non-zero > > > >> safepoint > > > >>>? ? ? time? > > > >>> > > > >>>? ? ? Thanks, > > > >>>? ? ? David > > > >>>? ? ? ----- > > > >>> > > > >>>? ? ? ?> Bug/webrev : > > > >>>? ? ? ?> > > > >>>? ? ? ?> https://bugs.openjdk.java.net/browse/JDK-8228658 > > > >>>? ? ? ?> > > > >>>? ? ? ?> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > > > >>>? ? ? ?> > > > >>>? ? ? ?> > > > >>>? ? ? ?> Thanks, Matthias > > > >>>? ? ? ?> > > > >>> > > > >>> > > > >>> -- > > > >>> > > > >>> Thanks, > > > >>> > > > >>> Jc > > > >>> > > > > > > > > > > > > -- > > > > > > Thanks, > > > Jc From coleen.phillimore at oracle.com Wed Jul 31 14:17:48 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 31 Jul 2019 10:17:48 -0400 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 In-Reply-To: <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> References: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> Message-ID: <27cd5357-cd90-ee41-212f-05cbe4f9ee1f@oracle.com> Thanks Aleksey! Coleen On 7/31/19 9:04 AM, Aleksey Shipilev wrote: > On 7/31/19 2:56 PM, coleen.phillimore at oracle.com wrote: >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228855.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8228855 > Looks good and trivial. > From thomas.stuefe at gmail.com Wed Jul 31 15:56:28 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 31 Jul 2019 17:56:28 +0200 Subject: [11u] RFR: 8227041: runtime/memory/RunUnitTestsConcurrently.java has a memory leak In-Reply-To: References: Message-ID: Hi Christoph, Assuming it builds and the tests run through, I am fine with this change. Cheers, Thomas On Tue, Jul 23, 2019 at 5:30 PM Langer, Christoph wrote: > Hi, > > please review backport of this test fix. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8227041 > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8227041.11u-dev.0/ > > The test is a real resource drain and causes OOMs - while not really > testing something useful. It was already removed in jdk/jdk - so requesting > to remove it from JDK11u as well (to fix sporadic test failures). > > In jdk11 the relevant source files look a bit different, so I had to > modify the original changeset a bit. > > Thanks > Christoph > > From shade at redhat.com Wed Jul 31 16:16:39 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 31 Jul 2019 18:16:39 +0200 Subject: [11u] RFR: 8227041: runtime/memory/RunUnitTestsConcurrently.java has a memory leak In-Reply-To: References: Message-ID: <30b6f4f4-1241-ef77-68a1-f4f5ec842179@redhat.com> On 7/23/19 5:29 PM, Langer, Christoph wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8227041 > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8227041.11u-dev.0/ That looks good. -- Thanks, -Aleksey From jcbeyler at google.com Wed Jul 31 19:07:36 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Wed, 31 Jul 2019 12:07:36 -0700 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> <470c58c5-c364-aca0-62fe-ef469c5cb390@oracle.com> Message-ID: Hi Matthias, Looks good to me :) Jc On Wed, Jul 31, 2019 at 7:01 AM Baesken, Matthias wrote: > > Hi upload works again, now with webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.2/ > > Best regards, Matthias > > > > -----Original Message----- > > From: Baesken, Matthias > > Sent: Mittwoch, 31. Juli 2019 14:05 > > To: 'David Holmes' ; Jean Christophe Beyler > > > > Cc: hotspot-dev at openjdk.java.net; serviceability-dev > dev at openjdk.java.net> > > Subject: RE: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on > fast > > Linux machines with Total safepoint time 0 ms > > > > Hello, here is a version following the latest proposal of JC . > > > > Unfortunately attached as patch, sorry for that - the uploads / pushes > > currently do not work from here . > > > > Best regards, Matthias > > > > > > > -----Original Message----- > > > From: David Holmes > > > Sent: Mittwoch, 31. Juli 2019 05:04 > > > To: Jean Christophe Beyler > > > Cc: Baesken, Matthias ; hotspot- > > > dev at openjdk.java.net; serviceability-dev > > dev at openjdk.java.net> > > > Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails > on > > fast > > > Linux machines with Total safepoint time 0 ms > > > > > > On 31/07/2019 9:08 am, Jean Christophe Beyler wrote: > > > > FWIW, I would have done something like what David was suggesting, > just > > > > slightly tweaked: > > > > > > > > public static long executeThreadDumps() { > > > > long value; > > > > long initial_value = mbean.getTotalSafepointTime(); > > > > do { > > > > Thread.getAllStackTraces(); > > > > value = mbean.getTotalSafepointTime(); > > > > } while (value == initial_value); > > > > return value; > > > > } > > > > > > > > This ensures that the value is a new value as opposed to the current > > > > value and if something goes wrong, as David said, it will timeout; > which > > > > is ok. > > > > > > Works for me. > > > > > > > But I come back to not really understanding why we are doing this at > > > > this point of relaxing (just get a new value of safepoint time). > > > > Because, if we accept timeouts now as a failure here, then really the > > > > whole test becomes: > > > > > > > > executeThreadDumps(); > > > > executeThreadDumps(); > > > > > > > > Since the first call will return when value > 0 and the second call > will > > > > return when value2 > value (I still wonder why we want to ensure it > > > > works twice...). > > > > > > The test is trying to sanity check that we are actually recording the > > > time used by safepoints. So first check is that we can get a non-zero > > > value; second check is we get a greater non-zero value. It's just a > > > sanity test to try and catch if something gets unexpectedly broken in > > > the time tracking code. > > > > > > > So both failures and even testing for it is kind of redundant, once > you > > > > have a do/while until a change? > > > > > > Yes - the problem with the tests that try to check internal VM > behaviour > > > is that we have no specified way to do something, in this case execute > > > safepoints, that relates to internal VM behaviour, so we have to do > > > something we know will currently work even if not specified to do so - > > > e.g. dumping all thread stacks uses a global safepoint. The second > > > problem is that the timer granularity is so coarse that we then have to > > > guess how many times we need to do that something before seeing a > > > change. To make the test robust we can keep doing stuff until we see a > > > change and so the only way that will fail is if the overall timeout of > > > the test kicks in. Or we can try and second guess how long it should > > > take by introducing our own internal timeout - either directly or by > > > limiting the number of loops in this case. That has its own problems > and > > > in general we have tried to reduce internal test timeouts (by removing > > > them) and let overall timeouts take charge. > > > > > > No ideal solution. And this has already consumed way too much of > > > everyone's time. > > > > > > Cheers, > > > David > > > > > > > Thanks, > > > > Jc > > > > > > > > > > > > On Tue, Jul 30, 2019 at 2:35 PM David Holmes < > david.holmes at oracle.com > > > > > wrote: > > > > > > > > On 30/07/2019 10:39 pm, Baesken, Matthias wrote: > > > > > Hi David, "put that whole code (the while loop) in a helper > > > > method." was JC's idea, and I like the idea . > > > > > > > > Regardless I think the way you are using NUM_THREAD_DUMPS is > > really > > > > confusing. As an all-caps static you'd expect it to be a > constant. > > > > > > > > Thanks, > > > > David > > > > > > > > > Let's see what others think . > > > > > > > > > >> > > > > >> Overall tests like this are not very useful, yet very > fragile. > > > > >> > > > > > > > > > > I am also fine with putting the test on the exclude list. > > > > > > > > > > Best regards, Matthias > > > > > > > > > > > > > > >> -----Original Message----- > > > > >> From: David Holmes > > > > > > > > >> Sent: Dienstag, 30. Juli 2019 14:12 > > > > >> To: Baesken, Matthias > > > >; Jean Christophe > > > > >> Beyler > > > > > >> Cc: hotspot-dev at openjdk.java.net > > > > ; serviceability-dev > > > > > > > >> dev at openjdk.java.net > > > > > >> Subject: Re: RFR: [XS] 8228658: test > GetTotalSafepointTime.java > > > > fails on fast > > > > >> Linux machines with Total safepoint time 0 ms > > > > >> > > > > >> Hi Matthias, > > > > >> > > > > >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote: > > > > >>> Hello JC / David, here is a second webrev : > > > > >>> > > > > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ > > > > >>> > > > > >>> It moves the thread dump execution into a method > > > > >>> executeThreadDumps(long) , and also adds while loops > > > > (but with a > > > > >>> limitation for the number of thread dumps, really don?t > > > > >>> want to cause timeouts etc.). I removed a check for > > > > >>> MAX_VALUE_FOR_PASS because we cannot go over > > > Long.MAX_VALUE . > > > > >> > > > > >> I don't think executeThreadDumps is worth factoring out like > out. > > > > >> > > > > >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd > rather > > it > > > > >> remains a constant 100, and then you set a simple loop > iteration > > > > count > > > > >> limit. Further with the proposed code when you get here: > > > > >> > > > > >> 85 NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; > > > > >> > > > > >> you don't even know what value you may be starting with. > > > > >> > > > > >> But I was thinking of simply: > > > > >> > > > > >> long value = 0; > > > > >> do { > > > > >> Thread.getAllStackTraces(); > > > > >> value = mbean.getTotalSafepointTime(); > > > > >> } while (value == 0); > > > > >> > > > > >> We'd only hit a timeout if something is completely broken - > > > > which is fine. > > > > >> > > > > >> Overall tests like this are not very useful, yet very > fragile. > > > > >> > > > > >> Thanks, > > > > >> David > > > > >> > > > > >>> Hope you like this version better. > > > > >>> > > > > >>> Best regards, Matthias > > > > >>> > > > > >>> *From:*Jean Christophe Beyler > > > > > > > > >>> *Sent:* Dienstag, 30. Juli 2019 05:39 > > > > >>> *To:* David Holmes > > > > > > > > >>> *Cc:* Baesken, Matthias > > > >; > > > > >>> hotspot-dev at openjdk.java.net > > > > ; serviceability-dev > > > > >>> > > > > > > > > >>> *Subject:* Re: RFR: [XS] 8228658: test > > > > GetTotalSafepointTime.java fails > > > > >>> on fast Linux machines with Total safepoint time 0 ms > > > > >>> > > > > >>> Hi Matthias, > > > > >>> > > > > >>> I wonder if you should not do what David is suggesting and > then > > > > put that > > > > >>> whole code (the while loop) in a helper method. Below you > have a > > > > >>> calculation again using value2 (which I wonder what the > added > > > > value of > > > > >>> it is though) but anyway, that value2 could also be 0 at > some > > > > point, no? > > > > >>> > > > > >>> So would it not be best to just refactor the > getAllStackTraces and > > > > >>> calculate safepoint time in a helper method for both value > / value2 > > > > >>> variables? > > > > >>> > > > > >>> Thanks, > > > > >>> > > > > >>> Jc > > > > >>> > > > > >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes > > > > > > > > >>> > > > >> wrote: > > > > >>> > > > > >>> Hi Matthias, > > > > >>> > > > > >>> On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > > > > >>> > Hello , please review this small test fix . > > > > >>> > > > > > >>> > The test > > > > >>> > > > > >> > > > > > test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. > > > > >> java > > > > >>> fails sometimes on fast Linux machines with this error > > > > message : > > > > >>> > > > > > >>> > java.lang.RuntimeException: Total safepoint time > > > > illegal value: 0 > > > > >>> ms (MIN = 1; MAX = 9223372036854775807) > > > > >>> > > > > > >>> > looks like the total safepoint time is too low > > > > currently on these > > > > >>> machines, it is < 1 ms. > > > > >>> > > > > > >>> > There might be several ways to handle this : > > > > >>> > > > > > >>> > * Change the test in a way that it might > generate > > > > nigher > > > > >>> safepoint times > > > > >>> > * Allow safepoint time == 0 ms > > > > >>> > * Offer an additional interface that gives > > > > safepoint times > > > > >>> with finer granularity ( currently the HS has safepoint > > > > time values > > > > >>> in ns , see > jdk/src/hotspot/share/runtime/safepoint.cpp > > > > >>> SafepointTracing::end > > > > >>> > > > > > >>> > But it is converted on ms in this code > > > > >>> > > > > > >>> > 114jlong RuntimeService::safepoint_time_ms() { > > > > >>> > 115 return UsePerfData ? > > > > >>> > 116 > > > > >>> > > > > Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; > > > > >>> > 117} > > > > >>> > > > > > >>> > 064jlong Management::ticks_to_ms(jlong ticks) { > > > > >>> > 2065 assert(os::elapsed_frequency() > 0, "Must be > > > > non-zero"); > > > > >>> > 2066 return (jlong)(((double)ticks / > > > > >>> (double)os::elapsed_frequency()) > > > > >>> > 2067 * (double)1000.0); > > > > >>> > 2068} > > > > >>> > > > > > >>> > > > > > >>> > > > > > >>> > Currently I go for the first attempt (and try to > generate > > > > >>> higher safepoint times in my patch) . > > > > >>> > > > > >>> Yes that's probably best. Coarse-grained timing on very > > > > fast machines > > > > >>> was bound to eventually lead to problems. > > > > >>> > > > > >>> But perhaps a more future-proof approach is to just > add a > > > > do-while loop > > > > >>> around the stack dumps and only exit when we have a > non-zero > > > > >> safepoint > > > > >>> time? > > > > >>> > > > > >>> Thanks, > > > > >>> David > > > > >>> ----- > > > > >>> > > > > >>> > Bug/webrev : > > > > >>> > > > > > >>> > https://bugs.openjdk.java.net/browse/JDK-8228658 > > > > >>> > > > > > >>> > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > > > > >>> > > > > > >>> > > > > > >>> > Thanks, Matthias > > > > >>> > > > > > >>> > > > > >>> > > > > >>> -- > > > > >>> > > > > >>> Thanks, > > > > >>> > > > > >>> Jc > > > > >>> > > > > > > > > > > > > > > > > -- > > > > > > > > Thanks, > > > > Jc > -- Thanks, Jc From coleen.phillimore at oracle.com Wed Jul 31 20:01:31 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 31 Jul 2019 16:01:31 -0400 Subject: RFR (XXS) 8228907: Some gc argument checking tests fail after JDK-8228855 Message-ID: Summary: Use new SurvivorAlignmentInBytes range in tests, remove test cases that verify unnecessarily large values. Tested locally with: make test TEST=gc/arguments Will wait for hs-tier1-3 to finish before pushing after reviewed. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228907.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8228907 Thanks, Coleen From kim.barrett at oracle.com Wed Jul 31 20:35:21 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 31 Jul 2019 16:35:21 -0400 Subject: RFR (XXS) 8228907: Some gc argument checking tests fail after JDK-8228855 In-Reply-To: References: Message-ID: <8B50F889-6859-4F4C-9671-CBE365BDC247@oracle.com> > On Jul 31, 2019, at 4:01 PM, coleen.phillimore at oracle.com wrote: > > Summary: Use new SurvivorAlignmentInBytes range in tests, remove test cases that verify unnecessarily large values. > > Tested locally with: > > make test TEST=gc/arguments > > Will wait for hs-tier1-3 to finish before pushing after reviewed. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228907.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8228907 > > Thanks, > Coleen Looks good. From daniel.daugherty at oracle.com Wed Jul 31 20:41:34 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 31 Jul 2019 16:41:34 -0400 Subject: RFR (XXS) 8228907: Some gc argument checking tests fail after JDK-8228855 In-Reply-To: References: Message-ID: On 7/31/19 4:01 PM, coleen.phillimore at oracle.com wrote: > Summary: Use new SurvivorAlignmentInBytes range in tests, remove test > cases that verify unnecessarily large values. > > Tested locally with: > > make test TEST=gc/arguments > > Will wait for hs-tier1-3 to finish before pushing after reviewed. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228907.01/webrev test/hotspot/jtreg/gc/arguments/TestSurvivorAlignmentInBytesOption.java ??? No comments. test/hotspot/jtreg/gc/survivorAlignment/TestPromotionLABLargeSurvivorAlignment.java ??? Your comment in the bug says this: ??? > One test is testing 1k, 16k, etc, which seems extreme. ??? but you also took out this test case: ??? > -XX:SurvivorAlignmentInBytes=512 ??? so your largest test case is now: ??? > -XX:SurvivorAlignmentInBytes=256 ??? You might want to update the bug report to match this fix. Thumbs up (modulo hs-tier[1-3] testing. Dan > bug link https://bugs.openjdk.java.net/browse/JDK-8228907 > > Thanks, > Coleen From coleen.phillimore at oracle.com Wed Jul 31 21:19:13 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 31 Jul 2019 17:19:13 -0400 Subject: RFR (XXS) 8228907: Some gc argument checking tests fail after JDK-8228855 In-Reply-To: References: Message-ID: <19473db1-df42-77ca-c1fb-6a61ec41b6f4@oracle.com> Thanks Kim and Dan! On 7/31/19 4:41 PM, Daniel D. Daugherty wrote: > On 7/31/19 4:01 PM, coleen.phillimore at oracle.com wrote: >> Summary: Use new SurvivorAlignmentInBytes range in tests, remove test >> cases that verify unnecessarily large values. >> >> Tested locally with: >> >> make test TEST=gc/arguments >> >> Will wait for hs-tier1-3 to finish before pushing after reviewed. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8228907.01/webrev > > test/hotspot/jtreg/gc/arguments/TestSurvivorAlignmentInBytesOption.java > ??? No comments. > > test/hotspot/jtreg/gc/survivorAlignment/TestPromotionLABLargeSurvivorAlignment.java > > ??? Your comment in the bug says this: > > ??? > One test is testing 1k, 16k, etc, which seems extreme. > > ??? but you also took out this test case: > > ??? > -XX:SurvivorAlignmentInBytes=512 > > ??? so your largest test case is now: > > ??? > -XX:SurvivorAlignmentInBytes=256 > > ??? You might want to update the bug report to match this fix. I fixed the bug comment to say this.? 256 is max range now. I also ran make test TEST=open/test/hotspot/jtreg/gc/survivorAlignment locally. > > Thumbs up (modulo hs-tier[1-3] testing. > Mach5 hs-tier[1-3] finished and passed. Thanks! Coleen > Dan > > >> bug link https://bugs.openjdk.java.net/browse/JDK-8228907 >> >> Thanks, >> Coleen > From david.holmes at oracle.com Wed Jul 31 21:30:44 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 1 Aug 2019 07:30:44 +1000 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 In-Reply-To: <84006e85-90a7-8c42-05d6-a26000d6d2e5@oracle.com> References: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> <56f8f704-dab0-0885-8af9-fe01a31c3a12@oracle.com> <6e7f086b-e4c7-027c-9475-9b3558c2bd4d@oracle.com> <84006e85-90a7-8c42-05d6-a26000d6d2e5@oracle.com> Message-ID: On 31/07/2019 11:39 pm, coleen.phillimore at oracle.com wrote: > On 7/31/19 9:25 AM, coleen.phillimore at oracle.com wrote: >> On 7/31/19 9:12 AM, David Holmes wrote: >>> I haven't seen Coleen's original mail turn up yet, so I'll respond here. > > I haven't gotten the email yet either. >>> >>> Shouldn't the range be handled by the constraint function: >> >> It is not handled that way in ObjAlignmentInBytes. > > What I meant is that ObjectAlignmentInBytes has the constraint function > AND the range.? SurvivorAlignmentInBytes should be the same.? The > constraint function tests that it's > ObjectAlignmentInBytes. > > ?158?? lp64_product(intx, ObjectAlignmentInBytes, > 8,???????????????????????????? \ > ?159?????????? "Default object alignment in bytes, 8 is > minimum")??????????????? \ > ?160?????????? range(8, > 256)???????????????????????????????????????????????????? \ > ?161 > constraint(ObjectAlignmentInBytesConstraintFunc,AtParse)????????? \ Okay, so specifying the range is reasonable and I guess specifying the same range as ObjectAlignmentInBytes is also reasonable. Note however that the default value for SurvivorAlignmentInBytes is 0 which is outside of the specified range, so I'm not sure if that makes sense. AFAICS the constraint function only applies if the flag is explicitly set so that default value is ignored. I don't know if that is the case for the range check ?? But given the dependency between these two flags the test won't be able to adjust SurvivorAlignmentInBytes independently of ObjectAlignmentInBytes. David ----- > > Coleen >> >> Coleen >>> >>> ?SurvivorAlignmentInBytesConstraintFunc >>> >>> ? >>> >>> David (signing off for the night) >>> >>> On 31/07/2019 11:04 pm, Aleksey Shipilev wrote: >>>> On 7/31/19 2:56 PM, coleen.phillimore at oracle.com wrote: >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8228855.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8228855 >>>> >>>> Looks good and trivial. >>>> >> > From david.holmes at oracle.com Wed Jul 31 21:41:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 1 Aug 2019 07:41:48 +1000 Subject: RFR (XXS) 8228907: Some gc argument checking tests fail after JDK-8228855 In-Reply-To: References: Message-ID: <004a91fb-aad4-4bd7-04f6-7191bdc3bb90@oracle.com> On 1/08/2019 6:01 am, coleen.phillimore at oracle.com wrote: > Summary: Use new SurvivorAlignmentInBytes range in tests, remove test > cases that verify unnecessarily large values. As long as the GC team agree they are unnecessarily large the changes seem fine. Thanks, David > Tested locally with: > > make test TEST=gc/arguments > > Will wait for hs-tier1-3 to finish before pushing after reviewed. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228907.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8228907 > > Thanks, > Coleen From coleen.phillimore at oracle.com Wed Jul 31 21:51:06 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 31 Jul 2019 17:51:06 -0400 Subject: RFR (XXS) 8228907: Some gc argument checking tests fail after JDK-8228855 In-Reply-To: <004a91fb-aad4-4bd7-04f6-7191bdc3bb90@oracle.com> References: <004a91fb-aad4-4bd7-04f6-7191bdc3bb90@oracle.com> Message-ID: <2268a432-7aa8-6b83-4486-8ecc834ae86c@oracle.com> On 7/31/19 5:41 PM, David Holmes wrote: > On 1/08/2019 6:01 am, coleen.phillimore at oracle.com wrote: >> Summary: Use new SurvivorAlignmentInBytes range in tests, remove test >> cases that verify unnecessarily large values. > > As long as the GC team agree they are unnecessarily large the changes > seem fine. Yes, they do.? I linked related bugs.? The large sizes were only used in stress testing, not by customers. Thanks, Coleen > > Thanks, > David > >> Tested locally with: >> >> make test TEST=gc/arguments >> >> Will wait for hs-tier1-3 to finish before pushing after reviewed. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8228907.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8228907 >> >> Thanks, >> Coleen From kim.barrett at oracle.com Wed Jul 31 21:56:40 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 31 Jul 2019 17:56:40 -0400 Subject: RFR (XXS) 8228907: Some gc argument checking tests fail after JDK-8228855 In-Reply-To: <004a91fb-aad4-4bd7-04f6-7191bdc3bb90@oracle.com> References: <004a91fb-aad4-4bd7-04f6-7191bdc3bb90@oracle.com> Message-ID: <922513B3-3DB2-410D-B282-9FDBEC7E1492@oracle.com> > On Jul 31, 2019, at 5:41 PM, David Holmes wrote: > > On 1/08/2019 6:01 am, coleen.phillimore at oracle.com wrote: >> Summary: Use new SurvivorAlignmentInBytes range in tests, remove test cases that verify unnecessarily large values. > > As long as the GC team agree they are unnecessarily large the changes seem fine. If we?re not going to allow values that large (per JDK-8228855), then trying to test them here is pointless. We do end up with some intertwined dependencies though, and I?m not seeing a good way to deal with that. The range that makes sense to test here is limited by the valid range for SurvivorAlignmentInBytes. Maybe have the test take a special ?MAX_VALUE? token and ask the VM for the max value? Do we have that capability at all right now (perhaps via WhiteBox)? I don?t recall seeing anything like that, but might be forgetting. Even if we want to do something like that, I think it should be done as a followup. For now, getting this failure out of testing seems more urgent than improving the test. > > Thanks, > David > >> Tested locally with: >> make test TEST=gc/arguments >> Will wait for hs-tier1-3 to finish before pushing after reviewed. >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228907.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8228907 >> Thanks, >> Coleen From david.holmes at oracle.com Wed Jul 31 21:57:07 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 1 Aug 2019 07:57:07 +1000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> <470c58c5-c364-aca0-62fe-ef469c5cb390@oracle.com> Message-ID: <9868e92d-398b-be7c-5d45-020c19a61052@oracle.com> On 1/08/2019 12:01 am, Baesken, Matthias wrote: > > Hi upload works again, now with webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.2/ Could you please add, for diagnostic purposes: System.out.println("Total safepoint time (ms): " + value); after: 60 long value = executeThreadDumps(); and 68 long value2 = executeThreadDumps(); that way if the test fails we can check logs to see what kind of safepoint times have been observed previously. No need to see an updated webrev just for that. I have one further suggestion, take it or leave it, that executeThreadDumps() takes a parameter to specify the initial value, so we'd have: 60 long value = executeThreadDumps(0); and 68 long value2 = executeThreadDumps(value); This might help detect getTotalSafepointTime() going backwards slightly better than current code. Thanks, David > Best regards, Matthias > > >> -----Original Message----- >> From: Baesken, Matthias >> Sent: Mittwoch, 31. Juli 2019 14:05 >> To: 'David Holmes' ; Jean Christophe Beyler >> >> Cc: hotspot-dev at openjdk.java.net; serviceability-dev > dev at openjdk.java.net> >> Subject: RE: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast >> Linux machines with Total safepoint time 0 ms >> >> Hello, here is a version following the latest proposal of JC . >> >> Unfortunately attached as patch, sorry for that - the uploads / pushes >> currently do not work from here . >> >> Best regards, Matthias >> >> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Mittwoch, 31. Juli 2019 05:04 >>> To: Jean Christophe Beyler >>> Cc: Baesken, Matthias ; hotspot- >>> dev at openjdk.java.net; serviceability-dev >> dev at openjdk.java.net> >>> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on >> fast >>> Linux machines with Total safepoint time 0 ms >>> >>> On 31/07/2019 9:08 am, Jean Christophe Beyler wrote: >>>> FWIW, I would have done something like what David was suggesting, just >>>> slightly tweaked: >>>> >>>> public static long executeThreadDumps() { >>>> ?long value; >>>> ?long initial_value = mbean.getTotalSafepointTime(); >>>> ?do { >>>> ? ? ?Thread.getAllStackTraces(); >>>> ? ? ?value = mbean.getTotalSafepointTime(); >>>> ?} while (value == initial_value); >>>> ?return value; >>>> } >>>> >>>> This ensures that the value is a new value as opposed to the current >>>> value and if something goes wrong, as David said, it will timeout; which >>>> is ok. >>> >>> Works for me. >>> >>>> But I come back to not really understanding why we are doing this at >>>> this point of relaxing (just get a new value of safepoint time). >>>> Because, if we accept timeouts now as a failure here, then really the >>>> whole test becomes: >>>> >>>> executeThreadDumps(); >>>> executeThreadDumps(); >>>> >>>> Since?the first call will return when value > 0 and the second call will >>>> return when value2 > value (I still wonder why we want to ensure it >>>> works twice...). >>> >>> The test is trying to sanity check that we are actually recording the >>> time used by safepoints. So first check is that we can get a non-zero >>> value; second check is we get a greater non-zero value. It's just a >>> sanity test to try and catch if something gets unexpectedly broken in >>> the time tracking code. >>> >>>> So both failures and even testing for it is kind of redundant, once you >>>> have a do/while until a change? >>> >>> Yes - the problem with the tests that try to check internal VM behaviour >>> is that we have no specified way to do something, in this case execute >>> safepoints, that relates to internal VM behaviour, so we have to do >>> something we know will currently work even if not specified to do so - >>> e.g. dumping all thread stacks uses a global safepoint. The second >>> problem is that the timer granularity is so coarse that we then have to >>> guess how many times we need to do that something before seeing a >>> change. To make the test robust we can keep doing stuff until we see a >>> change and so the only way that will fail is if the overall timeout of >>> the test kicks in. Or we can try and second guess how long it should >>> take by introducing our own internal timeout - either directly or by >>> limiting the number of loops in this case. That has its own problems and >>> in general we have tried to reduce internal test timeouts (by removing >>> them) and let overall timeouts take charge. >>> >>> No ideal solution. And this has already consumed way too much of >>> everyone's time. >>> >>> Cheers, >>> David >>> >>>> Thanks, >>>> Jc >>>> >>>> >>>> On Tue, Jul 30, 2019 at 2:35 PM David Holmes >>> > wrote: >>>> >>>> On 30/07/2019 10:39 pm, Baesken, Matthias wrote: >>>> > Hi David,? ?"put that whole code (the while loop) in a helper >>>> method."? ?was JC's idea,? and I like the idea . >>>> >>>> Regardless I think the way you are using NUM_THREAD_DUMPS is >> really >>>> confusing. As an all-caps static you'd expect it to be a constant. >>>> >>>> Thanks, >>>> David >>>> >>>> > Let's see what others think . >>>> > >>>> >> >>>> >> Overall tests like this are not very useful, yet very fragile. >>>> >> >>>> > >>>> > I am also? fine with putting the test on the exclude list. >>>> > >>>> > Best regards, Matthias >>>> > >>>> > >>>> >> -----Original Message----- >>>> >> From: David Holmes >>> > >>>> >> Sent: Dienstag, 30. Juli 2019 14:12 >>>> >> To: Baesken, Matthias >>> >; Jean Christophe >>>> >> Beyler > >>>> >> Cc: hotspot-dev at openjdk.java.net >>>> ; serviceability-dev >>>> >>> >> dev at openjdk.java.net > >>>> >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java >>>> fails on fast >>>> >> Linux machines with Total safepoint time 0 ms >>>> >> >>>> >> Hi Matthias, >>>> >> >>>> >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote: >>>> >>> Hello? JC / David,?? here is a second webrev? : >>>> >>> >>>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ >>>> >>> >>>> >>> It moves?? the? thread dump execution into a? method >>>> >>> executeThreadDumps(long)?? ??, and also adds? while loops >>>> (but with a >>>> >>> limitation? for the number of thread dumps, really don?t >>>> >>> want to cause timeouts etc.).??? I removed a check for >>>> >>> MAX_VALUE_FOR_PASS?? because we cannot go over >>> Long.MAX_VALUE . >>>> >> >>>> >> I don't think executeThreadDumps is worth factoring out like out. >>>> >> >>>> >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather >> it >>>> >> remains a constant 100, and then you set a simple loop iteration >>>> count >>>> >> limit. Further with the proposed code when you get here: >>>> >> >>>> >>? ? 85? ? ? ? ?NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; >>>> >> >>>> >> you don't even know what value you may be starting with. >>>> >> >>>> >> But I was thinking of simply: >>>> >> >>>> >> long value = 0; >>>> >> do { >>>> >>? ? ? ?Thread.getAllStackTraces(); >>>> >>? ? ? ?value = mbean.getTotalSafepointTime(); >>>> >> } while (value == 0); >>>> >> >>>> >> We'd only hit a timeout if something is completely broken - >>>> which is fine. >>>> >> >>>> >> Overall tests like this are not very useful, yet very fragile. >>>> >> >>>> >> Thanks, >>>> >> David >>>> >> >>>> >>> Hope you like this version ?better. >>>> >>> >>>> >>> Best regards, Matthias >>>> >>> >>>> >>> *From:*Jean Christophe Beyler >>> > >>>> >>> *Sent:* Dienstag, 30. Juli 2019 05:39 >>>> >>> *To:* David Holmes >>> > >>>> >>> *Cc:* Baesken, Matthias >>> >; >>>> >>> hotspot-dev at openjdk.java.net >>>> ; serviceability-dev >>>> >>> >>> > >>>> >>> *Subject:* Re: RFR: [XS] 8228658: test >>>> GetTotalSafepointTime.java fails >>>> >>> on fast Linux machines with Total safepoint time 0 ms >>>> >>> >>>> >>> Hi Matthias, >>>> >>> >>>> >>> I wonder if you should not do what David is suggesting and then >>>> put that >>>> >>> whole code (the while loop) in a helper method. Below you have a >>>> >>> calculation again using value2 (which I wonder what the added >>>> value of >>>> >>> it is though) but anyway, that value2 could also be 0 at some >>>> point, no? >>>> >>> >>>> >>> So would it not be best to just refactor the getAllStackTraces and >>>> >>> calculate safepoint time in a helper method for both value / value2 >>>> >>> variables? >>>> >>> >>>> >>> Thanks, >>>> >>> >>>> >>> Jc >>>> >>> >>>> >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes >>>> >>>> >>> >>> >> wrote: >>>> >>> >>>> >>>? ? ? Hi Matthias, >>>> >>> >>>> >>>? ? ? On 29/07/2019 8:20 pm, Baesken, Matthias wrote: >>>> >>>? ? ? ?> Hello , please review this small test fix . >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> The test >>>> >>> >>>> >> >>> >> test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. >>>> >> java >>>> >>>? ? ? fails sometimes on fast Linux machines with this error >>>> message : >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> java.lang.RuntimeException: Total safepoint time >>>> illegal value: 0 >>>> >>>? ? ? ms (MIN = 1; MAX = 9223372036854775807) >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> looks like the total safepoint time is too low >>>> currently on these >>>> >>>? ? ? machines, it is < 1 ms. >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> There might be several ways to handle this : >>>> >>>? ? ? ?> >>>> >>>? ? ? ?>? ? *? ?Change the test? in a way that it might generate >>>> nigher >>>> >>>? ? ? safepoint times >>>> >>>? ? ? ?>? ? *? ?Allow? safepoint time? == 0 ms >>>> >>>? ? ? ?>? ? *? ?Offer an additional interface that gives >>>> safepoint times >>>> >>>? ? ? with finer granularity ( currently the HS has safepoint >>>> time values >>>> >>>? ? ? in ns , see? jdk/src/hotspot/share/runtime/safepoint.cpp >>>> >>>? ? ? ??SafepointTracing::end >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> But it is converted on ms in this code >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> 114jlong RuntimeService::safepoint_time_ms() { >>>> >>>? ? ? ?> 115? return UsePerfData ? >>>> >>>? ? ? ?> 116 >>>> >>> >>>> Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1; >>>> >>>? ? ? ?> 117} >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> 064jlong Management::ticks_to_ms(jlong ticks) { >>>> >>>? ? ? ?> 2065? assert(os::elapsed_frequency() > 0, "Must be >>>> non-zero"); >>>> >>>? ? ? ?> 2066? return (jlong)(((double)ticks / >>>> >>>? ? ? (double)os::elapsed_frequency()) >>>> >>>? ? ? ?> 2067? ? ? ? ? ? ? ? ?* (double)1000.0); >>>> >>>? ? ? ?> 2068} >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> Currently I go for? the first attempt (and try to generate >>>> >>>? ? ? higher safepoint times in my patch) . >>>> >>> >>>> >>>? ? ? Yes that's probably best. Coarse-grained timing on very >>>> fast machines >>>> >>>? ? ? was bound to eventually lead to problems. >>>> >>> >>>> >>>? ? ? But perhaps a more future-proof approach is to just add a >>>> do-while loop >>>> >>>? ? ? around the stack dumps and only exit when we have a non-zero >>>> >> safepoint >>>> >>>? ? ? time? >>>> >>> >>>> >>>? ? ? Thanks, >>>> >>>? ? ? David >>>> >>>? ? ? ----- >>>> >>> >>>> >>>? ? ? ?> Bug/webrev : >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> https://bugs.openjdk.java.net/browse/JDK-8228658 >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> >>>> >>>? ? ? ?> Thanks, Matthias >>>> >>>? ? ? ?> >>>> >>> >>>> >>> >>>> >>> -- >>>> >>> >>>> >>> Thanks, >>>> >>> >>>> >>> Jc >>>> >>> >>>> >>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> Jc From coleen.phillimore at oracle.com Wed Jul 31 21:57:22 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 31 Jul 2019 17:57:22 -0400 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 In-Reply-To: References: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> <56f8f704-dab0-0885-8af9-fe01a31c3a12@oracle.com> <6e7f086b-e4c7-027c-9475-9b3558c2bd4d@oracle.com> <84006e85-90a7-8c42-05d6-a26000d6d2e5@oracle.com> Message-ID: <09a2718b-d6db-bda1-1ff8-daa38cd9c74d@oracle.com> On 7/31/19 5:30 PM, David Holmes wrote: > On 31/07/2019 11:39 pm, coleen.phillimore at oracle.com wrote: >> On 7/31/19 9:25 AM, coleen.phillimore at oracle.com wrote: >>> On 7/31/19 9:12 AM, David Holmes wrote: >>>> I haven't seen Coleen's original mail turn up yet, so I'll respond >>>> here. >> >> I haven't gotten the email yet either. >>>> >>>> Shouldn't the range be handled by the constraint function: >>> >>> It is not handled that way in ObjAlignmentInBytes. >> >> What I meant is that ObjectAlignmentInBytes has the constraint >> function AND the range.? SurvivorAlignmentInBytes should be the >> same.? The constraint function tests that it's > ObjectAlignmentInBytes. >> >> ??158?? lp64_product(intx, ObjectAlignmentInBytes, >> 8,???????????????????????????? \ >> ??159?????????? "Default object alignment in bytes, 8 is >> minimum")??????????????? \ >> ??160?????????? range(8, >> 256)???????????????????????????????????????????????????? \ >> ??161 constraint(ObjectAlignmentInBytesConstraintFunc,AtParse) \ > > Okay, so specifying the range is reasonable and I guess specifying the > same range as ObjectAlignmentInBytes is also reasonable. > > Note however that the default value for SurvivorAlignmentInBytes is 0 > which is outside of the specified range, so I'm not sure if that makes > sense. AFAICS the constraint function only applies if the flag is > explicitly set so that default value is ignored. I don't know if that > is the case for the range check ?? I think the range check is applied after ergonomics, and the code in arguments has: ? if (SurvivorAlignmentInBytes == 0) { ??? SurvivorAlignmentInBytes = ObjectAlignmentInBytes; ? } Probably the default should be changed to 8, and this code removed. Or maybe not, maybe it should be: if (FLAG_IS_DEFAULT(SurvivorAlignmentInBytes)) { ?? SurvivorAlignmentInBytes = ObjectAlignmentInBytes; } But I don't know.? Maybe we should have a P5 RFE to clean this up. > > But given the dependency between these two flags the test won't be > able to adjust SurvivorAlignmentInBytes independently of > ObjectAlignmentInBytes. If both options are supplied, there's a test that S >= O. Thanks, Coleen > > David > ----- > >> >> Coleen >>> >>> Coleen >>>> >>>> ?SurvivorAlignmentInBytesConstraintFunc >>>> >>>> ? >>>> >>>> David (signing off for the night) >>>> >>>> On 31/07/2019 11:04 pm, Aleksey Shipilev wrote: >>>>> On 7/31/19 2:56 PM, coleen.phillimore at oracle.com wrote: >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8228855.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8228855 >>>>> >>>>> Looks good and trivial. >>>>> >>> >> From coleen.phillimore at oracle.com Wed Jul 31 22:03:00 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 31 Jul 2019 18:03:00 -0400 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: References: <2668bf38-162a-7b6f-404b-0c1a598a304e@oracle.com> <06be806b-158f-eb9c-b27b-9cf8d6aa549c@oracle.com> <2a2a3aaa-8227-7455-2aa3-58b3aa3cd260@oracle.com> Message-ID: On 7/31/19 2:13 AM, David Holmes wrote: > Hi Coleen, > > I've sat on this for a few hours :) > > On 31/07/2019 8:15 am, coleen.phillimore at oracle.com wrote: >> On 7/30/19 5:11 PM, David Holmes wrote: >>> On 31/07/2019 6:59 am, Kim Barrett wrote: >>>>> On Jul 29, 2019, at 10:27 PM, David Holmes >>>>> wrote: >>>>> >>>>> Hi Kim, >>>>> >>>>> A meta-comment: "storages" is not a well formed term. Can we have >>>>> something clearer, perhaps OopStorageManager, or something like that? >>>>> >>>>> Thanks, >>>>> David >>>> >>>> Coleen suggested the name OopStorages, as the plural of OopStorage. >>> >>> "storage" doesn't really have a plural in common use. >> >> Well this isn't common use.? There are more than one oopStorage >> things in oopStorages. >>> >>>> (Unpublished versions of the change had a different name that I didn't >>>> really like and Coleen actively disliked.)? Coleen and I both have an >>>> antipathy toward "Manager" suffixed names, and I don't see how it's >>>> any clearer in this case.? "Set" suggests a wider API. >>>> >>>> Also, drive-by name bikeshedding doesn't carry much weight. >>> >>> Okay how about its really poor form to have classes and files that >>> differ by only one letter. I looked at this to see what it was about >>> and had to keep double-checking if I was looking at OopStorage or >>> OopStorages. In addition OopStorages conveys no semantic meaning to me. >>> >> >> This might be confusing to someone who doesn't normally look at the >> code. > > The fact they differ by only one letter leads to an easy source of > mistakes in both reading and writing the code. The very first change I > saw in the webrev was: > > ? #include "gc/shared/oopStorage.inline.hpp" > + #include "gc/shared/oopStorages.hpp" > > and I immediately thought it was a mistake because the .hpp would be > included by the .inline.hpp file - but I'd missed the 's'. I was going to say that ideally the runtime code only needs oopStorages.hpp, and not the details of oopStorage.inline.hpp (except WeakProcessor) but there are some other cleanups that should happen first. > >> If you come up with a better name than Manager, it might be okay to >> change.? So far, our other name ideas weren't better than just the >> succinct "Storages". Meaning multiple oopStorage objects (they're not >> objects, that's a bad name because it could be confusing with oops >> which are also called objects). > > OopStorageUnit > OopStorageDepot > OopStorageFactory > OopStorageHolder > OopStorageSet > > Arguably this could/should be folded into OopStorage itself and avoid > the naming issues altogether. oopStorage.hpp has different things in it.? oopStorageCollection maybe?? I don't like any of these other names.? I don't like this name either. Coleen > > Cheers, > David > > P.S. What's so bad about Manager? :) > >> Coleen >> >>> Thanks, >>> David >> From kim.barrett at oracle.com Wed Jul 31 22:05:43 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 31 Jul 2019 18:05:43 -0400 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 In-Reply-To: <56f8f704-dab0-0885-8af9-fe01a31c3a12@oracle.com> References: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> <56f8f704-dab0-0885-8af9-fe01a31c3a12@oracle.com> Message-ID: <2DFA82A5-9AF2-434D-B1BF-69AD5FA7AE07@oracle.com> > On Jul 31, 2019, at 9:12 AM, David Holmes wrote: > > I haven't seen Coleen's original mail turn up yet, so I'll respond here. > > Shouldn't the range be handled by the constraint function: > > SurvivorAlignmentInBytesConstraintFunc There are a number of options that have both ranges and constraint functions. The ranges specify bounds in isolation, and the constraint function checks for inter-option constraints, e.g. this option must not be greater than that one and the like. From david.holmes at oracle.com Wed Jul 31 22:20:49 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 1 Aug 2019 08:20:49 +1000 Subject: RFR (T) 8228855: Test runtime/CommandLine/OptionsValidation/TestOptionsWithRanges fails after JDK-8227123 In-Reply-To: <09a2718b-d6db-bda1-1ff8-daa38cd9c74d@oracle.com> References: <9e4de49b-c346-cb42-ab62-9293b004a682@oracle.com> <8b6bbc12-654b-3ff1-058e-b92432784802@redhat.com> <56f8f704-dab0-0885-8af9-fe01a31c3a12@oracle.com> <6e7f086b-e4c7-027c-9475-9b3558c2bd4d@oracle.com> <84006e85-90a7-8c42-05d6-a26000d6d2e5@oracle.com> <09a2718b-d6db-bda1-1ff8-daa38cd9c74d@oracle.com> Message-ID: On 1/08/2019 7:57 am, coleen.phillimore at oracle.com wrote: > On 7/31/19 5:30 PM, David Holmes wrote: >> On 31/07/2019 11:39 pm, coleen.phillimore at oracle.com wrote: >>> On 7/31/19 9:25 AM, coleen.phillimore at oracle.com wrote: >>>> On 7/31/19 9:12 AM, David Holmes wrote: >>>>> I haven't seen Coleen's original mail turn up yet, so I'll respond >>>>> here. >>> >>> I haven't gotten the email yet either. >>>>> >>>>> Shouldn't the range be handled by the constraint function: >>>> >>>> It is not handled that way in ObjAlignmentInBytes. >>> >>> What I meant is that ObjectAlignmentInBytes has the constraint >>> function AND the range.? SurvivorAlignmentInBytes should be the >>> same.? The constraint function tests that it's > ObjectAlignmentInBytes. >>> >>> ??158?? lp64_product(intx, ObjectAlignmentInBytes, >>> 8,???????????????????????????? \ >>> ??159?????????? "Default object alignment in bytes, 8 is >>> minimum")??????????????? \ >>> ??160?????????? range(8, >>> 256)???????????????????????????????????????????????????? \ >>> ??161 constraint(ObjectAlignmentInBytesConstraintFunc,AtParse) \ >> >> Okay, so specifying the range is reasonable and I guess specifying the >> same range as ObjectAlignmentInBytes is also reasonable. >> >> Note however that the default value for SurvivorAlignmentInBytes is 0 >> which is outside of the specified range, so I'm not sure if that makes >> sense. AFAICS the constraint function only applies if the flag is >> explicitly set so that default value is ignored. I don't know if that >> is the case for the range check ?? > > I think the range check is applied after ergonomics, and the code in > arguments has: > > ? if (SurvivorAlignmentInBytes == 0) { > ??? SurvivorAlignmentInBytes = ObjectAlignmentInBytes; > ? } Okay. Hard to remember how all this hangs together. > Probably the default should be changed to 8, and this code removed. Or > maybe not, maybe it should be: > > if (FLAG_IS_DEFAULT(SurvivorAlignmentInBytes)) { > ?? SurvivorAlignmentInBytes = ObjectAlignmentInBytes; > } Yes that looks more reasonable - assuming explicitly setting to 0 is disallowed because it is outside the range. > But I don't know.? Maybe we should have a P5 RFE to clean this up. > >> >> But given the dependency between these two flags the test won't be >> able to adjust SurvivorAlignmentInBytes independently of >> ObjectAlignmentInBytes. > > If both options are supplied, there's a test that S >= O. My point is that TestOptionsWithRanges can't just cycle SurvivorAlignmentInBytes through its specified range because it doesn't know about the additional constraints. I can't see how this test is supposed to work. David ----- > Thanks, > Coleen >> >> David >> ----- >> >>> >>> Coleen >>>> >>>> Coleen >>>>> >>>>> ?SurvivorAlignmentInBytesConstraintFunc >>>>> >>>>> ? >>>>> >>>>> David (signing off for the night) >>>>> >>>>> On 31/07/2019 11:04 pm, Aleksey Shipilev wrote: >>>>>> On 7/31/19 2:56 PM, coleen.phillimore at oracle.com wrote: >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8228855.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8228855 >>>>>> >>>>>> Looks good and trivial. >>>>>> >>>> >>> > From david.holmes at oracle.com Wed Jul 31 22:31:39 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 1 Aug 2019 08:31:39 +1000 Subject: RFR (XXS) 8228907: Some gc argument checking tests fail after JDK-8228855 In-Reply-To: <922513B3-3DB2-410D-B282-9FDBEC7E1492@oracle.com> References: <004a91fb-aad4-4bd7-04f6-7191bdc3bb90@oracle.com> <922513B3-3DB2-410D-B282-9FDBEC7E1492@oracle.com> Message-ID: <55c00505-95ec-8172-86bf-5ef2081c600e@oracle.com> On 1/08/2019 7:56 am, Kim Barrett wrote: >> On Jul 31, 2019, at 5:41 PM, David Holmes wrote: >> >> On 1/08/2019 6:01 am, coleen.phillimore at oracle.com wrote: >>> Summary: Use new SurvivorAlignmentInBytes range in tests, remove test cases that verify unnecessarily large values. >> >> As long as the GC team agree they are unnecessarily large the changes seem fine. > > If we?re not going to allow values that large (per JDK-8228855), then trying to test them here is pointless. Of course, but my question was really whether the way to fix the current problem is by limiting the test to the max value set by JDK-8228855, or whether JDK-8228855 was wrong to set such a low maximum and that it should be changed. The fact you were testing larger values suggests there was some expectation of using larger values, but Coleen already addressed this in her reply. > We do end up with some intertwined dependencies though, and I?m not seeing a good way to deal with that. > The range that makes sense to test here is limited by the valid range for SurvivorAlignmentInBytes. > Maybe have the test take a special ?MAX_VALUE? token and ask the VM for the max value? Do we > have that capability at all right now (perhaps via WhiteBox)? I don?t recall seeing anything like that, > but might be forgetting. I don't know of anything. I think the explicit SurvivorAlignmentInBytes test has to be aware of the constraints that apply to the flags it is using. I remain unclear how the more general TestOptionsWithRanges test is supposed to work, but that's being discussed in the JDK-8228855 review thread. > Even if we want to do something like that, I think it should be done as a followup. For now, getting this > failure out of testing seems more urgent than improving the test. Of course. David >> >> Thanks, >> David >> >>> Tested locally with: >>> make test TEST=gc/arguments >>> Will wait for hs-tier1-3 to finish before pushing after reviewed. >>> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8228907.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8228907 >>> Thanks, >>> Coleen > > From david.holmes at oracle.com Wed Jul 31 22:36:17 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 1 Aug 2019 08:36:17 +1000 Subject: RFR: 8227054: ServiceThread needs to know about all OopStorage objects In-Reply-To: References: <2668bf38-162a-7b6f-404b-0c1a598a304e@oracle.com> <06be806b-158f-eb9c-b27b-9cf8d6aa549c@oracle.com> <2a2a3aaa-8227-7455-2aa3-58b3aa3cd260@oracle.com> Message-ID: <25a03787-c80d-ac50-371d-939229072d92@oracle.com> On 1/08/2019 8:03 am, coleen.phillimore at oracle.com wrote: > oopStorage.hpp has different things in it. oopStorageCollection maybe? I don't like any of these other names. I don't like this name either. OopStorageSet is better (and shorter) than OopStorageCollection. They are all better IMO than OopStorages. But you need a second reviewer for this anyway, so lets get some additional input from them. David ----- > > On 7/31/19 2:13 AM, David Holmes wrote: >> Hi Coleen, >> >> I've sat on this for a few hours :) >> >> On 31/07/2019 8:15 am, coleen.phillimore at oracle.com wrote: >>> On 7/30/19 5:11 PM, David Holmes wrote: >>>> On 31/07/2019 6:59 am, Kim Barrett wrote: >>>>>> On Jul 29, 2019, at 10:27 PM, David Holmes >>>>>> wrote: >>>>>> >>>>>> Hi Kim, >>>>>> >>>>>> A meta-comment: "storages" is not a well formed term. Can we have >>>>>> something clearer, perhaps OopStorageManager, or something like that? >>>>>> >>>>>> Thanks, >>>>>> David >>>>> >>>>> Coleen suggested the name OopStorages, as the plural of OopStorage. >>>> >>>> "storage" doesn't really have a plural in common use. >>> >>> Well this isn't common use.? There are more than one oopStorage >>> things in oopStorages. >>>> >>>>> (Unpublished versions of the change had a different name that I didn't >>>>> really like and Coleen actively disliked.)? Coleen and I both have an >>>>> antipathy toward "Manager" suffixed names, and I don't see how it's >>>>> any clearer in this case.? "Set" suggests a wider API. >>>>> >>>>> Also, drive-by name bikeshedding doesn't carry much weight. >>>> >>>> Okay how about its really poor form to have classes and files that >>>> differ by only one letter. I looked at this to see what it was about >>>> and had to keep double-checking if I was looking at OopStorage or >>>> OopStorages. In addition OopStorages conveys no semantic meaning to me. >>>> >>> >>> This might be confusing to someone who doesn't normally look at the >>> code. >> >> The fact they differ by only one letter leads to an easy source of >> mistakes in both reading and writing the code. The very first change I >> saw in the webrev was: >> >> ? #include "gc/shared/oopStorage.inline.hpp" >> + #include "gc/shared/oopStorages.hpp" >> >> and I immediately thought it was a mistake because the .hpp would be >> included by the .inline.hpp file - but I'd missed the 's'. > > I was going to say that ideally the runtime code only needs > oopStorages.hpp, and not the details of oopStorage.inline.hpp (except > WeakProcessor) but there are some other cleanups that should happen first. > >> >>> If you come up with a better name than Manager, it might be okay to >>> change.? So far, our other name ideas weren't better than just the >>> succinct "Storages". Meaning multiple oopStorage objects (they're not >>> objects, that's a bad name because it could be confusing with oops >>> which are also called objects). >> >> OopStorageUnit >> OopStorageDepot >> OopStorageFactory >> OopStorageHolder >> OopStorageSet >> >> Arguably this could/should be folded into OopStorage itself and avoid >> the naming issues altogether. > > oopStorage.hpp has different things in it.? oopStorageCollection maybe? > I don't like any of these other names.? I don't like this name either. > > Coleen > >> >> Cheers, >> David >> >> P.S. What's so bad about Manager? :) >> >>> Coleen >>> >>>> Thanks, >>>> David >>> >